Management Cluster Ceph » Historique » Version 29
Laurent GUERBY, 13/08/2017 10:36
1 | 10 | Mehdi Abaakouk | {{>toc}} |
---|---|---|---|
2 | 1 | Mehdi Abaakouk | |
3 | 10 | Mehdi Abaakouk | h1. Management Cluster Ceph |
4 | 9 | Mehdi Abaakouk | |
5 | 8 | Mehdi Abaakouk | h2. Liens |
6 | 8 | Mehdi Abaakouk | |
7 | 8 | Mehdi Abaakouk | * [[Openstack Management TTNN]] |
8 | 8 | Mehdi Abaakouk | * [[Openstack Setup VM pas dans openstack]] |
9 | 8 | Mehdi Abaakouk | * [[Openstack Installation nouvelle node du cluster]] |
10 | 8 | Mehdi Abaakouk | * [[Openstack Installation TTNN]] |
11 | 8 | Mehdi Abaakouk | * "Openstack tools for ttnn":/projects/git-tetaneutral-net/repository/openstack-tools |
12 | 8 | Mehdi Abaakouk | |
13 | 2 | Mehdi Abaakouk | h2. Ajout d'un OSD classique |
14 | 2 | Mehdi Abaakouk | |
15 | 20 | Mehdi Abaakouk | |
16 | 20 | Mehdi Abaakouk | |
17 | 20 | Mehdi Abaakouk | <pre> |
18 | 1 | Mehdi Abaakouk | $ ceph-disk prepare --zap-disk --cluster-uuid 1fe74663-8dfa-486c-bb80-3bd94c90c967 --fs-type=ext4 /dev/sdX |
19 | 27 | Mehdi Abaakouk | $ tune2fs -c 0 -i 0 -m 0 /dev/sdX |
20 | 1 | Mehdi Abaakouk | $ smartctl --smart=on /dev/sdX # Pour le monitoring. |
21 | 1 | Mehdi Abaakouk | </pre> |
22 | 1 | Mehdi Abaakouk | |
23 | 1 | Mehdi Abaakouk | Récuperer l'id avec (c'est celui tout en bas pas accroché à l'arbre): |
24 | 1 | Mehdi Abaakouk | |
25 | 1 | Mehdi Abaakouk | <pre> |
26 | 1 | Mehdi Abaakouk | ceph osd tree |
27 | 1 | Mehdi Abaakouk | </pre> |
28 | 1 | Mehdi Abaakouk | |
29 | 26 | Mehdi Abaakouk | *DEBUT WORKAROUND BUG PREPARE* |
30 | 26 | Mehdi Abaakouk | |
31 | 26 | Mehdi Abaakouk | Dans le cas ou l'osd est DOWN après le prepare c'est surement ce bug |
32 | 26 | Mehdi Abaakouk | |
33 | 26 | Mehdi Abaakouk | ID est le premier numero libre d'osd en partant de zero (en bas du ceph osd tree) |
34 | 26 | Mehdi Abaakouk | |
35 | 26 | Mehdi Abaakouk | <pre> |
36 | 26 | Mehdi Abaakouk | mkdir /var/lib/ceph/osd/ceph-<ID> |
37 | 26 | Mehdi Abaakouk | chown ceph:ceph /var/lib/ceph/osd/ceph-<ID> |
38 | 26 | Mehdi Abaakouk | ceph-disk activate /dev/sd<X>1 |
39 | 26 | Mehdi Abaakouk | systemctl status ceph-osd@<ID> |
40 | 26 | Mehdi Abaakouk | </pre> |
41 | 26 | Mehdi Abaakouk | |
42 | 26 | Mehdi Abaakouk | *FIN WORKAROUND BUG PREPARE* |
43 | 26 | Mehdi Abaakouk | |
44 | 17 | Mehdi Abaakouk | Pour un HDD: |
45 | 2 | Mehdi Abaakouk | <pre> |
46 | 2 | Mehdi Abaakouk | $ ceph osd crush add osd.<ID> 0 root=default host=<host> |
47 | 1 | Mehdi Abaakouk | </pre> |
48 | 2 | Mehdi Abaakouk | |
49 | 17 | Mehdi Abaakouk | Pour un SSD: |
50 | 2 | Mehdi Abaakouk | <pre> |
51 | 2 | Mehdi Abaakouk | $ ceph osd crush add osd.<ID> 0 root=ssd host=<host>-ssd |
52 | 2 | Mehdi Abaakouk | </pre> |
53 | 2 | Mehdi Abaakouk | |
54 | 2 | Mehdi Abaakouk | Ensuite, autoriser Ceph à mettre des data dessus: |
55 | 2 | Mehdi Abaakouk | |
56 | 2 | Mehdi Abaakouk | <pre> |
57 | 2 | Mehdi Abaakouk | $ /root/tools/ceph-reweight-osds.sh osd.<ID> |
58 | 19 | Mehdi Abaakouk | </pre> |
59 | 19 | Mehdi Abaakouk | |
60 | 19 | Mehdi Abaakouk | h2. Vider un OSD: |
61 | 19 | Mehdi Abaakouk | |
62 | 19 | Mehdi Abaakouk | <pre> |
63 | 19 | Mehdi Abaakouk | vider_osp(){ |
64 | 19 | Mehdi Abaakouk | name="$1" |
65 | 19 | Mehdi Abaakouk | ceph osd out ${name} |
66 | 19 | Mehdi Abaakouk | ceph osd crush reweight ${name} 0 |
67 | 19 | Mehdi Abaakouk | ceph osd reweight ${name} 0 |
68 | 19 | Mehdi Abaakouk | } |
69 | 19 | Mehdi Abaakouk | </pre> |
70 | 19 | Mehdi Abaakouk | |
71 | 19 | Mehdi Abaakouk | h2. Suppression d'un OSD: |
72 | 19 | Mehdi Abaakouk | |
73 | 19 | Mehdi Abaakouk | <pre> |
74 | 19 | Mehdi Abaakouk | remove_osd(){ |
75 | 19 | Mehdi Abaakouk | name="$1" |
76 | 19 | Mehdi Abaakouk | ceph osd out ${name} |
77 | 19 | Mehdi Abaakouk | systemctl stop ceph-osd@${name#osd.} |
78 | 19 | Mehdi Abaakouk | ceph osd crush remove ${name} |
79 | 19 | Mehdi Abaakouk | ceph auth del ${name} |
80 | 19 | Mehdi Abaakouk | ceph osd rm ${name} |
81 | 19 | Mehdi Abaakouk | ceph osd tree |
82 | 19 | Mehdi Abaakouk | } |
83 | 19 | Mehdi Abaakouk | </pre> |
84 | 19 | Mehdi Abaakouk | |
85 | 19 | Mehdi Abaakouk | h2. Arrêter les IO de recovery: |
86 | 19 | Mehdi Abaakouk | |
87 | 19 | Mehdi Abaakouk | <pre> |
88 | 19 | Mehdi Abaakouk | ceph osd set nobackfill |
89 | 19 | Mehdi Abaakouk | ceph osd set norebalance |
90 | 19 | Mehdi Abaakouk | ceph osd set norecover |
91 | 19 | Mehdi Abaakouk | </pre> |
92 | 19 | Mehdi Abaakouk | |
93 | 23 | Mehdi Abaakouk | h2. Procédure d'upgrade |
94 | 23 | Mehdi Abaakouk | |
95 | 23 | Mehdi Abaakouk | + |
96 | 23 | Mehdi Abaakouk | _*/!\Lire la release note (contient très très souvent des trucs à faire en plus) /!\*_+ |
97 | 23 | Mehdi Abaakouk | |
98 | 23 | Mehdi Abaakouk | |
99 | 23 | Mehdi Abaakouk | h4. Upgrade des MONs: |
100 | 23 | Mehdi Abaakouk | |
101 | 23 | Mehdi Abaakouk | Mettre le flags noout: |
102 | 23 | Mehdi Abaakouk | |
103 | 23 | Mehdi Abaakouk | <pre>ceph osd set noout</pre> |
104 | 23 | Mehdi Abaakouk | |
105 | 23 | Mehdi Abaakouk | Sur chaque MONs (g1/g2/g3) |
106 | 23 | Mehdi Abaakouk | <pre> |
107 | 23 | Mehdi Abaakouk | apt-get upgrade -y |
108 | 23 | Mehdi Abaakouk | systemctl restart ceph-mon@g* |
109 | 23 | Mehdi Abaakouk | ceph -s |
110 | 23 | Mehdi Abaakouk | </pre> |
111 | 23 | Mehdi Abaakouk | |
112 | 23 | Mehdi Abaakouk | Note que seulement le node 'leader/master' va provoquer une micro/nano coupure, souvent c'est même invisible. |
113 | 23 | Mehdi Abaakouk | |
114 | 23 | Mehdi Abaakouk | h4. Upgrade des OSDs: |
115 | 23 | Mehdi Abaakouk | |
116 | 23 | Mehdi Abaakouk | Pour chaque machine |
117 | 23 | Mehdi Abaakouk | <pre> |
118 | 23 | Mehdi Abaakouk | apt-get upgrade -y |
119 | 23 | Mehdi Abaakouk | systemctl restart ceph-osd@* |
120 | 23 | Mehdi Abaakouk | </pre> |
121 | 23 | Mehdi Abaakouk | |
122 | 23 | Mehdi Abaakouk | Puis attendre que le recovery termine avant de faire la suivante. |
123 | 23 | Mehdi Abaakouk | |
124 | 23 | Mehdi Abaakouk | Une fois toutes les OSDs upgrader et relancer, faire: |
125 | 23 | Mehdi Abaakouk | |
126 | 23 | Mehdi Abaakouk | <pre>ceph osd unset noout</pre> |
127 | 23 | Mehdi Abaakouk | |
128 | 19 | Mehdi Abaakouk | h2. Remplacement à froid d'un tier cache: |
129 | 19 | Mehdi Abaakouk | |
130 | 19 | Mehdi Abaakouk | upstream doc: http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ |
131 | 19 | Mehdi Abaakouk | |
132 | 19 | Mehdi Abaakouk | <pre> |
133 | 19 | Mehdi Abaakouk | ceph osd tier cache-mode ec8p2c forward |
134 | 19 | Mehdi Abaakouk | rados -p ec8p2c cache-flush-evict-all |
135 | 19 | Mehdi Abaakouk | ceph osd tier remove-overlay ec8p2 |
136 | 19 | Mehdi Abaakouk | ceph osd tier remove ec8p2 ec8p2c |
137 | 19 | Mehdi Abaakouk | |
138 | 19 | Mehdi Abaakouk | rados rmpool ec8p2c ec8p2c --yes-i-really-really-mean-ita |
139 | 19 | Mehdi Abaakouk | ceph osd pool create ec8p2c 128 128 replicated |
140 | 19 | Mehdi Abaakouk | |
141 | 19 | Mehdi Abaakouk | ceph osd tier add ec8p2 ec8p2c |
142 | 19 | Mehdi Abaakouk | ceph osd tier cache-mode ec8p2c writeback |
143 | 19 | Mehdi Abaakouk | ceph osd tier set-overlay ec8p2 ec8p2c |
144 | 19 | Mehdi Abaakouk | |
145 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c size 3 |
146 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c min_size 2 |
147 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c hit_set_type bloom |
148 | 19 | Mehdi Abaakouk | |
149 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c hit_set_count 1 |
150 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c hit_set_period 3600 |
151 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c target_max_bytes 200000000000 |
152 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c target_max_objects 10000000 |
153 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c cache_target_dirty_ratio 0.4 |
154 | 19 | Mehdi Abaakouk | ceph osd pool set ec8p2c cache_target_full_ratio 0.8 |
155 | 19 | Mehdi Abaakouk | </pre> |
156 | 19 | Mehdi Abaakouk | |
157 | 16 | Mehdi Abaakouk | h2. Ajout d'un OSD qui partage le SSD avec l'OS (OBSOLETE PLUS COMPATIBLE AVEC LES FUTURES VERSION DE CEPH) |
158 | 2 | Mehdi Abaakouk | |
159 | 2 | Mehdi Abaakouk | |
160 | 2 | Mehdi Abaakouk | En général avec ceph, on donne un disque, ceph créé 2 partitions une pour le journal de l'OSD, l'autre pour les datas |
161 | 2 | Mehdi Abaakouk | mais pour le SSD de tetaneutral qui a aussi l'OS, voici la méthode |
162 | 2 | Mehdi Abaakouk | |
163 | 2 | Mehdi Abaakouk | Création manuelle de la partition de data ceph /dev/sda2 ici |
164 | 7 | Mehdi Abaakouk | |
165 | 7 | Mehdi Abaakouk | Debian (MBR format): |
166 | 2 | Mehdi Abaakouk | <pre> |
167 | 5 | Mehdi Abaakouk | apt-get install partprobe |
168 | 5 | Mehdi Abaakouk | fdisk /dev/sda |
169 | 5 | Mehdi Abaakouk | |
170 | 2 | Mehdi Abaakouk | n |
171 | 14 | Mehdi Abaakouk | p |
172 | 14 | Mehdi Abaakouk | <enter> |
173 | 14 | Mehdi Abaakouk | <enter> |
174 | 14 | Mehdi Abaakouk | <enter> |
175 | 14 | Mehdi Abaakouk | <enter> |
176 | 14 | Mehdi Abaakouk | w |
177 | 14 | Mehdi Abaakouk | |
178 | 14 | Mehdi Abaakouk | $ partprobe |
179 | 14 | Mehdi Abaakouk | </pre> |
180 | 14 | Mehdi Abaakouk | |
181 | 14 | Mehdi Abaakouk | Ubuntu (GPT format): |
182 | 2 | Mehdi Abaakouk | <pre> |
183 | 2 | Mehdi Abaakouk | # parted /dev/sdb |
184 | 2 | Mehdi Abaakouk | GNU Parted 2.3 |
185 | 13 | Mehdi Abaakouk | Using /dev/sdb |
186 | 13 | Mehdi Abaakouk | Welcome to GNU Parted! Type 'help' to view a list of commands. |
187 | 13 | Mehdi Abaakouk | (parted) print |
188 | 18 | Mehdi Abaakouk | Model: ATA SAMSUNG MZ7KM480 (scsi) |
189 | 13 | Mehdi Abaakouk | Disk /dev/sdb: 480GB |
190 | 13 | Mehdi Abaakouk | Sector size (logical/physical): 512B/512B |
191 | 13 | Mehdi Abaakouk | Partition Table: msdos |
192 | 13 | Mehdi Abaakouk | |
193 | 13 | Mehdi Abaakouk | Number Start End Size Type File system Flags |
194 | 2 | Mehdi Abaakouk | 1 1049kB 20.0GB 20.0GB primary ext4 boot |
195 | 2 | Mehdi Abaakouk | 2 20.0GB 36.0GB 16.0GB primary linux-swap(v1) |
196 | 15 | Mehdi Abaakouk | |
197 | 15 | Mehdi Abaakouk | (parted) mkpart |
198 | 15 | Mehdi Abaakouk | Partition type? primary/extended? |
199 | 15 | Mehdi Abaakouk | Partition type? primary/extended? primary |
200 | 15 | Mehdi Abaakouk | File system type? [ext2]? xfs |
201 | 15 | Mehdi Abaakouk | Start? |
202 | 15 | Mehdi Abaakouk | Start? 36.0GB |
203 | 15 | Mehdi Abaakouk | End? 100% |
204 | 1 | Mehdi Abaakouk | (parted) print |
205 | 1 | Mehdi Abaakouk | Model: ATA SAMSUNG MZ7KM480 (scsi) |
206 | 1 | Mehdi Abaakouk | Disk /dev/sdb: 480GB |
207 | 1 | Mehdi Abaakouk | Sector size (logical/physical): 512B/512B |
208 | 1 | Mehdi Abaakouk | Partition Table: msdos |
209 | 1 | Mehdi Abaakouk | |
210 | 1 | Mehdi Abaakouk | Number Start End Size Type File system Flags |
211 | 1 | Mehdi Abaakouk | 1 1049kB 20.0GB 20.0GB primary ext4 boot |
212 | 1 | Mehdi Abaakouk | 2 20.0GB 36.0GB 16.0GB primary linux-swap(v1) |
213 | 1 | Mehdi Abaakouk | 3 36.0GB 480GB 444GB primary |
214 | 1 | Mehdi Abaakouk | |
215 | 1 | Mehdi Abaakouk | (parted) quit |
216 | 1 | Mehdi Abaakouk | Information: You may need to update /etc/fstab. |
217 | 1 | Mehdi Abaakouk | </pre> |
218 | 1 | Mehdi Abaakouk | |
219 | 1 | Mehdi Abaakouk | On prepare le disk comme normalement |
220 | 1 | Mehdi Abaakouk | |
221 | 1 | Mehdi Abaakouk | <pre> |
222 | 1 | Mehdi Abaakouk | ceph-disk prepare --fs-type=ext4 --cluster-uuid 1fe74663-8dfa-486c-bb80-3bd94c90c967 /dev/sda2 |
223 | 1 | Mehdi Abaakouk | ceph-disk activate /dev/sda2 |
224 | 1 | Mehdi Abaakouk | ceph osd crush add osd.<ID> 0 root=ssd host=g3-ssd |
225 | 1 | Mehdi Abaakouk | </pre> |
226 | 1 | Mehdi Abaakouk | |
227 | 1 | Mehdi Abaakouk | Ensuite, autoriser Ceph à mettre des data dessus: |
228 | 1 | Mehdi Abaakouk | |
229 | 1 | Mehdi Abaakouk | <pre> |
230 | 1 | Mehdi Abaakouk | $ /root/tools/ceph-reweight-osds.sh osd.<ID> |
231 | 1 | Mehdi Abaakouk | </pre> |
232 | 28 | Laurent GUERBY | |
233 | 28 | Laurent GUERBY | h2. inconsistent pg |
234 | 28 | Laurent GUERBY | |
235 | 28 | Laurent GUERBY | <pre> |
236 | 28 | Laurent GUERBY | root@g1:~# ceph health detail |
237 | 28 | Laurent GUERBY | HEALTH_ERR 1 pgs inconsistent; 2 scrub errors |
238 | 28 | Laurent GUERBY | pg 58.22d is active+clean+inconsistent, acting [9,47,37] |
239 | 28 | Laurent GUERBY | 2 scrub errors |
240 | 28 | Laurent GUERBY | root@g1:~# rados list-inconsistent-obj 58.22d --format=json-pretty |
241 | 28 | Laurent GUERBY | { |
242 | 28 | Laurent GUERBY | "epoch": 269000, |
243 | 28 | Laurent GUERBY | "inconsistents": [ |
244 | 28 | Laurent GUERBY | { |
245 | 28 | Laurent GUERBY | "object": { |
246 | 28 | Laurent GUERBY | "name": "rbd_data.11f20f75aac8266.00000000000f79f9", |
247 | 28 | Laurent GUERBY | "nspace": "", |
248 | 28 | Laurent GUERBY | "locator": "", |
249 | 28 | Laurent GUERBY | "snap": "head", |
250 | 28 | Laurent GUERBY | "version": 9894452 |
251 | 28 | Laurent GUERBY | }, |
252 | 28 | Laurent GUERBY | "errors": [ |
253 | 28 | Laurent GUERBY | "data_digest_mismatch" |
254 | 28 | Laurent GUERBY | ], |
255 | 28 | Laurent GUERBY | "union_shard_errors": [ |
256 | 28 | Laurent GUERBY | "data_digest_mismatch_oi" |
257 | 28 | Laurent GUERBY | ], |
258 | 28 | Laurent GUERBY | "selected_object_info": "58:b453643a:::rbd_data.11f20f75aac8266.00000000000f79f9:head(261163'9281748 osd.9.0:6221608 dirty|data_digest|omap_digest s 4194304 uv 9894452 dd 2193d055 od ffffffff alloc_hint [0 0])", |
259 | 28 | Laurent GUERBY | "shards": [ |
260 | 28 | Laurent GUERBY | { |
261 | 28 | Laurent GUERBY | "osd": 9, |
262 | 28 | Laurent GUERBY | "errors": [], |
263 | 28 | Laurent GUERBY | "size": 4194304, |
264 | 28 | Laurent GUERBY | "omap_digest": "0xffffffff", |
265 | 28 | Laurent GUERBY | "data_digest": "0x2193d055" |
266 | 28 | Laurent GUERBY | }, |
267 | 28 | Laurent GUERBY | { |
268 | 28 | Laurent GUERBY | "osd": 37, |
269 | 28 | Laurent GUERBY | "errors": [ |
270 | 28 | Laurent GUERBY | "data_digest_mismatch_oi" |
271 | 28 | Laurent GUERBY | ], |
272 | 28 | Laurent GUERBY | "size": 4194304, |
273 | 28 | Laurent GUERBY | "omap_digest": "0xffffffff", |
274 | 28 | Laurent GUERBY | "data_digest": "0x05891fb4" |
275 | 28 | Laurent GUERBY | }, |
276 | 28 | Laurent GUERBY | { |
277 | 28 | Laurent GUERBY | "osd": 47, |
278 | 28 | Laurent GUERBY | "errors": [], |
279 | 28 | Laurent GUERBY | "size": 4194304, |
280 | 28 | Laurent GUERBY | "omap_digest": "0xffffffff", |
281 | 28 | Laurent GUERBY | "data_digest": "0x2193d055" |
282 | 28 | Laurent GUERBY | } |
283 | 28 | Laurent GUERBY | ] |
284 | 28 | Laurent GUERBY | } |
285 | 28 | Laurent GUERBY | ] |
286 | 28 | Laurent GUERBY | } |
287 | 29 | Laurent GUERBY | root@g1:~# ceph osd map disks rbd_data.11f20f75aac8266.00000000000f79f9 |
288 | 29 | Laurent GUERBY | osdmap e269110 pool 'disks' (58) object 'rbd_data.11f20f75aac8266.00000000000f79f9' -> pg 58.5c26ca2d (58.22d) -> up ([9,47,37], p9) acting ([9,47,37], p9) |
289 | 29 | Laurent GUERBY | |
290 | 28 | Laurent GUERBY | </pre> |
291 | 29 | Laurent GUERBY | |
292 | 29 | Laurent GUERBY | * http://cephnotes.ksperis.com/blog/2013/08/20/ceph-osd-where-is-my-data |