4 The 'setattr' VFS method is used to modify the attributes associated
5 with a resource (it is an inode operation). The attributes are the
6 same ones returned by a 'stat' operation: mode, uid, guid, size,
7 atime, ctime, and mtime.
9 Changing the File Mode Attribute
10 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12 If only the file 'mode' is being modified (a 'chmod' command, for
13 instance) then the interaction is relatively simple as shown in
16 .Setattr RPCs for Changing the Resource's Mode
18 image::chmod_rpcs.png["setattr RPCs for changing mode",height=100]
20 //////////////////////////////////////////////////////////////////////
21 The chmod_rpcs.png diagram resembles this text art:
25 ------- ------- -------
28 //////////////////////////////////////////////////////////////////////
30 *1 - Client1 issues an MDS_REINT with the REINT_SETATTR sub-operation.*
32 In addition to the 'ptlrpc_body' (Lustre RPC descriptor), the MDS_REINT
33 request RPC from the client has the REINT structure 'mdt_rec_setattr', and a
34 lock request 'ldlm_request'. For a detailed discussion of all the fields in
35 the 'mdt_rec_setattr' and 'ldlm_request' refer to <<mdt-rec-setattr>>
36 and <<struct-ldlm-request>>.
38 .MDS_REINT:REINT_SETATTR Request Packet Structure
39 image::mds-reint-setattr-request.png["MDS_REINT:REINT_SETATTR Request Packet Structure",height=50]
41 //////////////////////////////////////////////////////////////////////
42 The mds-reint-setattr-request.png diagram resembles this text art:
45 --REINT_SETATTR-request-------------------------
46 | ptlrpc_body | mdt_rec_setattr | ldlm_request |
47 ------------------------------------------------
48 //////////////////////////////////////////////////////////////////////
50 In this case the 'setattr' wants to set the mode attribute on
51 the resource. The 'mdt_rec_setattr' identifies the resource with the
52 'sa_fid' field, and the 'sa_valid' field is set to 0x2041:
54 .Flags for 'sa_valid' field of 'struct mdt_rec_setattr'
58 | MDS_ATTR_MODE | mode attribute
59 | MDS_ATTR_CTIME | ctime attribute
60 | MDS_ATTR_CTIME_SET | ctime is being set
63 So the 'ctime' is also updated on the MDT. The mode and time values
64 are put in the corresponding fields of the 'mdt_rec_setattr', and the
65 other attribute fields will be ignored.
67 The 'ldlm_request' structure encompasses an early lock cancellation
68 (see <<early-lock-cancellation>>) on the lock that the client had
69 previously acquired for the target resource. The lock handle
70 identifies this lock. Only lock_count and lock_handle are used, and
71 the rest of the ldlm_request is cleared, i.e. all fields set to zero.
73 *2 - The MDS_REINT reply acknowledges the updated attributes.*
75 In addition to the 'ptlrpc_body' (Lustre RPC descriptor), the MDS_REINT
76 reply RPC to the client has the 'mdt_body' structure. For a detailed
77 discussion of the fields in the 'mdt_body' refer to <<struct-mdt-body>>.
79 .MDS_REINT:REINT_SETATTR Reply Packet Structure
80 image::mds-reint-setattr-reply.png["MDS_REINT:REINT_SETATTR Reply Packet Structure",height=50]
82 //////////////////////////////////////////////////////////////////////
83 The mds-reint-setattr-reply.png diagram resembles this text art:
85 --REINT_SETATTR-reply-----
86 | ptlrpc_body | mdt_body |
87 --------------------------
88 //////////////////////////////////////////////////////////////////////
90 The reply from the MDT after the setattr operation has these valid
93 .Flags for 'mbo_valid' field of 'struct mdt_body'
98 | OBD_MD_FLMTIME | mtime attribute
99 | OBD_MD_FLSIZE | size attribute
100 | OBD_MD_FLBLOCKS | blocks attribute
101 | OBD_MD_BLKSZ | block size attribute
102 | OBD_MD_FLTYPE | type attribute
103 | OBD_MD_FLNLINK | number of links attribute
104 | OBD_MD_FLRDEV | device attribute
107 So the client is updated with any other information the MDT has after
108 the attributes were set at the client's request.
110 Changing the File Time Attributes
111 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
113 The RPC(s) that get sent for the 'setattr' depend on specifically what
114 values are being set. If the time values are being set (as in a
115 "touch" command) then there are RPCs in addition to the MDS_REINT,
116 with the REINT_SETATTR sub-operation, that update the time vales on
117 the MDT. That operation is followed by an OST_SETATTR that sets the
118 time values on the OST (or OSTs if there are several). But in order to
119 know what OSTs to contact the client must first get the layout of the
120 resource. Then it can send the OST_SETATTR RPC to the appropriate OSTs
121 and update the time attributes.
123 .Setattr RPCs for Changing the Resource's Time Attributes
125 image::touch_rpcs.png["setattr RPCs for the time attributes",height=200]
127 //////////////////////////////////////////////////////////////////////
128 The touch_rpcs.png diagram resembles this text art:
132 ------- ------- -------
136 4 <-------LDLM_ENQUEUE
137 5 OST_SETATTR------------------>
138 6 <--------------------OST_SETATTR
139 //////////////////////////////////////////////////////////////////////
142 *1 - The client issues an MDS_REINT with the REINT_SETATTR
145 The MDS_REINT request RPC closely resembles the one described above,
146 but in this case the 'setattr' wants to set the time attributes on the
147 resource. The 'mdt_rec_setattr' again identifies the resource with the
148 'sa_fid' field, and the 'sa_valid' field is set to 0x21f0:
150 .Flags for 'sa_valid' field of 'struct mdt_rec_setattr'
154 | MDS_ATTR_ATIME | atime attribute
155 | MDS_ATTR_MTIME | mtime attribute
156 | MDS_ATTR_CTIME | ctime attribute
157 | MDS_ATTR_ATIME_SET | atime is being set
158 | MDS_ATTR_MTIME_SET | mtime is being set
159 | MDS_ATTR_CTIME_SET | ctime is being set
162 The time values are put in the corresponding fields of the
163 'mdt_rec_setattr', and the other attribute fields will be ignored.
165 There is again an early lock cancellation, since the client knows it
166 no longer need to have a lock on the MDT resource attributes.
168 *2 - The MDS_REINT reply acknowledges the updated times.*
170 The MDS_REINT reply is identical to the previous case in every way,
171 including which valid attributes it echoes back.
173 *3 - The client asks for a intent lock on the layout data for the
176 Before communicating with the OSTs the client needs to know which ones
177 are involved with this resource, and before it can ask for that
178 'layout' information it must acquire a 'layout lock'. The
179 LDLM_ENQUEUE RPC in this case has (in addition to the 'ptlrpc_body'
180 structure) an 'ldlm_request', an 'ldlm_intent', and a 'layout_intent'.
182 .LDLM_ENQUEUE Intent:Layout Request Packet Structure
183 image::ldlm-enqueue-intent-layout-request.png["LDLM_ENQUEUE Intent:Layout request Packet Structure",height=50]
185 //////////////////////////////////////////////////////////////////////
186 The ldlm-enqueue-intent-layout-request.png diagram resembles this text
190 --intent:layout request------------------------------------
191 | ptlrpc_body | ldlm_request |ldlm_intent | layout_intent |
192 -----------------------------------------------------------
193 //////////////////////////////////////////////////////////////////////
195 The 'ldlm_request' asks for a read lock on the resource and has its
196 intent flag set. The 'ldlm_intent' has the intent opcode is 0x800:
197 IT_LAYOUT. The 'layout_intent' has the 'li_opc' value 0:
198 LAYOUT_INTENT_ACCESS.
200 *4 - The MDS replies with a read lock on the layout.*
202 The LDLM_ENQUEUE reply that the MDS sends back grants the read lock on
203 the layout and provides a Lock Value Block (LVB) describing the
204 layout of the resource. That layout is from the extended attribute
205 'trusted.lov' and has the structure 'lov_mds_md'.
207 .LDLM_ENQUEUE Intent:Layout Reply Packet Structure
208 image::ldlm-enqueue-intent-layout-reply.png["LDLM_ENQUEUE Intent:Layout reply Packet Structure",height=50]
210 //////////////////////////////////////////////////////////////////////
211 The ldlm-enqueue-intent-layout-reply.png diagram resembles this text
215 --intent:layout reply--------------------
216 | ptlrpc_body | ldlm_reply | lov_mds_md |
217 -----------------------------------------
218 //////////////////////////////////////////////////////////////////////
220 *5 - The client issues an OST_SETATTR with the updated times, which
221 are maintained on the OST.*
223 At last the client can send an update to the OST. The OST_SETATTR RPC
224 has an 'ost_body' structure.
226 .OST_SETATTR Request Packet Structure
227 image::ost-setattr-request.png["OST_SETATTR Request Packet Structure",height=50]
229 //////////////////////////////////////////////////////////////////////
230 The ost-setattr-request.png diagram resembles this text art:
233 --request-----------------
234 | ptlrpc_body | ost_body |
235 --------------------------
236 //////////////////////////////////////////////////////////////////////
238 The 'ost_body' structure is documented in <<struct-ost-body>>. In
239 this case the 'o_valid' field is 0x300400f, so the valid fields are
242 .Flags for 'o_valid' field of 'struct os_body'
247 | OBD_MD_FLATIME | atime attribute
248 | OBD_MD_FLMTIME | mtime attribute
249 | OBD_MD_FLCTIME | ctime attribute
250 | OBD_MD_FLGENER | generation
251 | OBD_MD_FLGROUP | group
255 *6 - The OST acknowledges the update.*
257 The reply RPC for the OST_SETATTR operation has the same form as the
260 .OST_SETATTR Reply Packet Structure
261 image::ost-setattr-reply.png["OST_SETATTR Reply Packet Structure",height=50]
263 //////////////////////////////////////////////////////////////////////
264 The ost-setattr-reply.png diagram resembles this text art:
267 --reply-------------------
268 | ptlrpc_body | ost_body |
269 --------------------------
270 //////////////////////////////////////////////////////////////////////
272 The OST_SETATTR reply acknowledges the update and sends back an
273 'o_valid' of 0x10007bf, which indicates the fields:
275 .Flags for 'o_valid' field of 'struct os_body'
279 | OBD_MD_FLID | OST ID
280 | OBD_MD_FLATIME | atime attribute
281 | OBD_MD_FLMTIME | mtime attribute
282 | OBD_MD_FLCTIME | ctime attribute
283 | OBD_MD_FLSIZE | size attribute
284 | OBD_MD_FLBLOCKS | blocks attribute
285 | OBD_MD_FLMODE | mode attribute
286 | OBD_MD_FLTYPE | type attribute
287 | OBD_MD_FLUID | UID attribute
288 | OBD_MD_FLGID | GID attribute
289 | OBD_MD_FLGROUP | group
292 Changing the Size Attribute
293 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
295 If the size is being set (as in a "truncate" command) then the client
296 (Client1) will issue an LDLM_ENQUEUE to the OST for a write lock on
297 the extent attributes of the resource. If another client (Client2) had
298 a lock on the resource, then before the OST can grant the lock to
299 Client1 it has to interact with Client2. The OST sends an
300 LDLM_BL_CALLBACK request to Client2 asking Client 2 to finish up with
301 the lock it has. Client2 replies with a simple acknowledgment. When
302 Client2 is no longer using the lock it will send an LDLM_CANEL RPC to
303 the OST. At that point the OST grants the original request sending an
304 LDLM_CP_CALLBACK request to Client1 to notify it. With that taken care
305 of Client1 is finally able to issue the OST_PUNCH request that
306 actually modifies the size attribute of the affected
307 resources. Meanwhile, the OST also replies to Client2 acknowledging
310 .Setattr RPCs for Changing the Resource's Size Attribute
312 image::truncate_rpcs.png["setattr RPCs for the size attribute",height=250]
314 //////////////////////////////////////////////////////////////////////
315 The truncate_rpcs.png diagram resembles this text art:
318 Step Client1 MDT OST Client2
319 ------- ------- ------- -------
322 3 LDLM_ENQUEUE----------------->
323 4 LDLM_BL_CALLBACK---->
324 5 <----LDLM_BL_CALLBACK
325 6 <-----------------LDLM_ENQUEUE
326 7 <--------LDLM_CANCEL
327 8 <-----------LDLM_CP_CALLBACK
328 9 LDLM_CP_CALLBACK----------->
329 10 OST_PUNCH-------------------->
330 11 LDLM_CANCEL------>
331 12 <--------------------OST_PUNCH
332 //////////////////////////////////////////////////////////////////////
334 *1 - The client issues an MDS_REINT with the REINT_SETATTR
337 The MDS_REINT request RPC closely resembles the one described above,
338 but in this case the 'setattr' wants to modify the size attribute on the
339 resource. The 'mdt_rec_setattr' again identifies the resource with the
340 'sa_fid' field, and the 'sa_valid' field is set to 0x2002168:
342 .Flags for 'sa_valid' field of 'struct mdt_rec_setattr'
346 | MDS_ATTR_SIZE | size attribute
347 | MDS_ATTR_MTIME | mtime attribute
348 | MDS_ATTR_CTIME | ctime attribute
349 | MDS_ATTR_MTIME_SET | mtime being set
350 | MDS_ATTR_CTIME_SET | ctime being set
353 The size and time values are put in the corresponding fields of the
354 'mdt_rec_setattr', and the other attribute fields will be ignored.
356 There is again an 'ldlm_request' structure in the RPC, but in this
357 case it is empty (all fields set to zero), so no early lock
360 *2 - The MDS_REINT reply acknowledges the updated times.*
362 The MDS_REINT reply is identical to the previous cases in every way,
363 including which valid attributes it echoes back.
365 *3 - The client asks the OST for a write lock of type LDLM_EXTENT.*
367 The 'ldlm_request' asks for a write lock with the lock descriptor
368 resource's type set to LDLM_EXTENT, the policy data covering the whole
369 file, and the lock handle set to identify this request. The rest of
370 the lock request is blank (zeroes). The RPC resembles the simplest
371 request form in <<ldlm-enqueue-rpc>>.
373 *4 - The OST contacts Client2 to ask for the return of the lock.*
375 The LDLM_BL_CALLBACK is initiated by the OST and sent to the client,
376 identifying the resource in question. The content of the ldlm_request
377 is otherwise identical to the one sent from Client1 to the OST
378 ('l_req_mode' == LCK_PW, 'l_granted_mode' == LCK_MINMODE).
380 .LDLM_BL_CALLBACK Request Packet Structure
381 image::ldlm-bl-callback-request.png["LDLM_BL_CALLBACK Request Packet Structure", height=50]
383 //////////////////////////////////////////////////////////////////////
384 The ldlm-bl-callback-request.png diagram resembles this text
388 --request---------------------
389 | ptlrpc_body | ldlm_request |
390 ------------------------------
391 //////////////////////////////////////////////////////////////////////
393 *5 - Client2 acknowledges the request and returns the lock.*
395 The LDLM_BL_CALLBACK is an "empty" RPC in that it only has the
396 LDLM_BL_CALLBACK opcode and no other content beyond the
399 .LDLM_BL_CALLBACK Reply Packet Structure
400 image::ldlm-bl-callback-reply.png["LDLM_BL_CALLBACK Reply Packet Structure",height=50]
402 //////////////////////////////////////////////////////////////////////
403 The ldlm-bl-callback-reply.png diagram resembles this text
410 //////////////////////////////////////////////////////////////////////
412 Its effect is to notify the OST that the lock has been returned.
414 *6 - The OST replies acknowleging the lock request.*
416 The ldlm_reply's lock descriptor acknowledges the request for an
417 extent write lock without granting it ('l_req_mode' == LCK_PW,
418 'l_granted_mode' == LCK_MINMODE, 'lock_flags' == 0x2 ==
419 LDLM_FL_BLOCK_GRANTED, it is not granted because it is
420 blocked). Additional attribute data accompanies the LDLM_ENQUEUE reply
421 to tell the client about the resource attributes on the OST.
423 .LDLM_ENQUEUE Extent LVB Reply Packet Structure
424 image::ldlm-enqueue-extent-lvb-reply.png["LDLM_ENQUEUE Extent LVB reply Packet Structure",height=50]
426 //////////////////////////////////////////////////////////////////////
427 The ldlm-enqueue-intent-lvb-reply.png diagram resembles this text
431 --extent lvb reply--------------------
432 | ptlrpc_body | ldlm_reply | ost_lvb |
433 --------------------------------------
434 //////////////////////////////////////////////////////////////////////
436 *7 - Client2 cancels its lock*
438 Having received an LDLM_BL_CALLBACK Client2 must finish up with its
439 lock. Once it does it sends an LDLM_CANCEL request to the OST to
440 signal that it is done.
442 .LDLM_CANCEL Request Packet Structure
443 image::ldlm-cancel-request.png["LDLM_CANCEL Request Packet Structure",height=50]
445 //////////////////////////////////////////////////////////////////////
446 The ldlm-cancel-request.png diagram resembles this text art:
449 --request---------------------
450 | ptlrpc_body | ldlm_request |
451 ------------------------------
452 //////////////////////////////////////////////////////////////////////
454 The 'ldlm_request' indicates which lock is being canceled in its
455 (first) 'lock_handle' field. The OST then looks for anyone else
456 waiting on that lock, which it finds is Client1. It waits to reply to
457 Client2 with an LDLM_CANCEL reply until after it has notified Client1.
459 *8 - The OST notifies Client1 that it now has the lock.*
461 The 'ldlm_request' structure now has the granted mode set to protected
462 write. It also sends along any updated attributes as, for example, if
463 Client1 had flushed its dirty write cache.
465 .LDLM_CP_CALLBACK Request Packet Structure
466 image::ldlm-cp-callback-request.png["LDLM_CP_CALLBACK Request Packet Structure",height=50]
468 //////////////////////////////////////////////////////////////////////
469 The ldlm-cp-callback-request.png diagram resembles this text
473 --request-------------------------------
474 | ptlrpc_body | ldlm_request | ost_lvb |
475 ----------------------------------------
476 //////////////////////////////////////////////////////////////////////
478 *9 - Client1 acknowledges the lock update.*
480 The reply is "empty" in this case as well. The opcode in the
481 'ptlrpc_body' is sufficient to inform the OST that Client1 got its
484 .LDLM_CP_CALLBACK Reply Packet Structure
485 image::ldlm-cp-callback-reply.png["LDLM_CP_CALLBACK Reply Packet Structure",height=50]
487 //////////////////////////////////////////////////////////////////////
488 The ldlm-cp-callback-reply.png diagram resembles this text
495 //////////////////////////////////////////////////////////////////////
497 *10 - Client1 issues an OST_PUNCH request.*
499 As with the OST_SETATTR RPC there is an 'ost_body' structure.
501 .OST_PUNCH Request Packet Structure
502 image::ost-punch-request.png["OST_PUNCH Request Packet Structure",height=50]
504 //////////////////////////////////////////////////////////////////////
505 The ost-punch-request.png diagram resembles this text art:
508 --request-----------------
509 | ptlrpc_body | ost_body |
510 --------------------------
511 //////////////////////////////////////////////////////////////////////
513 In this case the 'o_valid' field is 0x30403d:
515 .Flags for 'o_valid' field of 'struct os_body'
519 | OBD_MD_FLID | OST ID
520 | OBD_MD_FLMTIME | mtime attribute
521 | OBD_MD_FLCTIME | ctime attribute
522 | OBD_MD_FLSIZE | size attribute
523 | OBD_MD_FLBLOCKS | blocks attribute
524 | OBD_MD_FLGENER | generation
525 | OBD_MD_FLCKSUM | checksukm
526 | OBD_MD_FLQOS | quality of service
529 *11 - The OST acknowledges the LDLM_CANCEL (step 7) from Client2*
531 The OST finishes up with the lock cancel (after having notified
532 Client1) by replying to Clietn2. This happens asynchronously with the
533 arrival of the OST_PUNCH request, and in <<truncate-rpcs>> it is shown
534 occuring after the OST_PUNCH, but that is not required.
536 .LDLM_CANCEL Reply Packet Structure
537 image::ldlm-cancel-reply.png["LDLM_CANCEL Reply Packet Structure",height=50]
539 //////////////////////////////////////////////////////////////////////
540 The ldlm-cancel-reply.png diagram resembles this text art:
546 //////////////////////////////////////////////////////////////////////
548 The LDLM_CANCEL reply is a so-called "empty" RPC. Its only purpose is
549 to acknowldge receipt of the LDLM_CANCEL request.
551 *12 - The OST an OST_PUNCH reply.*
553 The OST_PUNCH reply also resembles the OST_SETATTR reply:
555 .OST_PUNCH Reply Packet Structure
556 image::ost-punch-reply.png["OST_PUNCH Reply Packet Structure",height=50]
558 //////////////////////////////////////////////////////////////////////
559 The ost-punch-reply.png diagram resembles this text art:
562 --reply-------------------
563 | ptlrpc_body | ost_body |
564 --------------------------
565 //////////////////////////////////////////////////////////////////////
567 The 'o_valid' field is 0x1, so only the 'o_id' field is
568 interpreted. It just acknowledges the requested change has been