4 The 'setattr' VFS method is used to modify the attributes associated
5 with a resource (it is an inode operation). The attributes are the
6 same ones returned by a 'stat' operation: mode, uid, guid, size,
7 atime, ctime, and mtime.
9 Changing the File Mode Attribute
10 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12 If only the file 'mode' is being modified (a 'chmod' command, for
13 instance) then the interaction is relatively simple as shown in
16 .Setattr RPCs for Changing the Resource's Mode
18 image::chmod_rpcs.png["setattr RPCs for changing mode",height=100]
20 //////////////////////////////////////////////////////////////////////
21 The chmod_rpcs.png diagram resembles this text art:
25 ------- ------- -------
28 //////////////////////////////////////////////////////////////////////
30 1 - Client1 issues an MDS_REINT with the REINT_SETATTR sub-operation.
32 In addition to the 'ptlrpc_body' (RPC message header), the MDS_REINT
33 request RPC from the client has the REINT structure 'mdt_rec_setattr', and a
34 lock request 'ldlm_request'. For a detailed discussion of all the fields in
35 the 'mdt_rec_setattr' and 'ldlm_request' refer to <<mdt-rec-setattr>>
36 and <<struct-ldlm-request>>.
38 .MDS_REINT:REINT_SETATTR Request Packet Structure
39 image::mds-reint-setattr-request.png["MDS_REINT:REINT_SETATTR Request Packet Structure",height=50]
41 //////////////////////////////////////////////////////////////////////
42 The mds-reint-setattr-request.png diagram resembles this text art:
45 --REINT_SETATTR-request-------------------------
46 | ptlrpc_body | mdt_rec_setattr | ldlm_request |
47 ------------------------------------------------
48 //////////////////////////////////////////////////////////////////////
50 In this case the 'setattr' wants to set the mode attribute on
51 the resource. The 'mdt_rec_setattr' identifies the resource with the
52 'sa_fid' field, and the 'sa_valid' field is set to 0x2041:
54 .Flags for 'sa_valid' field of 'struct mdt_rec_setattr'
58 | MDS_ATTR_MODE | mode attribute
59 | MDS_ATTR_CTIME | ctime attribute
60 | MDS_ATTR_CTIME_SET | ctime is being set
63 So the 'ctime' is also updated on the MDT. The mode and time values
64 are put in the corresponding fields of the 'mdt_rec_setattr', and the
65 other attribute fields will be ignored.
67 The 'ldlm_request' structure encompasses an early lock cancellation
68 (see <<early-lock-cancellation>>) on the lock that the client had
69 previously acquired for the target resource. The lock handle
70 identifies this lock. Only lock_count and lock_handle are used, and
71 the rest of the ldlm_request is cleared, i.e. all fields set to zero.
73 2 - The MDS_REINT reply acknowledges the updated attributes.
75 In addition to the 'ptlrpc_body' (RPC message header), the MDS_REINT
76 reply RPC to the client has the 'mdt_body' structure. For a detailed
77 discussion of the fields in the 'mdt_body' refer to <<struct-mdt-body>>.
79 .MDS_REINT:REINT_SETATTR Reply Packet Structure
80 image::mds-reint-setattr-reply.png["MDS_REINT:REINT_SETATTR Reply Packet Structure",height=50]
82 //////////////////////////////////////////////////////////////////////
83 The mds-reint-setattr-reply.png diagram resembles this text art:
85 --REINT_SETATTR-reply-----
86 | ptlrpc_body | mdt_body |
87 --------------------------
88 //////////////////////////////////////////////////////////////////////
90 The reply from the MDT after the setattr operation has these valid
93 .Flags for 'mbo_valid' field of 'struct mdt_body'
98 | OBD_MD_FLMTIME | mtime attribute
99 | OBD_MD_FLSIZE | size attribute
100 | OBD_MD_FLBLOCKS | blocks attribute
101 | OBD_MD_BLKSZ | block size attribute
102 | OBD_MD_FLTYPE | type attribute
103 | OBD_MD_FLNLINK | number of links attribute
104 | OBD_MD_FLRDEV | device attribute
107 So the client is updated with any other information the MDT has after
108 the attributes were set at the client's request.
110 Changing the File Time Attributes
111 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
113 The RPC(s) that get sent for the 'setattr' depend on specifically what
114 values are being set. If the time values are being set (as in a
115 "touch" command) then there are RPCs in addition to the MDS_REINT,
116 with the REINT_SETATTR sub-operation, that update the time vales on
117 the MDT. That operation is followed by an OST_SETATTR that sets the
118 time values on the OST (or OSTs if there are several). But in order to
119 know what OSTs to contact the client must first get the layout of the
120 resource. Then it can send the OST_SETATTR RPC to the appropriate OSTs
121 and update the time attributes.
123 .Setattr RPCs for Changing the Resource's Time Attributes
125 image::touch_rpcs.png["setattr RPCs for the time attributes",height=200]
127 //////////////////////////////////////////////////////////////////////
128 The touch_rpcs.png diagram resembles this text art:
132 ------- ------- -------
136 4 <-------LDLM_ENQUEUE
137 5 OST_SETATTR------------------>
138 6 <--------------------OST_SETATTR
139 //////////////////////////////////////////////////////////////////////
142 1 - The client issues an MDS_REINT with the REINT_SETATTR sub-operation.
144 The MDS_REINT request RPC closely resembles the one described above,
145 but in this case the 'setattr' wants to set the time attributes on the
146 resource. The 'mdt_rec_setattr' again identifies the resource with the
147 'sa_fid' field, and the 'sa_valid' field is set to 0x21f0:
149 .Flags for 'sa_valid' field of 'struct mdt_rec_setattr'
153 | MDS_ATTR_ATIME | atime attribute
154 | MDS_ATTR_MTIME | mtime attribute
155 | MDS_ATTR_CTIME | ctime attribute
156 | MDS_ATTR_ATIME_SET | atime is being set
157 | MDS_ATTR_MTIME_SET | mtime is being set
158 | MDS_ATTR_CTIME_SET | ctime is being set
161 The time values are put in the corresponding fields of the
162 'mdt_rec_setattr', and the other attribute fields will be ignored.
164 There is again an early lock cancellation, since the client knows it
165 no longer need to have a lock on the MDT resource attributes.
167 2 - The MDS_REINT reply acknowledges the updated times.
169 The MDS_REINT reply is identical to the previous case in every way,
170 including which valid attributes it echoes back.
172 3 - The client asks for a intent lock on the layout data for the
175 Before communicating with the OSTs the client needs to know which ones
176 are involved with this resource, and before it can ask for that
177 'layout' information it must acquire a 'layout lock'. The
178 LDLM_ENQUEUE RPC in this case has (in addition to the 'ptlrpc_body'
179 structure) an 'ldlm_request', an 'ldlm_intent', and a 'layout_intent'.
181 .LDLM_ENQUEUE Intent:Layout Request Packet Structure
182 image::ldlm-enqueue-intent-layout-request.png["LDLM_ENQUEUE Intent:Layout request Packet Structure",height=50]
184 //////////////////////////////////////////////////////////////////////
185 The ldlm-enqueue-intent-layout-request.png diagram resembles this text
189 --intent:layout request------------------------------------
190 | ptlrpc_body | ldlm_request |ldlm_intent | layout_intent |
191 -----------------------------------------------------------
192 //////////////////////////////////////////////////////////////////////
194 The 'ldlm_request' asks for a read lock on the resource and has its
195 intent flag set. The 'ldlm_intent' has the intent opcode is 0x800:
196 IT_LAYOUT. The 'layout_intent' has the 'li_opc' value 0:
197 LAYOUT_INTENT_ACCESS.
199 4 - The MDS replies with a read lock on the layout.
201 The LDLM_ENQUEUE reply that the MDS sends back grants the read lock on
202 the layout and provides a Lock Value Block (LVB) describing the
203 layout of the resource. That layout is from the extended attribute
204 'trusted.lov' and has the structure 'lov_mds_md'.
206 .LDLM_ENQUEUE Intent:Layout Reply Packet Structure
207 image::ldlm-enqueue-intent-layout-reply.png["LDLM_ENQUEUE Intent:Layout reply Packet Structure",height=50]
209 //////////////////////////////////////////////////////////////////////
210 The ldlm-enqueue-intent-layout-reply.png diagram resembles this text
214 --intent:layout reply--------------------
215 | ptlrpc_body | ldlm_reply | lov_mds_md |
216 -----------------------------------------
217 //////////////////////////////////////////////////////////////////////
219 5 - The client issues an OST_SETATTR with the updated times, which are
220 maintained on the OST.
222 At last the client can send an update to the OST. The OST_SETATTR RPC
223 has an 'ost_body' structure.
225 .OST_SETATTR Request Packet Structure
226 image::ost-setattr-request.png["OST_SETATTR Request Packet Structure",height=50]
228 //////////////////////////////////////////////////////////////////////
229 The ost-setattr-request.png diagram resembles this text art:
232 --request-----------------
233 | ptlrpc_body | ost_body |
234 --------------------------
235 //////////////////////////////////////////////////////////////////////
237 The 'ost_body' structure is documented in <<struct-ost-body>>. In
238 this case the 'o_valid' field is 0x300400f, so the valid fields are
241 .Flags for 'o_valid' field of 'struct os_body'
246 | OBD_MD_FLATIME | atime attribute
247 | OBD_MD_FLMTIME | mtime attribute
248 | OBD_MD_FLCTIME | ctime attribute
249 | OBD_MD_FLGENER | generation
250 | OBD_MD_FLGROUP | group
254 6 - The OST acknowledges the update.
256 The reply RPC for the OST_SETATTR operation has the same form as the
259 .OST_SETATTR Reply Packet Structure
260 image::ost-setattr-reply.png["OST_SETATTR Reply Packet Structure",height=50]
262 //////////////////////////////////////////////////////////////////////
263 The ost-setattr-reply.png diagram resembles this text art:
266 --reply-------------------
267 | ptlrpc_body | ost_body |
268 --------------------------
269 //////////////////////////////////////////////////////////////////////
271 The OST_SETATTR reply acknowledges the update and sends back an
272 'o_valid' of 0x10007bf, which indicates the fields:
274 .Flags for 'o_valid' field of 'struct os_body'
278 | OBD_MD_FLID | OST ID
279 | OBD_MD_FLATIME | atime attribute
280 | OBD_MD_FLMTIME | mtime attribute
281 | OBD_MD_FLCTIME | ctime attribute
282 | OBD_MD_FLSIZE | size attribute
283 | OBD_MD_FLBLOCKS | blocks attribute
284 | OBD_MD_FLMODE | mode attribute
285 | OBD_MD_FLTYPE | type attribute
286 | OBD_MD_FLUID | UID attribute
287 | OBD_MD_FLGID | GID attribute
288 | OBD_MD_FLGROUP | group
291 Changing the Size Attribute
292 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
294 If the size is being set (as in a "truncate" command) then the client
295 (Client1) will issue an LDLM_ENQUEUE to the OST for a write lock on
296 the extent attributes of the resource. If another client (Client2) had
297 a lock on the resource, then before the OST can grant the lock to
298 Client1 it has to interact with Client2. The OST sends an
299 LDLM_BL_CALLBACK request to Client2 asking Client 2 to finish up with
300 the lock it has. Client2 replies with a simple acknowledgment. When
301 Client2 is no longer using the lock it will send an LDLM_CANEL RPC to
302 the OST. At that point the OST grants the original request sending an
303 LDLM_CP_CALLBACK request to Client1 to notify it. With that taken care
304 of Client1 is finally able to issue the OST_PUNCH request that
305 actually modifies the size attribute of the affected
306 resources. Meanwhile, the OST also replies to Client2 acknowledging
309 .Setattr RPCs for Changing the Resource's Size Attribute
311 image::truncate_rpcs.png["setattr RPCs for the size attribute",height=250]
313 //////////////////////////////////////////////////////////////////////
314 The truncate_rpcs.png diagram resembles this text art:
317 Step Client1 MDT OST Client2
318 ------- ------- ------- -------
321 3 LDLM_ENQUEUE----------------->
322 4 LDLM_BL_CALLBACK---->
323 5 <----LDLM_BL_CALLBACK
324 6 <-----------------LDLM_ENQUEUE
325 7 <--------LDLM_CANCEL
326 8 <-----------LDLM_CP_CALLBACK
327 9 LDLM_CP_CALLBACK----------->
328 10 OST_PUNCH-------------------->
329 11 LDLM_CANCEL------>
330 12 <--------------------OST_PUNCH
331 //////////////////////////////////////////////////////////////////////
333 1 - The client issues an MDS_REINT with the REINT_SETATTR sub-operation.
335 The MDS_REINT request RPC closely resembles the one described above,
336 but in this case the 'setattr' wants to modify the size attribute on the
337 resource. The 'mdt_rec_setattr' again identifies the resource with the
338 'sa_fid' field, and the 'sa_valid' field is set to 0x2002168:
340 .Flags for 'sa_valid' field of 'struct mdt_rec_setattr'
344 | MDS_ATTR_SIZE | size attribute
345 | MDS_ATTR_MTIME | mtime attribute
346 | MDS_ATTR_CTIME | ctime attribute
347 | MDS_ATTR_MTIME_SET | mtime being set
348 | MDS_ATTR_CTIME_SET | ctime being set
351 The size and time values are put in the corresponding fields of the
352 'mdt_rec_setattr', and the other attribute fields will be ignored.
354 There is again an 'ldlm_request' structure in the RPC, but in this
355 case it is empty (all fields set to zero), so no early lock
358 2 - The MDS_REINT reply acknowledges the updated times.
360 The MDS_REINT reply is identical to the previous cases in every way,
361 including which valid attributes it echoes back.
363 3 - The client asks the OST for a write lock of type LDLM_EXTENT.
365 The 'ldlm_request' asks for a write lock with the lock descriptor
366 resource's type set to LDLM_EXTENT, the policy data covering the whole
367 file, and the lock handle set to identify this request. The rest of
368 the lock request is blank (zeroes). The RPC resembles the simplest
369 request form in <<ldlm-enqueue-rpc>>.
371 4 - The OST contacts Client2 to ask for the return of the lock.
373 The LDLM_BL_CALLBACK is initiated by the OST and sent to the client,
374 identifying the resource in question. The content of the ldlm_request
375 is otherwise identical to the one sent from Client1 to the OST
376 ('l_req_mode' == LCK_PW, 'l_granted_mode' == LCK_MINMODE).
378 .LDLM_BL_CALLBACK Request Packet Structure
379 image::ldlm-bl-callback-request.png["LDLM_BL_CALLBACK Request Packet Structure", height=50]
381 //////////////////////////////////////////////////////////////////////
382 The ldlm-bl-callback-request.png diagram resembles this text
386 --request---------------------
387 | ptlrpc_body | ldlm_request |
388 ------------------------------
389 //////////////////////////////////////////////////////////////////////
391 5 - Client2 acknowledges the request and returns the lock.
393 The LDLM_BL_CALLBACK is an "empty" RPC in that it only has the
394 LDLM_BL_CALLBACK opcode and no other content beyond the
397 .LDLM_BL_CALLBACK Reply Packet Structure
398 image::ldlm-bl-callback-reply.png["LDLM_BL_CALLBACK Reply Packet Structure",height=50]
400 //////////////////////////////////////////////////////////////////////
401 The ldlm-bl-callback-reply.png diagram resembles this text
408 //////////////////////////////////////////////////////////////////////
410 Its effect is to notify the OST that the lock has been returned.
412 6 - The OST replies acknowleging the lock request.
414 The ldlm_reply's lock descriptor acknowledges the request for an
415 extent write lock without granting it ('l_req_mode' == LCK_PW,
416 'l_granted_mode' == LCK_MINMODE, 'lock_flags' == 0x2 ==
417 LDLM_FL_BLOCK_GRANTED, it is not granted because it is
418 blocked). Additional attribute data accompanies the LDLM_ENQUEUE reply
419 to tell the client about the resource attributes on the OST.
421 .LDLM_ENQUEUE Extent LVB Reply Packet Structure
422 image::ldlm-enqueue-extent-lvb-reply.png["LDLM_ENQUEUE Extent LVB reply Packet Structure",height=50]
424 //////////////////////////////////////////////////////////////////////
425 The ldlm-enqueue-intent-lvb-reply.png diagram resembles this text
429 --extent lvb reply--------------------
430 | ptlrpc_body | ldlm_reply | ost_lvb |
431 --------------------------------------
432 //////////////////////////////////////////////////////////////////////
434 7 - Client2 cancels its lock
436 Having received an LDLM_BL_CALLBACK Client2 must finish up with its
437 lock. Once it does it sends an LDLM_CANCEL request to the OST to
438 signal that it is done.
440 .LDLM_CANCEL Request Packet Structure
441 image::ldlm-cancel-request.png["LDLM_CANCEL Request Packet Structure",height=50]
443 //////////////////////////////////////////////////////////////////////
444 The ldlm-cancel-request.png diagram resembles this text art:
447 --request---------------------
448 | ptlrpc_body | ldlm_request |
449 ------------------------------
450 //////////////////////////////////////////////////////////////////////
452 The 'ldlm_request' indicates which lock is being canceled in its
453 (first) 'lock_handle' field. The OST then looks for anyone else
454 waiting on that lock, which it finds is Client1. It waits to reply to
455 Client2 with an LDLM_CANCEL reply until after it has notified Client1.
457 8 - The OST notifies Client1 that it now has the lock.
459 The 'ldlm_request' structure now has the granted mode set to protected
460 write. It also sends along any updated attributes as, for example, if
461 Client1 had flushed its dirty write cache.
463 .LDLM_CP_CALLBACK Request Packet Structure
464 image::ldlm-cp-callback-request.png["LDLM_CP_CALLBACK Request Packet Structure",height=50]
466 //////////////////////////////////////////////////////////////////////
467 The ldlm-cp-callback-request.png diagram resembles this text
471 --request-------------------------------
472 | ptlrpc_body | ldlm_request | ost_lvb |
473 ----------------------------------------
474 //////////////////////////////////////////////////////////////////////
476 9 - Client1 acknowledges the lock update.
478 The reply is "empty" in this case as well. The opcode in the
479 'ptlrpc_body' is sufficient to inform the OST that Client1 got its
482 .LDLM_CP_CALLBACK Reply Packet Structure
483 image::ldlm-cp-callback-reply.png["LDLM_CP_CALLBACK Reply Packet Structure",height=50]
485 //////////////////////////////////////////////////////////////////////
486 The ldlm-cp-callback-reply.png diagram resembles this text
493 //////////////////////////////////////////////////////////////////////
495 10 - Client1 issues an OST_PUNCH request.
497 As with the OST_SETATTR RPC there is an 'ost_body' structure.
499 .OST_PUNCH Request Packet Structure
500 image::ost-punch-request.png["OST_PUNCH Request Packet Structure",height=50]
502 //////////////////////////////////////////////////////////////////////
503 The ost-punch-request.png diagram resembles this text art:
506 --request-----------------
507 | ptlrpc_body | ost_body |
508 --------------------------
509 //////////////////////////////////////////////////////////////////////
511 In this case the 'o_valid' field is 0x30403d:
513 .Flags for 'o_valid' field of 'struct os_body'
517 | OBD_MD_FLID | OST ID
518 | OBD_MD_FLMTIME | mtime attribute
519 | OBD_MD_FLCTIME | ctime attribute
520 | OBD_MD_FLSIZE | size attribute
521 | OBD_MD_FLBLOCKS | blocks attribute
522 | OBD_MD_FLGENER | generation
523 | OBD_MD_FLCKSUM | checksukm
524 | OBD_MD_FLQOS | quality of service
527 11 - The OST acknowledges the LDLM_CANCEL (step 7) from Client2
529 The OST finishes up with the lock cancel (after having notified
530 Client1) by replying to Clietn2. This happens asynchronously with the
531 arrival of the OST_PUNCH request, and in <<truncate-rpcs>> it is shown
532 occuring after the OST_PUNCH, but that is not required.
534 .LDLM_CANCEL Reply Packet Structure
535 image::ldlm-cancel-reply.png["LDLM_CANCEL Reply Packet Structure",height=50]
537 //////////////////////////////////////////////////////////////////////
538 The ldlm-cancel-reply.png diagram resembles this text art:
544 //////////////////////////////////////////////////////////////////////
546 The LDLM_CANCEL reply is a so-called "empty" RPC. Its only purpose is
547 to acknowldge receipt of the LDLM_CANCEL request.
549 12 - The OST an OST_PUNCH reply.
551 The OST_PUNCH reply also resembles the OST_SETATTR reply:
553 .OST_PUNCH Reply Packet Structure
554 image::ost-punch-reply.png["OST_PUNCH Reply Packet Structure",height=50]
556 //////////////////////////////////////////////////////////////////////
557 The ost-punch-reply.png diagram resembles this text art:
560 --reply-------------------
561 | ptlrpc_body | ost_body |
562 --------------------------
563 //////////////////////////////////////////////////////////////////////
565 The 'o_valid' field is 0x1, so only the 'o_id' field is
566 interpreted. It just acknowledges the requested change has been