figures/client_mgs_connect_rpcs.png \
figures/client_mdt_connect_rpcs.png \
figures/client_ost_connect_rpcs.png \
- figures/umount_rpcs.png
+ figures/umount_rpcs.png \
+ figures/ost-connect-generic.png \
+ figures/ost-connect-request.png \
+ figures/ost-connect-reply.png \
+ figures/mds-connect-generic.png \
+ figures/mds-connect-request.png \
+ figures/mds-connect-reply.png \
+ figures/mgs-connect-generic.png \
+ figures/mgs-connect-request.png \
+ figures/mgs-connect-reply.png \
+ figures/ost-disconnect-generic.png \
+ figures/mds-disconnect-generic.png \
+ figures/mgs-disconnect-generic.png \
+ figures/mds-getattr-generic.png \
+ figures/mds-getstatus-generic.png \
+ figures/mgs-config-read-generic.png \
+ figures/llog-origin-handle-create-generic.png \
+ figures/llog-origin-handle-create-reply.png \
+ figures/llog-origin-handle-create-request.png \
+ figures/llog-origin-handle-next-block-generic.png \
+ figures/llog-origin-handle-next-block-request.png \
+ figures/llog-origin-handle-next-block-reply.png \
+ figures/llog-origin-handle-read-header-generic.png \
+ figures/llog-origin-handle-read-header-request.png \
+ figures/llog-origin-handle-read-header-reply.png
TEXT = protocol.txt \
introduction.txt \
+ transno.txt \
connection.txt \
+ struct_obd_connect_data.txt \
import.txt \
export.txt \
+ struct_obd_uuid.txt \
timeouts.txt \
eviction.txt \
recovery.txt \
- file_id.txt \
path_lookup.txt \
+ lov_index.txt \
+ grant.txt \
ldlm.txt \
- layout_intent.txt \
- ldlm_resource_id.txt \
- ldlm_intent.txt \
- ldlm_resource_desc.txt \
- ldlm_lock_desc.txt \
- ldlm_request.txt \
- ldlm_reply.txt \
- ost_lvb.txt \
early_lock_cancellation.txt \
llog.txt \
security.txt \
setxattr.txt \
lustre_rpcs.txt \
ost_setattr.txt \
+ struct_ptlrpc_body.txt \
+ struct_lustre_handle.txt \
ost_connect.txt \
ost_disconnect.txt \
ost_punch.txt \
ost_statfs.txt \
+ struct_obd_statfs.txt \
mds_getattr.txt \
+ struct_mdt_body.txt \
+ struct_lu_fid.txt \
mds_reint.txt \
+ struct_mdt_rec_reint.txt \
+ struct_mdt_rec_setattr.txt \
+ struct_mdt_rec_setxattr.txt \
mds_connect.txt \
mds_disconnect.txt \
mds_getstatus.txt \
mds_statfs.txt \
mds_getxattr.txt \
ldlm_enqueue.txt \
+ struct_ldlm_request.txt \
+ struct_ldlm_intent.txt \
+ struct_layout_intent.txt \
+ struct_ldlm_reply.txt \
+ struct_ost_lvb.txt \
+ struct_lov_mds_md.txt \
+ struct_ost_id.txt \
ldlm_cancel.txt \
ldlm_bl_callback.txt \
ldlm_cp_callback.txt \
mgs_connect.txt \
mgs_disconnect.txt \
mgs_config_read.txt \
+ struct_mgs_config_body.txt \
llog_origin_handle_create.txt \
+ struct_llogd_body.txt \
llog_origin_handle_next_block.txt \
llog_origin_handle_read_header.txt \
- data_types.txt \
- lustre_handle.txt \
- ptlrpc_body.txt \
- mdt_structs.txt \
- mdt_body.txt \
- obd_statfs.txt \
- mds_reint_structs.txt \
- mdt_rec_reint.txt \
- mdt_rec_setattr.txt \
- mdt_rec_setxattr.txt \
- ost_setattr_structs.txt \
+ struct_llog_log_hdr.txt \
+ struct_lustre_msg.txt \
glossary.txt
.SUFFIXES : .gnuplot .gv .pdf .png .fig
reverse-import uses the same structure as a regular import, and the
reverse-export uses the same structure as a regular export.
-include::obd_connect_data.txt[]
+include::struct_obd_connect_data.txt[]
include::import.txt[]
+++ /dev/null
-Data Structures and Defines
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-[[data-structs]]
-
-The following data types are used in the Lustre protocol description.
-
-.Basic Data Types
-[options="header"]
-|=====
-| data types | size
-| __u8 | an 8-bit unsigned integer
-| __u16 | a 16-bit unsigned integer
-| __u32 | a 32-bit unsigned integer
-| __u64 | a 64-bit unsigned integer
-| __s64 | a 64-bit signed integer
-| obd_time | an __s64
-|=====
-
-
-The following topics introduce the various kinds of data that are
-represented and manipulated in Lustre messages and representations of
-the shared state on clients and servers.
-
-Grant
-^^^^^
-[[grant]]
-
-A grant value is part of a client's state for a given target. It
-provides an upper bound to the amount of dirty cache data the client
-will allow that is destined for the target. The value is established
-by agreement between the server and the client and represents a
-guarantee by the server that the target storage has enough free space
-for at least the amount of granted dirty data. The client can ask for
-additional grant with each write RPC, which the server may provide
-depending on how much available (ungranted and unallocated) space the
-target has.
-
-LOV Index
-^^^^^^^^^
-[[lov-index]]
-
-Each target is assigned an LOV index (by the 'mkfs.lustre' command line)
-as the target is added to the file system. This value is stored by the
-target locally as well as on the MGS in order to serve as a unique
-identifier in the file system.
-
-Transaction Number
-^^^^^^^^^^^^^^^^^^
-[[transno]]
-
-For each target there is a sequence of values (a strictly increasing
-series of numbers) where each operation that can modify the file
-system is assigned the next number in the series. This is the
-transaction number, and it imposes a strict serial ordering for all
-file system modifying operations. For file system modifying
-requests the server assigns the next value in the sequence and
-informs the client of the value in the 'pb_transno' field of the
-'ptlrpc_body' of its reply to the client's request. For replys to
-requests that do not modify the file system the 'pb_transno' field in
-the 'ptlrpc_body' is just set to 0.
-
-Extended Attributes
-^^^^^^^^^^^^^^^^^^^
-
-I have not figured out how so called 'eadata' buffers are handled,
-yet. I am told that this is not just for extended attributes, but is a
-generic structure.
-
-Also, see <<struct-lu-fid>>.
-
-MGS Configuration Data
-^^^^^^^^^^^^^^^^^^^^^^
-
-----
-#define MTI_NAME_MAXLEN 64
-struct mgs_config_body {
- char mcb_name[MTI_NAME_MAXLEN]; /* logname */
- __u64 mcb_offset; /* next index of config log to request */
- __u16 mcb_type; /* type of log: CONFIG_T_[CONFIG|RECOVER] */
- __u8 mcb_reserved;
- __u8 mcb_bits; /* bits unit size of config log */
- __u32 mcb_units; /* # of units for bulk transfer */
-};
-----
-
-The 'mgs_config_body' structure has information identifying to the MGS
-which Lustre file system the client is requesting configuration information
-from. 'mcb_name' contains the filesystem name (fsname). 'mcb_offset'
-contains the next record number in the configuration llog to process
-(see <<llog>> for details), not the byte offset or bulk transfer units.
-'mcb_bits' is the log2 of the units of minimum bulk transfer size,
-typically 4096 or 8192 bytes, while 'mcb_units' is the maximum number of
-2^mcb_bits sized units that can be transferred in a single request.
-
-----
-struct mgs_config_res {
- __u64 mcr_offset; /* index of last config log */
- __u64 mcr_size; /* size of the log */
-};
-----
-
-The 'mgs_config_res' structure returns information describing the
-replied configuration llog data requested in 'mgs_config_body'.
-'mcr_offset' contains the last configuration record number returned
-by this reply. 'mcr_size' contains the maximum record index in the
-entire configuration llog. When 'mcr_offset' equals 'mcr_size' there
-are no more records to process in the log.
-
-include::lustre_handle.txt[]
-
-Lustre Message Header
-^^^^^^^^^^^^^^^^^^^^^
-[[struct-lustre-msg]]
-
-Every message has an initial header that informs the receiver about
-the number of buffers and their size for the rest of the message to
-follow, along with other important information about the request or
-reply message.
-
-----
-#define LUSTRE_MSG_MAGIC_V2 0x0BD00BD3
-#define MSGHDR_AT_SUPPORT 0x1
-struct lustre_msg_v2 {
- __u32 lm_bufcount;
- __u32 lm_secflvr;
- __u32 lm_magic;
- __u32 lm_repsize;
- __u32 lm_cksum;
- __u32 lm_flags;
- __u32 lm_padding_2;
- __u32 lm_padding_3;
- __u32 lm_buflens[0];
-};
-#define lustre_msg lustre_msg_v2
-----
-
-The 'lm_bufcount' field holds the number of buffers that will follow
-the header. The header and sequence of buffers constitutes one
-message. Each of the buffers is a sequence of bytes whose contents
-corresponds to one of the structures described in this section. Each
-message will always have at least one buffer, and no message can have
-more than thirty-one buffers.
-
-The 'lm_secflvr' field gives an indication of whether any sort of
-cyptographic encoding of the subsequent buffers will be in force. The
-value is zero if there is no "crypto" and gives a code identifying the
-"flavor" of crypto if it is employed. Further, if crypto is employed
-there will only be one buffer following (i.e. 'lm_bufcount' = 1), and
-that buffer holds an encoding of what would otherwise have been the
-sequence of buffers normally following the header. Cryptography will
-be discussed in a separate chapter.
-
-The 'lm_magic' field is a "magic" value (LUSTRE_MSG_MAGIC_V2 = 0x0BD00BD3,
-'OBD' for 'object based device') that is
-checked in order to positively identify that the message is intended
-for the use to which it is being put. That is, we are indeed dealing
-with a Lustre message, and not, for example, corrupted memory or a bad
-pointer.
-
-The 'lm_repsize' field in a request indicates the maximum available
-space that has been reserved for any reply to the request. A reply
-that attempts to use more than the reserved space will be discarded.
-
-The 'lm_cksum' field contains a checksum of the 'ptlrpc_body' buffer
-to allow the receiver to verify that the message is intact. This is
-used to verify that an 'early reply' has not been overwritten by the
-actual reply message. If the 'MSGHDR_CKSUM_INCOMPAT18' flag is set
-in requests since Lustre 1.8
-(the server will send early reply messages with the appropriate 'lm_cksum'
-if it understands the flag
-and is mandatory in Lustre 2.8 and later.
-
-The 'lm_flags' field contains flags that affect the low-level RPC
-protocol. The 'MSGHDR_AT_SUPPORT' (0x1) bit indicates that the sender
-understands adaptive timeouts and can receive 'early reply' messages
-to extend its waiting period rather than timing out. This flag was
-introduced in Lustre 1.6. The 'MSGHDR_CKSUM_INCOMPAT18' (0x2) bit
-indicates that 'lm_cksum' is computed on the full 'ptlrpc_body'
-message buffer rather than on the original 'ptlrpc_body_v2' structure
-size (88 bytes). It was introduced in Lustre 1.8 and is mandatory
-for all requests in Lustre 2.8 and later.
-
-The 'lm_padding*' fields are reserved for future use.
-
-The array of 'lm_buflens' values has 'lm_bufcount' entries. Each
-entry corresponds to, and gives the length in bytes of, one of the
-buffers that will follow. The entire header, and each of the buffers,
-is required to be a multiple of eight bytes long to ensure the buffers
-are properly aligned to hold __u64 values. Thus there may be an extra
-four bytes of padding after the 'lm_buflens' array if that array has
-an odd number of entries.
-
-include::ptlrpc_body.txt[]
-
-Object Based Disk UUID
-^^^^^^^^^^^^^^^^^^^^^^
-[[struct-obd-uuid]]
-
-----
-#define UUID_MAX 40
-struct obd_uuid {
- char uuid[UUID_MAX];
-};
-----
-
-The 'ost_uuid' contains an ASCII-formatted string that identifies
-the entity uniquely within the filesystem. Clients use an RFC-4122
-hexadecimal UUID of the form ''de305d54-75b4-431b-adb2-eb6b9e546014''
-that is randomly generated. Servers may use a string-based identifier
-of the form ''fsname-TGTindx_UUID''.
-
-File IDentifier (FID)
-^^^^^^^^^^^^^^^^^^^^^
-
-See <<struct-lu-fid>>.
-
-OST ID
-^^^^^^
-[[struct-ost-id]]
-The 'ost_id' identifies a single object on a particular OST.
-
-----
-struct ost_id {
- union {
- struct ostid {
- __u64 oi_id;
- __u64 oi_seq;
- } oi;
- struct lu_fid oi_fid;
- };
-};
-----
-
-The 'ost_id' structure contains an identifier for a single OST object.
-The 'oi' structure holds the OST object identifier as used with Lustre
-1.8 and earlier, where the 'oi_seq' field is typically zero, and the
-'oi_id' field is an integer identifying an object on a particular
-OST (which is identified separately). Since Lustre 2.5 it is possible
-for OST objects to also be identified with a unique FID that identifies
-both the OST on which it resides as well as the object identifier itself.
Export
^^^^^^
[[obd-export]]
+
An 'obd_export' structure for a given target is created on a server
for each client that connects to that target. The exports for all the
clients for a given target are managed together. The export represents
to reconnect and participate in recovery, otherwise a client without
any export data will not be allowed to participate in recovery.
+[source,c]
----
struct obd_export {
struct portals_handle exp_handle;
////^^^^
//////////////////////////////////////////////////////////////////////
+include::struct_obd_uuid.txt[]
+
The 'exp_client_uuid' holds the UUID of the client connected to this
export. This UUID is randomly generated by the client and the same
UUID is used by the client for connecting to all servers, so that
//////////////////////////////////////////////////////////////////////
////vvvv
The 'exp_flags' field encodes three directives as follows:
+[source,c]
----
enum obd_option {
OBD_OPT_FORCE = 0x0001,
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 4575 1875 4575 2475 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 5475 825 5475 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4575 825 4575 900 4575 975 4575 1050 4575 1125 4575 1200
+ 4575 1275 4575 1350 4575 1425
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3075 2250 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 690 4650 1275 string\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3075 1275 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1275 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 4695 1050 675 LLOG_ORIGIN_HANDLE_CREATE\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 975 4575 975 4575 1575 1215 1575 1215 1125
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 2925 975 2925 1050 2925 1125 2925 1200 2925 1275 2925 1350
+ 2925 1425 2925 1500 2925 1575
+4 0 0 50 -1 16 18 0.0000 4 270 4695 1050 675 LLOG_ORIGIN_HANDLE_CREATE\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1200 1050 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1275 1425 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3000 1425 llogd_body\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 5475 825 5475 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4575 825 4575 900 4575 975 4575 1050 4575 1125 4575 1200
+ 4575 1275 4575 1350 4575 1425
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 690 4650 1275 string\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3075 1275 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1275 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 4695 1050 675 LLOG_ORIGIN_HANDLE_CREATE\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 5775 1875 5775 2475 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 4575 825 4575 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4650 1875 4650 1950 4650 2025 4650 2100 4650 2175 4650 2250
+ 4650 2325 4650 2400 4650 2475
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3075 2250 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3075 1275 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1275 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 210 900 4725 2250 eadata\001
+4 0 0 50 -1 16 18 0.0000 4 270 5445 1050 675 LLOG_ORIGIN_HANDLE_NEXT_BLOCK\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 900 5775 900 5775 1500 1215 1500 1215 1050
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 2925 900 2925 975 2925 1050 2925 1125 2925 1200 2925 1275
+ 2925 1350 2925 1425 2925 1500
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4500 900 4500 975 4500 1050 4500 1125 4500 1200 4500 1275
+ 4500 1350 4500 1425 4500 1500
+4 0 0 50 -1 16 18 0.0000 4 270 5445 1050 675 LLOG_ORIGIN_HANDLE_NEXT_BLOCK\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1200 975 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1275 1350 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3000 1350 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 210 900 4575 1350 eadata\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 4575 825 4575 1425 1200 1435 1200 985
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3075 1275 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1275 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 5445 1050 675 LLOG_ORIGIN_HANDLE_NEXT_BLOCK\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 4725 1875 4725 2475 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 4575 825 4575 1425 1200 1435 1200 985
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3075 1275 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1275 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1575 3075 2250 llog_log_hdr\001
+4 0 0 50 -1 16 18 0.0000 4 270 5700 1050 675 LLOG_ORIGIN_HANDLE_READ_HEADER\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 900 4725 900 4725 1500 1215 1500 1215 1050
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 2925 900 2925 975 2925 1050 2925 1125 2925 1200 2925 1275
+ 2925 1350 2925 1425 2925 1500
+4 0 0 50 -1 16 18 0.0000 4 270 5700 1050 675 LLOG_ORIGIN_HANDLE_READ_HEADER\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1200 975 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1350 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1575 3000 1350 llog_log_hdr\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 4575 825 4575 1425 1200 1435 1200 985
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1425 3075 1275 llogd_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1275 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 5700 1050 675 LLOG_ORIGIN_HANDLE_READ_HEADER\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 5745 1860 5730 2460 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4350 825 4350 900 4350 975 4350 1050 4350 1125 4350 1200
+ 4350 1275 4350 1350 4350 1425
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 5775 825 5775 900 5775 975 5775 1050 5775 1125 5775 1200
+ 5775 1275 5775 1350 5775 1425
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 10350 840 10335 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 7650 825 7650 900 7650 975 7650 1050 7650 1125 7650 1200
+ 7650 1275 7650 1350 7650 1425
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 3105 1320 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 4500 1275 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 7770 1275 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 1680 5895 1275 lustre_handle\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 3150 2250 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 2250 1050 675 MDS_CONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 976 5745 961 5730 1561 1215 1576 1215 1126
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 975 3000 1050 3000 1125 3000 1200 3000 1275 3000 1350
+ 3000 1425 3000 1500 3000 1575
+4 0 0 50 -1 16 18 0.0000 4 270 615 1200 1050 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1425 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 3150 1350 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 2250 1050 675 MDS_CONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4350 825 4350 900 4350 975 4350 1050 4350 1125 4350 1200
+ 4350 1275 4350 1350 4350 1425
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 5775 825 5775 900 5775 975 5775 1050 5775 1125 5775 1200
+ 5775 1275 5775 1350 5775 1425
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 10350 840 10335 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 7650 825 7650 900 7650 975 7650 1050 7650 1125 7650 1200
+ 7650 1275 7650 1350 7650 1425
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 3105 1320 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 4500 1275 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 7770 1275 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 1680 5895 1275 lustre_handle\001
+4 0 0 50 -1 16 18 0.0000 4 270 2250 1050 675 MDS_CONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 3120 1860 3120 2475 1215 2475 1215 2025
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 975 3120 960 3120 1575 1215 1575 1215 1125
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2745 1050 675 MDS_DISCONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 10050 1875 10050 2475 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4470 1890 4470 1965 4470 2040 4470 2115 4470 2190 4470 2265
+ 4470 2340 4470 2415 4470 2490
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4545 915 4545 990 4545 1065 4545 1140 4545 1215 4545 1290
+ 4545 1365 4545 1440 4545 1515
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 900 6300 900 6300 1500 1200 1500 1200 1050
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 5925 1875 5925 1950 5925 2025 5925 2100 5925 2175 5925 2250
+ 5925 2325 5925 2400 5925 2475
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 6750 1875 6750 1950 6750 2025 6750 2100 6750 2175 6750 2250
+ 6750 2325 6750 2400 6750 2475
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 8400 1875 8400 1950 8400 2025 8400 2100 8400 2175 8400 2250
+ 8400 2325 8400 2400 8400 2475
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1305 3105 1320 mdt_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1485 4680 1305 lustre_capa\001
+4 0 0 50 -1 16 18 0.0000 4 270 1305 3090 2265 mdt_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2175 1050 675 MDS_GETATTR\001
+4 0 0 50 -1 16 18 0.0000 4 270 1245 4620 2250 MDT_MD\001
+4 0 0 50 -1 16 18 0.0000 4 210 570 6075 2250 ACL\001
+4 0 0 50 -1 16 18 0.0000 4 270 1485 6825 2250 lustre_capa\001
+4 0 0 50 -1 16 18 0.0000 4 270 1485 8475 2250 lustre_capa\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 4575 1875 4575 2475 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 900 4575 900 4575 1500 1200 1500 1200 1050
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1305 3105 1320 mdt_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1305 3090 2265 mdt_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2565 1050 675 MDS_GETSTATUS\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 5475 1875 5475 2475 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 5475 825 5475 1425 1200 1435 1200 985
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2940 1050 675 MGS_CONFIG_READ\001
+4 0 0 50 -1 16 18 0.0000 4 270 2295 3105 1320 mgs_config_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2295 3075 2250 mgs_config_body\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 5745 1860 5730 2460 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4350 825 4350 900 4350 975 4350 1050 4350 1125 4350 1200
+ 4350 1275 4350 1350 4350 1425
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 5775 825 5775 900 5775 975 5775 1050 5775 1125 5775 1200
+ 5775 1275 5775 1350 5775 1425
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 10350 840 10335 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 7650 825 7650 900 7650 975 7650 1050 7650 1125 7650 1200
+ 7650 1275 7650 1350 7650 1425
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 3105 1320 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 4500 1275 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 7770 1275 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 1680 5895 1275 lustre_handle\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 3150 2250 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 2265 1050 675 MGS_CONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 976 5745 961 5730 1561 1215 1576 1215 1126
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 975 3000 1050 3000 1125 3000 1200 3000 1275 3000 1350
+ 3000 1425 3000 1500 3000 1575
+4 0 0 50 -1 16 18 0.0000 4 270 2265 1050 675 MGS_CONNECT\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1200 1050 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1425 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 3150 1350 obd_connect_data\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4350 825 4350 900 4350 975 4350 1050 4350 1125 4350 1200
+ 4350 1275 4350 1350 4350 1425
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 5775 825 5775 900 5775 975 5775 1050 5775 1125 5775 1200
+ 5775 1275 5775 1350 5775 1425
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 10350 840 10335 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 7650 825 7650 900 7650 975 7650 1050 7650 1125 7650 1200
+ 7650 1275 7650 1350 7650 1425
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 3105 1320 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 4500 1275 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 7770 1275 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 1680 5895 1275 lustre_handle\001
+4 0 0 50 -1 16 18 0.0000 4 270 2265 1050 675 MGS_CONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 3120 1860 3120 2475 1215 2475 1215 2025
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 975 3120 960 3120 1575 1215 1575 1215 1125
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2760 1050 675 MGS_DISCONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 5745 1860 5730 2460 1215 2475 1215 2025
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3015 1875 3015 1950 3015 2025 3015 2100 3015 2175 3015 2250
+ 3015 2325 3015 2400 3015 2475
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4350 825 4350 900 4350 975 4350 1050 4350 1125 4350 1200
+ 4350 1275 4350 1350 4350 1425
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 5775 825 5775 900 5775 975 5775 1050 5775 1125 5775 1200
+ 5775 1275 5775 1350 5775 1425
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 10350 840 10335 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 7650 825 7650 900 7650 975 7650 1050 7650 1125 7650 1200
+ 7650 1275 7650 1350 7650 1425
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2205 1050 675 OST_CONNECT\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 3105 1320 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 4500 1275 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 7770 1275 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 1680 5895 1275 lustre_handle\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 3150 2250 obd_connect_data\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 976 5745 961 5730 1561 1215 1576 1215 1126
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 975 3000 1050 3000 1125 3000 1200 3000 1275 3000 1350
+ 3000 1425 3000 1500 3000 1575
+4 0 0 50 -1 16 18 0.0000 4 270 615 1200 1050 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1425 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 3150 1350 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 2205 1050 675 OST_CONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3000 900 3000 975 3000 1050 3000 1125 3000 1200 3000 1275
+ 3000 1350 3000 1425 3000 1500
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 4350 825 4350 900 4350 975 4350 1050 4350 1125 4350 1200
+ 4350 1275 4350 1350 4350 1425
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 5775 825 5775 900 5775 975 5775 1050 5775 1125 5775 1200
+ 5775 1275 5775 1350 5775 1425
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 835 10350 840 10335 1425 1200 1435 1200 985
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 7650 825 7650 900 7650 975 7650 1050 7650 1125 7650 1200
+ 7650 1275 7650 1350 7650 1425
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 3105 1320 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 1185 4500 1275 obd_uuid\001
+4 0 0 50 -1 16 18 0.0000 4 270 2415 7770 1275 obd_connect_data\001
+4 0 0 50 -1 16 18 0.0000 4 270 1680 5895 1275 lustre_handle\001
+4 0 0 50 -1 16 18 0.0000 4 270 2205 1050 675 OST_CONNECT\001
--- /dev/null
+#FIG 3.2 Produced by xfig version 3.2.5b
+Landscape
+Center
+Inches
+Letter
+100.00
+Single
+-2
+1200 2
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 1815 1875 3120 1860 3120 2475 1215 2475 1215 2025
+2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
+ 2025 975 3120 960 3120 1575 1215 1575 1215 1125
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 2700 1050 675 OST_DISCONNECT\001
Single
-2
1200 2
-6 1125 1875 4650 2700
2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
- 1800 2025 4575 2025 4575 2625 1200 2625 1200 2175
-2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
- 3000 2025 3000 2100 3000 2175 3000 2250 3000 2325 3000 2400
- 3000 2475 3000 2550 3000 2625
-4 0 0 50 -1 16 18 0.0000 4 270 615 1125 2100 reply\001
-4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 2400 ptlrpc_body\001
-4 0 0 50 -1 16 18 0.0000 4 270 1365 3075 2400 obd_statfs\001
--6
+ 2025 900 3120 885 3120 1500 1215 1500 1215 1050
2 1 0 2 0 7 50 -1 -1 0.000 0 0 7 0 0 5
- 2025 975 3150 975 3150 1575 1140 1575 1140 1125
-4 0 0 50 -1 16 18 0.0000 4 270 1530 1290 1350 ptlrpc_body\001
-4 0 0 50 -1 16 18 0.0000 4 255 930 1065 1050 request\001
+ 1800 1800 4950 1800 4950 2400 1200 2400 1200 1950
+2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 9
+ 3075 1800 3075 1875 3075 1950 3075 2025 3075 2100 3075 2175
+ 3075 2250 3075 2325 3075 2400
+4 0 0 50 -1 16 18 0.0000 4 255 930 1050 1005 request\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1350 1305 ptlrpc_body\001
+4 0 0 50 -1 16 18 0.0000 4 270 615 1140 1950 reply\001
+4 0 0 50 -1 16 18 0.0000 4 270 1530 1365 2250 ptlrpc_body\001
4 0 0 50 -1 16 18 0.0000 4 270 1890 1050 675 OST_STATFS\001
+4 0 0 50 -1 16 18 0.0000 4 270 1365 3225 2250 obd_statfs\001
--- /dev/null
+Grant
+~~~~~
+[[grant]]
+
+A grant value is part of a client's state for a given target. It
+provides an upper bound to the amount of dirty cache data the client
+will allow that is destined for the target. The value is established
+by agreement between the server and the client and represents a
+guarantee by the server that the target storage has enough free space
+for at least the amount of granted dirty data. The client can ask for
+additional grant with each write RPC, which the server may provide
+depending on how much available (ungranted and unallocated) space the
+target has.
+
The 'obd_import' structure holds the connection state for between each
client and each target it is connected to.
+[source,c]
----
struct obd_import {
enum lustre_imp_state imp_state;
processes exchange along with the rules governing the behavior of
those processes.
-include::connection.txt[]
+include::transno.txt[]
-include::file_id.txt[]
+include::connection.txt[]
include::path_lookup.txt[]
+include::lov_index.txt[]
+
+include::grant.txt[]
+
include::ldlm.txt[]
include::llog.txt[]
Details about the operation of the LDLM will be introduced as they
become relevant to the discussion of various file system operations.
-#################################################################
-Fixme: Move the ldlm sturucture includes to where they first
-get introduced. In the sections that have ldlm_enqueue operations.
-#################################################################
-
-include::layout_intent.txt[]
-
-include::ldlm_resource_id.txt[]
-
-include::ldlm_intent.txt[]
-
-include::ldlm_resource_desc.txt[]
-
-include::ldlm_lock_desc.txt[]
-
-include::ldlm_request.txt[]
-
-include::ldlm_reply.txt[]
-
-include::ost_lvb.txt[]
-
-include::lov_mds_md.txt[]
-
-####
-This one can stay here:
-####
-
include::early_lock_cancellation.txt[]
'ptlrpc_body'::
RPC descriptor <<struct-ptlrpc-body>>.
-'ldlm_request'::
-Description of the lock being requested. Which resource is the target,
-what lock is current, and what lock desired. <<struct-ldlm-request>>
+include::struct_ldlm_request.txt[]
-'ldlm_intent'::
-Description of the intent being included with the lock request.
-<<struct-ldlm-intent>>
+include::struct_ldlm_intent.txt[]
-'layout_intent'::
-Description of the layout information that is the subject of a layout
-intent.
+include::struct_layout_intent.txt[]
'mdt_body'::
In a request, an indication (in the 'mbo_valid' field) of what
'name'::
A text field supplying the name of the desired resource.
-'ldlm_reply'::
-Resembling the 'ldlm_request', but in this case indicating what the
-LDLM actually granted as well as relevant policy data. <<struct-ldlm-reply>>
-
-'mdt_body'::
-Metadata about a given resource. <<struct-mdt-body>>
+include::struct_ldlm_reply.txt[]
'ACL data'::
Access Control List data associated with a resource.
to quotas (and not yet incorporated into this document). LDLM_ENQUEUE
reply RPCs may include a zero length instance of an LVB.
-'ost_lvb'::
-An LVB to communicate attribute data for an extent associated with a
-resource on a lock. It is returned from an OST to a client requesting
-an extent lock. <<struct-ost-lvb>>
-
-'lov_mds_md'::
-Layout data associated with a resource. It is
-returned from an MDT to a client requesting a lock a lock with a
-layout intent. In an intent request (as opposed to a reply and as yet
-unimplemanted) it will modify the layout. It will not be included
-(zero length) in requests in current releases. <<struct-lov-mds-md>>
+include::struct_ost_lvb.txt[]
+
+include::struct_lov_mds_md.txt[]
+
+++ /dev/null
-LDLM Lock Descriptor
-^^^^^^^^^^^^^^^^^^^^
-[[struct-ldlm-lock-desc]]
-
-The lock descriptor conveys the specific details about a particular
-lock being requested or granted. It appears in
-<<struct-ldlm-request>>.
-
-----
-struct ldlm_lock_desc {
- struct ldlm_resource_desc l_resource;
- enum ldlm_mode l_req_mode;
- enum ldlm_mode l_granted_mode;
- union ldlm_wire_policy_data l_policy_data;
-};
-----
-
-The 'l_resource' field identifies the resource upon which the lock is
-being requested or granted. See the description of
-<<struct-ldlm-resource-desc>>.
-
-The 'l_req_mode' and 'l_granted_mode' fields give the kind of lock
-being requested and the kind of lock that has been granted. The field
-values are:
-
-----
-enum ldlm_mode {
- LCK_EX = 1, /* exclusive */
- LCK_PW = 2, /* privileged write */
- LCK_PR = 4, /* privileged read */
- LCK_CW = 8, /* concurrent write */
- LCK_CR = 16, /* concurrent read */
- LCK_NL = 32, /* */
- LCK_GROUP = 64, /* */
- LCK_COS = 128, /* */
-};
-----
-[[struct-ldlm-mode]]
-
-Despite the fact that the lock modes are not overlapping, these lock
-modes are exclusive. In addition the mode value 0 is the MINMODE,
-i.e. no lock at all.
-
-In a request 'l_req_mode' is the value actually being requested and
-'l_granted_mode' is the value that currently is in place on for the
-requester. In a reply the 'l_req_mode' may be modified if more or
-fewer privileges were granted than requested, and the
-'l_granted_mode' is what has, in fact, been granted.
-
-The 'l_policy_data' field gives the kind of resource being
-requested/granted. It is a union of these struct definitions:
-[[struct-ldlm-wire-policy-data]]
-
-----
-union ldlm_wire_policy_data {
- struct ldlm_extent l_extent;
- struct ldlm_flock_wire l_flock;
- struct ldlm_inodebits l_inodebits;
-};
-----
-
-----
-struct ldlm_extent {
- __u64 start;
- __u64 end;
- __u64 gid;
-};
-----
-[[struct-ldlm-extent]]
-----
-struct ldlm_flock_wire {
- __u64 lfw_start;
- __u64 lfw_end;
- __u64 lfw_owner;
- __u32 lfw_padding;
- __u32 lfw_pid;
-};
-----
-[[struct-ldlm-flock-wire]]
-----
-struct ldlm_inodebits {
- __u64 bits;
-};
-----
-[[struct-ldlm-inodebits]]
-
-Thus the lock may be on an 'extent', a contiguous sequence of bytes
-in a regular file; an 'flock wire', whatever to heck that is; or a
-portion of an inode. For a "plain" lock (or one with no type at all)
-the 'l_policy_data' field has zero length.
+++ /dev/null
-LDLM Request
-^^^^^^^^^^^^
-[[struct-ldlm-request]]
-
-The 'ldlm_request' structure describes the details of a lock request.
-
-----
-struct ldlm_request {
- __u32 lock_flags;
- __u32 lock_count;
- struct ldlm_lock_desc lock_desc;
- struct lustre_handle lock_handle[2];
-};
-----
-
-The 'lock_flags' field governs how the lock request is to be
-interpreted. The flags are:
-
-----
-#define LDLM_FL_LOCK_CHANGED 0x0000000000000001ULL // bit 0
-#define LDLM_FL_BLOCK_GRANTED 0x0000000000000002ULL // bit 1
-#define LDLM_FL_BLOCK_CONV 0x0000000000000004ULL // bit 2
-#define LDLM_FL_BLOCK_WAIT 0x0000000000000008ULL // bit 3
-#define LDLM_FL_AST_SENT 0x0000000000000020ULL // bit 5
-#define LDLM_FL_REPLAY 0x0000000000000100ULL // bit 8
-#define LDLM_FL_INTENT_ONLY 0x0000000000000200ULL // bit 9
-#define LDLM_FL_HAS_INTENT 0x0000000000001000ULL // bit 12
-#define LDLM_FL_FLOCK_DEADLOCK 0x0000000000008000ULL // bit 15
-#define LDLM_FL_DISCARD_DATA 0x0000000000010000ULL // bit 16
-#define LDLM_FL_NO_TIMEOUT 0x0000000000020000ULL // bit 17
-#define LDLM_FL_BLOCK_NOWAIT 0x0000000000040000ULL // bit 18
-#define LDLM_FL_TEST_LOCK 0x0000000000080000ULL // bit 19
-#define LDLM_FL_CANCEL_ON_BLOCK 0x0000000000800000ULL // bit 23
-#define LDLM_FL_DENY_ON_CONTENTION 0x0000000040000000ULL // bit 30
-#define LDLM_FL_AST_DISCARD_DATA 0x0000000080000000ULL // bit 31
-#define LDLM_FL_FAIL_LOC 0x0000000100000000ULL // bit 32
-#define LDLM_FL_SKIPPED 0x0000000200000000ULL // bit 33
-#define LDLM_FL_CBPENDING 0x0000000400000000ULL // bit 34
-#define LDLM_FL_WAIT_NOREPROC 0x0000000800000000ULL // bit 35
-#define LDLM_FL_CANCEL 0x0000001000000000ULL // bit 36
-#define LDLM_FL_LOCAL_ONLY 0x0000002000000000ULL // bit 37
-#define LDLM_FL_FAILED 0x0000004000000000ULL // bit 38
-#define LDLM_FL_CANCELING 0x0000008000000000ULL // bit 39
-#define LDLM_FL_LOCAL 0x0000010000000000ULL // bit 40
-#define LDLM_FL_LVB_READY 0x0000020000000000ULL // bit 41
-#define LDLM_FL_KMS_IGNORE 0x0000040000000000ULL // bit 42
-#define LDLM_FL_CP_REQD 0x0000080000000000ULL // bit 43
-#define LDLM_FL_CLEANED 0x0000100000000000ULL // bit 44
-#define LDLM_FL_ATOMIC_CB 0x0000200000000000ULL // bit 45
-#define LDLM_FL_BL_AST 0x0000400000000000ULL // bit 46
-#define LDLM_FL_BL_DONE 0x0000800000000000ULL // bit 47
-#define LDLM_FL_NO_LRU 0x0001000000000000ULL // bit 48
-#define LDLM_FL_FAIL_NOTIFIED 0x0002000000000000ULL // bit 49
-#define LDLM_FL_DESTROYED 0x0004000000000000ULL // bit 50
-#define LDLM_FL_SERVER_LOCK 0x0008000000000000ULL // bit 51
-#define LDLM_FL_RES_LOCKED 0x0010000000000000ULL // bit 52
-#define LDLM_FL_WAITED 0x0020000000000000ULL // bit 53
-#define LDLM_FL_NS_SRV 0x0040000000000000ULL // bit 54
-#define LDLM_FL_EXCL 0x0080000000000000ULL // bit 55
-----
-
-The 'lock_count' field represents how many requests are queued on this
-resource.
-
-The 'lock_desc' field holds the lock descriptor as described in
-<<struct-ldlm-lock-desc>.
-
-The 'lock_handle' array's first element holds the handle for the lock
-manager (see the description of <<struct-lustre-handle>>) involved in
-the operation. There is only one lock manager involved in any given
-RPC. The second handle is set to zero except in the rare case that
-there is also an early lock cancellation. The latter case will be
-discussed elsewhere.
-
+++ /dev/null
-LDLM Resource Descriptor
-^^^^^^^^^^^^^^^^^^^^^^^^
-[[struct-ldlm-resource-desc]]
-
-The resource descriptor identifies the individual resource that is
-being locked, along with what sort of thing it is.
-
-----
-struct ldlm_resource_desc {
- struct ldlm_type lr_type;
- __u32 lr_padding; /* also fix lustre_swab_ldlm_resource_desc */
- struct ldlm_res_id lr_name;
-};
-----
-
-The 'lr_type' field identifies one of the four types of lock that
-might be placed on a resource. A "plain" lock type just locks a
-particular resource. An "extent" lock type only locks a contiguous
-sequence of byte offsets within a regular file. An "flock" lock type
-represents an application layer advisory lock from the 'flock()'
-system call. While Lustre manages "flock" types locks on behalf of the
-application, they do not affect Lustre operation. An "ibits" lock
-type allows fine grained locking of different parts of a single
-resource. A single lock request or cancellation may operate on one or
-more lock bits, or individual lock bits may be granted on the same
-resource separately. See also <<ldlm-wire-policy-data-t>>. A lock
-descriptor may also have no type at all, in which case the 'lr_type'
-field is 0, meaning "no lock".
-
-The 'lr_name' field identifies (by name) the resource(s) that are the
-objects of the locking operation. See the discussion of
-<<struct-ldlm-res-id>>.
-
-----
-enum ldlm_type {
- LDLM_PLAIN = 10,
- LDLM_EXTENT = 11,
- LDLM_FLOCK = 12,
- LDLM_IBITS = 13,
-};
-----
-[[struct-ldlm-type]]
+++ /dev/null
-LDLM Resource ID
-^^^^^^^^^^^^^^^^
-[[struct-ldlm-res-id]]
-
-This structure gets used in <<struct-ldlm-resource-desc>>.
-
-----
-struct ldlm_res_id {
- __u64 name[4];
-};
-----
-
-The 'name' array holds identifiers for the resource in question. Those
-identifiers may be the elements of a 'struct lu_fid' file ID, or they
-may be other uniquely identifying values for the resource. See <<struct-lu-fid>>.
-
transactions such as unlink or ownership changes, configuration records
for targets and clients, or ChangeLog records to track changes to the
filesystem for external consumption, among others.
-
-Each llog file begins with an 'llog_log_hdr' that describes the llog
-file itself, followed by a series of log records that are appended
-sequentially to the file. Each record, including the header itself,
-begins with an 'llog_rec_hdr' and ends with an 'llog_rec_tail'.
-
-LLOG Log ID
-^^^^^^^^^^^
-[[struct-llog-logid]]
-
-----
-struct llog_logid {
- struct ost_id lgl_oi;
- __u32 lgl_ogen;
-};
-----
-
-The 'llog_logid' structure is used to identify a single Lustre log file.
-It holds a <<struct-ost-id>> in 'lgl_oi', which is typically a FID.
-
-LLog Information
-^^^^^^^^^^^^^^^^
-[[struct-llogd-body]]
-----
-struct llogd_body {
- struct llog_logid lgd_logid;
- __u32 lgd_ctxt_idx;
- __u32 lgd_llh_flags;
- __u32 lgd_index;
- __u32 lgd_saved_index;
- __u32 lgd_len;
- __u64 lgd_cur_offset;
-};
-----
-
-The lgd_llh_flags are:
-----
-enum llog_flag {
- LLOG_F_ZAP_WHEN_EMPTY = 0x1,
- LLOG_F_IS_CAT = 0x2,
- LLOG_F_IS_PLAIN = 0x4,
-};
-----
-
-LLog Record Header
-^^^^^^^^^^^^^^^^^^
-[[struct-llog-rec-hdr]]
-----
-struct llog_rec_hdr {
- __u32 lrh_len;
- __u32 lrh_index;
- __u32 lrh_type;
- __u32 lrh_id;
-};
-----
-
-The 'llog_rec_hdr' is at the start of each llog record and describes
-the log record. 'lrh_len' holds the record size in bytes, including
-the header and tail. 'lrh_index' is the record index within the llog
-file and is sequentially increasing for each subsequent record. It
-can be used to determine the offset within the llog file when searching
-for an arbitrary record within the file. 'lrh_type' describes the type
-of data stored in this record.
-
-----
-enum llog_op_type {
- LLOG_PAD_MAGIC = LLOG_OP_MAGIC | 0x00000,
- OST_SZ_REC = LLOG_OP_MAGIC | 0x00f00,
- MDS_UNLINK64_REC = LLOG_OP_MAGIC | 0x90000 | (MDS_REINT << 8) |
- REINT_UNLINK,
- MDS_SETATTR64_REC = LLOG_OP_MAGIC | 0x90000 | (MDS_REINT << 8) |
- REINT_SETATTR,
- OBD_CFG_REC = LLOG_OP_MAGIC | 0x20000,
- LLOG_GEN_REC = LLOG_OP_MAGIC | 0x40000,
- CHANGELOG_REC = LLOG_OP_MAGIC | 0x60000,
- CHANGELOG_USER_REC = LLOG_OP_MAGIC | 0x70000,
- HSM_AGENT_REC = LLOG_OP_MAGIC | 0x80000,
- UPDATE_REC = LLOG_OP_MAGIC | 0xa0000,
- LLOG_HDR_MAGIC = LLOG_OP_MAGIC | 0x45539,
- LLOG_LOGID_MAGIC = LLOG_OP_MAGIC | 0x4553b,
-};
-----
-
-LLog Record Tail
-^^^^^^^^^^^^^^^^
-[[struct-llog-rec-tail]]
-----
-struct llog_rec_tail {
- __u32 lrt_len;
- __u32 lrt_index;
-};
-----
-
-The 'llog_rec_tail' is at the end of each llog record. The 'lrt_len'
-and 'lrt_index' fields must be the same as 'lrh_len' and 'lrh_index'
-in the header. They can be used to verify record integrity, as well
-as allowing processing the llog records in reverse order.
-
-LLog Log Header Information
-^^^^^^^^^^^^^^^^^^^^^^^^^^^
-[[struct-llog-log-hdr]]
-----
-struct llog_log_hdr {
- struct llog_rec_hdr llh_hdr;
- obd_time llh_timestamp;
- __u32 llh_count;
- __u32 llh_bitmap_offset;
- __u32 llh_size;
- __u32 llh_flags;
- __u32 llh_cat_idx;
- /* for a catalog the first plain slot is next to it */
- struct obd_uuid llh_tgtuuid;
- __u32 llh_reserved[LLOG_HEADER_SIZE/sizeof(__u32) - 23];
- __u32 llh_bitmap[LLOG_BITMAP_BYTES/sizeof(__u32)];
- struct llog_rec_tail llh_tail;
-};
-----
-
-The llog records start and end on a record size boundary, typically
-8192 bytes, or as stored in 'llh_size', which allows some degree of
-random access within the llog file, even with variable record sizes.
-It is possible to interpolate the offset of an arbitrary record
-within the file by estimating the byte offset of a particular record
-index using 'llh_count' and the llog file size and aligning it to
-the chunk boundary 'llh_size'. The record index in the 'llog_rec_hdr'
-of the first record in that chunk can be used to further refine the
-estimate of the offset of the desired index down to a single chunk,
-and then sequential access can be used to find the actual record.
-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[llog-origin-handle-create-rpc]]
-.LLOG_ORIGIN_HANDLE_CREATE (510)
-[options="header"]
-|====
-| request | reply
-| llog_origin_handle_create_client | llogd_body_only
-|====
+.LLOG_ORIGIN_HANDLE_CREATE Generic Packet Structure
+image::llog-origin-handle-create-generic.png["LLOG_ORIGIN_HANDLE_CREATE Generic Packet Structure",height=100]
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-create-generic.png diagram resembles this text
+art:
+
+ LLOG_ORIGIN_HANDLE_CREATE:
+ --request----------------------------
+ | ptlrpc_body | llogd_body | string |
+ -------------------------------------
+ --reply---------------------
+ | ptlrpc_body | llogd_body |
+ ----------------------------
+//////////////////////////////////////////////////////////////////////
+
+'ptlrpc_body':: RPC descriptor. See <<struct-ptlrpc-body>>.
+
+include::struct_llogd_body.txt[]
+
+'string':: The name of the log.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[llog-origin-handle-next-block-rpc]]
-.LLOG_ORIGIN_HANDLE_NEXT_BLOCK (502)
-[options="header"]
-|====
-| request | reply
-| llogd_body_only | llog_origin_handle_next_block_server
-|====
+.LLOG_ORIGIN_HANDLE_NEXT_BLOCK Generic Packet Structure
+image::llog-origin-handle-next-block-generic.png["LLOG_ORIGIN_HANDLE_NEXT_BLOCK Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-next-block-generic.png diagram resembles this
+text art:
+
+ LLOG_ORIGIN_HANDLE_NEXT_BLOCK:
+ --request-------------------
+ | ptlrpc_body | llogd_body |
+ ----------------------------
+ --reply------------------------------
+ | ptlrpc_body | llogd_body | eadata |
+ -------------------------------------
+//////////////////////////////////////////////////////////////////////
+
+'ptlrpc_body':: RPC descriptor. See <<struct-ptlrpc-body>>.
+
+'llogd_body':: See <<struct-llogd-body>>
+
+'eadata'::Extended attributes info.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[llog-origin-handle-read-header-rpc]]
-.LLOG_ORIGIN_HANDLE_READ_HEADER (503)
-[options="header"]
-|====
-| request | reply
-| llogd_body_only | llog_log_hdr_only
-|====
+.LLOG_ORIGIN_HANDLE_READ_HEADER Generic Packet Structure
+image::llog-origin-handle-read-header-generic.png["LLOG_ORIGIN_HANDLE_READ_HEADER Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-read-header-generic.png diagram resembles this
+text art:
+
+ LLOG_ORIGIN_HANDLE_READ_HEADER:
+ --request-------------------
+ | ptlrpc_body | llogd_body |
+ ----------------------------
+ --reply--------------------------
+ | ptlrpc_body | llog_log_header |
+ ---------------------------------
+//////////////////////////////////////////////////////////////////////
+
+'ptlrpc_body':: RPC descriptor. See <<struct-ptlrpc-body>>.
+
+'llogd_body':: See <<struct-llogd-body>>
+
+include::struct_llog_log_hdr.txt[]
--- /dev/null
+LOV Index
+~~~~~~~~~
+[[lov-index]]
+
+Each target is assigned an LOV index (by the 'mkfs.lustre' command line)
+as the target is added to the file system. This value is stored by the
+target locally as well as on the MGS in order to serve as a unique
+identifier in the file system.
+
include::llog_origin_handle_read_header.txt[]
-#################################################################
-Fixme: Move the RPC message sturucture includes to where they
-first gets introduced. In the sections that have the relevant
-operations.
-#################################################################
+include::struct_lustre_msg.txt[]
-include::data_types.txt[]
-
-include::mdt_structs.txt[]
-
-include::mds_reint_structs.txt[]
-
-include::ost_setattr_structs.txt[]
-
-include::statfs_structs.txt[]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[mds-connect-rpc]]
-.MDS_CONNECT (38)
-[options="header"]
-|====
-| request
-| ptlrpc_body | tgt_uuid | client_uuid | lustre_handle | obd_connect_data |
-|
-| reply
-| ptlrpc_body | obd_connect_data |
-|====
+.MDS_CONNECT Generic Packet Structure
+image::mds-connect-generic.png["MDS_CONNECT Generic Packet Structure",height=100]
-N.B. This is nearly identical to the explanation for OST_CONNECT and
-for MGS_CONNECT. We may want to simplify and/or unify the discussion
-and only call out how this one differs from a generic CONNECT
-operation.
+//////////////////////////////////////////////////////////////////////
+The mds-connect-generic.png diagram resembles this text art:
-When a client initiates a connection to a specific target on an MDS,
-it does so by sending an 'obd_connect_client' message and awaiting the
-reply from the MDS of an 'obd_connect_server' message. From a previous
-interaction with the MGS the client knows the UUID of the target MDT,
-and must fill that value into the 'tgt_uuid' buffer of the request.
+ MDS_CONNECT:
+ --request--------------------------------------------
+ | ptlrpc_body | obd_uuid | obd_uuid | lustre_handle |
+ -----------------------------------------------------
+ | obd_connect_data |
+ ---------------------
+ --reply---------------------------
+ | ptlrpc_body | obd_connect_data |
+ ----------------------------------
+//////////////////////////////////////////////////////////////////////
-The 'client_uuid' buffer holds the randomly-generated 128-bit UUID of
-the client. The 'client_uuid' is unique for each mount of the client.
-Even if the same client mounts the same filesystem multiple times it
-will generate a new 'client_uuid' value for each mount.
+'ptlrpc_body'::
+RPC descriptor.
-The 'lustre_handle' buffer contains the cookie for this connection,
-and is zero for a new mount. If the client is reconnecting to a
-server after a loss of communication, the 'lustre_handle' contains
-the connection cookie previously assigned by the server and returned
-to the client in the reply. This allows the server to determine if
-the client connection matches any previous connection from this client.
+'obd_uuid'::
+UUIDs of the target (first) and client (second) entities. See
+<<struct-obd-uuid>>.
+
+'lustre_handle'::
+See <<struct-lustre-handle>>.
+
+'obd_connect_data'::
+See <<struct-obd-connect-data>>.
The 'ocd_connect_flags' buffer is initialized to the set of features
that the client supports for metadata targets. For Lustre 2.7 clients
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[mds-disconnect-rpc]]
-.MDS_DISCONNECT (39)
-[options="header"]
-|====
-| request | reply
-| empty | empty
-|====
+.MDS_DISCONNECT Generic Packet Structure
+image::mds-disconnect-generic.png["MDS_DISCONNECT Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The mds-disconnect-generic.png diagram resembles this text art:
+
+ MDS_DISCONNECT:
+ --request------
+ | ptlrpc_body |
+ ---------------
+ --reply--------
+ | ptlrpc_body |
+ ---------------
+//////////////////////////////////////////////////////////////////////
The information exchanged in a DISCONNECT message is that normally
conveyed in the RPC descriptor, as described in <<struct-ptlrpc-body>>.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[mds-getattr-rpc]]
-.MDS_GETATTR (33)
-[options="header"]
-|====
-| request | reply
-| mdt_body_capa | mds_getattr_server
-|====
+.MDS_GETATTR Generic Packet Structure
+image::mds-getattr-generic.png["MDS_GETATTR Generic Packet Structure",height=100]
+//////////////////////////////////////////////////////////////////////
+The mds-getattr-generic.png diagram resembles this text art:
+
+ MDS_GETATTR:
+ --request-------------------------------
+ | ptlrpc_body | mdt_body | lustre_capa |
+ ----------------------------------------
+ --reply------------------------------------------------
+ | ptlrpc_body | mdt_body | MDS_MD | ACL | lustre_capa |
+ -------------------------------------------------------
+ | lustre_capa |
+ ---------------
+//////////////////////////////////////////////////////////////////////
+
+'ptlrpc_body'::
+RPC descriptor. See <<struct-ptlrpc_body>>.
+
+include::struct_mdt_body.txt[]
+
+MDS_MD::
+Needs more detail.
+
+ACL::
+Needs more detail.
+
+'lustre_capa'::
+So called "capabilities" structure. This is deprecated in recent
+versions of Lustre, and commonly appears in the packet header as a zero
+length buffer.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[mds-getstatus-rpc]]
-The MDS_GETSTATUS request is used to determine the filesystem ROOT FID
-during initial mount. It is always sent to MDT0000 initially.
+.MDS_GETSTATUS Generic Packet Structure
+image::mds-getstatus-generic.png["MDS_GETSTATUS Generic Packet Structure",height=100]
-.MDS_GETSTATUS (40)
-[options="header"]
-|====
-| request | reply
-| mdt_body | mdt_body
-|====
+//////////////////////////////////////////////////////////////////////
+The mds-getstatus-generic.png diagram resembles this text art:
-The request message 'mdt_body' is normally empty.
+ MDS_GETSTATUS:
+ --request-----------------
+ | ptlrpc_body | mdt_body |
+ --------------------------
+ --reply-------------------
+ | ptlrpc_body | mdt_body |
+ --------------------------
+//////////////////////////////////////////////////////////////////////
-The reply message 'mdt_body' contains only the FID of the filesystem
-ROOT in 'mbo_fid1'. The client can then use the returned FID to
-fetch inode attributes for the root inode of the local mountpoint.
+'ptlrpc_body'::
+RPC descriptor. See <<struct-ptlrpc_body>>.
+
+'mdt_body'::
+See <<struct-mdt-body>>.
+
+In the request message, the 'mdt_body' is normally empty.
+
+In the reply message, the 'mdt_body' contains only the FID of the
+filesystem ROOT in 'mbo_fid1'. The client can then use the returned
+FID to fetch inode attributes for the root inode of the local
+mountpoint.
} mds_reint_t, mdt_reint_t;
----
-REINT_SETATTR
-^^^^^^^^^^^^^
+include::struct_mdt_rec_reint.txt[]
+
+REINT_SETATTR RPC
+^^^^^^^^^^^^^^^^^
[[mds-reint-setattr-rpc]]
An RPC that implements the 'setattr' sub-command of the MDS_REINT.
-------------------------------------------------------
//////////////////////////////////////////////////////////////////////
-The second buffer ('mdt_rec_setattr' in the above) is one of the
-variants specific to the particular REINT as given by the
-'mdt_reint_t' opcode. Each such variant has the same number and size
-of fields, but how the fields are interpreted varies slightly between
-variiants. For all the variant structures refer to
-<<mds-reint-structs>>.
-
-REINT_SETXATTR
-^^^^^^^^^^^^^^
+REINT_SETXATTR RPC
+^^^^^^^^^^^^^^^^^^
[[mds-reint-setxattr-rpc]]
An RPC that implements the 'setxattr' sub-command of the MDS_REINT.
'ptlrpc_body'::
RPC descriptor.
-'mdt_rec_setattr'::
-Information pertinent to setting attributes on the MDT.
+include::struct_mdt_rec_setattr.txt[]
-'mdt_rec_setxattr'::
-Information pertinent to setting extended attributes on the MDT.
+include::struct_mdt_rec_setxattr.txt[]
'lustre_capa'::
So called "capabilities" structure. This is deprecated in recent
+++ /dev/null
-MDS_REINT Structures
-^^^^^^^^^^^^^^^^^^^^
-[[mds-reint-structs]]
-
-MDS_REINT RPCs are those that get issued from a client to request an
-operation on an MDT that will modify the MDT, so the MDT will initiate
-a transaction to carry out the operation. The transactional nature of
-the operation allows it to be replayed in the event of failure. Note:
-The only other MDT-modifying operation is MDS_CLOSE.
-
-include::mdt_rec_reint.txt[]
-
-include::mdt_rec_setattr.txt[]
-
-include::mdt_rec_setxattr.txt[]
~~~~~~~~~~~~~~~~~~
[[mds-statfs-rpc]]
-MDS_STATFS is an RPC that queries data about the underlying file
-system for a given MDT.
-////
-It is generated in response to an explicit call for 'statfs'
-information from the VFS via the 'statfs(2)' function.
-////
-
-The MDS_STATFS request message is a so-called "empty" message in that
-it only has a buffer for the 'ptlrpc_body' with the 'pb_opc' value
-MDS_STATFS (41).
-
-The reply message conveys 'statfs' data when it succeeds, and an error
-code if it doesn't.
+MDS_STATFS ('pb_opc' = 41) is an RPC that queries data about the
+underlying file system for a given MDT. It's form and use are nearly
+identical to the OST_STATFS RPC. Refer to <<ost-statfs-rpc>> for
+details. The only differences in MDS_STATFS are that it has a distinct
+'pb_opc' value and it carries information about an MDT (instead of an
+OST). An MDT will send regular OST_STATFS RPCs to each OST in order
+to keep its information about free space and utilization updated.
+That allows the MDS to make more optimal file allocation decisions.
.MDS_STATFS Generic Packet Structure
+image::mds-statfs-generic.png["MDS_STATFS Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The mds-statfs-generic.png diagram resembles this text art:
-:frame: none
-:grid: none
-[width="50%", cols="2a"]
-|====
-| request
-[cols="1"]
-!===================
-! <<struct-ptlrpc-body,ptlrpc_body>> !
-!===================
-| reply
-[cols="2"]
-!===================
-! <<struct-ptlrpc-body,ptlrpc_body>> ! <<struct-obd_statfs,obd_statfs>> !
-!===================
-|====
+ MDS_STATFS:
+ --request------
+ | ptlrpc_body |
+ ---------------
+ --reply---------------------
+ | ptlrpc_body | obd_statfs |
+ ----------------------------
+//////////////////////////////////////////////////////////////////////
-'ptlrpc_body':: RPC descriptor. Only the 'pb_opc' value (MDS_STATFS =
-41) is directly relevant to the MDS_STATFS request message. The rest
-of the 'ptlrpc_body' fields handle generic information about the
-RPC, as discussed in <<struct-ptlrpc-body>>, including generic error
-conditions. In a normal reply ('pb_type' = PTL_RPC_MSG_REPLY) the
-'pb_status' field is 0. The one error that can be returned in
-'pb_status' that is speficially from OST_STATFS' handling is -ENOMEM,
-which occurs if there is not enough memory to allocate a temporary
-buffer for the 'statfs' data.
+'ptlrpc_body':: RPC descriptor. See <<ptlrpc_body>>.
-'obd_statfs'::
-File system wide statistics corresponding to 'struct statfs' as well
-as Lustre-specific information. See <<struct-obd-statfs>> for a
-detailed discussion.
+'obd_statfs':: Statfs information about the target. See
+<struct-obd-statfs>>.
+++ /dev/null
-MDT Structures
-~~~~~~~~~~~~~~
-[[mdt-structs]]
-
-These structures convey information to or from an MDT concerning the
-metadata about a resource.
-
-include::mdt_body.txt[]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[mgs-config-read-rpc]]
-.MGS_CONFIG_READ (256)
-[options="header"]
-|====
-| request | reply
-| mgs_config_read_client | mgs_config_read_server
-|====
+.MGS_CONFIG_READ Generic Packet Structure
+image::mgs-config-read-generic.png["MGS_CONFIG_READ Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The mgs-config-read-generic.png diagram resembles this text art:
+
+ MGS_CONFIG_READ:
+ --request------------------------
+ | ptlrpc_body | mgs_config_body |
+ ---------------------------------
+ --reply--------------------------
+ | ptlrpc_body | mgs_config_body |
+ ---------------------------------
+//////////////////////////////////////////////////////////////////////
+
+'ptlrpc_body':: RPC descriptor. See <<struct-ptlrpc-body>>.
+
+include::struct_mgs_config_body.txt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[mgs-connect-rpc]]
-.MGS_CONNECT (250)
-[options="header"]
-|====
-| request | reply
-| obd_connect_client | obd_connect_server
-|====
-
-When a client initiates a connection to the MGS,
-it does so by sending an 'obd_connect_client' message and awaiting the
-reply from the MGS of an 'obd_connect_server' message. This is the
-first operation carried out by a client upon the issue of a 'mount'
-command, and the target UUID is provided on the command line.
+.MGS_CONNECT Generic Packet Structure
+image::mgs-connect-generic.png["MGS_CONNECT Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The mgs-connect-generic.png diagram resembles this text art:
+
+ MGS_CONNECT:
+ --request--------------------------------------------
+ | ptlrpc_body | obd_uuid | obd_uuid | lustre_handle |
+ -----------------------------------------------------
+ | obd_connect_data |
+ ---------------------
+ --reply---------------------------
+ | ptlrpc_body | obd_connect_data |
+ ----------------------------------
+//////////////////////////////////////////////////////////////////////
+
+'ptlrpc_body'::
+RPC descriptor.
+
+'obd_uuid'::
+UUIDs of the target (first) and client (second) entities. See
+<<struct-obd-uuid>>.
+
+'lustre_handle'::
+See <<struct-lustre-handle>>.
+
+'obd_connect_data'::
+See <<struct-obd-connect-data>>.
The target UUID is just "MGS", and the client UUID is set to the
32byte string it gets from ... where?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[mgs-disconnect-rpc]]
-.MGS_DISCONNECT (251)
-[options="header"]
-|====
-| request | reply
-| empty | empty
-|====
+.MGS_DISCONNECT Generic Packet Structure
+image::mgs-disconnect-generic.png["MGS_DISCONNECT Generic Packet Structure",height=100]
-N.B. The usual 'struct req_format' definition does not exist for
-MGS_DISCONNECT.
+//////////////////////////////////////////////////////////////////////
+The mgs-disconnect-generic.png diagram resembles this text art:
+
+ MGS_DISCONNECT:
+ --request------
+ | ptlrpc_body |
+ ---------------
+ --reply--------
+ | ptlrpc_body |
+ ---------------
+//////////////////////////////////////////////////////////////////////
The information exchanged in a DISCONNECT message is only that normally
conveyed in the RPC descriptor, as described in <<struct-ptlrpc-body>>.
there is an import on the client for each target it connects to. On
the server this connection state is referred to as an 'export', and
again the server has an export for each client that has connected to
-it. There a separate export for each client for each target.
+it.
+
+*1 - Client issues an MGS_CONNECT to the MGS.*
+
+.MGS_CONNECT Request Packet Structure
+image::mgs-connect-request.png["MGS_CONNECT Request Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The mgs-connect-request.png diagram resembles this text art:
+
+ MGS_CONNECT:
+ --request--------------------------------------------
+ | ptlrpc_body | obd_uuid | obd_uuid | lustre_handle |
+ -----------------------------------------------------
+ | obd_connect_data |
+ ---------------------
+//////////////////////////////////////////////////////////////////////
The client begins by carrying out the <<mgs-connect-rpc,MGS_CONNECT>>
-Lustre operation, which establishes the connection (creates the
+Lustre RPC, which establishes the connection (creates the
import and the export) between the client and the MGS. The connect
message from the client includes a 'lustre_handle' to uniquely
identify itself. Subsequent request messages to the MGS will refer
| OBD_CONNECT_PINGLESS
|====
+*2 - The MGS sends an MGS_CONNECT reply to the client.*
+
+.MGS_CONNECT Reply Packet Structure
+image::mgs-connect-reply.png["MGS_CONNECT Reply Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The mgs-connect-reply.png diagram resembles this text art:
+
+ MGS_CONNECT:
+ --reply---------------------------
+ | ptlrpc_body | obd_connect_data |
+ ----------------------------------
+//////////////////////////////////////////////////////////////////////
+
The MGS's reply to the connection request will include the handle that
the server and client will both use to identify this connection in
subsequent messages. This is the 'connection-handle' (as opposed to
Once the connection is established the client gets configuration
information for the file system from the MGS in four stages. First,
the two exchange messages establishing the file system wide security
-policy that will be followed in all subsequent communications. Second,
-the client gets a configuration <<llog>> starting with a bitmap
-instructing it as to which among the
-configuration records on the MGS it needs. Third, reading those
-records from the MGS gives the client the list of all the servers and
-targets it will need to communicate with. Fourth, the client reads
-the cluster wide configuration data (the sort that might be set at the
-client command line with a 'lctl conf_param' command). The following
-paragraphs go into these four stages in more detail.
+policy that will be followed in all subsequent communications.
Each time the client is going to read information from server storage
it needs to first acquire the appropriate lock. Since the client is
sort of modification to the MGS data then the lock exchange might
result in a delay while the client waits. More details about the
behavior of the <<ldlm,Distributed Lock Manager>> are in that
-section. For now, let's assume the locks are granted for each of these
-four operations. The first LLOG_ORIGIN_HANDLE_CREATE operation (the
-client is creating its own local handle not the target's file) asks
+section.
+
+*3 - The Client sends an LDLM_ENQUEUE to the MGS.*
+
+.LDLM_ENQUEUE Request Packet Structure
+image::ldlm-enqueue-request.png["LDLM_ENQUEUE Request Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The ldlm-enqueue-request.png diagram resembles this text art:
+
+ LDLM_ENQUEUE:
+ --request---------------------
+ | ptlrpc_body | ldlm_request |
+ ------------------------------
+//////////////////////////////////////////////////////////////////////
+
+The client seeks a 'concurrent read' lock on the MGS.
+
+*4 - The MGS sends an LDLM_ENQUEUE reply to the client.*
+
+.LDLM_ENQUEUE Reply Packet Structure
+image::ldlm-enqueue-reply.png["LDLM_ENQUEUE Reply Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The ldlm-enqueue-reply.png diagram resembles this text
+art:
+
+ LDLM_ENQUEUE:
+ --reply---------------------
+ | ptlrpc_body | ldlm_reply |
+ ----------------------------
+//////////////////////////////////////////////////////////////////////
+
+The MGS grants that lock.
+
+*5 - The Client sends an LLOG_ORIGIN_HANDLE_CREATE to the MGS.*
+
+
+.LLOG_ORIGIN_HANDLE_CREATE Request Packet Structure
+image::llog-origin-handle-create-request.png["LLOG_ORIGIN_HANDLE_CREATE Request Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-create-request.png diagram resembles this text
+art:
+
+ LLOG_ORIGIN_HANDLE_CREATE:
+ --request----------------------------
+ | ptlrpc_body | llogd_body | string |
+ -------------------------------------
+//////////////////////////////////////////////////////////////////////
+
+The first LLOG_ORIGIN_HANDLE_CREATE operation asks
for the security configuration file ("lfs-sptlrpc"). <<security>>
-discusses security, and for now let's assume there is nothing to be
-done for security. That is, subsequent messages will all use an "empty
-security flavor" and no encryption will take place. In this case the
-MGS's reply ('pb_status' == -2, ENOENT) indicated that there was no
-such file, so nothing actually gets read.
-
-Another LDLM_ENQUEUE and LLOG_ORIGIN_HANDLE_CREATE pair of operations
-identifies the configuration client data ("lfs-client") file, and in
-this case there is data to read. The LLOG_ORIGIN_HANDLE_CREATE reply
-identifies the actual object of interest on the MGS via the
-'llog_logid' field in the 'struct llogd_body'. The MGS stores
+discusses security.
+
+*6 - The MGS sends an LLOG_ORIGIN_HANDLE_CREATE reply to the client.*
+
+
+.LLOG_ORIGIN_HANDLE_CREATE Reply Packet Structure
+image::llog-origin-handle-create-reply.png["LLOG_ORIGIN_HANDLE_CREATE Reply Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-create-reply.png diagram resembles this text
+art:
+
+ LLOG_ORIGIN_HANDLE_CREATE:
+ --reply---------------------
+ | ptlrpc_body | llogd_body |
+ ----------------------------
+//////////////////////////////////////////////////////////////////////
+
+In this case there is nothing to be done for security. That is,
+subsequent messages will all use an "empty security flavor" and no
+encryption will take place. In this case the MGS's reply ('pb_status'
+== -2, ENOENT) indicated that there was no such file, so nothing
+actually gets read.
+
+Next, the client gets a configuration <<llog>> starting with a bitmap
+instructing it as to which among the configuration records on the MGS
+it needs. Another LDLM_ENQUEUE and LLOG_ORIGIN_HANDLE_CREATE pair of
+operations identifies the configuration client data ("lfs-client")
+file.
+
+*7 - The Client sends an LDLM_ENQUEUE to the MGS.*
+
+The client again seeks a 'concurrent read' lock on the MGS.
+
+*8 - The MGS sends an LDLM_ENQUEUE reply to the client.*
+
+The MGS grants that lock.
+
+*9 - The Client sends an LLOG_ORIGIN_HANDLE_CREATE to the MGS.*
+
+The client asks for the 'lfs-client' log file, which holds a bitmap
+indicating the available configuration records.
+
+*10 - The MGS sends an LLOG_ORIGIN_HANDLE_CREATE reply to the client.*
+
+In this case there is data to read. The LLOG_ORIGIN_HANDLE_CREATE
+reply identifies the actual object of interest on the MGS via the
+'llog_logid' field in the 'struct llogd_body'.
+
+*11 - The Client sends an LLOG_ORIGIN_HANDLE_READ_HEADER to the MGS.*
+
+.LLOG_ORIGIN_HANDLE_READ_HEADER Request Packet Structure
+image::llog-origin-handle-read-header-request.png["LLOG_ORIGIN_HANDLE_READ_HEADER Request Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-read-header-request.png diagram resembles this
+text art:
+
+ LLOG_ORIGIN_HANDLE_READ_HEADER:
+ --request-------------------
+ | ptlrpc_body | llogd_body |
+ ----------------------------
+//////////////////////////////////////////////////////////////////////
+
+The client asks for that log file's header. The MGS stores
configuration data in log records. A header at the beginning of
"lfs-client" uses a bitmap to identify the log records that are
actually needed. The header includes both which records to retrieve
and how large those records are. The LLOG_ORIGIN_HANDLE_READ_HEADER
request uses the 'llog_logid' to identify desired log file, and the
reply provides the bitmap and size information identifying the
-records that are actually needed. The
-LLOG_ORIGIN_HANDLE_NEXT_BLOCK operations retrieves the data thus
-identified.
+records that are actually needed.
+
+*12 - The MGS sends an LLOG_ORIGIN_HANDLE_READ_HEADER reply to the client.*
+
+.LLOG_ORIGIN_HANDLE_READ_HEADER Reply Packet Structure
+image::llog-origin-handle-read-header-reply.png["LLOG_ORIGIN_HANDLE_READ_HEADER Reply Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-read-header-reply.png diagram resembles this
+text art:
+
+ LLOG_ORIGIN_HANDLE_READ_HEADER:
+ --reply--------------------------
+ | ptlrpc_body | llog_log_header |
+ ---------------------------------
+//////////////////////////////////////////////////////////////////////
+
+The MGs responds with the header, which holds the actual bitmap.
+
+*13 - The Client sends an LLOG_ORIGIN_HANDLE_NEXT_BLOCK to the MGS.*
+
+.LLOG_ORIGIN_HANDLE_NEXT_BLOCK Request Packet Structure
+image::llog-origin-handle-next-block-request.png["LLOG_ORIGIN_HANDLE_NEXT_BLOCK Request Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-next-block-request.png diagram resembles this
+text art:
+
+ LLOG_ORIGIN_HANDLE_NEXT_BLOCK:
+ --request-------------------
+ | ptlrpc_body | llogd_body |
+ ----------------------------
+//////////////////////////////////////////////////////////////////////
+
+The client asks for the configuration record.
+
+*14 - The MGS sends an LLOG_ORIGIN_HANDLE_NEXT_BLOCK reply to the client.*
+
+.LLOG_ORIGIN_HANDLE_NEXT_BLOCK Reply Packet Structure
+image::llog-origin-handle-next-block-reply.png["LLOG_ORIGIN_HANDLE_NEXT_BLOCK Reply Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The llog-origin-handle-next-block-reply.png diagram resembles this
+text art:
+
+ LLOG_ORIGIN_HANDLE_NEXT_BLOCK:
+ --reply------------------------------
+ | ptlrpc_body | llogd_body | eadata |
+ -------------------------------------
+//////////////////////////////////////////////////////////////////////
+
+The MGS replies with the details of where in the configuration data to
+find the desired details.
Knowing the specific configuration records it wants, the client then
proceeds to retrieve them. This requires another LDLM_ENQUEUE
get the UUIDs for the servers and targets from the configuration log
("lfs-cliir").
-A final LDLM_ENQUEUE, LLOG_ORIGIN_HANDLE_CREATE, and
-LLOG_ORIGIN_HANDLE_READ_HEADER then retrieve the cluster wide
-configuration data ("params").
+*15 - The Client sends an LDLM_ENQUEUE to the MGS.*
+
+The client again seeks a 'concurrent read' lock on the MGS.
+
+*16 - The MGS sends an LDLM_ENQUEUE reply to the client.*
+
+The MGS grants that lock.
+
+*17 - The Client sends an MGS_CONFIG_READ to the MGS.*
+
+The client identifies the desired record in the 'lfs-cliir' file,
+which contains the current details of the configuration for this file
+system.
+
+*18 - The MGS sends an MGS_CONFIG_READ reply to the client.*
+
+The MGS responds with the actual configuration data. This gives the
+client the list of all the servers and targets it will need to
+communicate with.
+
+Finally, the client reads the cluster wide configuration data (the
+sort that might be set at the client command line with a 'lctl
+conf_param' command). The following paragraphs go into these four
+stages in more detail.
+
+*19 - The Client sends an LDLM_ENQUEUE to the MGS.*
+
+The client again seeks a 'concurrent read' lock on the MGS.
+
+*20 - The MGS sends an LDLM_ENQUEUE reply to the client.*
+
+The MGS grants that lock.
+
+*21 - The Client sends an LLOG_ORIGIN_HANDLE_CREATE to the MGS.*
+
+The client asks for the 'params' log file.
+
+*22 - The MGS sends an LLOG_ORIGIN_HANDLE_CREATE reply to the client.*
+
+The MGS responds that the log file is available.
+
+*23 - The Client sends an LLOG_ORIGIN_HANDLE_READ_HEADER to the MGS.*
+
+The client asks for that log file's header.
+
+*24 - The MGS sends an LLOG_ORIGIN_HANDLE_READ_HEADER reply to the client.*
+
+The MGs responds with the header, which holds the actual bitmap. In
+this case there are no 'params' to report.
Messages Between the Client and the MDSs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
and a specific target (MDT) on an MDS. Thus, if an MDS has multiple
targets, there is a separate MDS_CONNECT operation for each. This
creates an import for the target on the client and an export for the
-client and target on the MDS. As with the connect operation for the
-MGS, the connect message from the client includes a UUID to uniquely
-identify this connection, and subsequent messages to the lock manager
-on the server will refer to that UUID. The connection data from the
-client also proposes the set of <<connect-flags,connection flags>>
-appropriate to connecting to an MDS. The following are the flags
-always included.
+client and target on the MDS.
+
+*1 - Client issues an MDS_CONNECT to each MDT.*
+
+.MDS_CONNECT Request Packet Structure
+image::mds-connect-request.png["MDS_CONNECT Request Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The mds-connect-request.png diagram resembles this text art:
+
+ MDS_CONNECT:
+ --request--------------------------------------------
+ | ptlrpc_body | obd_uuid | obd_uuid | lustre_handle |
+ -----------------------------------------------------
+ | obd_connect_data |
+ ---------------------
+//////////////////////////////////////////////////////////////////////
+
+As with the connect operation for the MGS, the connect message from
+the client includes a UUID to uniquely identify this connection, and
+subsequent messages to the lock manager on the server will refer to
+that UUID. The connection data from the client also proposes the set
+of <<connect-flags,connection flags>> appropriate to connecting to an
+MDS. The following are the flags always included.
.Always included flags for the client connection to an MDS
[options="header"]
| OBD_CONNECT_RMT_CLIENT_FORCE
|====
+*2 - The MDS sends an MDS_CONNECT reply to the client.*
+
+.MDS_CONNECT Reply Packet Structure
+image::mds-connect-reply.png["MDS_CONNECT Reply Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The mds-connect-reply.png diagram resembles this text art:
+
+ MDS_CONNECT:
+ --reply---------------------------
+ | ptlrpc_body | obd_connect_data |
+ ----------------------------------
+//////////////////////////////////////////////////////////////////////
+
The MDS replies to the connect message with a subset of the flags
proposed by the client, and the client notes those values in its
import. The MDS's reply to the connection request will include a UUID
that the server and client will both use to identify this connection
in subsequent messages.
+*3 - The Client sends an MDS_STATFS to the MDS.*
+
The client next uses an MDS_STATFS operation to request 'statfs'
information from the target, and that data is returned in the reply
message. The actual fields closely resemble the results of a 'statfs'
system call. See the 'obd_statfs' structure in the <<data-structs,Data
Structures and Defines Section>>.
+*4 - The MDS sends an MDS_STATFS reply to the client.*
+
+The MDS replies witht eh 'statfs' information.
+
+*5 - The Client sends an MDS_GETSTATUS to the MDS.*
+
The client uses the MDS_GETSTATUS operation to request information
-about the mount point of the file system. fixme: Does MDS_GETSTATUS
-only ask about the root (so it would seem)? The server reply contains
-the 'fid' of the root directory of the file system being mounted. If
-there is a security policy the capabilities of that security policy
-are included in the reply.
+about the mount point of the file system.
+
+*6 - The MDS sends an MDS_GETSTATUS reply to the client.*
+
+The server reply contains the 'fid' of the root directory of the file
+system being mounted. If there is a security policy the capabilities
+of that security policy are included in the reply.
+
+*7 - The Client sends an MDS_GETATTR to the MDS.*
The client then uses the MDS_GETATTR operation to get get further
information about the root directory of the file system. The request
-message includes the above fid. It will also include the security
-capability (if appropriate). The reply also holds the same fid, and in
-this case the 'mdt_body' has several additional fields filled
-in. These include the mtime, atime, ctime, mode, uid, and gid. It also
-includes the size of the extended attributes and the size of the ACL
-information. The reply message also includes the extended attributes
-and the ACL. From the extended attributes the client can find out
-about striping information for the root, if any.
+message includes the above fid.
+
+*8 - The MDS sends an MDS_GETATTR reply to the client.*
+
+It will also include the security capability (if appropriate). The
+reply also holds the same fid, and in this case the 'mdt_body' has
+several additional fields filled in. These include the mtime, atime,
+ctime, mode, uid, and gid. It also includes the size of the extended
+attributes and the size of the ACL information. The reply message also
+includes the extended attributes and the ACL. From the extended
+attributes the client can find out about striping information for the
+root, if any.
Messages Between the Client and the OSSs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2 <-----------------------------------OST_CONNECT
//////////////////////////////////////////////////////////////////////
+The OST_CONNECT operation establishes a connection between the client
+and a specific target (OST) on an OST. Thus, if an OST has multiple
+targets, there is a separate OST_CONNECT operation for each. This
+creates an import for the target on the client and an export for the
+client and target on the OST.
+
+*1 - Client issues an OST_CONNECT to each OST.*
+
+.OST_CONNECT Request Packet Structure
+image::ost-connect-request.png["OST_CONNECT Request Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The ost-connect-request.png diagram resembles this text art:
+
+ OST_CONNECT:
+ --request--------------------------------------------
+ | ptlrpc_body | obd_uuid | obd_uuid | lustre_handle |
+ -----------------------------------------------------
+ | obd_connect_data |
+ ---------------------
+//////////////////////////////////////////////////////////////////////
+
+As with the connect operations for the MGS and MDTs, the connect
+message from the client includes a UUID to uniquely identify this
+connection, and subsequent messages to the lock manager on the server
+will refer to that UUID. The connection data from the client also
+proposes the set of <<connect-flags,connection flags>> appropriate to
+connecting to an OST. The following are the flags always included.
+
+.Flags for the client connection to an OST
+[options="header"]
+|====
+| obd_connect_data->ocd_connect_flags
+| OBD_CONNECT_GRANT
+| OBD_CONNECT_SRVLOCK
+| OBD_CONNECT_VERSION
+| OBD_CONNECT_REQPORTAL
+| OBD_CONNECT_TRUNCLOCK
+| OBD_CONNECT_RMT_CLIENT
+| OBD_CONNECT_BRW_SIZE
+| OBD_CONNECT_OSS_CAPA
+| OBD_CONNECT_CANCELSET
+| OBD_CONNECT_AT
+| OBD_CONNECT_LRU_RESIZE
+| OBD_CONNECT_CKSUM
+| OBD_CONNECT_FID
+| OBD_CONNECT_VBR
+| OBD_CONNECT_FULL20
+| OBD_CONNECT_LAYOUTLOCK
+| OBD_CONNECT_64BITHASH
+| OBD_CONNECT_MAXBYTES
+| OBD_CONNECT_JOBSTATS
+| OBD_CONNECT_EINPROGRESS
+| OBD_CONNECT_LVB_TYPE
+| OBD_CONNECT_PINGLESS
+|====
+
+*2 - The OST sends an OST_CONNECT reply to the client.*
+
+.OST_CONNECT Reply Packet Structure
+image::ost-connect-reply.png["OST_CONNECT Reply Packet Structure",height=50]
+
+//////////////////////////////////////////////////////////////////////
+The ost-connect-reply.png diagram resembles this text art:
+
+ OST_CONNECT:
+ --reply---------------------------
+ | ptlrpc_body | obd_connect_data |
+ ----------------------------------
+//////////////////////////////////////////////////////////////////////
+
+The OST replies to the connect message with a subset of the flags
+proposed by the client, and the client notes those values in its
+import. The OST's reply to the connection request will include a UUID
+that the server and client will both use to identify this connection
+in subsequent messages. The flags that the server doesn't include are:
+OBD_CONNECT_RMT_CLIENT, OBD_CONNECT_OSS_CAPA, and
+OBD_CONNECT_PINGLESS.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[ost-connect-rpc]]
-.OST_CONNECT (8)
-[options="header"]
-|====
-| request | reply
-| obd_connect_client | obd_connect_server
-|====
-
When a client initiates a connection to a specific target on an OSS,
-it does so by sending an 'obd_connect_client' message and awaiting the
-reply from the OSS of an 'obd_connect_server' message. From a previous
+it does so via an OST_CONNECT RPC ('pb_oc' = 8). From a previous
interaction with the MGS the client knows the UUID of the target OST,
and can fill that value into the 'obd_connect_client' message.
-The 'ocd_connect_flags' field is set to (fixme: what?) reflecting the
-capabilities appropriate to the client. The 'ocd_brw_size' is set to the
-largest value for the size of an RPC that the client can handle. The
-'ocd_ibits_known' and 'ocd_checksum_types' values are set to what the client
-considers appropriate. Other fields in the descriptor and
-'obd_connect_data' structures are zero, as is the 'lustre_handle'
-element.
-
-Once the server receives the 'obd_connect_client' message on behalf of
-the given target it replies with an 'obd_connect_server' message. In
-that message the server sends the 'pb__handle' to uniquely
-identify the connection for subsequent communication. The client notes
-that handle in its import for the given target.
-
-fixme: Are there circumstances that could lead to the 'status'
-value in the reply being non-zero? What would lead to that and what
-error values would result?
-
-The target maintains the last committed transaction for a client in
-its export for that client. If this is the first connection, then that
-last transaction value would just be zero. If there were previous
-transactions for the client, then the transaction number for the last
-such committed transaction is put in the 'pb_last_committed' field.
-
-In a connection request the operation does not modify the filesystem,
-so the 'pb_transno' value will be zero in the reply as well.
-
-fixme: there is still some work to be done about how the fields are
-managed.
+.OST_CONNECT Generic Packet Structure
+image::ost-connect-generic.png["OST_CONNECT Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The ost-connect-generic.png diagram resembles this text art:
+
+ OST_CONNECT:
+ --request--------------------------------------------
+ | ptlrpc_body | obd_uuid | obd_uuid | lustre_handle |
+ -----------------------------------------------------
+ | obd_connect_data |
+ ---------------------
+ --reply---------------------------
+ | ptlrpc_body | obd_connect_data |
+ ----------------------------------
+//////////////////////////////////////////////////////////////////////
+
+'ptlrpc_body'::
+RPC descriptor.
+
+'obd_uuid'::
+UUIDs of the target (first) and client (second) entities. See
+<<struct-obd-uuid>>.
+
+'lustre_handle'::
+See <<struct-lustre-handle>>.
+
+'obd_connect_data'::
+See <<struct-obd-connect-data>>.
+
+In the OST_CONNECT RPC reply from the server the 'ptlrpc_body' field
+'pb__handle' is set to uniquely identify the connection for subsequent
+communication. The client notes that handle in its import for the
+given target.
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[ost-disconnect-rpc]]
-.OST_DISCONNECT (9)
-[options="header"]
-|====
-| request | reply
-| empty | empty
-|====
+.OST_DISCONNECT Generic Packet Structure
+image::ost-disconnect-generic.png["OST_DISCONNECT Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The ost-disconnect-generic.png diagram resembles this text art:
+
+ OST_DISCONNECT:
+ --request------
+ | ptlrpc_body |
+ ---------------
+ --reply--------
+ | ptlrpc_body |
+ ---------------
+//////////////////////////////////////////////////////////////////////
The information exchanged in an OST_DISCONNECT request message is
only that normally conveyed in the RPC descriptor, as described in
--------------------------
//////////////////////////////////////////////////////////////////////
-'ptlrpc_body'::
-RPC descriptor.
+include::struct_ptlrpc_body.txt[]
+
+include::struct_ost_body.txt[]
-'ost_body'::
-Metadata about the resource, as it appears on the OST.
~~~~~~~~~~~~~~~~~~
[[ost-statfs-rpc]]
-OST_STATFS ('pb_opc' = 13) is an RPC that queries data about the
-underlying file system for a given OST. It's form and use are nearly
-identical to the MDS_STATFS RPC. Refer to <<mds-statfs-rpc>> for
-details. The only differences in OST_STATFS are that it has a distinct
-'pb_opc' value, it carries information about an OST (instead of an
-MDT). The MDT may send regular OST_STATFS RPCs to each OST in order
-to keep its information about free space and utilization updated.
-That allows the MDS to make more optimal file allocation decisions.
+OST_STATFS is an RPC that queries data about the underlying file
+system for a given OST.
+
+The OST_STATFS request message is a so-called "empty" message in that
+it only has a buffer for the 'ptlrpc_body' with the 'pb_opc' value
+OST_STATFS (13).
+
+The reply message conveys 'statfs' data when it succeeds, and an error
+code if it doesn't.
+
+.OST_STATFS Generic Packet Structure
+image::ost-statfs-generic.png["OST_STATFS Generic Packet Structure",height=100]
+
+//////////////////////////////////////////////////////////////////////
+The ost-statfs-generic.png diagram resembles this text art:
+
+ OST_STATFS:
+ --request------
+ | ptlrpc_body |
+ ---------------
+ --reply---------------------
+ | ptlrpc_body | obd_statfs |
+ ----------------------------
+//////////////////////////////////////////////////////////////////////
+
+'ptlrpc_body':: RPC descriptor. Only the 'pb_opc' value (OST_STATFS =
+41) is directly relevant to the OST_STATFS request message. The rest
+of the 'ptlrpc_body' fields handle generic information about the
+RPC, as discussed in <<struct-ptlrpc-body>>, including generic error
+conditions. In a normal reply ('pb_type' = PTL_RPC_MSG_REPLY) the
+'pb_status' field is 0. The one error that can be returned in
+'pb_status' that is speficially from OST_STATFS' handling is -ENOMEM,
+which occurs if there is not enough memory to allocate a temporary
+buffer for the 'statfs' data.
+
+include::struct_obd_statfs.txt[]
In addition to the 'ptlrpc_body' (Lustre RPC descriptor), the MDS_REINT
request RPC from the client has the REINT structure 'mdt_rec_setattr', and a
lock request 'ldlm_request'. For a detailed discussion of all the fields in
-the 'mdt_rec_setattr' and 'ldlm_request' refer to <<mdt-rec-setattr>>
+the 'mdt_rec_setattr' and 'ldlm_request' refer to <<struct-mdt-rec-setattr>>
and <<struct-ldlm-request>>.
.MDS_REINT:REINT_SETATTR Request Packet Structure
'mdt_rec_setxattr', the name of the extended attribute in question,
the extended attribute data to put in place, and a lock request
'ldlm_request'. For a detailed discussion of all the fields in the
-'mdt_rec_setxattr' and 'ldlm_request' refer to <<mdt-rec-setxattr>>
-and <<struct-ldlm-request>>.
+'mdt_rec_setxattr' and 'ldlm_request' refer to
+<<struct-mdt-rec-setxattr>> and <<struct-ldlm-request>>.
.MDS_REINT:REINT_SETXATTR Request Packet Structure
image::mds-reint-setxattr-request.png["MDS_REINT:REINT_SETXATTR Request Packet Structure",height=50]
+++ /dev/null
-Statfs Structures
-~~~~~~~~~~~~~~~~~
-[[statfs-structs]]
-
-These structures convey statfs information to or from an MDTs and OSTs.
-
-include::obd_statfs.txt[]
-
An LDLM_ENQUEUE RPC with a layout intent uses the 'layout_intent'
structure to specify the desired use for the layout.
+[source,c]
----
struct layout_intent {
__u32 li_opc;
The uses for the layout that the 'li_opc' field can specify are:
+[source,c]
----
enum {
LAYOUT_INTENT_ACCESS = 0,
A lock request can include an 'intent' operation. Which operation is
encoded in the 'ldlm_intent' 'opc'.
+[source,c]
----
struct ldlm_intent {
__u64 opc;
The available operations are:
+[source,c]
----
#define IT_OPEN (1 << 0)
#define IT_CREAT (1 << 1)
The 'ldlm_reply' structure is the reciprocal of the 'ldlm_request'.
+[source,c]
----
struct ldlm_reply {
__u32 lock_flags;
--- /dev/null
+LDLM Request
+^^^^^^^^^^^^
+[[struct-ldlm-request]]
+
+The 'ldlm_request' structure is a description of the lock being
+requested. Which resource is the target, what lock is current, and
+what lock desired.
+
+[source,c]
+----
+struct ldlm_request {
+ __u32 lock_flags;
+ __u32 lock_count;
+ struct ldlm_lock_desc lock_desc;
+ struct lustre_handle lock_handle[2];
+};
+----
+
+The 'lock_flags' field governs how the lock request is to be
+interpreted. The flags are:
+
+[source,c]
+----
+#define LDLM_FL_LOCK_CHANGED 0x0000000000000001ULL // bit 0
+#define LDLM_FL_BLOCK_GRANTED 0x0000000000000002ULL // bit 1
+#define LDLM_FL_BLOCK_CONV 0x0000000000000004ULL // bit 2
+#define LDLM_FL_BLOCK_WAIT 0x0000000000000008ULL // bit 3
+#define LDLM_FL_AST_SENT 0x0000000000000020ULL // bit 5
+#define LDLM_FL_REPLAY 0x0000000000000100ULL // bit 8
+#define LDLM_FL_INTENT_ONLY 0x0000000000000200ULL // bit 9
+#define LDLM_FL_HAS_INTENT 0x0000000000001000ULL // bit 12
+#define LDLM_FL_FLOCK_DEADLOCK 0x0000000000008000ULL // bit 15
+#define LDLM_FL_DISCARD_DATA 0x0000000000010000ULL // bit 16
+#define LDLM_FL_NO_TIMEOUT 0x0000000000020000ULL // bit 17
+#define LDLM_FL_BLOCK_NOWAIT 0x0000000000040000ULL // bit 18
+#define LDLM_FL_TEST_LOCK 0x0000000000080000ULL // bit 19
+#define LDLM_FL_CANCEL_ON_BLOCK 0x0000000000800000ULL // bit 23
+#define LDLM_FL_DENY_ON_CONTENTION 0x0000000040000000ULL // bit 30
+#define LDLM_FL_AST_DISCARD_DATA 0x0000000080000000ULL // bit 31
+#define LDLM_FL_FAIL_LOC 0x0000000100000000ULL // bit 32
+#define LDLM_FL_SKIPPED 0x0000000200000000ULL // bit 33
+#define LDLM_FL_CBPENDING 0x0000000400000000ULL // bit 34
+#define LDLM_FL_WAIT_NOREPROC 0x0000000800000000ULL // bit 35
+#define LDLM_FL_CANCEL 0x0000001000000000ULL // bit 36
+#define LDLM_FL_LOCAL_ONLY 0x0000002000000000ULL // bit 37
+#define LDLM_FL_FAILED 0x0000004000000000ULL // bit 38
+#define LDLM_FL_CANCELING 0x0000008000000000ULL // bit 39
+#define LDLM_FL_LOCAL 0x0000010000000000ULL // bit 40
+#define LDLM_FL_LVB_READY 0x0000020000000000ULL // bit 41
+#define LDLM_FL_KMS_IGNORE 0x0000040000000000ULL // bit 42
+#define LDLM_FL_CP_REQD 0x0000080000000000ULL // bit 43
+#define LDLM_FL_CLEANED 0x0000100000000000ULL // bit 44
+#define LDLM_FL_ATOMIC_CB 0x0000200000000000ULL // bit 45
+#define LDLM_FL_BL_AST 0x0000400000000000ULL // bit 46
+#define LDLM_FL_BL_DONE 0x0000800000000000ULL // bit 47
+#define LDLM_FL_NO_LRU 0x0001000000000000ULL // bit 48
+#define LDLM_FL_FAIL_NOTIFIED 0x0002000000000000ULL // bit 49
+#define LDLM_FL_DESTROYED 0x0004000000000000ULL // bit 50
+#define LDLM_FL_SERVER_LOCK 0x0008000000000000ULL // bit 51
+#define LDLM_FL_RES_LOCKED 0x0010000000000000ULL // bit 52
+#define LDLM_FL_WAITED 0x0020000000000000ULL // bit 53
+#define LDLM_FL_NS_SRV 0x0040000000000000ULL // bit 54
+#define LDLM_FL_EXCL 0x0080000000000000ULL // bit 55
+----
+
+The 'lock_count' field represents how many requests are queued on this
+resource.
+
+[[struct-ldlm-lock-desc]]
+
+The lock descriptor conveys the specific details about a particular
+lock being requested or granted. It appears in
+<<struct-ldlm-request>>.
+
+[source,c]
+----
+struct ldlm_lock_desc {
+ struct ldlm_resource_desc l_resource;
+ ldlm_mode_t l_req_mode;
+ ldlm_mode_t l_granted_mode;
+ ldlm_wire_policy_data_t l_policy_data;
+};
+----
+
+[[struct-ldlm-resource-desc]]
+
+The resource descriptor identifies the individual resource that is
+being locked, along with what sort of thing it is.
+
+[source,c]
+----
+struct ldlm_resource_desc {
+ ldlm_type_t lr_type;
+ __u32 lr_padding; /* also fix lustre_swab_ldlm_resource_desc */
+ struct ldlm_res_id lr_name;
+};
+----
+
+[[ldlm-type-t]]
+The 'lr_type' field identifies one of the four types of lock that
+might be placed on a resource. A "plain" lock type just locks a
+particular resource. An "extent" lock type only locks a contiguous
+sequence of byte offsets within a regular file. An "flock" lock type
+represents an application layer advisory lock from the 'flock()'
+system call. While Lustre manages "flock" types locks on behalf of the
+application, they do not affect Lustre operation. An "ibits" lock
+type allows fine grained locking of different parts of a single
+resource. A single lock request or cancellation may operate on one or
+more lock bits, or individual lock bits may be granted on the same
+resource separately. See also <<ldlm-wire-policy-data-t>>. A lock
+descriptor may also have no type at all, in which case the 'lr_type'
+field is 0, meaning "no lock".
+
+[source,c]
+----
+enum {
+ LDLM_PLAIN = 10,
+ LDLM_EXTENT = 11,
+ LDLM_FLOCK = 12,
+ LDLM_IBITS = 13,
+} ldlm_type_t;
+----
+
+[[struct-ldlm-res-id]]
+The 'lr_name' field identifies (by name) the resource(s) that are the
+objects of the locking operation.
+
+[source,c]
+----
+struct ldlm_res_id {
+ __u64 name[4];
+};
+----
+
+The 'name' array holds identifiers for the resource in question. Those
+identifiers may be the elements of a 'struct lu_fid' file ID, or they
+may be other uniquely identifying values for the resource. See
+<<struct-lu-fid>>.
+
+[[ldlm-mode-t]]
+The 'l_req_mode' and 'l_granted_mode' fields give the kind of lock
+being requested and the kind of lock that has been granted. The field
+values are:
+
+[source,c]
+----
+enum {
+ LCK_EX = 1, /* exclusive */
+ LCK_PW = 2, /* privileged write */
+ LCK_PR = 4, /* privileged read */
+ LCK_CW = 8, /* concurrent write */
+ LCK_CR = 16, /* concurrent read */
+ LCK_NL = 32, /* */
+ LCK_GROUP = 64, /* */
+ LCK_COS = 128, /* */
+} ldlm_mode_t;
+----
+
+Despite the fact that the lock modes are not overlapping, these lock
+modes are exclusive. In addition the mode value 0 is the MINMODE,
+i.e. no lock at all.
+
+In a request 'l_req_mode' is the value actually being requested and
+'l_granted_mode' is the value that currently is in place on for the
+requester. In a reply the 'l_req_mode' may be modified if more or
+fewer privileges were granted than requested, and the
+'l_granted_mode' is what has, in fact, been granted.
+
+[[ldlm-wire-policy-data-t]]
+The 'l_policy_data' field gives the kind of resource being
+requested/granted. It is a union of these struct definitions:
+
+[source,c]
+----
+typedef union {
+ struct ldlm_extent l_extent;
+ struct ldlm_flock_wire l_flock;
+ struct ldlm_inodebits l_inodebits;
+} ldlm_wire_policy_data_t;
+----
+
+[[struct-ldlm-extent]]
+[source,c]
+----
+struct ldlm_extent {
+ __u64 start;
+ __u64 end;
+ __u64 gid;
+};
+----
+
+[[struct-ldlm-flock-wire]]
+[source,c]
+----
+struct ldlm_flock_wire {
+ __u64 lfw_start;
+ __u64 lfw_end;
+ __u64 lfw_owner;
+ __u32 lfw_padding;
+ __u32 lfw_pid;
+};
+----
+
+[[struct-ldlm-inodebits]]
+[source,c]
+----
+struct ldlm_inodebits {
+ __u64 bits;
+};
+----
+
+Thus the lock may be on an 'extent', a contiguous sequence of bytes
+in a regular file; an 'flock wire', whatever to heck that is; or a
+portion of an inode. For a "plain" lock (or one with no type at all)
+the 'l_policy_data' field has zero length.
+
+The 'lock_handle' array's first element holds the handle for the lock
+manager (see the description of <<struct-lustre-handle>>) involved in
+the operation. There is only one lock manager involved in any given
+RPC. The second handle is set to zero except in the rare case that
+there is also an early lock cancellation. The latter case will be
+discussed elsewhere.
+
--- /dev/null
+LLOG Log Header
+^^^^^^^^^^^^^^^
+[[struct-llog-log-hdr]]
+
+[source,c]
+----
+struct llog_log_hdr {
+ struct llog_rec_hdr llh_hdr;
+ obd_time llh_timestamp;
+ __u32 llh_count;
+ __u32 llh_bitmap_offset;
+ __u32 llh_size;
+ __u32 llh_flags;
+ __u32 llh_cat_idx;
+ /* for a catalog the first plain slot is next to it */
+ struct obd_uuid llh_tgtuuid;
+ __u32 llh_reserved[LLOG_HEADER_SIZE/sizeof(__u32) - 23];
+ __u32 llh_bitmap[LLOG_BITMAP_BYTES/sizeof(__u32)];
+ struct llog_rec_tail llh_tail;
+};
+----
+
+The llog records start and end on a record size boundary, typically
+8192 bytes, or as stored in 'llh_size', which allows some degree of
+random access within the llog file, even with variable record sizes.
+It is possible to interpolate the offset of an arbitrary record
+within the file by estimating the byte offset of a particular record
+index using 'llh_count' and the llog file size and aligning it to
+the chunk boundary 'llh_size'. The record index in the 'llog_rec_hdr'
+of the first record in that chunk can be used to further refine the
+estimate of the offset of the desired index down to a single chunk,
+and then sequential access can be used to find the actual record.
+
+
+Each llog file begins with an 'llog_log_hdr' that describes the llog
+file itself, followed by a series of log records that are appended
+sequentially to the file. Each record, including the header itself,
+begins with an 'llog_rec_hdr' and ends with an 'llog_rec_tail'.
+
+
+The lgd_llh_flags are:
+[source,c]
+----
+enum llog_flag {
+ LLOG_F_ZAP_WHEN_EMPTY = 0x1,
+ LLOG_F_IS_CAT = 0x2,
+ LLOG_F_IS_PLAIN = 0x4,
+};
+----
+
+LLog Record Header
+^^^^^^^^^^^^^^^^^^
+[[struct-llog-rec-hdr]]
+[source,c]
+----
+struct llog_rec_hdr {
+ __u32 lrh_len;
+ __u32 lrh_index;
+ __u32 lrh_type;
+ __u32 lrh_id;
+};
+----
+
+The 'llog_rec_hdr' is at the start of each llog record and describes
+the log record. 'lrh_len' holds the record size in bytes, including
+the header and tail. 'lrh_index' is the record index within the llog
+file and is sequentially increasing for each subsequent record. It
+can be used to determine the offset within the llog file when searching
+for an arbitrary record within the file. 'lrh_type' describes the type
+of data stored in this record.
+
+[source,c]
+----
+enum llog_op_type {
+ LLOG_PAD_MAGIC = LLOG_OP_MAGIC | 0x00000,
+ OST_SZ_REC = LLOG_OP_MAGIC | 0x00f00,
+ MDS_UNLINK64_REC = LLOG_OP_MAGIC | 0x90000 | (MDS_REINT << 8) |
+ REINT_UNLINK,
+ MDS_SETATTR64_REC = LLOG_OP_MAGIC | 0x90000 | (MDS_REINT << 8) |
+ REINT_SETATTR,
+ OBD_CFG_REC = LLOG_OP_MAGIC | 0x20000,
+ LLOG_GEN_REC = LLOG_OP_MAGIC | 0x40000,
+ CHANGELOG_REC = LLOG_OP_MAGIC | 0x60000,
+ CHANGELOG_USER_REC = LLOG_OP_MAGIC | 0x70000,
+ HSM_AGENT_REC = LLOG_OP_MAGIC | 0x80000,
+ UPDATE_REC = LLOG_OP_MAGIC | 0xa0000,
+ LLOG_HDR_MAGIC = LLOG_OP_MAGIC | 0x45539,
+ LLOG_LOGID_MAGIC = LLOG_OP_MAGIC | 0x4553b,
+};
+----
+
+LLog Record Tail
+^^^^^^^^^^^^^^^^
+[[struct-llog-rec-tail]]
+[source,c]
+----
+struct llog_rec_tail {
+ __u32 lrt_len;
+ __u32 lrt_index;
+};
+----
+
+The 'llog_rec_tail' is at the end of each llog record. The 'lrt_len'
+and 'lrt_index' fields must be the same as 'lrh_len' and 'lrh_index'
+in the header. They can be used to verify record integrity, as well
+as allowing processing the llog records in reverse order.
+
--- /dev/null
+LLOGD Body
+^^^^^^^^^^
+[[struct-llogd-body]]
+
+[source,c]
+----
+struct llogd_body {
+ struct llog_logid lgd_logid;
+ __u32 lgd_ctxt_idx;
+ __u32 lgd_llh_flags;
+ __u32 lgd_index;
+ __u32 lgd_saved_index;
+ __u32 lgd_len;
+ __u64 lgd_cur_offset;
+};
+----
+
+[[struct-llog-logid]]
+
+[source,c]
+----
+struct llog_logid {
+ struct ost_id lgl_oi;
+ __u32 lgl_ogen;
+};
+----
+
+The 'llog_logid' structure is used to identify a single Lustre log file.
+It holds a <<struct-ost-id>> in 'lgl_oi', which is typically a FID.
^^^^^^^^^^^^^^^^^^^^
[[struct-lov-mds-md]]
-The 'lov_mds_md' structure contains the layout of a single file.
-In replies to lock requests and other situations requiring
-layout information from an MDT the 'lov_mds_md' information provides
-details about the layout of a file across the OSTs. There may be
-different types of layouts for different files, either 'lov_mds_md_v1'
-or 'lov_mds_md_v3' as of Lustre 2.7, though they are very similar in
-structure.
+The 'lov_mds_md' structure contains the layout of a resource. In
+replies to lock requests and other situations requiring layout
+information from an MDT the 'lov_mds_md' information provides details
+about the layout of a file across the OSTs. There may be different
+types of layouts for different files, either 'lov_mds_md_v1' or
+'lov_mds_md_v3' as of Lustre 2.7, though they are very similar in
+structure. In an intent request (as opposed to a reply and as yet
+unimplemanted) it will modify the layout. It will not be included
+(zero length) in requests in current releases.
+[source,c]
----
struct lov_mds_md_v1 {
__u32 lmm_magic;
entries in the 'lmm_objects' array, which can be determined by the overall
layout size.
+[source,c]
----
struct lov_ost_data_v1 { /* per-stripe data structure (little-endian)*/
struct ost_id l_ost_oi; /* OST object ID */
layout, one per object.
'l_ost_id' identifies the object on the OST specified by 'l_ost_idx'.
-It may contain a OST object ID or a FID as described in <<struct-ost-id>>.
-The 'l_ost_gen' field is currently unused.
+It may contain a OST object ID or a FID. The 'l_ost_gen' field is
+currently unused.
+include::struct_ost_id.txt[]
Lustre File Identifiers
-~~~~~~~~~~~~~~~~~~~~~~~
+^^^^^^^^^^^^^^^^^^^^^^^
[[struct-lu-fid]]
Each resource stored on a target is assigned an identifier that is
different hosts might be referring to different versions of the same
resource. It has never been used as of Lustre 2.8.
+[source,c]
----
enum fid_seq {
FID_SEQ_OST_MDT0 = 0,
etc. The meaning of the handle is dependent on the context in which it
is used.
+[source,c]
----
struct lustre_handle {
__u64 cookie;
--- /dev/null
+Lustre Message Header
+~~~~~~~~~~~~~~~~~~~~~
+[[struct-lustre-msg]]
+
+Every message has an initial header that informs the receiver about
+the number of buffers and their size for the rest of the message to
+follow, along with other important information about the request or
+reply message.
+
+[source,c]
+----
+#define LUSTRE_MSG_MAGIC_V2 0x0BD00BD3
+#define MSGHDR_AT_SUPPORT 0x1
+struct lustre_msg {
+ __u32 lm_bufcount;
+ __u32 lm_secflvr;
+ __u32 lm_magic;
+ __u32 lm_repsize;
+ __u32 lm_cksum;
+ __u32 lm_flags;
+ __u32 lm_padding_2;
+ __u32 lm_padding_3;
+ __u32 lm_buflens[0];
+};
+----
+
+The 'lm_bufcount' field holds the number of buffers that will follow
+the header. The header and sequence of buffers constitutes one
+message. Each of the buffers is a sequence of bytes whose contents
+corresponds to one of the structures described in this section. Each
+message will always have at least one buffer, and no message can have
+more than thirty-one buffers.
+
+The 'lm_secflvr' field gives an indication of whether any sort of
+cyptographic encoding of the subsequent buffers will be in force. The
+value is zero if there is no "crypto" and gives a code identifying the
+"flavor" of crypto if it is employed. Further, if crypto is employed
+there will only be one buffer following (i.e. 'lm_bufcount' = 1), and
+that buffer holds an encoding of what would otherwise have been the
+sequence of buffers normally following the header. Cryptography will
+be discussed in a separate chapter.
+
+The 'lm_magic' field is a "magic" value (LUSTRE_MSG_MAGIC_V2 = 0x0BD00BD3,
+'OBD' for 'object based device') that is
+checked in order to positively identify that the message is intended
+for the use to which it is being put. That is, we are indeed dealing
+with a Lustre message, and not, for example, corrupted memory or a bad
+pointer.
+
+The 'lm_repsize' field in a request indicates the maximum available
+space that has been reserved for any reply to the request. A reply
+that attempts to use more than the reserved space will be discarded.
+
+The 'lm_cksum' field contains a checksum of the 'ptlrpc_body' buffer
+to allow the receiver to verify that the message is intact. This is
+used to verify that an 'early reply' has not been overwritten by the
+actual reply message. If the 'MSGHDR_CKSUM_INCOMPAT18' flag is set
+in requests since Lustre 1.8
+(the server will send early reply messages with the appropriate 'lm_cksum'
+if it understands the flag
+and is mandatory in Lustre 2.8 and later.
+
+The 'lm_flags' field contains flags that affect the low-level RPC
+protocol. The 'MSGHDR_AT_SUPPORT' (0x1) bit indicates that the sender
+understands adaptive timeouts and can receive 'early reply' messages
+to extend its waiting period rather than timing out. This flag was
+introduced in Lustre 1.6. The 'MSGHDR_CKSUM_INCOMPAT18' (0x2) bit
+indicates that 'lm_cksum' is computed on the full 'ptlrpc_body'
+message buffer rather than on the original 'ptlrpc_body_v2' structure
+size (88 bytes). It was introduced in Lustre 1.8 and is mandatory
+for all requests in Lustre 2.8 and later.
+
+The 'lm_padding*' fields are reserved for future use.
+
+The array of 'lm_buflens' values has 'lm_bufcount' entries. Each
+entry corresponds to, and gives the length in bytes of, one of the
+buffers that will follow. The entire header, and each of the buffers,
+is required to be a multiple of eight bytes long to ensure the buffers
+are properly aligned to hold __u64 values. Thus there may be an extra
+four bytes of padding after the 'lm_buflens' array if that array has
+an odd number of entries.
+
An 'mdt_body' structure conveys information about the metadata for a
single resource, typically an MDT inode.
+[source,c]
----
struct mdt_body {
struct lu_fid mbo_fid1; /* OBD_MD_FLID */
For an operation that involves two resources both the 'mbo_fid1' and
'mbo_fid2' fields will be filled in. If the 'mdt_body' is part of an
RPC affecting or involving only a single resource then 'mbo_fid1' will
-designate that resource and 'mbo_fid2' will be cleared (see
-<<struct-lu-fid>>).
+designate that resource and 'mbo_fid2' will be cleared.
+
+include::struct_lu_fid.txt[]
The 'mbo_handle' field indicates the identity of an open file related
to the operation, if any. If there is no lock then it is just 0.
The 'mbo_valid' field identifies which of the remaining fields are
actually in force. The flags in 'mbo_valid' are:
+[source,c]
----
#define OBD_MD_FLID (0x00000001ULL)
#define OBD_MD_FLATIME (0x00000002ULL)
Generic MDS_REINT
^^^^^^^^^^^^^^^^^
+[[struct-mdt-rec-reint]]
An 'mdt_rec_reint' structure specifies the generic form for MDS_REINT
requests. Each sub-operation, as defned by the 'rr_opcode' field, has
the sequence of field sizes must the same in every variant as it is in
the generic version (not just the overal size of the sturcture).
+[source,c]
----
struct mdt_rec_reint {
__u32 rr_opcode;
The 'rr_opcode' field defines one among the several sub-commands for
MDS REINT RPCs. Those opcodes are:
+[source,c]
----
typedef enum {
REINT_SETATTR = 1,
The 'rr_bias' field adds additional optional information to the
REINT. The possible values are:
+[source,c]
----
enum mds_op_bias {
MDS_CHECK_SPLIT = 1 << 0,
-REINT_SETATTR
-^^^^^^^^^^^^^
-[[mdt-rec-setattr]]
+REINT_SETATTR Structure
+^^^^^^^^^^^^^^^^^^^^^^^
+[[struct-mdt-rec-setattr]]
The variant of the 'mdt_rec_reint' for the 'setattr' operation is:
+[source,c]
----
struct mdt_rec_setattr {
__u32 sa_opcode;
then the value of the corresponding field is to be ignored. The flags
are:
+[source,c]
----
#define MDS_ATTR_MODE 0x1ULL /* = 1 */
#define MDS_ATTR_UID 0x2ULL /* = 2 */
-REINT_SETXATTR
-^^^^^^^^^^^^^^
-[[mdt-rec-setxattr]]
+REINT_SETXATTR Structure
+^^^^^^^^^^^^^^^^^^^^^^^^
+[[struct-mdt-rec-setxattr]]
The variant of the 'mdt_rec_reint' for the 'setxattr' operation is:
+[source,c]
----
struct mdt_rec_setxattr {
__u32 sx_opcode;
The 'sx_valid' field identifies which of the other fields in the
structure are to be honored. If the corresponding flag bit is not set
then the value of the corresponding field is to be ignored. The flag
-values draw from the same set of definitions as <<mdt-rec-setattr>>.
+values draw from the same set of definitions as
+<<struct-mdt-rec-setattr>>.
.Flags for 'sx_valid' field of 'struct mdt_rec_setxattr'
[options="header"]
--- /dev/null
+MGS Configuration Data
+^^^^^^^^^^^^^^^^^^^^^^
+
+[source,c]
+----
+#define MTI_NAME_MAXLEN 64
+struct mgs_config_body {
+ char mcb_name[MTI_NAME_MAXLEN]; /* logname */
+ __u64 mcb_offset; /* next index of config log to request */
+ __u16 mcb_type; /* type of log: CONFIG_T_[CONFIG|RECOVER] */
+ __u8 mcb_reserved;
+ __u8 mcb_bits; /* bits unit size of config log */
+ __u32 mcb_units; /* # of units for bulk transfer */
+};
+----
+
+The 'mgs_config_body' structure has information identifying to the MGS
+which Lustre file system the client is requesting configuration information
+from. 'mcb_name' contains the filesystem name (fsname). 'mcb_offset'
+contains the next record number in the configuration llog to process
+(see <<llog>> for details), not the byte offset or bulk transfer units.
+'mcb_bits' is the log2 of the units of minimum bulk transfer size,
+typically 4096 or 8192 bytes, while 'mcb_units' is the maximum number of
+2^mcb_bits sized units that can be transferred in a single request.
+
+[source,c]
+----
+struct mgs_config_res {
+ __u64 mcr_offset; /* index of last config log */
+ __u64 mcr_size; /* size of the log */
+};
+----
+
+The 'mgs_config_res' structure returns information describing the
+replied configuration llog data requested in 'mgs_config_body'.
+'mcr_offset' contains the last configuration record number returned
+by this reply. 'mcr_size' contains the maximum record index in the
+entire configuration llog. When 'mcr_offset' equals 'mcr_size' there
+are no more records to process in the log.
+
with the subset of feature flags that it understands and intends to honour.
The server may set fields in the reply for mutually-understood features.
+[source,c]
----
struct obd_connect_data {
__u64 ocd_connect_flags;
actually control whether the remaining fields of 'obd_connect_data'
get used. The [[obd-connect-flags]] flags are:
+[source,c]
----
#define OBD_CONNECT_RDONLY 0x1ULL /*client has read-only access*/
#define OBD_CONNECT_INDEX 0x2ULL /*connect specific LOV idx */
of inodes and extended attributes.
[[mds-inode-bits-locks]]
+[source,c]
----
#define MDS_INODELOCK_LOOKUP 0x000001 /* For namespace, dentry etc, and also
* was used to protect permission (mode,
If the OBD_CONNECT_LOV_V3 is set if the client supports LOV_MAGIC_V3
(0x0BD30BD0) style layouts. This type of the layout was introduced
-along with OST pools support and added the 'lov_mds_md_v3' layout. The
+along with OST pools support and added the 'lov_mds_md' layout. The
OBD_CONNECT_LOV_V3 flag notifies a server if client supports
this type of LOV EA to handle requests from it properly.
An 'obd_statfs' structure conveys file-system-wide information for the
back-end file system of a given target (MDT or OST).
+[source,c]
----
struct obd_statfs {
__u64 os_type;
--- /dev/null
+Object Based Disk UUID
+^^^^^^^^^^^^^^^^^^^^^^
+[[struct-obd-uuid]]
+
+[source,c]
+----
+#define UUID_MAX 40
+struct obd_uuid {
+ char uuid[UUID_MAX];
+};
+----
+
+The 'uuid' contains an ASCII-formatted string that identifies
+the entity uniquely within the filesystem. Clients use an RFC-4122
+hexadecimal UUID of the form ''de305d54-75b4-431b-adb2-eb6b9e546014''
+that is randomly generated. Servers may use a string-based identifier
+of the form ''fsname-TGTindx_UUID''.
+
-OST_SETATTR Structures
-~~~~~~~~~~~~~~~~~~~~~~
-[[ost-setattr]]
-
OST Body
^^^^^^^^
[[struct-ost-body]]
-The 'ost_body' structure just hold a 'struct 'obdo', which is where
+The 'ost_body' structure just holds a 'struct 'obdo', which is where
all the actual information is conveyed.
+[source,c]
----
struct ost_body {
struct obdo oa;
The 'obdo' structure conveys metadata about a resource on an OST.
+[source,c]
----
struct obdo {
__u64 o_valid;
to be interpreted. The flags are the same set (with additions) used
for the 'mdt_body' field 'mbo_valid' (see <<struct-mdt-body>>).
+[source,c]
----
#define OBD_MD_FLID (0x00000001ULL)
#define OBD_MD_FLATIME (0x00000002ULL)
--- /dev/null
+OST ID
+^^^^^^
+[[struct-ost-id]]
+
+The 'ost_id' identifies a single object on a particular OST.
+
+[source,c]
+----
+struct ost_id {
+ union {
+ struct ostid {
+ __u64 oi_id;
+ __u64 oi_seq;
+ } oi;
+ struct lu_fid oi_fid;
+ };
+};
+----
+
+The 'ost_id' structure contains an identifier for a single OST object.
+The 'oi' structure holds the OST object identifier as used with Lustre
+1.8 and earlier, where the 'oi_seq' field is typically zero, and the
+'oi_id' field is an integer identifying an object on a particular
+OST (which is identified separately). Since Lustre 2.5 it is possible
+for OST objects to also be identified with a unique FID that identifies
+both the OST on which it resides as well as the object identifier itself.
[[struct-ost-lvb]]
The 'ost_lvb' structure is a "lock value block", and encompasses
-attribute data for resources on the OST. It is an optional part of an
+attribute data for resources on the OST. It is returned from an OST to
+a client requesting an extent lock. It is an optional part of an
LDLM_ENQUEUE reply RPC for an MDT as well.
+[source,c]
----
struct ost_lvb {
__u64 lvb_size;
encoded in the 'pb_opc' Lustre operation number. The value of that
opcode, as well as whether it is an RPC 'request' or 'reply',
determines what else will be in the message following the preamble.
+
+[source,c]
----
#define PTLRPC_NUM_VERSIONS 4
#define JOBSTATS_JOBID_SIZE 32
};
----
+include::struct_lustre_handle.txt[]
+
In a connection request, sent by a client to a server and regarding a
specific target, the 'pb_handle' is 0. In the reply to a connection
request, sent by the target, the handle is a value uniquely
PTL_RPC_MSG_ERR in a reply to convey that a message was received that
could not be interpreted, that is, if it was corrupt or
incomplete. The encoding of those type values is given by:
+
+[source,c]
----
#define PTL_RPC_MSG_REQUEST 4711
#define PTL_RPC_MSG_ERR 4712
version of PtlRPC being employed in the message, and the upper two
bytes encode the role of the host for the service being
requested. That role is one of OBD, MDS, OST, DLM, LOG, or MGS.
+
+[source,c]
----
#define PTLRPC_MSG_VERSION 0x00000003
#define LUSTRE_VERSION_MASK 0xffff0000
that is the subject of this message. For example, MDS_CONNECT is a
Lustre operation (number 38). The following list gives the name used
and the value for each operation.
+
+[source,c]
----
typedef enum {
OST_REPLY = 0,
what the states and transitions are of this state machine. Currently,
only the bottom two bytes are used, and they encode state according to
the following values:
+
+[source,c]
----
#define MSG_GEN_FLAG_MASK 0x0000ffff
#define MSG_LAST_REPLAY 0x0001
for connect operations governs the client connection status state
machine.
+[source,c]
----
#define MSG_CONNECT_RECOVERING 0x00000001
#define MSG_CONNECT_RECONNECT 0x00000002
--- /dev/null
+Transaction Number
+~~~~~~~~~~~~~~~~~~
+[[transno]]
+
+For each target there is a sequence of values (a strictly increasing
+series of numbers) where each operation that can modify the file
+system is assigned the next number in the series. This is the
+transaction number, and it imposes a strict serial ordering for all
+file system modifying operations. For file system modifying
+requests the server assigns the next value in the sequence and
+informs the client of the value in the 'pb_transno' field of the
+'ptlrpc_body' of its reply to the client's request. For replys to
+requests that do not modify the file system the 'pb_transno' field in
+the 'ptlrpc_body' is just set to 0.
+