This set of structure specificiations includes all those directly refereneced in the message formats and all those subsidiary structures mentioned in them. acl ^^^ define LUSTRE_POSIX_ACL_MAX_SIZE sizeof(posix_acl_xattr_header) + LUSTRE_POSIX_ACL_MAX_ENTRIES * sizeof(posix_acl_xattr_entry)) mdt_md ^^^^^^ MIN_MD_SIZE (sizeof(struct lov_mds_md) + 1 * sizeof(struct lov_ost_data)) close_data ^^^^^^^^^^ .close_data [options="header"] |===== | type | field | struct lustre_handle | cd_handle | struct lu_fid | cd_fid | __u64 | cd_data_version | __u64 | cd_reserved[8] |===== hsm_current_action ^^^^^^^^^^^^^^^^^^ .hsm_current_action [options="header"] |===== | type | field | __u32 | hca_state | __u32 | hca_action | struct hsm_extent | hca_location |===== hsm_extent ^^^^^^^^^^ .hsm_extent [options="header"] |===== | type | field | __u64 | offset | __u64 | length |===== hsm_progress_kernel ^^^^^^^^^^^^^^^^^^^ .hsm_progress_kernel [options="header"] |===== | type | field | lustre_fid | hpk_fid | __u64 | hpk_cookie | struct hsm_extent | hpk_extent | __u16 | hpk_flags | __u16 | hpk_errval | __u32 | hpk_padding1 | __u64 | hpk_data_version | __u64 | hpk_padding2 |===== hsm_request ^^^^^^^^^^^ .hsm_request [options="header"] |===== | type | field | __u32 | hr_action | __u32 | hr_archive_id | __u64 | hr_flags | __u32 | hr_itemcount | __u32 | hr_data_len |===== hsm_state_set ^^^^^^^^^^^^^ .hsm_state_set [options="header"] |===== | type | field | __u32 | hss_valid | __u32 | hss_archive_id | __u64 | hss_setmask | __u64 | hss_clearmask |===== hsm_user_item ^^^^^^^^^^^^^ .hsm_user_item [options="header"] |===== | type | field | lustre_fid | hui_fid | struct hsm_extent | hui_extent |===== hsm_user_state ^^^^^^^^^^^^^^ .hsm_user_state [options="header"] |===== | type | field | __u32 | hus_states | __u32 | hus_archive_id | __u32 | hus_in_progress_state | __u32 | hus_in_progress_action | struct hsm_extent | hus_in_progress_location | char | hus_extended_info[] |===== idx_info ^^^^^^^^ .idx_info [options="header"] |===== | type | field | __u32 | ii_magic | __u32 | ii_flags | __u16 | ii_count | __u16 | ii_pad0 | __u32 | ii_attrs | struct lu_fid | ii_fid | __u64 | ii_version | __u64 | ii_hash_start | __u64 | ii_hash_end | __u16 | ii_keysize | __u16 | ii_recsize | __u32 | ii_pad1 | __u64 | ii_pad2 | __u64 | ii_pad3 |===== layout_intent ^^^^^^^^^^^^^ .layout_intent [options="header"] |===== | type | field | __u32 | li_opc | __u32 | li_flags | __u64 | li_start | __u64 | li_end |===== ldlm_gl_lquota_desc ^^^^^^^^^^^^^^^^^^^ .ldlm_gl_lquota_desc [options="header"] |===== | type | field | union lquota_id | gl_id | __u64 | gl_flags | __u64 | gl_ver | __u64 | gl_hardlimit | __u64 | gl_softlimit | __u64 | gl_time | __u64 | gl_pad2 |===== ldlm_intent ^^^^^^^^^^^ .ldlm_intent [options="header"] |===== | type | field | __u64 | opc |===== ldlm_lock_desc ^^^^^^^^^^^^^^ .ldlm_lock_desc [options="header"] |===== | type | field | struct ldlm_resource_desc | l_resource | ldlm_mode_t | l_req_mode | ldlm_mode_t | l_granted_mode | ldlm_wire_policy_data_t | l_policy_data |===== ldlm_reply ^^^^^^^^^^ .ldlm_reply [options="header"] |===== | type | field | __u32 | lock_flags | __u32 | lock_padding | struct ldlm_lock_desc | lock_desc | struct lustre_handle | lock_handle | __u64 | lock_policy_res1 | __u64 | lock_policy_res2 |===== ldlm_request ^^^^^^^^^^^^ .ldlm_request [options="header"] |===== | type | field | __u32 | lock_flags | __u32 | lock_count | struct ldlm_lock_desc | lock_desc | struct lustre_handle | lock_handle\[LDLM_LOCKREQ_HANDLES\] |===== ldlm_res_id ^^^^^^^^^^^ .ldlm_res_id [options="header"] |===== | type | field | __u64 | name[RES_NAME_SIZE]; |===== ldlm_resource_desc ^^^^^^^^^^^^^^^^^^ .ldlm_resource_desc [options="header"] |===== | type | field | ldlm_type_t | lr_type | __u32 | lr_padding | struct ldlm_res_id | lr_name |===== lfsck_reply ^^^^^^^^^^^ .lfsck_reply [options="header"] |===== | type | field | __u32 | lr_status | __u32 | lr_padding_1 | __u64 | lr_padding_2 |===== lfsck_request ^^^^^^^^^^^^^ .lfsck_request [options="header"] |===== | type | field | __u32 | lr_event | __u32 | lr_index | __u32 | lr_flags | __u32 | lr_valid | union __u32 | lr_speed, lr_status, lr_type | __u16 | lr_version | __u16 | lr_active | __u16 | lr_param | __u16 | lr_async_windows | __u32 | lr_flags2 | struct lu_fid | lr_fid | struct lu_fid | lr_fid2 | struct lu_fid | lr_fid3 | __u64 | lr_padding_1 | __u64 | lr_padding_2 |===== ll_fiemap_info_key ^^^^^^^^^^^^^^^^^^ .ll_fiemap_info_key [options="header"] |===== | type | field | char | name[8] | struct obdo | oa | struct ll_user_fiemap | fiemap |===== ll_user_fiemap ^^^^^^^^^^^^^^ .ll_user_fiemap [options="header"] |===== | type | field | __u64 | fm_start | __u64 | fm_length | __u32 | fm_flags | __u32 | fm_mapped_extents | __u32 | fm_extent_count | __u32 | fm_reserved | struct ll_fiemap_extent | fm_extents[0] |===== ll_fiemap_extent ^^^^^^^^^^^^^^^^ .ll_fiemap_extent [options="header"] |===== | type | field | __u64 | fe_logical | __u64 | fe_physical | __u64 | fe_length | __u64 | fe_reserved64[2] | __u32 | fe_flags | __u32 | fe_device | __u32 | fe_reserved[2] |===== llog_cookie ^^^^^^^^^^^ .llog_cookie [options="header"] |===== | type | field | struct llog_logid | lgc_lgl | __u32 | lgc_subsys | __u32 | lgc_index | __u32 | lgc_padding |===== llog_gen ^^^^^^^^ .llog_gen [options="header"] |===== | type | field | __u64 | mnt_cnt; | __u64 | conn_cnt |===== llog_log_hdr ^^^^^^^^^^^^ .llog_log_hdr [options="header"] |===== | type | field | struct llog_rec_hdr | llh_hdr | __s64 | llh_timestamp | __u32 | llh_count | __u32 | llh_bitmap_offset | __u32 | llh_size | __u32 | llh_flags | __u32 | llh_cat_idx | struct obd_uuid | llh_tgtuuid | __u32 | llh_reserved[LLOG_HEADER_SIZE/sizeof(__u32) - 23] | __u32 | llh_bitmap[LLOG_BITMAP_BYTES/sizeof(__u32)] | struct llog_rec_tail | llh_tail |===== llog_rec_hdr ^^^^^^^^^^^^ .llog_rec_hdr [options="header"] |===== | type | field | __u32 | lrh_len | __u32 | lrh_index | __u32 | lrh_type | __u32 | lrh_id |===== llog_rec_tail ^^^^^^^^^^^^^ .llog_rec_tail [options="header"] |===== | type | field | __u32 | lrt_len; | __u32 | lrt_index |===== llog_logid ^^^^^^^^^^ .llog_logid [options="header"] |===== | type | field | struct ost_id | lgl_oi | __u32 | lgl_ogen |===== llogd_body ^^^^^^^^^^ .llogd_body [options="header"] |===== | type | field | struct llog_logid | lgd_logid | __u32 | lgd_ctxt_idx | __u32 | lgd_llh_flags | __u32 | lgd_index | __u32 | lgd_saved_index | __u32 | lgd_len | __u64 | lgd_cur_offset |===== llogd_conn_body ^^^^^^^^^^^^^^^ .llogd_conn_body [options="header"] |===== | type | field | struct llog_gen | lgdc_gen | struct llog_logid | lgdc_logid | __u32 | lgdc_ctxt_idx |===== lov_mds_md_v1 ^^^^^^^^^^^^^ .lov_mds_md_v1 [options="header"] |===== | type | field | __u32 | lmm_magic | __u32 | lmm_pattern | struct ost_id | lmm_oi | __u32 | lmm_stripe_size | __u16 | lmm_stripe_count | __u16 | lmm_layout_gen | struct lov_ost_data | lmm_objects[0] |===== lov_ost_data ^^^^^^^^^^^^ .lov_ost_data [options="header"] |===== | type | field | struct ost_id | l_ost_oi | __u32 l_ost_gen | __u32 l_ost_idx |===== lu_fid ^^^^^^ .lu_fid [options="header"] |===== | type | field | __u64 | f_seq | __u32 | f_oid | __u32 | f_ver |===== lu_seq_range ^^^^^^^^^^^^ .lu_seq_range [options="header"] |===== | type | field | __u64 | lsr_start | __u64 | lsr_end | __u32 | lsr_index | __u32 | lsr_flags |===== lustre_capa ^^^^^^^^^^^ .lustre_capa [options="header"] |===== | type | field | struct lu_fid | lc_fid | __u64 | lc_opc | __u64 | lc_uid | __u64 | lc_gid | __u32 | lc_flags | __u32 | lc_keyid | __u32 | lc_timeout | __u32 | lc_expiry | __u8 | lc_hmac[CAPA_HMAC_MAX_LEN] |===== lustre_handle ^^^^^^^^^^^^^ .lustre_handle [options="header"] |===== | type | field | __u64 | cookie |===== mdc_swap_layouts ^^^^^^^^^^^^^^^^ .mdc_swap_layouts [options="header"] |===== | type | field | __u64 | msl_flags |===== mdt_body ^^^^^^^^ .mdt_body [options="header"] |===== | type | field | struct lu_fid | mbo_fid1 | struct lu_fid | mbo_fid2 | struct lustre_handle | mbo_handle | __u64 | mbo_valid | __u64 | mbo_size | __s64 | mbo_mtime | __s64 | mbo_atime | __s64 | mbo_ctime | __u64 | mbo_blocks | __u64 | mbo_ioepoch | __u64 | mbo_t_state | __u32 | mbo_fsuid | __u32 | mbo_fsgid | __u32 | mbo_capability | __u32 | mbo_mode | __u32 | mbo_uid | __u32 | mbo_gid | __u32 | mbo_flags | __u32 | mbo_rdev | __u32 | mbo_nlink | __u32 | mbo_unused2 | __u32 | mbo_suppgid | __u32 | mbo_eadatasize | __u32 | mbo_aclsize | __u32 | mbo_max_mdsize | __u32 | mbo_max_cookiesize | __u32 | mbo_uid_h | __u32 | mbo_gid_h | __u32 | mbo_padding_5 | __u64 | mbo_padding_6 | __u64 | mbo_padding_7 | __u64 | mbo_padding_8 | __u64 | mbo_padding_9 | __u64 | mbo_padding_10 |===== mdt_ioepoch ^^^^^^^^^^^ .mdt_ioepoch [options="header"] |===== | type | field | struct lustre_handle | handle | __u64 | ioepoch | __u32 | flags | __u32 | padding |===== mdt_rec_reint ^^^^^^^^^^^^^ .mdt_rec_reint [options="header"] |===== | type | field | __u32 | rr_opcode | __u32 | rr_cap | __u32 | rr_fsuid | __u32 | rr_fsuid_h | __u32 | rr_fsgid | __u32 | rr_fsgid_h | __u32 | rr_suppgid1 | __u32 | rr_suppgid1_h | __u32 | rr_suppgid2 | __u32 | rr_suppgid2_h | struct lu_fid | rr_fid1 | struct lu_fid | rr_fid2 | __s64 | rr_mtime | __s64 | rr_atime | __s64 | rr_ctime | __u64 | rr_size | __u64 | rr_blocks | __u32 | rr_bias | __u32 | rr_mode | __u32 | rr_flags | __u32 | rr_flags_h | __u32 | rr_umask | __u32 | rr_padding_4 |===== mgs_config_res ^^^^^^^^^^^^^^ .mgs_config_res [options="header"] |===== | type | field | __u64 | mcr_offset | __u64 | mcr_size |===== mgs_send_param ^^^^^^^^^^^^^^ .mgs_send_param [options="header"] |===== | type | field | char | mgs_param[MGS_PARAM_MAXLEN] |===== mgs_target_info ^^^^^^^^^^^^^^^ .mgs_target_info [options="header"] |===== | type | field | __u32 | mti_lustre_ver | __u32 | mti_stripe_index | __u32 | mti_config_ver | __u32 | mti_flags | __u32 | mti_nid_count | __u32 | mti_instance | char | mti_fsname[MTI_NAME_MAXLEN] | char | mti_svname[MTI_NAME_MAXLEN] | char | mti_uuid[sizeof(struct obd_uuid)] | __u64 | mti_nids[MTI_NIDS_MAX] | char | mti_params[MTI_PARAM_MAXLEN] |===== niobuf_remote ^^^^^^^^^^^^^ .niobuf_remote [options="header"] |===== | type | field | __u64 | rnb_offset | __u32 | rnb_len | __u32 | rnb_flags |===== obd_connect_data ^^^^^^^^^^^^^^^^ .obd_connect_data [options="header"] |===== | type | field | __u64 | ocd_connect_flags | __u32 | ocd_version | __u32 | ocd_grant | __u32 | ocd_index | __u32 | ocd_brw_size | __u64 | ocd_ibits_known | __u8 | ocd_blocksize | __u8 | ocd_inodespace | __u16 | ocd_grant_extent | __u32 | ocd_unused | __u64 | ocd_transno | __u32 | ocd_group | __u32 | ocd_cksum_types | __u32 | ocd_max_easize | __u32 | ocd_instance | __u64 | ocd_maxbytes | __u64 | padding1 | __u64 | padding2 | __u64 | padding3 | __u64 | padding4 | __u64 | padding5 | __u64 | padding6 | __u64 | padding7 | __u64 | padding8 | __u64 | padding9 | __u64 | paddingA | __u64 | paddingB | __u64 | paddingC | __u64 | paddingD | __u64 | paddingE | __u64 | paddingF |===== obd_dqblk ^^^^^^^^^ .obd_dqblk [options="header"] |===== | type | field | __u64 | dqb_bhardlimit | __u64 | dqb_bsoftlimit | __u64 | dqb_curspace | __u64 | dqb_ihardlimit | __u64 | dqb_isoftlimit | __u64 | dqb_curinodes | __u64 | dqb_btime | __u64 | dqb_itime | __u32 | dqb_valid | __u32 | dqb_paddin |===== obd_dqinfo ^^^^^^^^^^ .obd_dqinfo [options="header"] |===== | type | field | __u64 | dqi_bgrace | __u64 | dqi_igrace | __u32 | dqi_flags | __u32 | dqi_valid |===== obd_ioobj ^^^^^^^^^ .obd_ioobj [options="header"] |===== | type | field | struct ost_id | ioo_oid | __u32 | ioo_max_brw | __u32 | ioo_bufcnt |===== obd_quotactl ^^^^^^^^^^^^ .obd_quotactl [options="header"] |===== | type | field | __u32 | qc_cmd | __u32 | qc_type | __u32 | qc_id | __u32 | qc_stat | struct obd_dqinfo | qc_dqinfo | struct obd_dqblk | qc_dqblk |===== obd_statfs ^^^^^^^^^^ .obd_statfs [options="header"] |===== | type | field | __u64 | os_type | __u64 | os_blocks | __u64 | os_bfree | __u64 | os_bavail | __u64 | os_files | __u64 | os_ffree | __u8 | os_fsid[40] | __u32 | os_bsize | __u32 | os_namelen | __u64 | os_maxbytes | __u32 | os_state | __u32 | os_fprecreated | __u32 | os_spare2 | __u32 | os_spare3 | __u32 | os_spare4 | __u32 | os_spare5 | __u32 | os_spare6 | __u32 | os_spare7 | __u32 | os_spare8 | __u32 | os_spare9 |===== obd_uuid ^^^^^^^^ .obd_uuid [options="header"] |===== | type | field | char | uuid[UUID_MAX] |===== obdo ^^^^ .obdo [options="header"] |===== | type | field | __u64 | o_valid | struct ost_id | o_oi | __u64 | o_parent_seq | __u64 | o_size | __s64 | o_mtime | __s64 | o_atime | __s64 | o_ctime | __u64 | o_blocks | __u64 | o_grant | __u32 | o_blksize | __u32 | o_mode | __u32 | o_uid | __u32 | o_gid | __u32 | o_flags | __u32 | o_nlink | __u32 | o_parent_oid | __u32 | o_misc | __u64 | o_ioepoch | __u32 | o_stripe_idx | __u32 | o_parent_ver | struct lustre_handle | o_handle | struct llog_cookie | o_lcookie | __u32 | o_uid_h | __u32 | o_gid_h | __u64 | o_data_version | __u64 | o_padding_4 | __u64 | o_padding_5 | __u64 | o_padding_6 |===== ost_body ^^^^^^^^ .ost_body [options="header"] |===== | type | field | struct obdo | oa |===== ost_id ^^^^^^ .ost_id [options="header"] |===== | type | field union struct __u64 | oi_id, oi_seq | struct lu_fid | oi_fid |===== ptlrpc_body ^^^^^^^^^^^ Each buffer has additional structure imposed on it, and the first buffer always has the format given by a 'ptlrpc_body' structure. .ptlrpc_body [options="header"] |===== | type | field | struct lustre_handle | pb_handle | __u32 | pb_type | __u32 | pb_version | __u32 | pb_opc | __u32 | pb_status | __u64 | pb_last_xid | __u64 | pb_last_seen | __u64 | pb_last_committed | __u64 | pb_transno | __u32 | pb_flags | __u32 | pb_op_flags | __u32 | pb_conn_cnt | __u32 | pb_timeout | __u32 | pb_service_time | __u32 | pb_limit | __u64 | pb_slv | __u64 | pb_pre_versions[PTLRPC_NUM_VERSIONS] | __u64 | pb_padding[4] | char | pb_jobid[LUSTRE_JOBID_SIZE] |===== A 'struct lustre_handle' contains a single 64-bit field called 'cookie' that ... The semantics of each field may be different between requst messages and replies. 'pb_handle' is a 64-bit value to uniquely determine shared state between a sender and a reciever. When communication is initiated, as in a "connect" message (eg. MDS_CONNCET, from a client to a server), the value will be 0. A reply (from the server back to the client) to this message will contain a value (a "cookie") to identify the shared state information (the "export") for the client that is maintained on the server. The client will then associate this cookie with the shared state information (the "import") that it maintains about the server. Subsequent messages between this client and this server will refer to the same shared state by using this cookie as the handle in this field. 'pb_type' is one of the three message types PTL_RPC_MSG_REQUEST, PTL_RPC_MSG_ERR, or PTL_RPC_MSG_REPL. As one might expect, "request" and "reply" are the two usual message types, one for initiating and exchange and the other for completing it. The "err" message type is only for responding to a PtlRPC message that failed to be interpeted as an actual message. Note that other errors, such as those that emerge from processing the actual message content, do not use the PTL_RPC_MSG_ERR symbol. 'pb_version' is a field that encodes the Lustre protocol version in combination ('or'-ed) with one of the service type version. One of: LUSTRE_OBD_VERSION LUSTRE_MDS_VERSION LUSTRE_OST_VERSION LUSTRE_DLM_VERSION LUSTRE_LOG_VERSION LUSTRE_MGS_VERSION What exactly is the significance of these? 'pb_opc' gives the actual operation that is the subject of this PtlRPC. There is a long list of such "op codes". List a few. 'pb_status' allows for the return of a status code or error code (eg. "permissoin denied"). This is one of the ways to return an error (the other is if an RPC could not even be interpreted, which results in an pb_type=RPC_MSG_ERR) given that the particular pb_opc had an error in its processing. A value of zero signifies that the request was successfully executed. Note that for operations that modify the file system this indicates the operation has been initiated, not necessarily completed (cf. pb_last_commited). The actual status values will be consistent with standard Liunx kernel (POsIX) error codes (eg. ENOENT). This field is always zero in requests. 'pb_last_xid' is not used. 'pb_last_seen' is not used. 'pb_last_committed' is the highest transaction number that has been commited to storage. The transaction numbers are maintained on a per-target basis and each such sequence is a monotonically increasing sequence. This field is only set in reply messages and can accomany any kind of message including pings and non-modifying transactions. 'pb_transno' is the server assigned (and is unique for each target that server for all time) 64-bit number assigned to any file system modifying operation for that server. It is zero for and message that does not modify the file system. 'pb_flags' is one among: MSG_LAST_REPLAY MSG_RESENT MSG_REPLAY MSG_DELAY_REPLAY MSG_VERSION_REPLAY MSG_REQ_REPLAY_DONE MSG_LOCK_REPLAY_DONE 'pb_op_flags' is one among: MSG_CONNECT_RECOVERING 0x00000001 MSG_CONNECT_RECONNECT 0x00000002 MSG_CONNECT_REPLAYABLE 0x00000004 MSG_CONNECT_LIBCLIENT 0x00000010 MSG_CONNECT_INITIAL 0x00000020 MSG_CONNECT_ASYNC 0x00000040 MSG_CONNECT_NEXT_VER 0x00000080 /* use next version of lustre_msg */ MSG_CONNECT_TRANSNO 0x00000100 /* report transno */ 'pb_conn_cnt' is a monotonically increasing number that identifyies to the server the connection era for the client than was current when the message was constructed. The era for a client is the portion of the shared state that reflects its connection count. This count is intialized to one at the first connection and subsequent eviction and reconnect events will increment the count. This enables the server to discard requests from clients whose era has expired. 'pb_timeout' tells how long the client is willing to wait for its specific reply message. In the reply, it signifies how long the service is estimated to take for this type of requests (op codes). There are multiple request queues, called "portals". The server may send an "early reply" for express purpose of extending the client's timeout. Such an "early reply" will still be followed by the actual reply. 'pb_service_time' is how long this particular operation actually took from the time it first arrived in the request queue to the time the server replied. Note that the client can use this value and the local elapsed time to calculate network latency. 'pb_limit' is a value, in a reply message, sent from a lock server to a client to set the maximum number of locks available to the client. When dynamic lock LRU's are enable this allows for managing thier sizes. 'pb_slv' is the "server lock volume" which is the product of the number of locks and their age. It is used to estimate the lock traffic load. In the reply it is this client's share of the total lock load on the server. It is prescriptive. 'pb_pre_versions[PTLRPC_NUM_VERSIONS]' has up to four entries (PTLRPC_NUM_VERSIONS = 4). The values are sent in reply messages. Each entry returns the previous versions of an object modified by this operation. The version being communicated is the transaction number (pb_transno) of the request that last modified that object. 'pb_padding[4]' is reserved for use and must also respect the 8 byte alignment requirement. 'pb_jobid[LUSTRE_JOBID_SIZE]' gives a unique identifier aassociated by the process on behalf of which this meeage was generated. The identifier is assigned to the user process by a job scheduler, if any. quota_body ^^^^^^^^^^ .quota_body [options="header"] |===== | type | field | struct lu_fid | qb_fid | union | lquota_id qb_id | __u32 | qb_flags | __u32 | qb_padding | __u64 | qb_count | __u64 | qb_usage | __u64 | qb_slv_ver | struct lustre_handle | qb_lockh | struct lustre_handle | qb_glb_lockh | __u64 | qb_padding1[4] |=====