LU-17357 mgc: wait for sptlrpc config log The sptlrpc config log is mandatory to establish connections to targets with proper security context. So wait for its retrieval. Add sanity-sec test_68 to exercise this, and improve test_32 for mgssec. Signed-off-by: Sebastien Buisson <sbuisson@ddn.com> Change-Id: I5352e926dc6a9a68db1224629c68a42b74bee8a4 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53423 Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-6142 ptlrpc: Fix style issues for niobuf.c This patch fixes issues reported by checkpatch for file lustre/ptlrpc/niobuf.c Test-Parameters: trivial Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Change-Id: I2b431ef591fe3e920e57ce173250e600dc3b5f1f Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54061 Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-15246 ptlrpc: per-device adaptive timeout parameters When a client is mounting multiple filesystems with different MGSes setting global parameters at_min, at_max, etc., then the settings from one filesystem's MGS config will also apply to RPCs sent for the OSC, MDC, and MGC devices on the other filesystem(s). Typically the settings of the last filesystem to mount on the client override the earlier values, and there is no way to separate them. Moving the parameters to be per-device values allows them to be set independently for each set of client devices, so that the client can interact properly with each set of servers. This allows e.g. different timeouts for local and remote mounts, or for flash and HDD filesystems that have different load and performance. Add per-device adaptive timeout parameters that can optionally replace global parameters of the same name: at_min -> *.<fsname>*.at_min at_max -> *.<fsname>*.at_max at_history -> *.<fsname>*.at_history ldlm_enqeue_min -> *.<fsname>*.ldlm_enqueue_min These parameters should always be set with fsname in the device name, rather than pure wildcard '*' device names, or it will be be same as the global parameters in the end (settings from one MGS will apply to devices on other filesystems). That is a bug in how "lctl set_param -P" works, but will be fixed separately. Signed-off-by: Lei Feng <flei@whamcloud.com> Change-Id: I5b04c9aa53a446fb5a78bfaff372b4f236c9eb8a Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45598 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16483 tests: replay-single test_200 fixes Modify test to ensure idle disconnect is enabled for all targets except OST0000. This prevents an issue where an idle ping is sent to another target instead of OST0000. Re-work test to check the debug log for all relevant messages. rcli is not set correctly when RCLIENTS contains multiple hostnames. Fix it by not surrounding RCLIENTS with double quotes. Added a debug statement to ptl_send_rpc(), and moved an existing one, to faciliate debugging any future test failures. Test-Parameters: trivial clients=3 testlist=replay-single env=ONLY=200,ONLY_REPEAT=100 Fixes: eb1f4a5 ("LU-16483 ptlrpc: Track highest reply XID") Signed-off-by: Chris Horn <chris.horn@hpe.com> Change-Id: If0a214092dad1e40f1b9e785e179ef67f686b85a Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50891 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Alex Deiter <alex.deiter@gmail.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-12610 ptlrpc: replace OBD_ -> CFS_ macros Replace OBD macros that are simply redefinitions of CFS macros. Signed-off-by: Timothy Day <timday@amazon.com> Signed-off-by: Ben Evans <beevans@whamcloud.com> Change-Id: I634f364d33ac56de678d273d87c9ac54d1f8c1ef Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50684 Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com> Reviewed-by: Neil Brown <neilb@suse.de> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com>
LU-16483 ptlrpc: Track highest reply XID Keep track of the highest XID that we've received a reply for. When an OBD_PING expires, do not disconnect the import if the failed XID is less than or equal to the last reply XID. This avoids situation where a lost OBD_PING rpc causes a reconnect even though we've completed other RPCs in the meantime. HPE-bug-id: LUS-11474 Signed-off-by: Chris Horn <chris.horn@hpe.com> Change-Id: I7e66bcc1368fa41ec86ffd843abac676f8d29254 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49807 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-16297 ptlrpc: don't panic during reconnection ptlrpc_send_rpc() could race with ptlrpc_connect_import_locked() in the middle of assertion check and this leads to a wrong panic. Assertion checks (AT_OFF || imp->imp_state != LUSTRE_IMP_FULL || reconnect changes import state and flags and second part (imp->imp_msghdr_flags & MSGHDR_AT_SUPPORT) || !(imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_AT))) MSGHDR_AT_SUPPORT is disabled during client reconnection. It is not good to use locking at this hot part, so fix changes assertion to a report. HPE-bug-id: LUS-10985 Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com> Change-Id: Ifc9e413c679c3e8a4c8f4f541251bebabae41c82 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49029 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com> Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 ptlrpc: pass lnet_nid for self to ptl_send_buf() The 'self' arg to ptl_send_buf() is now a pointer to a 'struct lnet_nid', and can be NULL meaning "ANY NID". LNetPut() already accepts NULL as the self pointer. Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I859dfa10e2f5e50c029c6926fe25ac036fb4f494 Reviewed-on: https://review.whamcloud.com/44641 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 ptlrpc: change rq_source to struct lnet_nid rq_source in struct ptlrpc_request can now store large NIDs. ptl_send_buf() now takes a struct lnet_processid for the peer. Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I2fe7da2332955c69f6252d44fb3ae28d2ef4e517 Reviewed-on: https://review.whamcloud.com/44639 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 ptlrpc: change rq_peer to struct lnet_nid rq_peer in struct ptlrpc_request can now store large NIDs. ptlrpc_connection_get() and others now take a struct lnet_processid Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I3bb419720434714301946d278413ce6090aa2cdd Reviewed-on: https://review.whamcloud.com/44638 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 ptlrpc: change rq_self to struct lnet_nid rq_self in struct ptlrpc_request can now store largs NIDs. ptlrpc_connection_get() is also changed to received a 'struct lnet_nid'. Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: If2ea7770e967e2f044f2b2300950b612463e130c Reviewed-on: https://review.whamcloud.com/44636 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Frank Sehr <fsehr@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-15451 sec: read-only nodemap flag Add a new 'readonly_mount' property to nodemaps. When set, we return -EROFS from server side if the client is not mounting read-only. So the client will have to specify the read-only mount option to be allowed to mount. Fixes: 928714dddabb ("LU-5092 nodemap: save id maps to targets in new index file") Signed-off-by: Sebastien Buisson <sbuisson@ddn.com> Change-Id: I9931844ae46dfd5d724f592f8dfacc4a8011c7e3 Reviewed-on: https://review.whamcloud.com/46149 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 lnet: change LNetGet to take 16byte nid and pid. "self" is now passed to LNetGet as a pointer to a 16-byte-addr nid, or NULL for "ANY". "target" is passed as a 16-bytes-addr process_id. Test-Parameters: trivial Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests Test-Parameters: clientversion=2.12 testlist=runtests Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I8e0fcd442d5195991b799a8db3ec8030c81f9400 Reviewed-on: https://review.whamcloud.com/43620 Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Chris Horn <chris.horn@hpe.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 lnet: convert LNetPut to take 16byte nid and pid. LNetPut() now takes a 16byte nid for self and similar process_id for target. Test-Parameters: trivial Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests Test-Parameters: clientversion=2.12 testlist=runtests Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I240caf6fb8b02b1814b9d4883aceda33894786a4 Reviewed-on: https://review.whamcloud.com/43619 Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Amir Shehata <ashehata@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 lnet: use large nids in struct lnet_event All nids, including those in process_id, are changed to to struct lnet_nid / struct lnet_processid. Test-Parameters: trivial Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests Test-Parameters: clientversion=2.12 testlist=runtests Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I799dbbc22f7cfe403f07eb22f4bfc4e4b5dc23ea Reviewed-on: https://review.whamcloud.com/43600 Tested-by: jenkins <devops@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-12678 ptlrpc: remove bogus LASSERT In the error case, it isn't possible for rc to be both -ENOMEM and 0 at the same time, so remove the incorrect LASSERT(rc == 0) to avoid crashing the system on an allocation failure. Improve error messages to conform to code style. Fixes: ceeeae4271fd ("LU-12678 lnet: me: discard struct lnet_handle_me") Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Change-Id: I61ac5d735d7b2658dae76213a2d40cbfd2bb8bb9 Reviewed-on: https://review.whamcloud.com/45421 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Chris Horn <chris.horn@hpe.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-10391 lnet: switch to large lnet_processid for matching Change lnet_libhandle.me_match_id and lnet_match_info.mi_id to struct lnet_processid, so they support large nids. This requires changing LNetMEAttach(), lnet_mt_match_head(), lnet_mt_of_attach(), lnet_ptl_match_type(), lnet_match2mt() to accept a pointer to lnet_processid rather than an lnet_process_id. Test-Parameters: trivial Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests Test-Parameters: clientversion=2.12 testlist=runtests Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I6957b467bb9af84e20a4525db6351694f4d2a7af Reviewed-on: https://review.whamcloud.com/43597 Reviewed-by: James Simmons <jsimmons@infradead.org> Reviewed-by: Chris Horn <chris.horn@hpe.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Amir Shehata <ashehata@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-14594 ptlrpc: do not match reply with resent RPC The server is able to filter by the connection ID, and drop late coming RPCs of previous connections, however it does not happen for replies. At the same time, this is a problem in some cases. Allocate new matchbits for resends and check replies by them, instead of xid. Connect RPCs are exceptions due to interop with old server - at the time of connect we do not know yet if the server supports it. Signed-off-by: Vitaly Fertman <c17818@cray.com> Change-Id: I2aad037002b488b0c3371544ede0c47940f87efe HPE-bug-id: LUS-9596 Reviewed-on: https://es-gerrit.dev.cray.com/158446 Reviewed-by: Alexey Lyashkov <c17817@cray.com> Reviewed-by: Andriy Skulysh <c17819@cray.com> Tested-by: Elena Gryaznova <c17455@cray.com> Reviewed-on: https://review.whamcloud.com/43242 Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Mike Pershin <mpershin@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-14487 modules: remove references to Sun Trademark. "lustre" is no longer a Trademark of Sun Microsystems. There is no need to acknowledge the trademark in every file, so just remove all these claims. Test-Parameters: trivial Signed-off-by: Mr NeilBrown <neilb@suse.de> Change-Id: I66941494eabc54bedf85079c5b85701187f2a8f1 Reviewed-on: https://review.whamcloud.com/42139 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Aurelien Degremont <degremoa@amazon.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
LU-13368 lnet: discard the callback Lustre need a completion callback for event that request has been sent. And then need other callback when reply arrived. Sometime the request completion callback maybe lost by some reason even reply has been received. system will wait forever even timeout. We needn't to wait request completion in such case. So provide a way to discard the callback. Signed-off-by: Yang Sheng <ys@whamcloud.com> Change-Id: If9cd8420ee76947ee5053180e0f5219f76bb94c2 Reviewed-on: https://review.whamcloud.com/38845 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Amir Shehata <ashehata@whamcloud.com> Reviewed-by: Cyril Bordage <cbordage@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>