Whamcloud - gitweb
LU-16297 ptlrpc: don't panic during reconnection
authorAlexander Boyko <alexander.boyko@hpe.com>
Thu, 3 Nov 2022 11:23:20 +0000 (07:23 -0400)
committerAndreas Dilger <adilger@whamcloud.com>
Sat, 24 Feb 2024 03:47:15 +0000 (03:47 +0000)
commitbf198406102d2f653243f637d3541846b67c42e5
tree4546441fd120f870fdaa14f6372a8fb17a07136d
parenta94c5c0f3b8363aaf82264094047a095d76dcf49
LU-16297 ptlrpc: don't panic during reconnection

ptlrpc_send_rpc() could race with ptlrpc_connect_import_locked()
in the middle of assertion check and this leads to a wrong panic.
Assertion checks

(AT_OFF || imp->imp_state != LUSTRE_IMP_FULL ||

reconnect changes import state and flags
and second part

(imp->imp_msghdr_flags & MSGHDR_AT_SUPPORT) ||
!(imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_AT)))

MSGHDR_AT_SUPPORT is disabled during client reconnection.
It is not good to use locking at this hot part, so fix changes
assertion to a report.

Lustre-change: https://review.whamcloud.com/49029
Lustre-commit: df31c4c0b39b8845911344e6fadc008bcba40bb1

HPE-bug-id: LUS-10985
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ifc9e413c679c3e8a4c8f4f541251bebabae41c82
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54086
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/ptlrpc/niobuf.c