Whamcloud - gitweb
LU-17484 gss: reply error for SEC_CTX_INIT on wrong node
authorSebastien Buisson <sbuisson@ddn.com>
Thu, 8 Feb 2024 12:44:21 +0000 (13:44 +0100)
committerAndreas Dilger <adilger@whamcloud.com>
Sat, 24 Feb 2024 03:46:45 +0000 (03:46 +0000)
commit10f98eb8a95a9efc7a123b69d2f05cf0f8ac2fb7
tree1e00273dcde8b6169ceeace8ad3221a329c6f668
parent5c0d597e24203c4364cdf03be34860a28824feab
LU-17484 gss: reply error for SEC_CTX_INIT on wrong node

When a server receives a SEC_CTX_INIT request for a target that is not
available (either stopping, or not set up yet, or moved to a failover
node), the request gets dropped. This makes the client-side RPC time
out, increasing the time it takes to establish a proper gss context
with the target, because it slows down the HA mechanism that tries
alternate failover NIDs.
Instead of dropping the request reply for SEC_CTX_INIT, the server
needs to send back a proper error reply. The client will then be able
to immediately try alternate failover NIDs, speeding mount/reconnect
process up, and avoiding potential eviction.

Lustre-change: https://review.whamcloud.com/53970
Lustre-commit: 3d635dd3f24421c181aca5673cd81ed8f3e2c622

Test-Parameters: trivial
Test-Parameters: kerberos=true testlist=sanity-krb5
Test-Parameters: testgroup=review-dne-selinux-ssk-part-2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id2cefaa7d54729a63c7be13b65d7ace579bcaa78
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54157
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/ldlm/ldlm_lib.c
lustre/ptlrpc/gss/lproc_gss.c
lustre/ptlrpc/gss/sec_gss.c
lustre/ptlrpc/import.c
lustre/utils/gss/svcgssd_proc.c