From: Chris Horn Date: Tue, 5 Dec 2023 09:56:57 +0000 (-0600) Subject: LU-14810 lnet: Cancel discovery ping/push on shutdown X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=5844ace62c8b53fe928048585050b166a37713ba;p=fs%2Flustre-release.git LU-14810 lnet: Cancel discovery ping/push on shutdown Discovery shutdown can race with ping and push events. In some cases this can result in failing to unlink ping/push MDs on shutdown. Protect against this by checking for PING/PUSH_FAILED state on peers on the request queue. Lustre-change: https://review.whamcloud.com/53356 Lustre-commit: c3b9597742d5118a96f56129e7dd30d84468d2c8 Test-Parameters: trivial Test-Parameters: testlist=sanity-lnet env=ONLY=500,ONLY_REPEAT=50 Signed-off-by: Chris Horn Change-Id: I84a1f5beb6508651bc62e1dd93271f9e72f5081c Reviewed-by: Frank Sehr Reviewed-by: James Simmons Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53848 Tested-by: jenkins Tested-by: Maloo Reviewed-by: Andreas Dilger --- diff --git a/lnet/lnet/peer.c b/lnet/lnet/peer.c index 7833576..7362e0c 100644 --- a/lnet/lnet/peer.c +++ b/lnet/lnet/peer.c @@ -3873,6 +3873,14 @@ static int lnet_peer_discovery(void *arg) while (!list_empty(&the_lnet.ln_dc_request)) { lp = list_first_entry(&the_lnet.ln_dc_request, struct lnet_peer, lp_dc_list); + lnet_net_unlock(LNET_LOCK_EX); + spin_lock(&lp->lp_lock); + if (lp->lp_state & LNET_PEER_PING_FAILED) + rc = lnet_peer_ping_failed(lp); + if (lp->lp_state & LNET_PEER_PUSH_FAILED) + rc = lnet_peer_push_failed(lp); + spin_unlock(&lp->lp_lock); + lnet_net_lock(LNET_LOCK_EX); lnet_peer_discovery_complete(lp, -ESHUTDOWN); } lnet_net_unlock(LNET_LOCK_EX);