From: Bruno Faccini Date: Fri, 23 Feb 2024 12:16:36 +0000 (+0100) Subject: LU-17578 lnet: fix &the_lnet.ln_mt_peerNIRecovq race X-Git-Tag: 2.15.62~128 X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=0a0e881d8884a220c485c0384351da12dc8aed9f;p=fs%2Flustre-release.git LU-17578 lnet: fix &the_lnet.ln_mt_peerNIRecovq race To avoid race &the_lnet.ln_mt_peerNIRecovq must always be accessed with lnet_net_lock(0) protection. Test-Parameters: trivial Fixes: da23037 ("LU-16563 lnet: use discovered ni status to set initial health") Change-Id: Ic5e0194020200afdecba4cbf5afed274b14da388 Signed-off-by: Bruno Faccini Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54163 Reviewed-by: Chris Horn Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin Tested-by: Maloo Tested-by: jenkins --- diff --git a/lnet/lnet/peer.c b/lnet/lnet/peer.c index 5c3f3d3..f890d3f 100644 --- a/lnet/lnet/peer.c +++ b/lnet/lnet/peer.c @@ -3191,9 +3191,11 @@ int ping_info_count_entries(struct lnet_ping_buffer *pbuf) static inline void handle_disc_lpni_health(struct lnet_peer_ni *lpni) { - if (lpni->lpni_ns_status == LNET_NI_STATUS_DOWN) + if (lpni->lpni_ns_status == LNET_NI_STATUS_DOWN) { + lnet_net_lock(0); lnet_handle_remote_failure_locked(lpni); - else if (lpni->lpni_ns_status == LNET_NI_STATUS_UP && + lnet_net_unlock(0); + } else if (lpni->lpni_ns_status == LNET_NI_STATUS_UP && !lpni->lpni_last_alive) atomic_set(&lpni->lpni_healthv, LNET_MAX_HEALTH_VALUE); }