From: Chris Horn Date: Tue, 17 Jun 2025 19:22:48 +0000 (-0600) Subject: LU-14810 lnet: Avoid multiple PUSH to same peer X-Git-Url: https://git.whamcloud.com/gitweb?a=commitdiff_plain;h=99153d212d8c3c1d58cb088b23224d17eefb0fd7;p=fs%2Flustre-release.git LU-14810 lnet: Avoid multiple PUSH to same peer It is possible to send multiple PUSHes to the same peer when the LNET_PEER_FORCE_PUSH bit is set in the peer state. A partial solution was added in https://review.whamcloud.com/55559/ where we modified lnet_peer_needs_push() to check for the PUSH_SENT flag. However, we missed that the main loop in lnet_peer_discovery() will check for the LNET_PEER_FORCE_PUSH bit prior to calling lnet_peer_needs_push(). Update lnet_peer_discovery() to remove the problematic check for LNET_PEER_FORCE_PUSH. Also refactor the checks for sending a ping into a new function, lnet_peer_needs_ping(). Test-Parameters: trivial Test-Parameters: testlist=sanity-lnet env=ONLY=212,ONLY_REPEAT=100 Fixes: 72726a3118 ("LU-14810 lnet: Do not issue multiple PUSHes") Signed-off-by: Chris Horn Change-Id: Ie25089a07ac1d0fcc0e6c56ec69337d22371cc32 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59815 Tested-by: jenkins Tested-by: Maloo Reviewed-by: Serguei Smirnov Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin --- diff --git a/lnet/include/lnet/lib-lnet.h b/lnet/include/lnet/lib-lnet.h index d723bc2..23a109d 100644 --- a/lnet/include/lnet/lib-lnet.h +++ b/lnet/include/lnet/lib-lnet.h @@ -1127,6 +1127,17 @@ lnet_peer_needs_push(struct lnet_peer *lp) return false; } +static inline bool +lnet_peer_needs_ping(struct lnet_peer *lp) +{ + if (lp->lp_state & LNET_PEER_FORCE_PING) + return true; + else if (!(lp->lp_state & LNET_PEER_NIDS_UPTODATE)) + return true; + + return false; +} + static inline unsigned int lnet_get_next_recovery_ping(unsigned int ping_count, time64_t now) { diff --git a/lnet/lnet/peer.c b/lnet/lnet/peer.c index bee5725..7eda4dc 100644 --- a/lnet/lnet/peer.c +++ b/lnet/lnet/peer.c @@ -4124,11 +4124,7 @@ static int lnet_peer_discovery(void *arg) rc = lnet_peer_ping_failed(lp); else if (lp->lp_state & LNET_PEER_PUSH_FAILED) rc = lnet_peer_push_failed(lp); - else if (lp->lp_state & LNET_PEER_FORCE_PING) - rc = lnet_peer_send_ping(lp); - else if (lp->lp_state & LNET_PEER_FORCE_PUSH) - rc = lnet_peer_send_push(lp); - else if (!(lp->lp_state & LNET_PEER_NIDS_UPTODATE)) + else if (lnet_peer_needs_ping(lp)) rc = lnet_peer_send_ping(lp); else if (lnet_peer_needs_push(lp)) rc = lnet_peer_send_push(lp);