Whamcloud - gitweb
LU-14810 lnet: Avoid multiple PUSH to same peer 15/59815/3
authorChris Horn <chris.horn@hpe.com>
Tue, 17 Jun 2025 19:22:48 +0000 (13:22 -0600)
committerOleg Drokin <green@whamcloud.com>
Tue, 8 Jul 2025 03:52:10 +0000 (03:52 +0000)
It is possible to send multiple PUSHes to the same peer when the
LNET_PEER_FORCE_PUSH bit is set in the peer state. A partial solution
was added in https://review.whamcloud.com/55559/ where we modified
lnet_peer_needs_push() to check for the PUSH_SENT flag. However, we
missed that the main loop in lnet_peer_discovery() will check for the
LNET_PEER_FORCE_PUSH bit prior to calling lnet_peer_needs_push().
Update lnet_peer_discovery() to remove the problematic check for
LNET_PEER_FORCE_PUSH.

Also refactor the checks for sending a ping into a new function,
lnet_peer_needs_ping().

Test-Parameters: trivial
Test-Parameters: testlist=sanity-lnet env=ONLY=212,ONLY_REPEAT=100
Fixes: 72726a3118 ("LU-14810 lnet: Do not issue multiple PUSHes")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie25089a07ac1d0fcc0e6c56ec69337d22371cc32
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59815
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/include/lnet/lib-lnet.h
lnet/lnet/peer.c

index d723bc2..23a109d 100644 (file)
@@ -1127,6 +1127,17 @@ lnet_peer_needs_push(struct lnet_peer *lp)
        return false;
 }
 
+static inline bool
+lnet_peer_needs_ping(struct lnet_peer *lp)
+{
+       if (lp->lp_state & LNET_PEER_FORCE_PING)
+               return true;
+       else if (!(lp->lp_state & LNET_PEER_NIDS_UPTODATE))
+               return true;
+
+       return false;
+}
+
 static inline unsigned int
 lnet_get_next_recovery_ping(unsigned int ping_count, time64_t now)
 {
index bee5725..7eda4dc 100644 (file)
@@ -4124,11 +4124,7 @@ static int lnet_peer_discovery(void *arg)
                                rc = lnet_peer_ping_failed(lp);
                        else if (lp->lp_state & LNET_PEER_PUSH_FAILED)
                                rc = lnet_peer_push_failed(lp);
-                       else if (lp->lp_state & LNET_PEER_FORCE_PING)
-                               rc = lnet_peer_send_ping(lp);
-                       else if (lp->lp_state & LNET_PEER_FORCE_PUSH)
-                               rc = lnet_peer_send_push(lp);
-                       else if (!(lp->lp_state & LNET_PEER_NIDS_UPTODATE))
+                       else if (lnet_peer_needs_ping(lp))
                                rc = lnet_peer_send_ping(lp);
                        else if (lnet_peer_needs_push(lp))
                                rc = lnet_peer_send_push(lp);