Whamcloud - gitweb
LU-14654 tests: Ensure recovery_limit zero works as expected
authorChris Horn <chris.horn@hpe.com>
Thu, 29 Apr 2021 18:09:07 +0000 (13:09 -0500)
committerAndreas Dilger <adilger@whamcloud.com>
Sat, 23 Mar 2024 20:32:35 +0000 (20:32 +0000)
When lnet_recovery_limit is set to zero (the default) peer NIs are
eligible for recovery pings indefinitely. Verify this functionality
by modifying sanity-lnet test_211 to use recovery_limit 0 to make
a peer NI re-eligible for recovery.

Lustre-change: https://review.whamcloud.com/43502
Lustre-commit: 8d1895f2f69bd2eec3ff6af5eb356740fa2c8766

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-9953
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I00cb0940133e15ec73491e875d08b6db2bff3fe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54449
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
lustre/tests/sanity-lnet.sh

index 2dc364b..8171e85 100755 (executable)
@@ -2121,13 +2121,16 @@ test_211() {
        # Set health to force it back onto the recovery queue. Set to 500 means
        # in 5 seconds it should be back at maximum value. We'll wait a couple
        # more seconds than that to be safe.
-       # NB: we need to increase the recovery limit so the peer NI is
+       # NB: we reset the recovery limit to 0 (indefinite) so the peer NI is
        # eligible again
-       do_lnetctl set recovery_limit 50 ||
+       do_lnetctl set recovery_limit 0 ||
                error "failed to set recovery_limit"
 
        $LNETCTL peer set --nid $prim_nid --health 500
 
+       check_nid_in_recovq "-p" 1
+       check_ping_count "peer_ni" "2"
+
        sleep 7
 
        check_nid_in_recovq "-p" 0