Whamcloud - gitweb
LU-15791 tests: Get health before removing drop rules
authorChris Horn <chris.horn@hpe.com>
Wed, 20 Jul 2022 15:44:39 +0000 (09:44 -0600)
committerAndreas Dilger <adilger@whamcloud.com>
Sat, 23 Mar 2024 20:31:28 +0000 (20:31 +0000)
lnet_health_post() can race with recovery pings, so we should
wait to delete the drop rules until after we've gathered the
health and resend values.

Lustre-change: https://review.whamcloud.com/47998
Lustre-commit: 8caec97d5e89eefe250edb64e6f7ad61e12a9d71

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 79ab053562 ("LU-13569 lnet: Deprecate lnet_recovery_interval")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ia7595e015809f796cafcc40382d98ab66a708a49
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54439
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
lustre/tests/sanity-lnet.sh

index 8f05c69..615db63 100755 (executable)
@@ -1458,6 +1458,8 @@ function lnet_health_post() {
 
        restore_lnet_params
 
+       $LCTL net_drop_del -a
+
        do_lnetctl peer set --health 1000 --all
        do_lnetctl net set --health 1000 --all
 
@@ -1870,7 +1872,6 @@ test_209() {
 
        do_lnetctl discover ${RNIDS[0]} &&
                error "Should have failed"
-       $LCTL net_drop_del -a
 
        lnet_health_post
 
@@ -1890,7 +1891,6 @@ test_209() {
 
        do_lnetctl discover ${RNIDS[0]} &&
                error "Should have failed"
-       $LCTL net_drop_del -a
 
        lnet_health_post