When lnet_recovery_limit is set to zero (the default) peer NIs are
eligible for recovery pings indefinitely. Verify this functionality
by modifying sanity-lnet test_211 to use recovery_limit 0 to make
a peer NI re-eligible for recovery.
Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-9953
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I00cb0940133e15ec73491e875d08b6db2bff3fe5
Reviewed-on: https://review.whamcloud.com/43502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
# Set health to force it back onto the recovery queue. Set to 500 means
# in 5 seconds it should be back at maximum value. We'll wait a couple
# more seconds than that to be safe.
- # NB: we need to increase the recovery limit so the peer NI is
+ # NB: we reset the recovery limit to 0 (indefinite) so the peer NI is
# eligible again
- do_lnetctl set recovery_limit 50 ||
+ do_lnetctl set recovery_limit 0 ||
error "failed to set recovery_limit"
$LNETCTL peer set --nid $prim_nid --health 500
+ check_nid_in_recovq "-p" 1
+ check_ping_count "peer_ni" "2"
+
sleep 7
check_nid_in_recovq "-p" 0