LU-13569 lnet: Recover peer NI w/exponential backoff interval
Perform LNet recovery pings of peer NIs with an exponential backoff
interval.
- The interval is equal to 2^(number failed pings) up to a maximum
of 900 seconds (15 minutes).
- When a message is received the count of failed pings for the
associated peer NI is reset to 0 so that recovery can happen more
quickly.
Lustre-change: https://review.whamcloud.com/39720
Lustre-commit:
917553c537a8860f57a50dc9752e3ac69d06c11c
Test-Parameters: trivial
HPE-bug-id: LUS-9109
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic7e60455015a0236a96010c07fc0ddd02078cf92
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/54402
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>