Whamcloud - gitweb
LU-12292 lnet: keep health even if recovery failed 21/36921/11
authorAmir Shehata <ashehata@whamcloud.com>
Tue, 3 Dec 2019 17:22:03 +0000 (09:22 -0800)
committerOleg Drokin <green@whamcloud.com>
Tue, 31 Mar 2020 07:00:15 +0000 (07:00 +0000)
commit610a7542107d5a8ab0a12dc8bda7a4f44f9f0b60
treeacda29b963fe623917e5746f4096db530d4c9e30
parent1d94a29dbc018fd00aa1c8a7a7ae343e0c9a4b83
LU-12292 lnet: keep health even if recovery failed

Don't decrement the interface's health value when recovery
message fails. If we've already determined that an interface
is unhealthy, there is no need to continue decrementing
it's health every 1 second. It'll take too long to come back
into service when it becomes healthy.

Clean up where health is decremented in order not to have
repetitive decrements. No need to decrement in lnet_notify()
because in order for the LND to call this an existing transmit
must've failed. This means a message has already failed which
will result in the health being decremented.

When a recovery send fails make sure to flag the recovery as
failed because there is no reply expected in this case.

Test-parameters: trivial

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ifb3500a77a5a5be51e7079269c8ddba85ed0c2a7
Reviewed-on: https://review.whamcloud.com/36921
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/lnet/lib-move.c
lnet/lnet/lib-msg.c
lnet/lnet/router.c