Whamcloud - gitweb
LU-16563 lnet: use discovered ni status to set initial health
authorSerguei Smirnov <ssmirnov@whamcloud.com>
Thu, 16 Feb 2023 18:34:03 +0000 (10:34 -0800)
committerAndreas Dilger <adilger@whamcloud.com>
Tue, 25 Apr 2023 04:01:46 +0000 (04:01 +0000)
commitc4df48116d953162ca16f995e887643eacab22f9
treeb76e40b687d093497cc27a84a67bcb0d047da287
parent6654057165226980ba4aaf205f7689f31d3375b2
LU-16563 lnet: use discovered ni status to set initial health

If not routing, track local NI status in the ping buffer
such that locally recognized "down" state, for example,
due to a downed network interface/link, is available
to any discovering peer.

On the active side of discovery, check peer NI status so if NI
is down, decrement its health score and queue for recovery.

Lustre-change: https://review.whamcloud.com/50027/
Lustre-commit: da230373bd14306cb97fb48748ebce205f09d468

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I513c7942099c0da9088fa6d4460f76386ea91d3b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/50040
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
lnet/include/lnet/lib-lnet.h
lnet/klnds/o2iblnd/o2iblnd.c
lnet/klnds/socklnd/socklnd.c
lnet/lnet/api-ni.c
lnet/lnet/peer.c