Whamcloud - gitweb
LU-13912 lnet: Correct the router ping interval calculation 94/39694/7
authorChris Horn <chris.horn@hpe.com>
Mon, 17 Aug 2020 21:02:10 +0000 (16:02 -0500)
committerOleg Drokin <green@whamcloud.com>
Tue, 11 May 2021 22:54:02 +0000 (22:54 +0000)
commit0131d39a622f1efc07dc49df7bceed1bbe16357d
tree93900f2664dbf50174f8a299af76a2931ab3e735
parent0f81c5ae973bf7fba45b6ba7f9c5f4fb1f6eadcb
LU-13912 lnet: Correct the router ping interval calculation

The router ping interval is being divided by the number of local nets
which results in sending pings more frequently than defined by the
alive_router_check_interval. In addition, the current code is structured
such that we may not find a peer net in need of a ping until after
inspecting the router list multiple times. Re-work the code so that the
loop that inspects a router's peer nets will look at all of them until
it either loops back around the list or it finds one that actually
needs to be pinged.

We also move the check of LNET_PEER_RTR_DISCOVERY so that we avoid the
work of inspecting the router's peer nets if the router is already being
discovered.

Test-Parameters: trivial
HPE-bug-id: LUS-9237
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I5a4733002f29c0ade6aee62b4424313d5d245556
Reviewed-on: https://review.whamcloud.com/39694
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/include/lnet/lib-types.h
lnet/lnet/router.c