Whamcloud - gitweb
LU-14750 lnet: use ni fatal error when calculating net health 62/43962/2
authorSerguei Smirnov <ssmirnov@whamcloud.com>
Wed, 9 Jun 2021 21:22:12 +0000 (14:22 -0700)
committerOleg Drokin <green@whamcloud.com>
Thu, 8 Jul 2021 02:06:31 +0000 (02:06 +0000)
When ni is flagged with "fatal_error" by LND, its health score
remains unaffected. This allows for the net containing such ni
to be selected for tx even if it is the only ni in this net.
Take "fatal_error" status of the ni into account when calculating
the net health score.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ib76245f835f1458873f0c05ad9b6727d295857de
Reviewed-on: https://review.whamcloud.com/43962
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/lnet/api-ni.c

index d60c905..835d96a 100644 (file)
@@ -3177,11 +3177,12 @@ int lnet_get_net_healthv_locked(struct lnet_net *net)
 {
        struct lnet_ni *ni;
        int best_healthv = 0;
 {
        struct lnet_ni *ni;
        int best_healthv = 0;
-       int healthv;
+       int healthv, ni_fatal;
 
        list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
                healthv = atomic_read(&ni->ni_healthv);
 
        list_for_each_entry(ni, &net->net_ni_list, ni_netlist) {
                healthv = atomic_read(&ni->ni_healthv);
-               if (healthv > best_healthv)
+               ni_fatal = atomic_read(&ni->ni_fatal_error_on);
+               if (!ni_fatal && healthv > best_healthv)
                        best_healthv = healthv;
        }
 
                        best_healthv = healthv;
        }