Whamcloud - gitweb
LU-13571 lnd: Use NETWORK_TIMEOUT for txs on ibp_tx_queue 99/39899/15
authorChris Horn <chris.horn@hpe.com>
Fri, 11 Sep 2020 18:42:42 +0000 (13:42 -0500)
committerOleg Drokin <green@whamcloud.com>
Thu, 3 Dec 2020 07:26:25 +0000 (07:26 +0000)
TXs on the ibp_tx_queue are waiting for a connection to be
established. Failure to establish a connection could be due to a
problem with either the local NI or the remote NI, and o2iblnd cannot
currently distinguish between these failures. As such, it should
return LNET_MSG_STATUS_NETWORK_TIMEOUT to LNet so that LNet will
decrement the health value of both the local NI and the remote NI and
future sends can take these health values into account.

Test-Parameters: trivial
HPE-bug-id: LUS-9342
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Idbbbe95483d25ec48b83e33a00685f72fa5292e6
Reviewed-on: https://review.whamcloud.com/39899
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

index 7e078b2..70ec9e8 100644 (file)
@@ -3374,7 +3374,7 @@ kiblnd_check_conns (int idx)
        if (!list_empty(&timedout_txs))
                kiblnd_txlist_done(&timedout_txs, -ETIMEDOUT,
-                                  LNET_MSG_STATUS_LOCAL_TIMEOUT);
+                                  LNET_MSG_STATUS_NETWORK_TIMEOUT);
        /* Handle timeout by closing the whole
         * connection. We can only be sure RDMA activity