From: Amir Shehata Date: Wed, 20 May 2020 05:21:10 +0000 (-0700) Subject: LU-13553 lnd: gracefully handle unexpected events X-Git-Tag: 2.13.54~1 X-Git-Url: https://git.whamcloud.com/?p=fs%2Flustre-release.git;a=commitdiff_plain;h=60f9f539e686fc19b080a3cda15ade7111bbd4a7 LU-13553 lnd: gracefully handle unexpected events When a tx completes kiblnd_tx_complete() callback is invoked. We ensure: LASSERT (tx->tx_sending > 0); However this assert is being triggered in some rare scenarios. The reason tx_sending would be 0 at this point is because: 1. ib_post_send() failed but OFED stack is still sending a tx complete event. 2. We're getting two different events for the same tx Instead of asserting, ignore that tx_complete event and print the tx pointer and its status. Signed-off-by: Amir Shehata Change-Id: I8cd192538c0c80abaef23a4b6e6906936043060b Reviewed-on: https://review.whamcloud.com/38669 Tested-by: jenkins Tested-by: Maloo Reviewed-by: Andreas Dilger Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin --- diff --git a/lnet/klnds/o2iblnd/o2iblnd_cb.c b/lnet/klnds/o2iblnd/o2iblnd_cb.c index b116f2b..0117d40 100644 --- a/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -1020,24 +1020,28 @@ kiblnd_check_sends_locked(struct kib_conn *conn) static void kiblnd_tx_complete(struct kib_tx *tx, int status) { - int failed = (status != IB_WC_SUCCESS); + int failed = (status != IB_WC_SUCCESS); struct kib_conn *conn = tx->tx_conn; - int idle; + int idle; - LASSERT (tx->tx_sending > 0); + if (tx->tx_sending <= 0) { + CERROR("Received an event on a freed tx: %p status %d\n", + tx, tx->tx_status); + return; + } - if (failed) { - if (conn->ibc_state == IBLND_CONN_ESTABLISHED) + if (failed) { + if (conn->ibc_state == IBLND_CONN_ESTABLISHED) CNETERR("Tx -> %s cookie %#llx" - " sending %d waiting %d: failed %d\n", - libcfs_nid2str(conn->ibc_peer->ibp_nid), - tx->tx_cookie, tx->tx_sending, tx->tx_waiting, - status); + " sending %d waiting %d: failed %d\n", + libcfs_nid2str(conn->ibc_peer->ibp_nid), + tx->tx_cookie, tx->tx_sending, tx->tx_waiting, + status); - kiblnd_close_conn(conn, -EIO); - } else { - kiblnd_peer_alive(conn->ibc_peer); - } + kiblnd_close_conn(conn, -EIO); + } else { + kiblnd_peer_alive(conn->ibc_peer); + } spin_lock(&conn->ibc_lock);