From: Serguei Smirnov Date: Thu, 30 Nov 2023 18:55:11 +0000 (-0800) Subject: LU-17325 o2iblnd: CM_EVENT_UNREACHABLE on established conn X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=4ee7b6d1639e6b043da2b8edbef8add8581435db;p=fs%2Flustre-release.git LU-17325 o2iblnd: CM_EVENT_UNREACHABLE on established conn There were examples in the field with RoCE setups which demonstrate that CM_EVENT_UNREACHABLE may be received when connection is already in ESTABLISHED state. This causes an assert in kiblnd_cm_callback to fail. Handle this in a more gracious manner: report the event as unexpected and allow the flow to continue. If there are indeed issues on the connection, it is expected to report transaction errors later and get cleaned up without crashing the whole system. Lustre-change: https://review.whamcloud.com/53298 Lustre-commit: TBD (from cbde71bf893dba0de752a190c3b16d653ef75085) Test-Parameters: trivial testlist=sanity-lnet Signed-off-by: Serguei Smirnov Change-Id: If32166fe9fc59e025609c2035cb1c03d3bed22f2 Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53301 Tested-by: jenkins Tested-by: Maloo Reviewed-by: Frank Sehr Reviewed-by: Cyril Bordage Reviewed-by: Andreas Dilger --- diff --git a/lnet/klnds/o2iblnd/o2iblnd_cb.c b/lnet/klnds/o2iblnd/o2iblnd_cb.c index 1cd22f4..3d2a181 100644 --- a/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -3259,8 +3259,7 @@ kiblnd_cm_callback(struct rdma_cm_id *cmid, struct rdma_cm_event *event) libcfs_nid2str(conn->ibc_peer->ibp_nid), event->status, conn->ibc_state); - LASSERT(conn->ibc_state != IBLND_CONN_ESTABLISHED && - conn->ibc_state != IBLND_CONN_INIT); + LASSERT(conn->ibc_state != IBLND_CONN_INIT); if (conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT || conn->ibc_state == IBLND_CONN_PASSIVE_WAIT) { kiblnd_connreq_done(conn, -ENETDOWN);