There were examples in the field with RoCE setups which demonstrate
that CM_EVENT_UNREACHABLE may be received when connection is already
in ESTABLISHED state. This causes an assert in kiblnd_cm_callback to
fail.
Handle this in a more gracious manner: report the event as unexpected
and allow the flow to continue. If there are indeed issues on
the connection, it is expected to report transaction errors later
and get cleaned up without crashing the whole system.
Lustre-change: https://review.whamcloud.com/53298
Lustre-commit: TBD (from
cbde71bf893dba0de752a190c3b16d653ef75085)
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: If32166fe9fc59e025609c2035cb1c03d3bed22f2
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
libcfs_nid2str(conn->ibc_peer->ibp_nid),
event->status,
conn->ibc_state);
- LASSERT(conn->ibc_state != IBLND_CONN_ESTABLISHED &&
- conn->ibc_state != IBLND_CONN_INIT);
+ LASSERT(conn->ibc_state != IBLND_CONN_INIT);
if (conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT ||
conn->ibc_state == IBLND_CONN_PASSIVE_WAIT) {
kiblnd_connreq_done(conn, -ENETDOWN);