Whamcloud - gitweb
LU-15860 socklnd: Duplicate ksock_conn_cb 11/48911/2
authorChris Horn <chris.horn@hpe.com>
Thu, 12 May 2022 18:16:10 +0000 (13:16 -0500)
committerOleg Drokin <green@whamcloud.com>
Tue, 11 Apr 2023 00:06:14 +0000 (00:06 +0000)
commitea34ee7b40271ec23b6d9ed916a43971dd73fad5
treee1fe2c4e866d313fe4d8fee4d6bf59dd1c896282
parent727638a1d0e72a043c2798f081f166a0e3a39268
LU-15860 socklnd: Duplicate ksock_conn_cb

If two threads enter ksocknal_add_peer(), the first one to acquire
the ksnd_global_lock will create a ksock_peer_ni and associate a
ksock_conn_cb with it.

When the second thread acquires the ksnd_global_lock it will find the
existing ksock_peer_ni, but it does not check for an existing
ksock_conn_cb. As a result, it overwrites the existing ksock_conn_cb
(ksock_peer_ni::ksnp_conn_cb) and the ksock_conn_cb from the first
thread becomes stranded.

Modify ksocknal_add_peer() to check whether the peer_ni has an
existing ksock_conn_cb associated with it

Lustre-change: https://review.whamcloud.com/47361
Lustre-commit: 0c91d49a44e1214b5c65d4a557f6969b3d217881

Fixes: 7766f01e89 ("LU-13641 socklnd: replace route construct")
HPE-bug-id: LUS-10956
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6c0190a0c1d3321ddd85c763b86ad1f0d32cf2b9
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48911
lnet/klnds/socklnd/socklnd.c