Whamcloud - gitweb
LU-17271 kfilnd: Allocate tn_mr_key before kfilnd_peer 29/53029/4
authorChris Horn <chris.horn@hpe.com>
Tue, 7 Nov 2023 22:19:26 +0000 (15:19 -0700)
committerOleg Drokin <green@whamcloud.com>
Thu, 15 Feb 2024 07:07:21 +0000 (07:07 +0000)
commit2740cf66c88e2d04126f7016cb1958ca976cd323
tree87648d349378d6a21f70e5ec4a82e966bf92a2b5
parentf054443e9c03b71a934fd7d05475fb55e50be8c8
LU-17271 kfilnd: Allocate tn_mr_key before kfilnd_peer

A race exists between kfilnd_peer and tn_mr_key allocation that could
result in RKEY re-use and data corruption.

Thread 1: Posts tagged receive with RKEY based on
          peerA::kp_local_session_key X and tn_mr_key Y
Thread 2: Fetches peerA with kp_local_session_key X
Thread 1: Cancels tagged receive, marks peerA for removal, and
          releases tn_mr_key Y
Thread 2: allocates tn_mr_key Y
At this point, thread 2 has the same RKEY used by thread 1.

The fix is to always allocate the tn_mr_key before looking up the
peer, and always mark peers for removal before releasing tn_mr_key.
This commit modifies the TN allocation to ensure the tn_mr_key is
allocated before looking up the target peer.

HPE-bug-id: LUS-11972
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I2e0948ae4fe7c5dfb86e297a3437213f193bf67c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53029
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ron Gredvig <ron.gredvig@hpe.com>
Reviewed-by: Ian Ziemba <ian.ziemba@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lnet/klnds/kfilnd/kfilnd.c
lnet/klnds/kfilnd/kfilnd_tn.c
lnet/klnds/kfilnd/kfilnd_tn.h