Whamcloud - gitweb
LU-19076 ptlrpc: resend can hit original req 97/59497/12
authorAlex Zhuravlev <bzzz@whamcloud.com>
Fri, 30 May 2025 15:45:41 +0000 (18:45 +0300)
committerOleg Drokin <green@whamcloud.com>
Thu, 12 Jun 2025 06:37:12 +0000 (06:37 +0000)
commit3f2bfd1b0e2ec78b12bd64f58d0a48a9b4f04978
treed2e131a4f936d09f4fd3aad62f4bbfcbcfa7d32d
parent7aba5126752c5734eae7de5363ad914aecd064d4
LU-19076 ptlrpc: resend can hit original req

the client may need to resend a request if the reply buffer can
not fit the reply (LOVEA has just changed, for example).
in some environment (e.g. server and client share same node),
a resend RPC can find the original RPC on export's list and the
server just drops the resend RPC thinking it's a duplicate.
this way the client gets no reply for the resend RPC and times
out.

if this problem happens during layout refresh where the client
holds layout lock requesting LOVEA with MDS_GETXATTR, then
the server can evict the client.

the patch removes RPC from export's list just before sending a
reply as RPC has been already processed and for non-idempotent
request reconstruction should take place.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I48437ad018b9b43b9fff4157203906fd84b6cfd3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59497
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/include/lustre_net.h
lustre/include/obd_support.h
lustre/mdt/mdt_handler.c
lustre/ptlrpc/niobuf.c
lustre/ptlrpc/service.c
lustre/tests/replay-single.sh