From: Lei Feng Date: Thu, 30 Jun 2022 02:46:31 +0000 (+0800) Subject: LU-15986 ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs() X-Git-Tag: 2.15.53~207 X-Git-Url: https://git.whamcloud.com/?p=fs%2Flustre-release.git;a=commitdiff_plain;h=aaef545cff2dd958418ec9fb364d4bbe1408edb9 LU-15986 ptlrpc: protect rq_repmsg in ptlrpc_req_drop_rs() There is a race condition that: on server side, one thread sent reply message and is deleting the reply message, another is searching for existing request and print some debug information in _debug_req() if there is a duplicated request. They both operate on req->rq_repmsg but it is not protected in ptlrpc_req_drop_rs(). So we protected it with req->rq_early_free_lock. Signed-off-by: Lei Feng Change-Id: Ied55427ee15c3ef84bdd2d579844eba398dbf010 Reviewed-on: https://review.whamcloud.com/47839 Reviewed-by: Andreas Dilger Reviewed-by: Li Xi Reviewed-by: Qian Yingjin Reviewed-by: Andrew Perepechko Tested-by: jenkins Tested-by: Maloo Reviewed-by: Oleg Drokin --- diff --git a/lustre/include/lustre_net.h b/lustre/include/lustre_net.h index 897302c..105ef088 100644 --- a/lustre/include/lustre_net.h +++ b/lustre/include/lustre_net.h @@ -2497,11 +2497,18 @@ ptlrpc_rs_decref(struct ptlrpc_reply_state *rs) /* Should only be called once per req */ static inline void ptlrpc_req_drop_rs(struct ptlrpc_request *req) { - if (req->rq_reply_state == NULL) - return; /* shouldn't occur */ - ptlrpc_rs_decref(req->rq_reply_state); - req->rq_reply_state = NULL; - req->rq_repmsg = NULL; + if (req->rq_reply_state == NULL) + return; /* shouldn't occur */ + + /* req_repmsg equals rq_reply_state->rs_msg, + * so set it to NULL before rq_reply_state is possibly freed + */ + spin_lock(&req->rq_early_free_lock); + req->rq_repmsg = NULL; + spin_unlock(&req->rq_early_free_lock); + + ptlrpc_rs_decref(req->rq_reply_state); + req->rq_reply_state = NULL; } static inline __u32 lustre_request_magic(struct ptlrpc_request *req) diff --git a/lustre/ptlrpc/service.c b/lustre/ptlrpc/service.c index 9e7c34c..aeaaaca 100644 --- a/lustre/ptlrpc/service.c +++ b/lustre/ptlrpc/service.c @@ -1446,6 +1446,7 @@ static int ptlrpc_at_send_early_reply(struct ptlrpc_request *req) GOTO(out_free, rc = -ENOMEM); *reqcopy = *req; + spin_lock_init(&reqcopy->rq_early_free_lock); reqcopy->rq_reply_state = NULL; reqcopy->rq_rep_swab_mask = 0; reqcopy->rq_pack_bulk = 0;