Whamcloud - gitweb
LU-14027 ldlm: Do not wait for lock replay sending if import dsconnected 23/41223/2
authorOleg Drokin <green@whamcloud.com>
Fri, 16 Oct 2020 14:25:58 +0000 (10:25 -0400)
committerOleg Drokin <green@whamcloud.com>
Thu, 4 Mar 2021 08:36:39 +0000 (08:36 +0000)
If import disconnected while we were preparing to send some lock replays
the sending RPC would get stuck on the sending list and would keep
the reconnected import state from progressing from REPLAY
to REPLAY_LOCKS state waiting for the queued replay RPCs to finish.

Set them as no_delay to ensure they don't wait.

LU-13600 exacerbated this issue some but it certainly exist
before it as well.

Lustre-change: https://review.whamcloud.com/40272
Lustre-commit: f06a4efe13faca21ae2a6afcf5718d748bb6ac5d

Change-Id: Id276a0be7657d9ad6cf40ad8e7a165d5cd341cb8
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/41223
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/ldlm/ldlm_request.c

index 566db66..be0bb8e 100644 (file)
@@ -2355,6 +2355,8 @@ static int replay_one_lock(struct obd_import *imp, struct ldlm_lock *lock)
 
         /* We're part of recovery, so don't wait for it. */
         req->rq_send_state = LUSTRE_IMP_REPLAY_LOCKS;
+       /* If the state changed while we were prepared, don't wait */
+       req->rq_no_delay = 1;
 
         body = req_capsule_client_get(&req->rq_pill, &RMF_DLM_REQ);
         ldlm_lock2desc(lock, &body->lock_desc);