Whamcloud - gitweb
LU-14522 ldlm: reprocess locks if enqueue failed 31/42031/7
authorAlex Zhuravlev <bzzz@whamcloud.com>
Sun, 14 Mar 2021 04:29:11 +0000 (07:29 +0300)
committerOleg Drokin <green@whamcloud.com>
Tue, 6 Apr 2021 03:02:51 +0000 (03:02 +0000)
if the export got disconnected during enqueue, ldlm_handle_enqueue0()
drops the lock, but can skip reprocessing and this way all subsequent
waiting locks conflicting with the dopped one may get stuck.

with the patch most of racers succeed, otherwise 1/4 of runs get stuck

Fixes: 37932c4beb ("LU-10175 ldlm: IBITS lock convert instead of cancel")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I584b0de2656840da5dfa86a894fe02f138e1389d
Reviewed-on: https://review.whamcloud.com/42031
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/ldlm/ldlm_lockd.c

index 52930be..63d345d 100644 (file)
@@ -1467,9 +1467,9 @@ existing_lock:
 out:
        req->rq_status = rc ?: err; /* return either error - b=11190 */
        if (!req->rq_packed_final) {
-               err = lustre_pack_reply(req, 1, NULL, NULL);
+               int rc1 = lustre_pack_reply(req, 1, NULL, NULL);
                if (rc == 0)
-                       rc = err;
+                       rc = rc1;
        }
 
        /*
@@ -1547,8 +1547,8 @@ retry:
                                ldlm_resource_unlink_lock(lock);
                                ldlm_lock_destroy_nolock(lock);
                                unlock_res_and_lock(lock);
-
                        }
+                       ldlm_reprocess_all(lock->l_resource, lock);
                }
 
                if (!err && !ldlm_is_cbpending(lock) &&