Whamcloud - gitweb
LU-11762 ldlm: ensure the recovery timer is armed 27/35627/4
authorHongchao Zhang <hongchao@whamcloud.com>
Wed, 10 Jul 2019 08:22:15 +0000 (04:22 -0400)
committerOleg Drokin <green@whamcloud.com>
Fri, 6 Dec 2019 01:06:46 +0000 (01:06 +0000)
During recovery, when the recovery timer is expired, the VBR phase
is initiated only the current recovery timeout is less than the hard
recovery timeout, or it will be stuck in the "wait_event_timeout()"
because there is no timer and it can't be waked up.

Change-Id: I32467afa45393e37f255e2b14f160c9da710461b
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35627
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/ldlm/ldlm_lib.c

index 2688d07..6c311e2 100644 (file)
@@ -2173,7 +2173,8 @@ repeat:
                /** evict exports which didn't finish recovery yet */
                class_disconnect_stale_exports(obd, exp_finished);
                return 1;
-       } else if (obd->obd_recovery_expired) {
+       } else if (obd->obd_recovery_expired &&
+                  obd->obd_recovery_timeout < obd->obd_recovery_time_hard) {
                obd->obd_recovery_expired = 0;
 
                /** If some clients died being recovered, evict them */