From: Niu Yawei Date: Mon, 16 Dec 2013 08:08:08 +0000 (-0500) Subject: LU-4365 quota: wait for global lock cancel X-Git-Tag: 2.4.2-RC1~3 X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=refs%2Fchanges%2F86%2F8586%2F3;p=fs%2Flustre-release.git LU-4365 quota: wait for global lock cancel In qsd_qtype_fini(), we'd wait for the global lock cancel done. Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes \ mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs \ testlist=recovery-small Signed-off-by: Niu Yawei Change-Id: Id235c7d6e07f5ce436655a6d5382e4c8c161fa3b Reviewed-on: http://review.whamcloud.com/8586 Tested-by: Jenkins Tested-by: Maloo Reviewed-by: Oleg Drokin --- diff --git a/lustre/quota/qsd_lib.c b/lustre/quota/qsd_lib.c index b1a29b1..f8b3b5b 100644 --- a/lustre/quota/qsd_lib.c +++ b/lustre/quota/qsd_lib.c @@ -273,6 +273,7 @@ static void qsd_qtype_fini(const struct lu_env *env, struct qsd_instance *qsd, int qtype) { struct qsd_qtype_info *qqi; + int repeat = 0; ENTRY; if (qsd->qsd_type_array[qtype] == NULL) @@ -290,6 +291,29 @@ static void qsd_qtype_fini(const struct lu_env *env, struct qsd_instance *qsd, qqi->qqi_site = NULL; } + /* The qqi may still be holding by global locks which are being + * canceled asynchronously (LU-4365), see the following steps: + * + * - On server umount, we try to clear all quota locks first by + * disconnecting LWP (which will invalidate import and cleanup + * all locks on it), however, if quota reint process is holding + * the global lock for reintegration at that time, global lock + * will fail to be cleared on LWP disconnection. + * + * - Umount process goes on and stops reint process, the global + * lock will be dropped on reint process exit, however, the lock + * cancel in done in asynchronous way, so the + * qsd_glb_blocking_ast() might haven't been called yet when we + * get here. + */ + while (cfs_atomic_read(&qqi->qqi_ref) > 1) { + CDEBUG(D_QUOTA, "qqi reference count %u, repeat: %d\n", + cfs_atomic_read(&qqi->qqi_ref), repeat); + repeat++; + cfs_schedule_timeout_and_set_state(TASK_INTERRUPTIBLE, + cfs_time_seconds(1)); + } + /* by now, all qqi users should have gone away */ LASSERT(cfs_atomic_read(&qqi->qqi_ref) == 1); lu_ref_fini(&qqi->qqi_reference);