Whamcloud - gitweb
LU-1428 ldlm: fix a race in ldlm_lock_destroy_internal
authorLiang Zhen <liang@whamcloud.com>
Tue, 5 Jun 2012 08:34:34 +0000 (16:34 +0800)
committerAndreas Dilger <adilger@whamcloud.com>
Tue, 19 Jun 2012 19:42:13 +0000 (15:42 -0400)
commitf432efadf096764778702a6249a3e7fd4d15c844
tree7bd1523bfea9b6e1586ac91cb1425c4599680b7b
parent845d4b760ccdc685b6bc9746985532429e717224
LU-1428 ldlm: fix a race in ldlm_lock_destroy_internal

ldlm_lock::l_exp_hash should be protected by internal lock of
cfs_hash, but we called cfs_hlist_unhashed(lock::l_exp_hash)
w/o holding cfs_hash lock in ldlm_lock_destroy_internal,
which means if someone called ldlm_lock_cancel on a lock while
export::exp_lock_hash is in progress of rehashing (thread context of
cfs_workitem), there could be tiny window between deleting this lock
from bucket[A] and re-adding it to bucket[B] of l_exp_hash, and
cfs_hlist_unhashed(lock::l_exp_hash) will return 1 in this window,
then we destroyed a lock but left it on l_exp_hash forever because
lock::l_destroyed has been set to 1 and ldlm_lock_destroy_internal()
wouldn't be able to remove the lock from l_exp_hash even it's called
infinite times in ldlm_cancel_locks_for_export_cb().

This patch also added some debug information to
ldlm_cancel_locks_for_export_cb in case this patch can't fix this
problem.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Ia0932658b3f085a55535e36bee4fb833e74fa242
Reviewed-on: http://review.whamcloud.com/3028
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
libcfs/libcfs/hash.c
lustre/ldlm/ldlm_lock.c
lustre/ldlm/ldlm_lockd.c