Whamcloud - gitweb
LU-17427 mdt: reduce hold time for BFL rename lock
In non-parallel rename, the order of resource locking follows
a fixed order: first global BFL, then (up to) 4 FID locks.
This means the BFL can be held for a long time if any client
holding one of the FID locks is non-responsive for some reason.
Especially in a big cluster where hundreds or thousands clients
exist doing various workloads.
To reduce hold time for BFL lock, we restructure the locking
into two phases,
1. get the 4 child locks first (to cancel the majority of
lock holders)
2. try to get the BFL resource lock if it is uncontended
3a. if BFL is contended then drop child locks and wait for it
3b. re-get the child locks if they were dropped in 3a
Note in phase 1 there's no BFL umbrella anymore, which in turn
may introduce deadlock, hence some order logic is added, e.g.
mdt_rename_determine_lock_order() order by fid if objs belong
to different parents.
> perf results:
1. two active clients with only renames:
max_dirs without-patch with
500x500 14418 14276
150x150 20419 20210
2. slow clients scenario, QPS=10 and 5s pause where P(X)=1%
for master and P(X)=0.1% for patch (not quite sure if the P is
reasonable or not),
Metric (ms/req) without-patch with
Average Latency 205.76 21.50
Signed-off-by: Keguang Xu <squalfof@gmail.com>
Change-Id: I806f52a7c23acdb0f9cb7e7b4c8e306db2fad8d5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/57741
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>