Whamcloud - gitweb
LU-3031 ldlm: disconnect speedup 43/5843/31
authorVitaly Fertman <vitaly_fertman@xyratex.com>
Wed, 22 Jul 2015 14:52:03 +0000 (10:52 -0400)
committerOleg Drokin <oleg.drokin@intel.com>
Tue, 18 Aug 2015 11:12:55 +0000 (11:12 +0000)
commit79e81d228320a26f6ea39a174b4bef2ac1dd1fd9
tree83cc0eb2e76e4e2cf984353c8dd8aab084856ced
parent76040fdbbb739053695386e7ed6d0dcb1ea7539a
LU-3031 ldlm: disconnect speedup

disconnect takes too long time if there are many locks to cancel.
besides the amount of time spent on each lock cancel, there is a
resched() in cfs_hash_for_each_relax(), i.e. disconnect or eviction
may take unexpectedly long time.
- do not cancel locks on disconnect_export;
- export will be left in obd_unlinked_exports list pinned by live
  locks;
- new re-connects will created other non-conflicting exports;
- new locks will cancel obsolete locks on conflicts;
- once all the locks on the disconnected export will be cancelled,
  the export will be destroyed on the last ref put;
- do not cancel in small portions, cancel all together in just 1
  dedicated thread - use server side blocking thread for that;
- cancel blocked locks first so that waiting locks could proceed;
- take care about blocked waiting locks, so that they would get
  cancelled quickly too;
- do not remove lock from waiting list on AST error before moving
  it to elt_expired_locks list, because it removes it from export
  list too; otherwise this blocked lock will not be cancelled
  immediately on failed export;
- cancel lock instead of just destroy for failed export, to make
  full cleanup, i.e. remove it from export list.

also make the proper order of events on umount:
- disconnect export;
- cleanup namespace, to cancel all the locks before export barrier;
- exports barrier;
- lprocfs_free_per_client_stats (requires nid_exp_ref_count == 0);
- namespace_free_post is left in cleanup ensure will not get and
  segfault on an absent namespace.

Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Change-Id: Ia39b09ce967237ed5078c8a71e760b1e103c6f55
Xyratex-bug-id: MRP-395 MRP-1366 MRP-1366
Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <Alexey_Lyashkov@xyratex.com>
Tested-by: Elena Gryaznova <Elena_Gryaznova@xyratex.com>
Reviewed-on: http://review.whamcloud.com/5843
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
17 files changed:
lustre/include/lustre_dlm.h
lustre/include/lustre_export.h
lustre/include/obd_class.h
lustre/include/obd_support.h
lustre/ldlm/ldlm_internal.h
lustre/ldlm/ldlm_lib.c
lustre/ldlm/ldlm_lock.c
lustre/ldlm/ldlm_lockd.c
lustre/ldlm/ldlm_pool.c
lustre/ldlm/ldlm_resource.c
lustre/mdt/mdt_handler.c
lustre/mgs/mgs_fs.c
lustre/mgs/mgs_handler.c
lustre/obdclass/class_obd.c
lustre/obdclass/genops.c
lustre/ofd/ofd_dev.c
lustre/tests/recovery-small.sh