Whamcloud - gitweb
LU-11926 ldlm: Lost lease lock on migrate error 82/34182/4
authorAndriy Skulysh <c17819@cray.com>
Tue, 4 Dec 2018 13:27:58 +0000 (15:27 +0200)
committerOleg Drokin <green@whamcloud.com>
Thu, 21 Mar 2019 03:43:41 +0000 (03:43 +0000)
commitae7ca90713b444647e682599398b28c8c16b68f7
treecb7aedfb14195344a6d02caccc83e374d5449d3b
parentee009da38d016a129b975aa3db77c675a17c1c3d
LU-11926 ldlm: Lost lease lock on migrate error

All the file operations have the following locking order - parent,
child. If a lock for a child is returned to the client, the following
operations on this file are done by the child fid.

However, the migrate is an exception - it takes the lease lock first and
takes the PW parent lock next during the MDS_REINT.

At the same time, if there is a parallel racing operation (open) which
has taken a lock on parent (conflicting with the next MDS_REINT) and
is trying to take a lock on child - it is blocked until
the lease cancel comes.

The lease cancel is piggy-backed on the MDS_REINT RPC and is handled
at the end of the operation, trying to take the conflicting parent lock
first - thus a deadlock occurs.

At the same time, the lease lock is not supposed to block anything, it
is just an indicator on the server there is no other conflicting
operation has occurred during the migration - thus
set LDLM_FL_CANCEL_ON_BLOCK on it and the conflicting operation
will not be blocked.

In this case, the MDS_REINT will return -EAGAIN as the lease
is cancelled and the client will retry its migration.

Change-Id: Ib6cdc24ffe4ecb99d314a5466bcbb066a1d04dc1
Cray-bug-id: LUS-6811
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/34182
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/include/obd_support.h
lustre/ldlm/ldlm_lockd.c
lustre/ldlm/ldlm_request.c
lustre/llite/file.c
lustre/mdt/mdt_handler.c
lustre/mdt/mdt_open.c
lustre/tests/sanity.sh