git://git.whamcloud.com - fs/lustre-release.git/commit

author	Alexander Boyko <alexander.boyko@hpe.com>
	Sat, 15 Jun 2024 10:04:34 +0000 (06:04 -0400)
committer	Oleg Drokin <green@whamcloud.com>
	Thu, 8 Aug 2024 00:16:36 +0000 (00:16 +0000)
commit	27f787daa7f25f1f14f8e041582ef969f87cd77a
tree	f0ba0e829af5716c4b39f5cf870dc717f582cf63	tree \| snapshot
parent	768c1d65836d5f91c4ac51fbc73ff69ce48a1048	commit \| diff

LU-15737 ofd: don't block destroys

ofd_destoy_by_fid could sleep infinite for a GROUP lock
conflict. If all MDT osp_sync_inflight is spend for such destroys,
MDT would not be able to send destroys and setattr. And as a result
OST free space leakage.

This fix makes ldlm_cli_enqueue_local nonblocking for group locks,
and adds MDT repeat part of sync requests with errors.
Also patch adds a debugfs file to check hanged osp jobs.
lctl get_param osp.lustre-OST0000-osc-MDT0000.error_list

Adds recovery-small 160. It reproduces a situation when
MDT sends object destroys and it hangs at OST side,
because of conflicting GROUP lock.

Lustre: ll_ost02_068: service thread pid 51278 was inactive for
204.776 seconds. The thread might be hung...
Call Trace TBD:
ldlm_completion_ast+0x7ac/0x900 [ptlrpc]
ldlm_cli_enqueue_local+0x307/0x880 [ptlrpc]
ofd_destroy_by_fid+0x235/0x4a0 [ofd]
ofd_destroy_hdl+0x263/0xa10 [ofd]
tgt_request_handle+0xcc9/0x1a20 [ptlrpc]
ptlrpc_server_handle_request+0x23f/0xc60 [ptlrpc]
ptlrpc_main+0xc8b/0x15d0 [ptlrpc]

HPE-bug-id: LUS-12350
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I396bf48d3d29f058f65095cbb4dbba11581534cc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/55598
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

lustre/ldlm/ldlm_extent.c		diff \| blob \| history
lustre/ofd/ofd_obd.c		diff \| blob \| history
lustre/osp/lproc_osp.c		diff \| blob \| history
lustre/osp/osp_internal.h		diff \| blob \| history
lustre/osp/osp_sync.c		diff \| blob \| history
lustre/tests/recovery-small.sh		diff \| blob \| history