Whamcloud - gitweb
LU-16032 osd: move unlink of large objects to separate thread
Final unlink and freeing of blocks for large objects can lead to
a thread hung with this call stack:
Net: Service thread pid 1739 was inactive for 200.16s.
The thread might be hung, or it might only be slow and will
resume later.
Dumping the stack trace for debugging purposes:
__wait_on_buffer+0x2a/0x30
ldiskfs_wait_block_bitmap+0xe0/0xf0 [ldiskfs]
ldiskfs_read_block_bitmap+0x31/0x60 [ldiskfs]
ldiskfs_free_blocks+0x329/0xbb0 [ldiskfs]
ldiskfs_ext_remove_space+0x8a9/0x1150 [ldiskfs]
ldiskfs_ext_truncate+0xb0/0xe0 [ldiskfs]
ldiskfs_truncate+0x3b7/0x3f0 [ldiskfs]
ldiskfs_evict_inode+0x58a/0x630 [ldiskfs]
evict+0xb4/0x180
iput+0xfc/0x190
osd_object_delete+0x1f8/0x370 [osd_ldiskfs]
lu_object_free.isra.30+0x68/0x170 [obdclass]
lu_object_put+0xc5/0x3e0 [obdclass]
ofd_destroy_by_fid+0x20e/0x500 [ofd]
ofd_destroy_hdl+0x267/0x9f0 [ofd]
tgt_request_handle+0xaee/0x15f0 [ptlrpc]
ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
ptlrpc_main+0xb34/0x1470 [ptlrpc]
kthread+0xd1/0xe0
Let's move final unlink to workqueue if inode size > 1GB. The size
threshold be configured by setting the minimum async truncate size
with the "osd-ldiskfs.*.delay_unlink_mb" parameter.
Writes to "osd-ldiskfs.*.force_sync" parameter will flush pending
delayed unlinks so that space can be reclaimed as needed.
Change-Id: Id535ae4c58732769effabee42835bc2da8cb5cc1
Signed-off-by: Artem Blagodarenko <ablagodarenko@whamcloud.com>
DDN-bug-id: DDN-3144
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47995
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>