Whamcloud - gitweb
LU-10859 ldiskfs: fix deadlock with heavy memory preassure
On one Customer site, we hit following deadlock:
Thread 1:
ofd_object_punch
osd_punch
ldiskfs_truncate
ldiskfs_inode_attach_jinode
...
do_try_to_free_pages
lu_cache_shrink
mutex_lock -->try to hold @lu_sites_guard
kswapd thread2:
kthread
shrink_slab
lu_cache_shrink
mutex_lock ---->hold already.
...
dqget
ldiskfs_acquire_dquot
jbd2__journal_start-->blocked to wait for more credits.
Thread3:
kthread
kjournald2
jbd2_journal_commit_transaction-->blocked to wait Thread2 finished,
since Thread1 add a handle into transaction.
So deadlock happens because of Thread1 wait Thread2, Thread2 wait Thread3..
but Thread3 wait Thread1....
This problem still exists even we have switched @lu_sites_guard
into a read/write lock, sine we hold write lock at lu_cahce_shrink().
Fixed the problem by making ldiskfs_inode_attach_jinode() use
GFP_NOFS.
Test-Parameters: testgroup=review-ldiskfs \
mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs
Change-Id: I0ab143fc0cdb8e1b0c490c2c25e8af483c491a81
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/31806
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>