Whamcloud - gitweb
LU-10499 pcc: avoid dead lock for auto attach in PCC-RO
In this patch, It releases the pcc inode lock when calling
ll_layout_refresh() in @pcc_try_auto_attach() as it may cause the
following deadlock:
1. The client is writing or truncating a file in readonly mode.
At this time, it will send a write layout intent lock to clear
the readonly state on the layout on MDT.
2. A read process tries to auto attach the file with pcc inode
lock hold. During the pregress of auto attach, it will call
ll_layout_refresh(). The client-side enqueue request for a
layout lock returned a blocked lock, it will sleep and wait for
the lock being granted;
3. MDT will take EX layout lock to cancel all cached layout lock
on client to change the layout for clearing the PCC-RO state.
4. when the client handles the revocation of layout lock, it needs
to invalidate the PCC state which needs under the protection of
pcc inode lock.
EX-3191 pcc: add test for mmap | write | detach racer
This patch adds the mmap racer among: (write | read | mmap_cat |
detach | unlink): sanity-pcc/test_99.
Was-Change-Id: I5db160851a95937275fea6ae32f40dcd0fe69f46
EX-3478 pcc: avoid uninitialized pcc mutext lock in cleanup
Running racer concurrently crashed in the following way:
RIP: 0010:[...] [...] __list_add+0x1b/0xc0
__mutex_lock_slowpath+0xa6/0x1d0
mutex_lock+0x1f/0x2f
pcc_inode_free+0x1e/0x60 [lustre]
ll_clear_inode+0x64/0x6a0 [lustre]
ll_delete_inode+0x5d/0x220 [lustre]
evict+0xb4/0x180
iput+0xfc/0x190
ll_iget+0x156/0x350 [lustre]
ll_prep_inode+0x212/0x9b0 [lustre]
After analysis, we found that the mutex @lli_pcc_lock is not
initialized. The reason is that ll_lli_init() is not called to
initialize @lli.
When call pcc_inode_free(), it will call mutex_lock() on the
uniniitialized @lli_pcc_lock, thus crash the kernel.
In liblustreapi_pcc.c, it should set errno on error return.
Was-Change-Id: I612c79a5b8eb4fa9daeb9e446a457e95c666c04a
EX-3636 pcc: reset file mmaping for the file once mmaped
For a file once mmaped and cached on PCC, a new open will set the
mapping for the file handle of PCC copy (@file->f_mapping) with
the one of the Lustre file handle. When the file is detached from
PCC due to manual detach or layout lock shrinking, the normal I/O
(read/write) will auto-attach the file into PCC again during I/O
as the layout version is unchanged. However, it still needs to
reset the file mapping (@pcc_file->f_mapping) with the mapping of
the PCC copy. Otherwise it will cause panic as follows:
[ 935.516823] RIP: 0010:_raw_read_lock+0xa/0x20
[ 935.517077] ll_cl_find+0x19/0x60 [lustre]
[ 935.517098] ll_readpage+0x51/0x820 [lustre]
[ 935.517110] read_pages+0x122/0x190
[ 935.517119] __do_page_cache_readahead+0x1c1/0x1e0
[ 935.517131] ondemand_readahead+0x1f9/0x2c0
[ 935.517142] pagecache_get_page+0x30/0x2c0
[ 935.517165] generic_file_buffered_read+0x556/0xa00
[ 935.517189] pcc_try_auto_attach+0x3ac/0x400 [lustre]
[ 935.517552] pcc_io_init+0x146/0x560 [lustre]
[ 935.517906] pcc_file_read_iter+0x24d/0x2b0 [lustre]
[ 935.518259] ll_file_read_iter+0x74/0x2e0 [lustre]
[ 935.518604] new_sync_read+0x121/0x170
[ 935.518937] vfs_read+0x8a/0x140
This patch adds sanity-pcc test_98 to verify it.
I/O for a file previously opened before attach into PCC or once
opened while in ATTACHING state will fallback to Lustre OSTs.
For the later mmap() on the file, the mmap() I/O also needs to
fallback to Lustre OSTs and cannot read directly from local valid
cached PCC copy until all fallback file handles are closed as the
mapping of the PCC copy is replaced with the one of Lustre file
when mmapped a file.
Add sanity-pcc test_97 to verify it.
And we also forbid to auto attach the file which is still in
mmapped I/O.
EX-3636 pcc: auto attach should skip if already attached
When try to auto attach a file into PCC, if found that the file
had already attached into PCC, it should skip the auto attach
processing. Otherwise, it will result in wrong PCC inode refcount
when multiple threads try to auto attach a file at the same time.
For a file once mmapped into PCC and detached due to layout lock
shrinking or manual detach command, If found that file is still
valid cached (attach into PCC again by another thread), in the
@pcc_mmap_io_init(), it should set the mapping of PCC copy with
the one of Lustre file again.
Was-Change-Id: I5f049ca7d6db8708712e79e9ad459fc60b80f2be
LU-17964 pcc: set mapneg bit in all cases of normal I/O fallback
When a file is copying data from Lustre OSTs to the PCC copy, the
file is in PCC ATTACHING state. New opens and I/O on this file
will fallback to the normal I/O path (Lustre OSTs) before the
attach is finished. And the file handle will be set with fallback
and mapneg bit. Currently we only clear the fallback and mapneg
bit when the file handle is closed.
To support mmap() I/O, we replace the mapping of the PCC copy with
the one of the Lustre file. However, we can do that only if the
Lustre file has not any opened file handle with mapneg bit set.
Otherwise, we can not switch the mapping and the mmap() I/O will
also fallback to Lustre OSTs and use the mapping of the Lustre
file.
Once a mmap()ed file was detached from PCC backend due to the
manual detach command or the revocation of the LAYOUT ibit lock
(which protects the cache validity of PCC cache access), we should
reset the mapping of the PCC file accordingly and set fallback and
mapneg bits if the I/O is falling back into the normal path
(Lustre OSTs).
Was-Change-Id: Ibd152aaf724dcff48efbe022dc7f3e70848b4e0d
EX-bug-id: EX-3080 EX-3191 EX-3478 EX-3480 EX-3636
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I18890d19d03726a5991c923505e8c5363382fdc2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54390
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>