Whamcloud - gitweb
LU-12752 mdt: commitrw_write() - check dying object under lock
If process writes to unlinked file the following race between
mdt_commitrw_write() and mdd_close() may occur because
mdt_commitrw_write() checks whether an object is dying without lock:
mdt_commitrw_write() checks lu_object_is_dying(&mo->mot_header) and it
not yet
mdd_close() interposes and destroys the object via
mdo_destroy()
lod_destroy()
lod_sub_destroy()
osd_destroy()
obj->oo_destroyed = 1;
mdt_commitrw_write() continues, locks the object and returns ENOENT
from
dt_attr_get()
osd_attr_get()
if (unlikely(obj->oo_destroyed))
return -ENOENT;
If the file is built of DoM and raid component ll_delete_inode() calls
cl_sync_file_range() which is to iterate over both mdt and raid
components via mdc_io_fsync_start() and osc_io_fsync_start(). As
mdc_io_fsync_start() fails with -ENOENT due to failed write rpc,
osc_io_fsync_start() does not get called. Then
truncate_inode_pages_final() finds not-discarded pages and fails with:
(osc_page.c:183:osc_page_delete()) Trying to teardown failed: -16
(osc_page.c:184:osc_page_delete()) ASSERTION( 0 ) failed:
(osc_page.c:184:osc_page_delete()) LBUG
Test to illustrate the issue is added.
The fix is to call lu_object_is_dying() under object lock.
Change-Id: I463c8a6f85d4f5fd934b167c6194f50ae9d4b7d4
HPE-bug-id: LUS-7189
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-on: https://review.whamcloud.com/41797
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>