Whamcloud - gitweb
LU-19046 mgc: mgc_fs_setup() should wait interruptibly
When a target mounts, it fetches a copy of its config log from the
MGS to store in the local filesystem. However, the MGC can currently
only fetch the config log for one target filesystem at a time.
This should be improved in a separate patch.
If the MGS is inaccessible, or there is a problem during setup, the
server will wait for it while holding cl_mgc_mutex. Other targets on
the same server will be unable to mount, and block on cl_mgc_mutex,
possibly dumping a stack trace like:
INFO: task mount.lustre:93138 blocked for more than 90 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" to disable this
task:mount.lustre state:D stack:0 pid:93138 ppid:93135
Call Trace:
__schedule+0x2d1/0x870
schedule+0x55/0xf0
schedule_preempt_disabled+0xa/0x10
__mutex_lock.isra.11+0x349/0x420
mgc_fs_setup.isra.12+0x65/0x7a0 [mgc]
mgc_set_info_async+0x99f/0xb30 [mgc]
server_start_targets+0x452/0x2c30 [obdclass]
server_fill_super+0x94e/0x10a0 [obdclass]
lustre_fill_super+0x388/0x3d0 [lustre]
mount_nodev+0x49/0xa0
legacy_get_tree+0x27/0x50
vfs_get_tree+0x25/0xc0
do_mount+0x2e9/0x950
ksys_mount+0xbe/0xe0
Use wait_event_interruptible() in mgc_fs_setup() so the server's mount
thread can be interrupted and killed. This does not fix the reason
for the server to be blocked, but it does allow it to be killed.
Rename mgc_fs_cleanup() to mgc_fs_clear() so it is not confused with
actually cleaning up the MGC.
Avoid printing an error if the sptlrpc log is not available. This is
common for most filesystems, and is not an error.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0bafa5dae0eadecb112efaf61f8bcf7ea8c4c296
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>