Whamcloud - gitweb
LU-19046 mgc: mgc_fs_setup() should wait interruptibly 96/59396/2
authorAndreas Dilger <adilger@whamcloud.com>
Fri, 23 May 2025 03:32:26 +0000 (21:32 -0600)
committerOleg Drokin <green@whamcloud.com>
Sat, 7 Jun 2025 23:05:13 +0000 (23:05 +0000)
commitcde3df1cfe121eba8796dddc37d6501b0bcd89aa
tree4e894c0c6aaf3e8b90e9dce8d0a97b39fc9a4c1e
parent3ce73a088b155121f454804108ffa808982615a7
LU-19046 mgc: mgc_fs_setup() should wait interruptibly

When a target mounts, it fetches a copy of its config log from the
MGS to store in the local filesystem. However, the MGC can currently
only fetch the config log for one target filesystem at a time.
This should be improved in a separate patch.

If the MGS is inaccessible, or there is a problem during setup, the
server will wait for it while holding cl_mgc_mutex.  Other targets on
the same server will be unable to mount, and block on cl_mgc_mutex,
possibly dumping a stack trace like:

    INFO: task mount.lustre:93138 blocked for more than 90 seconds.
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" to disable this
    task:mount.lustre    state:D stack:0     pid:93138 ppid:93135
    Call Trace:
    __schedule+0x2d1/0x870
    schedule+0x55/0xf0
    schedule_preempt_disabled+0xa/0x10
    __mutex_lock.isra.11+0x349/0x420
    mgc_fs_setup.isra.12+0x65/0x7a0 [mgc]
    mgc_set_info_async+0x99f/0xb30 [mgc]
    server_start_targets+0x452/0x2c30 [obdclass]
    server_fill_super+0x94e/0x10a0 [obdclass]
    lustre_fill_super+0x388/0x3d0 [lustre]
    mount_nodev+0x49/0xa0
    legacy_get_tree+0x27/0x50
    vfs_get_tree+0x25/0xc0
    do_mount+0x2e9/0x950
    ksys_mount+0xbe/0xe0

Use wait_event_interruptible() in mgc_fs_setup() so the server's mount
thread can be interrupted and killed.  This does not fix the reason
for the server to be blocked, but it does allow it to be killed.

Rename mgc_fs_cleanup() to mgc_fs_clear() so it is not confused with
actually cleaning up the MGC.

Avoid printing an error if the sptlrpc log is not available.  This is
common for most filesystems, and is not an error.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0bafa5dae0eadecb112efaf61f8bcf7ea8c4c296
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/59396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/mgc/mgc_request_server.c
lustre/target/tgt_mount.c