Whamcloud - gitweb
LU-17809 osp: make disconnect asynchronous
authorAlexander Boyko <alexander.boyko@hpe.com>
Sat, 20 Apr 2024 22:02:54 +0000 (18:02 -0400)
committerAndreas Dilger <adilger@whamcloud.com>
Wed, 3 Jul 2024 04:36:20 +0000 (04:36 +0000)
commit1a3a9c573e8582f68ccf5c2f9e985b8616160dfe
tree8a2acd81387f78f8e985eb1be0d4cc2270f1453d
parent6ab9f6917fa9fdccbc9ef3c369c05102c781ed3e
LU-17809 osp: make disconnect asynchronous

MDT could have many osp devices. During umount there is a problem
of casscading timeouts of disconnect request. It could lead to
unpredictable large umount time.

This patch adds ability of parallel disconnect for OSP devices.
During LCFG_PRECLEANUP osp_disconnect() sends disconnects requests.
And osp_shutdown() waits it. So casscading timeouts were changed
to a single request wait.

Don't drop obd_force flag from upper layers.

Adds replay-single test 201, it simulates delays of OSP disconnects.
This leads to a high cumulative umount time.

Lustre-change: https://review.whamcloud.com/54995
Lustre-commit: ffedcbae21f7aefe5c2258a94b36fe286f46182c

HPE-bug-id: LUS-12251
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Id788b22c494147bdc7f0d36968629e7b7f660e01
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/55498
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
lustre/include/lustre_net.h
lustre/osp/osp_dev.c
lustre/osp/osp_internal.h
lustre/ptlrpc/import.c
lustre/tests/replay-single.sh