LU-11243 lod: fix assertion and hang upon lod_add_device failure
There are two problems:
See following assertion:
lod_add_device() lustre-OSTe42a-osc-MDT0000:
can't set up pool, failed with -12
osp_disconnect() ASSERTION( imp != ((void *)0) ) failed:
osp_disconnect() LBUG
CPU: 1 PID: 10059 Comm: llog_process_th
Problem is obd_disconnect() will cleanup @imp and set NULL.
->osp_obd_disconnect
->class_manual_cleanup
->class_process_config
->class_cleanup
->obd_precleanup
->osp_device_fini
->client_obd_cleanup
While ldo_process_config() will try to access @imp again:
->ldo_process_config
->osp_shutdown
->osp_disconnect
->LASSERT(imp != NULL)
Another problem is if we failed before obd_connect().
we will hang on with mount:
->ldo_process_config
->osp_shutdown
->osp_disconnect
->ptlrpc_disconnect_import
->rc = l_wait_event(imp->imp_recovery_waitq,
!ptlrpc_import_in_recovery(imp), &lwi);
Since connect is not called, imp state will stay LUSTRE_IMP_NEW.
Fix this by check whether we are in recovery properly, only consider
we are in recovery if we are in following states:
LUSTRE_IMP_CONNECTING = 4,
LUSTRE_IMP_REPLAY = 5,
LUSTRE_IMP_REPLAY_LOCKS = 6,
LUSTRE_IMP_REPLAY_WAIT = 7,
LUSTRE_IMP_RECOVER = 8,
Lustre-change: https://review.whamcloud.com/32994
Lustre-commit:
f28353b3d810cfbec018a263556ceac84ab9413e
Change-Id: I2113b95a421bae7117f3057d5f0fdf78db95caa3
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34450
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>