Whamcloud - gitweb
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event
authorWang Shilong <wshilong@ddn.com>
Thu, 7 Nov 2019 02:18:15 +0000 (10:18 +0800)
committerOleg Drokin <green@whamcloud.com>
Fri, 22 Nov 2019 20:00:39 +0000 (15:00 -0500)
commit3cb7636a04921c2b6d1ccb1db4848af1ca9d1701
tree5d032e8df16494e9a475efb69a8147e4f1861930
parentafa8f63200c29313e7e9fb5dd3a89807551b8a9c
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event

It looks like what's happening is when dm_dispatch_clone_request
dispatches the "clone" I/O request to the underlying (real) device
from the multipath device, the scsi driver can (often under load)
return BLK_MQ_RQ_QUEUE_DEV_BUSY. dm_dispatch_clone_request doesn't
have that as an exception the way it does BLK_MQ_RQ_QUEUE_BUSY and
so it calls dm_complete_request which propagates
the BLK_MQ_RQ_QUEUE_DEV_BUSY error code up the stack resulting
in multipath_end_io calling fail_path and failing the path because
there is an error value set.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: If17ea5b3ab33a89a17d49e5dfb2e9f9f19371564
Reviewed-on: https://review.whamcloud.com/36699
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/kernel_patches/patches/dm-fix-handle-BLK_MQ_RQ_QUEUE_DEV_BUSY-rhel7.6.patch [new file with mode: 0644]
lustre/kernel_patches/series/3.10-rhel7.6.series
lustre/kernel_patches/series/3.10-rhel7.7.series