Whamcloud - gitweb
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event 68/36868/3
authorWang Shilong <wshilong@ddn.com>
Thu, 7 Nov 2019 02:18:15 +0000 (10:18 +0800)
committerOleg Drokin <green@whamcloud.com>
Thu, 12 Dec 2019 23:06:25 +0000 (23:06 +0000)
commit849e1a5cbcd7025a19611277b14c5605c0dffefa
tree45062f551df38ec98c2fa33a2411fd46340d6321
parent6341522cc8088a367cd156a6f284823c69b92f7b
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event

It looks like what's happening is when dm_dispatch_clone_request
dispatches the "clone" I/O request to the underlying (real) device
from the multipath device, the scsi driver can (often under load)
return BLK_MQ_RQ_QUEUE_DEV_BUSY. dm_dispatch_clone_request doesn't
have that as an exception the way it does BLK_MQ_RQ_QUEUE_BUSY and
so it calls dm_complete_request which propagates
the BLK_MQ_RQ_QUEUE_DEV_BUSY error code up the stack resulting
in multipath_end_io calling fail_path and failing the path because
there is an error value set.

Lustre-change: https://review.whamcloud.com/36699
Lustre-commit: 5c8b1e87a97bbe7b05f0b8325e98c16a0de1ff4c

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: If17ea5b3ab33a89a17d49e5dfb2e9f9f19371564
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/kernel_patches/patches/dm-fix-handle-BLK_MQ_RQ_QUEUE_DEV_BUSY-rhel7.6.patch [new file with mode: 0644]
lustre/kernel_patches/series/3.10-rhel7.6.series
lustre/kernel_patches/series/3.10-rhel7.7.series