Whamcloud - gitweb
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event 99/36699/3
authorWang Shilong <wshilong@ddn.com>
Thu, 7 Nov 2019 02:18:15 +0000 (10:18 +0800)
committerOleg Drokin <green@whamcloud.com>
Fri, 22 Nov 2019 19:59:49 +0000 (19:59 +0000)
commit5c8b1e87a97bbe7b05f0b8325e98c16a0de1ff4c
treef8763e27986ee6eed7c5e2ce3778d8b75d19d481
parent32528a689889989607a34b21efa583429bda1422
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event

It looks like what's happening is when dm_dispatch_clone_request
dispatches the "clone" I/O request to the underlying (real) device
from the multipath device, the scsi driver can (often under load)
return BLK_MQ_RQ_QUEUE_DEV_BUSY. dm_dispatch_clone_request doesn't
have that as an exception the way it does BLK_MQ_RQ_QUEUE_BUSY and
so it calls dm_complete_request which propagates
the BLK_MQ_RQ_QUEUE_DEV_BUSY error code up the stack resulting
in multipath_end_io calling fail_path and failing the path because
there is an error value set.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: If17ea5b3ab33a89a17d49e5dfb2e9f9f19371564
Reviewed-on: https://review.whamcloud.com/36699
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/kernel_patches/patches/dm-fix-handle-BLK_MQ_RQ_QUEUE_DEV_BUSY-rhel7.6.patch [new file with mode: 0644]
lustre/kernel_patches/series/3.10-rhel7.6.series
lustre/kernel_patches/series/3.10-rhel7.7.series