Whamcloud - gitweb
EX-2010 scsi: requeue aborted commands instead of retry
authorTrung Nguyen <trunguyen@ddn.com>
Mon, 9 Nov 2020 06:46:44 +0000 (23:46 -0700)
committerAndreas Dilger <adilger@whamcloud.com>
Sat, 6 Mar 2021 20:35:41 +0000 (20:35 +0000)
If the underlying SCSI command returns an abort, rather than retry
it quickly in a loop, which can finish within a few milliseconds,
requeue it with delay so that the hardware has a chance to recover.

The command requeue will take several seconds each time and allows
more chance for the problem to be resolved at the SCSI layer instead
of returning an error to the filesystem and causing server failover.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Trung Nguyen <trunguyen@ddn.com>
Change-Id: Ibdf1b3a52dd0a1b388c7f5f97aa7a51620138845
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-on: https://review.whamcloud.com/41852
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/kernel_patches/patches/scsi-requeue-aborted-commands-instead-of-retry.patch [new file with mode: 0644]
lustre/kernel_patches/series/3.10-rhel7.7.series

diff --git a/lustre/kernel_patches/patches/scsi-requeue-aborted-commands-instead-of-retry.patch b/lustre/kernel_patches/patches/scsi-requeue-aborted-commands-instead-of-retry.patch
new file mode 100644 (file)
index 0000000..61e7304
--- /dev/null
@@ -0,0 +1,22 @@
+DDN-1501 scsi: requeue aborted commands instead of retry
+
+If the underlying SCSI command returns an abort, rather than retry
+it quickly in a loop, which can finish within a few milliseconds,
+requeue it with delay so that the hardware has a chance to recover.
+
+The command requeue will take several seconds each time and allows
+more chance for the problem to be resolved at the SCSI layer instead
+of returning an error to the filesystem and causing server failover.
+
+Signed-off-by: Trung Nguyen <trunguyen@ddn.com>
+--- ./drivers/scsi/scsi_error.c.orig   2020-02-12 06:45:22.000000000 -0700
++++ ./drivers/scsi/scsi_error.c        2020-11-08 23:11:41.045007688 -0700
+@@ -510,7 +510,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
+               if (sshdr.asc == 0x10) /* DIF */
+                       return SUCCESS;
+-              return NEEDS_RETRY;
++              return ADD_TO_MLQUEUE;
+       case NOT_READY:
+       case UNIT_ATTENTION:
+               /*
index 001175d..994bcba 100644 (file)
@@ -3,6 +3,7 @@ blkdev_tunables-3.9.patch
 vfs-project-quotas-rhel7.patch
 fix-integrity-verify-rhel7.patch
 fix-sd-dif-complete-rhel7.patch
+scsi-requeue-aborted-commands-instead-of-retry.patch
 block-integrity-allow-optional-integrity-functions-rhel7.patch
 block-pass-bio-into-integrity_processing_fn-rhel7.patch
 block-Ensure-we-only-enable-integrity-metadata-for-reads-and-writes-rhel7.patch