Whamcloud - gitweb
LU-8376 ost: enhance end to end bulk cksum error report 60/23960/20
authorBruno Faccini <bruno.faccini@intel.com>
Fri, 25 Nov 2016 14:57:20 +0000 (15:57 +0100)
committerOleg Drokin <oleg.drokin@intel.com>
Tue, 9 May 2017 03:44:21 +0000 (03:44 +0000)
commit672986cbae63e90262d55bf277643ea046bfa8b2
tree3ee2f03d388a4c7f968b7221ebb17910dbfc7e49
parent6dc05f00218798e400433feeda7ad6f271b535d8
LU-8376 ost: enhance end to end bulk cksum error report

Some sites have experienced spurious checksum errors upon bulk
xfers where it is very difficult to determine the source of the
corruption.
With this patch, upon cksum error, full dump of all pages in a
bulk xfer is now possible (enabled via a /proc tunable) on both
Client and OSS sides, to allow easier root cause identification.

sanity.sh/test_77[b,d,f,g]() existing sub-tests results can already
be used to show the effects of this patch, by injecting bulk cksum
error/corruption using OBD_FAIL_[OSC,OST]_CHECKSUM_[SEND,RECEIVE]
fail codes.

sanity.sh/test_77c has been created to specificaly test new dump
on cksum error functionality.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I0d200bb6d5c41c55a66ac012fd9cbd8d702d2f3a
Reviewed-on: https://review.whamcloud.com/23960
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
libcfs/libcfs/debug.c
lustre/include/lprocfs_status.h
lustre/include/obd.h
lustre/obdclass/lprocfs_status_server.c
lustre/ofd/lproc_ofd.c
lustre/osc/lproc_osc.c
lustre/osc/osc_request.c
lustre/target/tgt_handler.c
lustre/tests/sanity.sh