Whamcloud - gitweb
LU-1026 ldiskfs: make bitmaps corruption not fatal 79/16679/8
authorWang Shilong <wshilong@ddn.com>
Sat, 11 Jul 2015 03:49:55 +0000 (11:49 +0800)
committerOleg Drokin <oleg.drokin@intel.com>
Fri, 4 Dec 2015 17:58:44 +0000 (17:58 +0000)
commite727c383db8b2485d9e6137895136699d57ea047
tree85162f8ad212790f0d37200cdbf7ebd1dbb33a58
parent343364d0f06c34f403eb009de498455b5ebe14be
LU-1026 ldiskfs: make bitmaps corruption not fatal

We still hit bitmaps problems for rhel6 series kernel,
corruptions happen because ext4_mb_check_ondisk_bitmap()
check failed and FS become RO again:

ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group
294corrupted: 20180 blocks free in bitmap, 20181 - in gd
Aborting journal on device dm-6-8.
LDISKFS-fs (dm-6): Remounting filesystem read-only
ldiskfs_mb_new_blocks: Updating bitmap error: [err -30]
 [pa ffff880d9d6e4d68] [phy 14974976] [logic 8192] [len 3072]
 [free 3072] [error 1] [inode 278678]
ldiskfs_ext_new_extent_cb: Journal has aborted

this might be caused by some ext4 internal bugs, this patch
did the following things:

1.Inside ext4_read_block_bitmap() have gaven reasons
why it failed, so caller don't need call ext4_error() again.
2. mark block group corrupt and use ext4_warning() instead
of ext4_error().

There are still some bitmaps corruption places not handling,
let's keep it for now, and if it really hurt, let's add the
same handling codes logic later.

Tested by following scripts:

TEST_DEV="/dev/sdb"
TEST_MNT="/mnt/ext4"

mkdir -p $TEST_MNT
mkfs.ext4 -F $TEST_DEV >&/dev/null

mount -t ldiskfs $TEST_DEV $TEST_MNT
dd if=/dev/zero of=$TEST_MNT/largefile
oflag=direct bs=10485760 count=200
umount $TEST_MNT
dd if=/dev/zero of=$TEST_DEV bs=4096 seek=641
count=10 oflag=direct
mount -t ldiskfs $TEST_DEV $TEST_MNT
rm -f $TEST_MNT/largefile
dd if=/dev/zero of=$TEST_MNT/largefile oflag=direct
bs=10485760 count=200 && echo
  "FILESYSTEM still usable after bitmaps corrupts happen"
dmesg | tail
umount $TEST_MNT
e2fsck $TEST_DEV -y

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Iabb6ebf719d80d9ba4f41bee0b237e304212832b
Reviewed-on: http://review.whamcloud.com/16679
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
ldiskfs/kernel_patches/patches/rhel6.6/ext4-corrupted-inode-block-bitmaps-handling-patches.patch
ldiskfs/kernel_patches/patches/rhel7/ext4-corrupted-inode-block-bitmaps-handling-patches.patch
ldiskfs/kernel_patches/patches/sles11sp2/ext4-corrupted-inode-block-bitmaps-handling-patches.patch
ldiskfs/kernel_patches/series/ldiskfs-2.6-rhel6.6.series