LU-1026 ldiskfs: make bitmaps corruption not fatal
We still hit bitmaps problems for rhel6 series kernel,
corruptions happen because ext4_mb_check_ondisk_bitmap()
check failed and FS become RO again:
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group
294corrupted: 20180 blocks free in bitmap, 20181 - in gd
Aborting journal on device dm-6-8.
LDISKFS-fs (dm-6): Remounting filesystem read-only
ldiskfs_mb_new_blocks: Updating bitmap error: [err -30]
[pa
ffff880d9d6e4d68] [phy
14974976] [logic 8192] [len 3072]
[free 3072] [error 1] [inode 278678]
ldiskfs_ext_new_extent_cb: Journal has aborted
this might be caused by some ext4 internal bugs, this patch
did the following things:
1.Inside ext4_read_block_bitmap() have gaven reasons
why it failed, so caller don't need call ext4_error() again.
2. mark block group corrupt and use ext4_warning() instead
of ext4_error().
There are still some bitmaps corruption places not handling,
let's keep it for now, and if it really hurt, let's add the
same handling codes logic later.
Tested by following scripts:
TEST_DEV="/dev/sdb"
TEST_MNT="/mnt/ext4"
mkdir -p $TEST_MNT
mkfs.ext4 -F $TEST_DEV >&/dev/null
mount -t ldiskfs $TEST_DEV $TEST_MNT
dd if=/dev/zero of=$TEST_MNT/largefile
oflag=direct bs=
10485760 count=200
umount $TEST_MNT
dd if=/dev/zero of=$TEST_DEV bs=4096 seek=641
count=10 oflag=direct
mount -t ldiskfs $TEST_DEV $TEST_MNT
rm -f $TEST_MNT/largefile
dd if=/dev/zero of=$TEST_MNT/largefile oflag=direct
bs=
10485760 count=200 && echo
"FILESYSTEM still usable after bitmaps corrupts happen"
dmesg | tail
umount $TEST_MNT
e2fsck $TEST_DEV -y
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Iabb6ebf719d80d9ba4f41bee0b237e304212832b
Reviewed-on: http://review.whamcloud.com/16679
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>