Whamcloud - gitweb
LU-824 corrupted ldiskfs after md rebuild (bz24264)
authoryangsheng <ys@whamcloud.com>
Fri, 4 Nov 2011 19:49:49 +0000 (03:49 +0800)
committerOleg Drokin <green@whamcloud.com>
Mon, 12 Dec 2011 18:34:31 +0000 (13:34 -0500)
Pick up a patch from upstream to fix the md bug may
cause a corruption issue after rebuild.

Change-Id: I802ff3b3d5e86b9d9e77e57d1d98004c17e800a6
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1650
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/kernel_patches/patches/raid5-rebuild-corrupt-bug.patch [new file with mode: 0644]
lustre/kernel_patches/series/2.6-rhel5.series

diff --git a/lustre/kernel_patches/patches/raid5-rebuild-corrupt-bug.patch b/lustre/kernel_patches/patches/raid5-rebuild-corrupt-bug.patch
new file mode 100644 (file)
index 0000000..c434498
--- /dev/null
@@ -0,0 +1,26 @@
+While the stripe in-memory must be in-sync, the stripe on disk might not be
+because if we computed a block rather than reading it from an in-sync disk,
+the in-memory stripe can be different from the on-disk stripe.
+
+If this bug were still in mainline I would probably want a bigger patch which
+would leave this code but also set R5_LOCKED on all blocks that have been
+computed.  But as it is a stablisation patch, the above is simple and more
+clearly correct.
+
+Thanks for you patience - I look forward to your success/failure report.
+
+NeilBrown
+
+diff -up /drivers/md/raid5.c
+===========================================
+--- a/drivers/md/raid5.c
++++ b/drivers/md/raid5.c
+@@ -2466,8 +2466,6 @@
+                                       locked++;
+                                       set_bit(R5_Wantwrite, &sh->dev[i].flags);
+                               }
+-                      /* after a RECONSTRUCT_WRITE, the stripe MUST be in-sync */
+-                      set_bit(STRIPE_INSYNC, &sh->state);
+
+                       if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) {
+                               atomic_dec(&conf->preread_active_stripes);
index f1b1346..97402e9 100644 (file)
@@ -15,6 +15,7 @@ raid5-stripe-by-stripe-handling-rhel5.patch
 raid5-merge-ios-rhel5.patch
 raid5-zerocopy-rhel5.patch
 raid5-maxsectors-rhel5.patch
 raid5-merge-ios-rhel5.patch
 raid5-zerocopy-rhel5.patch
 raid5-maxsectors-rhel5.patch
+raid5-rebuild-corrupt-bug.patch
 md-rebuild-policy.patch
 jbd-journal-chksum-2.6.18-vanilla.patch
 quota-large-limits-rhel5.patch
 md-rebuild-policy.patch
 jbd-journal-chksum-2.6.18-vanilla.patch
 quota-large-limits-rhel5.patch