From patchwork Mon Jul 29 15:34:27 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 262859 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 81C202C00FE for ; Tue, 30 Jul 2013 01:44:55 +1000 (EST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1V3pcu-0005mr-Ol; Mon, 29 Jul 2013 15:44:48 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1V3pSy-0004Ar-5q for kernel-team@lists.ubuntu.com; Mon, 29 Jul 2013 15:34:32 +0000 Received: from bl20-146-220.dsl.telepac.pt ([2.81.146.220] helo=localhost) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1V3pSu-0007kN-Hn; Mon, 29 Jul 2013 15:34:28 +0000 From: Luis Henriques To: NeilBrown Subject: [ 3.5.y.z extended stable ] Patch "md/raid5: fix interaction of 'replace' and 'recovery'." has been added to staging queue Date: Mon, 29 Jul 2013 16:34:27 +0100 Message-Id: <1375112067-3485-1-git-send-email-luis.henriques@canonical.com> X-Mailer: git-send-email 1.8.1.2 X-Extended-Stable: 3.5 Cc: kernel-team@lists.ubuntu.com, qindehua <13691222965@163.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled md/raid5: fix interaction of 'replace' and 'recovery'. to the linux-3.5.y-queue branch of the 3.5.y.z extended stable tree which can be found at: http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.5.y-queue If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.5.y.z tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Luis ------ From 0563a6842d3643bf1b5cc78e95689442bf1fa244 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 22 Jul 2013 12:57:21 +1000 Subject: [PATCH] md/raid5: fix interaction of 'replace' and 'recovery'. commit f94c0b6658c7edea8bc19d13be321e3860a3fa54 upstream. If a device in a RAID4/5/6 is being replaced while another is being recovered, then the writes to the replacement device currently don't happen, resulting in corruption when the replacement completes and the new drive takes over. This is because the replacement writes are only triggered when 's.replacing' is set and not when the similar 's.sync' is set (which is the case during resync and recovery - it means all devices need to be read). So schedule those writes when s.replacing is set as well. In this case we cannot use "STRIPE_INSYNC" to record that the replacement has happened as that is needed for recording that any parity calculation is complete. So introduce STRIPE_REPLACED to record if the replacement has happened. For safety we should also check that STRIPE_COMPUTE_RUN is not set. This has a similar effect to the "s.locked == 0" test. The latter ensure that now IO has been flagged but not started. The former checks if any parity calculation has been flagged by not started. We must wait for both of these to complete before triggering the 'replace'. Add a similar test to the subsequent check for "are we finished yet". This possibly isn't needed (is subsumed in the STRIPE_INSYNC test), but it makes it more obvious that the REPLACE will happen before we think we are finished. Finally if a NeedReplace device is not UPTODATE then that is an error. We really must trigger a warning. This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2 (md/raid5: detect and handle replacements during recovery.) which introduced replacement for raid5. That was in 3.3-rc3, so any stable kernel since then would benefit from this fix. Reported-by: qindehua <13691222965@163.com> Tested-by: qindehua Signed-off-by: NeilBrown [ luis: backported to 3.5: adjusted context ] Signed-off-by: Luis Henriques --- drivers/md/raid5.c | 15 ++++++++++----- drivers/md/raid5.h | 1 + 2 files changed, 11 insertions(+), 5 deletions(-) -- 1.8.1.2 diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index bcd096b..ecf09ec 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3372,6 +3372,7 @@ static void handle_stripe(struct stripe_head *sh) if (test_and_clear_bit(STRIPE_SYNC_REQUESTED, &sh->state)) { set_bit(STRIPE_SYNCING, &sh->state); clear_bit(STRIPE_INSYNC, &sh->state); + clear_bit(STRIPE_REPLACED, &sh->state); } clear_bit(STRIPE_DELAYED, &sh->state); @@ -3511,19 +3512,23 @@ static void handle_stripe(struct stripe_head *sh) handle_parity_checks5(conf, sh, &s, disks); } - if (s.replacing && s.locked == 0 - && !test_bit(STRIPE_INSYNC, &sh->state)) { + if ((s.replacing || s.syncing) && s.locked == 0 + && !test_bit(STRIPE_COMPUTE_RUN, &sh->state) + && !test_bit(STRIPE_REPLACED, &sh->state)) { /* Write out to replacement devices where possible */ for (i = 0; i < conf->raid_disks; i++) - if (test_bit(R5_UPTODATE, &sh->dev[i].flags) && - test_bit(R5_NeedReplace, &sh->dev[i].flags)) { + if (test_bit(R5_NeedReplace, &sh->dev[i].flags)) { + WARN_ON(!test_bit(R5_UPTODATE, &sh->dev[i].flags)); set_bit(R5_WantReplace, &sh->dev[i].flags); set_bit(R5_LOCKED, &sh->dev[i].flags); s.locked++; } - set_bit(STRIPE_INSYNC, &sh->state); + if (s.replacing) + set_bit(STRIPE_INSYNC, &sh->state); + set_bit(STRIPE_REPLACED, &sh->state); } if ((s.syncing || s.replacing) && s.locked == 0 && + !test_bit(STRIPE_COMPUTE_RUN, &sh->state) && test_bit(STRIPE_INSYNC, &sh->state)) { md_done_sync(conf->mddev, STRIPE_SECTORS, 1); clear_bit(STRIPE_SYNCING, &sh->state); diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h index 2164021..43d22fc 100644 --- a/drivers/md/raid5.h +++ b/drivers/md/raid5.h @@ -307,6 +307,7 @@ enum { STRIPE_SYNC_REQUESTED, STRIPE_SYNCING, STRIPE_INSYNC, + STRIPE_REPLACED, STRIPE_PREREAD_ACTIVE, STRIPE_DELAYED, STRIPE_DEGRADED,