From patchwork Thu Feb 19 13:51:44 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 441619 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 57FEC14008F; Fri, 20 Feb 2015 00:52:47 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1YORX0-0001hc-Gl; Thu, 19 Feb 2015 13:52:42 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1YORWU-0001Rg-Nb for kernel-team@lists.ubuntu.com; Thu, 19 Feb 2015 13:52:10 +0000 Received: from av-217-129-142-138.netvisao.pt ([217.129.142.138] helo=localhost) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1YORWU-0008D8-Ex; Thu, 19 Feb 2015 13:52:10 +0000 From: Luis Henriques To: linux-kernel@vger.kernel.org, stable@vger.kernel.org, kernel-team@lists.ubuntu.com Subject: [PATCH 3.16.y-ckt 14/58] md/raid5: fix another livelock caused by non-aligned writes. Date: Thu, 19 Feb 2015 13:51:44 +0000 Message-Id: <1424353948-31863-15-git-send-email-luis.henriques@canonical.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1424353948-31863-1-git-send-email-luis.henriques@canonical.com> References: <1424353948-31863-1-git-send-email-luis.henriques@canonical.com> X-Extended-Stable: 3.16 Cc: NeilBrown X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com 3.16.7-ckt7 -stable review patch. If anyone has any objections, please let me know. ------------------ From: NeilBrown commit b1b02fe97f75b12ab34b2303bfd4e3526d903a58 upstream. If a non-page-aligned write is destined for a device which is missing/faulty, we can deadlock. As the target device is missing, a read-modify-write cycle is not possible. As the write is not for a full-page, a recontruct-write cycle is not possible. This should be handled by logic in fetch_block() which notices there is a non-R5_OVERWRITE write to a missing device, and so loads all blocks. However since commit 67f455486d2ea2, that code requires STRIPE_PREREAD_ACTIVE before it will active, and those circumstances never set STRIPE_PREREAD_ACTIVE. So: in handle_stripe_dirtying, if neither rmw or rcw was possible, set STRIPE_DELAYED, which will cause STRIPE_PREREAD_ACTIVE be set after a suitable delay. Fixes: 67f455486d2ea20b2d94d6adf5b9b783d079e321 Reported-by: Mikulas Patocka Tested-by: Heinz Mauelshagen Signed-off-by: NeilBrown Signed-off-by: Luis Henriques --- drivers/md/raid5.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 222aa7521877..68a03d7f25ee 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3204,6 +3204,11 @@ static void handle_stripe_dirtying(struct r5conf *conf, (unsigned long long)sh->sector, rcw, qread, test_bit(STRIPE_DELAYED, &sh->state)); } + + if (rcw > disks && rmw > disks && + !test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) + set_bit(STRIPE_DELAYED, &sh->state); + /* now if nothing is locked, and if we have enough data, * we can start a write request */