From patchwork Thu Dec 2 04:42:06 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artem Bityutskiy X-Patchwork-Id: 73932 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from canuck.infradead.org (canuck.infradead.org [134.117.69.58]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id EBF5BB6EE9 for ; Thu, 2 Dec 2010 15:50:06 +1100 (EST) Received: from localhost ([127.0.0.1] helo=canuck.infradead.org) by canuck.infradead.org with esmtp (Exim 4.72 #1 (Red Hat Linux)) id 1PO13S-0004iT-U9; Thu, 02 Dec 2010 04:46:02 +0000 Received: from mail-fx0-f49.google.com ([209.85.161.49]) by canuck.infradead.org with esmtp (Exim 4.72 #1 (Red Hat Linux)) id 1PO13O-0004g9-Vv for linux-mtd@lists.infradead.org; Thu, 02 Dec 2010 04:46:00 +0000 Received: by fxm19 with SMTP id 19so2898682fxm.36 for ; Wed, 01 Dec 2010 20:45:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:subject:from:reply-to:to:cc :in-reply-to:references:content-type:date:message-id:mime-version :x-mailer:content-transfer-encoding; bh=157/uWE4wcyW2BpZX2fkuZi6sdzZxAMEM+S5A4Mlio4=; b=oAvSSceAGRFCLnl+MsJNvNmZLzNcKiOzm2JbfWClrUg6E+/43prEgIEaqwyhFv13qt XhM4rUa1gLDPaMY3n9epAgXzTHt+zcK5VddjWSmUHA8Aup37kSOJvZWuTJ24GBDkGqWo DFnkLwIKYIVumsUCYrMeJNA/R3KqHTUKf/8wY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:reply-to:to:cc:in-reply-to:references:content-type :date:message-id:mime-version:x-mailer:content-transfer-encoding; b=oG4ULnlL4McpqCAm9SVuP4mjEbgKdMJIH2sFE4Vp1tRZcBflTK12zdvzTcqFHjbDYT 0lFoSsJxzBsyLM6paSv9sLVAG6AUDh7ijsqiSvMfLO2ZqB8a88+M/lvu8/9uCG1ZWIpk kTkbXez4WcYsZlXYwymAmkZyCgeJNtOcj4Ubo= Received: by 10.223.81.79 with SMTP id w15mr747910fak.72.1291264930508; Wed, 01 Dec 2010 20:42:10 -0800 (PST) Received: from [192.168.255.4] (a91-152-69-107.elisa-laajakaista.fi [91.152.69.107]) by mx.google.com with ESMTPS id y1sm27914fak.15.2010.12.01.20.42.08 (version=SSLv3 cipher=RC4-MD5); Wed, 01 Dec 2010 20:42:09 -0800 (PST) Subject: Re: UBIFS partition on NOR flash not mountable after power cut test From: Artem Bityutskiy To: Anatolij Gustschin In-Reply-To: <20101201164447.2215bc58@wker> References: <20101129195014.19224240@wker> <20101201130534.5b95ce83@wker> <20101201164447.2215bc58@wker> Date: Thu, 02 Dec 2010 06:42:06 +0200 Message-ID: <1291264926.14534.32.camel@koala> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 (2.32.1-1.fc14) X-CRM114-Version: 20090807-BlameThorstenAndJenny ( TRE 0.7.6 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20101201_234559_413022_EBB8DE25 X-CRM114-Status: GOOD ( 32.98 ) X-Spam-Score: 1.4 (+) X-Spam-Report: SpamAssassin version 3.3.1 on canuck.infradead.org summary: Content analysis details: (1.4 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.161.49 listed in list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is freemail (dedekind1[at]gmail.com) 2.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (dedekind1[at]gmail.com) -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature Cc: linux-mtd@lists.infradead.org, Detlev Zundel X-BeenThere: linux-mtd@lists.infradead.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: dedekind1@gmail.com List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-mtd-bounces@lists.infradead.org Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org On Wed, 2010-12-01 at 16:44 +0100, Anatolij Gustschin wrote: > On Wed, 1 Dec 2010 13:05:34 +0100 > Anatolij Gustschin wrote: > ... > > My question is: > > Is it possible that the reset occured while running in > > nor_erase_prepare() just after clearing the VID header magic, > > but before clearing the EC header magic? > > yes, it is possible and it seems this is the reason for > that next issue we've observed. Adding the call to the > machine restart callback in nor_erase_prepare() just before > clearing EC header magic to simulate this hypothetical case > we see: You are right, Anatoli. Looking closer to my own code, I see that I treat PEB as "corrupted and should be preserved" if: 1. EC header is OK. 2. VID header is corrupted. 3. data area is not "all 0xFFs". And in 'nor_erase_prepare()' we first invalidate the VID header, and then invalidate the EC header. So there is a small window where you can end up with all 3 conditions to be true. The solution is to first invalidate the EC header, and only then the VID header. Then in case of the race, we just lose the EC header, but VID header will be all-right, and UBI will handle this - it'll move the data from this PEB to another one, re-create EC header and use average EC count. But if you test this scenario, it will be great!! This patch should help (compile-tested only). From 7ddb9cb80ab54d118e2a055ce7a35649070578b1 Mon Sep 17 00:00:00 2001 From: Artem Bityutskiy Date: Thu, 2 Dec 2010 06:34:01 +0200 Subject: [PATCH] UBI: fix corrupted PEB detection for NOR flash My new shiny code for corrupted PEB detection has NOR specific bug. We tread PEB as corrupted and preserve it, if 1. EC header is OK. 2. VID header is corrupted. 3. data area is not "all 0xFFs" In case of NOR we have 'nor_erase_prepare()' quirk, which invalidates the headers before erasing the PEB. And we invalidate first the VID header, and then the EC header. So if a power cut happens after we have invalidated the VID header, but before we have invalidated the EC header, we end up with a PEB which satisfies the above 3 conditions, and the scanning code will treat it as corrupted, and will print scary warnings, wrongly. This patch fixes the issue by firt invalidating the EC header, then invalidating the VID header. In case of power cut inbetween, we still just lose the EC header, and UBI can deal with this situation gracefully. This patch also adds one irrelevant comment about defining VID header buffer on stack. Thanks to Anatolij Gustschin for tracking this down. Signed-off-by: Artem Bityutskiy Reported-by: Anatolij Gustschin Tested-by: Anatolij Gustschin --- drivers/mtd/ubi/io.c | 37 ++++++++++++++++++++++++++++--------- drivers/mtd/ubi/scan.c | 4 ++++ 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c index c2960ac..3839253 100644 --- a/drivers/mtd/ubi/io.c +++ b/drivers/mtd/ubi/io.c @@ -480,12 +480,26 @@ static int nor_erase_prepare(struct ubi_device *ubi, int pnum) size_t written; loff_t addr; uint32_t data = 0; + /* + * Note, we cannot generally define VID header buffers on stack, + * because of the way we deal with these buffers (see the heder + * comment). But we know this is NOR-specific piece of code, so we can + * do this. But yes, this is error-prone and we should (pre-)allocate + * VID header buffer instead. + */ struct ubi_vid_hdr vid_hdr; - addr = (loff_t)pnum * ubi->peb_size + ubi->vid_hdr_aloffset; + /* + * Note, it is important to first invalidate the EC header, and then + * VID header. Otherwise a power cut may result in valid EC header and + * invalid VID header, then UBI will treat this PEB as corrupted and + * will try to preserve it, print scary warnings. See scan.c header + * comment for more information. + */ + addr = (loff_t)pnum * ubi->peb_size; err = ubi->mtd->write(ubi->mtd, addr, 4, &written, (void *)&data); if (!err) { - addr -= ubi->vid_hdr_aloffset; + addr += ubi->vid_hdr_aloffset; err = ubi->mtd->write(ubi->mtd, addr, 4, &written, (void *)&data); if (!err) @@ -499,13 +513,18 @@ static int nor_erase_prepare(struct ubi_device *ubi, int pnum) * In this case we probably anyway have garbage in this PEB. */ err1 = ubi_io_read_vid_hdr(ubi, pnum, &vid_hdr, 0); - if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR) - /* - * The VID header is corrupted, so we can safely erase this - * PEB and not afraid that it will be treated as a valid PEB in - * case of an unclean reboot. - */ - return 0; + if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR) { + struct ubi_ec_hdr ec_hdr; + + err1 = ubi_io_read_ec_hdr(ubi, pnum, &ec_hdr, 0); + if (err1 == UBI_IO_BAD_HDR_EBADMSG || err1 == UBI_IO_BAD_HDR) + /* + * The VID and EC headers are corrupted, so we can + * safely erase this PEB and not afraid that it will be + * treated as a valid PEB in case of an unclean reboot. + */ + return 0; + } /* * The PEB contains a valid VID header, but we cannot invalidate it. diff --git a/drivers/mtd/ubi/scan.c b/drivers/mtd/ubi/scan.c index 3c63186..d4b1115 100644 --- a/drivers/mtd/ubi/scan.c +++ b/drivers/mtd/ubi/scan.c @@ -951,6 +951,10 @@ static int process_eb(struct ubi_device *ubi, struct ubi_scan_info *si, * impossible to distinguish it from a PEB which just * contains garbage because of a power cut during erase * operation. So we just schedule this PEB for erasure. + * + * Besides, in case of NOR flash, we deliberatly + * corrupt both headers because NOR flash erasure is + * slow and can start from the end. */ err = 0; else