From patchwork Mon Dec 2 10:33:06 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ezequiel Garcia X-Patchwork-Id: 295863 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from casper.infradead.org (unknown [IPv6:2001:770:15f::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 68C632C009A for ; Mon, 2 Dec 2013 21:34:56 +1100 (EST) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VnQpc-0002Dm-3j; Mon, 02 Dec 2013 10:34:24 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1VnQpR-0000oJ-SH; Mon, 02 Dec 2013 10:34:13 +0000 Received: from top.free-electrons.com ([176.31.233.9] helo=mail.free-electrons.com) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1VnQpH-0000m9-Th; Mon, 02 Dec 2013 10:34:06 +0000 Received: by mail.free-electrons.com (Postfix, from userid 106) id B53EC8A3; Mon, 2 Dec 2013 11:33:37 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on mail.free-electrons.com X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,SHORTCIRCUIT, URIBL_BLOCKED shortcircuit=ham autolearn=disabled version=3.3.2 Received: from localhost (unknown [190.2.98.212]) by mail.free-electrons.com (Postfix) with ESMTPSA id 98D8089F; Mon, 2 Dec 2013 11:33:08 +0100 (CET) Date: Mon, 2 Dec 2013 07:33:06 -0300 From: Ezequiel Garcia To: Arnaud Ebalard Subject: Re: [PATCH v5 00/14] Armada 370/XP NAND support Message-ID: <20131202103305.GB2466@localhost> References: <1384464339-6817-1-git-send-email-ezequiel.garcia@free-electrons.com> <87d2lp28pd.fsf@natisbad.org> <20131125120335.GD2408@localhost> <87r4a4f5gr.fsf@natisbad.org> <20131126124003.GA2344@localhost> <87zjopd240.fsf@natisbad.org> <87wqjtbm8r.fsf@natisbad.org> <20131128185040.GA13182@localhost> <87bo12kcyt.fsf@natisbad.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <87bo12kcyt.fsf@natisbad.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20131202_053404_168230_25131DF9 X-CRM114-Status: GOOD ( 25.02 ) X-Spam-Score: -1.2 (-) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-1.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- 0.7 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) -0.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: Lior Amsalem , Thomas Petazzoni , Jason Cooper , linux-mtd@lists.infradead.org, Gregory Clement , Brian Norris , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-mtd@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-mtd" Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org Hi Arnaud, First of all: thanks for such great testings! On Sat, Nov 30, 2013 at 12:25:14AM +0100, Arnaud Ebalard wrote: > Ezequiel Garcia writes: > > >> Ezequiel, I am back in business to test a v2 ;-) > > > > Well, I'm not sure yet what's going on. Do you have any spare NAND partition > > to run some destructive testings? > > > > In that case, please run: > > > > $ nandtest /dev/mtd{X} > > Here is the nandtest run: > > nandtest /dev/mtd4 > ECC corrections: 0 > ECC failures : 0 > Bad blocks : 8 > BBT blocks : 0 > Bad block at 0x06700000 > Bad block at 0x06720000 > Bad block at 0x06740000 > Bad block at 0x06760000 > Bad block at 0x06780000 > Bad block at 0x067a0000 > Bad block at 0x067c0000 > Bad block at 0x067e0000 > > Finished pass 1 successfully > Hm, so something *is* working properly. Notice that mtd4 has its 8 last blocks marked 'bad' becuase it holds the bad block table. If you don't mind running a few more rounds, then it would be nice to do: $ nandtest --passes {N} So we run the test a few more times, just to be sure. > > Then, doing a simple read and copying back the data: > > root@mood:~# dd if=/dev/mtd4 of=/tmp/toto > 212992+0 records in > 212992+0 records out > 109051904 bytes (109 MB) copied, 10.8671 s, 10.0 MB/s > > root@mood:~# flash_erase /dev/mtd4 0 0 > > root@mood:~# nandwrite -p /dev/mtd4 /tmp/toto > ... > Writing data to block 795 at offset 0x6360000 > Writing data to block 796 at offset 0x6380000 > Writing data to block 797 at offset 0x63a0000 > Writing data to block 798 at offset 0x63c0000 > Writing data to block 799 at offset 0x63e0000 > Writing data to block 800 at offset 0x6400000 > [ 1509.210395] pxa3xx-nand d00d0000.nand: Ready time out!!! > libmtd: error!: cannot write 2048 bytes to mtd4 (eraseblock 800, offset 0) > error 5 (Input/output error) [..] > [ 1513.810387] pxa3xx-nand d00d0000.nand: Ready time out!!! > libmtd: error!: cannot write 2048 bytes to mtd4 (eraseblock 823, offset 0) > error 5 (Input/output error) > Erasing failed write from 0x66e0000 to 0x66fffff Hm.. so you get errors when writing to mtd4 blocks from 800 to 823. Is that completely reproducible, IOW do you get always the error on those blocks? And what happens if you write directly to those blocks only? (you can use nandwrite --offset) > Writing data to block 824 at offset 0x6700000 > Bad block at 6700000, 1 block(s) from 6700000 will be skipped > Writing data to block 825 at offset 0x6720000 > Bad block at 6720000, 1 block(s) from 6720000 will be skipped > Writing data to block 826 at offset 0x6740000 > Bad block at 6740000, 1 block(s) from 6740000 will be skipped > Writing data to block 827 at offset 0x6760000 > Bad block at 6760000, 1 block(s) from 6760000 will be skipped > Writing data to block 828 at offset 0x6780000 > Bad block at 6780000, 1 block(s) from 6780000 will be skipped > Writing data to block 829 at offset 0x67a0000 > Bad block at 67a0000, 1 block(s) from 67a0000 will be skipped > Writing data to block 830 at offset 0x67c0000 > Bad block at 67c0000, 1 block(s) from 67c0000 will be skipped > Writing data to block 831 at offset 0x67e0000 > Bad block at 67e0000, 1 block(s) from 67e0000 will be skipped > Writing data to block 832 at offset 0x6800000 > libmtd: error!: bad eraseblock number 832, mtd4 has 832 eraseblocks > nandwrite: error!: /dev/mtd4: MTD get bad block failed > error 22 (Invalid argument) > nandwrite: error!: Data was only partially written due to error > error 22 (Invalid argument) > These 8 blocks (824-832) that has been skipped are the ones marked as 'bad' because they hold the bad block table. > This is the kind of errors I got last time but I think am starting to > understand the root cause now. Tell me if I get it right: what is > understood as bad blocks above (and in nandtest) is in fact the two bad > block tables reported during boot: > > NAND device: Manufacturer ID: 0xad, Chip ID: 0xf1 (Hynix H27U1G8F2BTR-BC) > NAND device: 128MiB, SLC, page size: 2048, OOB size: 64 > Bad block table found at page 65472, version 0x01 > Bad block table found at page 65408, version 0x01 > Yes and no :-) The bad block table consists of 8 blocks at the end of the flash device. These blocks are marked as 'reserved' and nandtest or any other userspace writing/erasing tool will skip them. Hence, the bad block table is what explains the skipping of the group of blocks [824..832]. However, you're getting errors when writing data to [800..823], and it's a "Ready timeout" condition. I'm not sure exactly what's going on, but we can say that: * Either the waiting time is not enough, or ... * The commands (maybe some race) were badly issued so there's nothing to wait at all. To check the former, you only need to hack the driver like this, and then re-try the nandwrite: It's just a test. To check about the latter, I cannot think of anything but adding printk all over the place and inspect the command sequence. Brian: Do you have any better idea? Thanks again for these tests and for your patience! diff --git a/drivers/mtd/nand/pxa3xx_nand.c b/drivers/mtd/nand/pxa3xx_nand.c index 3d143fe..9bb7d35 100644 --- a/drivers/mtd/nand/pxa3xx_nand.c +++ b/drivers/mtd/nand/pxa3xx_nand.c @@ -39,7 +39,7 @@ #include #define NAND_DEV_READY_TIMEOUT 50 -#define CHIP_DELAY_TIMEOUT (2 * HZ/10) +#define CHIP_DELAY_TIMEOUT (10 * HZ/10) #define NAND_STOP_DELAY (2 * HZ/50) #define PAGE_CHUNK_SIZE (2048)