diff mbox series

[U-Boot] cmd: mtd: solve bad block support in erase command

Message ID 20190920072012.17841-1-patrick.delaunay@st.com
State Accepted
Commit b1b147f2b839c45b896dfacf5a2929313d88aa63
Delegated to: Tom Rini
Headers show
Series [U-Boot] cmd: mtd: solve bad block support in erase command | expand

Commit Message

Patrick DELAUNAY Sept. 20, 2019, 7:20 a.m. UTC
This patch modify the loop in mtd erase command to erase one by one
the blocks in the requested area.

It solves issue on "mtd erase" command on nand with existing bad block,
the command is interrupted on the first bad block with the trace:
	"Skipping bad block at 0xffffffffffffffff"

In MTD driver (nand/raw), when a bad block is present on the MTD
device, the erase_op.fail_addr is not updated and we have the initial
value MTD_FAIL_ADDR_UNKNOWN = (ULL)-1.

This case seems normal in nand_base.c:nand_erase_nand(),
we have the 2 exit cases during the loop:

1/ we have a bad block (nand_block_checkbad)
	instr->state = MTD_ERASE_FAILED
	loop interrupted (goto erase_exit)

2/ if block erase failed (status & NAND_STATUS_FAIL)
	instr->state = MTD_ERASE_FAILED;
	instr->fail_addr =
				((loff_t)page << chip->page_shift);
	loop interrupted (goto erase_exit)

So erase_op.fail_addr can't be used if bad blocks were present
in the erased area; we need to use mtd_erase only one block to detect
and skip these existing bad blocks (as it is done in nand_util.c).

Signed-off-by: Patrick Delaunay <patrick.delaunay@st.com>
---

Hi,

Found a correct in the mtd erase command.

I detect the issue and test the patch on STM32MP157C-EV1 board,
with nor and nand. We have the block table at the end of the nand
so the 4 last blocks are marked bad.

And I try to erase all the nand with the command "mtd erase".

Before the patch:

The "nand erase" command behavior is OK:

STM32MP> nand erase 0x0 0x000040000000

NAND erase: device 0 whole chip
Skipping bad block at  0x3ff00000
Skipping bad block at  0x3ff40000
Skipping bad block at  0x3ff80000
Skipping bad block at  0x3ffc0000

But the "mtd erase" command is not correct:

STM32MP> mtd list
SF: Detected mx66l51235l with page size 256 Bytes, erase size 64 KiB, total 64 MiB
List of MTD devices:
* nand0
  - type: NAND flash
  - block size: 0x40000 bytes
  - min I/O: 0x1000 bytes
  - OOB size: 224 bytes
  - OOB available: 118 bytes
  - ECC strength: 8 bits
  - ECC step size: 512 bytes
  - bitflip threshold: 6 bits
  - 0x000000000000-0x000040000000 : "nand0"
          - 0x000000000000-0x000000200000 : "fsbl"
          - 0x000000200000-0x000000400000 : "ssbl1"
          - 0x000000400000-0x000000600000 : "ssbl2"
          - 0x000000600000-0x000040000000 : "UBI"
* nor0
  - type: NOR flash
  - block size: 0x10000 bytes
  - min I/O: 0x1 bytes
  - 0x000000000000-0x000004000000 : "nor0"
          - 0x000000000000-0x000000040000 : "fsbl1"
          - 0x000000040000-0x000000080000 : "fsbl2"
          - 0x000000080000-0x000000280000 : "ssbl"
          - 0x000000280000-0x000000300000 : "u-boot-env"
          - 0x000000300000-0x000004000000 : "nor_user"

STM32MP> mtd erase nand0 0x0 0x000040000000
Erasing 0x00000000 ... 0x3fffffff (4096 eraseblock(s))
Skipping bad block at 0xffffffffffffffff

OK

The 4 bad blocks are not correctly skipped,
the command is stopped on the first error.

After the patch, the "mtd erase" command skips the 4 bad block
exactly as the "nand erase" command:

STM32MP> mtd erase nand0 0x000000000000 0x000040000000
SF: Detected mx66l51235l with page size 256 Bytes, erase size 64 KiB, total 64 MiB
Erasing 0x00000000 ... 0x3fffffff (4096 eraseblock(s))
Skipping bad block at 0x3ff00000
Skipping bad block at 0x3ff40000
Skipping bad block at 0x3ff80000
Skipping bad block at 0x3ffc0000

Regards

Patrick


 cmd/mtd.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

Comments

Stefan Roese Sept. 20, 2019, 9:20 a.m. UTC | #1
Hi Patrick,

On 20.09.19 09:20, Patrick Delaunay wrote:
> This patch modify the loop in mtd erase command to erase one by one
> the blocks in the requested area.
> 
> It solves issue on "mtd erase" command on nand with existing bad block,
> the command is interrupted on the first bad block with the trace:
> 	"Skipping bad block at 0xffffffffffffffff"
> 
> In MTD driver (nand/raw), when a bad block is present on the MTD
> device, the erase_op.fail_addr is not updated and we have the initial
> value MTD_FAIL_ADDR_UNKNOWN = (ULL)-1.

So here is the difference? I remember testing this on a board with
SPI NAND and here it worked correctly. But your test case is with
RAW NAND?

Do you have a change to also test this on a board with SPI NAND?

Thanks,
Stefan
  
> This case seems normal in nand_base.c:nand_erase_nand(),
> we have the 2 exit cases during the loop:
> 
> 1/ we have a bad block (nand_block_checkbad)
> 	instr->state = MTD_ERASE_FAILED
> 	loop interrupted (goto erase_exit)
> 
> 2/ if block erase failed (status & NAND_STATUS_FAIL)
> 	instr->state = MTD_ERASE_FAILED;
> 	instr->fail_addr =
> 				((loff_t)page << chip->page_shift);
> 	loop interrupted (goto erase_exit)
> 
> So erase_op.fail_addr can't be used if bad blocks were present
> in the erased area; we need to use mtd_erase only one block to detect
> and skip these existing bad blocks (as it is done in nand_util.c).
> 
> Signed-off-by: Patrick Delaunay <patrick.delaunay@st.com>
> ---
> 
> Hi,
> 
> Found a correct in the mtd erase command.
> 
> I detect the issue and test the patch on STM32MP157C-EV1 board,
> with nor and nand. We have the block table at the end of the nand
> so the 4 last blocks are marked bad.
> 
> And I try to erase all the nand with the command "mtd erase".
> 
> Before the patch:
> 
> The "nand erase" command behavior is OK:
> 
> STM32MP> nand erase 0x0 0x000040000000
> 
> NAND erase: device 0 whole chip
> Skipping bad block at  0x3ff00000
> Skipping bad block at  0x3ff40000
> Skipping bad block at  0x3ff80000
> Skipping bad block at  0x3ffc0000
> 
> But the "mtd erase" command is not correct:
> 
> STM32MP> mtd list
> SF: Detected mx66l51235l with page size 256 Bytes, erase size 64 KiB, total 64 MiB
> List of MTD devices:
> * nand0
>    - type: NAND flash
>    - block size: 0x40000 bytes
>    - min I/O: 0x1000 bytes
>    - OOB size: 224 bytes
>    - OOB available: 118 bytes
>    - ECC strength: 8 bits
>    - ECC step size: 512 bytes
>    - bitflip threshold: 6 bits
>    - 0x000000000000-0x000040000000 : "nand0"
>            - 0x000000000000-0x000000200000 : "fsbl"
>            - 0x000000200000-0x000000400000 : "ssbl1"
>            - 0x000000400000-0x000000600000 : "ssbl2"
>            - 0x000000600000-0x000040000000 : "UBI"
> * nor0
>    - type: NOR flash
>    - block size: 0x10000 bytes
>    - min I/O: 0x1 bytes
>    - 0x000000000000-0x000004000000 : "nor0"
>            - 0x000000000000-0x000000040000 : "fsbl1"
>            - 0x000000040000-0x000000080000 : "fsbl2"
>            - 0x000000080000-0x000000280000 : "ssbl"
>            - 0x000000280000-0x000000300000 : "u-boot-env"
>            - 0x000000300000-0x000004000000 : "nor_user"
> 
> STM32MP> mtd erase nand0 0x0 0x000040000000
> Erasing 0x00000000 ... 0x3fffffff (4096 eraseblock(s))
> Skipping bad block at 0xffffffffffffffff
> 
> OK
> 
> The 4 bad blocks are not correctly skipped,
> the command is stopped on the first error.
> 
> After the patch, the "mtd erase" command skips the 4 bad block
> exactly as the "nand erase" command:
> 
> STM32MP> mtd erase nand0 0x000000000000 0x000040000000
> SF: Detected mx66l51235l with page size 256 Bytes, erase size 64 KiB, total 64 MiB
> Erasing 0x00000000 ... 0x3fffffff (4096 eraseblock(s))
> Skipping bad block at 0x3ff00000
> Skipping bad block at 0x3ff40000
> Skipping bad block at 0x3ff80000
> Skipping bad block at 0x3ffc0000
> 
> Regards
> 
> Patrick
> 
> 
>   cmd/mtd.c | 24 ++++++++++++------------
>   1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/cmd/mtd.c b/cmd/mtd.c
> index 1b6b8dda2b..a559b5a4a3 100644
> --- a/cmd/mtd.c
> +++ b/cmd/mtd.c
> @@ -387,7 +387,7 @@ static int do_mtd_erase(cmd_tbl_t *cmdtp, int flag, int argc,
>   	struct mtd_info *mtd;
>   	u64 off, len;
>   	bool scrub;
> -	int ret;
> +	int ret = 0;
>   
>   	if (argc < 2)
>   		return CMD_RET_USAGE;
> @@ -423,22 +423,22 @@ static int do_mtd_erase(cmd_tbl_t *cmdtp, int flag, int argc,
>   
>   	erase_op.mtd = mtd;
>   	erase_op.addr = off;
> -	erase_op.len = len;
> +	erase_op.len = mtd->erasesize;
>   	erase_op.scrub = scrub;
>   
> -	while (erase_op.len) {
> +	while (len) {
>   		ret = mtd_erase(mtd, &erase_op);
>   
> -		/* Abort if its not a bad block error */
> -		if (ret != -EIO)
> -			break;
> -
> -		printf("Skipping bad block at 0x%08llx\n", erase_op.fail_addr);
> +		if (ret) {
> +			/* Abort if its not a bad block error */
> +			if (ret != -EIO)
> +				break;
> +			printf("Skipping bad block at 0x%08llx\n",
> +			       erase_op.addr);
> +		}
>   
> -		/* Skip bad block and continue behind it */
> -		erase_op.len -= erase_op.fail_addr - erase_op.addr;
> -		erase_op.len -= mtd->erasesize;
> -		erase_op.addr = erase_op.fail_addr + mtd->erasesize;
> +		len -= mtd->erasesize;
> +		erase_op.addr += mtd->erasesize;
>   	}
>   
>   	if (ret && ret != -EIO)
>
Miquel Raynal Sept. 20, 2019, 9:55 a.m. UTC | #2
Hi Patrick,

Patrick Delaunay <patrick.delaunay@st.com> wrote on Fri, 20 Sep 2019
09:20:12 +0200:

> This patch modify the loop in mtd erase command to erase one by one
> the blocks in the requested area.
> 
> It solves issue on "mtd erase" command on nand with existing bad block,
> the command is interrupted on the first bad block with the trace:
> 	"Skipping bad block at 0xffffffffffffffff"
> 
> In MTD driver (nand/raw), when a bad block is present on the MTD
> device, the erase_op.fail_addr is not updated and we have the initial
> value MTD_FAIL_ADDR_UNKNOWN = (ULL)-1.
> 
> This case seems normal in nand_base.c:nand_erase_nand(),
> we have the 2 exit cases during the loop:
> 
> 1/ we have a bad block (nand_block_checkbad)
> 	instr->state = MTD_ERASE_FAILED
> 	loop interrupted (goto erase_exit)
> 
> 2/ if block erase failed (status & NAND_STATUS_FAIL)
> 	instr->state = MTD_ERASE_FAILED;
> 	instr->fail_addr =
> 				((loff_t)page << chip->page_shift);
> 	loop interrupted (goto erase_exit)
> 
> So erase_op.fail_addr can't be used if bad blocks were present
> in the erased area; we need to use mtd_erase only one block to detect
> and skip these existing bad blocks (as it is done in nand_util.c).
> 
> Signed-off-by: Patrick Delaunay <patrick.delaunay@st.com>
> ---
> 
> Hi,
> 
> Found a correct in the mtd erase command.
> 
> I detect the issue and test the patch on STM32MP157C-EV1 board,
> with nor and nand. We have the block table at the end of the nand
> so the 4 last blocks are marked bad.
> 
> And I try to erase all the nand with the command "mtd erase".
> 
> Before the patch:
> 
> The "nand erase" command behavior is OK:
> 
> STM32MP> nand erase 0x0 0x000040000000  
> 
> NAND erase: device 0 whole chip
> Skipping bad block at  0x3ff00000
> Skipping bad block at  0x3ff40000
> Skipping bad block at  0x3ff80000
> Skipping bad block at  0x3ffc0000
> 
> But the "mtd erase" command is not correct:
> 
> STM32MP> mtd list  
> SF: Detected mx66l51235l with page size 256 Bytes, erase size 64 KiB, total 64 MiB
> List of MTD devices:
> * nand0
>   - type: NAND flash
>   - block size: 0x40000 bytes
>   - min I/O: 0x1000 bytes
>   - OOB size: 224 bytes
>   - OOB available: 118 bytes
>   - ECC strength: 8 bits
>   - ECC step size: 512 bytes
>   - bitflip threshold: 6 bits
>   - 0x000000000000-0x000040000000 : "nand0"
>           - 0x000000000000-0x000000200000 : "fsbl"
>           - 0x000000200000-0x000000400000 : "ssbl1"
>           - 0x000000400000-0x000000600000 : "ssbl2"
>           - 0x000000600000-0x000040000000 : "UBI"
> * nor0
>   - type: NOR flash
>   - block size: 0x10000 bytes
>   - min I/O: 0x1 bytes
>   - 0x000000000000-0x000004000000 : "nor0"
>           - 0x000000000000-0x000000040000 : "fsbl1"
>           - 0x000000040000-0x000000080000 : "fsbl2"
>           - 0x000000080000-0x000000280000 : "ssbl"
>           - 0x000000280000-0x000000300000 : "u-boot-env"
>           - 0x000000300000-0x000004000000 : "nor_user"
> 
> STM32MP> mtd erase nand0 0x0 0x000040000000  
> Erasing 0x00000000 ... 0x3fffffff (4096 eraseblock(s))
> Skipping bad block at 0xffffffffffffffff
> 
> OK
> 
> The 4 bad blocks are not correctly skipped,
> the command is stopped on the first error.
> 
> After the patch, the "mtd erase" command skips the 4 bad block
> exactly as the "nand erase" command:
> 
> STM32MP> mtd erase nand0 0x000000000000 0x000040000000  
> SF: Detected mx66l51235l with page size 256 Bytes, erase size 64 KiB, total 64 MiB
> Erasing 0x00000000 ... 0x3fffffff (4096 eraseblock(s))
> Skipping bad block at 0x3ff00000
> Skipping bad block at 0x3ff40000
> Skipping bad block at 0x3ff80000
> Skipping bad block at 0x3ffc0000
> 
> Regards
> 
> Patrick
> 
> 
>  cmd/mtd.c | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/cmd/mtd.c b/cmd/mtd.c
> index 1b6b8dda2b..a559b5a4a3 100644
> --- a/cmd/mtd.c
> +++ b/cmd/mtd.c
> @@ -387,7 +387,7 @@ static int do_mtd_erase(cmd_tbl_t *cmdtp, int flag, int argc,
>  	struct mtd_info *mtd;
>  	u64 off, len;
>  	bool scrub;
> -	int ret;
> +	int ret = 0;
>  
>  	if (argc < 2)
>  		return CMD_RET_USAGE;
> @@ -423,22 +423,22 @@ static int do_mtd_erase(cmd_tbl_t *cmdtp, int flag, int argc,
>  
>  	erase_op.mtd = mtd;
>  	erase_op.addr = off;
> -	erase_op.len = len;
> +	erase_op.len = mtd->erasesize;
>  	erase_op.scrub = scrub;
>  
> -	while (erase_op.len) {
> +	while (len) {
>  		ret = mtd_erase(mtd, &erase_op);
>  
> -		/* Abort if its not a bad block error */
> -		if (ret != -EIO)
> -			break;
> -
> -		printf("Skipping bad block at 0x%08llx\n", erase_op.fail_addr);
> +		if (ret) {
> +			/* Abort if its not a bad block error */
> +			if (ret != -EIO)
> +				break;
> +			printf("Skipping bad block at 0x%08llx\n",
> +			       erase_op.addr);
> +		}
>  
> -		/* Skip bad block and continue behind it */
> -		erase_op.len -= erase_op.fail_addr - erase_op.addr;
> -		erase_op.len -= mtd->erasesize;
> -		erase_op.addr = erase_op.fail_addr + mtd->erasesize;
> +		len -= mtd->erasesize;
> +		erase_op.addr += mtd->erasesize;
>  	}
>  
>  	if (ret && ret != -EIO)

Nice catch!

Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>


Thanks,
Miquèl
Patrick DELAUNAY Sept. 26, 2019, 9:31 a.m. UTC | #3
Hi Stefan,

> From: Stefan Roese <sr@denx.de>
> Sent: vendredi 20 septembre 2019 11:20
> 
> Hi Patrick,
> 
> On 20.09.19 09:20, Patrick Delaunay wrote:
> > This patch modify the loop in mtd erase command to erase one by one
> > the blocks in the requested area.
> >
> > It solves issue on "mtd erase" command on nand with existing bad
> > block, the command is interrupted on the first bad block with the trace:
> > 	"Skipping bad block at 0xffffffffffffffff"
> >
> > In MTD driver (nand/raw), when a bad block is present on the MTD
> > device, the erase_op.fail_addr is not updated and we have the initial
> > value MTD_FAIL_ADDR_UNKNOWN = (ULL)-1.
> 
> So here is the difference? I remember testing this on a board with SPI NAND and
> here it worked correctly. But your test case is with RAW NAND?

Yes with RAW nand.

it is the difference the U-Boot code, for SPI nan use:
	int nanddev_mtd_erase()

the fail address is always updated 
	=> einfo->fail_addr = nanddev_pos_to_offs(nand, &pos);

 
> Do you have a change to also test this on a board with SPI NAND?

I do the test  a SPI-NAND today.

The mtd erase command was functional on SPI-ANND before my patch : 
I create 2 bad block manually and they are correctly skipped.

STM32MP> mtd list  
List of MTD devices:
* spi-nand0
  - device: spi-nand@0
  - parent: qspi@58003000
  - driver: spi_nand
  - type: NAND flash
  - block size: 0x20000 bytes
  - min I/O: 0x800 bytes
  - OOB size: 128 bytes
  - OOB available: 62 bytes
  - 0x000000000000-0x000010000000 : "spi-nand0"
	  - 0x000000000000-0x000000200000 : "fsbl"
	  - 0x000000200000-0x000000400000 : "ssbl1"
	  - 0x000000400000-0x000000600000 : "ssbl2"
	  - 0x000000600000-0x000010000000 : "UBI"

STM32MP> mtd erase spi-nand0 0x00000000 0x10000000            
Erasing 0x00000000 ... 0x0fffffff (2048 eraseblock(s))
0x0fd00000: bad block
0x0fd20000: bad block
attempt to erase a bad/reserved block @fd00000
Skipping bad block at 0x0fd00000
attempt to erase a bad/reserved block @fd20000
Skipping bad block at 0x0fd20000
0x0fd00000: bad block
0x0fd20000: bad block


> Thanks,
> Stefan
> 

What it is the better solution for you ?

 update the MTD command (my patch) or allign the behavior of the 2 MTD devices 
- MTD RAW NAND (nand_base.c:: nand_erase_nand)
- MTD SPI NAND (core.c:: nanddev_mtd_erase)

Best regards 

Patrick
Miquel Raynal Sept. 26, 2019, 9:42 a.m. UTC | #4
Hi Patrick,

Patrick DELAUNAY <patrick.delaunay@st.com> wrote on Thu, 26 Sep 2019
09:31:46 +0000:

> Hi Stefan,
> 
> > From: Stefan Roese <sr@denx.de>
> > Sent: vendredi 20 septembre 2019 11:20
> > 
> > Hi Patrick,
> > 
> > On 20.09.19 09:20, Patrick Delaunay wrote:  
> > > This patch modify the loop in mtd erase command to erase one by one
> > > the blocks in the requested area.
> > >
> > > It solves issue on "mtd erase" command on nand with existing bad
> > > block, the command is interrupted on the first bad block with the trace:
> > > 	"Skipping bad block at 0xffffffffffffffff"
> > >
> > > In MTD driver (nand/raw), when a bad block is present on the MTD
> > > device, the erase_op.fail_addr is not updated and we have the initial
> > > value MTD_FAIL_ADDR_UNKNOWN = (ULL)-1.  
> > 
> > So here is the difference? I remember testing this on a board with SPI NAND and
> > here it worked correctly. But your test case is with RAW NAND?  
> 
> Yes with RAW nand.
> 
> it is the difference the U-Boot code, for SPI nan use:
> 	int nanddev_mtd_erase()
> 
> the fail address is always updated 
> 	=> einfo->fail_addr = nanddev_pos_to_offs(nand, &pos);  
> 
>  
> > Do you have a change to also test this on a board with SPI NAND?  
> 
> I do the test  a SPI-NAND today.
> 
> The mtd erase command was functional on SPI-ANND before my patch : 
> I create 2 bad block manually and they are correctly skipped.
> 
> STM32MP> mtd list    
> List of MTD devices:
> * spi-nand0
>   - device: spi-nand@0
>   - parent: qspi@58003000
>   - driver: spi_nand
>   - type: NAND flash
>   - block size: 0x20000 bytes
>   - min I/O: 0x800 bytes
>   - OOB size: 128 bytes
>   - OOB available: 62 bytes
>   - 0x000000000000-0x000010000000 : "spi-nand0"
> 	  - 0x000000000000-0x000000200000 : "fsbl"
> 	  - 0x000000200000-0x000000400000 : "ssbl1"
> 	  - 0x000000400000-0x000000600000 : "ssbl2"
> 	  - 0x000000600000-0x000010000000 : "UBI"
> 
> STM32MP> mtd erase spi-nand0 0x00000000 0x10000000              
> Erasing 0x00000000 ... 0x0fffffff (2048 eraseblock(s))
> 0x0fd00000: bad block
> 0x0fd20000: bad block
> attempt to erase a bad/reserved block @fd00000
> Skipping bad block at 0x0fd00000
> attempt to erase a bad/reserved block @fd20000
> Skipping bad block at 0x0fd20000
> 0x0fd00000: bad block
> 0x0fd20000: bad block
> 
> 
> > Thanks,
> > Stefan
> >   
> 
> What it is the better solution for you ?
> 
>  update the MTD command (my patch) or allign the behavior of the 2 MTD devices 
> - MTD RAW NAND (nand_base.c:: nand_erase_nand)
> - MTD SPI NAND (core.c:: nanddev_mtd_erase)

Do you think it is easy to make use of nanddev_mtd_erase() from the raw
NAND core? It is probably a little bit more elegant (and efficient)
to do all in one go than iterating over each block (while there is a
helper in the core to do that).


Thanks,
Miquèl
Patrick DELAUNAY Sept. 27, 2019, 11:23 a.m. UTC | #5
Hi Miquel

> From: Miquel Raynal <miquel.raynal@bootlin.com>
> Sent: jeudi 26 septembre 2019 11:43
> 
> Hi Patrick,
> 
> Patrick DELAUNAY <patrick.delaunay@st.com> wrote on Thu, 26 Sep 2019
> 09:31:46 +0000:
> 
> > Hi Stefan,
> >
> > > From: Stefan Roese <sr@denx.de>
> > > Sent: vendredi 20 septembre 2019 11:20
> > >
> > > Hi Patrick,
> > >
> > > On 20.09.19 09:20, Patrick Delaunay wrote:
> > > > This patch modify the loop in mtd erase command to erase one by
> > > > one the blocks in the requested area.
> > > >
> > > > It solves issue on "mtd erase" command on nand with existing bad
> > > > block, the command is interrupted on the first bad block with the trace:
> > > > 	"Skipping bad block at 0xffffffffffffffff"
> > > >
> > > > In MTD driver (nand/raw), when a bad block is present on the MTD
> > > > device, the erase_op.fail_addr is not updated and we have the
> > > > initial value MTD_FAIL_ADDR_UNKNOWN = (ULL)-1.
> > >
> > > So here is the difference? I remember testing this on a board with
> > > SPI NAND and here it worked correctly. But your test case is with RAW
> NAND?
> >
> > Yes with RAW nand.
> >
> > it is the difference the U-Boot code, for SPI nan use:
> > 	int nanddev_mtd_erase()
> >
> > the fail address is always updated
> > 	=> einfo->fail_addr = nanddev_pos_to_offs(nand, &pos);
> >
> >
> > > Do you have a change to also test this on a board with SPI NAND?
> >
> > I do the test  a SPI-NAND today.
> >
> > The mtd erase command was functional on SPI-ANND before my patch :
> > I create 2 bad block manually and they are correctly skipped.
> >
> > STM32MP> mtd list
> > List of MTD devices:
> > * spi-nand0
> >   - device: spi-nand@0
> >   - parent: qspi@58003000
> >   - driver: spi_nand
> >   - type: NAND flash
> >   - block size: 0x20000 bytes
> >   - min I/O: 0x800 bytes
> >   - OOB size: 128 bytes
> >   - OOB available: 62 bytes
> >   - 0x000000000000-0x000010000000 : "spi-nand0"
> > 	  - 0x000000000000-0x000000200000 : "fsbl"
> > 	  - 0x000000200000-0x000000400000 : "ssbl1"
> > 	  - 0x000000400000-0x000000600000 : "ssbl2"
> > 	  - 0x000000600000-0x000010000000 : "UBI"
> >
> > STM32MP> mtd erase spi-nand0 0x00000000 0x10000000
> > Erasing 0x00000000 ... 0x0fffffff (2048 eraseblock(s))
> > 0x0fd00000: bad block
> > 0x0fd20000: bad block
> > attempt to erase a bad/reserved block @fd00000 Skipping bad block at
> > 0x0fd00000 attempt to erase a bad/reserved block @fd20000 Skipping bad
> > block at 0x0fd20000
> > 0x0fd00000: bad block
> > 0x0fd20000: bad block
> >
> >
> > > Thanks,
> > > Stefan
> > >
> >
> > What it is the better solution for you ?
> >
> >  update the MTD command (my patch) or allign the behavior of the 2 MTD
> > devices
> > - MTD RAW NAND (nand_base.c:: nand_erase_nand)
> > - MTD SPI NAND (core.c:: nanddev_mtd_erase)
> 
> Do you think it is easy to make use of nanddev_mtd_erase() from the raw NAND
> core? It is probably a little bit more elegant (and efficient) to do all in one go than
> iterating over each block (while there is a helper in the core to do that).


Yes, I agree: it will be more elegant.

But,  I am not comfortable with MTD Raw NAND framework.

Based on a quick check between Linux Kernel 5.3 and U-Boot, it seems that U-Boot
Raw NAND framework is aligned with Kernel 4.19 Raw NAND framework.
To be able to use nanddev_mtd_erase API, we should backport Raw NAND framework
from Kernel 5.3 because nanddev_mtd_erase can be used only if memorg structure
is properly set (has been done on Kernel 5.2).

I have not checked all potential impacts to use this API, but a backport form Kernel
Raw NAND framework is needed in U-Boot in a first step.

As  I am not comfortable with MTD frameworks, I think that my patch could be currently
applied to solve this issue, and in a second step, when a MTD expert will backport the
framework, it could be removed.

PS: A other solution with minimize the impacts in MTD, it is to change
      only nand_base.c:nand_erase_nand(), to update the fail_addr:

----------------------- drivers/mtd/nand/raw/nand_base.c ----------------------- index aba8ac019d..50542a2b9a 100644 @@ -3554,6 +3554,8 @@ int nand_erase_nand(struct mtd_info *mtd, struct erase_info *instr,
 			pr_warn("%s: attempt to erase a bad block at page 0x%08x\n",
 				    __func__, page);
 			instr->state = MTD_ERASE_FAILED;
+			instr->fail_addr =
+				((loff_t)page << chip->page_shift);
 			goto erase_exit;
 		}

But as it is also a common MTD part with Linux kernel so I continue to prefer
my patch on U-Boot only code.

> 
> Thanks,
> Miquèl

Regards
Patrick
Miquel Raynal Sept. 29, 2019, 7:46 p.m. UTC | #6
Hi Patrick,

Patrick DELAUNAY <patrick.delaunay@st.com> wrote on Fri, 27 Sep 2019
11:23:09 +0000:

> Hi Miquel
> 
> > From: Miquel Raynal <miquel.raynal@bootlin.com>
> > Sent: jeudi 26 septembre 2019 11:43
> > 
> > Hi Patrick,
> > 
> > Patrick DELAUNAY <patrick.delaunay@st.com> wrote on Thu, 26 Sep 2019
> > 09:31:46 +0000:
> >   
> > > Hi Stefan,
> > >  
> > > > From: Stefan Roese <sr@denx.de>
> > > > Sent: vendredi 20 septembre 2019 11:20
> > > >
> > > > Hi Patrick,
> > > >
> > > > On 20.09.19 09:20, Patrick Delaunay wrote:  
> > > > > This patch modify the loop in mtd erase command to erase one by
> > > > > one the blocks in the requested area.
> > > > >
> > > > > It solves issue on "mtd erase" command on nand with existing bad
> > > > > block, the command is interrupted on the first bad block with the trace:
> > > > > 	"Skipping bad block at 0xffffffffffffffff"
> > > > >
> > > > > In MTD driver (nand/raw), when a bad block is present on the MTD
> > > > > device, the erase_op.fail_addr is not updated and we have the
> > > > > initial value MTD_FAIL_ADDR_UNKNOWN = (ULL)-1.  
> > > >
> > > > So here is the difference? I remember testing this on a board with
> > > > SPI NAND and here it worked correctly. But your test case is with RAW  
> > NAND?  
> > >
> > > Yes with RAW nand.
> > >
> > > it is the difference the U-Boot code, for SPI nan use:
> > > 	int nanddev_mtd_erase()
> > >
> > > the fail address is always updated  
> > > 	=> einfo->fail_addr = nanddev_pos_to_offs(nand, &pos);  
> > >
> > >  
> > > > Do you have a change to also test this on a board with SPI NAND?  
> > >
> > > I do the test  a SPI-NAND today.
> > >
> > > The mtd erase command was functional on SPI-ANND before my patch :
> > > I create 2 bad block manually and they are correctly skipped.
> > >  
> > > STM32MP> mtd list  
> > > List of MTD devices:
> > > * spi-nand0
> > >   - device: spi-nand@0
> > >   - parent: qspi@58003000
> > >   - driver: spi_nand
> > >   - type: NAND flash
> > >   - block size: 0x20000 bytes
> > >   - min I/O: 0x800 bytes
> > >   - OOB size: 128 bytes
> > >   - OOB available: 62 bytes
> > >   - 0x000000000000-0x000010000000 : "spi-nand0"
> > > 	  - 0x000000000000-0x000000200000 : "fsbl"
> > > 	  - 0x000000200000-0x000000400000 : "ssbl1"
> > > 	  - 0x000000400000-0x000000600000 : "ssbl2"
> > > 	  - 0x000000600000-0x000010000000 : "UBI"
> > >  
> > > STM32MP> mtd erase spi-nand0 0x00000000 0x10000000  
> > > Erasing 0x00000000 ... 0x0fffffff (2048 eraseblock(s))
> > > 0x0fd00000: bad block
> > > 0x0fd20000: bad block
> > > attempt to erase a bad/reserved block @fd00000 Skipping bad block at
> > > 0x0fd00000 attempt to erase a bad/reserved block @fd20000 Skipping bad
> > > block at 0x0fd20000
> > > 0x0fd00000: bad block
> > > 0x0fd20000: bad block
> > >
> > >  
> > > > Thanks,
> > > > Stefan
> > > >  
> > >
> > > What it is the better solution for you ?
> > >
> > >  update the MTD command (my patch) or allign the behavior of the 2 MTD
> > > devices
> > > - MTD RAW NAND (nand_base.c:: nand_erase_nand)
> > > - MTD SPI NAND (core.c:: nanddev_mtd_erase)  
> > 
> > Do you think it is easy to make use of nanddev_mtd_erase() from the raw NAND
> > core? It is probably a little bit more elegant (and efficient) to do all in one go than
> > iterating over each block (while there is a helper in the core to do that).  
> 
> 
> Yes, I agree: it will be more elegant.
> 
> But,  I am not comfortable with MTD Raw NAND framework.
> 
> Based on a quick check between Linux Kernel 5.3 and U-Boot, it seems that U-Boot
> Raw NAND framework is aligned with Kernel 4.19 Raw NAND framework.
> To be able to use nanddev_mtd_erase API, we should backport Raw NAND framework
> from Kernel 5.3 because nanddev_mtd_erase can be used only if memorg structure
> is properly set (has been done on Kernel 5.2).
> 
> I have not checked all potential impacts to use this API, but a backport form Kernel
> Raw NAND framework is needed in U-Boot in a first step.
> 
> As  I am not comfortable with MTD frameworks, I think that my patch could be currently
> applied to solve this issue, and in a second step, when a MTD expert will backport the
> framework, it could be removed.
> 
> PS: A other solution with minimize the impacts in MTD, it is to change
>       only nand_base.c:nand_erase_nand(), to update the fail_addr:
> 
> ----------------------- drivers/mtd/nand/raw/nand_base.c ----------------------- index aba8ac019d..50542a2b9a 100644 @@ -3554,6 +3554,8 @@ int nand_erase_nand(struct mtd_info *mtd, struct erase_info *instr,
>  			pr_warn("%s: attempt to erase a bad block at page 0x%08x\n",
>  				    __func__, page);
>  			instr->state = MTD_ERASE_FAILED;
> +			instr->fail_addr =
> +				((loff_t)page << chip->page_shift);
>  			goto erase_exit;
>  		}
> 
> But as it is also a common MTD part with Linux kernel so I continue to prefer
> my patch on U-Boot only code.

I understand, I'm fine with the cmd/mtd.c to be changed only.

Thanks,
Miquèl
Tom Rini Jan. 25, 2020, 5:08 p.m. UTC | #7
On Fri, Sep 20, 2019 at 09:20:12AM +0200, Patrick Delaunay wrote:

> This patch modify the loop in mtd erase command to erase one by one
> the blocks in the requested area.
> 
> It solves issue on "mtd erase" command on nand with existing bad block,
> the command is interrupted on the first bad block with the trace:
> 	"Skipping bad block at 0xffffffffffffffff"
> 
> In MTD driver (nand/raw), when a bad block is present on the MTD
> device, the erase_op.fail_addr is not updated and we have the initial
> value MTD_FAIL_ADDR_UNKNOWN = (ULL)-1.
> 
> This case seems normal in nand_base.c:nand_erase_nand(),
> we have the 2 exit cases during the loop:
> 
> 1/ we have a bad block (nand_block_checkbad)
> 	instr->state = MTD_ERASE_FAILED
> 	loop interrupted (goto erase_exit)
> 
> 2/ if block erase failed (status & NAND_STATUS_FAIL)
> 	instr->state = MTD_ERASE_FAILED;
> 	instr->fail_addr =
> 				((loff_t)page << chip->page_shift);
> 	loop interrupted (goto erase_exit)
> 
> So erase_op.fail_addr can't be used if bad blocks were present
> in the erased area; we need to use mtd_erase only one block to detect
> and skip these existing bad blocks (as it is done in nand_util.c).
> 
> Signed-off-by: Patrick Delaunay <patrick.delaunay@st.com>
> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>

Applied to u-boot/master, thanks!
diff mbox series

Patch

diff --git a/cmd/mtd.c b/cmd/mtd.c
index 1b6b8dda2b..a559b5a4a3 100644
--- a/cmd/mtd.c
+++ b/cmd/mtd.c
@@ -387,7 +387,7 @@  static int do_mtd_erase(cmd_tbl_t *cmdtp, int flag, int argc,
 	struct mtd_info *mtd;
 	u64 off, len;
 	bool scrub;
-	int ret;
+	int ret = 0;
 
 	if (argc < 2)
 		return CMD_RET_USAGE;
@@ -423,22 +423,22 @@  static int do_mtd_erase(cmd_tbl_t *cmdtp, int flag, int argc,
 
 	erase_op.mtd = mtd;
 	erase_op.addr = off;
-	erase_op.len = len;
+	erase_op.len = mtd->erasesize;
 	erase_op.scrub = scrub;
 
-	while (erase_op.len) {
+	while (len) {
 		ret = mtd_erase(mtd, &erase_op);
 
-		/* Abort if its not a bad block error */
-		if (ret != -EIO)
-			break;
-
-		printf("Skipping bad block at 0x%08llx\n", erase_op.fail_addr);
+		if (ret) {
+			/* Abort if its not a bad block error */
+			if (ret != -EIO)
+				break;
+			printf("Skipping bad block at 0x%08llx\n",
+			       erase_op.addr);
+		}
 
-		/* Skip bad block and continue behind it */
-		erase_op.len -= erase_op.fail_addr - erase_op.addr;
-		erase_op.len -= mtd->erasesize;
-		erase_op.addr = erase_op.fail_addr + mtd->erasesize;
+		len -= mtd->erasesize;
+		erase_op.addr += mtd->erasesize;
 	}
 
 	if (ret && ret != -EIO)