diff mbox series

mtd: spi-nor-core: call WATCHDOG_RESET() in spi_nor_ready()

Message ID 20200316201832.28693-1-rasmus.villemoes@prevas.dk
State Changes Requested
Delegated to: Jagannadha Sutradharudu Teki
Headers show
Series mtd: spi-nor-core: call WATCHDOG_RESET() in spi_nor_ready() | expand

Commit Message

Rasmus Villemoes March 16, 2020, 8:18 p.m. UTC
I have a board for which doing "sf erase 0x100000 0x80000"
consistently causes the external watchdog circuit to reset the
board. Make sure to pet the watchdog during slow operations such as
erasing or writing large areas of a spi nor flash.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/mtd/spi/spi-nor-core.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Vignesh Raghavendra March 19, 2020, 11:25 a.m. UTC | #1
Hi,

On 17/03/20 1:48 am, Rasmus Villemoes wrote:
> I have a board for which doing "sf erase 0x100000 0x80000"
> consistently causes the external watchdog circuit to reset the
> board. Make sure to pet the watchdog during slow operations such as
> erasing or writing large areas of a spi nor flash.
> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>  drivers/mtd/spi/spi-nor-core.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/mtd/spi/spi-nor-core.c b/drivers/mtd/spi/spi-nor-core.c
> index 4076646225..b7f7aa7b28 100644
> --- a/drivers/mtd/spi/spi-nor-core.c
> +++ b/drivers/mtd/spi/spi-nor-core.c
> @@ -20,6 +20,7 @@
>  #include <linux/mtd/spi-nor.h>
>  #include <spi-mem.h>
>  #include <spi.h>
> +#include <watchdog.h>
>  
>  #include "sf_internal.h"
>  
> @@ -399,6 +400,7 @@ static int spi_nor_ready(struct spi_nor *nor)
>  {
>  	int sr, fsr;
>  
> +	WATCHDOG_RESET();

Is it necessary to reset watchdog for every status register read? How
small is the timeout? Resetting too often will impact performance
depending on overhead of this call.

Is it not sufficient to reset watchdog once per page write (in
spi_nor_write()) and once per sector erase (in spi_nor_erase())?


>  	sr = spi_nor_sr_ready(nor);
>  	if (sr < 0)
>  		return sr;
> 

Regards
Vignesh
Rasmus Villemoes March 19, 2020, 11:39 a.m. UTC | #2
On 19/03/2020 12.25, Vignesh Raghavendra wrote:
> Hi,
> 
> On 17/03/20 1:48 am, Rasmus Villemoes wrote:
>> I have a board for which doing "sf erase 0x100000 0x80000"
>> consistently causes the external watchdog circuit to reset the
>> board. Make sure to pet the watchdog during slow operations such as
>> erasing or writing large areas of a spi nor flash.
>>
>> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
>> ---
>>  drivers/mtd/spi/spi-nor-core.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/mtd/spi/spi-nor-core.c b/drivers/mtd/spi/spi-nor-core.c
>> index 4076646225..b7f7aa7b28 100644
>> --- a/drivers/mtd/spi/spi-nor-core.c
>> +++ b/drivers/mtd/spi/spi-nor-core.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/mtd/spi-nor.h>
>>  #include <spi-mem.h>
>>  #include <spi.h>
>> +#include <watchdog.h>
>>  
>>  #include "sf_internal.h"
>>  
>> @@ -399,6 +400,7 @@ static int spi_nor_ready(struct spi_nor *nor)
>>  {
>>  	int sr, fsr;
>>  
>> +	WATCHDOG_RESET();
> 
> Is it necessary to reset watchdog for every status register read? How
> small is the timeout? Resetting too often will impact performance
> depending on overhead of this call.
> 
> Is it not sufficient to reset watchdog once per page write (in
> spi_nor_write()) and once per sector erase (in spi_nor_erase())?
> 

Probably, yes. I was a bit torn between putting the call here or in
spi_nor_wait_till_ready(). That should do it once per erase/page write
which should be fine (well, if the busy-looping for spi_nor_ready takes
more than the watchdog timeout, the board will reset, but I don't think
the flash is that bad).

I'll test that, but I just found out I'll need something in the read
path as well. Reading 1MB works fine, reading 2MB resets:

[2020-03-19 12:31:11.923] => echo a ; sf read $loadaddr 0 0x100000 ; echo b
[2020-03-19 12:31:32.724] a
[2020-03-19 12:31:32.724] device 0 offset 0x0, size 0x100000
[2020-03-19 12:31:33.586] SF: 1048576 bytes @ 0x0 Read: OK
[2020-03-19 12:31:33.586] b
[2020-03-19 12:31:33.586] => echo a ; sf read $loadaddr 0 0x200000 ; echo b
[2020-03-19 12:31:40.771] a
[2020-03-19 12:31:40.771] device 0 offset 0x0, size 0x200000
[2020-03-19 12:31:42.666]
[2020-03-19 12:31:42.666] U-Boot SPL 2020.01-00078-g058da1a-dirty (Mar
17 2020 - 16:27:58 +0000)

Rasmus
Vignesh Raghavendra March 19, 2020, 12:28 p.m. UTC | #3
On 19/03/20 5:09 pm, Rasmus Villemoes wrote:
> On 19/03/2020 12.25, Vignesh Raghavendra wrote:
>> Hi,
[...]
>>> @@ -399,6 +400,7 @@ static int spi_nor_ready(struct spi_nor *nor)
>>>  {
>>>  	int sr, fsr;
>>>  
>>> +	WATCHDOG_RESET();
>>
>> Is it necessary to reset watchdog for every status register read? How
>> small is the timeout? Resetting too often will impact performance
>> depending on overhead of this call.
>>
>> Is it not sufficient to reset watchdog once per page write (in
>> spi_nor_write()) and once per sector erase (in spi_nor_erase())?
>>
> 
> Probably, yes. I was a bit torn between putting the call here or in
> spi_nor_wait_till_ready(). That should do it once per erase/page write
> which should be fine (well, if the busy-looping for spi_nor_ready takes
> more than the watchdog timeout, the board will reset, but I don't think
> the flash is that bad).>
> I'll test that, but I just found out I'll need something in the read
> path as well. Reading 1MB works fine, reading 2MB resets:
> 
> [2020-03-19 12:31:11.923] => echo a ; sf read $loadaddr 0 0x100000 ; echo b
> [2020-03-19 12:31:32.724] a
> [2020-03-19 12:31:32.724] device 0 offset 0x0, size 0x100000
> [2020-03-19 12:31:33.586] SF: 1048576 bytes @ 0x0 Read: OK
> [2020-03-19 12:31:33.586] b
> [2020-03-19 12:31:33.586] => echo a ; sf read $loadaddr 0 0x200000 ; echo b
> [2020-03-19 12:31:40.771] a
> [2020-03-19 12:31:40.771] device 0 offset 0x0, size 0x200000
> [2020-03-19 12:31:42.666]
> [2020-03-19 12:31:42.666] U-Boot SPL 2020.01-00078-g058da1a-dirty (Mar
> 17 2020 - 16:27:58 +0000)
> 

Hmm, Watchdog seems to be set to trigger after ~2s of inactivity. Isn't
that too small? WATCHDOG_RESET() resets only once per second, so 2
second timeout is too close IMO.

Typical watchdog timeouts are usually in tens of seconds
Rasmus Villemoes March 19, 2020, 12:44 p.m. UTC | #4
On 19/03/2020 13.28, Vignesh Raghavendra wrote:
> 
> 
> On 19/03/20 5:09 pm, Rasmus Villemoes wrote:
>> On 19/03/2020 12.25, Vignesh Raghavendra wrote:
>>> Hi,
> [...]
>>>> @@ -399,6 +400,7 @@ static int spi_nor_ready(struct spi_nor *nor)
>>>>  {
>>>>  	int sr, fsr;
>>>>  
>>>> +	WATCHDOG_RESET();
>>>
>>> Is it necessary to reset watchdog for every status register read? How
>>> small is the timeout? Resetting too often will impact performance
>>> depending on overhead of this call.
>>>
>>> Is it not sufficient to reset watchdog once per page write (in
>>> spi_nor_write()) and once per sector erase (in spi_nor_erase())?
>>>
>>
>> Probably, yes. I was a bit torn between putting the call here or in
>> spi_nor_wait_till_ready(). That should do it once per erase/page write
>> which should be fine (well, if the busy-looping for spi_nor_ready takes
>> more than the watchdog timeout, the board will reset, but I don't think
>> the flash is that bad).>
>> I'll test that, but I just found out I'll need something in the read
>> path as well. Reading 1MB works fine, reading 2MB resets:
>>
>> [2020-03-19 12:31:11.923] => echo a ; sf read $loadaddr 0 0x100000 ; echo b
>> [2020-03-19 12:31:32.724] a
>> [2020-03-19 12:31:32.724] device 0 offset 0x0, size 0x100000
>> [2020-03-19 12:31:33.586] SF: 1048576 bytes @ 0x0 Read: OK
>> [2020-03-19 12:31:33.586] b
>> [2020-03-19 12:31:33.586] => echo a ; sf read $loadaddr 0 0x200000 ; echo b
>> [2020-03-19 12:31:40.771] a
>> [2020-03-19 12:31:40.771] device 0 offset 0x0, size 0x200000
>> [2020-03-19 12:31:42.666]
>> [2020-03-19 12:31:42.666] U-Boot SPL 2020.01-00078-g058da1a-dirty (Mar
>> 17 2020 - 16:27:58 +0000)
>>
> 
> Hmm, Watchdog seems to be set to trigger after ~2s of inactivity. Isn't
> that too small? WATCHDOG_RESET() resets only once per second, so 2
> second timeout is too close IMO.
> 
> Typical watchdog timeouts are usually in tens of seconds

Nope, not with this external gpio-triggered one. The data sheet says

Watchdog timeout period 1.12/1.60/2.24 (min/typ/max)

so ~2 seconds sounds about right - and I have to account for other
instances of the board that may be a lot closer to the minimum. The
timeout is fixed, nothing in software can affect it. So you see why I
cannot afford to let spi flash operations take several seconds to
complete without a WATCHDOG_RESET().

And yes, I'm very well aware of the rate-limiting imposed by the
wdt-provided watchdog_reset() function - that's mostly a solved problem:
https://patchwork.ozlabs.org/project/uboot/list/?series=164254

For the read side, it seems that just replacing the UINT_MAX in the two
copies of spi_nor_read_data with some smaller constant should suffice.
But I don't know if I should make that smaller constant a CONFIG_*
option or just pick something like 256K. Thoughts?

Rasmus
Vignesh Raghavendra March 19, 2020, 1:23 p.m. UTC | #5
On 19/03/20 6:14 pm, Rasmus Villemoes wrote:
>> Hmm, Watchdog seems to be set to trigger after ~2s of inactivity. Isn't
>> that too small? WATCHDOG_RESET() resets only once per second, so 2
>> second timeout is too close IMO.
>>
>> Typical watchdog timeouts are usually in tens of seconds
> Nope, not with this external gpio-triggered one. The data sheet says
> 
> Watchdog timeout period 1.12/1.60/2.24 (min/typ/max)
> 
> so ~2 seconds sounds about right - and I have to account for other
> instances of the board that may be a lot closer to the minimum. The
> timeout is fixed, nothing in software can affect it. So you see why I
> cannot afford to let spi flash operations take several seconds to
> complete without a WATCHDOG_RESET().
> 
> And yes, I'm very well aware of the rate-limiting imposed by the
> wdt-provided watchdog_reset() function - that's mostly a solved problem:
> https://patchwork.ozlabs.org/project/uboot/list/?series=164254
> 

Understood.

> For the read side, it seems that just replacing the UINT_MAX in the two
> copies of spi_nor_read_data with some smaller constant should suffice.
> But I don't know if I should make that smaller constant a CONFIG_*
> option or just pick something like 256K. Thoughts?

Breaking reads into smaller units unconditionally will cause performance
regressions. But I would like to avoid adding new CONFIG option as well.

How about resetting the watchdog in the SPI driver's read
implementation? Or setting max_read_size (in struct spi_slave) to
smaller value in the SPI controller driver to force fragmentation of reads?
Rasmus Villemoes March 19, 2020, 1:50 p.m. UTC | #6
On 19/03/2020 14.23, Vignesh Raghavendra wrote:
> 

>> For the read side, it seems that just replacing the UINT_MAX in the two
>> copies of spi_nor_read_data with some smaller constant should suffice.
>> But I don't know if I should make that smaller constant a CONFIG_*
>> option or just pick something like 256K. Thoughts?
> 
> Breaking reads into smaller units unconditionally will cause performance
> regressions. But I would like to avoid adding new CONFIG option as well.

Hm, but how much are we talking about? I can't imagine WATCHDOG_RESET()
taking much more than 10us - especially the rate-limited one that has an
early return just based on reading the current time will be practically
free to call. For me, reading 256K takes about 200ms, which I assume is
a rather typical value. Even if I'm off by an order of magnitude on both
numbers, we're talking about adding an extra 100us for every 20ms, i.e.
0.5%. And that's extremely pessimistic.

> How about resetting the watchdog in the SPI driver's read
> implementation? 

I'd prefer something done in the generic layer, not something specific
to this SOC (because next month I'll have another customer with another
board based on some ARM SOC that also decided to put on an aggressive
gpio watchdog, and next month yet another...).

Or setting max_read_size (in struct spi_slave) to
> smaller value in the SPI controller driver to force fragmentation of reads?

Even if one forces fragmentation in that way, the generic layer would
still need to grow a WATCHDOG_RESET() call in the read loop, no? It also
seems to be an abuse of max_read_size.

Rasmus
Kuldeep Singh March 19, 2020, 2:52 p.m. UTC | #7
Hi Vignesh,

> -----Original Message-----
> From: U-Boot <u-boot-bounces@lists.denx.de> On Behalf Of Rasmus
> Villemoes
> Sent: Tuesday, March 17, 2020 1:49 AM
> To: u-boot@lists.denx.de
> Cc: Jagan Teki <jagan@amarulasolutions.com>; Vignesh R
> <vigneshr@ti.com>; Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> Subject: [EXT] [PATCH] mtd: spi-nor-core: call WATCHDOG_RESET() in
> spi_nor_ready()
> 
> Caution: EXT Email
> 
> I have a board for which doing "sf erase 0x100000 0x80000"
> consistently causes the external watchdog circuit to reset the board. Make
> sure to pet the watchdog during slow operations such as erasing or writing
> large areas of a spi nor flash.

I also stumbled with the same problem of board resetting but it was in order of MBs and not in KBs.
Board gets reset while performing burst erase/write(for 48M erase, resets at 16M~) and read operation on 16M+.
I provided fix [1] in driver itself(add 1us delay) assuming it to be soc specific which solves problem for me and able to perform complete flash size operations.

Thanks 
Kuldeep

[1] https://patchwork.ozlabs.org/patch/1243005/
Rasmus Villemoes March 19, 2020, 3:05 p.m. UTC | #8
On 19/03/2020 15.52, Kuldeep Singh wrote:
> Hi Vignesh,
> 

>> I have a board for which doing "sf erase 0x100000 0x80000"
>> consistently causes the external watchdog circuit to reset the board. Make
>> sure to pet the watchdog during slow operations such as erasing or writing
>> large areas of a spi nor flash.
> 
> I also stumbled with the same problem of board resetting but it was in order of MBs and not in KBs.
> Board gets reset while performing burst erase/write(for 48M erase, resets at 16M~) and read operation on 16M+.
> I provided fix [1] in driver itself(add 1us delay) assuming it to be soc specific which solves problem for me and able to perform complete flash size operations.

Eh, but is it the short delay that fixes the problem you saw, or the
implicit WATCHDOG_RESET() done inside the udelay() function?

Rasmus
Vignesh Raghavendra March 20, 2020, 7:43 a.m. UTC | #9
On 19/03/20 7:20 pm, Rasmus Villemoes wrote:
> On 19/03/2020 14.23, Vignesh Raghavendra wrote:
>>
> 
>>> For the read side, it seems that just replacing the UINT_MAX in the two
>>> copies of spi_nor_read_data with some smaller constant should suffice.
>>> But I don't know if I should make that smaller constant a CONFIG_*
>>> option or just pick something like 256K. Thoughts?
>>
>> Breaking reads into smaller units unconditionally will cause performance
>> regressions. But I would like to avoid adding new CONFIG option as well.
> 
> Hm, but how much are we talking about? I can't imagine WATCHDOG_RESET()
> taking much more than 10us - especially the rate-limited one that has an
> early return just based on reading the current time will be practically
> free to call. For me, reading 256K takes about 200ms, which I assume is
> a rather typical value. Even if I'm off by an order of magnitude on both
> numbers, we're talking about adding an extra 100us for every 20ms, i.e.
> 0.5%. And that's extremely pessimistic.
> 

256K for 200ms is <2 MB/s which is pretty slow IMO.

QSPI and OSPIs flashes are capable of up to 400MB/s read throughput.
Some of the drivers that support such faster flashes use DMA to achieve
this and therefore have higher setup overhead per read request. Most SPI
drivers are not optimized and reconfigure registers for every read
requests which adds to the overhead.
Splitting of transfers into 256K unconditionally would degrade
performance for such platforms (irrespective of whether or not watchdog
is present).


>> How about resetting the watchdog in the SPI driver's read
>> implementation? 
> 
> I'd prefer something done in the generic layer, not something specific
> to this SOC (because next month I'll have another customer with another
> board based on some ARM SOC that also decided to put on an aggressive
> gpio watchdog, and next month yet another...).
> 

So, there should at least be a config option at the minimum. OR
determine the length of transfer possible before watchdog timeout
happens by looking at bus frequency and watchdog timeout value.

> Or setting max_read_size (in struct spi_slave) to
>> smaller value in the SPI controller driver to force fragmentation of reads?
> 
> Even if one forces fragmentation in that way, the generic layer would
> still need to grow a WATCHDOG_RESET() call in the read loop, no? It also
> seems to be an abuse of max_read_size.
> 

I agree that read loop should call WATCHDOG_RESET()

Regards
Vignesh
diff mbox series

Patch

diff --git a/drivers/mtd/spi/spi-nor-core.c b/drivers/mtd/spi/spi-nor-core.c
index 4076646225..b7f7aa7b28 100644
--- a/drivers/mtd/spi/spi-nor-core.c
+++ b/drivers/mtd/spi/spi-nor-core.c
@@ -20,6 +20,7 @@ 
 #include <linux/mtd/spi-nor.h>
 #include <spi-mem.h>
 #include <spi.h>
+#include <watchdog.h>
 
 #include "sf_internal.h"
 
@@ -399,6 +400,7 @@  static int spi_nor_ready(struct spi_nor *nor)
 {
 	int sr, fsr;
 
+	WATCHDOG_RESET();
 	sr = spi_nor_sr_ready(nor);
 	if (sr < 0)
 		return sr;