diff mbox

mtd: nand: Disable subpage writes for drivers without ecc->hwctl

Message ID 1397034804-27161-1-git-send-email-helmut.schaa@googlemail.com
State Accepted
Commit 44991b3d19cd71eabe68019ae7cb91df0c929614
Headers show

Commit Message

Helmut Schaa April 9, 2014, 9:13 a.m. UTC
nand_write_subpage_hwecc causes a crash if the driver did not register
ecc->hwctl or ecc->calculate. Fix this by disabling subpage writes if
ecc->hwctl or ecc->calculate is not provided by the driver.

This behavior was introduced in commit 837a6ba4f3b6d23026674e6af6b6849a4634fff9
"mtd: nand: subpage write support for hardware based ECC schemes".

This fixes a crash with fsl_elbc_nand and maybe others:

Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0x00000000
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2 P1020 RDB
Modules linked in: ath9k ath9k_common pppoe ppp_async option iptable_nat ath9k_hw ath usb_wwan pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_quota xt_policy xt_pkttype xt_owner xt_nat xt_multiport xt_mh
CPU: 1 PID: 2161 Comm: ubiformat Not tainted 3.10.26 #6
task: efbc2700 ti: c7950000 task.ti: c7950000
NIP: 00000000 LR: c01a495c CTR: 00000000
REGS: c7951cb0 TRAP: 0400   Not tainted  (3.10.26)
MSR: 00029000 <CE,EE,ME>  CR: 24002028  XER: 00000000

GPR00: c01a4b6c c7951d60 efbc2700 ef84b000 00000001 00000000 000001ff c7800500
GPR08: 00000000 00000000 efae5e40 c01a4ae4 24002022 10023418 c7951e5c c7800500
GPR16: c017b6a8 00000000 0000003f c053404c 00000000 00000004 00000000 00000003
GPR24: 00000010 00000200 ef84b000 c7800d00 c7800000 c7800500 ef84b1c8 00000000
NIP [00000000]   (null)
LR [c01a495c] nand_write_subpage_hwecc+0x74/0x174
Call Trace:
[c7951d60] [c7951d64] 0xc7951d64 (unreliable)
[c7951da0] [c01a4b6c] nand_write_page+0x88/0x198
[c7951dd0] [c01a5f7c] nand_do_write_ops+0x2f4/0x39c
[c7951e40] [c01a61e0] nand_write+0x58/0x84
[c7951e80] [c019e29c] mtdchar_write+0x1dc/0x28c
[c7951ef0] [c00aba84] vfs_write+0xcc/0x1ac
[c7951f10] [c00ac04c] SyS_write+0x4c/0x90
[c7951f40] [c000cd84] ret_from_syscall+0x0/0x3c
 --- Exception: c01 at 0x48050ed8
     LR = 0x100071b8
 Instruction dump:
 XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
 XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
 ---[ end trace 161d3c65a2a15cb8 ]---

Kernel panic - not syncing: Fatal exception

Cc: Gupta, Pekon <pekon@ti.com>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
---
I noticed this bug on openwrt with kernel 3.10 but it looks as if this can
still happen in more recent kernels.

 drivers/mtd/nand/nand_base.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

pekon gupta April 9, 2014, 9:38 a.m. UTC | #1
>From: Helmut Schaa [mailto:helmut.schaa@googlemail.com]
>
>nand_write_subpage_hwecc causes a crash if the driver did not register
>ecc->hwctl or ecc->calculate. Fix this by disabling subpage writes if
>ecc->hwctl or ecc->calculate is not provided by the driver.
>
>This behavior was introduced in commit 837a6ba4f3b6d23026674e6af6b6849a4634fff9
>"mtd: nand: subpage write support for hardware based ECC schemes".
>
>This fixes a crash with fsl_elbc_nand and maybe others:
>
>Unable to handle kernel paging request for instruction fetch
>Faulting instruction address: 0x00000000
>Oops: Kernel access of bad area, sig: 11 [#1]
>SMP NR_CPUS=2 P1020 RDB
>Modules linked in: ath9k ath9k_common pppoe ppp_async option iptable_nat ath9k_hw ath usb_wwan pppox ppp_generic
>nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_quota xt_policy xt_pkttype
>xt_owner xt_nat xt_multiport xt_mh
>CPU: 1 PID: 2161 Comm: ubiformat Not tainted 3.10.26 #6
>task: efbc2700 ti: c7950000 task.ti: c7950000
>NIP: 00000000 LR: c01a495c CTR: 00000000
>REGS: c7951cb0 TRAP: 0400   Not tainted  (3.10.26)
>MSR: 00029000 <CE,EE,ME>  CR: 24002028  XER: 00000000
>
>GPR00: c01a4b6c c7951d60 efbc2700 ef84b000 00000001 00000000 000001ff c7800500
>GPR08: 00000000 00000000 efae5e40 c01a4ae4 24002022 10023418 c7951e5c c7800500
>GPR16: c017b6a8 00000000 0000003f c053404c 00000000 00000004 00000000 00000003
>GPR24: 00000010 00000200 ef84b000 c7800d00 c7800000 c7800500 ef84b1c8 00000000
>NIP [00000000]   (null)
>LR [c01a495c] nand_write_subpage_hwecc+0x74/0x174
>Call Trace:
>[c7951d60] [c7951d64] 0xc7951d64 (unreliable)
>[c7951da0] [c01a4b6c] nand_write_page+0x88/0x198
>[c7951dd0] [c01a5f7c] nand_do_write_ops+0x2f4/0x39c
>[c7951e40] [c01a61e0] nand_write+0x58/0x84
>[c7951e80] [c019e29c] mtdchar_write+0x1dc/0x28c
>[c7951ef0] [c00aba84] vfs_write+0xcc/0x1ac
>[c7951f10] [c00ac04c] SyS_write+0x4c/0x90
>[c7951f40] [c000cd84] ret_from_syscall+0x0/0x3c
> --- Exception: c01 at 0x48050ed8
>     LR = 0x100071b8
> Instruction dump:
> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> ---[ end trace 161d3c65a2a15cb8 ]---
>
>Kernel panic - not syncing: Fatal exception
>
>Cc: Gupta, Pekon <pekon@ti.com>
>Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>Cc: David Woodhouse <David.Woodhouse@intel.com>
>Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
>---
>I noticed this bug on openwrt with kernel 3.10 but it looks as if this can
>still happen in more recent kernels.
>
> drivers/mtd/nand/nand_base.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
>index 9715a7b..2298289 100644
>--- a/drivers/mtd/nand/nand_base.c
>+++ b/drivers/mtd/nand/nand_base.c
>@@ -3768,7 +3768,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> 			ecc->write_oob = nand_write_oob_std;
> 		if (!ecc->read_subpage)
> 			ecc->read_subpage = nand_read_subpage;
>-		if (!ecc->write_subpage)
>+		if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)

I don't think this is correct because nand_write_subpage_hwecc() is a
replaceable function (default generic implementation). So

(1) if chip->ecc.hwctl() and chip->ecc.calculate are not implemented but you
 still want to use subpage write feature, then you need to provide custom
 implementation for chip->ecc.write_subpage().
 that's same for other interfaces of nand_chip like chip->ecc.write_page().

(2) If you don't want to use subpage write feature then just disable it using
	chip->options |= NAND_NO_SUBPAGE_WRITE;

Can you please tell which NAND controller driver is causing this ?
We need to fix that..

with regards, pekon
Helmut Schaa April 9, 2014, 10:06 a.m. UTC | #2
On Wed, Apr 9, 2014 at 11:38 AM, Gupta, Pekon <pekon@ti.com> wrote:
>>diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
>>index 9715a7b..2298289 100644
>>--- a/drivers/mtd/nand/nand_base.c
>>+++ b/drivers/mtd/nand/nand_base.c
>>@@ -3768,7 +3768,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>>                       ecc->write_oob = nand_write_oob_std;
>>               if (!ecc->read_subpage)
>>                       ecc->read_subpage = nand_read_subpage;
>>-              if (!ecc->write_subpage)
>>+              if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)
>
> I don't think this is correct because nand_write_subpage_hwecc() is a
> replaceable function (default generic implementation). So
>
> (1) if chip->ecc.hwctl() and chip->ecc.calculate are not implemented but you
>  still want to use subpage write feature, then you need to provide custom
>  implementation for chip->ecc.write_subpage().
>  that's same for other interfaces of nand_chip like chip->ecc.write_page().

But these don't cause panics :)

> (2) If you don't want to use subpage write feature then just disable it using
>         chip->options |= NAND_NO_SUBPAGE_WRITE;
>
> Can you please tell which NAND controller driver is causing this ?
> We need to fix that..

This happens with fsl_elbc_nand (while trying to run ubiformat on a
mtd dev) but the
crash was caused by the introduction of nand_write_subpage_hwecc. So, in this
case I think instead of trying to fix every possible driver we should
let the nand core
code handle this issue gracefully. Maybe we could add a WARN_ON_ONCE to
notice which drivers require adjustments.

Regards,
Helmut
Brian Norris Dec. 9, 2015, 11:58 p.m. UTC | #3
Hi,

I have my archaeology hat on, in the deep corners of my mailbox...

On Wed, Apr 09, 2014 at 11:13:24AM +0200, Helmut Schaa wrote:
> nand_write_subpage_hwecc causes a crash if the driver did not register
> ecc->hwctl or ecc->calculate. Fix this by disabling subpage writes if
> ecc->hwctl or ecc->calculate is not provided by the driver.
> 
> This behavior was introduced in commit 837a6ba4f3b6d23026674e6af6b6849a4634fff9
> "mtd: nand: subpage write support for hardware based ECC schemes".
> 
> This fixes a crash with fsl_elbc_nand and maybe others:
> 
[...]
> 
> Cc: Gupta, Pekon <pekon@ti.com>
> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> Cc: David Woodhouse <David.Woodhouse@intel.com>
> Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
> ---
> I noticed this bug on openwrt with kernel 3.10 but it looks as if this can
> still happen in more recent kernels.
> 
>  drivers/mtd/nand/nand_base.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 9715a7b..2298289 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -3768,7 +3768,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>  			ecc->write_oob = nand_write_oob_std;
>  		if (!ecc->read_subpage)
>  			ecc->read_subpage = nand_read_subpage;
> -		if (!ecc->write_subpage)
> +		if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)
>  			ecc->write_subpage = nand_write_subpage_hwecc;
>  
>  	case NAND_ECC_HW_SYNDROME:

I realize we've merged a fix for fsl_elbc_nand long ago (commit
f034d87def51 "mtd: eLBC NAND: fix subpage write support"), but this
change still looks sane, applies, and could possibly fix some other
drivers that are lurking somewhere. Any reason I shouldn't apply it?

Regards,
Brian
Helmut Schaa Dec. 10, 2015, 9:31 a.m. UTC | #4
On Thu, Dec 10, 2015 at 12:58 AM, Brian Norris
<computersforpeace@gmail.com> wrote:
> Hi,
>
> I have my archaeology hat on, in the deep corners of my mailbox...
>
> On Wed, Apr 09, 2014 at 11:13:24AM +0200, Helmut Schaa wrote:
>> nand_write_subpage_hwecc causes a crash if the driver did not register
>> ecc->hwctl or ecc->calculate. Fix this by disabling subpage writes if
>> ecc->hwctl or ecc->calculate is not provided by the driver.
>>
>> This behavior was introduced in commit 837a6ba4f3b6d23026674e6af6b6849a4634fff9
>> "mtd: nand: subpage write support for hardware based ECC schemes".
>>
>> This fixes a crash with fsl_elbc_nand and maybe others:
>>
> [...]
>>
>> Cc: Gupta, Pekon <pekon@ti.com>
>> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>> Cc: David Woodhouse <David.Woodhouse@intel.com>
>> Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
>> ---
>> I noticed this bug on openwrt with kernel 3.10 but it looks as if this can
>> still happen in more recent kernels.
>>
>>  drivers/mtd/nand/nand_base.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
>> index 9715a7b..2298289 100644
>> --- a/drivers/mtd/nand/nand_base.c
>> +++ b/drivers/mtd/nand/nand_base.c
>> @@ -3768,7 +3768,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>>                       ecc->write_oob = nand_write_oob_std;
>>               if (!ecc->read_subpage)
>>                       ecc->read_subpage = nand_read_subpage;
>> -             if (!ecc->write_subpage)
>> +             if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)
>>                       ecc->write_subpage = nand_write_subpage_hwecc;
>>
>>       case NAND_ECC_HW_SYNDROME:
>
> I realize we've merged a fix for fsl_elbc_nand long ago (commit
> f034d87def51 "mtd: eLBC NAND: fix subpage write support"), but this
> change still looks sane, applies, and could possibly fix some other
> drivers that are lurking somewhere. Any reason I shouldn't apply it?

I'm not carrying this patch in my tree anymore and did not see any more
crashes with the flash configuration I'm using. So, I think the original patch
is kind of superfluous.

Regards,
Helmut
Brian Norris Dec. 12, 2015, 12:40 a.m. UTC | #5
On Thu, Dec 10, 2015 at 10:31:32AM +0100, Helmut Schaa wrote:
> On Thu, Dec 10, 2015 at 12:58 AM, Brian Norris
> <computersforpeace@gmail.com> wrote:
> > On Wed, Apr 09, 2014 at 11:13:24AM +0200, Helmut Schaa wrote:
> >> nand_write_subpage_hwecc causes a crash if the driver did not register
> >> ecc->hwctl or ecc->calculate. Fix this by disabling subpage writes if
> >> ecc->hwctl or ecc->calculate is not provided by the driver.
> >>
> >> This behavior was introduced in commit 837a6ba4f3b6d23026674e6af6b6849a4634fff9
> >> "mtd: nand: subpage write support for hardware based ECC schemes".
> >>
> >> This fixes a crash with fsl_elbc_nand and maybe others:
> >>
> > [...]
> >>
> >> --- a/drivers/mtd/nand/nand_base.c
> >> +++ b/drivers/mtd/nand/nand_base.c
> >> @@ -3768,7 +3768,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> >>                       ecc->write_oob = nand_write_oob_std;
> >>               if (!ecc->read_subpage)
> >>                       ecc->read_subpage = nand_read_subpage;
> >> -             if (!ecc->write_subpage)
> >> +             if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)
> >>                       ecc->write_subpage = nand_write_subpage_hwecc;
> >>
> >>       case NAND_ECC_HW_SYNDROME:
> >
> > I realize we've merged a fix for fsl_elbc_nand long ago (commit
> > f034d87def51 "mtd: eLBC NAND: fix subpage write support"), but this
> > change still looks sane, applies, and could possibly fix some other
> > drivers that are lurking somewhere. Any reason I shouldn't apply it?
> 
> I'm not carrying this patch in my tree anymore and did not see any more
> crashes with the flash configuration I'm using. So, I think the original patch
> is kind of superfluous.

There could easily be other drivers that don't have hwctl() or
calculate() callbacks yet use NAND_ECC_HW. By inspection, I see that
maybe the new hisi504_nand.c is susceptible, as well as sh_flctl.c and
maybe even sunxi_nand.c (?). Some of these might not be hit if they
don't use NAND that support subpage writes.

Anyway, I think this patch is helpful, because it prevents drivers from
having to fill in a 'subpage' write callback that looks just like their
write_page() callback.

Brian
Boris Brezillon Dec. 12, 2015, 7:59 a.m. UTC | #6
Hi Brian,

On Fri, 11 Dec 2015 16:40:25 -0800
Brian Norris <computersforpeace@gmail.com> wrote:

> On Thu, Dec 10, 2015 at 10:31:32AM +0100, Helmut Schaa wrote:
> > On Thu, Dec 10, 2015 at 12:58 AM, Brian Norris
> > <computersforpeace@gmail.com> wrote:
> > > On Wed, Apr 09, 2014 at 11:13:24AM +0200, Helmut Schaa wrote:
> > >> nand_write_subpage_hwecc causes a crash if the driver did not register
> > >> ecc->hwctl or ecc->calculate. Fix this by disabling subpage writes if
> > >> ecc->hwctl or ecc->calculate is not provided by the driver.
> > >>
> > >> This behavior was introduced in commit 837a6ba4f3b6d23026674e6af6b6849a4634fff9
> > >> "mtd: nand: subpage write support for hardware based ECC schemes".
> > >>
> > >> This fixes a crash with fsl_elbc_nand and maybe others:
> > >>
> > > [...]
> > >>
> > >> --- a/drivers/mtd/nand/nand_base.c
> > >> +++ b/drivers/mtd/nand/nand_base.c
> > >> @@ -3768,7 +3768,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> > >>                       ecc->write_oob = nand_write_oob_std;
> > >>               if (!ecc->read_subpage)
> > >>                       ecc->read_subpage = nand_read_subpage;

Shouldn't we have the same kind of test for ->read_subpage.

		if (!ecc->read_subpage && ecc->correct &&
		    ecc->calculate)
			ecc->read_subpage = nand_read_subpage;


> > >> -             if (!ecc->write_subpage)
> > >> +             if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)
> > >>                       ecc->write_subpage = nand_write_subpage_hwecc;
> > >>
> > >>       case NAND_ECC_HW_SYNDROME:
> > >
> > > I realize we've merged a fix for fsl_elbc_nand long ago (commit
> > > f034d87def51 "mtd: eLBC NAND: fix subpage write support"), but this
> > > change still looks sane, applies, and could possibly fix some other
> > > drivers that are lurking somewhere. Any reason I shouldn't apply it?
> > 
> > I'm not carrying this patch in my tree anymore and did not see any more
> > crashes with the flash configuration I'm using. So, I think the original patch
> > is kind of superfluous.
> 
> There could easily be other drivers that don't have hwctl() or
> calculate() callbacks yet use NAND_ECC_HW. By inspection, I see that
> maybe the new hisi504_nand.c is susceptible, as well as sh_flctl.c and
> maybe even sunxi_nand.c (?). Some of these might not be hit if they
> don't use NAND that support subpage writes.

Yep, sunxi platforms could be hit by this bug, but it seems nobody
decided to design a board with an SLC NAND (and MLCs are not supporting
subpage writes).

> 
> Anyway, I think this patch is helpful, because it prevents drivers from
> having to fill in a 'subpage' write callback that looks just like their
> write_page() callback.

I agree.

Best Regards,

Boris
Helmut Schaa Dec. 14, 2015, 2:04 p.m. UTC | #7
On Sat, Dec 12, 2015 at 8:59 AM, Boris Brezillon
<boris.brezillon@free-electrons.com> wrote:
> Hi Brian,
>
> On Fri, 11 Dec 2015 16:40:25 -0800
> Brian Norris <computersforpeace@gmail.com> wrote:
>
>> On Thu, Dec 10, 2015 at 10:31:32AM +0100, Helmut Schaa wrote:
>> > On Thu, Dec 10, 2015 at 12:58 AM, Brian Norris
>> > <computersforpeace@gmail.com> wrote:
>> > > On Wed, Apr 09, 2014 at 11:13:24AM +0200, Helmut Schaa wrote:
>> > >> nand_write_subpage_hwecc causes a crash if the driver did not register
>> > >> ecc->hwctl or ecc->calculate. Fix this by disabling subpage writes if
>> > >> ecc->hwctl or ecc->calculate is not provided by the driver.
>> > >>
>> > >> This behavior was introduced in commit 837a6ba4f3b6d23026674e6af6b6849a4634fff9
>> > >> "mtd: nand: subpage write support for hardware based ECC schemes".
>> > >>
>> > >> This fixes a crash with fsl_elbc_nand and maybe others:
>> > >>
>> > > [...]
>> > >>
>> > >> --- a/drivers/mtd/nand/nand_base.c
>> > >> +++ b/drivers/mtd/nand/nand_base.c
>> > >> @@ -3768,7 +3768,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>> > >>                       ecc->write_oob = nand_write_oob_std;
>> > >>               if (!ecc->read_subpage)
>> > >>                       ecc->read_subpage = nand_read_subpage;
>
> Shouldn't we have the same kind of test for ->read_subpage.
>
>                 if (!ecc->read_subpage && ecc->correct &&
>                     ecc->calculate)
>                         ecc->read_subpage = nand_read_subpage;
>
>
>> > >> -             if (!ecc->write_subpage)
>> > >> +             if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)
>> > >>                       ecc->write_subpage = nand_write_subpage_hwecc;
>> > >>
>> > >>       case NAND_ECC_HW_SYNDROME:
>> > >
>> > > I realize we've merged a fix for fsl_elbc_nand long ago (commit
>> > > f034d87def51 "mtd: eLBC NAND: fix subpage write support"), but this
>> > > change still looks sane, applies, and could possibly fix some other
>> > > drivers that are lurking somewhere. Any reason I shouldn't apply it?
>> >
>> > I'm not carrying this patch in my tree anymore and did not see any more
>> > crashes with the flash configuration I'm using. So, I think the original patch
>> > is kind of superfluous.
>>
>> There could easily be other drivers that don't have hwctl() or
>> calculate() callbacks yet use NAND_ECC_HW. By inspection, I see that
>> maybe the new hisi504_nand.c is susceptible, as well as sh_flctl.c and
>> maybe even sunxi_nand.c (?). Some of these might not be hit if they
>> don't use NAND that support subpage writes.
>
> Yep, sunxi platforms could be hit by this bug, but it seems nobody
> decided to design a board with an SLC NAND (and MLCs are not supporting
> subpage writes).
>
>>
>> Anyway, I think this patch is helpful, because it prevents drivers from
>> having to fill in a 'subpage' write callback that looks just like their
>> write_page() callback.
>
> I agree.

Fine with me then.
Helmut
Brian Norris Dec. 19, 2015, 2:19 a.m. UTC | #8
On Sat, Dec 12, 2015 at 08:59:00AM +0100, Boris Brezillon wrote:
> On Fri, 11 Dec 2015 16:40:25 -0800
> Brian Norris <computersforpeace@gmail.com> wrote:
> 
> > On Thu, Dec 10, 2015 at 10:31:32AM +0100, Helmut Schaa wrote:
> > > On Thu, Dec 10, 2015 at 12:58 AM, Brian Norris
> > > <computersforpeace@gmail.com> wrote:
> > > > On Wed, Apr 09, 2014 at 11:13:24AM +0200, Helmut Schaa wrote:
> > > >> nand_write_subpage_hwecc causes a crash if the driver did not register
> > > >> ecc->hwctl or ecc->calculate. Fix this by disabling subpage writes if
> > > >> ecc->hwctl or ecc->calculate is not provided by the driver.
> > > >>
> > > >> This behavior was introduced in commit 837a6ba4f3b6d23026674e6af6b6849a4634fff9
> > > >> "mtd: nand: subpage write support for hardware based ECC schemes".
> > > >>
> > > >> This fixes a crash with fsl_elbc_nand and maybe others:
> > > >>
> > > > [...]
> > > >>
> > > >> --- a/drivers/mtd/nand/nand_base.c
> > > >> +++ b/drivers/mtd/nand/nand_base.c
> > > >> @@ -3768,7 +3768,7 @@ int nand_scan_tail(struct mtd_info *mtd)
> > > >>                       ecc->write_oob = nand_write_oob_std;
> > > >>               if (!ecc->read_subpage)
> > > >>                       ecc->read_subpage = nand_read_subpage;
> 
> Shouldn't we have the same kind of test for ->read_subpage.
> 
> 		if (!ecc->read_subpage && ecc->correct &&
> 		    ecc->calculate)
> 			ecc->read_subpage = nand_read_subpage;

We could do that, though right now we have a little less danger for
read_subpage() than we do with write_subpage(), because drivers have to
opt in with the NAND_SUBPAGE_READ flag before this function will
actually be used. Feel free to send a patch if you'd like.

> > > >> -             if (!ecc->write_subpage)
> > > >> +             if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)
> > > >>                       ecc->write_subpage = nand_write_subpage_hwecc;
> > > >>
> > > >>       case NAND_ECC_HW_SYNDROME:
> > > >
> > > > I realize we've merged a fix for fsl_elbc_nand long ago (commit
> > > > f034d87def51 "mtd: eLBC NAND: fix subpage write support"), but this
> > > > change still looks sane, applies, and could possibly fix some other
> > > > drivers that are lurking somewhere. Any reason I shouldn't apply it?
> > > 
> > > I'm not carrying this patch in my tree anymore and did not see any more
> > > crashes with the flash configuration I'm using. So, I think the original patch
> > > is kind of superfluous.
> > 
> > There could easily be other drivers that don't have hwctl() or
> > calculate() callbacks yet use NAND_ECC_HW. By inspection, I see that
> > maybe the new hisi504_nand.c is susceptible, as well as sh_flctl.c and
> > maybe even sunxi_nand.c (?). Some of these might not be hit if they
> > don't use NAND that support subpage writes.
> 
> Yep, sunxi platforms could be hit by this bug, but it seems nobody
> decided to design a board with an SLC NAND (and MLCs are not supporting
> subpage writes).
> 
> > 
> > Anyway, I think this patch is helpful, because it prevents drivers from
> > having to fill in a 'subpage' write callback that looks just like their
> > write_page() callback.
> 
> I agree.

OK. Applied to l2-mtd.git, with additional commentary about how the
reported issue is already fixed, but there could be other drivers who
run across this problem.

Brian
diff mbox

Patch

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 9715a7b..2298289 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -3768,7 +3768,7 @@  int nand_scan_tail(struct mtd_info *mtd)
 			ecc->write_oob = nand_write_oob_std;
 		if (!ecc->read_subpage)
 			ecc->read_subpage = nand_read_subpage;
-		if (!ecc->write_subpage)
+		if (!ecc->write_subpage && ecc->hwctl && ecc->calculate)
 			ecc->write_subpage = nand_write_subpage_hwecc;
 
 	case NAND_ECC_HW_SYNDROME: