[PULL,v2] mtd: nand: Changes for 4.13

Message ID 20170703134609.0ea9cd63@bbrezillon
State Not Applicable
Headers show

Pull-request

ssh://git.infradead.org/srv/git/l2-mtd.git tags/nand/for-4.13

Message

Boris Brezillon July 3, 2017, 11:46 a.m.
Hi,

Here's my PR for 4.13. This v2 includes Dan's fix ("mtd: nand: mtk:
release lock on error path").

As usual, let me know if you see any problem.

Thanks,

Boris

The following changes since commit 2ea659a9ef488125eb46da6eb571de5eae5c43f6:

  Linux 4.12-rc1 (2017-05-13 13:19:49 -0700)

are available in the git repository at:

  ssh://git.infradead.org/srv/git/l2-mtd.git tags/nand/for-4.13

for you to fetch changes up to 81667e9c8ad827365f2d1e8b924caf062a19c593:

  mtd: nand: mtk: release lock on error path (2017-07-03 13:39:09 +0200)

----------------------------------------------------------------
This pull request contains the following core changes:

* addition of on-ecc support to Micron driver
* addition of helpers to help drivers choose most appropriate ECC
  settings
* deletion of dead-code (cached programming and ->errstat() hook)
* make sure drivers that do not support the SET/GET FEATURES command
  return ENOTSUPP use a dummy ->set/get_features implementation
  returning -ENOTSUPP (required for Micron on-die ECC)
* change the semantic of ecc->write_page() for drivers setting the
  NAND_ECC_CUSTOM_PAGE_ACCESS flag
* support exiting 'GET STATUS' command in default ->cmdfunc()
  implementations
* change the prototype of ->setup_data_interface()

A bunch of driver related changes:

* various cleanup, fixes and improvements of the MTK driver
* OMAP DT bindings fixes
* support for ->setup_data_interface() in the fsmc driver
* support for imx7 in the gpmi driver
* finalization of the denali driver rework (thanks to Masahiro for the
  work he's done on this driver)
* fix "bitflips in erased pages" handling in the ifc driver
* addition of PM ops and dynamic timing configuration to the atmel
  driver

And as usual we also have a few minor cleanup/fixes/improvements
patches across the subsystem.

----------------------------------------------------------------
Alexander Couzens (1):
      mtd: nand: davinci: set ECC algorithm explicitly for HW based ECC

Alexandre Belloni (1):
      mtd: nand: atmel: drop unused include

Arnd Bergmann (1):
      mtd: nand: atmel: mark resume function __maybe_unused

Arvind Yadav (1):
      mtd: nand: orion: Handle return value of clk_prepare_enable

Boris Brezillon (15):
      mtd: nand: jz4780: Use mtd_set_ooblayout() to set the ooblayout
      mtd: nand: Make sure drivers not supporting SET/GET_FEATURES return -ENOTSUPP
      mtd: nand: gpmi: Fix gpmi_nand_init() error path
      mtd: nand: gpmi: Kill gpmi_nand_exit()
      mtd: nand: Pass the CS line to ->setup_data_interface()
      mtd: nand: atmel: Add ->setup_data_interface() hooks
      mtd: nand: atmel: Add PM ops
      mtd: nand: Drop unused cached programming support
      mtd: nand: Drop the ->errstat() hook
      mtd: nand: sunxi: Actually use DMA for subpage reads
      mtd: nand: sunxi: Remove unneeded ->cmdfunc(NAND_CMD_READ0, 0, page)
      mtd: nand: tango: Fix incorrect use of SEQIN command
      mtd: nand: Wait for PAGEPROG to finish in drivers setting NAND_ECC_CUSTOM_PAGE_ACCESS
      mtd: nand: Support 'EXIT GET STATUS' command in nand_command[_lp]()
      mtd: nand: fsl_ifc: fix handing of bit flips in erased pages

Dan Carpenter (1):
      mtd: nand: mtk: release lock on error path

Ezequiel Garcia (2):
      mtd: nand: Add Hisilicon machine dependency
      mtd: nand: Add Mediatek machine dependency

Masahiro Yamada (25):
      mtd: nand: check ecc->total sanity in nand_scan_tail
      mtd: nand: denali_dt: clean up resource ioremap
      mtd: nand: denali: use BIT() and GENMASK() for register macros
      mtd: nand: add generic helpers to check, match, maximize ECC settings
      mtd: nand: add a shorthand to generate nand_ecc_caps structure
      mtd: nand: denali: avoid hard-coding ECC step, strength, bytes
      mtd: nand: denali: remove Toshiba and Hynix specific fixup code
      mtd: nand: denali_dt: add compatible strings for UniPhier SoC variants
      mtd: nand: denali: set NAND_ECC_CUSTOM_PAGE_ACCESS
      mtd: nand: denali: remove unneeded find_valid_banks()
      mtd: nand: denali: handle timing parameters by setup_data_interface()
      mtd: nand: denali: rework interrupt handling
      mtd: nand: denali: switch over to cmd_ctrl instead of cmdfunc
      mtd: nand: denali: fix bank reset function to detect the number of chips
      mtd: nand: denali: use interrupt instead of polling for bank reset
      mtd: nand: denali: propagate page to helpers via function argument
      mtd: nand: denali: merge struct nand_buf into struct denali_nand_info
      mtd: nand: denali: use flag instead of register macro for direction
      mtd: nand: denali: fix raw and oob accessors for syndrome page layout
      mtd: nand: denali: support hardware-assisted erased page detection
      mtd: nand: denali: skip driver internal bounce buffer when possible
      mtd: nand: denali: use non-managed kmalloc() for DMA buffer
      mtd: nand: denali: enable bad block table scan
      mtd: nand: denali: avoid magic numbers and rename for clarification
      MAINTAINERS: add entry for Denali NAND controller driver

Matthias Lange (2):
      mtd: nand: gpmi: Fix typo in data structure name
      mtd: nand: gpmi: fix typo in comment

Pavel Machek (1):
      mtd: nand: Optimize checking of erased buffers

Prabhakar Kushwaha (1):
      mtd: nand: ifc: Initialize SRAM for all version >= 1.0

Stefan Agner (3):
      mtd: nand: gpmi: unify clock handling
      mtd: nand: gpmi: add i.MX 7 SoC support
      mtd: gpmi: document current clock requirements

Thomas Petazzoni (8):
      mtd: nand: fsmc: reduce number of arguments of fsmc_nand_setup()
      mtd: nand: fsmc: add support for SDR timings
      mtd: nand: fsmc: remove default timings
      dt-bindings: mtd: document new "on-die" nand-ecc-mode
      mtd: nand: add core support for on-die ECC
      mtd: nand: export nand_{read,write}_page_raw()
      mtd: nand: add support for Micron on-die ECC
      mtd: nand: fsmc_nand: handle on-die ECC case

Tom Rini (2):
      dt-bindings: mtd: elm: Correct compatible string requirement
      dt-bindings: gpmc: Correct location of generic gpmc binding

Xiaolei Li (9):
      mtd: nand: mediatek: update DT bindings
      mtd: nand: mediatek: refine register NFI_PAGEFMT setting
      mtd: nand: mediatek: add support for different MTK NAND FLASH Controller IP
      mtd: nand: mediatek: add support for MT2712 NAND FLASH Controller
      mtd: nand: mtk: fix incorrect register setting order about ecc irq
      mtd: nand: mtk: disable ecc irq when writing page with hwecc
      mtd: nand: mtk: remove unneeded mtk_nfc_hw_init from mtk_nfc_resume
      mtd: nand: mtk: remove unneeded mtk_ecc_hw_init from mtk_ecc_resume
      mtd: nand: mtk: add ->setup_data_interface() hook

 Documentation/devicetree/bindings/mtd/denali-nand.txt  |   13 +
 Documentation/devicetree/bindings/mtd/elm.txt          |    2 +-
 Documentation/devicetree/bindings/mtd/gpmc-nand.txt    |    2 +-
 Documentation/devicetree/bindings/mtd/gpmc-nor.txt     |    4 +-
 Documentation/devicetree/bindings/mtd/gpmc-onenand.txt |    2 +-
 Documentation/devicetree/bindings/mtd/gpmi-nand.txt    |   14 +-
 Documentation/devicetree/bindings/mtd/mtk-nand.txt     |    5 +-
 Documentation/devicetree/bindings/mtd/nand.txt         |    2 +-
 Documentation/devicetree/bindings/net/gpmc-eth.txt     |    4 +-
 MAINTAINERS                                            |    6 +
 drivers/mtd/nand/Kconfig                               |    3 +
 drivers/mtd/nand/atmel/nand-controller.c               |  354 ++++++++++-
 drivers/mtd/nand/bcm47xxnflash/ops_bcm4706.c           |    2 +
 drivers/mtd/nand/cafe_nand.c                           |    2 +
 drivers/mtd/nand/davinci_nand.c                        |    3 +
 drivers/mtd/nand/denali.c                              | 1831 ++++++++++++++++++++++++-----------------------------
 drivers/mtd/nand/denali.h                              |  315 +++++----
 drivers/mtd/nand/denali_dt.c                           |   53 +-
 drivers/mtd/nand/denali_pci.c                          |   26 +-
 drivers/mtd/nand/docg4.c                               |    2 +
 drivers/mtd/nand/fsl_elbc_nand.c                       |    2 +
 drivers/mtd/nand/fsl_ifc_nand.c                        |   81 +--
 drivers/mtd/nand/fsmc_nand.c                           |  122 +++-
 drivers/mtd/nand/gpmi-nand/gpmi-lib.c                  |    6 +-
 drivers/mtd/nand/gpmi-nand/gpmi-nand.c                 |   75 ++-
 drivers/mtd/nand/gpmi-nand/gpmi-nand.h                 |   13 +-
 drivers/mtd/nand/hisi504_nand.c                        |    2 +
 drivers/mtd/nand/jz4780_nand.c                         |    2 +-
 drivers/mtd/nand/mpc5121_nfc.c                         |    2 +
 drivers/mtd/nand/mtk_ecc.c                             |  228 +++----
 drivers/mtd/nand/mtk_ecc.h                             |    2 +-
 drivers/mtd/nand/mtk_nand.c                            |  279 ++++----
 drivers/mtd/nand/mxc_nand.c                            |   12 +-
 drivers/mtd/nand/nand_base.c                           |  349 ++++++++--
 drivers/mtd/nand/nand_micron.c                         |  222 +++++++
 drivers/mtd/nand/orion_nand.c                          |    6 +-
 drivers/mtd/nand/pxa3xx_nand.c                         |    2 +
 drivers/mtd/nand/qcom_nandc.c                          |    2 +
 drivers/mtd/nand/s3c2410.c                             |    5 +-
 drivers/mtd/nand/sh_flctl.c                            |    2 +
 drivers/mtd/nand/sunxi_nand.c                          |    9 +-
 drivers/mtd/nand/tango_nand.c                          |   22 +-
 drivers/mtd/nand/vf610_nfc.c                           |    2 +
 drivers/staging/mt29f_spinand/mt29f_spinand.c          |    2 +
 include/linux/mtd/nand.h                               |   80 ++-
 45 files changed, 2546 insertions(+), 1628 deletions(-)

Comments

Brian Norris July 12, 2017, 1:01 a.m. | #1
+ Thomas, as I don't really want to dig up the original patch to review
right now

On Mon, Jul 03, 2017 at 01:46:09PM +0200, Boris Brezillon wrote:
> Hi,
> 
> Here's my PR for 4.13. This v2 includes Dan's fix ("mtd: nand: mtk:
> release lock on error path").
> 
> As usual, let me know if you see any problem.
> 
> Thanks,
> 
> Boris
> 
> The following changes since commit 2ea659a9ef488125eb46da6eb571de5eae5c43f6:
> 
>   Linux 4.12-rc1 (2017-05-13 13:19:49 -0700)
> 
> are available in the git repository at:
> 
>   ssh://git.infradead.org/srv/git/l2-mtd.git tags/nand/for-4.13

I happen to have SSH access to that repository :) but a
publicly-readable link is usually a better idea, in the highly unlikely
case that someone else wants to review your pull request.

> for you to fetch changes up to 81667e9c8ad827365f2d1e8b924caf062a19c593:
> 
>   mtd: nand: mtk: release lock on error path (2017-07-03 13:39:09 +0200)
> 
> ----------------------------------------------------------------
> This pull request contains the following core changes:
> 
> * addition of on-ecc support to Micron driver

This still works by overriding chip->ecc.{read,write}_page{,_raw}()
callbacks... I never liked that, and there have been multiple
submissions that tried this already, IIRC. It's for sure not going to
work with at least one in-tree driver (brcmnand); I suspect others
(qcom_nand? mtk_nand?). Can this be improved to either bail out when
this clobbers a controller's existing callback, or even better, to
utilize a controller's existing _raw() implementation, instead of
assuming it can use the nand_base defaults?

The latter seems difficult to do in time for this merge, but maybe the
former can be done within the release?

I'm still tempted to pull this, since at least it does require somebody
to opt in via the device tree binding. So they should probably make sure
it works with their driver before using it.

> * addition of helpers to help drivers choose most appropriate ECC
>   settings
> * deletion of dead-code (cached programming and ->errstat() hook)
> * make sure drivers that do not support the SET/GET FEATURES command
>   return ENOTSUPP use a dummy ->set/get_features implementation
>   returning -ENOTSUPP (required for Micron on-die ECC)
> * change the semantic of ecc->write_page() for drivers setting the
>   NAND_ECC_CUSTOM_PAGE_ACCESS flag
> * support exiting 'GET STATUS' command in default ->cmdfunc()
>   implementations
> * change the prototype of ->setup_data_interface()
> 
> A bunch of driver related changes:
> 
> * various cleanup, fixes and improvements of the MTK driver
> * OMAP DT bindings fixes
> * support for ->setup_data_interface() in the fsmc driver
> * support for imx7 in the gpmi driver
> * finalization of the denali driver rework (thanks to Masahiro for the
>   work he's done on this driver)
> * fix "bitflips in erased pages" handling in the ifc driver
> * addition of PM ops and dynamic timing configuration to the atmel
>   driver
> 
> And as usual we also have a few minor cleanup/fixes/improvements
> patches across the subsystem.
> 

Brian
Boris Brezillon July 12, 2017, 7:19 a.m. | #2
Le Tue, 11 Jul 2017 18:01:15 -0700,
Brian Norris <computersforpeace@gmail.com> a écrit :

> + Thomas, as I don't really want to dig up the original patch to review
> right now
> 
> On Mon, Jul 03, 2017 at 01:46:09PM +0200, Boris Brezillon wrote:
> > Hi,
> > 
> > Here's my PR for 4.13. This v2 includes Dan's fix ("mtd: nand: mtk:
> > release lock on error path").
> > 
> > As usual, let me know if you see any problem.
> > 
> > Thanks,
> > 
> > Boris
> > 
> > The following changes since commit 2ea659a9ef488125eb46da6eb571de5eae5c43f6:
> > 
> >   Linux 4.12-rc1 (2017-05-13 13:19:49 -0700)
> > 
> > are available in the git repository at:
> > 
> >   ssh://git.infradead.org/srv/git/l2-mtd.git tags/nand/for-4.13  
> 
> I happen to have SSH access to that repository :) but a
> publicly-readable link is usually a better idea, in the highly unlikely
> case that someone else wants to review your pull request.

My bad. I have 2 remotes for l2-mtd (l2-mtd and l2-mtd-pub), just passed
the wrong one to 'git request-pull' :-/.

> 
> > for you to fetch changes up to 81667e9c8ad827365f2d1e8b924caf062a19c593:
> > 
> >   mtd: nand: mtk: release lock on error path (2017-07-03 13:39:09 +0200)
> > 
> > ----------------------------------------------------------------
> > This pull request contains the following core changes:
> > 
> > * addition of on-ecc support to Micron driver  
> 
> This still works by overriding chip->ecc.{read,write}_page{,_raw}()
> callbacks... I never liked that, and there have been multiple
> submissions that tried this already, IIRC. It's for sure not going to
> work with at least one in-tree driver (brcmnand); I suspect others
> (qcom_nand? mtk_nand?).

Yes, I know it doesn't work with all drivers, but given that those
drivers are already not supporting ECC_NONE or ECC_SW, I don't think
it's a real program. 

> Can this be improved to either bail out when
> this clobbers a controller's existing callback, or even better, to
> utilize a controller's existing _raw() implementation, instead of
> assuming it can use the nand_base defaults?

IMO, if there's something to fix it's the NAND controller drivers that
are only partially complying with core expectations. I mean,
'->cmdfunc(READ0) + ->read_buf()' is a valid sequence and should work
with all NAND controllers, but for some (good?) reasons, some drivers
are taking liberties with this semantic.

Note that we are working on improving the situation with the
->exec_op() approach [4], which should provide NAND controller drivers
with all the information they need to execute a NAND operation
(CMD, ADDR, DATA, WAIT cycles), which, AFAIU was one blocking
aspect with '->cmdfunc()/->cmd_ctrl() + ->read/write_buf/byte/word()'. 

It will also let drivers return -ENOTSUPP to report when they simply
don't support a specific NAND operation (for example, when the
CMD+ADDR+DATA+WAIT sequence is not supported).

> 
> The latter seems difficult to do in time for this merge, but maybe the
> former can be done within the release?
> 
> I'm still tempted to pull this, since at least it does require somebody
> to opt in via the device tree binding. So they should probably make sure
> it works with their driver before using it.

There really no risk with this new feature because:

1/ the option has to be opted-in by the NAND controller driver (or
   through a DT property)
2/ even if it's enabled by mistake on one of these controllers, they
   either check ecc->mode [1][2] or simply override chip->ecc content
   unconditionally [3]

If one NAND controller driver is not doing #2, the driver should be
fixed not the on-die ECC logic.

> 
> > * addition of helpers to help drivers choose most appropriate ECC
> >   settings
> > * deletion of dead-code (cached programming and ->errstat() hook)
> > * make sure drivers that do not support the SET/GET FEATURES command
> >   return ENOTSUPP use a dummy ->set/get_features implementation
> >   returning -ENOTSUPP (required for Micron on-die ECC)
> > * change the semantic of ecc->write_page() for drivers setting the
> >   NAND_ECC_CUSTOM_PAGE_ACCESS flag
> > * support exiting 'GET STATUS' command in default ->cmdfunc()
> >   implementations
> > * change the prototype of ->setup_data_interface()
> > 
> > A bunch of driver related changes:
> > 
> > * various cleanup, fixes and improvements of the MTK driver
> > * OMAP DT bindings fixes
> > * support for ->setup_data_interface() in the fsmc driver
> > * support for imx7 in the gpmi driver
> > * finalization of the denali driver rework (thanks to Masahiro for the
> >   work he's done on this driver)
> > * fix "bitflips in erased pages" handling in the ifc driver
> > * addition of PM ops and dynamic timing configuration to the atmel
> >   driver
> > 
> > And as usual we also have a few minor cleanup/fixes/improvements
> > patches across the subsystem.
> >   
> 
> Brian

[1]http://elixir.free-electrons.com/linux/latest/source/drivers/mtd/nand/brcmnand/brcmnand.c#L2121
[2]http://elixir.free-electrons.com/linux/latest/source/drivers/mtd/nand/mtk_nand.c#L1171
[3]http://elixir.free-electrons.com/linux/latest/source/drivers/mtd/nand/qcom_nandc.c#L1840
[4]https://github.com/linux-nand/linux/commits/nand/exec_op
Brian Norris July 13, 2017, 1:44 a.m. | #3
Hi,

On Wed, Jul 12, 2017 at 09:19:19AM +0200, Boris Brezillon wrote:
> Le Tue, 11 Jul 2017 18:01:15 -0700,
> Brian Norris <computersforpeace@gmail.com> a écrit :
> 
> > + Thomas, as I don't really want to dig up the original patch to review
> > right now
> > 
> > On Mon, Jul 03, 2017 at 01:46:09PM +0200, Boris Brezillon wrote:
> > > Hi,
> > > 
> > > Here's my PR for 4.13. This v2 includes Dan's fix ("mtd: nand: mtk:
> > > release lock on error path").
> > > 
> > > As usual, let me know if you see any problem.
> > > 
> > > Thanks,
> > > 
> > > Boris
> > > 
> > > The following changes since commit 2ea659a9ef488125eb46da6eb571de5eae5c43f6:
> > > 
> > >   Linux 4.12-rc1 (2017-05-13 13:19:49 -0700)
> > > 
> > > are available in the git repository at:
> > > 
> > >   ssh://git.infradead.org/srv/git/l2-mtd.git tags/nand/for-4.13  
> > 
> > I happen to have SSH access to that repository :) but a
> > publicly-readable link is usually a better idea, in the highly unlikely
> > case that someone else wants to review your pull request.
> 
> My bad. I have 2 remotes for l2-mtd (l2-mtd and l2-mtd-pub), just passed
> the wrong one to 'git request-pull' :-/.
> 
> > 
> > > for you to fetch changes up to 81667e9c8ad827365f2d1e8b924caf062a19c593:
> > > 
> > >   mtd: nand: mtk: release lock on error path (2017-07-03 13:39:09 +0200)
> > > 
> > > ----------------------------------------------------------------
> > > This pull request contains the following core changes:
> > > 
> > > * addition of on-ecc support to Micron driver  
> > 
> > This still works by overriding chip->ecc.{read,write}_page{,_raw}()
> > callbacks... I never liked that, and there have been multiple
> > submissions that tried this already, IIRC. It's for sure not going to
> > work with at least one in-tree driver (brcmnand); I suspect others
> > (qcom_nand? mtk_nand?).
> 
> Yes, I know it doesn't work with all drivers, but given that those
> drivers are already not supporting ECC_NONE or ECC_SW, I don't think
> it's a real program. 

I'll take 's/program/problem/' as a fun typo? :D

I suppose that's mostly fine reasoning (though ECC_NONE is a pretty
stupid mode). And I see that the "unsupported drivers" are mostly going
to be resilient to this anyway, as noted below.

> > Can this be improved to either bail out when
> > this clobbers a controller's existing callback, or even better, to
> > utilize a controller's existing _raw() implementation, instead of
> > assuming it can use the nand_base defaults?
> 
> IMO, if there's something to fix it's the NAND controller drivers that
> are only partially complying with core expectations. I mean,
> '->cmdfunc(READ0) + ->read_buf()' is a valid sequence and should work
> with all NAND controllers, but for some (good?) reasons, some drivers
> are taking liberties with this semantic.

How is that supposed to communicate things like "read with ECC" vs. "read
without ECC"? Overall, that command semantic looked designed mostly for
stupid controllers that integrated very little, and I completely
understand why many vendors went with drivers that sidestepped that and
implemented operations differently. (Disclosure: I implemented one.)

If you expect cmdfunc(READ0) + ->read_buf() to always give you raw data,
then why should any controller driver be allowed to implement
ecc.read_page_raw()?

Or conversely, if there are good reasons to allow controllers to
override ecc.read_page_raw(), then why should the on-die *flash*
implementation circumvent that? Flash support should be orthogonal to
controller support.

> Note that we are working on improving the situation with the
> ->exec_op() approach [4], which should provide NAND controller drivers
> with all the information they need to execute a NAND operation
> (CMD, ADDR, DATA, WAIT cycles), which, AFAIU was one blocking
> aspect with '->cmdfunc()/->cmd_ctrl() + ->read/write_buf/byte/word()'. 
> 
> It will also let drivers return -ENOTSUPP to report when they simply
> don't support a specific NAND operation (for example, when the
> CMD+ADDR+DATA+WAIT sequence is not supported).

I'm pretty sure that won't cover why several of the drivers I pointed at
don't implement the "semantics" you expect.

> > 
> > The latter seems difficult to do in time for this merge, but maybe the
> > former can be done within the release?
> > 
> > I'm still tempted to pull this, since at least it does require somebody
> > to opt in via the device tree binding. So they should probably make sure
> > it works with their driver before using it.
> 
> There really no risk with this new feature because:

The following reasons are OK, but I object to the "no risk" assessment.
There's a reason this approach was rejected before; it's setting a
precedent of supporting new flash features with a "works for me", but
without regard for some drivers that don't use the same approach. Yes,
sometimes controller drivers need to be reworked or updated, but at the
same time, there are perfectly reasonable ways to implement this while
seamlessly supporting these drivers -- namely, *wrap* the existing
ecc->{read,write}_page_raw() instead of replacing them.

> 1/ the option has to be opted-in by the NAND controller driver (or
>    through a DT property)

I noted that already, which is why I'm not necessarily rejecting the
pull request.

> 2/ even if it's enabled by mistake on one of these controllers, they
>    either check ecc->mode [1][2]

Right, thanks for pointing those out. That helps a bit more.

> or simply override chip->ecc content
>    unconditionally [3]

Really? That seems a little error prone to have two different setters
trying to clobber each other. Before this patch set, at least we mostly
just respect the controller driver (see all the 'if (!ecc->foo)' in
nand_scan_tail()); the only clobbering was on the 'mode' field, because
it can be read from the device tree, in nand_scan_ident().

> If one NAND controller driver is not doing #2, the driver should be
> fixed not the on-die ECC logic.

I'm not sure those are mutually exclusive. The (new) on-die ECC logic
can easily recognize that ecc.foo is already configured, and skip
clobbering it. As noted above, nand_scan_tail() already does this for
most other similar features.

If I don't see someone else fix this up soon, I'll look at doing it
myself.

> > 
> > > * addition of helpers to help drivers choose most appropriate ECC
> > >   settings
> > > * deletion of dead-code (cached programming and ->errstat() hook)
> > > * make sure drivers that do not support the SET/GET FEATURES command
> > >   return ENOTSUPP use a dummy ->set/get_features implementation
> > >   returning -ENOTSUPP (required for Micron on-die ECC)
> > > * change the semantic of ecc->write_page() for drivers setting the
> > >   NAND_ECC_CUSTOM_PAGE_ACCESS flag
> > > * support exiting 'GET STATUS' command in default ->cmdfunc()
> > >   implementations
> > > * change the prototype of ->setup_data_interface()
> > > 
> > > A bunch of driver related changes:
> > > 
> > > * various cleanup, fixes and improvements of the MTK driver
> > > * OMAP DT bindings fixes
> > > * support for ->setup_data_interface() in the fsmc driver
> > > * support for imx7 in the gpmi driver
> > > * finalization of the denali driver rework (thanks to Masahiro for the
> > >   work he's done on this driver)
> > > * fix "bitflips in erased pages" handling in the ifc driver
> > > * addition of PM ops and dynamic timing configuration to the atmel
> > >   driver
> > > 
> > > And as usual we also have a few minor cleanup/fixes/improvements
> > > patches across the subsystem.

Anyway, I think I'm satisfied enough that we're OK for now. I've pulled
the set.

Brian

> [1]http://elixir.free-electrons.com/linux/latest/source/drivers/mtd/nand/brcmnand/brcmnand.c#L2121
> [2]http://elixir.free-electrons.com/linux/latest/source/drivers/mtd/nand/mtk_nand.c#L1171
> [3]http://elixir.free-electrons.com/linux/latest/source/drivers/mtd/nand/qcom_nandc.c#L1840
> [4]https://github.com/linux-nand/linux/commits/nand/exec_op
Boris Brezillon July 13, 2017, 9:09 a.m. | #4
On Wed, 12 Jul 2017 18:44:33 -0700
Brian Norris <computersforpeace@gmail.com> wrote:

> Hi,
> 
> On Wed, Jul 12, 2017 at 09:19:19AM +0200, Boris Brezillon wrote:
> > Le Tue, 11 Jul 2017 18:01:15 -0700,
> > Brian Norris <computersforpeace@gmail.com> a écrit :
> >   
> > > + Thomas, as I don't really want to dig up the original patch to review
> > > right now
> > > 
> > > On Mon, Jul 03, 2017 at 01:46:09PM +0200, Boris Brezillon wrote:  
> > > > Hi,
> > > > 
> > > > Here's my PR for 4.13. This v2 includes Dan's fix ("mtd: nand: mtk:
> > > > release lock on error path").
> > > > 
> > > > As usual, let me know if you see any problem.
> > > > 
> > > > Thanks,
> > > > 
> > > > Boris
> > > > 
> > > > The following changes since commit 2ea659a9ef488125eb46da6eb571de5eae5c43f6:
> > > > 
> > > >   Linux 4.12-rc1 (2017-05-13 13:19:49 -0700)
> > > > 
> > > > are available in the git repository at:
> > > > 
> > > >   ssh://git.infradead.org/srv/git/l2-mtd.git tags/nand/for-4.13    
> > > 
> > > I happen to have SSH access to that repository :) but a
> > > publicly-readable link is usually a better idea, in the highly unlikely
> > > case that someone else wants to review your pull request.  
> > 
> > My bad. I have 2 remotes for l2-mtd (l2-mtd and l2-mtd-pub), just passed
> > the wrong one to 'git request-pull' :-/.
> >   
> > >   
> > > > for you to fetch changes up to 81667e9c8ad827365f2d1e8b924caf062a19c593:
> > > > 
> > > >   mtd: nand: mtk: release lock on error path (2017-07-03 13:39:09 +0200)
> > > > 
> > > > ----------------------------------------------------------------
> > > > This pull request contains the following core changes:
> > > > 
> > > > * addition of on-ecc support to Micron driver    
> > > 
> > > This still works by overriding chip->ecc.{read,write}_page{,_raw}()
> > > callbacks... I never liked that, and there have been multiple
> > > submissions that tried this already, IIRC. It's for sure not going to
> > > work with at least one in-tree driver (brcmnand); I suspect others
> > > (qcom_nand? mtk_nand?).  
> > 
> > Yes, I know it doesn't work with all drivers, but given that those
> > drivers are already not supporting ECC_NONE or ECC_SW, I don't think
> > it's a real program.   
> 
> I'll take 's/program/problem/' as a fun typo? :D

Hehe. Yep, it's a typo.

> 
> I suppose that's mostly fine reasoning (though ECC_NONE is a pretty
> stupid mode).

I don't want to debate the usefulness of ECC_NONE here, but I don't
find it stupid, just not so useful in real-life :-P.

> And I see that the "unsupported drivers" are mostly going
> to be resilient to this anyway, as noted below.

I'm more optimistic here. Of course, it requires reworking the logic in
most of them, and a few might never be able to support on-die ECC
because of hardcoded logic in the NAND controller IP preventing one
from disabling the ECC engine.

> 
> > > Can this be improved to either bail out when
> > > this clobbers a controller's existing callback, or even better, to
> > > utilize a controller's existing _raw() implementation, instead of
> > > assuming it can use the nand_base defaults?  
> > 
> > IMO, if there's something to fix it's the NAND controller drivers that
> > are only partially complying with core expectations. I mean,
> > '->cmdfunc(READ0) + ->read_buf()' is a valid sequence and should work
> > with all NAND controllers, but for some (good?) reasons, some drivers
> > are taking liberties with this semantic.  
> 
> How is that supposed to communicate things like "read with ECC" vs. "read
> without ECC"?

It's not supposed to.

> Overall, that command semantic looked designed mostly for
> stupid controllers that integrated very little, and I completely
> understand why many vendors went with drivers that sidestepped that and
> implemented operations differently. (Disclosure: I implemented one.)

Well, that's where most of the misunderstanding comes from. People
tend to mix the ECC engine and NAND controller concepts, simply because
it's most of the time embedded in the same HW block.

Let me explain my understanding of the NAND framework. You have 3
different concepts:

 1/ the NAND chip (struct nand_chip)
 2/ the NAND controller (struct nand_hw_ctrl)
 3/ the ECC engine (struct nand_ecc_ctrl)

The NAND chip and NAND controller are pretty easy to identify and
separate, and you don't have a choice here.

It's a bit more complicated for the ECC engine, because it may be in
the placed in the NAND controller, directly on the chip or handled
completely in SW.

ecc->read/write_page[_raw]() methods are here to let the ECC engine
take the appropriate action based on the required operation, but those
functions have to be implemented by the block taking care of ECC:

- on-die ECC: NAND vendor driver sends a command to disable/enable ECC
  before launching the read/prog page operation, and check for errors
  with another NAND command (only for READ page operations).
- ECC engine embedded in NAND controllers: NAND controller driver sets
  the ECC_EN bit before trigerring the NAND read/prog operation and
  check for errors when the read is done

> 
> If you expect cmdfunc(READ0) + ->read_buf() to always give you raw data,
> then why should any controller driver be allowed to implement
> ecc.read_page_raw()?

Because ECC engine might have to be disabled before you launch the
cmdfunc(READ0) + ->read_buf() sequence, and only the ECC controller can
do that. For ECC engines embedded in NAND controllers it's usually about
setting the ECC_EN bit to 0 and skipping ECC correction step. For
on-die ECC, it may require sending an extra NAND command.
If you let the NAND controller override ecc->read_page_raw() and still
want to use on-die ECC, then you're screwed, because on-die ECC logic
embedded in the NAND might stay enabled even when ecc->read_page_raw()
is called.

> 
> Or conversely, if there are good reasons to allow controllers to
> override ecc.read_page_raw(), then why should the on-die *flash*
> implementation circumvent that? Flash support should be orthogonal to
> controller support.

The only case where on-die ECC logic should override NAND controller
hooks is when ecc->mode is set to ONDIE_ECC. But that also means that,
when on-die ECC is selected by the NAND controller driver (at probe/init
time), it should make sure that the ECC engine embedded in the NAND
controller is disabled when this particular NAND chip is accessed (can
be done in ->select_chip() for instance, or it can be disabled at
probe time assuming you only have one NAND chip connected to the NAND
controller).

> 
> > Note that we are working on improving the situation with the  
> > ->exec_op() approach [4], which should provide NAND controller drivers  
> > with all the information they need to execute a NAND operation
> > (CMD, ADDR, DATA, WAIT cycles), which, AFAIU was one blocking
> > aspect with '->cmdfunc()/->cmd_ctrl() + ->read/write_buf/byte/word()'. 
> > 
> > It will also let drivers return -ENOTSUPP to report when they simply
> > don't support a specific NAND operation (for example, when the
> > CMD+ADDR+DATA+WAIT sequence is not supported).  
> 
> I'm pretty sure that won't cover why several of the drivers I pointed at
> don't implement the "semantics" you expect.

Of course, this ->exec_op() hook won't magically solve all problems, but
it will help transition to something that is easier to maintain.

As mentioned above, the NAND controller drivers will still be able to
decide whether it supports on-die ECC or not, and when it does, it
should make sure the ECC engine embedded in the controller is disabled
when read/prog accesses are done to this specific chip.

Now, maybe I'm wrong, and ->exec_op() will not fit all controllers.
But in this case, I'd rather add chip->read/write_page() hooks to let
the NAND controller do raw accesses to the NAND rather than abuse the
ecc->read/write_page[_raw]() ones, because we're likely to need a
specific implementation for these methods for some funky on-die ECC
engines (a specific action might be required to disable ECC before a
read/write access).

> 
> > > 
> > > The latter seems difficult to do in time for this merge, but maybe the
> > > former can be done within the release?
> > > 
> > > I'm still tempted to pull this, since at least it does require somebody
> > > to opt in via the device tree binding. So they should probably make sure
> > > it works with their driver before using it.  
> > 
> > There really no risk with this new feature because:  
> 
> The following reasons are OK, but I object to the "no risk" assessment.
> There's a reason this approach was rejected before; it's setting a
> precedent of supporting new flash features with a "works for me", but
> without regard for some drivers that don't use the same approach.

But that's not what Thomas did here. The feature has to be explicitly
enabled, and most (all?) drivers check/override ecc->mode before
calling nand_scan_tail(). So, unless someone decides to remove
ecc->mode checks without testing, we should be safe.

> Yes,
> sometimes controller drivers need to be reworked or updated, but at the
> same time, there are perfectly reasonable ways to implement this while
> seamlessly supporting these drivers -- namely, *wrap* the existing
> ecc->{read,write}_page_raw() instead of replacing them.

Then we need to clarify things, and adding chip->read/write_page()
page hooks in nand_chip to do all raw accesses to the NAND is one
solution. By raw, I mean raw from the NAND controller PoV. That is, if
the controller embeds an ECC engine it has to disable it.
Note that we still need the ecc->read/write_page_raw() methods, because
accessing the NAND in raw mode from the NAND controller PoV does not
necessarily mean we've disabled ECC (this is the case when on-die ECC
is in use).

> 
> > 1/ the option has to be opted-in by the NAND controller driver (or
> >    through a DT property)  
> 
> I noted that already, which is why I'm not necessarily rejecting the
> pull request.
> 
> > 2/ even if it's enabled by mistake on one of these controllers, they
> >    either check ecc->mode [1][2]  
> 
> Right, thanks for pointing those out. That helps a bit more.
> 
> > or simply override chip->ecc content
> >    unconditionally [3]  
> 
> Really? That seems a little error prone to have two different setters
> trying to clobber each other. Before this patch set, at least we mostly
> just respect the controller driver (see all the 'if (!ecc->foo)' in
> nand_scan_tail()); the only clobbering was on the 'mode' field, because
> it can be read from the device tree, in nand_scan_ident().
> 
> > If one NAND controller driver is not doing #2, the driver should be
> > fixed not the on-die ECC logic.  
> 
> I'm not sure those are mutually exclusive. The (new) on-die ECC logic
> can easily recognize that ecc.foo is already configured, and skip
> clobbering it. As noted above, nand_scan_tail() already does this for
> most other similar features.
> 
> If I don't see someone else fix this up soon, I'll look at doing it
> myself.

Actually, this patch [1] should fix the problem. By moving the vendor
init step in nand_scan_tail() we make sure ecc->mode is set to its final
value when Micron's driver decides whether on-die ECC should be
activated or not.

[1]https://patchwork.ozlabs.org/patch/770228/