mbox series

[v2,0/5] spi: spi-mtk-nor: Add mt8192 support.

Message ID 20200918083124.3921207-1-ikjn@chromium.org
Headers show
Series spi: spi-mtk-nor: Add mt8192 support. | expand

Message

Ikjoon Jang Sept. 18, 2020, 8:31 a.m. UTC
This patchset adds 36bit dma address and power management
supports for mt8192-nor.

Changes in v2:
- Add power management support
- Fix bugs in checking spi memory operation.
- use dma_alloc_coherent for allocating bounce buffer
- code cleanups

Ikjoon Jang (5):
  dt-bindings: spi: add mt8192-nor compatible string
  spi: spi-mtk-nor: fix mishandled logics in checking SPI memory
    operation
  spi: spi-mtk-nor: use dma_alloc_coherent() for bounce buffer
  spi: spi-mtk-nor: support 36bit dma addressing to mediatek
  spi: spi-mtk-nor: Add power management support

 .../bindings/spi/mediatek,spi-mtk-nor.yaml    |   1 +
 drivers/spi/spi-mtk-nor.c                     | 242 ++++++++++++------
 2 files changed, 170 insertions(+), 73 deletions(-)

Comments

Chuanhong Guo Sept. 18, 2020, 1:09 p.m. UTC | #1
Hi!

On Fri, Sep 18, 2020 at 4:34 PM Ikjoon Jang <ikjn@chromium.org> wrote:
>
> Fix a simple bug which can limits its transfer size,
> and add a simple helper function for code cleanups.
>
> Fixes: a59b2c7c56bf ("spi: spi-mtk-nor: support standard spi properties")
> Signed-off-by: Ikjoon Jang <ikjn@chromium.org>
>
> ---
>
> (no changes since v1)
>
>  drivers/spi/spi-mtk-nor.c | 62 ++++++++++++++++++++++++---------------
>  1 file changed, 38 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/spi/spi-mtk-nor.c b/drivers/spi/spi-mtk-nor.c
> index 6e6ca2b8e6c8..54b2c0fde95b 100644
> --- a/drivers/spi/spi-mtk-nor.c
> +++ b/drivers/spi/spi-mtk-nor.c
> @@ -167,52 +167,63 @@ static bool mtk_nor_match_read(const struct spi_mem_op *op)
>         return false;
>  }
>
> -static int mtk_nor_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op)
> +static bool need_bounce(void *cpu_addr, unsigned long len)
>  {
> -       size_t len;
> +       return !!(((uintptr_t)cpu_addr) & MTK_NOR_DMA_ALIGN_MASK);
> +}

parameter 'len' isn't used in this function.

>
> +static int mtk_nor_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op)
> +{
>         if (!op->data.nbytes)
>                 return 0;
>
>         if ((op->addr.nbytes == 3) || (op->addr.nbytes == 4)) {
> -               if ((op->data.dir == SPI_MEM_DATA_IN) &&
> -                   mtk_nor_match_read(op)) {

I think replacing a if/else if with a two-case switch is more
of a personal code preference rather than code cleanup.
I'd prefer only adding need_bounce to replace alignment
check using a separated commit and leave other stuff
untouched because:
1. This "cleanup" made unintended logic changes (see below)
2. The "cleanup" itself actually becomes the major part of
    this patch, while the actual fix mentioned in commit
    message is the minor part.
3. A fix commit should contain the fix itself. It shouldn't
    mix with these code changes.

> +               switch (op->data.dir) {
> +               case SPI_MEM_DATA_IN:
> +                       if (!mtk_nor_match_read(op))
> +                               return -EINVAL;

You are changing the code logic here.
mtk_nor_match_read checks if the operation can be executed
using controller PIO/DMA reading. Even if it's not supported,
we can still use PRG mode to execute the operation.
One example of such an operation is SPI NOR SFDP reading.
Your change breaks that which then breaks 1_2_2 and 1_4_4
reading capability because spi-nor driver parses these op formats
from SFDP table.

> +                       /* check if it's DMAable */
>                         if ((op->addr.val & MTK_NOR_DMA_ALIGN_MASK) ||
> -                           (op->data.nbytes < MTK_NOR_DMA_ALIGN))
> +                           (op->data.nbytes < MTK_NOR_DMA_ALIGN)) {
>                                 op->data.nbytes = 1;
> -                       else if (!((ulong)(op->data.buf.in) &
> -                                  MTK_NOR_DMA_ALIGN_MASK))
> +                       } else {
> +                               if (need_bounce(op->data.buf.in, op->data.nbytes) &&
> +                                   (op->data.nbytes > MTK_NOR_BOUNCE_BUF_SIZE))
> +                                       op->data.nbytes = MTK_NOR_BOUNCE_BUF_SIZE;
>                                 op->data.nbytes &= ~MTK_NOR_DMA_ALIGN_MASK;
> -                       else if (op->data.nbytes > MTK_NOR_BOUNCE_BUF_SIZE)
> -                               op->data.nbytes = MTK_NOR_BOUNCE_BUF_SIZE;

data length alignment is intentionally done only for DMA reading
without the bounce buffer.
My intention here:
If we use the bounce buffer, we can read more data than needed to.
Say we want 25 bytes of data, reading 32 bytes using DMA and
bounce buffer should be faster than reading 16 bytes with DMA
and another 9 bytes with PIO, because for every single byte of PIO
reading, adjust_op_size and exec_op is called once, we
program controller with new cmd/address, and controller need
to send extra cmd/address to flash.
I noticed that you removed this part of logic from DMA reading
execution in 3/5 as well. Please revert the logic change here
add in DMA reading function (see later comment in 3/5).

> -                       return 0;
> -               } else if (op->data.dir == SPI_MEM_DATA_OUT) {
> +                       }
> +                       break;
> +               case SPI_MEM_DATA_OUT:
>                         if (op->data.nbytes >= MTK_NOR_PP_SIZE)
>                                 op->data.nbytes = MTK_NOR_PP_SIZE;
>                         else
>                                 op->data.nbytes = 1;
> -                       return 0;
> +                       break;
> +               default:
> +                       break;
>                 }
> +       } else {
> +               u8 len = op->cmd.nbytes + op->addr.nbytes + op->dummy.nbytes;
> +
> +               if (len > MTK_NOR_PRG_MAX_SIZE)
> +                       return -EINVAL;
> +               if (op->data.nbytes && !(MTK_NOR_PRG_MAX_SIZE - len))
> +                       return -EINVAL;
> +               if (op->data.nbytes > (MTK_NOR_PRG_MAX_SIZE - len))
> +                       op->data.nbytes = MTK_NOR_PRG_MAX_SIZE - len;
>         }
>
> -       len = MTK_NOR_PRG_MAX_SIZE - op->cmd.nbytes - op->addr.nbytes -
> -             op->dummy.nbytes;
> -       if (op->data.nbytes > len)
> -               op->data.nbytes = len;
> -
>         return 0;
>  }
>
>  static bool mtk_nor_supports_op(struct spi_mem *mem,
>                                 const struct spi_mem_op *op)
>  {
> -       size_t len;
> -
>         if (op->cmd.buswidth != 1)
>                 return false;
>
>         if ((op->addr.nbytes == 3) || (op->addr.nbytes == 4)) {
> -               switch(op->data.dir) {
> +               switch (op->data.dir) {
>                 case SPI_MEM_DATA_IN:
>                         if (!mtk_nor_match_read(op))
>                                 return false;
> @@ -226,11 +237,14 @@ static bool mtk_nor_supports_op(struct spi_mem *mem,
>                 default:
>                         break;
>                 }
> +       } else {
> +               u8 len = op->cmd.nbytes + op->addr.nbytes + op->dummy.nbytes;
> +
> +               if (len > MTK_NOR_PRG_MAX_SIZE)
> +                       return false;
> +               if (op->data.nbytes && !(MTK_NOR_PRG_MAX_SIZE - len))
> +                       return false;
>         }
> -       len = op->cmd.nbytes + op->addr.nbytes + op->dummy.nbytes;
> -       if ((len > MTK_NOR_PRG_MAX_SIZE) ||
> -           ((op->data.nbytes) && (len == MTK_NOR_PRG_MAX_SIZE)))
> -               return false;
>
>         return spi_mem_default_supports_op(mem, op);
>  }
> --
> 2.28.0.681.g6f77f65b4e-goog
>
Chuanhong Guo Sept. 18, 2020, 1:25 p.m. UTC | #2
Hi!

On Fri, Sep 18, 2020 at 4:35 PM Ikjoon Jang <ikjn@chromium.org> wrote:
>
> Use dma_alloc_coherent() for bounce buffer instead of kmalloc.

The commit message should explain why such a change is
needed. (i.e. why using dma_alloc_coherent here is better
than kmalloc.) And if there's no benefit for this change I'd prefer
leaving it untouched.
I remembered reading somewhere that stream DMA api is
prefered over dma_alloc_coherent for this kind of single-direction
DMA operation.

>
> Signed-off-by: Ikjoon Jang <ikjn@chromium.org>
>
> ---
>
> (no changes since v1)
>
>  drivers/spi/spi-mtk-nor.c | 60 +++++++++++++++++++++++----------------
>  1 file changed, 35 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/spi/spi-mtk-nor.c b/drivers/spi/spi-mtk-nor.c
> index 54b2c0fde95b..e14798a6e7d0 100644
> --- a/drivers/spi/spi-mtk-nor.c
> +++ b/drivers/spi/spi-mtk-nor.c
> @@ -96,6 +96,7 @@ struct mtk_nor {
>         struct device *dev;
>         void __iomem *base;
>         u8 *buffer;
> +       dma_addr_t buffer_dma;
>         struct clk *spi_clk;
>         struct clk *ctlr_clk;
>         unsigned int spi_freq;
> @@ -275,19 +276,16 @@ static void mtk_nor_setup_bus(struct mtk_nor *sp, const struct spi_mem_op *op)
>         mtk_nor_rmw(sp, MTK_NOR_REG_BUSCFG, reg, MTK_NOR_BUS_MODE_MASK);
>  }
>
> -static int mtk_nor_read_dma(struct mtk_nor *sp, u32 from, unsigned int length,
> -                           u8 *buffer)
> +static int read_dma(struct mtk_nor *sp, u32 from, unsigned int length,

This name is a bit confusing considering there's a mtk_nor_read_dma
below.
As this function now only executes dma readings and wait it to finish,
what about mtk_nor_dma_exec instead?

> +                   dma_addr_t dma_addr)
>  {
>         int ret = 0;
>         ulong delay;
>         u32 reg;
> -       dma_addr_t dma_addr;
>
> -       dma_addr = dma_map_single(sp->dev, buffer, length, DMA_FROM_DEVICE);
> -       if (dma_mapping_error(sp->dev, dma_addr)) {
> -               dev_err(sp->dev, "failed to map dma buffer.\n");
> +       if (WARN_ON((length & MTK_NOR_DMA_ALIGN_MASK) ||
> +                   (dma_addr & MTK_NOR_DMA_ALIGN_MASK)))

These alignment is guaranteed by callers of this function if all
my comments below are addressed. This check isn't needed.

>                 return -EINVAL;
> -       }
>
>         writel(from, sp->base + MTK_NOR_REG_DMA_FADR);
>         writel(dma_addr, sp->base + MTK_NOR_REG_DMA_DADR);
> @@ -312,30 +310,39 @@ static int mtk_nor_read_dma(struct mtk_nor *sp, u32 from, unsigned int length,
>                                          (delay + 1) * 100);
>         }
>
> -       dma_unmap_single(sp->dev, dma_addr, length, DMA_FROM_DEVICE);
>         if (ret < 0)
>                 dev_err(sp->dev, "dma read timeout.\n");
>
>         return ret;
>  }
>
> -static int mtk_nor_read_bounce(struct mtk_nor *sp, u32 from,
> -                              unsigned int length, u8 *buffer)
> +static int mtk_nor_read_dma(struct mtk_nor *sp, u32 from,
> +                           unsigned int length, u8 *buffer)
>  {
> -       unsigned int rdlen;
>         int ret;
> +       dma_addr_t dma_addr;
> +       bool bounce = need_bounce(buffer, length);
>
> -       if (length & MTK_NOR_DMA_ALIGN_MASK)
> -               rdlen = (length + MTK_NOR_DMA_ALIGN) & ~MTK_NOR_DMA_ALIGN_MASK;

The intention of this rdlen alignment is explained in 2/5.
Please make sure this rdlen alignment logic is present
only for PIO reading.

> -       else
> -               rdlen = length;
> +       if (!bounce) {
> +               dma_addr = dma_map_single(sp->dev, buffer, length,
> +                                         DMA_FROM_DEVICE);
> +               if (dma_mapping_error(sp->dev, dma_addr)) {
> +                       dev_err(sp->dev, "failed to map dma buffer.\n");
> +                       return -EINVAL;
> +               }
> +       } else {
> +               dma_addr = sp->buffer_dma;
> +       }
>
> -       ret = mtk_nor_read_dma(sp, from, rdlen, sp->buffer);
> -       if (ret)
> -               return ret;
> +       ret = read_dma(sp, from, length, dma_addr);
>
> -       memcpy(buffer, sp->buffer, length);
> -       return 0;
> +       if (!bounce)
> +               dma_unmap_single(sp->dev, dma_addr, length,
> +                                DMA_FROM_DEVICE);
> +       else
> +               memcpy(buffer, sp->buffer, length);
> +
> +       return ret;
>  }

I think a separated read_dma and read_bounce function will be
cleaner than this if-else implementation:
read_dma:
1. call dma_map_single to get physical address
2. call read_dma to execute operation
3. call dma_unmap_single

read_bounce:
1. align reading length
2. call read_dma
3. call memcpy

>
>  static int mtk_nor_read_pio(struct mtk_nor *sp, const struct spi_mem_op *op)
> @@ -439,11 +446,6 @@ static int mtk_nor_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
>                 if (op->data.nbytes == 1) {
>                         mtk_nor_set_addr(sp, op);
>                         return mtk_nor_read_pio(sp, op);
> -               } else if (((ulong)(op->data.buf.in) &
> -                           MTK_NOR_DMA_ALIGN_MASK)) {
> -                       return mtk_nor_read_bounce(sp, op->addr.val,
> -                                                  op->data.nbytes,
> -                                                  op->data.buf.in);
>                 } else {
>                         return mtk_nor_read_dma(sp, op->addr.val,
>                                                 op->data.nbytes,
> @@ -654,6 +656,10 @@ static int mtk_nor_probe(struct platform_device *pdev)
>         sp->dev = &pdev->dev;
>         sp->spi_clk = spi_clk;
>         sp->ctlr_clk = ctlr_clk;

There is extra memory allocation code for sp->buffer in mtk_nor_probe.
If you intend to replace this with dma_alloc_coherent you should
drop those devm_kmalloc code as well.

> +       sp->buffer = dma_alloc_coherent(&pdev->dev, MTK_NOR_BOUNCE_BUF_SIZE,
> +                                       &sp->buffer_dma, GFP_KERNEL);

There's a devm variant: dmam_alloc_coherent(dev, size, dma_handle, gfp)

> +       if (!sp->buffer)
> +               return -ENOMEM;

This spi-nor controller requires all addresses to be 16-byte aligned.
Although it should be guaranteed by a usually way larger page
alignment address from dma_alloc_coherent I'd prefer an explicit
check for address alignment here rather than letting it probe
successfully and fail for every dma_read with bounce buffer.


>
>         irq = platform_get_irq_optional(pdev, 0);
>         if (irq < 0) {
> @@ -674,6 +680,8 @@ static int mtk_nor_probe(struct platform_device *pdev)
>         ret = mtk_nor_init(sp);
>         if (ret < 0) {
>                 kfree(ctlr);
> +               dma_free_coherent(&pdev->dev, MTK_NOR_BOUNCE_BUF_SIZE,
> +                                 sp->buffer, sp->buffer_dma);
>                 return ret;
>         }
>
> @@ -692,6 +700,8 @@ static int mtk_nor_remove(struct platform_device *pdev)
>
>         mtk_nor_disable_clk(sp);
>
> +       dma_free_coherent(&pdev->dev, MTK_NOR_BOUNCE_BUF_SIZE,
> +                         sp->buffer, sp->buffer_dma);
>         return 0;
>  }
>
> --
> 2.28.0.681.g6f77f65b4e-goog
>


--
Regards,
Chuanhong Guo
Chuanhong Guo Sept. 18, 2020, 1:33 p.m. UTC | #3
Hi!

On Fri, Sep 18, 2020 at 9:09 PM Chuanhong Guo <gch981213@gmail.com> wrote:
> On Fri, Sep 18, 2020 at 4:34 PM Ikjoon Jang <ikjn@chromium.org> wrote:
> > [...]
> > +               switch (op->data.dir) {
> > +               case SPI_MEM_DATA_IN:
> > +                       if (!mtk_nor_match_read(op))
> > +                               return -EINVAL;
>
> You are changing the code logic here.
> mtk_nor_match_read checks if the operation can be executed
> using controller PIO/DMA reading. Even if it's not supported,
> we can still use PRG mode to execute the operation.
> One example of such an operation is SPI NOR SFDP reading.
> Your change breaks that which then breaks 1_2_2 and 1_4_4
> reading capability because spi-nor driver parses these op formats
> from SFDP table.

I just noticed that you already broke this in:
spi: spi-mtk-nor: support standard spi properties
Please also fix the same logic in mtk_nor_supports_op in your v3.
Yingjoe Chen Sept. 19, 2020, 3:26 p.m. UTC | #4
On Fri, 2020-09-18 at 16:31 +0800, Ikjoon Jang wrote:
> This patch enables 36bit dma address support to spi-mtk-nor.
> Currently 36bit dma addressing is enabled only for mt8192-nor.
> 
> Signed-off-by: Ikjoon Jang <ikjn@chromium.org>
> ---
> 
> (no changes since v1)
> 
>  drivers/spi/spi-mtk-nor.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/spi/spi-mtk-nor.c b/drivers/spi/spi-mtk-nor.c
> index e14798a6e7d0..99dd5dca744e 100644
> --- a/drivers/spi/spi-mtk-nor.c
> +++ b/drivers/spi/spi-mtk-nor.c
> @@ -78,6 +78,8 @@
>  #define MTK_NOR_REG_DMA_FADR		0x71c
>  #define MTK_NOR_REG_DMA_DADR		0x720
>  #define MTK_NOR_REG_DMA_END_DADR	0x724
> +#define MTK_NOR_REG_DMA_DADR_HB		0x738
> +#define MTK_NOR_REG_DMA_END_DADR_HB	0x73c
>  
>  #define MTK_NOR_PRG_MAX_SIZE		6
>  // Reading DMA src/dst addresses have to be 16-byte aligned
> @@ -102,6 +104,7 @@ struct mtk_nor {
>  	unsigned int spi_freq;
>  	bool wbuf_en;
>  	bool has_irq;
> +	bool high_dma;
>  	struct completion op_done;
>  };
>  
> @@ -291,6 +294,11 @@ static int read_dma(struct mtk_nor *sp, u32 from, unsigned int length,
>  	writel(dma_addr, sp->base + MTK_NOR_REG_DMA_DADR);
>  	writel(dma_addr + length, sp->base + MTK_NOR_REG_DMA_END_DADR);
>  
> +	if (sp->high_dma) {
> +		writel(dma_addr >> 32, sp->base + MTK_NOR_REG_DMA_DADR_HB);
> +		writel((dma_addr + length) >> 32, sp->base + MTK_NOR_REG_DMA_END_DADR_HB);
> +	}
> +
>  	if (sp->has_irq) {
>  		reinit_completion(&sp->op_done);
>  		mtk_nor_rmw(sp, MTK_NOR_REG_IRQ_EN, MTK_NOR_IRQ_DMA, 0);
> @@ -594,7 +602,8 @@ static const struct spi_controller_mem_ops mtk_nor_mem_ops = {
>  };
>  
>  static const struct of_device_id mtk_nor_match[] = {
> -	{ .compatible = "mediatek,mt8173-nor" },
> +	{ .compatible = "mediatek,mt8192-nor", .data = (void *)36 },
> +	{ .compatible = "mediatek,mt8173-nor", .data = (void *)32 },
>  	{ /* sentinel */ }
>  };
>  MODULE_DEVICE_TABLE(of, mtk_nor_match);
> @@ -607,6 +616,7 @@ static int mtk_nor_probe(struct platform_device *pdev)
>  	u8 *buffer;
>  	struct clk *spi_clk, *ctlr_clk;
>  	int ret, irq;
> +	unsigned long dma_bits;
>  
>  	base = devm_platform_ioremap_resource(pdev, 0);
>  	if (IS_ERR(base))
> @@ -623,6 +633,13 @@ static int mtk_nor_probe(struct platform_device *pdev)
>  	buffer = devm_kmalloc(&pdev->dev,
>  			      MTK_NOR_BOUNCE_BUF_SIZE + MTK_NOR_DMA_ALIGN,
>  			      GFP_KERNEL);
> +
> +	dma_bits = (unsigned long)of_device_get_match_data(&pdev->dev);
> +	if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(dma_bits))) {
> +		dev_err(&pdev->dev, "failed to set dma mask(%lu)\n", dma_bits);
> +		return -EINVAL;
> +	}
> +

Do we need to set sp->high_dma when we have >32bits DMA?

Joe.C
Ikjoon Jang Sept. 21, 2020, 6:10 a.m. UTC | #5
On Fri, Sep 18, 2020 at 9:09 PM Chuanhong Guo <gch981213@gmail.com> wrote:
>
> Hi!
>
> On Fri, Sep 18, 2020 at 4:34 PM Ikjoon Jang <ikjn@chromium.org> wrote:
> >
> > Fix a simple bug which can limits its transfer size,
> > and add a simple helper function for code cleanups.
> >
> > Fixes: a59b2c7c56bf ("spi: spi-mtk-nor: support standard spi properties")
> > Signed-off-by: Ikjoon Jang <ikjn@chromium.org>
> >
> > ---
> >
> > (no changes since v1)
> >
> >  drivers/spi/spi-mtk-nor.c | 62 ++++++++++++++++++++++++---------------
> >  1 file changed, 38 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/spi/spi-mtk-nor.c b/drivers/spi/spi-mtk-nor.c
> > index 6e6ca2b8e6c8..54b2c0fde95b 100644
> > --- a/drivers/spi/spi-mtk-nor.c
> > +++ b/drivers/spi/spi-mtk-nor.c
> > @@ -167,52 +167,63 @@ static bool mtk_nor_match_read(const struct spi_mem_op *op)
> >         return false;
> >  }
> >
> > -static int mtk_nor_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op)
> > +static bool need_bounce(void *cpu_addr, unsigned long len)
> >  {
> > -       size_t len;
> > +       return !!(((uintptr_t)cpu_addr) & MTK_NOR_DMA_ALIGN_MASK);
> > +}
>
> parameter 'len' isn't used in this function.

ACK.

>
> >
> > +static int mtk_nor_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op)
> > +{
> >         if (!op->data.nbytes)
> >                 return 0;
> >
> >         if ((op->addr.nbytes == 3) || (op->addr.nbytes == 4)) {
> > -               if ((op->data.dir == SPI_MEM_DATA_IN) &&
> > -                   mtk_nor_match_read(op)) {
>
> I think replacing a if/else if with a two-case switch is more
> of a personal code preference rather than code cleanup.
> I'd prefer only adding need_bounce to replace alignment
> check using a separated commit and leave other stuff
> untouched because:
> 1. This "cleanup" made unintended logic changes (see below)
> 2. The "cleanup" itself actually becomes the major part of
>     this patch, while the actual fix mentioned in commit
>     message is the minor part.
> 3. A fix commit should contain the fix itself. It shouldn't
>     mix with these code changes.

Got it, v3 will only include the fix itself.

>
> > +               switch (op->data.dir) {
> > +               case SPI_MEM_DATA_IN:
> > +                       if (!mtk_nor_match_read(op))
> > +                               return -EINVAL;
>
> You are changing the code logic here.
> mtk_nor_match_read checks if the operation can be executed
> using controller PIO/DMA reading. Even if it's not supported,
> we can still use PRG mode to execute the operation.
> One example of such an operation is SPI NOR SFDP reading.
> Your change breaks that which then breaks 1_2_2 and 1_4_4
> reading capability because spi-nor driver parses these op formats
> from SFDP table.

As you already noticed it, this is a bug from the last patch I made
and v2 isn't fixing this problem. This should be restored & fixed in v3.
And will not include other changes in a same patch.

>
> > +                       /* check if it's DMAable */
> >                         if ((op->addr.val & MTK_NOR_DMA_ALIGN_MASK) ||
> > -                           (op->data.nbytes < MTK_NOR_DMA_ALIGN))
> > +                           (op->data.nbytes < MTK_NOR_DMA_ALIGN)) {
> >                                 op->data.nbytes = 1;
> > -                       else if (!((ulong)(op->data.buf.in) &
> > -                                  MTK_NOR_DMA_ALIGN_MASK))
> > +                       } else {
> > +                               if (need_bounce(op->data.buf.in, op->data.nbytes) &&
> > +                                   (op->data.nbytes > MTK_NOR_BOUNCE_BUF_SIZE))
> > +                                       op->data.nbytes = MTK_NOR_BOUNCE_BUF_SIZE;
> >                                 op->data.nbytes &= ~MTK_NOR_DMA_ALIGN_MASK;
> > -                       else if (op->data.nbytes > MTK_NOR_BOUNCE_BUF_SIZE)
> > -                               op->data.nbytes = MTK_NOR_BOUNCE_BUF_SIZE;
>
> data length alignment is intentionally done only for DMA reading
> without the bounce buffer.
> My intention here:
> If we use the bounce buffer, we can read more data than needed to.
> Say we want 25 bytes of data, reading 32 bytes using DMA and
> bounce buffer should be faster than reading 16 bytes with DMA
> and another 9 bytes with PIO, because for every single byte of PIO
> reading, adjust_op_size and exec_op is called once, we
> program controller with new cmd/address, and controller need
> to send extra cmd/address to flash.
> I noticed that you removed this part of logic from DMA reading
> execution in 3/5 as well. Please revert the logic change here
> add in DMA reading function (see later comment in 3/5).

In v2, I wasn't sure whether this behavior is sane (read more than requested)
Now I think this is okay, I've missed the fact that flash address is
also aligned together.
I'll just keep the previous logics with padding in v3.

Thanks!

>
> > -                       return 0;
> > -               } else if (op->data.dir == SPI_MEM_DATA_OUT) {
> > +                       }
> > +                       break;
> > +               case SPI_MEM_DATA_OUT:
> >                         if (op->data.nbytes >= MTK_NOR_PP_SIZE)
> >                                 op->data.nbytes = MTK_NOR_PP_SIZE;
> >                         else
> >                                 op->data.nbytes = 1;
> > -                       return 0;
> > +                       break;
> > +               default:
> > +                       break;
> >                 }
> > +       } else {
> > +               u8 len = op->cmd.nbytes + op->addr.nbytes + op->dummy.nbytes;
> > +
> > +               if (len > MTK_NOR_PRG_MAX_SIZE)
> > +                       return -EINVAL;
> > +               if (op->data.nbytes && !(MTK_NOR_PRG_MAX_SIZE - len))
> > +                       return -EINVAL;
> > +               if (op->data.nbytes > (MTK_NOR_PRG_MAX_SIZE - len))
> > +                       op->data.nbytes = MTK_NOR_PRG_MAX_SIZE - len;
> >         }
> >
> > -       len = MTK_NOR_PRG_MAX_SIZE - op->cmd.nbytes - op->addr.nbytes -
> > -             op->dummy.nbytes;
> > -       if (op->data.nbytes > len)
> > -               op->data.nbytes = len;
> > -
> >         return 0;
> >  }
> >
> >  static bool mtk_nor_supports_op(struct spi_mem *mem,
> >                                 const struct spi_mem_op *op)
> >  {
> > -       size_t len;
> > -
> >         if (op->cmd.buswidth != 1)
> >                 return false;
> >
> >         if ((op->addr.nbytes == 3) || (op->addr.nbytes == 4)) {
> > -               switch(op->data.dir) {
> > +               switch (op->data.dir) {
> >                 case SPI_MEM_DATA_IN:
> >                         if (!mtk_nor_match_read(op))
> >                                 return false;
> > @@ -226,11 +237,14 @@ static bool mtk_nor_supports_op(struct spi_mem *mem,
> >                 default:
> >                         break;
> >                 }
> > +       } else {
> > +               u8 len = op->cmd.nbytes + op->addr.nbytes + op->dummy.nbytes;
> > +
> > +               if (len > MTK_NOR_PRG_MAX_SIZE)
> > +                       return false;
> > +               if (op->data.nbytes && !(MTK_NOR_PRG_MAX_SIZE - len))
> > +                       return false;
> >         }
> > -       len = op->cmd.nbytes + op->addr.nbytes + op->dummy.nbytes;
> > -       if ((len > MTK_NOR_PRG_MAX_SIZE) ||
> > -           ((op->data.nbytes) && (len == MTK_NOR_PRG_MAX_SIZE)))
> > -               return false;
> >
> >         return spi_mem_default_supports_op(mem, op);
> >  }
> > --
> > 2.28.0.681.g6f77f65b4e-goog
> >
>
>
> --
> Regards,
> Chuanhong Guo
Ikjoon Jang Sept. 21, 2020, 6:52 a.m. UTC | #6
On Fri, Sep 18, 2020 at 9:25 PM Chuanhong Guo <gch981213@gmail.com> wrote:
>
> Hi!
>
> On Fri, Sep 18, 2020 at 4:35 PM Ikjoon Jang <ikjn@chromium.org> wrote:
> >
> > Use dma_alloc_coherent() for bounce buffer instead of kmalloc.
>
> The commit message should explain why such a change is
> needed. (i.e. why using dma_alloc_coherent here is better
> than kmalloc.) And if there's no benefit for this change I'd prefer
> leaving it untouched.
> I remembered reading somewhere that stream DMA api is
> prefered over dma_alloc_coherent for this kind of single-direction
> DMA operation.
>

I will add more description on why I changed it to dma_alloc_coherent():
- to explictly support devices like mt8173-nor which only supports
32bit addressing for dma.

And it also reminded me an another problem, (I won't address this
issue for now in v3):
as this device is using dma range as [start, end) format
where 'end' can be zero in that corner case of {start = 0xffff_f000; end = 0; }

> >
> > Signed-off-by: Ikjoon Jang <ikjn@chromium.org>
> >
> > ---
> >
> > (no changes since v1)
> >
> >  drivers/spi/spi-mtk-nor.c | 60 +++++++++++++++++++++++----------------
> >  1 file changed, 35 insertions(+), 25 deletions(-)
> >
> > diff --git a/drivers/spi/spi-mtk-nor.c b/drivers/spi/spi-mtk-nor.c
> > index 54b2c0fde95b..e14798a6e7d0 100644
> > --- a/drivers/spi/spi-mtk-nor.c
> > +++ b/drivers/spi/spi-mtk-nor.c
> > @@ -96,6 +96,7 @@ struct mtk_nor {
> >         struct device *dev;
> >         void __iomem *base;
> >         u8 *buffer;
> > +       dma_addr_t buffer_dma;
> >         struct clk *spi_clk;
> >         struct clk *ctlr_clk;
> >         unsigned int spi_freq;
> > @@ -275,19 +276,16 @@ static void mtk_nor_setup_bus(struct mtk_nor *sp, const struct spi_mem_op *op)
> >         mtk_nor_rmw(sp, MTK_NOR_REG_BUSCFG, reg, MTK_NOR_BUS_MODE_MASK);
> >  }
> >
> > -static int mtk_nor_read_dma(struct mtk_nor *sp, u32 from, unsigned int length,
> > -                           u8 *buffer)
> > +static int read_dma(struct mtk_nor *sp, u32 from, unsigned int length,
>
> This name is a bit confusing considering there's a mtk_nor_read_dma
> below.
> As this function now only executes dma readings and wait it to finish,
> what about mtk_nor_dma_exec instead?

yeah, good idea.

>
> > +                   dma_addr_t dma_addr)
> >  {
> >         int ret = 0;
> >         ulong delay;
> >         u32 reg;
> > -       dma_addr_t dma_addr;
> >
> > -       dma_addr = dma_map_single(sp->dev, buffer, length, DMA_FROM_DEVICE);
> > -       if (dma_mapping_error(sp->dev, dma_addr)) {
> > -               dev_err(sp->dev, "failed to map dma buffer.\n");
> > +       if (WARN_ON((length & MTK_NOR_DMA_ALIGN_MASK) ||
> > +                   (dma_addr & MTK_NOR_DMA_ALIGN_MASK)))
>
> These alignment is guaranteed by callers of this function if all
> my comments below are addressed. This check isn't needed.

ACK.

>
> >                 return -EINVAL;
> > -       }
> >
> >         writel(from, sp->base + MTK_NOR_REG_DMA_FADR);
> >         writel(dma_addr, sp->base + MTK_NOR_REG_DMA_DADR);
> > @@ -312,30 +310,39 @@ static int mtk_nor_read_dma(struct mtk_nor *sp, u32 from, unsigned int length,
> >                                          (delay + 1) * 100);
> >         }
> >
> > -       dma_unmap_single(sp->dev, dma_addr, length, DMA_FROM_DEVICE);
> >         if (ret < 0)
> >                 dev_err(sp->dev, "dma read timeout.\n");
> >
> >         return ret;
> >  }
> >
> > -static int mtk_nor_read_bounce(struct mtk_nor *sp, u32 from,
> > -                              unsigned int length, u8 *buffer)
> > +static int mtk_nor_read_dma(struct mtk_nor *sp, u32 from,
> > +                           unsigned int length, u8 *buffer)
> >  {
> > -       unsigned int rdlen;
> >         int ret;
> > +       dma_addr_t dma_addr;
> > +       bool bounce = need_bounce(buffer, length);
> >
> > -       if (length & MTK_NOR_DMA_ALIGN_MASK)
> > -               rdlen = (length + MTK_NOR_DMA_ALIGN) & ~MTK_NOR_DMA_ALIGN_MASK;
>
> The intention of this rdlen alignment is explained in 2/5.
> Please make sure this rdlen alignment logic is present
> only for PIO reading.

okay, I'll use padding again in v3.

>
> > -       else
> > -               rdlen = length;
> > +       if (!bounce) {
> > +               dma_addr = dma_map_single(sp->dev, buffer, length,
> > +                                         DMA_FROM_DEVICE);
> > +               if (dma_mapping_error(sp->dev, dma_addr)) {
> > +                       dev_err(sp->dev, "failed to map dma buffer.\n");
> > +                       return -EINVAL;
> > +               }
> > +       } else {
> > +               dma_addr = sp->buffer_dma;
> > +       }
> >
> > -       ret = mtk_nor_read_dma(sp, from, rdlen, sp->buffer);
> > -       if (ret)
> > -               return ret;
> > +       ret = read_dma(sp, from, length, dma_addr);
> >
> > -       memcpy(buffer, sp->buffer, length);
> > -       return 0;
> > +       if (!bounce)
> > +               dma_unmap_single(sp->dev, dma_addr, length,
> > +                                DMA_FROM_DEVICE);
> > +       else
> > +               memcpy(buffer, sp->buffer, length);
> > +
> > +       return ret;
> >  }
>
> I think a separated read_dma and read_bounce function will be
> cleaner than this if-else implementation:
> read_dma:
> 1. call dma_map_single to get physical address
> 2. call read_dma to execute operation
> 3. call dma_unmap_single
>
> read_bounce:
> 1. align reading length
> 2. call read_dma
> 3. call memcpy

ACK

>
> >
> >  static int mtk_nor_read_pio(struct mtk_nor *sp, const struct spi_mem_op *op)
> > @@ -439,11 +446,6 @@ static int mtk_nor_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
> >                 if (op->data.nbytes == 1) {
> >                         mtk_nor_set_addr(sp, op);
> >                         return mtk_nor_read_pio(sp, op);
> > -               } else if (((ulong)(op->data.buf.in) &
> > -                           MTK_NOR_DMA_ALIGN_MASK)) {
> > -                       return mtk_nor_read_bounce(sp, op->addr.val,
> > -                                                  op->data.nbytes,
> > -                                                  op->data.buf.in);
> >                 } else {
> >                         return mtk_nor_read_dma(sp, op->addr.val,
> >                                                 op->data.nbytes,
> > @@ -654,6 +656,10 @@ static int mtk_nor_probe(struct platform_device *pdev)
> >         sp->dev = &pdev->dev;
> >         sp->spi_clk = spi_clk;
> >         sp->ctlr_clk = ctlr_clk;
>
> There is extra memory allocation code for sp->buffer in mtk_nor_probe.
> If you intend to replace this with dma_alloc_coherent you should
> drop those devm_kmalloc code as well.
>
> > +       sp->buffer = dma_alloc_coherent(&pdev->dev, MTK_NOR_BOUNCE_BUF_SIZE,
> > +                                       &sp->buffer_dma, GFP_KERNEL);
>
> There's a devm variant: dmam_alloc_coherent(dev, size, dma_handle, gfp)

ACK

>
> > +       if (!sp->buffer)
> > +               return -ENOMEM;
>
> This spi-nor controller requires all addresses to be 16-byte aligned.
> Although it should be guaranteed by a usually way larger page
> alignment address from dma_alloc_coherent I'd prefer an explicit
> check for address alignment here rather than letting it probe
> successfully and fail for every dma_read with bounce buffer.
>

Yep, I'll restore the padding.

>
> >
> >         irq = platform_get_irq_optional(pdev, 0);
> >         if (irq < 0) {
> > @@ -674,6 +680,8 @@ static int mtk_nor_probe(struct platform_device *pdev)
> >         ret = mtk_nor_init(sp);
> >         if (ret < 0) {
> >                 kfree(ctlr);
> > +               dma_free_coherent(&pdev->dev, MTK_NOR_BOUNCE_BUF_SIZE,
> > +                                 sp->buffer, sp->buffer_dma);
> >                 return ret;
> >         }
> >
> > @@ -692,6 +700,8 @@ static int mtk_nor_remove(struct platform_device *pdev)
> >
> >         mtk_nor_disable_clk(sp);
> >
> > +       dma_free_coherent(&pdev->dev, MTK_NOR_BOUNCE_BUF_SIZE,
> > +                         sp->buffer, sp->buffer_dma);
> >         return 0;
> >  }
> >
> > --
> > 2.28.0.681.g6f77f65b4e-goog
> >
>
>
> --
> Regards,
> Chuanhong Guo
Ikjoon Jang Sept. 21, 2020, 6:54 a.m. UTC | #7
On Sat, Sep 19, 2020 at 11:26 PM Yingjoe Chen <yingjoe.chen@mediatek.com> wrote:
>
> On Fri, 2020-09-18 at 16:31 +0800, Ikjoon Jang wrote:
> > This patch enables 36bit dma address support to spi-mtk-nor.
> > Currently 36bit dma addressing is enabled only for mt8192-nor.
[snip]
>
> Do we need to set sp->high_dma when we have >32bits DMA?
>

Yes, to do so, you need to add new compatible strings if there are
more SoCs support >32bit other than mt8192
or just ust mt8192-nor for binding.

> Joe.C