diff mbox series

hw/block: m25p80: Fix fast read for SST flashes

Message ID 1606704602-59435-1-git-send-email-bmeng.cn@gmail.com
State New
Headers show
Series hw/block: m25p80: Fix fast read for SST flashes | expand

Commit Message

Bin Meng Nov. 30, 2020, 2:50 a.m. UTC
From: Bin Meng <bin.meng@windriver.com>

SST flashes require a dummy byte after the address bits.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
---

 hw/block/m25p80.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Alistair Francis Dec. 2, 2020, 7:40 p.m. UTC | #1
On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> From: Bin Meng <bin.meng@windriver.com>
>
> SST flashes require a dummy byte after the address bits.
>
> Signed-off-by: Bin Meng <bin.meng@windriver.com>

I couldn't find a datasheet that says this... But the actual code
change looks fine, so:

Acked-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>
>  hw/block/m25p80.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> index 483925f..9b36762 100644
> --- a/hw/block/m25p80.c
> +++ b/hw/block/m25p80.c
> @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
>      s->needed_bytes = get_addr_length(s);
>      switch (get_man(s)) {
>      /* Dummy cycles - modeled with bytes writes instead of bits */
> +    case MAN_SST:
> +        s->needed_bytes += 1;
> +        break;
>      case MAN_WINBOND:
>          s->needed_bytes += 8;
>          break;
> --
> 2.7.4
>
>
Bin Meng Dec. 2, 2020, 11:09 p.m. UTC | #2
Hi Alistair,

On Thu, Dec 3, 2020 at 3:52 AM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > From: Bin Meng <bin.meng@windriver.com>
> >
> > SST flashes require a dummy byte after the address bits.
> >
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
>
> I couldn't find a datasheet that says this... But the actual code
> change looks fine, so:
>

Please find the SST25VF016B datasheet at
http://ww1.microchip.com/downloads/en/devicedoc/s71271_04.pdf. The
fast read sequence is on page 11.

> Acked-by: Alistair Francis <alistair.francis@wdc.com>
>

Thanks!

Regards,
Bin
Alistair Francis Dec. 2, 2020, 11:14 p.m. UTC | #3
On Wed, Dec 2, 2020 at 3:09 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> Hi Alistair,
>
> On Thu, Dec 3, 2020 at 3:52 AM Alistair Francis <alistair23@gmail.com> wrote:
> >
> > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > >
> > > From: Bin Meng <bin.meng@windriver.com>
> > >
> > > SST flashes require a dummy byte after the address bits.
> > >
> > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >
> > I couldn't find a datasheet that says this... But the actual code
> > change looks fine, so:
> >
>
> Please find the SST25VF016B datasheet at
> http://ww1.microchip.com/downloads/en/devicedoc/s71271_04.pdf. The
> fast read sequence is on page 11.

Ah cool. I thought it would be somewhere, I just couldn't find it.

Alistair

>
> > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> >
>
> Thanks!
>
> Regards,
> Bin
Francisco Iglesias Dec. 3, 2020, 8:38 a.m. UTC | #4
Hi Bin and Alistair,

On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > From: Bin Meng <bin.meng@windriver.com>
> >
> > SST flashes require a dummy byte after the address bits.
> >
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> 
> I couldn't find a datasheet that says this... But the actual code
> change looks fine, so:
> 
> Acked-by: Alistair Francis <alistair.francis@wdc.com>
> 
> Alistair
> 
> > ---
> >
> >  hw/block/m25p80.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > index 483925f..9b36762 100644
> > --- a/hw/block/m25p80.c
> > +++ b/hw/block/m25p80.c
> > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> >      s->needed_bytes = get_addr_length(s);
> >      switch (get_man(s)) {
> >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > +    case MAN_SST:
> > +        s->needed_bytes += 1;

1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
dummy byte (8 dummy clk cycles) will need +8 above. An option could be to fall
through to the Windbond case below instead (since it seems to operate
likewise). 

Best regards,
Francisco Iglesias


> > +        break;
> >      case MAN_WINBOND:
> >          s->needed_bytes += 8;
> >          break;
> > --
> > 2.7.4
> >
> >
>
Bin Meng Dec. 4, 2020, 7:52 a.m. UTC | #5
Hi Francisco,

On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
<frasse.iglesias@gmail.com> wrote:
>
> Hi Bin and Alistair,
>
> On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > >
> > > From: Bin Meng <bin.meng@windriver.com>
> > >
> > > SST flashes require a dummy byte after the address bits.
> > >
> > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >
> > I couldn't find a datasheet that says this... But the actual code
> > change looks fine, so:
> >
> > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> >
> > Alistair
> >
> > > ---
> > >
> > >  hw/block/m25p80.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > index 483925f..9b36762 100644
> > > --- a/hw/block/m25p80.c
> > > +++ b/hw/block/m25p80.c
> > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > >      s->needed_bytes = get_addr_length(s);
> > >      switch (get_man(s)) {
> > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > +    case MAN_SST:
> > > +        s->needed_bytes += 1;
>
> 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> dummy byte (8 dummy clk cycles) will need +8 above.

I think you were confused by the WINBOND codes. The comments are
correct. It is modeled with bytes instead of bits, so we should +=1.

> An option could be to fall
> through to the Windbond case below instead (since it seems to operate
> likewise).
>
> Best regards,
> Francisco Iglesias
>
>
> > > +        break;
> > >      case MAN_WINBOND:
> > >          s->needed_bytes += 8;

Actually this is wrong. This should be corrected to +=1. I will
prepare a patch for it.

> > >          break;
> > > --

Regards,
Bin
Francisco Iglesias Dec. 4, 2020, 10:46 a.m. UTC | #6
Hello Bin,

On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> Hi Francisco,
> 
> On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> <frasse.iglesias@gmail.com> wrote:
> >
> > Hi Bin and Alistair,
> >
> > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > >
> > > > From: Bin Meng <bin.meng@windriver.com>
> > > >
> > > > SST flashes require a dummy byte after the address bits.
> > > >
> > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > >
> > > I couldn't find a datasheet that says this... But the actual code
> > > change looks fine, so:
> > >
> > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > >
> > > Alistair
> > >
> > > > ---
> > > >
> > > >  hw/block/m25p80.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > index 483925f..9b36762 100644
> > > > --- a/hw/block/m25p80.c
> > > > +++ b/hw/block/m25p80.c
> > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > >      s->needed_bytes = get_addr_length(s);
> > > >      switch (get_man(s)) {
> > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > +    case MAN_SST:
> > > > +        s->needed_bytes += 1;
> >
> > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > dummy byte (8 dummy clk cycles) will need +8 above.
> 
> I think you were confused by the WINBOND codes. The comments are
> correct. It is modeled with bytes instead of bits, so we should +=1.

What the comment says is (perhaps not superclear) that 1 dummy clock cycle
is modeled as one 1 byte write into the flash (meaining that 8 byte writes
are needed for 1 dummy byte). Perhaps it is easier to understand
looking into how the controllers issue the command towards the flash model
(for example the xilinx_spips), the start of the FAST_READ cmd is issued
as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
8 bytes (8 dummy cycles -> 1 dummy byte).

Best regards,
Francisco Iglesias

> 
> > An option could be to fall
> > through to the Windbond case below instead (since it seems to operate
> > likewise).
> >
> > Best regards,
> > Francisco Iglesias
> >
> >
> > > > +        break;
> > > >      case MAN_WINBOND:
> > > >          s->needed_bytes += 8;
> 
> Actually this is wrong. This should be corrected to +=1. I will
> prepare a patch for it.
> 
> > > >          break;
> > > > --
> 
> Regards,
> Bin
Bin Meng Dec. 4, 2020, 10:52 a.m. UTC | #7
Hi Francisco,

On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
<frasse.iglesias@gmail.com> wrote:
>
> Hello Bin,
>
> On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > Hi Francisco,
> >
> > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > <frasse.iglesias@gmail.com> wrote:
> > >
> > > Hi Bin and Alistair,
> > >
> > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > >
> > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > >
> > > > > SST flashes require a dummy byte after the address bits.
> > > > >
> > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > >
> > > > I couldn't find a datasheet that says this... But the actual code
> > > > change looks fine, so:
> > > >
> > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > >
> > > > Alistair
> > > >
> > > > > ---
> > > > >
> > > > >  hw/block/m25p80.c | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > index 483925f..9b36762 100644
> > > > > --- a/hw/block/m25p80.c
> > > > > +++ b/hw/block/m25p80.c
> > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > >      s->needed_bytes = get_addr_length(s);
> > > > >      switch (get_man(s)) {
> > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > +    case MAN_SST:
> > > > > +        s->needed_bytes += 1;
> > >
> > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > dummy byte (8 dummy clk cycles) will need +8 above.
> >
> > I think you were confused by the WINBOND codes. The comments are
> > correct. It is modeled with bytes instead of bits, so we should +=1.
>
> What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> are needed for 1 dummy byte). Perhaps it is easier to understand
> looking into how the controllers issue the command towards the flash model
> (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> 8 bytes (8 dummy cycles -> 1 dummy byte).
>

My interpretation of the comments are opposite: one cycle is a bit,
but we are not using bits, instead we are using bytes.

Testing shows that +=1 is the correct way with the imx_spi controller,
and with my SiFive SPI model in my local tree (not upstreamed yet)

Regards,
Bin
Francisco Iglesias Dec. 4, 2020, 11:28 a.m. UTC | #8
Hello Bin,

On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> Hi Francisco,
> 
> On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> <frasse.iglesias@gmail.com> wrote:
> >
> > Hello Bin,
> >
> > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > Hi Francisco,
> > >
> > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > <frasse.iglesias@gmail.com> wrote:
> > > >
> > > > Hi Bin and Alistair,
> > > >
> > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > >
> > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > >
> > > > > > SST flashes require a dummy byte after the address bits.
> > > > > >
> > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > >
> > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > change looks fine, so:
> > > > >
> > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > >
> > > > > Alistair
> > > > >
> > > > > > ---
> > > > > >
> > > > > >  hw/block/m25p80.c | 3 +++
> > > > > >  1 file changed, 3 insertions(+)
> > > > > >
> > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > index 483925f..9b36762 100644
> > > > > > --- a/hw/block/m25p80.c
> > > > > > +++ b/hw/block/m25p80.c
> > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > >      switch (get_man(s)) {
> > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > +    case MAN_SST:
> > > > > > +        s->needed_bytes += 1;
> > > >
> > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > >
> > > I think you were confused by the WINBOND codes. The comments are
> > > correct. It is modeled with bytes instead of bits, so we should +=1.
> >
> > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > are needed for 1 dummy byte). Perhaps it is easier to understand
> > looking into how the controllers issue the command towards the flash model
> > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > 8 bytes (8 dummy cycles -> 1 dummy byte).
> >
> 
> My interpretation of the comments are opposite: one cycle is a bit,
> but we are not using bits, instead we are using bytes.

Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
Maybe just bellow would have been better:

/* Dummy clock cycles - modeled with bytes writes */

> 
> Testing shows that +=1 is the correct way with the imx_spi controller,
> and with my SiFive SPI model in my local tree (not upstreamed yet)

Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
those could work aswell for the imx_spi?

Regarding this patch, with += 8 it looks correct to me (and will work with
above controllers as far as I can see).

Best regards,
Francisco Iglesias

> 
> Regards,
> Bin
Bin Meng Dec. 11, 2020, 6:07 a.m. UTC | #9
Hi Francisco,

On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
<frasse.iglesias@gmail.com> wrote:
>
> Hello Bin,
>
> On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > Hi Francisco,
> >
> > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > <frasse.iglesias@gmail.com> wrote:
> > >
> > > Hello Bin,
> > >
> > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > Hi Francisco,
> > > >
> > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > <frasse.iglesias@gmail.com> wrote:
> > > > >
> > > > > Hi Bin and Alistair,
> > > > >
> > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > >
> > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > >
> > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > >
> > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > >
> > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > change looks fine, so:
> > > > > >
> > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > >
> > > > > > Alistair
> > > > > >
> > > > > > > ---
> > > > > > >
> > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > >  1 file changed, 3 insertions(+)
> > > > > > >
> > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > index 483925f..9b36762 100644
> > > > > > > --- a/hw/block/m25p80.c
> > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > >      switch (get_man(s)) {
> > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > +    case MAN_SST:
> > > > > > > +        s->needed_bytes += 1;
> > > > >
> > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > >
> > > > I think you were confused by the WINBOND codes. The comments are
> > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > >
> > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > looking into how the controllers issue the command towards the flash model
> > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > >
> >
> > My interpretation of the comments are opposite: one cycle is a bit,
> > but we are not using bits, instead we are using bytes.
>
> Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> Maybe just bellow would have been better:
>
> /* Dummy clock cycles - modeled with bytes writes */
>
> >
> > Testing shows that +=1 is the correct way with the imx_spi controller,
> > and with my SiFive SPI model in my local tree (not upstreamed yet)
>
> Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> those could work aswell for the imx_spi?
>

Thanks for pointing this out. So there is some inconsistency among
different SPI controller modeling.

Or maybe fixing aspeed_smc, xilinx_spips and npcm7xx_fiu to work like
imx_spi? Which one is the expected behavior for dummy cycles?

> Regarding this patch, with += 8 it looks correct to me (and will work with
> above controllers as far as I can see).
>

Regards,
Bin
Francisco Iglesias Dec. 11, 2020, 3:16 p.m. UTC | #10
Hello Bin,

On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> Hi Francisco,
> 
> On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> <frasse.iglesias@gmail.com> wrote:
> >
> > Hello Bin,
> >
> > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > Hi Francisco,
> > >
> > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > <frasse.iglesias@gmail.com> wrote:
> > > >
> > > > Hello Bin,
> > > >
> > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > Hi Francisco,
> > > > >
> > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > >
> > > > > > Hi Bin and Alistair,
> > > > > >
> > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > >
> > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > >
> > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > >
> > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > >
> > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > change looks fine, so:
> > > > > > >
> > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > >
> > > > > > > Alistair
> > > > > > >
> > > > > > > > ---
> > > > > > > >
> > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > index 483925f..9b36762 100644
> > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > >      switch (get_man(s)) {
> > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > +    case MAN_SST:
> > > > > > > > +        s->needed_bytes += 1;
> > > > > >
> > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > >
> > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > >
> > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > looking into how the controllers issue the command towards the flash model
> > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > >
> > >
> > > My interpretation of the comments are opposite: one cycle is a bit,
> > > but we are not using bits, instead we are using bytes.
> >
> > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > Maybe just bellow would have been better:
> >
> > /* Dummy clock cycles - modeled with bytes writes */
> >
> > >
> > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> >
> > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > those could work aswell for the imx_spi?
> >
> 
> Thanks for pointing this out. So there is some inconsistency among
> different SPI controller modeling.

I'm not sure I understand you correctly but the controllers supporting
commands with dummy clock cycles can only do it following the modeled
approach, so I would rather say it is pretty consistent across the
controllers (not all controllers support these commands though).

> 
> Or maybe fixing aspeed_smc, xilinx_spips and npcm7xx_fiu to work like
> imx_spi?

For me I would say no to above (it makes more sense to let new controllers
implement the currently modeled approach).

> Which one is the expected behavior for dummy cycles?

Dummy clock cycles are modeled as 1 byte written to the flash per dummy clock
cycle (expected behavior).

Best regards,
Francisco Iglesias

> 
> > Regarding this patch, with += 8 it looks correct to me (and will work with
> > above controllers as far as I can see).
> >
> 
> Regards,
> Bin
Bin Meng Dec. 11, 2020, 3:29 p.m. UTC | #11
Hi Francisco,

On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
<frasse.iglesias@gmail.com> wrote:
>
> Hello Bin,
>
> On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > Hi Francisco,
> >
> > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > <frasse.iglesias@gmail.com> wrote:
> > >
> > > Hello Bin,
> > >
> > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > Hi Francisco,
> > > >
> > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > <frasse.iglesias@gmail.com> wrote:
> > > > >
> > > > > Hello Bin,
> > > > >
> > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > Hi Francisco,
> > > > > >
> > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi Bin and Alistair,
> > > > > > >
> > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > >
> > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > >
> > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > change looks fine, so:
> > > > > > > >
> > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > >
> > > > > > > > Alistair
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > >
> > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > >      switch (get_man(s)) {
> > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > +    case MAN_SST:
> > > > > > > > > +        s->needed_bytes += 1;
> > > > > > >
> > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > >
> > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > >
> > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > looking into how the controllers issue the command towards the flash model
> > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > >
> > > >
> > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > but we are not using bits, instead we are using bytes.
> > >
> > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > Maybe just bellow would have been better:
> > >
> > > /* Dummy clock cycles - modeled with bytes writes */
> > >
> > > >
> > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > >
> > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > those could work aswell for the imx_spi?
> > >
> >
> > Thanks for pointing this out. So there is some inconsistency among
> > different SPI controller modeling.
>
> I'm not sure I understand you correctly but the controllers supporting
> commands with dummy clock cycles can only do it following the modeled
> approach, so I would rather say it is pretty consistent across the
> controllers (not all controllers support these commands though).

I mean there are 2 approaches to emulate the dummy cycles for
different SPI controller models, yet we only have one m25p80 flash
model to work with both of them. Some controllers may choose 1 byte to
emulate 1 dummy clock cycle, but some others choose 1 bit to emulate 1
dummy cycle. This is inconsistent.

>
> >
> > Or maybe fixing aspeed_smc, xilinx_spips and npcm7xx_fiu to work like
> > imx_spi?
>
> For me I would say no to above (it makes more sense to let new controllers
> implement the currently modeled approach).

Yes, we can certainly make them consistent. But the question is which
one is the correct one? I tried to search in the doc but in vain.

>
> > Which one is the expected behavior for dummy cycles?
>
> Dummy clock cycles are modeled as 1 byte written to the flash per dummy clock
> cycle (expected behavior).
>

Like I mentioned before, 8 bytes to emulate 8 dummy clock cycles does
not work with current imx_spi model.

Regards,
Bin
Francisco Iglesias Dec. 11, 2020, 4:11 p.m. UTC | #12
Hello Bin,

On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> Hi Francisco,
> 
> On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> <frasse.iglesias@gmail.com> wrote:
> >
> > Hello Bin,
> >
> > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > Hi Francisco,
> > >
> > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > <frasse.iglesias@gmail.com> wrote:
> > > >
> > > > Hello Bin,
> > > >
> > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > Hi Francisco,
> > > > >
> > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > >
> > > > > > Hello Bin,
> > > > > >
> > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > Hi Francisco,
> > > > > > >
> > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hi Bin and Alistair,
> > > > > > > >
> > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > >
> > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > >
> > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > change looks fine, so:
> > > > > > > > >
> > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > >
> > > > > > > > > Alistair
> > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > >
> > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > >
> > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > >
> > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > >
> > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > >
> > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > >
> > > > >
> > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > but we are not using bits, instead we are using bytes.
> > > >
> > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > Maybe just bellow would have been better:
> > > >
> > > > /* Dummy clock cycles - modeled with bytes writes */
> > > >
> > > > >
> > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > >
> > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > those could work aswell for the imx_spi?
> > > >
> > >
> > > Thanks for pointing this out. So there is some inconsistency among
> > > different SPI controller modeling.
> >
> > I'm not sure I understand you correctly but the controllers supporting
> > commands with dummy clock cycles can only do it following the modeled
> > approach, so I would rather say it is pretty consistent across the
> > controllers (not all controllers support these commands though).
> 
> I mean there are 2 approaches to emulate the dummy cycles for

There is currently only 1 way of modeling dummy clock cycles. All commands that
require / support them in m25p80 goes with that approach. An the controllers
that support dummy clock cycles uses that approach. 

> different SPI controller models, yet we only have one m25p80 flash
> model to work with both of them. Some controllers may choose 1 byte to
> emulate 1 dummy clock cycle, but some others choose 1 bit to emulate 1
> dummy cycle. This is inconsistent.
> 
> >
> > >
> > > Or maybe fixing aspeed_smc, xilinx_spips and npcm7xx_fiu to work like
> > > imx_spi?
> >
> > For me I would say no to above (it makes more sense to let new controllers
> > implement the currently modeled approach).
> 
> Yes, we can certainly make them consistent. But the question is which
> one is the correct one? I tried to search in the doc but in vain.

I think it is better to look into the code instead of doc/comments since it
might be outdated/misinterpreted.

> 
> >
> > > Which one is the expected behavior for dummy cycles?
> >
> > Dummy clock cycles are modeled as 1 byte written to the flash per dummy clock
> > cycle (expected behavior).
> >
> 
> Like I mentioned before, 8 bytes to emulate 8 dummy clock cycles does
> not work with current imx_spi model.

I've understood this and that is why I'm trying to explain that it needs to
convert the dummy byte to 8 writes (1 for each clock cycle). At the moment the
imx_spi does not support dummy clock cycle generation as far as I can see.

This patch works for the imx_xpi with the SST flash connected because it's
hardcoding the flash to use 1 dummy clock cycle (speaking m25p80.c language)
and then issues 1 byte write (for the dummy byte) from the model (matching it).
If you switch to another flash, for example Windbond, the FAST_READ command
won't work for the imx_spi anymore. And if you use an SST flash (using this
patch) with another controller that has implemented dummy cycle generation
the FAST_READ command won't work.

Best regards,
Francisco Iglesias

> 
> Regards,
> Bin
Bin Meng Dec. 12, 2020, 8:16 a.m. UTC | #13
Hi Francisco,

On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
<frasse.iglesias@gmail.com> wrote:
>
> Hello Bin,
>
> On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > Hi Francisco,
> >
> > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > <frasse.iglesias@gmail.com> wrote:
> > >
> > > Hello Bin,
> > >
> > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > Hi Francisco,
> > > >
> > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > <frasse.iglesias@gmail.com> wrote:
> > > > >
> > > > > Hello Bin,
> > > > >
> > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > Hi Francisco,
> > > > > >
> > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > >
> > > > > > > Hello Bin,
> > > > > > >
> > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > Hi Francisco,
> > > > > > > >
> > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > Hi Bin and Alistair,
> > > > > > > > >
> > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > >
> > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > >
> > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > change looks fine, so:
> > > > > > > > > >
> > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > >
> > > > > > > > > > Alistair
> > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > >
> > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > >
> > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > >
> > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > >
> > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > >
> > > > > >
> > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > but we are not using bits, instead we are using bytes.
> > > > >
> > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > Maybe just bellow would have been better:
> > > > >
> > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > >
> > > > > >
> > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > >
> > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > those could work aswell for the imx_spi?
> > > > >
> > > >
> > > > Thanks for pointing this out. So there is some inconsistency among
> > > > different SPI controller modeling.
> > >
> > > I'm not sure I understand you correctly but the controllers supporting
> > > commands with dummy clock cycles can only do it following the modeled
> > > approach, so I would rather say it is pretty consistent across the
> > > controllers (not all controllers support these commands though).
> >
> > I mean there are 2 approaches to emulate the dummy cycles for
>
> There is currently only 1 way of modeling dummy clock cycles. All commands that
> require / support them in m25p80 goes with that approach. An the controllers
> that support dummy clock cycles uses that approach.

No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
npcm7xx do for dummy cycles. For these controllers, there are hardware
registers for dummy cycles, and software does not need to write
anything into the tx fifo. These models emulate one dummy cycle by
transferring one byte one the SPI line so we see there are actually a
number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
these controllers, they just transfer whatever is written by guest
software to tx fifo without any special awareness of dummy cycles.

>
> > different SPI controller models, yet we only have one m25p80 flash
> > model to work with both of them. Some controllers may choose 1 byte to
> > emulate 1 dummy clock cycle, but some others choose 1 bit to emulate 1
> > dummy cycle. This is inconsistent.
> >
> > >
> > > >
> > > > Or maybe fixing aspeed_smc, xilinx_spips and npcm7xx_fiu to work like
> > > > imx_spi?
> > >
> > > For me I would say no to above (it makes more sense to let new controllers
> > > implement the currently modeled approach).
> >
> > Yes, we can certainly make them consistent. But the question is which
> > one is the correct one? I tried to search in the doc but in vain.
>
> I think it is better to look into the code instead of doc/comments since it
> might be outdated/misinterpreted.
>

But the codes have 2 different approaches and I don't know which one
was the original intended behavior.

> >
> > >
> > > > Which one is the expected behavior for dummy cycles?
> > >
> > > Dummy clock cycles are modeled as 1 byte written to the flash per dummy clock
> > > cycle (expected behavior).
> > >
> >
> > Like I mentioned before, 8 bytes to emulate 8 dummy clock cycles does
> > not work with current imx_spi model.
>
> I've understood this and that is why I'm trying to explain that it needs to
> convert the dummy byte to 8 writes (1 for each clock cycle). At the moment the
> imx_spi does not support dummy clock cycle generation as far as I can see.
>
> This patch works for the imx_xpi with the SST flash connected because it's
> hardcoding the flash to use 1 dummy clock cycle (speaking m25p80.c language)
> and then issues 1 byte write (for the dummy byte) from the model (matching it).

Please see above, the imx_spi does not hard coding 1 dummy clock cycle
to be used.

> If you switch to another flash, for example Windbond, the FAST_READ command
> won't work for the imx_spi anymore. And if you use an SST flash (using this
> patch) with another controller that has implemented dummy cycle generation
> the FAST_READ command won't work.

Dividing the Winbound dummy cycles by 8 should work with imx_spi. But
like I mentioned, we should firstly take a consistent approach to
handle dummy cycles in the SPI models.

Regards,
Bin
Francisco Iglesias Dec. 12, 2020, 9:24 a.m. UTC | #14
Hi bin,

On [2020 Dec 12] Sat 16:16:59, Bin Meng wrote:
> Hi Francisco,
> 
> On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
> <frasse.iglesias@gmail.com> wrote:
> >
> > Hello Bin,
> >
> > On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > > Hi Francisco,
> > >
> > > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > > <frasse.iglesias@gmail.com> wrote:
> > > >
> > > > Hello Bin,
> > > >
> > > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > > Hi Francisco,
> > > > >
> > > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > >
> > > > > > Hello Bin,
> > > > > >
> > > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > > Hi Francisco,
> > > > > > >
> > > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hello Bin,
> > > > > > > >
> > > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > > Hi Francisco,
> > > > > > > > >
> > > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Bin and Alistair,
> > > > > > > > > >
> > > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > >
> > > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > >
> > > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > > change looks fine, so:
> > > > > > > > > > >
> > > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > > >
> > > > > > > > > > > Alistair
> > > > > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > >
> > > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > > >
> > > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > > >
> > > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > > >
> > > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > > >
> > > > > > >
> > > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > > but we are not using bits, instead we are using bytes.
> > > > > >
> > > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > > Maybe just bellow would have been better:
> > > > > >
> > > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > > >
> > > > > > >
> > > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > > >
> > > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > > those could work aswell for the imx_spi?
> > > > > >
> > > > >
> > > > > Thanks for pointing this out. So there is some inconsistency among
> > > > > different SPI controller modeling.
> > > >
> > > > I'm not sure I understand you correctly but the controllers supporting
> > > > commands with dummy clock cycles can only do it following the modeled
> > > > approach, so I would rather say it is pretty consistent across the
> > > > controllers (not all controllers support these commands though).
> > >
> > > I mean there are 2 approaches to emulate the dummy cycles for
> >
> > There is currently only 1 way of modeling dummy clock cycles. All commands that
> > require / support them in m25p80 goes with that approach. An the controllers
> > that support dummy clock cycles uses that approach.
> 
> No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
> npcm7xx do for dummy cycles. For these controllers, there are hardware
> registers for dummy cycles, and software does not need to write
> anything into the tx fifo. These models emulate one dummy cycle by
> transferring one byte one the SPI line so we see there are actually a
> number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
> way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
> these controllers, they just transfer whatever is written by guest
> software to tx fifo without any special awareness of dummy cycles.


The xilinx_spips supports above way of transferring a command so you can look
into that one for an example of how to handle a command pushed into a txfifo
with regards to the dummy clock cycles. Not all controllers support generating
dummy clock cycles, meaning that not all do the dummy byte -> dummy clock cycle
conversion. The controllers that do not do this currently does not support
issuing commands requiring them towards m25p80 (as FAST_READ, DOR, QOR etc..).

Best regards,
Francisco Iglesias


> 
> >
> > > different SPI controller models, yet we only have one m25p80 flash
> > > model to work with both of them. Some controllers may choose 1 byte to
> > > emulate 1 dummy clock cycle, but some others choose 1 bit to emulate 1
> > > dummy cycle. This is inconsistent.
> > >
> > > >
> > > > >
> > > > > Or maybe fixing aspeed_smc, xilinx_spips and npcm7xx_fiu to work like
> > > > > imx_spi?
> > > >
> > > > For me I would say no to above (it makes more sense to let new controllers
> > > > implement the currently modeled approach).
> > >
> > > Yes, we can certainly make them consistent. But the question is which
> > > one is the correct one? I tried to search in the doc but in vain.
> >
> > I think it is better to look into the code instead of doc/comments since it
> > might be outdated/misinterpreted.
> >
> 
> But the codes have 2 different approaches and I don't know which one
> was the original intended behavior.
> 
> > >
> > > >
> > > > > Which one is the expected behavior for dummy cycles?
> > > >
> > > > Dummy clock cycles are modeled as 1 byte written to the flash per dummy clock
> > > > cycle (expected behavior).
> > > >
> > >
> > > Like I mentioned before, 8 bytes to emulate 8 dummy clock cycles does
> > > not work with current imx_spi model.
> >
> > I've understood this and that is why I'm trying to explain that it needs to
> > convert the dummy byte to 8 writes (1 for each clock cycle). At the moment the
> > imx_spi does not support dummy clock cycle generation as far as I can see.
> >
> > This patch works for the imx_xpi with the SST flash connected because it's
> > hardcoding the flash to use 1 dummy clock cycle (speaking m25p80.c language)
> > and then issues 1 byte write (for the dummy byte) from the model (matching it).
> 
> Please see above, the imx_spi does not hard coding 1 dummy clock cycle
> to be used.
> 
> > If you switch to another flash, for example Windbond, the FAST_READ command
> > won't work for the imx_spi anymore. And if you use an SST flash (using this
> > patch) with another controller that has implemented dummy cycle generation
> > the FAST_READ command won't work.
> 
> Dividing the Winbound dummy cycles by 8 should work with imx_spi. But
> like I mentioned, we should firstly take a consistent approach to
> handle dummy cycles in the SPI models.
> 
> Regards,
> Bin
Bin Meng Dec. 12, 2020, 9:44 a.m. UTC | #15
Hi Francisco,

On Sat, Dec 12, 2020 at 5:24 PM Francisco Iglesias
<frasse.iglesias@gmail.com> wrote:
>
> Hi bin,
>
> On [2020 Dec 12] Sat 16:16:59, Bin Meng wrote:
> > Hi Francisco,
> >
> > On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
> > <frasse.iglesias@gmail.com> wrote:
> > >
> > > Hello Bin,
> > >
> > > On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > > > Hi Francisco,
> > > >
> > > > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > > > <frasse.iglesias@gmail.com> wrote:
> > > > >
> > > > > Hello Bin,
> > > > >
> > > > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > > > Hi Francisco,
> > > > > >
> > > > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > >
> > > > > > > Hello Bin,
> > > > > > >
> > > > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > > > Hi Francisco,
> > > > > > > >
> > > > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > Hello Bin,
> > > > > > > > >
> > > > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > > > Hi Francisco,
> > > > > > > > > >
> > > > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi Bin and Alistair,
> > > > > > > > > > >
> > > > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > >
> > > > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > >
> > > > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > > > change looks fine, so:
> > > > > > > > > > > >
> > > > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > > > >
> > > > > > > > > > > > Alistair
> > > > > > > > > > > >
> > > > > > > > > > > > > ---
> > > > > > > > > > > > >
> > > > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > > > >
> > > > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > > > >
> > > > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > > > >
> > > > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > > > >
> > > > > > > >
> > > > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > > > but we are not using bits, instead we are using bytes.
> > > > > > >
> > > > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > > > Maybe just bellow would have been better:
> > > > > > >
> > > > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > > > >
> > > > > > > >
> > > > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > > > >
> > > > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > > > those could work aswell for the imx_spi?
> > > > > > >
> > > > > >
> > > > > > Thanks for pointing this out. So there is some inconsistency among
> > > > > > different SPI controller modeling.
> > > > >
> > > > > I'm not sure I understand you correctly but the controllers supporting
> > > > > commands with dummy clock cycles can only do it following the modeled
> > > > > approach, so I would rather say it is pretty consistent across the
> > > > > controllers (not all controllers support these commands though).
> > > >
> > > > I mean there are 2 approaches to emulate the dummy cycles for
> > >
> > > There is currently only 1 way of modeling dummy clock cycles. All commands that
> > > require / support them in m25p80 goes with that approach. An the controllers
> > > that support dummy clock cycles uses that approach.
> >
> > No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
> > npcm7xx do for dummy cycles. For these controllers, there are hardware
> > registers for dummy cycles, and software does not need to write
> > anything into the tx fifo. These models emulate one dummy cycle by
> > transferring one byte one the SPI line so we see there are actually a
> > number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
> > way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
> > these controllers, they just transfer whatever is written by guest
> > software to tx fifo without any special awareness of dummy cycles.
>
>
> The xilinx_spips supports above way of transferring a command so you can look
> into that one for an example of how to handle a command pushed into a txfifo
> with regards to the dummy clock cycles. Not all controllers support generating
> dummy clock cycles, meaning that not all do the dummy byte -> dummy clock cycle
> conversion. The controllers that do not do this currently does not support
> issuing commands requiring them towards m25p80 (as FAST_READ, DOR, QOR etc..).

No, I don't think inspecting tx fifo to decode the SPI command, and to
insert dummy cycles when seeing FAST_READ is the correct way for these
SPI controllers like imx_spi. The real hardware does not do this and
we should not make them behave like xilinx_spips.

Regards,
Bin
Francisco Iglesias Dec. 15, 2020, 4:40 p.m. UTC | #16
Hello Bin,

On [2020 Dec 12] Sat 17:44:27, Bin Meng wrote:
> Hi Francisco,
> 
> On Sat, Dec 12, 2020 at 5:24 PM Francisco Iglesias
> <frasse.iglesias@gmail.com> wrote:
> >
> > Hi bin,
> >
> > On [2020 Dec 12] Sat 16:16:59, Bin Meng wrote:
> > > Hi Francisco,
> > >
> > > On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
> > > <frasse.iglesias@gmail.com> wrote:
> > > >
> > > > Hello Bin,
> > > >
> > > > On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > > > > Hi Francisco,
> > > > >
> > > > > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > >
> > > > > > Hello Bin,
> > > > > >
> > > > > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > > > > Hi Francisco,
> > > > > > >
> > > > > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hello Bin,
> > > > > > > >
> > > > > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > > > > Hi Francisco,
> > > > > > > > >
> > > > > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hello Bin,
> > > > > > > > > >
> > > > > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > > > > Hi Francisco,
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Bin and Alistair,
> > > > > > > > > > > >
> > > > > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > >
> > > > > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > > > > change looks fine, so:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alistair
> > > > > > > > > > > > >
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > > > > >
> > > > > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > > > > >
> > > > > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > > > > >
> > > > > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > > > > but we are not using bits, instead we are using bytes.
> > > > > > > >
> > > > > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > > > > Maybe just bellow would have been better:
> > > > > > > >
> > > > > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > > > > >
> > > > > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > > > > those could work aswell for the imx_spi?
> > > > > > > >
> > > > > > >
> > > > > > > Thanks for pointing this out. So there is some inconsistency among
> > > > > > > different SPI controller modeling.
> > > > > >
> > > > > > I'm not sure I understand you correctly but the controllers supporting
> > > > > > commands with dummy clock cycles can only do it following the modeled
> > > > > > approach, so I would rather say it is pretty consistent across the
> > > > > > controllers (not all controllers support these commands though).
> > > > >
> > > > > I mean there are 2 approaches to emulate the dummy cycles for
> > > >
> > > > There is currently only 1 way of modeling dummy clock cycles. All commands that
> > > > require / support them in m25p80 goes with that approach. An the controllers
> > > > that support dummy clock cycles uses that approach.
> > >
> > > No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
> > > npcm7xx do for dummy cycles. For these controllers, there are hardware
> > > registers for dummy cycles, and software does not need to write
> > > anything into the tx fifo. These models emulate one dummy cycle by
> > > transferring one byte one the SPI line so we see there are actually a
> > > number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
> > > way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
> > > these controllers, they just transfer whatever is written by guest
> > > software to tx fifo without any special awareness of dummy cycles.
> >
> >
> > The xilinx_spips supports above way of transferring a command so you can look
> > into that one for an example of how to handle a command pushed into a txfifo
> > with regards to the dummy clock cycles. Not all controllers support generating
> > dummy clock cycles, meaning that not all do the dummy byte -> dummy clock cycle
> > conversion. The controllers that do not do this currently does not support
> > issuing commands requiring them towards m25p80 (as FAST_READ, DOR, QOR etc..).
> 
> No, I don't think inspecting tx fifo to decode the SPI command, and to
> insert dummy cycles when seeing FAST_READ is the correct way for these
> SPI controllers like imx_spi. The real hardware does not do this and
> we should not make them behave like xilinx_spips.

Above is not correct, the xilinx_spips does not insert dummy clock cycles, it
converts the dummy bytes in the txfifo into the correct amount of dummy clock
cycles needed to be generated for the dummy byte based on the command and state
of the controller. For example if the command (as DOR) uses 2 lines when
transmitting the dummy byte it will issue 4 dummy clock cycles, if the command
uses 4 lines (example QIOR) it converts the dummy bytes into 2 dummy clock
cycles each.

How the hardware really works and how QEMU models it is not necessarly the
same, for the FAST_READ command the hardware will generate 8 dummy cycles of
the dummy byte (probably by shifting out the byte), currently the only way to
model this in QEMU is by making the controller convert the dummy
byte into 8 byte writes into m25p80 (there's no command in m25p80 modeling
this differently). If you would like to change this I think it is better if you
post a patch series demonstrating a solution that is better and also solves all
problems currently solved by the current one. Example problems (but not all)
are the ones mentioned above.

Thanks,
Francisco Iglesias


> 
> Regards,
> Bin
Bin Meng Dec. 16, 2020, 10:11 a.m. UTC | #17
Hi Francisco,

On Wed, Dec 16, 2020 at 12:40 AM Francisco Iglesias
<frasse.iglesias@gmail.com> wrote:
>
> Hello Bin,
>
> On [2020 Dec 12] Sat 17:44:27, Bin Meng wrote:
> > Hi Francisco,
> >
> > On Sat, Dec 12, 2020 at 5:24 PM Francisco Iglesias
> > <frasse.iglesias@gmail.com> wrote:
> > >
> > > Hi bin,
> > >
> > > On [2020 Dec 12] Sat 16:16:59, Bin Meng wrote:
> > > > Hi Francisco,
> > > >
> > > > On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
> > > > <frasse.iglesias@gmail.com> wrote:
> > > > >
> > > > > Hello Bin,
> > > > >
> > > > > On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > > > > > Hi Francisco,
> > > > > >
> > > > > > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > >
> > > > > > > Hello Bin,
> > > > > > >
> > > > > > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > > > > > Hi Francisco,
> > > > > > > >
> > > > > > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > Hello Bin,
> > > > > > > > >
> > > > > > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > > > > > Hi Francisco,
> > > > > > > > > >
> > > > > > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hello Bin,
> > > > > > > > > > >
> > > > > > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi Bin and Alistair,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > > > > > change looks fine, so:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Alistair
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > > > > > >
> > > > > > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > > > > > >
> > > > > > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > > > > > but we are not using bits, instead we are using bytes.
> > > > > > > > >
> > > > > > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > > > > > Maybe just bellow would have been better:
> > > > > > > > >
> > > > > > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > > > > > >
> > > > > > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > > > > > those could work aswell for the imx_spi?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Thanks for pointing this out. So there is some inconsistency among
> > > > > > > > different SPI controller modeling.
> > > > > > >
> > > > > > > I'm not sure I understand you correctly but the controllers supporting
> > > > > > > commands with dummy clock cycles can only do it following the modeled
> > > > > > > approach, so I would rather say it is pretty consistent across the
> > > > > > > controllers (not all controllers support these commands though).
> > > > > >
> > > > > > I mean there are 2 approaches to emulate the dummy cycles for
> > > > >
> > > > > There is currently only 1 way of modeling dummy clock cycles. All commands that
> > > > > require / support them in m25p80 goes with that approach. An the controllers
> > > > > that support dummy clock cycles uses that approach.
> > > >
> > > > No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
> > > > npcm7xx do for dummy cycles. For these controllers, there are hardware
> > > > registers for dummy cycles, and software does not need to write
> > > > anything into the tx fifo. These models emulate one dummy cycle by
> > > > transferring one byte one the SPI line so we see there are actually a
> > > > number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
> > > > way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
> > > > these controllers, they just transfer whatever is written by guest
> > > > software to tx fifo without any special awareness of dummy cycles.
> > >
> > >
> > > The xilinx_spips supports above way of transferring a command so you can look
> > > into that one for an example of how to handle a command pushed into a txfifo
> > > with regards to the dummy clock cycles. Not all controllers support generating
> > > dummy clock cycles, meaning that not all do the dummy byte -> dummy clock cycle
> > > conversion. The controllers that do not do this currently does not support
> > > issuing commands requiring them towards m25p80 (as FAST_READ, DOR, QOR etc..).
> >
> > No, I don't think inspecting tx fifo to decode the SPI command, and to
> > insert dummy cycles when seeing FAST_READ is the correct way for these
> > SPI controllers like imx_spi. The real hardware does not do this and
> > we should not make them behave like xilinx_spips.
>
> Above is not correct, the xilinx_spips does not insert dummy clock cycles, it
> converts the dummy bytes in the txfifo into the correct amount of dummy clock
> cycles needed to be generated for the dummy byte based on the command and state

I was referring to aspeed_smc codes. It multiplies the dummy cycles by
8 and calls ssi_transfer(). This to me is wrong. However I am not
familiar with xilinx-spips and I surely could be wrong, like you said.

> of the controller. For example if the command (as DOR) uses 2 lines when
> transmitting the dummy byte it will issue 4 dummy clock cycles, if the command
> uses 4 lines (example QIOR) it converts the dummy bytes into 2 dummy clock
> cycles each.

I will take some time to check the xilinx-spips spec. This might need some time.

>
> How the hardware really works and how QEMU models it is not necessarly the
> same, for the FAST_READ command the hardware will generate 8 dummy cycles of

Agree, they should not be necessarily the same but for this case I was
saying that type of hardware has no way to inspect the content being
transferred hence for QEMU it's impossible to do that either.

> the dummy byte (probably by shifting out the byte), currently the only way to
> model this in QEMU is by making the controller convert the dummy
> byte into 8 byte writes into m25p80 (there's no command in m25p80 modeling
> this differently). If you would like to change this I think it is better if you
> post a patch series demonstrating a solution that is better and also solves all
> problems currently solved by the current one. Example problems (but not all)
> are the ones mentioned above.
>

Anyway I will take a look at the xilinx_spips and propose something later.

Regards,
Bin
Bin Meng Dec. 22, 2020, 1:40 a.m. UTC | #18
Hi Francisco,

On Wed, Dec 16, 2020 at 6:11 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> Hi Francisco,
>
> On Wed, Dec 16, 2020 at 12:40 AM Francisco Iglesias
> <frasse.iglesias@gmail.com> wrote:
> >
> > Hello Bin,
> >
> > On [2020 Dec 12] Sat 17:44:27, Bin Meng wrote:
> > > Hi Francisco,
> > >
> > > On Sat, Dec 12, 2020 at 5:24 PM Francisco Iglesias
> > > <frasse.iglesias@gmail.com> wrote:
> > > >
> > > > Hi bin,
> > > >
> > > > On [2020 Dec 12] Sat 16:16:59, Bin Meng wrote:
> > > > > Hi Francisco,
> > > > >
> > > > > On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
> > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > >
> > > > > > Hello Bin,
> > > > > >
> > > > > > On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > > > > > > Hi Francisco,
> > > > > > >
> > > > > > > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hello Bin,
> > > > > > > >
> > > > > > > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > > > > > > Hi Francisco,
> > > > > > > > >
> > > > > > > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hello Bin,
> > > > > > > > > >
> > > > > > > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > > > > > > Hi Francisco,
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hello Bin,
> > > > > > > > > > > >
> > > > > > > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Bin and Alistair,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > > > > > > change looks fine, so:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Alistair
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > > > > > > >
> > > > > > > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > > > > > > but we are not using bits, instead we are using bytes.
> > > > > > > > > >
> > > > > > > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > > > > > > Maybe just bellow would have been better:
> > > > > > > > > >
> > > > > > > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > > > > > > >
> > > > > > > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > > > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > > > > > > those could work aswell for the imx_spi?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks for pointing this out. So there is some inconsistency among
> > > > > > > > > different SPI controller modeling.
> > > > > > > >
> > > > > > > > I'm not sure I understand you correctly but the controllers supporting
> > > > > > > > commands with dummy clock cycles can only do it following the modeled
> > > > > > > > approach, so I would rather say it is pretty consistent across the
> > > > > > > > controllers (not all controllers support these commands though).
> > > > > > >
> > > > > > > I mean there are 2 approaches to emulate the dummy cycles for
> > > > > >
> > > > > > There is currently only 1 way of modeling dummy clock cycles. All commands that
> > > > > > require / support them in m25p80 goes with that approach. An the controllers
> > > > > > that support dummy clock cycles uses that approach.
> > > > >
> > > > > No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
> > > > > npcm7xx do for dummy cycles. For these controllers, there are hardware
> > > > > registers for dummy cycles, and software does not need to write
> > > > > anything into the tx fifo. These models emulate one dummy cycle by
> > > > > transferring one byte one the SPI line so we see there are actually a
> > > > > number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
> > > > > way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
> > > > > these controllers, they just transfer whatever is written by guest
> > > > > software to tx fifo without any special awareness of dummy cycles.
> > > >
> > > >
> > > > The xilinx_spips supports above way of transferring a command so you can look
> > > > into that one for an example of how to handle a command pushed into a txfifo
> > > > with regards to the dummy clock cycles. Not all controllers support generating
> > > > dummy clock cycles, meaning that not all do the dummy byte -> dummy clock cycle
> > > > conversion. The controllers that do not do this currently does not support
> > > > issuing commands requiring them towards m25p80 (as FAST_READ, DOR, QOR etc..).
> > >
> > > No, I don't think inspecting tx fifo to decode the SPI command, and to
> > > insert dummy cycles when seeing FAST_READ is the correct way for these
> > > SPI controllers like imx_spi. The real hardware does not do this and
> > > we should not make them behave like xilinx_spips.
> >
> > Above is not correct, the xilinx_spips does not insert dummy clock cycles, it
> > converts the dummy bytes in the txfifo into the correct amount of dummy clock
> > cycles needed to be generated for the dummy byte based on the command and state
>
> I was referring to aspeed_smc codes. It multiplies the dummy cycles by
> 8 and calls ssi_transfer(). This to me is wrong. However I am not
> familiar with xilinx-spips and I surely could be wrong, like you said.
>
> > of the controller. For example if the command (as DOR) uses 2 lines when
> > transmitting the dummy byte it will issue 4 dummy clock cycles, if the command
> > uses 4 lines (example QIOR) it converts the dummy bytes into 2 dummy clock
> > cycles each.
>
> I will take some time to check the xilinx-spips spec. This might need some time.
>
> >
> > How the hardware really works and how QEMU models it is not necessarly the
> > same, for the FAST_READ command the hardware will generate 8 dummy cycles of
>
> Agree, they should not be necessarily the same but for this case I was
> saying that type of hardware has no way to inspect the content being
> transferred hence for QEMU it's impossible to do that either.
>
> > the dummy byte (probably by shifting out the byte), currently the only way to
> > model this in QEMU is by making the controller convert the dummy
> > byte into 8 byte writes into m25p80 (there's no command in m25p80 modeling
> > this differently). If you would like to change this I think it is better if you
> > post a patch series demonstrating a solution that is better and also solves all
> > problems currently solved by the current one. Example problems (but not all)
> > are the ones mentioned above.
> >
>
> Anyway I will take a look at the xilinx_spips and propose something later.

It looks like the xilinx_spips.c supports 3 types of SPI controllers:

static void xilinx_spips_register_types(void)
{
    type_register_static(&xilinx_spips_info);
    type_register_static(&xilinx_qspips_info);
    type_register_static(&xlnx_zynqmp_qspips_info);
}

Dummy cycle handling happens in xilinx_qspips and xilinz_zynqmp_qspips codes.

Could you please provide some instructions on how to test these 2
controllers with upstream U-Boot or Linux?
I tried Zynqmp upstream U-Boot but it does not boot on QEMU. I
probably missed something.

Regards,
Bin
Bin Meng Jan. 6, 2021, 2:21 p.m. UTC | #19
Hi Francisco,

On Tue, Dec 22, 2020 at 9:40 AM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> Hi Francisco,
>
> On Wed, Dec 16, 2020 at 6:11 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > Hi Francisco,
> >
> > On Wed, Dec 16, 2020 at 12:40 AM Francisco Iglesias
> > <frasse.iglesias@gmail.com> wrote:
> > >
> > > Hello Bin,
> > >
> > > On [2020 Dec 12] Sat 17:44:27, Bin Meng wrote:
> > > > Hi Francisco,
> > > >
> > > > On Sat, Dec 12, 2020 at 5:24 PM Francisco Iglesias
> > > > <frasse.iglesias@gmail.com> wrote:
> > > > >
> > > > > Hi bin,
> > > > >
> > > > > On [2020 Dec 12] Sat 16:16:59, Bin Meng wrote:
> > > > > > Hi Francisco,
> > > > > >
> > > > > > On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
> > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > >
> > > > > > > Hello Bin,
> > > > > > >
> > > > > > > On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > > > > > > > Hi Francisco,
> > > > > > > >
> > > > > > > > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > Hello Bin,
> > > > > > > > >
> > > > > > > > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > > > > > > > Hi Francisco,
> > > > > > > > > >
> > > > > > > > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hello Bin,
> > > > > > > > > > >
> > > > > > > > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hello Bin,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Bin and Alistair,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > > > > > > > change looks fine, so:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Alistair
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > > > > > > > >
> > > > > > > > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > > > > > > > but we are not using bits, instead we are using bytes.
> > > > > > > > > > >
> > > > > > > > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > > > > > > > Maybe just bellow would have been better:
> > > > > > > > > > >
> > > > > > > > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > > > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > > > > > > > >
> > > > > > > > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > > > > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > > > > > > > those could work aswell for the imx_spi?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thanks for pointing this out. So there is some inconsistency among
> > > > > > > > > > different SPI controller modeling.
> > > > > > > > >
> > > > > > > > > I'm not sure I understand you correctly but the controllers supporting
> > > > > > > > > commands with dummy clock cycles can only do it following the modeled
> > > > > > > > > approach, so I would rather say it is pretty consistent across the
> > > > > > > > > controllers (not all controllers support these commands though).
> > > > > > > >
> > > > > > > > I mean there are 2 approaches to emulate the dummy cycles for
> > > > > > >
> > > > > > > There is currently only 1 way of modeling dummy clock cycles. All commands that
> > > > > > > require / support them in m25p80 goes with that approach. An the controllers
> > > > > > > that support dummy clock cycles uses that approach.
> > > > > >
> > > > > > No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
> > > > > > npcm7xx do for dummy cycles. For these controllers, there are hardware
> > > > > > registers for dummy cycles, and software does not need to write
> > > > > > anything into the tx fifo. These models emulate one dummy cycle by
> > > > > > transferring one byte one the SPI line so we see there are actually a
> > > > > > number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
> > > > > > way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
> > > > > > these controllers, they just transfer whatever is written by guest
> > > > > > software to tx fifo without any special awareness of dummy cycles.
> > > > >
> > > > >
> > > > > The xilinx_spips supports above way of transferring a command so you can look
> > > > > into that one for an example of how to handle a command pushed into a txfifo
> > > > > with regards to the dummy clock cycles. Not all controllers support generating
> > > > > dummy clock cycles, meaning that not all do the dummy byte -> dummy clock cycle
> > > > > conversion. The controllers that do not do this currently does not support
> > > > > issuing commands requiring them towards m25p80 (as FAST_READ, DOR, QOR etc..).
> > > >
> > > > No, I don't think inspecting tx fifo to decode the SPI command, and to
> > > > insert dummy cycles when seeing FAST_READ is the correct way for these
> > > > SPI controllers like imx_spi. The real hardware does not do this and
> > > > we should not make them behave like xilinx_spips.
> > >
> > > Above is not correct, the xilinx_spips does not insert dummy clock cycles, it
> > > converts the dummy bytes in the txfifo into the correct amount of dummy clock
> > > cycles needed to be generated for the dummy byte based on the command and state
> >
> > I was referring to aspeed_smc codes. It multiplies the dummy cycles by
> > 8 and calls ssi_transfer(). This to me is wrong. However I am not
> > familiar with xilinx-spips and I surely could be wrong, like you said.
> >
> > > of the controller. For example if the command (as DOR) uses 2 lines when
> > > transmitting the dummy byte it will issue 4 dummy clock cycles, if the command
> > > uses 4 lines (example QIOR) it converts the dummy bytes into 2 dummy clock
> > > cycles each.
> >
> > I will take some time to check the xilinx-spips spec. This might need some time.
> >
> > >
> > > How the hardware really works and how QEMU models it is not necessarly the
> > > same, for the FAST_READ command the hardware will generate 8 dummy cycles of
> >
> > Agree, they should not be necessarily the same but for this case I was
> > saying that type of hardware has no way to inspect the content being
> > transferred hence for QEMU it's impossible to do that either.
> >
> > > the dummy byte (probably by shifting out the byte), currently the only way to
> > > model this in QEMU is by making the controller convert the dummy
> > > byte into 8 byte writes into m25p80 (there's no command in m25p80 modeling
> > > this differently). If you would like to change this I think it is better if you
> > > post a patch series demonstrating a solution that is better and also solves all
> > > problems currently solved by the current one. Example problems (but not all)
> > > are the ones mentioned above.
> > >
> >
> > Anyway I will take a look at the xilinx_spips and propose something later.
>
> It looks like the xilinx_spips.c supports 3 types of SPI controllers:
>
> static void xilinx_spips_register_types(void)
> {
>     type_register_static(&xilinx_spips_info);
>     type_register_static(&xilinx_qspips_info);
>     type_register_static(&xlnx_zynqmp_qspips_info);
> }
>
> Dummy cycle handling happens in xilinx_qspips and xilinz_zynqmp_qspips codes.
>
> Could you please provide some instructions on how to test these 2
> controllers with upstream U-Boot or Linux?
> I tried Zynqmp upstream U-Boot but it does not boot on QEMU. I
> probably missed something.

Ping?

Could you kindly provide instructions of using QEMU to boot upstream
U-Boot or Linux?

Regards,
Bin
Bin Meng Jan. 12, 2021, 1:10 a.m. UTC | #20
On Wed, Jan 6, 2021 at 10:21 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> Hi Francisco,
>
> On Tue, Dec 22, 2020 at 9:40 AM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > Hi Francisco,
> >
> > On Wed, Dec 16, 2020 at 6:11 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > >
> > > Hi Francisco,
> > >
> > > On Wed, Dec 16, 2020 at 12:40 AM Francisco Iglesias
> > > <frasse.iglesias@gmail.com> wrote:
> > > >
> > > > Hello Bin,
> > > >
> > > > On [2020 Dec 12] Sat 17:44:27, Bin Meng wrote:
> > > > > Hi Francisco,
> > > > >
> > > > > On Sat, Dec 12, 2020 at 5:24 PM Francisco Iglesias
> > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > >
> > > > > > Hi bin,
> > > > > >
> > > > > > On [2020 Dec 12] Sat 16:16:59, Bin Meng wrote:
> > > > > > > Hi Francisco,
> > > > > > >
> > > > > > > On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
> > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hello Bin,
> > > > > > > >
> > > > > > > > On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > > > > > > > > Hi Francisco,
> > > > > > > > >
> > > > > > > > > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hello Bin,
> > > > > > > > > >
> > > > > > > > > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > > > > > > > > Hi Francisco,
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hello Bin,
> > > > > > > > > > > >
> > > > > > > > > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hello Bin,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Bin and Alistair,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > > > > > > > > change looks fine, so:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Alistair
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > > > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > > > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > > > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > > > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > > > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > > > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > > > > > > > > but we are not using bits, instead we are using bytes.
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > > > > > > > > Maybe just bellow would have been better:
> > > > > > > > > > > >
> > > > > > > > > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > > > > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > > > > > > > > >
> > > > > > > > > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > > > > > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > > > > > > > > those could work aswell for the imx_spi?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks for pointing this out. So there is some inconsistency among
> > > > > > > > > > > different SPI controller modeling.
> > > > > > > > > >
> > > > > > > > > > I'm not sure I understand you correctly but the controllers supporting
> > > > > > > > > > commands with dummy clock cycles can only do it following the modeled
> > > > > > > > > > approach, so I would rather say it is pretty consistent across the
> > > > > > > > > > controllers (not all controllers support these commands though).
> > > > > > > > >
> > > > > > > > > I mean there are 2 approaches to emulate the dummy cycles for
> > > > > > > >
> > > > > > > > There is currently only 1 way of modeling dummy clock cycles. All commands that
> > > > > > > > require / support them in m25p80 goes with that approach. An the controllers
> > > > > > > > that support dummy clock cycles uses that approach.
> > > > > > >
> > > > > > > No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
> > > > > > > npcm7xx do for dummy cycles. For these controllers, there are hardware
> > > > > > > registers for dummy cycles, and software does not need to write
> > > > > > > anything into the tx fifo. These models emulate one dummy cycle by
> > > > > > > transferring one byte one the SPI line so we see there are actually a
> > > > > > > number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
> > > > > > > way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
> > > > > > > these controllers, they just transfer whatever is written by guest
> > > > > > > software to tx fifo without any special awareness of dummy cycles.
> > > > > >
> > > > > >
> > > > > > The xilinx_spips supports above way of transferring a command so you can look
> > > > > > into that one for an example of how to handle a command pushed into a txfifo
> > > > > > with regards to the dummy clock cycles. Not all controllers support generating
> > > > > > dummy clock cycles, meaning that not all do the dummy byte -> dummy clock cycle
> > > > > > conversion. The controllers that do not do this currently does not support
> > > > > > issuing commands requiring them towards m25p80 (as FAST_READ, DOR, QOR etc..).
> > > > >
> > > > > No, I don't think inspecting tx fifo to decode the SPI command, and to
> > > > > insert dummy cycles when seeing FAST_READ is the correct way for these
> > > > > SPI controllers like imx_spi. The real hardware does not do this and
> > > > > we should not make them behave like xilinx_spips.
> > > >
> > > > Above is not correct, the xilinx_spips does not insert dummy clock cycles, it
> > > > converts the dummy bytes in the txfifo into the correct amount of dummy clock
> > > > cycles needed to be generated for the dummy byte based on the command and state
> > >
> > > I was referring to aspeed_smc codes. It multiplies the dummy cycles by
> > > 8 and calls ssi_transfer(). This to me is wrong. However I am not
> > > familiar with xilinx-spips and I surely could be wrong, like you said.
> > >
> > > > of the controller. For example if the command (as DOR) uses 2 lines when
> > > > transmitting the dummy byte it will issue 4 dummy clock cycles, if the command
> > > > uses 4 lines (example QIOR) it converts the dummy bytes into 2 dummy clock
> > > > cycles each.
> > >
> > > I will take some time to check the xilinx-spips spec. This might need some time.
> > >
> > > >
> > > > How the hardware really works and how QEMU models it is not necessarly the
> > > > same, for the FAST_READ command the hardware will generate 8 dummy cycles of
> > >
> > > Agree, they should not be necessarily the same but for this case I was
> > > saying that type of hardware has no way to inspect the content being
> > > transferred hence for QEMU it's impossible to do that either.
> > >
> > > > the dummy byte (probably by shifting out the byte), currently the only way to
> > > > model this in QEMU is by making the controller convert the dummy
> > > > byte into 8 byte writes into m25p80 (there's no command in m25p80 modeling
> > > > this differently). If you would like to change this I think it is better if you
> > > > post a patch series demonstrating a solution that is better and also solves all
> > > > problems currently solved by the current one. Example problems (but not all)
> > > > are the ones mentioned above.
> > > >
> > >
> > > Anyway I will take a look at the xilinx_spips and propose something later.
> >
> > It looks like the xilinx_spips.c supports 3 types of SPI controllers:
> >
> > static void xilinx_spips_register_types(void)
> > {
> >     type_register_static(&xilinx_spips_info);
> >     type_register_static(&xilinx_qspips_info);
> >     type_register_static(&xlnx_zynqmp_qspips_info);
> > }
> >
> > Dummy cycle handling happens in xilinx_qspips and xilinz_zynqmp_qspips codes.
> >
> > Could you please provide some instructions on how to test these 2
> > controllers with upstream U-Boot or Linux?
> > I tried Zynqmp upstream U-Boot but it does not boot on QEMU. I
> > probably missed something.
>
> Ping?
>
> Could you kindly provide instructions of using QEMU to boot upstream
> U-Boot or Linux?

Ping?
Bin Meng Jan. 14, 2021, 1:38 p.m. UTC | #21
On Tue, Jan 12, 2021 at 9:10 AM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> On Wed, Jan 6, 2021 at 10:21 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > Hi Francisco,
> >
> > On Tue, Dec 22, 2020 at 9:40 AM Bin Meng <bmeng.cn@gmail.com> wrote:
> > >
> > > Hi Francisco,
> > >
> > > On Wed, Dec 16, 2020 at 6:11 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > >
> > > > Hi Francisco,
> > > >
> > > > On Wed, Dec 16, 2020 at 12:40 AM Francisco Iglesias
> > > > <frasse.iglesias@gmail.com> wrote:
> > > > >
> > > > > Hello Bin,
> > > > >
> > > > > On [2020 Dec 12] Sat 17:44:27, Bin Meng wrote:
> > > > > > Hi Francisco,
> > > > > >
> > > > > > On Sat, Dec 12, 2020 at 5:24 PM Francisco Iglesias
> > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi bin,
> > > > > > >
> > > > > > > On [2020 Dec 12] Sat 16:16:59, Bin Meng wrote:
> > > > > > > > Hi Francisco,
> > > > > > > >
> > > > > > > > On Sat, Dec 12, 2020 at 12:11 AM Francisco Iglesias
> > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > Hello Bin,
> > > > > > > > >
> > > > > > > > > On [2020 Dec 11] Fri 23:29:16, Bin Meng wrote:
> > > > > > > > > > Hi Francisco,
> > > > > > > > > >
> > > > > > > > > > On Fri, Dec 11, 2020 at 11:16 PM Francisco Iglesias
> > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hello Bin,
> > > > > > > > > > >
> > > > > > > > > > > On [2020 Dec 11] Fri 14:07:21, Bin Meng wrote:
> > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Dec 4, 2020 at 7:28 PM Francisco Iglesias
> > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hello Bin,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On [2020 Dec 04] Fri 18:52:50, Bin Meng wrote:
> > > > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Dec 4, 2020 at 6:46 PM Francisco Iglesias
> > > > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hello Bin,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On [2020 Dec 04] Fri 15:52:12, Bin Meng wrote:
> > > > > > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Dec 3, 2020 at 4:38 PM Francisco Iglesias
> > > > > > > > > > > > > > > > <frasse.iglesias@gmail.com> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi Bin and Alistair,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On [2020 Dec 02] Wed 11:40:11, Alistair Francis wrote:
> > > > > > > > > > > > > > > > > > On Sun, Nov 29, 2020 at 6:55 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > SST flashes require a dummy byte after the address bits.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I couldn't find a datasheet that says this... But the actual code
> > > > > > > > > > > > > > > > > > change looks fine, so:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Acked-by: Alistair Francis <alistair.francis@wdc.com>
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Alistair
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >  hw/block/m25p80.c | 3 +++
> > > > > > > > > > > > > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > > > index 483925f..9b36762 100644
> > > > > > > > > > > > > > > > > > > --- a/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > > > +++ b/hw/block/m25p80.c
> > > > > > > > > > > > > > > > > > > @@ -825,6 +825,9 @@ static void decode_fast_read_cmd(Flash *s)
> > > > > > > > > > > > > > > > > > >      s->needed_bytes = get_addr_length(s);
> > > > > > > > > > > > > > > > > > >      switch (get_man(s)) {
> > > > > > > > > > > > > > > > > > >      /* Dummy cycles - modeled with bytes writes instead of bits */
> > > > > > > > > > > > > > > > > > > +    case MAN_SST:
> > > > > > > > > > > > > > > > > > > +        s->needed_bytes += 1;
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 1 dummy clk cycle is modelled as 1 byte write (see the comment above), so 1
> > > > > > > > > > > > > > > > > dummy byte (8 dummy clk cycles) will need +8 above.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I think you were confused by the WINBOND codes. The comments are
> > > > > > > > > > > > > > > > correct. It is modeled with bytes instead of bits, so we should +=1.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > What the comment says is (perhaps not superclear) that 1 dummy clock cycle
> > > > > > > > > > > > > > > is modeled as one 1 byte write into the flash (meaining that 8 byte writes
> > > > > > > > > > > > > > > are needed for 1 dummy byte). Perhaps it is easier to understand
> > > > > > > > > > > > > > > looking into how the controllers issue the command towards the flash model
> > > > > > > > > > > > > > > (for example the xilinx_spips), the start of the FAST_READ cmd is issued
> > > > > > > > > > > > > > > as writing the following into the flash: 1 byte (cmd), 3 bytes (address),
> > > > > > > > > > > > > > > 8 bytes (8 dummy cycles -> 1 dummy byte).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > My interpretation of the comments are opposite: one cycle is a bit,
> > > > > > > > > > > > > > but we are not using bits, instead we are using bytes.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, the mentioning of 'bits' in the comment makes it not very clear at first read.
> > > > > > > > > > > > > Maybe just bellow would have been better:
> > > > > > > > > > > > >
> > > > > > > > > > > > > /* Dummy clock cycles - modeled with bytes writes */
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Testing shows that +=1 is the correct way with the imx_spi controller,
> > > > > > > > > > > > > > and with my SiFive SPI model in my local tree (not upstreamed yet)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Perhaps an option could be to look into how the aspeed_smc, xilinx_spips or the
> > > > > > > > > > > > > npcm7xx_fiu generate dummy clock cycles and see if a similar solution to one of
> > > > > > > > > > > > > those could work aswell for the imx_spi?
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for pointing this out. So there is some inconsistency among
> > > > > > > > > > > > different SPI controller modeling.
> > > > > > > > > > >
> > > > > > > > > > > I'm not sure I understand you correctly but the controllers supporting
> > > > > > > > > > > commands with dummy clock cycles can only do it following the modeled
> > > > > > > > > > > approach, so I would rather say it is pretty consistent across the
> > > > > > > > > > > controllers (not all controllers support these commands though).
> > > > > > > > > >
> > > > > > > > > > I mean there are 2 approaches to emulate the dummy cycles for
> > > > > > > > >
> > > > > > > > > There is currently only 1 way of modeling dummy clock cycles. All commands that
> > > > > > > > > require / support them in m25p80 goes with that approach. An the controllers
> > > > > > > > > that support dummy clock cycles uses that approach.
> > > > > > > >
> > > > > > > > No, there are 2 ways. One way is how aspeed_smc, xilinx_spips and
> > > > > > > > npcm7xx do for dummy cycles. For these controllers, there are hardware
> > > > > > > > registers for dummy cycles, and software does not need to write
> > > > > > > > anything into the tx fifo. These models emulate one dummy cycle by
> > > > > > > > transferring one byte one the SPI line so we see there are actually a
> > > > > > > > number of (bit * 8) bytes needed in decode_fast_read_cmd(). The other
> > > > > > > > way is how imx_spi, mss-spi, pl022, stm32f2xx_spi and xilinx_spi. For
> > > > > > > > these controllers, they just transfer whatever is written by guest
> > > > > > > > software to tx fifo without any special awareness of dummy cycles.
> > > > > > >
> > > > > > >
> > > > > > > The xilinx_spips supports above way of transferring a command so you can look
> > > > > > > into that one for an example of how to handle a command pushed into a txfifo
> > > > > > > with regards to the dummy clock cycles. Not all controllers support generating
> > > > > > > dummy clock cycles, meaning that not all do the dummy byte -> dummy clock cycle
> > > > > > > conversion. The controllers that do not do this currently does not support
> > > > > > > issuing commands requiring them towards m25p80 (as FAST_READ, DOR, QOR etc..).
> > > > > >
> > > > > > No, I don't think inspecting tx fifo to decode the SPI command, and to
> > > > > > insert dummy cycles when seeing FAST_READ is the correct way for these
> > > > > > SPI controllers like imx_spi. The real hardware does not do this and
> > > > > > we should not make them behave like xilinx_spips.
> > > > >
> > > > > Above is not correct, the xilinx_spips does not insert dummy clock cycles, it
> > > > > converts the dummy bytes in the txfifo into the correct amount of dummy clock
> > > > > cycles needed to be generated for the dummy byte based on the command and state
> > > >
> > > > I was referring to aspeed_smc codes. It multiplies the dummy cycles by
> > > > 8 and calls ssi_transfer(). This to me is wrong. However I am not
> > > > familiar with xilinx-spips and I surely could be wrong, like you said.
> > > >
> > > > > of the controller. For example if the command (as DOR) uses 2 lines when
> > > > > transmitting the dummy byte it will issue 4 dummy clock cycles, if the command
> > > > > uses 4 lines (example QIOR) it converts the dummy bytes into 2 dummy clock
> > > > > cycles each.
> > > >
> > > > I will take some time to check the xilinx-spips spec. This might need some time.
> > > >
> > > > >
> > > > > How the hardware really works and how QEMU models it is not necessarly the
> > > > > same, for the FAST_READ command the hardware will generate 8 dummy cycles of
> > > >
> > > > Agree, they should not be necessarily the same but for this case I was
> > > > saying that type of hardware has no way to inspect the content being
> > > > transferred hence for QEMU it's impossible to do that either.
> > > >
> > > > > the dummy byte (probably by shifting out the byte), currently the only way to
> > > > > model this in QEMU is by making the controller convert the dummy
> > > > > byte into 8 byte writes into m25p80 (there's no command in m25p80 modeling
> > > > > this differently). If you would like to change this I think it is better if you
> > > > > post a patch series demonstrating a solution that is better and also solves all
> > > > > problems currently solved by the current one. Example problems (but not all)
> > > > > are the ones mentioned above.
> > > > >
> > > >
> > > > Anyway I will take a look at the xilinx_spips and propose something later.
> > >
> > > It looks like the xilinx_spips.c supports 3 types of SPI controllers:
> > >
> > > static void xilinx_spips_register_types(void)
> > > {
> > >     type_register_static(&xilinx_spips_info);
> > >     type_register_static(&xilinx_qspips_info);
> > >     type_register_static(&xlnx_zynqmp_qspips_info);
> > > }
> > >
> > > Dummy cycle handling happens in xilinx_qspips and xilinz_zynqmp_qspips codes.
> > >
> > > Could you please provide some instructions on how to test these 2
> > > controllers with upstream U-Boot or Linux?
> > > I tried Zynqmp upstream U-Boot but it does not boot on QEMU. I
> > > probably missed something.
> >
> > Ping?
> >
> > Could you kindly provide instructions of using QEMU to boot upstream
> > U-Boot or Linux?
>
> Ping?

Since there is no response, I am going to send a series that fixed the
Xilinx QSPIPS model dummy cycle issue, based on my experiment on a
real ZCU102 board.

Of course the m25p80 flash model mess will be fixed too.

Regards,
Bin
diff mbox series

Patch

diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index 483925f..9b36762 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -825,6 +825,9 @@  static void decode_fast_read_cmd(Flash *s)
     s->needed_bytes = get_addr_length(s);
     switch (get_man(s)) {
     /* Dummy cycles - modeled with bytes writes instead of bits */
+    case MAN_SST:
+        s->needed_bytes += 1;
+        break;
     case MAN_WINBOND:
         s->needed_bytes += 8;
         break;