Patchwork [U-Boot,2/2] EXYNOS: SPI: Support SPI_PREAMBLE mode

login
register
mail settings
Submitter Vadim Bendebury
Date May 3, 2013, 5:12 a.m.
Message ID <CANy1buL7pxzB9DfDHMV8ZTwbMohF1ai6GcB=h69nUUpxNq4Mgw@mail.gmail.com>
Download mbox | patch
Permalink /patch/241147/
State Superseded
Delegated to: Jagannadha Sutradharudu Teki
Headers show

Comments

Vadim Bendebury - May 3, 2013, 5:12 a.m.
[the original patch removed, re-sending from a registered address]

So, I spent some more time debugging a system which requires this
patch: a system, where on a SPI interface a response to a command
could come way later then the command data transmission completes.

The original patch was trying to address many corner cases, but come
to think of it, in this situation the slave does not care about extra
data sent on the transmit interface, as otherwise there is no clock
and no data could be transferred from the slave.

Then, for this SPI interface we do not need to set the counter of
clocks, and do not need to keep adding more clocks if the data has not
been received yet, the clocks could be just free running. And then the
patch becomes much simpler, what do you think:

         * Bytes are transmitted/received in pairs. Wait to receive all the
@@ -243,13 +249,23 @@ static void spi_rx_tx(struct exynos_spi_slave
*spi_slave, int todo,

                /* Keep the fifos full/empty. */
                spi_get_fifo_levels(regs, &rx_lvl, &tx_lvl);
-               if (tx_lvl < spi_slave->fifo_size && out_bytes) {
-                       temp = txp ? *txp++ : 0xff;
+               if (tx_lvl < spi_slave->fifo_size) {
+                       if (txp && out_bytes) {
+                               temp = *txp++;
+                               out_bytes--;
+                       } else {
+                               temp = 0xff;
+                       }
                        writel(temp, &regs->tx_data);
-                       out_bytes--;
                }
                if (rx_lvl > 0 && in_bytes) {
                        temp = readl(&regs->rx_data);
+                       if (hunting) {
+                               if ((temp & 0xff) != PREAMBLE_VALUE)
+                                       continue;
+                               else
+                                       hunting = 0;
+                       }
                        if (rxp)
                                *rxp++ = temp;
                        in_bytes--;
Simon Glass - May 6, 2013, 2:37 p.m.
HI Vadim,

On Thu, May 2, 2013 at 11:12 PM, Vadim Bendebury <vbendeb@chromium.org> wrote:
> [the original patch removed, re-sending from a registered address]
>
> So, I spent some more time debugging a system which requires this
> patch: a system, where on a SPI interface a response to a command
> could come way later then the command data transmission completes.
>
> The original patch was trying to address many corner cases, but come
> to think of it, in this situation the slave does not care about extra
> data sent on the transmit interface, as otherwise there is no clock
> and no data could be transferred from the slave.
>
> Then, for this SPI interface we do not need to set the counter of
> clocks, and do not need to keep adding more clocks if the data has not
> been received yet, the clocks could be just free running. And then the
> patch becomes much simpler, what do you think:

Does this deal with the performance problems that the old driver code
had? There were a number of other patches sent upstream by Rajeshwari
also. I wonder if it might be easier to do your improvement as a
separate patch on top of those instead. Then it can be considered on
its merits.

Regards,
Simon
Vadim Bendebury - May 6, 2013, 4:01 p.m.
On Mon, May 6, 2013 at 7:37 AM, Simon Glass <sjg@chromium.org> wrote:
>
> HI Vadim,
>
> On Thu, May 2, 2013 at 11:12 PM, Vadim Bendebury <vbendeb@chromium.org> wrote:
> > [the original patch removed, re-sending from a registered address]
> >
> > So, I spent some more time debugging a system which requires this
> > patch: a system, where on a SPI interface a response to a command
> > could come way later then the command data transmission completes.
> >
> > The original patch was trying to address many corner cases, but come
> > to think of it, in this situation the slave does not care about extra
> > data sent on the transmit interface, as otherwise there is no clock
> > and no data could be transferred from the slave.
> >
> > Then, for this SPI interface we do not need to set the counter of
> > clocks, and do not need to keep adding more clocks if the data has not
> > been received yet, the clocks could be just free running. And then the
> > patch becomes much simpler, what do you think:
>
> Does this deal with the performance problems that the old driver code
> had? There were a number of other patches sent upstream by Rajeshwari
> also. I wonder if it might be easier to do your improvement as a
> separate patch on top of those instead. Then it can be considered on
> its merits.
>

Hi Simon,

what performance problems are there? Do you mean that u-boot is not
fast enough when polling the SPI interface? I thought about this -
even when clocking at 50MHz (resulting in 6.125 MB/s transfer rate)
with 64 byte FIFOs there should be no problem when serving the
interface, especially when receive and transmit flows are split in
time.

Have there been any evidence of performance problems?  Also, I noticed
that the driver does not pay any respect to error conditions. I am
planning to add error monitoring/processing code, as this would be a
good way to know if there indeed are performance problems.

cheers, --vb


> Regards,
> Simon
Simon Glass - May 11, 2013, 9:38 p.m.
Hi Vadim,

On Mon, May 6, 2013 at 10:01 AM, Vadim Bendebury <vbendeb@chromium.org> wrote:
> On Mon, May 6, 2013 at 7:37 AM, Simon Glass <sjg@chromium.org> wrote:
>>
>> HI Vadim,
>>
>> On Thu, May 2, 2013 at 11:12 PM, Vadim Bendebury <vbendeb@chromium.org> wrote:
>> > [the original patch removed, re-sending from a registered address]
>> >
>> > So, I spent some more time debugging a system which requires this
>> > patch: a system, where on a SPI interface a response to a command
>> > could come way later then the command data transmission completes.
>> >
>> > The original patch was trying to address many corner cases, but come
>> > to think of it, in this situation the slave does not care about extra
>> > data sent on the transmit interface, as otherwise there is no clock
>> > and no data could be transferred from the slave.
>> >
>> > Then, for this SPI interface we do not need to set the counter of
>> > clocks, and do not need to keep adding more clocks if the data has not
>> > been received yet, the clocks could be just free running. And then the
>> > patch becomes much simpler, what do you think:
>>
>> Does this deal with the performance problems that the old driver code
>> had? There were a number of other patches sent upstream by Rajeshwari
>> also. I wonder if it might be easier to do your improvement as a
>> separate patch on top of those instead. Then it can be considered on
>> its merits.
>>
>
> Hi Simon,
>
> what performance problems are there? Do you mean that u-boot is not
> fast enough when polling the SPI interface? I thought about this -
> even when clocking at 50MHz (resulting in 6.125 MB/s transfer rate)
> with 64 byte FIFOs there should be no problem when serving the
> interface, especially when receive and transmit flows are split in
> time.
>
> Have there been any evidence of performance problems?  Also, I noticed
> that the driver does not pay any respect to error conditions. I am
> planning to add error monitoring/processing code, as this would be a
> good way to know if there indeed are performance problems.

The issue is not the hardware but the software. Yes the hardware is
well able to keep up, but it does have some oddities. For example
reading the FIFO level registers seems to take a while, as does
reading/writing data to the FIFO.

I did a bit of benchmarking comparing the original upstream driver
with the driver after Rajeshwari's patches are applied. I posted that
to the list earlier today, but roughly speaking it is 100x faster, and
SPI boot time is reduced by about half a second. Unfortunately it is a
bit more complicated, but it is reliable and the code is well tested
with lots of units in the field :-)

Regards,
Simon

Patch

diff --git a/drivers/spi/exynos_spi.c b/drivers/spi/exynos_spi.c
index c697db8..fff8310 100644
--- a/drivers/spi/exynos_spi.c
+++ b/drivers/spi/exynos_spi.c
@@ -211,10 +211,10 @@  static void spi_get_fifo_levels(struct exynos_spi *regs,
  */
 static void spi_request_bytes(struct exynos_spi *regs, int count)
 {
-       assert(count && count < (1 << 16));
        setbits_le32(&regs->ch_cfg, SPI_CH_RST);
        clrbits_le32(&regs->ch_cfg, SPI_CH_RST);
-       writel(count | SPI_PACKET_CNT_EN, &regs->pkt_cnt);
+       if (count)
+               writel(count | SPI_PACKET_CNT_EN, &regs->pkt_cnt);
 }

 static void spi_rx_tx(struct exynos_spi_slave *spi_slave, int todo,
@@ -225,14 +225,20 @@  static void spi_rx_tx(struct exynos_spi_slave
*spi_slave, int todo,
        const uchar *txp = *doutp;
        int rx_lvl, tx_lvl;
        uint out_bytes, in_bytes;
-
+       int hunting;
+
+       if (spi_slave->free_running_mode) {
+               spi_request_bytes(regs, 0);
+               hunting = 1;
+       } else {
+               hunting = 0;
+               spi_request_bytes(regs, todo);
+       }
        out_bytes = in_bytes = todo;

-       /*
-        * If there's something to send, do a software reset and set a
-        * transaction size.
-        */
-       spi_request_bytes(regs, todo);
+       /* Software reset the channel. */
+       setbits_le32(&regs->ch_cfg, SPI_CH_RST);
+       clrbits_le32(&regs->ch_cfg, SPI_CH_RST);

        /*