Message ID | 538D9DB3.4080905@solarflare.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Hello. On 06/03/2014 02:04 PM, Shradha Shah wrote: > From: Jon Cooper <jcooper@solarflare.com> > Patch to open-code the memory copy routines. > 32bit writes over the PCI bus causes data corruption. > Fixes:ee45fd92c739 > ("sfc: Use TX PIO for sufficiently small packets") > Signed-off-by: Shradha Shah <sshah@solarflare.com> > --- > drivers/net/ethernet/sfc/tx.c | 24 +++++++++++++++++++----- > 1 file changed, 19 insertions(+), 5 deletions(-) > diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c > index fa94753..d2c9ca0 100644 > --- a/drivers/net/ethernet/sfc/tx.c > +++ b/drivers/net/ethernet/sfc/tx.c > @@ -189,6 +189,20 @@ struct efx_short_copy_buffer { > u8 buf[L1_CACHE_BYTES]; > }; > > +/* Copy in explicit 64-bit writes. */ > +static void efx_memcpy_64(void __iomem *dest, void *src, size_t len) > +{ > + u64 *src64 = src, __iomem *dest64 = dest; > + size_t i, l64 = len / 8; > + > + WARN_ON_ONCE(len % 8 != 0); > + WARN_ON_ONCE(((u8 __iomem *)dest - (u8 __iomem *)0) % 8 != 0); > + BUILD_BUG_ON(sizeof(uint64_t) != 8); > + > + for (i = 0; i < l64; ++i) > + writeq(src64[i], dest64+i); Could you please surround + by spaces for consistency? > +} > + WBR, Sergei -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2014-06-03 at 20:45 +0400, Sergei Shtylyov wrote: > Hello. > > On 06/03/2014 02:04 PM, Shradha Shah wrote: > > > From: Jon Cooper <jcooper@solarflare.com> > > > Patch to open-code the memory copy routines. > > 32bit writes over the PCI bus causes data corruption. > > > Fixes:ee45fd92c739 > > ("sfc: Use TX PIO for sufficiently small packets") > > > Signed-off-by: Shradha Shah <sshah@solarflare.com> > > --- > > drivers/net/ethernet/sfc/tx.c | 24 +++++++++++++++++++----- > > 1 file changed, 19 insertions(+), 5 deletions(-) > > > diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c > > index fa94753..d2c9ca0 100644 > > --- a/drivers/net/ethernet/sfc/tx.c > > +++ b/drivers/net/ethernet/sfc/tx.c > > @@ -189,6 +189,20 @@ struct efx_short_copy_buffer { > > u8 buf[L1_CACHE_BYTES]; > > }; > > > > +/* Copy in explicit 64-bit writes. */ > > +static void efx_memcpy_64(void __iomem *dest, void *src, size_t len) > > +{ > > + u64 *src64 = src, __iomem *dest64 = dest; > > + size_t i, l64 = len / 8; > > + > > + WARN_ON_ONCE(len % 8 != 0); > > + WARN_ON_ONCE(((u8 __iomem *)dest - (u8 __iomem *)0) % 8 != 0); > > + BUILD_BUG_ON(sizeof(uint64_t) != 8); > > + > > + for (i = 0; i < l64; ++i) > > + writeq(src64[i], dest64+i); > > Could you please surround + by spaces for consistency? > > > +} > > + The BUILD_BUG_ON seems unnecessary. The separate WARN_ON_ONCEs could be combined. The subtraction of 0 just seems odd. Would this be clearer as: static void efx_memcpy_64(void __iomem *dest, void *src, size_t len) { u64 *src64 = src, u64 __iomem *dest64 = dest; size_t l64 = len / 8; size_t i; WARN_ON_ONCE(len % 8 || dest64 % 8); for (i = 0; i < l64; i++) writeq(src64[i], &dest64[i]); } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Shradha Shah <sshah@solarflare.com> Date: Tue, 3 Jun 2014 11:04:35 +0100 > + writeq(src64[i], dest64+i); What does writeq() do on a 32-bit machine? Did you do any functional testing of this on such a machine? I'm extremely disappointed in this patch submission, because you didn't even _compile_ test this in the environment where you claim the problem exists. A 32-bit build with this patch applied results in: CC drivers/net/ethernet/sfc/tx.o drivers/net/ethernet/sfc/tx.c: In function ‘efx_memcpy_64’: drivers/net/ethernet/sfc/tx.c:203:3: error: implicit declaration of function ‘writeq’ [-Werror=implicit-function-declaration] cc1: some warnings being treated as errors make[1]: *** [drivers/net/ethernet/sfc/tx.o] Error 1 make: *** [drivers/net/ethernet/sfc/tx.o] Error 2 This is an extremely _LOW_ quality patch submission. You put very little time and effort into the changes I asked you to make. And clearly you did absolutely not functional testing of this change, what if with the modification it didn't fix the bug any longer? It is imperitive that you take your time and go implement these changes properly. Yes, you can clearly see now that a proper interface for writeq() is not provided that actually uses 64-bit operations on 32-bit systems. And yes, _you_ will need to resolve this somehow. And most importantly, you will need to both compile and functionally test the change before you even think about posting it again here. Thanks.
On 03/06/14 23:58, David Miller wrote: > Did you do any functional testing of this on such a machine? > > I'm extremely disappointed in this patch submission, because you > didn't even_compile_ test this in the environment where you claim > the problem exists. The next patch posted in the series would have meant that this change was limited to x86_64 systems. I am sorry that by posting the patches in an incorrect order/not merging them we gave the impression that we do not test these patches (but hands up, patch 1/2 did break 32 bit compiles and certainly did need more work. Thanks to Sergei, Joe and yourself for comments) For x86_64 machines we checked writeq() implementations in io.h and disassembled the module to ensure that a single 64bit access was used. I would like to get an improved bug fix upstream, as I know this rare hardware issue has been hit in the wild and with TX checksum offload enabled corrupted data can get to the application. The reason for limiting to x86_64 is that in this code path there is an IO write to a write-combined mapping. It was a very hard case to hit (believed to depend on the timing of accesses and resulting write-combining behaviour), but we never reproduced it in our lab. Limiting this IO to WC mappings to x86_64 only was a safety measure given the amount of hours of testing we have on x86_64 vs.other 64 bit platforms that we test with (PPC64) This feature gives a small latency win on this hardware family for small packets. Unless there are more comments to consider the next post will be a merged patch, addressing all points raised so far. Regards Rob -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c index fa94753..d2c9ca0 100644 --- a/drivers/net/ethernet/sfc/tx.c +++ b/drivers/net/ethernet/sfc/tx.c @@ -189,6 +189,20 @@ struct efx_short_copy_buffer { u8 buf[L1_CACHE_BYTES]; }; +/* Copy in explicit 64-bit writes. */ +static void efx_memcpy_64(void __iomem *dest, void *src, size_t len) +{ + u64 *src64 = src, __iomem *dest64 = dest; + size_t i, l64 = len / 8; + + WARN_ON_ONCE(len % 8 != 0); + WARN_ON_ONCE(((u8 __iomem *)dest - (u8 __iomem *)0) % 8 != 0); + BUILD_BUG_ON(sizeof(uint64_t) != 8); + + for (i = 0; i < l64; ++i) + writeq(src64[i], dest64+i); +} + /* Copy to PIO, respecting that writes to PIO buffers must be dword aligned. * Advances piobuf pointer. Leaves additional data in the copy buffer. */ @@ -198,7 +212,7 @@ static void efx_memcpy_toio_aligned(struct efx_nic *efx, u8 __iomem **piobuf, { int block_len = len & ~(sizeof(copy_buf->buf) - 1); - memcpy_toio(*piobuf, data, block_len); + efx_memcpy_64(*piobuf, data, block_len); *piobuf += block_len; len -= block_len; @@ -230,7 +244,7 @@ static void efx_memcpy_toio_aligned_cb(struct efx_nic *efx, u8 __iomem **piobuf, if (copy_buf->used < sizeof(copy_buf->buf)) return; - memcpy_toio(*piobuf, copy_buf->buf, sizeof(copy_buf->buf)); + efx_memcpy_64(*piobuf, copy_buf->buf, sizeof(copy_buf->buf)); *piobuf += sizeof(copy_buf->buf); data += copy_to_buf; len -= copy_to_buf; @@ -245,7 +259,7 @@ static void efx_flush_copy_buffer(struct efx_nic *efx, u8 __iomem *piobuf, { /* if there's anything in it, write the whole buffer, including junk */ if (copy_buf->used) - memcpy_toio(piobuf, copy_buf->buf, sizeof(copy_buf->buf)); + efx_memcpy_64(piobuf, copy_buf->buf, sizeof(copy_buf->buf)); } /* Traverse skb structure and copy fragments in to PIO buffer. @@ -304,8 +318,8 @@ efx_enqueue_skb_pio(struct efx_tx_queue *tx_queue, struct sk_buff *skb) */ BUILD_BUG_ON(L1_CACHE_BYTES > SKB_DATA_ALIGN(sizeof(struct skb_shared_info))); - memcpy_toio(tx_queue->piobuf, skb->data, - ALIGN(skb->len, L1_CACHE_BYTES)); + efx_memcpy_64(tx_queue->piobuf, skb->data, + ALIGN(skb->len, L1_CACHE_BYTES)); } EFX_POPULATE_QWORD_5(buffer->option,