diff mbox

stmmac: explicitly zero des0 & des1 on init

Message ID 1434476441-18241-1-git-send-email-abrodkin@synopsys.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Alexey Brodkin June 16, 2015, 5:40 p.m. UTC
Current implementtion of descriptor init procedure only takes care about
ownership flag. While it is perfectly possible to have underlying memory
filled with garbage on boot or driver installation.

And randomly set flags in non-zeroed des0 and des1 fields may lead to
unpredictable behavior of the GMAC DMA block.

Solution to this problem is as simple as explicit zeroing of both des0
and des1 fields of all buffer descriptors.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: arc-linux-dev@synopsys.com
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
---
 drivers/net/ethernet/stmicro/stmmac/descs.h     | 2 ++
 drivers/net/ethernet/stmicro/stmmac/enh_desc.c  | 3 ++-
 drivers/net/ethernet/stmicro/stmmac/norm_desc.c | 3 ++-
 3 files changed, 6 insertions(+), 2 deletions(-)

Comments

Vineet Gupta June 17, 2015, 7:03 a.m. UTC | #1
+CC linux-arch, linux-mm, Arnd and Marek

On Tuesday 16 June 2015 11:11 PM, Alexey Brodkin wrote:

Current implementtion of descriptor init procedure only takes care about
ownership flag. While it is perfectly possible to have underlying memory
filled with garbage on boot or driver installation.

And randomly set flags in non-zeroed des0 and des1 fields may lead to
unpredictable behavior of the GMAC DMA block.

Solution to this problem is as simple as explicit zeroing of both des0
and des1 fields of all buffer descriptors.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com><mailto:abrodkin@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com><mailto:peppe.cavallaro@st.com>
Cc: arc-linux-dev@synopsys.com<mailto:arc-linux-dev@synopsys.com>
Cc: linux-kernel@vger.kernel.org<mailto:linux-kernel@vger.kernel.org>
Cc: stable@vger.kernel.org<mailto:stable@vger.kernel.org>

FWIW, this was causing sporadic/random networking flakiness on ARC SDP platform (scheduled for upstream inclusion in next window)

This also leads to an interesting question - should arch/*/dma_alloc_coherent() and friends unconditionally zero out memory (vs. the current semantics of letting only doing it based on gfp, as requested by driver). This is the second instance we ran into stale descriptor memory, the first one was in dw_mmc driver which was recently fixed in upstream as well (although debugged independently by Alexey and using the upstream fix)

http://www.spinics.net/lists/linux-mmc/msg31600.html

The pros is better out of box experience (despite buggy drivers) while the cons are they remain broken and perhaps increased boot time due to extra memzero....

Thx,
-Vineet
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller June 21, 2015, 4:29 p.m. UTC | #2
From: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Date: Tue, 16 Jun 2015 20:40:41 +0300

> Current implementtion of descriptor init procedure only takes care about
> ownership flag. While it is perfectly possible to have underlying memory
> filled with garbage on boot or driver installation.
> 
> And randomly set flags in non-zeroed des0 and des1 fields may lead to
> unpredictable behavior of the GMAC DMA block.
> 
> Solution to this problem is as simple as explicit zeroing of both des0
> and des1 fields of all buffer descriptors.
> 
> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>

If you need the memory zero initialized, use dma_zalloc_coherent().
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
Alexey Brodkin June 22, 2015, 6:43 a.m. UTC | #3
Hi David,

On Sun, 2015-06-21 at 09:29 -0700, David Miller wrote:
> From: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
> Date: Tue, 16 Jun 2015 20:40:41 +0300
> 
> > Current implementtion of descriptor init procedure only takes care 
> > about
> > ownership flag. While it is perfectly possible to have underlying 
> > memory
> > filled with garbage on boot or driver installation.
> > 
> > And randomly set flags in non-zeroed des0 and des1 fields may lead 
> > to
> > unpredictable behavior of the GMAC DMA block.
> > 
> > Solution to this problem is as simple as explicit zeroing of both 
> > des0
> > and des1 fields of all buffer descriptors.
> > 
> > Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
> 
> If you need the memory zero initialized, use dma_zalloc_coherent().

Indeed usage of dma_zalloc_coherent() will resolve observed issue.
But since buffer descriptors are reused extensively I would say that
explicit zeroing of fields with flags is useful. Probably I need to add
this clarification in commit message.

And then if we do that explicit zeroing of flags and other fields which
hold data size and addresses of data buffer and the next descriptor in
chain are all get set later we may not care about allocation of zeroed
memory.

-Alexey--
To unsubscribe from this list: send the line "unsubscribe netdev" in
Alexey Brodkin June 22, 2015, 8:08 a.m. UTC | #4
SGkgYWxsLA0KDQpPbiBXZWQsIDIwMTUtMDYtMTcgYXQgMDc6MDMgKzAwMDAsIFZpbmVldCBHdXB0
YSB3cm90ZToNCitDQyBsaW51eC1hcmNoLCBsaW51eC1tbSwgQXJuZCBhbmQgTWFyZWsNCg0KT24g
VHVlc2RheSAxNiBKdW5lIDIwMTUgMTE6MTEgUE0sIEFsZXhleSBCcm9ka2luIHdyb3RlOg0KDQpD
dXJyZW50IGltcGxlbWVudHRpb24gb2YgZGVzY3JpcHRvciBpbml0IHByb2NlZHVyZSBvbmx5IHRh
a2VzIGNhcmUgYWJvdXQNCm93bmVyc2hpcCBmbGFnLiBXaGlsZSBpdCBpcyBwZXJmZWN0bHkgcG9z
c2libGUgdG8gaGF2ZSB1bmRlcmx5aW5nIG1lbW9yeQ0KZmlsbGVkIHdpdGggZ2FyYmFnZSBvbiBi
b290IG9yIGRyaXZlciBpbnN0YWxsYXRpb24uDQoNCkFuZCByYW5kb21seSBzZXQgZmxhZ3MgaW4g
bm9uLXplcm9lZCBkZXMwIGFuZCBkZXMxIGZpZWxkcyBtYXkgbGVhZCB0bw0KdW5wcmVkaWN0YWJs
ZSBiZWhhdmlvciBvZiB0aGUgR01BQyBETUEgYmxvY2suDQoNClNvbHV0aW9uIHRvIHRoaXMgcHJv
YmxlbSBpcyBhcyBzaW1wbGUgYXMgZXhwbGljaXQgemVyb2luZyBvZiBib3RoIGRlczANCmFuZCBk
ZXMxIGZpZWxkcyBvZiBhbGwgYnVmZmVyIGRlc2NyaXB0b3JzLg0KDQpTaWduZWQtb2ZmLWJ5OiBB
bGV4ZXkgQnJvZGtpbiA8YWJyb2RraW5Ac3lub3BzeXMuY29tPjxtYWlsdG86YWJyb2RraW5Ac3lu
b3BzeXMuY29tPg0KQ2M6IEdpdXNlcHBlIENhdmFsbGFybyA8cGVwcGUuY2F2YWxsYXJvQHN0LmNv
bT48bWFpbHRvOnBlcHBlLmNhdmFsbGFyb0BzdC5jb20+DQpDYzogYXJjLWxpbnV4LWRldkBzeW5v
cHN5cy5jb208bWFpbHRvOmFyYy1saW51eC1kZXZAc3lub3BzeXMuY29tPg0KQ2M6IGxpbnV4LWtl
cm5lbEB2Z2VyLmtlcm5lbC5vcmc8bWFpbHRvOmxpbnV4LWtlcm5lbEB2Z2VyLmtlcm5lbC5vcmc+
DQpDYzogc3RhYmxlQHZnZXIua2VybmVsLm9yZzxtYWlsdG86c3RhYmxlQHZnZXIua2VybmVsLm9y
Zz4NCg0KRldJVywgdGhpcyB3YXMgY2F1c2luZyBzcG9yYWRpYy9yYW5kb20gbmV0d29ya2luZyBm
bGFraW5lc3Mgb24gQVJDIFNEUCBwbGF0Zm9ybSAoc2NoZWR1bGVkIGZvciB1cHN0cmVhbSBpbmNs
dXNpb24gaW4gbmV4dCB3aW5kb3cpDQoNClRoaXMgYWxzbyBsZWFkcyB0byBhbiBpbnRlcmVzdGlu
ZyBxdWVzdGlvbiAtIHNob3VsZCBhcmNoLyovZG1hX2FsbG9jX2NvaGVyZW50KCkgYW5kIGZyaWVu
ZHMgdW5jb25kaXRpb25hbGx5IHplcm8gb3V0IG1lbW9yeSAodnMuIHRoZSBjdXJyZW50IHNlbWFu
dGljcyBvZiBsZXR0aW5nIG9ubHkgZG9pbmcgaXQgYmFzZWQgb24gZ2ZwLCBhcyByZXF1ZXN0ZWQg
YnkgZHJpdmVyKS4gVGhpcyBpcyB0aGUgc2Vjb25kIGluc3RhbmNlIHdlIHJhbiBpbnRvIHN0YWxl
IGRlc2NyaXB0b3IgbWVtb3J5LCB0aGUgZmlyc3Qgb25lIHdhcyBpbiBkd19tbWMgZHJpdmVyIHdo
aWNoIHdhcyByZWNlbnRseSBmaXhlZCBpbiB1cHN0cmVhbSBhcyB3ZWxsIChhbHRob3VnaCBkZWJ1
Z2dlZCBpbmRlcGVuZGVudGx5IGJ5IEFsZXhleSBhbmQgdXNpbmcgdGhlIHVwc3RyZWFtIGZpeCkN
Cg0KaHR0cDovL3d3dy5zcGluaWNzLm5ldC9saXN0cy9saW51eC1tbWMvbXNnMzE2MDAuaHRtbA0K
DQpUaGUgcHJvcyBpcyBiZXR0ZXIgb3V0IG9mIGJveCBleHBlcmllbmNlIChkZXNwaXRlIGJ1Z2d5
IGRyaXZlcnMpIHdoaWxlIHRoZSBjb25zIGFyZSB0aGV5IHJlbWFpbiBicm9rZW4gYW5kIHBlcmhh
cHMgaW5jcmVhc2VkIGJvb3QgdGltZSBkdWUgdG8gZXh0cmEgbWVtemVyby4uLi4NCg0KUHJvYmFi
bHkgaWYgd2UgYWxyZWFkeSBoYXZlIGRtYV96YWxsb2NfY29oZXJlbnQoKSB0aGF0IGRvZXMgZXhw
bGljaXQgemVyb2luZyBvZiByZXR1cm5lZCBtZW1vcnkgdGhlbiB0aGVyZSdzIG5vIG5lZWQgdG8g
ZG8gaW1wbGljaXQgemVyb2luZyBpbiBkbWFfYWxsb2NfY29oZXJlbnQoKT8NCg0KLUFsZXhleQ0K
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
Vineet Gupta June 22, 2015, 8:34 a.m. UTC | #5
On Monday 22 June 2015 01:38 PM, Alexey Brodkin wrote:
> Hi all,
> 
> On Wed, 2015-06-17 at 07:03 +0000, Vineet Gupta wrote:
> +CC linux-arch, linux-mm, Arnd and Marek
> 
> On Tuesday 16 June 2015 11:11 PM, Alexey Brodkin wrote:
> 
> Current implementtion of descriptor init procedure only takes care about
> ownership flag. While it is perfectly possible to have underlying memory
> filled with garbage on boot or driver installation.
> 
> And randomly set flags in non-zeroed des0 and des1 fields may lead to
> unpredictable behavior of the GMAC DMA block.
> 
> Solution to this problem is as simple as explicit zeroing of both des0
> and des1 fields of all buffer descriptors.
> 
> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com><mailto:abrodkin@synopsys.com>
> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com><mailto:peppe.cavallaro@st.com>
> Cc: arc-linux-dev@synopsys.com<mailto:arc-linux-dev@synopsys.com>
> Cc: linux-kernel@vger.kernel.org<mailto:linux-kernel@vger.kernel.org>
> Cc: stable@vger.kernel.org<mailto:stable@vger.kernel.org>
> 
> FWIW, this was causing sporadic/random networking flakiness on ARC SDP platform (scheduled for upstream inclusion in next window)
> 
> This also leads to an interesting question - should arch/*/dma_alloc_coherent() and friends unconditionally zero out memory (vs. the current semantics of letting only doing it based on gfp, as requested by driver). This is the second instance we ran into stale descriptor memory, the first one was in dw_mmc driver which was recently fixed in upstream as well (although debugged independently by Alexey and using the upstream fix)
> 
> http://www.spinics.net/lists/linux-mmc/msg31600.html
> 
> The pros is better out of box experience (despite buggy drivers) while the cons are they remain broken and perhaps increased boot time due to extra memzero....
> 
> Probably if we already have dma_zalloc_coherent() that does explicit zeroing of returned memory then there's no need to do implicit zeroing in dma_alloc_coherent()?


The question is, when drivers don't have dma_zalloc_coherent() - meaning they
don't pass __GFP_ZERO, which causes these random issues, do we need to be more
conservative in arch code (ARC at least is) or do we need to debug and fix these
drivers - one by one.

FWIW, ARC needs to fix __GFP_ZERO case, since we are doing memzero twice.

-Vineet
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
Alexey Brodkin June 23, 2015, 3:04 p.m. UTC | #6
Hi David,

On Mon, 2015-06-22 at 09:43 +0300, Alexey Brodkin wrote:
> Hi David,
> 
> On Sun, 2015-06-21 at 09:29 -0700, David Miller wrote:
> > From: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
> > Date: Tue, 16 Jun 2015 20:40:41 +0300
> > 
> > > Current implementtion of descriptor init procedure only takes 
> > > care 
> > > about
> > > ownership flag. While it is perfectly possible to have underlying 
> > > 
> > > memory
> > > filled with garbage on boot or driver installation.
> > > 
> > > And randomly set flags in non-zeroed des0 and des1 fields may 
> > > lead 
> > > to
> > > unpredictable behavior of the GMAC DMA block.
> > > 
> > > Solution to this problem is as simple as explicit zeroing of both 
> > > 
> > > des0
> > > and des1 fields of all buffer descriptors.
> > > 
> > > Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
> > 
> > If you need the memory zero initialized, use dma_zalloc_coherent().
> 
> Indeed usage of dma_zalloc_coherent() will resolve observed issue.
> But since buffer descriptors are reused extensively I would say that
> explicit zeroing of fields with flags is useful. Probably I need to 
> add
> this clarification in commit message.
> 
> And then if we do that explicit zeroing of flags and other fields 
> which
> hold data size and addresses of data buffer and the next descriptor 
> in
> chain are all get set later we may not care about allocation of 
> zeroed
> memory.

I'm wondering if my comment makes sense and should I just change commit
message or you'd prefer to still use dma_zalloc_coherent() during
driver probe?

-Alexey--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller June 24, 2015, 7:18 a.m. UTC | #7
From: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Date: Tue, 23 Jun 2015 15:04:31 +0000

> I'm wondering if my comment makes sense and should I just change commit
> message or you'd prefer to still use dma_zalloc_coherent() during
> driver probe?

I think you should do both, convert to dma_zalloc_coherent() and perform
the explicit clearing in the routine in question.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/stmicro/stmmac/descs.h b/drivers/net/ethernet/stmicro/stmmac/descs.h
index ad39960..799c292 100644
--- a/drivers/net/ethernet/stmicro/stmmac/descs.h
+++ b/drivers/net/ethernet/stmicro/stmmac/descs.h
@@ -158,6 +158,8 @@  struct dma_desc {
 			u32 buffer2_size:13;
 			u32 reserved4:3;
 		} etx;		/* -- enhanced -- */
+
+		u64 all_flags;
 	} des01;
 	unsigned int des2;
 	unsigned int des3;
diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
index 1e2bcf5..7d94444 100644
--- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c
@@ -240,6 +240,7 @@  static int enh_desc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 static void enh_desc_init_rx_desc(struct dma_desc *p, int disable_rx_ic,
 				  int mode, int end)
 {
+	p->des01.all_flags = 0;
 	p->des01.erx.own = 1;
 	p->des01.erx.buffer1_size = BUF_SIZE_8KiB - 1;
 
@@ -254,7 +255,7 @@  static void enh_desc_init_rx_desc(struct dma_desc *p, int disable_rx_ic,
 
 static void enh_desc_init_tx_desc(struct dma_desc *p, int mode, int end)
 {
-	p->des01.etx.own = 0;
+	p->des01.all_flags = 0;
 	if (mode == STMMAC_CHAIN_MODE)
 		ehn_desc_tx_set_on_chain(p, end);
 	else
diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
index 35ad4f4..48c3456 100644
--- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c
@@ -123,6 +123,7 @@  static int ndesc_get_rx_status(void *data, struct stmmac_extra_stats *x,
 static void ndesc_init_rx_desc(struct dma_desc *p, int disable_rx_ic, int mode,
 			       int end)
 {
+	p->des01.all_flags = 0;
 	p->des01.rx.own = 1;
 	p->des01.rx.buffer1_size = BUF_SIZE_2KiB - 1;
 
@@ -137,7 +138,7 @@  static void ndesc_init_rx_desc(struct dma_desc *p, int disable_rx_ic, int mode,
 
 static void ndesc_init_tx_desc(struct dma_desc *p, int mode, int end)
 {
-	p->des01.tx.own = 0;
+	p->des01.all_flags = 0;
 	if (mode == STMMAC_CHAIN_MODE)
 		ndesc_tx_set_on_chain(p, end);
 	else