Patchwork [3.8-rc] regression: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out

login
register
mail settings
Submitter françois romieu
Date Jan. 5, 2013, 4:57 p.m.
Message ID <20130105165735.GB4906@electric-eye.fr.zoreil.com>
Download mbox | patch
Permalink /patch/209687/
State RFC
Delegated to: David Miller
Headers show

Comments

françois romieu - Jan. 5, 2013, 4:57 p.m.
Jörg Otte <jrg.otte@gmail.com> :
[...]
> jojo@ahorn:~$ dmesg | grep XID
> [    1.808847] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at
> 0xffffc90000054000, 5c:9a:d8:69:2b:39, XID 0c900800 IRQ 42

Can you check if things improve with v3.8-rc2 after removing :

1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda 
   r8169: workaround for missing extended GigaMAC registers
2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
   r8169: enable internal ASPM and clock request settings
3. e0c075577965d1c01b30038d38bf637b027a1df3
   r8169: enable ALDPS for power saving

(you can directly try v3.7 r8169.c with v3.8-rc2 if it worked for you
so far) 

If the regression is still there, please apply the patch below to both
v3.8-rc2 unpatched and a known working version then send me their dmesg
after you 'ip link set dev eth0 up'.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jörg Otte - Jan. 6, 2013, 1:03 p.m.
2013/1/5 Francois Romieu <romieu@fr.zoreil.com>:
> Can you check if things improve with v3.8-rc2 after removing :
>
> 1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda
>    r8169: workaround for missing extended GigaMAC registers
> 2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
>    r8169: enable internal ASPM and clock request settings

Doesn't help for this problem.

However this fixes a second issue for me:
In 3.7.1 at startup the link came up after 15 sec.:
grep r8169 dmesg.3.7.1
[    1.956842] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[    1.957059] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
[    1.957161] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at..
[    1.957163] r8169 0000:02:00.0 eth0: jumbo features [frames..
[   13.575452] r8169 0000:02:00.0 eth0: link down
[   13.575475] r8169 0000:02:00.0 eth0: link down
[   15.181317] r8169 0000:02:00.0 eth0: link up

In 3.8rc the time increased to 24 seconds:
grep r8169 dmesg.3.8.0
[    1.852546] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[    1.852765] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
[    1.852872] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
[    1.852874] r8169 0000:02:00.0 eth0: jumbo features [frames...
[   14.150212] r8169 0000:02:00.0 eth0: link down
[   14.150229] r8169 0000:02:00.0 eth0: link down
[   24.140263] r8169 0000:02:00.0 eth0: link up

But with this revert I get the old performance:
dmesg | grep r8169
[    1.816613] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[    1.816832] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
[    1.816947] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
[    1.816948] r8169 0000:02:00.0 eth0: jumbo features [frames...
[   13.986401] r8169 0000:02:00.0 eth0: link down
[   13.986422] r8169 0000:02:00.0 eth0: link down
[   15.623631] r8169 0000:02:00.0 eth0: link up

Thus I recommend to revert this too.

> 3. e0c075577965d1c01b30038d38bf637b027a1df3
>    r8169: enable ALDPS for power saving

That's it! This fixes the problem for me!


Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jörg Otte - Feb. 3, 2013, 3:34 p.m.
2013/1/6 Jörg Otte <jrg.otte@gmail.com>:
> 2013/1/5 Francois Romieu <romieu@fr.zoreil.com>:
>> Can you check if things improve with v3.8-rc2 after removing :
>>
>> 1. 9ecb9aabaf634677c77af467f4e3028b09d7bcda
>>    r8169: workaround for missing extended GigaMAC registers
>> 2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
>>    r8169: enable internal ASPM and clock request settings
>
> Doesn't help for this problem.
>
> However this fixes a second issue for me:
> In 3.7.1 at startup the link came up after 15 sec.:
> grep r8169 dmesg.3.7.1
> [    1.956842] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> [    1.957059] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
> [    1.957161] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at..
> [    1.957163] r8169 0000:02:00.0 eth0: jumbo features [frames..
> [   13.575452] r8169 0000:02:00.0 eth0: link down
> [   13.575475] r8169 0000:02:00.0 eth0: link down
> [   15.181317] r8169 0000:02:00.0 eth0: link up
>
> In 3.8rc the time increased to 24 seconds:
> grep r8169 dmesg.3.8.0
> [    1.852546] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> [    1.852765] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
> [    1.852872] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
> [    1.852874] r8169 0000:02:00.0 eth0: jumbo features [frames...
> [   14.150212] r8169 0000:02:00.0 eth0: link down
> [   14.150229] r8169 0000:02:00.0 eth0: link down
> [   24.140263] r8169 0000:02:00.0 eth0: link up
>
> But with this revert I get the old performance:
> dmesg | grep r8169
> [    1.816613] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> [    1.816832] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
> [    1.816947] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
> [    1.816948] r8169 0000:02:00.0 eth0: jumbo features [frames...
> [   13.986401] r8169 0000:02:00.0 eth0: link down
> [   13.986422] r8169 0000:02:00.0 eth0: link down
> [   15.623631] r8169 0000:02:00.0 eth0: link up
>
> Thus I recommend to revert this too.
>
>> 3. e0c075577965d1c01b30038d38bf637b027a1df3
>>    r8169: enable ALDPS for power saving
>
> That's it! This fixes the problem for me!
>


We are closely before v3.8 and I didn't see a solution
so far.
What is the plan regarding this issue(s)?

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jörg Otte - Feb. 6, 2013, 4:55 p.m.
2013/2/3 Jörg Otte <jrg.otte@gmail.com>:
> 2013/1/6 Jörg Otte <jrg.otte@gmail.com>:
>> 2013/1/5 Francois Romieu <romieu@fr.zoreil.com>:
>>> Can you check if things improve with v3.8-rc2 after removing :
>>>
>>> 2. d64ec841517a25f6d468bde9f67e5b4cffdc67c7
>>>    r8169: enable internal ASPM and clock request settings
>>
>> this fixes a second issue for me:
>> In 3.7.1 at startup the link came up after 15 sec.:
>> grep r8169 dmesg.3.7.1
>> [    1.956842] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> [    1.957059] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
>> [    1.957161] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at..
>> [    1.957163] r8169 0000:02:00.0 eth0: jumbo features [frames..
>> [   13.575452] r8169 0000:02:00.0 eth0: link down
>> [   13.575475] r8169 0000:02:00.0 eth0: link down
>> [   15.181317] r8169 0000:02:00.0 eth0: link up
>>
>> In 3.8rc the time increased to 24 seconds:
>> grep r8169 dmesg.3.8.0
>> [    1.852546] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> [    1.852765] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
>> [    1.852872] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
>> [    1.852874] r8169 0000:02:00.0 eth0: jumbo features [frames...
>> [   14.150212] r8169 0000:02:00.0 eth0: link down
>> [   14.150229] r8169 0000:02:00.0 eth0: link down
>> [   24.140263] r8169 0000:02:00.0 eth0: link up
>>
>> But with this revert I get the old performance:
>> dmesg | grep r8169
>> [    1.816613] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> [    1.816832] r8169 0000:02:00.0: irq 42 for MSI/MSI-X
>> [    1.816947] r8169 0000:02:00.0 eth0: RTL8168evl/8111evl at...
>> [    1.816948] r8169 0000:02:00.0 eth0: jumbo features [frames...
>> [   13.986401] r8169 0000:02:00.0 eth0: link down
>> [   13.986422] r8169 0000:02:00.0 eth0: link down
>> [   15.623631] r8169 0000:02:00.0 eth0: link up
>>
>>
>>> 3. e0c075577965d1c01b30038d38bf637b027a1df3
>>>    r8169: enable ALDPS for power saving
>>
>> That's it! This fixes the problem for me!
>>
>> Thanks, Jörg
>
>
> We are closely before v3.8 and I didn't see a solution
> so far.
> What is the plan regarding this issue(s)?
>
> Thanks, Jörg

No response, so I Cc to Linus:

To Summarize: Two net-regressions where introduced in v3.8 (driver r8169):

1) NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
was introduced by commit
e0c075577965d1c01b30038d38bf637b027a1df3
("r8169: enable ALDPS for power saving")

2) Boot-time increased from 15sec (V3.7) to 24sec (V3.8)
by commit:
d64ec841517a25f6d468bde9f67e5b4cffdc67c7
("r8169: enable internal ASPM and clock request settings")

Reverting the commits resolve the problems entirely.

As long as the issues are not fixed the commits should be reverted.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
françois romieu - Feb. 7, 2013, 12:46 a.m.
Jörg Otte <jrg.otte@gmail.com> :
[...]
> To Summarize: Two net-regressions where introduced in v3.8 (driver r8169):
> 
> 1) NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> was introduced by commit
> e0c075577965d1c01b30038d38bf637b027a1df3
> ("r8169: enable ALDPS for power saving")

Hayes Wang <hayeswang@realtek.com> authored it. You should ask him
why commit e0c075577965d1c01b30038d38bf637b027a1df3 sometimes chokes
with the 8168evl. 

And you can ask him if there is a chance that the non-8168evl that are
handled by the patch (mis-)behave the same too.

Patch

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index ed96f30..3d2d2446 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -90,10 +90,28 @@  static const int multicast_filter_limit = 32;
 #define RTL8169_TX_TIMEOUT	(6*HZ)
 #define RTL8169_PHY_TIMEOUT	(10*HZ)
 
+static void rw8(void __iomem *ioaddr, u8 b)
+{
+	printk(KERN_DEBUG PFX "w %p %02x\n", ioaddr, b);
+	writeb(b, ioaddr);
+}
+
+static void rw16(void __iomem *ioaddr, u16 w)
+{
+	printk(KERN_DEBUG PFX "w %p %04x\n", ioaddr, w);
+	writew(w, ioaddr);
+}
+
+static void rw32(void __iomem *ioaddr, u32 d)
+{
+	printk(KERN_DEBUG PFX "w %p %08x\n", ioaddr, d);
+	writel(d, ioaddr);
+}
+
 /* write/read MMIO register */
-#define RTL_W8(reg, val8)	writeb ((val8), ioaddr + (reg))
-#define RTL_W16(reg, val16)	writew ((val16), ioaddr + (reg))
-#define RTL_W32(reg, val32)	writel ((val32), ioaddr + (reg))
+#define RTL_W8(reg, val8)	rw8(ioaddr + (reg), (val8))
+#define RTL_W16(reg, val16)	rw16(ioaddr + (reg), (val16))
+#define RTL_W32(reg, val32)	rw32(ioaddr + (reg), (val32))
 #define RTL_R8(reg)		readb (ioaddr + (reg))
 #define RTL_R16(reg)		readw (ioaddr + (reg))
 #define RTL_R32(reg)		readl (ioaddr + (reg))