Message ID | 20120216074637.GA6208@electric-eye.fr.zoreil.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Francois Romieu <romieu@fr.zoreil.com> :
[...]
> I am testing it now.
It does not fix the problem. It seems to make a difference but I still
see a few percents packet loss. The pattern has changed a bit too: there
are more bogus IP fragment offsets.
Trying with 'ping -qf -c 100 -l 2 -s 65507 10.0.7.1'
Good:
239 0.020107 10.0.7.1 -> 10.0.7.5 IP Fragmented IP protocol (proto=ICMP 0x01, off=0, ID=9f97)
240 0.020116 10.0.7.1 -> 10.0.7.5 IP Fragmented IP protocol (proto=ICMP 0x01, off=8976, ID=9f97)
241 0.020122 10.0.7.1 -> 10.0.7.5 IP Fragmented IP protocol (proto=ICMP 0x01, off=17952, ID=9f97)
242 0.020128 10.0.7.1 -> 10.0.7.5 IP Fragmented IP protocol (proto=ICMP 0x01, off=26928, ID=9f97)
243 0.020134 10.0.7.1 -> 10.0.7.5 IP Fragmented IP protocol (proto=ICMP 0x01, off=35904, ID=9f97)
244 0.020139 10.0.7.1 -> 10.0.7.5 IP Fragmented IP protocol (proto=ICMP 0x01, off=44880, ID=9f97)
245 0.020145 10.0.7.1 -> 10.0.7.5 IP Fragmented IP protocol (proto=ICMP 0x01, off=53856, ID=9f97)
246 0.020150 10.0.7.1 -> 10.0.7.5 ICMP Echo (ping) reply (id=0x04c7, seq(be/le)=16/4096, ttl=64)
Bad:
247 0.020809 10.0.7.5 -> 10.0.7.1 IP Fragmented IP protocol (proto=ICMP 0x01, off=0, ID=dce9)
248 0.020977 10.0.7.5 -> 10.0.7.1 IP Fragmented IP protocol (proto=ICMP 0x01, off=8976, ID=dce9)
249 0.021046 10.0.7.5 -> 10.0.7.1 IP Fragmented IP protocol (proto=ICMP 0x01, off=17952, ID=dce9)
250 0.021192 10.0.7.5 -> 10.0.7.1 IP Fragmented IP protocol (proto=ICMP 0x01, off=26928, ID=dce9)
251 0.021199 10.0.7.5 -> 10.0.7.1 IP Fragmented IP protocol (proto=ICMP 0x01, off=35904, ID=dce9)
252 0.021291 10.0.7.5 -> 10.0.7.1 IP Fragmented IP protocol (proto=ICMP 0x01, off=44880, ID=dce9)
253 0.021298 10.0.7.5 -> 10.0.7.1 IP Fragmented IP protocol (proto=ICMP 0x01, off=62832, ID=dce9)
So far I have only seen it with the 2nd and last fragment.
> From: Francois Romieu [mailto:romieu@fr.zoreil.com] > Sent: Thursday, February 16, 2012 8:05 PM > To: Hayeswang > Cc: 'Eric Dumazet'; 'Nick Bowler'; netdev@vger.kernel.org > Subject: Re: Bogus frames transmitted with r8169 & > fragmentation & large mtu > > Francois Romieu <romieu@fr.zoreil.com> : > [...] > > I am testing it now. > > It does not fix the problem. It seems to make a difference but I still > see a few percents packet loss. The pattern has changed a bit > too: there > are more bogus IP fragment offsets. > > Trying with 'ping -qf -c 100 -l 2 -s 65507 10.0.7.1' It works fine for me. I use Linux platform (Fedora 16 with kernel 3.2.0 with that patch) with the realtek nic to ping Windows platform with the NVIDIA nic, and I connect the two PCs directly without the switch. It works well except for the first packet after I set the new mtu. > > So far I have only seen it with the 2nd and last fragment. > Best Regards, Hayes
hayeswang <hayeswang@realtek.com> : > > Francois Romieu <romieu@fr.zoreil.com> : [...] > > Trying with 'ping -qf -c 100 -l 2 -s 65507 10.0.7.1' > > It works fine for me. I use Linux platform (Fedora 16 with kernel 3.2.0 with > that patch) with the realtek nic to ping Windows platform with the NVIDIA nic, > and I connect the two PCs directly without the switch. It works well except for > the first packet after I set the new mtu. No switch, old PCI e1000, Fedora 15, d5ef8a4d87ab21d575ac86366599c9152a28028d (post -rc3) from davem here. Imvho you are not trying hard enough to break your toys :o) What about : # sysctl -w net.core.rmem_max=1000000 # ping -qf -c 10000 -l 8 -s 65507 10.0.7.1 If it's not enough you can increase '-c' and '-l' further. I see no loss with '-l 1', event after 100k packets. > > So far I have only seen it with the 2nd and last fragment. ... which should have read as "the hardware problem is gone, stop using SLUB if you do not want to see intra-packet corruption". However I still see it with SLAB (.config attached): # ping -qf -c 10000 -l 2 -s 65507 10.0.7.1 PING 10.0.7.1 (10.0.7.1) 65507(65535) bytes of data. --- 10.0.7.1 ping statistics --- 10000 packets transmitted, 9916 received, 0% packet loss, time 15898ms rtt min/avg/max/mdev = 2.154/2.847/3.256/0.138 ms, pipe 2, ipg/ewma 1.589/2.867 ms Almost mundane .5 %. Alas: # ping -qf -c 6 -l 4 -s 65507 10.0.7.1 PING 10.0.7.1 (10.0.7.1) 65507(65535) bytes of data. --- 10.0.7.1 ping statistics --- 6 packets transmitted, 4 received, 33% packet loss, time 4ms rtt min/avg/max/mdev = 2.831/3.872/4.896/0.765 ms, pipe 4, ipg/ewma 0.874/3.309 ms I'll follow Eric's suggestion with checking the packet content in start_xmit.
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index 5eb6858..81e6ea2 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -3833,12 +3833,20 @@ static void rtl8169_init_ring_indexes(struct rtl8169_private *tp) static void rtl_hw_jumbo_enable(struct rtl8169_private *tp) { + void __iomem *ioaddr = tp->mmio_addr; + + RTL_W8(Cfg9346, Cfg9346_Unlock); rtl_generic_op(tp, tp->jumbo_ops.enable); + RTL_W8(Cfg9346, Cfg9346_Lock); } static void rtl_hw_jumbo_disable(struct rtl8169_private *tp) { + void __iomem *ioaddr = tp->mmio_addr; + + RTL_W8(Cfg9346, Cfg9346_Unlock); rtl_generic_op(tp, tp->jumbo_ops.disable); + RTL_W8(Cfg9346, Cfg9346_Lock); } static void r8168c_hw_jumbo_enable(struct rtl8169_private *tp)