Message ID | 1356718263.21409.430.camel@edumazet-glaptop |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Hello, On Fri, 28 Dec 2012 10:11:03 -0800 Eric Dumazet wrote: > On Sun, 2012-12-23 at 15:06 +0400, Andrew Savchenko wrote: [...] > > I hit this bug again on uptime 11 days on 3.7.0 vanilla kernel. > > See kernel config, /prot/net/upd, netstat -s and dropwatch logs > > attached to this mail. This bug happens on UDP DNS requests only, > > TCP requests work fine, see dig.log attached. > > > > Increasing of net.ipv4.udp_mem from > > 24150 32201 48300 > > to > > 100000 150000 200000 > > helps, but I'm afraid only temporary again. > > > > Dropwatch data was collected in the following way: > > - dropwatch.bug.* files contain data obtained after bug occurred; > > - dropwatch.*.background files contain background data when no > > host or dig test was running; this system has active firewall > > and complicated routing, ipv6 disabled via sysctl, etc, so some > > drops are normal; > > - dropwatch.*.host.request shows dropped packets recorded during > > host ya.ru request; of course, during this time some background > > packets were recorded as well (dropwatch doesn't support filtering > > at this moment); > > - dropwatch.nobug.* data was collected after the bug was > > workarounded via net.ipv4.upd_mem as described above. > > > > As can be seen from dropwatch logs, drop in __udp_queue_rcv_skb+61 > > happens only on host request on bug conditions, thus something is > > wrong there. > > > > Best regards, > > Andrew Savchenko > > Thanks a lot ! > > I see strange drops in dev_hard_start_xmit() > > l2tp needs some care. > > Please try the following patch, as skb_cow_head() API > doesnt really ease skb->truesize exact tracking anyway, better not mess > with it. Sorry for the delay, but I was able to reboot kernel only today. Your patch is applied on top of the 3.7.2 vanilla kernel. l2tp works fine and /proc/net/udp tx_queue values are normal now, see attached /prot/net/udp output. This is a good hint that problem is probably solved, but we need to wait at least several weeks to be sure. Best regards, Andrew Savchenko
Hello, On Wed, 16 Jan 2013 20:36:44 +0400 Andrew Savchenko wrote: > On Fri, 28 Dec 2012 10:11:03 -0800 Eric Dumazet wrote: [...] > > Thanks a lot ! > > > > I see strange drops in dev_hard_start_xmit() > > > > l2tp needs some care. > > > > Please try the following patch, as skb_cow_head() API > > doesnt really ease skb->truesize exact tracking anyway, better not mess > > with it. > > Sorry for the delay, but I was able to reboot kernel only today. > Your patch is applied on top of the 3.7.2 vanilla kernel. > > l2tp works fine and /proc/net/udp tx_queue values are normal now, see > attached /prot/net/udp output. This is a good hint that problem is > probably solved, but we need to wait at least several weeks to be > sure. With 16-days uptime system works fine. Also I was able to reproduce this bug on another box: an embedded system running openwrt with 3.7.5 kernel with openl2tpd client and dnsmasq server. Due to limited memory resources this bug happened to be easily reproducible: several thousands of dns queries were sufficient to reproduce this bug. Full debug on embedded box was not possible due to constrained resources, but bug appearance was the same and /proc/net/udp is apparently broken (see attached log). I applied your patch on openwrt's 3.7.5 kernel and it fixed the bug on this box too. So we've found a solution and I'm looking forward for it in the main kernel :) Best regards, Andrew Savchenko
On Mon, 2013-02-04 at 17:39 +0400, Andrew Savchenko wrote: > With 16-days uptime system works fine. > > Also I was able to reproduce this bug on another box: an embedded > system running openwrt with 3.7.5 kernel with openl2tpd client and > dnsmasq server. Due to limited memory resources this bug happened to > be easily reproducible: several thousands of dns queries were > sufficient to reproduce this bug. Full debug on embedded box was not > possible due to constrained resources, but bug appearance was the same > and /proc/net/udp is apparently broken (see attached log). > > I applied your patch on openwrt's 3.7.5 kernel and it fixed the bug > on this box too. > > So we've found a solution and I'm looking forward for it in the main > kernel :) Thanks a lot for testing, I'll send the official patch asap. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 1a9f372..d77e655 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1123,8 +1123,6 @@ int l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb, int hdr_len struct udphdr *uh; struct inet_sock *inet; __wsum csum; - int old_headroom; - int new_headroom; int headroom; int uhlen = (tunnel->encap == L2TP_ENCAPTYPE_UDP) ? sizeof(struct udphdr) : 0; int udp_len; @@ -1136,16 +1134,12 @@ int l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb, int hdr_len */ headroom = NET_SKB_PAD + sizeof(struct iphdr) + uhlen + hdr_len; - old_headroom = skb_headroom(skb); if (skb_cow_head(skb, headroom)) { kfree_skb(skb); return NET_XMIT_DROP; } - new_headroom = skb_headroom(skb); skb_orphan(skb); - skb->truesize += new_headroom - old_headroom; - /* Setup L2TP header */ session->build_header(session, __skb_push(skb, hdr_len)); diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c index 286366e..716605c 100644 --- a/net/l2tp/l2tp_ppp.c +++ b/net/l2tp/l2tp_ppp.c @@ -388,8 +388,6 @@ static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb) struct l2tp_session *session; struct l2tp_tunnel *tunnel; struct pppol2tp_session *ps; - int old_headroom; - int new_headroom; int uhlen, headroom; if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED)) @@ -408,7 +406,6 @@ static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb) if (tunnel == NULL) goto abort_put_sess; - old_headroom = skb_headroom(skb); uhlen = (tunnel->encap == L2TP_ENCAPTYPE_UDP) ? sizeof(struct udphdr) : 0; headroom = NET_SKB_PAD + sizeof(struct iphdr) + /* IP header */ @@ -418,9 +415,6 @@ static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb) if (skb_cow_head(skb, headroom)) goto abort_put_sess_tun; - new_headroom = skb_headroom(skb); - skb->truesize += new_headroom - old_headroom; - /* Setup PPP header */ __skb_push(skb, sizeof(ppph)); skb->data[0] = ppph[0];