Message ID | 20150915234848.GO24810@breakpoint.cc |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2015-09-16 at 01:48 +0200, Florian Westphal wrote: > > What I don't understand is why you see this with fragmented ipv6 > packets only (and not with all ipv6 forwarded skbs). > > Something like this copy-pastry from ip_finish_output2 should fix it: That works; thanks. Tested-by: David Woodhouse <David.Woodhouse@intel.com> A little extra debugging output shows that the offending fragments were arriving here with skb_headroom(skb)==10. Which is reasonable, being the Solos ADSL card's header of 8 bytes followed by 2 bytes of PPP frame type. The non-fragmented packets, on the other hand, are arriving with a headroom of 42 bytes. Could something else already have reallocated them before they get that far? (Do we have any way to gather statistics on such reallocations? It seems that might be useful for performance investigation.) Johannes and I were talking on IRC yesterday about trying to make this kind of thing easier to reproduce without odd hardware. We postulated a skb_torture() function which, when an appropriate debugging option was enabled, would randomly screw around with the skb in various interesting ways — shifting the data down so that there's no headroom, deliberately making it *non-linear*, temporarily cloning it and freeing the clone a couple of seconds later, etc. Then we could insert calls to skb_torture() in interesting places like netif_rx(), ip6_finish_output2() and anywhere else that seems appropriate (perhaps with flags to indicate *what* kind of torture is permissible in certain locations). And see what breaks...
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -62,6 +62,7 @@ static int ip6_finish_output2(struct sock *sk, struct sk_buff *skb) struct net_device *dev = dst->dev; struct neighbour *neigh; struct in6_addr *nexthop; + unsigned int hh_len; int ret; skb->protocol = htons(ETH_P_IPV6); @@ -104,6 +105,21 @@ static int ip6_finish_output2(struct sock *sk, struct sk_buff *skb) } } + hh_len = LL_RESERVED_SPACE(dev); + if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) { + struct sk_buff *skb2; + + skb2 = skb_realloc_headroom(skb, hh_len); + if (!skb2) { + kfree_skb(skb); + return -ENOMEM; + } + if (skb->sk) + skb_set_owner_w(skb2, skb->sk); + consume_skb(skb); + skb = skb2; + } + rcu_read_lock_bh(); nexthop = rt6_nexthop((struct rt6_info *)dst, &ipv6_hdr(skb)->daddr); neigh = __ipv6_neigh_lookup_noref(dst->dev, nexthop);