Message ID | A6A1774AFD79E346AE6D49A33CB294530DC19EB5@EX-BE-017-SFO.shared.themessagecenter.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: "Ben Menchaca (ben@bigfootnetworks.com)" <ben@bigfootnetworks.com> Date: Sat, 20 Mar 2010 12:54:59 -0700 > We are seeing some random skb data length errors on RX after long-running, full-gigabit traffic. First, my debugging and solution are based on the following invariant assumption: > (skb->tail - skb->data) == skb->len > > If this is wrong, please educate. > > After some tracing, here is where the error packets seem to originate: > 1. We are cleaning rx, in gfar_clean_rx_ring; > 2. A new RX skb is drawn from the rx_recycle queue, and obey the above invariant (so, in gfar_new_skb(), __skb_dequeue returns an skb); > 3. At this point skb_reserve is called, which moves data and tail by the same calculated alignamount; > 4. So, newskb is not NULL. However, !(bdp->status & RXBD_LAST) || (bdp->status & RXBD_ERR)) is evaluates to true; > 5. Since newskb is not NULL, we arrive at the else if (skb), which is true; > 6. skb->data = skb->head + NET_SKB_PAD is applied, and then the skb is requeued for recycling. > > At this point, skb->data != skb->tail, but skb->len == 0. When this skb is used for the next RX, it is causing issues later when we skb_put trailers, and then trust skb->len. > > I would propose something like: Thanks for debugging this, some gianfar developers CC:'d. > @@ -2540,6 +2540,7 @@ > * recycle list. > */ > skb->data = skb->head + NET_SKB_PAD; > + skb_reset_tail_pointer(skb); > __skb_queue_head(&priv->rx_recycle, skb); > } > } else { This code is essentially trying to undo skb_reserve() but as you found it's doing so in a buggy manner. skb_reserve() adjusts both the 'data' and 'tail' pointers, but this attempt at a reversal is only modifying 'data'. Your fix is fine, but really any by-hand modification of skb->data is a bug, and we should provide an skb_unreserve() or similar to hide such details away, and use it here. Anton? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Mar 21, 2010 at 09:46:42PM -0700, David Miller wrote: [...] > > * recycle list. > > */ > > skb->data = skb->head + NET_SKB_PAD; > > + skb_reset_tail_pointer(skb); > > __skb_queue_head(&priv->rx_recycle, skb); > > } > > } else { > > This code is essentially trying to undo skb_reserve() > but as you found it's doing so in a buggy manner. > > skb_reserve() adjusts both the 'data' and 'tail' pointers, > but this attempt at a reversal is only modifying 'data'. > > Your fix is fine, but really any by-hand modification of > skb->data is a bug, and we should provide an skb_unreserve() > or similar to hide such details away, and use it here. > > Anton? Yes, skb_unreserve() (or skb_reset_reserved() for naming consistency?) would be great. Ben, note that ucc_geth.c driver is also affected by that bug, so I guess it needs a similar fix. Thanks,
--- a/drivers/net/gianfar.c +++ b/drivers/net/gianfar.c @@ -2540,6 +2540,7 @@ * recycle list. */ skb->data = skb->head + NET_SKB_PAD; + skb_reset_tail_pointer(skb); __skb_queue_head(&priv->rx_recycle, skb); } } else {