diff mbox

[2/2] xen: netfront: Drop GSO SKBs which do not have csum_blank.

Message ID 1295975400-538-2-git-send-email-ian.campbell@citrix.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Ian Campbell Jan. 25, 2011, 5:10 p.m. UTC
The Linux network stack expects all GSO SKBs to have ip_summed ==
CHECKSUM_PARTIAL (which implies that the frame contains a partial
checksum) and the Xen network ring protocol similarly expects an SKB
which has GSO set to also have NETRX_csum_blank (which also implies a
partial checksum). Therefore drop such frames on receive otherwise
they will trigger the warning in skb_gso_segment.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: xen-devel@lists.xensource.com
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netfront.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

Comments

David Miller Jan. 26, 2011, 3:44 a.m. UTC | #1
From: Ian Campbell <ian.campbell@citrix.com>
Date: Tue, 25 Jan 2011 17:10:00 +0000

> The Linux network stack expects all GSO SKBs to have ip_summed ==
> CHECKSUM_PARTIAL (which implies that the frame contains a partial
> checksum) and the Xen network ring protocol similarly expects an SKB
> which has GSO set to also have NETRX_csum_blank (which also implies a
> partial checksum). Therefore drop such frames on receive otherwise
> they will trigger the warning in skb_gso_segment.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

The GSO code does in fact warn in the logs about this situation, but
it _DOES NOT_ drop the packet.  Therefore, either you guys should do
the same or we should make the generic code drop too.

I think the generic code is doing the right thing, therefore what you
should probably do is put the checksum of the SKB into the right state
when you detect this situation (and perhaps bump a ethtool driver
local statistic which specifically tracks this exact event).

Or, even better, you should fix whatever causes this in the first
place.

Dropping frames ought to be the last option, stuff like this is
impossible to debug if someone starts wondering why they are getting
frame drops.

You don't even account for this in a unique statistic somewhere, so
people can figure out the actual spcific _reason_ for the drop.  They
will just see "rx_error" and scratch their heads.

Anyways, I think dropping is fundamentally wrong, so I'm not applying
this.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ian Campbell Jan. 26, 2011, 11:56 a.m. UTC | #2
On Wed, 2011-01-26 at 03:44 +0000, David Miller wrote:
> From: Ian Campbell <ian.campbell@citrix.com>
> Date: Tue, 25 Jan 2011 17:10:00 +0000
> 
> > The Linux network stack expects all GSO SKBs to have ip_summed ==
> > CHECKSUM_PARTIAL (which implies that the frame contains a partial
> > checksum) and the Xen network ring protocol similarly expects an SKB
> > which has GSO set to also have NETRX_csum_blank (which also implies a
> > partial checksum). Therefore drop such frames on receive otherwise
> > they will trigger the warning in skb_gso_segment.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> 
> The GSO code does in fact warn in the logs about this situation, but
> it _DOES NOT_ drop the packet.  Therefore, either you guys should do
> the same or we should make the generic code drop too.

Ah, yes. I misread the handling of an error from pskb_expand_head() in
skb_gso_segment() and thought it was a more general error return
covering the entire case.

> I think the generic code is doing the right thing, therefore what you
> should probably do is put the checksum of the SKB into the right state
> when you detect this situation (and perhaps bump a ethtool driver
> local statistic which specifically tracks this exact event).

Yes, I think this is a good idea. I'll come up with a patch which does
this.

> Or, even better, you should fix whatever causes this in the first
> place.

Sure, that has already been done but the proper fix is in another guest,
with a secondary robustness fix in netback (similar to this one, so I'll
take your advice from above on board in that context too).

The intention here was to be robust in the face of unfixed guests
sharing the same host or future netback bugs etc.

> Dropping frames ought to be the last option, stuff like this is
> impossible to debug if someone starts wondering why they are getting
> frame drops.
> 
> You don't even account for this in a unique statistic somewhere, so
> people can figure out the actual spcific _reason_ for the drop.  They
> will just see "rx_error" and scratch their heads.
> 
> Anyways, I think dropping is fundamentally wrong, so I'm not applying
> this.

You've convinced me too, thanks for the feedback.

Thanks,
Ian.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 4dc347b..0ea47da 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -818,6 +818,10 @@  static int validate_incoming_skb(struct sk_buff *skb)
 	if (skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_setup(skb))
 		return 0;
 
+	/* A GSO SKB must be partial. */
+	if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb))
+		return 0;
+
 	return 1;
 }