diff mbox

sky2: don't do GRO on second port

Message ID 20100830105117.0f0cf140@nehalam
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

stephen hemminger Aug. 30, 2010, 5:51 p.m. UTC
There's something very important I forgot to tell you.
 What?

 Don't cross the GRO streams.
 Why?

 It would be bad.
 I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?

 Try to imagine all the Internet as you know it stopping instantaneously
  and every bit in every packet swapping at the speed of light.
 Total packet reordering.
 Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert

The simplest way to stop this is just avoid doing GRO on the second port.
Very few Marvell boards support two ports per ring, and GRO is just
an optimization.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jarek Poplawski Aug. 30, 2010, 7:09 p.m. UTC | #1
On Mon, Aug 30, 2010 at 10:51:17AM -0700, Stephen Hemminger wrote:
> 
>  There's something very important I forgot to tell you.
>  What?
> 
>  Don't cross the GRO streams.
>  Why?
> 
>  It would be bad.
>  I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?
> 
>  Try to imagine all the Internet as you know it stopping instantaneously
>   and every bit in every packet swapping at the speed of light.
>  Total packet reordering.
>  Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert

Looks really bad to me, so... let's forget it! ;-) (At least until
next next.)

Jarek P.

> 
> The simplest way to stop this is just avoid doing GRO on the second port.
> Very few Marvell boards support two ports per ring, and GRO is just
> an optimization.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> 
> --- a/drivers/net/sky2.c	2010-08-30 10:13:28.211536096 -0700
> +++ b/drivers/net/sky2.c	2010-08-30 10:22:01.347183151 -0700
> @@ -2520,24 +2520,27 @@ static inline void sky2_tx_done(struct n
>  	}
>  }
>  
> -static inline void sky2_skb_rx(const struct sky2_port *sky2,
> +static inline void sky2_skb_rx(struct napi_struct *napi,
> +			       const struct sky2_port *sky2,
>  			       u32 status, struct sk_buff *skb)
>  {
>  #ifdef SKY2_VLAN_TAG_USED
> -	u16 vlan_tag = be16_to_cpu(sky2->rx_tag);
>  	if (sky2->vlgrp && (status & GMR_FS_VLAN)) {
> -		if (skb->ip_summed == CHECKSUM_NONE)
> +		u16 vlan_tag = be16_to_cpu(sky2->rx_tag);
> +
> +		if (skb->ip_summed == CHECKSUM_NONE ||
> +		    sky2->netdev != napi->dev)
>  			vlan_hwaccel_receive_skb(skb, sky2->vlgrp, vlan_tag);
>  		else
> -			vlan_gro_receive(&sky2->hw->napi, sky2->vlgrp,
> -					 vlan_tag, skb);
> +			vlan_gro_receive(napi, sky2->vlgrp, vlan_tag, skb);
>  		return;
>  	}
>  #endif
> -	if (skb->ip_summed == CHECKSUM_NONE)
> +	if (skb->ip_summed == CHECKSUM_NONE ||
> +	    sky2->netdev != napi->dev)
>  		netif_receive_skb(skb);
>  	else
> -		napi_gro_receive(&sky2->hw->napi, skb);
> +		napi_gro_receive(napi, skb);
>  }
>  
>  static inline void sky2_rx_done(struct sky2_hw *hw, unsigned port,
> @@ -2638,7 +2641,7 @@ static int sky2_status_intr(struct sky2_
>  
>  			skb->protocol = eth_type_trans(skb, dev);
>  
> -			sky2_skb_rx(sky2, status, skb);
> +			sky2_skb_rx(&hw->napi, sky2, status, skb);
>  
>  			/* Stop after net poll weight */
>  			if (++work_done >= to_do)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 1, 2010, 9:51 p.m. UTC | #2
From: Jarek Poplawski <jarkao2@gmail.com>
Date: Mon, 30 Aug 2010 21:09:00 +0200

> On Mon, Aug 30, 2010 at 10:51:17AM -0700, Stephen Hemminger wrote:
>> 
>>  There's something very important I forgot to tell you.
>>  What?
>> 
>>  Don't cross the GRO streams.
>>  Why?
>> 
>>  It would be bad.
>>  I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?
>> 
>>  Try to imagine all the Internet as you know it stopping instantaneously
>>   and every bit in every packet swapping at the speed of light.
>>  Total packet reordering.
>>  Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert
> 
> Looks really bad to me, so... let's forget it! ;-) (At least until
> next next.)

I'm applying this patch.

Note that for us, devices act as domains, or a key for networking
traffic, whether we like it or not.  Yes, even for the same IP
addresses on the same host.

The reason is that we can do ingress packet editing and the realm of
those rules are per-ingress-qdisc, which are per-device.

The only scenerio you can guarentee that all packets for a given
flow key will be treated the same is the one where the input device
is the same as well.

When there is a single napi --> device mapping, it works, but without
that invariant it doesn't.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Sept. 1, 2010, 9:55 p.m. UTC | #3
On Wed, 01 Sep 2010 14:51:51 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Mon, 30 Aug 2010 21:09:00 +0200
> 
> > On Mon, Aug 30, 2010 at 10:51:17AM -0700, Stephen Hemminger wrote:
> >> 
> >>  There's something very important I forgot to tell you.
> >>  What?
> >> 
> >>  Don't cross the GRO streams.
> >>  Why?
> >> 
> >>  It would be bad.
> >>  I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?
> >> 
> >>  Try to imagine all the Internet as you know it stopping instantaneously
> >>   and every bit in every packet swapping at the speed of light.
> >>  Total packet reordering.
> >>  Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert
> > 
> > Looks really bad to me, so... let's forget it! ;-) (At least until
> > next next.)

The patch wasn't that bad, but the movie quote probably confused you.

Alternatively, the driver could keep track of "current GRO device"
and manually complete the GRO if the next packet changes
the current device.  But it didn't seem worth the effort.
Jarek Poplawski Sept. 2, 2010, 8:33 a.m. UTC | #4
On Wed, Sep 01, 2010 at 02:51:51PM -0700, David Miller wrote:
> The only scenerio you can guarentee that all packets for a given
> flow key will be treated the same is the one where the input device
> is the same as well.
> 
> When there is a single napi --> device mapping, it works, but without
> that invariant it doesn't.

Do you mean a single napi can't be used eg. for vlan_gro and napi_gro?
(napi_gro handles flows for skb->dev different from napi->dev or I
miss something?)

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Sept. 2, 2010, 9:18 a.m. UTC | #5
On Wed, Sep 01, 2010 at 02:55:54PM -0700, Stephen Hemminger wrote:
> On Wed, 01 Sep 2010 14:51:51 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
> 
> > From: Jarek Poplawski <jarkao2@gmail.com>
> > Date: Mon, 30 Aug 2010 21:09:00 +0200
> > 
> > > On Mon, Aug 30, 2010 at 10:51:17AM -0700, Stephen Hemminger wrote:
> > >> 
> > >>  There's something very important I forgot to tell you.
> > >>  What?
> > >> 
> > >>  Don't cross the GRO streams.
> > >>  Why?
> > >> 
> > >>  It would be bad.
> > >>  I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?
> > >> 
> > >>  Try to imagine all the Internet as you know it stopping instantaneously
> > >>   and every bit in every packet swapping at the speed of light.
> > >>  Total packet reordering.
> > >>  Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert
> > > 
> > > Looks really bad to me, so... let's forget it! ;-) (At least until
> > > next next.)
> 
> The patch wasn't that bad, but the movie quote probably confused you.

Yes, I was equally confused by both of them. Good to know it's only
fiction... ;-)

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Sept. 2, 2010, 9:31 a.m. UTC | #6
Le jeudi 02 septembre 2010 à 08:33 +0000, Jarek Poplawski a écrit :
> On Wed, Sep 01, 2010 at 02:51:51PM -0700, David Miller wrote:
> > The only scenerio you can guarentee that all packets for a given
> > flow key will be treated the same is the one where the input device
> > is the same as well.
> > 
> > When there is a single napi --> device mapping, it works, but without
> > that invariant it doesn't.
> 
> Do you mean a single napi can't be used eg. for vlan_gro and napi_gro?
> (napi_gro handles flows for skb->dev different from napi->dev or I
> miss something?)
> 

Same napi can be used both for vlan tagged trafic and "non tagged
trafic".

vlan_gro_common() does the right thing, when initializing skb->dev to
the vlan device, before doing the GRO loop.

So if we receive two packets on same ethernet device, two different
vlans, vlan_gro_common() automatically say they are not part of the same
flow, even if upper layers would say "it's ok for these two packets to
merge". Of course, we wont ask upper layers what they think :)

So we must keep the test against skb->dev, because of vlans,

diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;

(both in vlan_gro_common() and __napi_gro_receive())


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Sept. 2, 2010, 9:32 a.m. UTC | #7
On Thu, Sep 02, 2010 at 08:33:27AM +0000, Jarek Poplawski wrote:
> On Wed, Sep 01, 2010 at 02:51:51PM -0700, David Miller wrote:
> > The only scenerio you can guarentee that all packets for a given
> > flow key will be treated the same is the one where the input device
> > is the same as well.
> > 
> > When there is a single napi --> device mapping, it works, but without
> > that invariant it doesn't.
> 
> Do you mean a single napi can't be used eg. for vlan_gro and napi_gro?
> (napi_gro handles flows for skb->dev different from napi->dev or I
Of course I meant:
(vlan_gro handles flows for skb->dev different from napi->dev or I

> miss something?)

Sorry,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Sept. 2, 2010, 9:55 a.m. UTC | #8
On Thu, Sep 02, 2010 at 11:31:49AM +0200, Eric Dumazet wrote:
> Le jeudi 02 septembre 2010 ?? 08:33 +0000, Jarek Poplawski a écrit :
> > On Wed, Sep 01, 2010 at 02:51:51PM -0700, David Miller wrote:
> > > The only scenerio you can guarentee that all packets for a given
> > > flow key will be treated the same is the one where the input device
> > > is the same as well.
> > > 
> > > When there is a single napi --> device mapping, it works, but without
> > > that invariant it doesn't.
> > 
> > Do you mean a single napi can't be used eg. for vlan_gro and napi_gro?
> > (napi_gro handles flows for skb->dev different from napi->dev or I
> > miss something?)
> > 
> 
> Same napi can be used both for vlan tagged trafic and "non tagged
> trafic".
> 
> vlan_gro_common() does the right thing, when initializing skb->dev to
> the vlan device, before doing the GRO loop.
> 
> So if we receive two packets on same ethernet device, two different
> vlans, vlan_gro_common() automatically say they are not part of the same
> flow, even if upper layers would say "it's ok for these two packets to
> merge". Of course, we wont ask upper layers what they think :)
> 
> So we must keep the test against skb->dev, because of vlans,
> 
> diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
> 
> (both in vlan_gro_common() and __napi_gro_receive())
> 

Exactly, but there is no "a single napi --> device mapping". And sky2
uses the same model. So, there is only a question of cost of this test,
and a question of probability of gro errors on collisions without such
a test in normal use. (And if gro can never do such errors for other
reasons?)

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Sept. 2, 2010, 12:53 p.m. UTC | #9
On Thu, Sep 02, 2010 at 09:18:39AM +0000, Jarek Poplawski wrote:
> On Wed, Sep 01, 2010 at 02:55:54PM -0700, Stephen Hemminger wrote:
> > On Wed, 01 Sep 2010 14:51:51 -0700 (PDT)
> > David Miller <davem@davemloft.net> wrote:
> > 
> > > From: Jarek Poplawski <jarkao2@gmail.com>
> > > Date: Mon, 30 Aug 2010 21:09:00 +0200
> > > 
> > > > On Mon, Aug 30, 2010 at 10:51:17AM -0700, Stephen Hemminger wrote:
> > > >> 
> > > >>  There's something very important I forgot to tell you.
> > > >>  What?
> > > >> 
> > > >>  Don't cross the GRO streams.
> > > >>  Why?
> > > >> 
> > > >>  It would be bad.
> > > >>  I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?
> > > >> 
> > > >>  Try to imagine all the Internet as you know it stopping instantaneously
> > > >>   and every bit in every packet swapping at the speed of light.
> > > >>  Total packet reordering.
> > > >>  Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert
> > > > 
> > > > Looks really bad to me, so... let's forget it! ;-) (At least until
> > > > next next.)
> > 
> > The patch wasn't that bad, but the movie quote probably confused you.
> 
> Yes, I was equally confused by both of them. Good to know it's only
> fiction... ;-)

Stephen, after Eric's explanation, I really think this patch was a bad
idea, and I apologize my false opinion started this.

I hope David, will revert this patch, please.

Sorry,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 2, 2010, 4:30 p.m. UTC | #10
From: Jarek Poplawski <jarkao2@gmail.com>
Date: Thu, 2 Sep 2010 12:53:26 +0000

> Stephen, after Eric's explanation, I really think this patch was a bad
> idea, and I apologize my false opinion started this.
> 
> I hope David, will revert this patch, please.

Ok, now that I've reread everything over I agree, I'll revert the
sky2 change, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Sept. 2, 2010, 4:48 p.m. UTC | #11
On Thu, Sep 02, 2010 at 09:30:23AM -0700, David Miller wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Thu, 2 Sep 2010 12:53:26 +0000
> 
> > Stephen, after Eric's explanation, I really think this patch was a bad
> > idea, and I apologize my false opinion started this.
> > 
> > I hope David, will revert this patch, please.
> 
> Ok, now that I've reread everything over I agree, I'll revert the
> sky2 change, thanks.

Thanks!
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/net/sky2.c	2010-08-30 10:13:28.211536096 -0700
+++ b/drivers/net/sky2.c	2010-08-30 10:22:01.347183151 -0700
@@ -2520,24 +2520,27 @@  static inline void sky2_tx_done(struct n
 	}
 }
 
-static inline void sky2_skb_rx(const struct sky2_port *sky2,
+static inline void sky2_skb_rx(struct napi_struct *napi,
+			       const struct sky2_port *sky2,
 			       u32 status, struct sk_buff *skb)
 {
 #ifdef SKY2_VLAN_TAG_USED
-	u16 vlan_tag = be16_to_cpu(sky2->rx_tag);
 	if (sky2->vlgrp && (status & GMR_FS_VLAN)) {
-		if (skb->ip_summed == CHECKSUM_NONE)
+		u16 vlan_tag = be16_to_cpu(sky2->rx_tag);
+
+		if (skb->ip_summed == CHECKSUM_NONE ||
+		    sky2->netdev != napi->dev)
 			vlan_hwaccel_receive_skb(skb, sky2->vlgrp, vlan_tag);
 		else
-			vlan_gro_receive(&sky2->hw->napi, sky2->vlgrp,
-					 vlan_tag, skb);
+			vlan_gro_receive(napi, sky2->vlgrp, vlan_tag, skb);
 		return;
 	}
 #endif
-	if (skb->ip_summed == CHECKSUM_NONE)
+	if (skb->ip_summed == CHECKSUM_NONE ||
+	    sky2->netdev != napi->dev)
 		netif_receive_skb(skb);
 	else
-		napi_gro_receive(&sky2->hw->napi, skb);
+		napi_gro_receive(napi, skb);
 }
 
 static inline void sky2_rx_done(struct sky2_hw *hw, unsigned port,
@@ -2638,7 +2641,7 @@  static int sky2_status_intr(struct sky2_
 
 			skb->protocol = eth_type_trans(skb, dev);
 
-			sky2_skb_rx(sky2, status, skb);
+			sky2_skb_rx(&hw->napi, sky2, status, skb);
 
 			/* Stop after net poll weight */
 			if (++work_done >= to_do)