diff mbox

[net-next-2.6,2/5] ixgbe: drop support for UDP in RSS hash generation

Message ID 20100719235925.14112.65890.stgit@localhost.localdomain
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Kirsher, Jeffrey T July 19, 2010, 11:59 p.m. UTC
From: Alexander Duyck <alexander.h.duyck@intel.com>

This change removes UDP from the supported protocols for RSS hashing.  The
reason for removing this protocol is because IP fragmentation was causing a
network flow to be broken into two streams, one for fragmented, and one for
non-fragmented and this in turn was causing out-of-order issues.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/ixgbe/ixgbe_main.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller July 20, 2010, 3:24 a.m. UTC | #1
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon, 19 Jul 2010 16:59:27 -0700

> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> This change removes UDP from the supported protocols for RSS hashing.  The
> reason for removing this protocol is because IP fragmentation was causing a
> network flow to be broken into two streams, one for fragmented, and one for
> non-fragmented and this in turn was causing out-of-order issues.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bill Fink July 20, 2010, 6:07 a.m. UTC | #2
On Mon, 19 Jul 2010, David Miller wrote:

> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Date: Mon, 19 Jul 2010 16:59:27 -0700
> 
> > From: Alexander Duyck <alexander.h.duyck@intel.com>
> > 
> > This change removes UDP from the supported protocols for RSS hashing.  The
> > reason for removing this protocol is because IP fragmentation was causing a
> > network flow to be broken into two streams, one for fragmented, and one for
> > non-fragmented and this in turn was causing out-of-order issues.
> > 
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> > Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
> > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> 
> Applied.

Should there be a /proc or ethtool setting for whether or not to
use RSS hashing for UDP flows?  I would think that for many common
UDP applications, IP fragmentation would not be an issue because
they often tend to use sub-MTU sized datagrams.  And of course
UDP does not guarantee in-order delivery in any event.  Then a
remaining issue is what the default setting of such an option
should be.  I would lean to having it enabled by default, but
I can also see the safety argument for having it off by default.

						-Bill
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet July 20, 2010, 6:30 a.m. UTC | #3
Le mardi 20 juillet 2010 à 02:07 -0400, Bill Fink a écrit :
> On Mon, 19 Jul 2010, David Miller wrote:
> 

> Should there be a /proc or ethtool setting for whether or not to
> use RSS hashing for UDP flows?  I would think that for many common
> UDP applications, IP fragmentation would not be an issue because
> they often tend to use sub-MTU sized datagrams.  And of course
> UDP does not guarantee in-order delivery in any event.  Then a
> remaining issue is what the default setting of such an option
> should be.  I would lean to having it enabled by default, but
> I can also see the safety argument for having it off by default.
> 

Their are several issues here.

1) Ability for the NIC to spread UDP loads on several queues.

2) Ability for the NIC to provide the hash to our stack, to speedup a
bit RPS.


If the patch is about 1), ie disables NIC ability to split UDP flows on
several RX queues, then yes : its probably _not_ wanted.



Commit message is not very clear on this topic.

By nature, UDP flows are subject to out of order issues, so what is this
patch tries to avoid ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet July 20, 2010, 6:39 a.m. UTC | #4
Le lundi 19 juillet 2010 à 16:59 -0700, Jeff Kirsher a écrit :
> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> This change removes UDP from the supported protocols for RSS hashing.  The
> reason for removing this protocol is because IP fragmentation was causing a
> network flow to be broken into two streams, one for fragmented, and one for
> non-fragmented and this in turn was causing out-of-order issues.
> 

Jeff, does it mean all UDP packets are going to be delivered to a single
queue ?

This would be a serious regression.

Many UDP applications try hard to not use fragments. 

They are going to pay the price because some application :
- Use big segments, fragmented.
- Is subject to OOO artifacts.

We would like some clarifications please :)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller July 20, 2010, 6:39 a.m. UTC | #5
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 20 Jul 2010 08:30:00 +0200

> By nature, UDP flows are subject to out of order issues, so what is this
> patch tries to avoid ?

UDP being subject to out-of-order issues really doesn't matter one
bit.

We should never, _knowingly_ create out-of-order packets in our
networking stack.  And this is regardless of protocol.

If there is no way to make ixgbe respect in-order packet delivery for
UDP frames vis-a-vis fragmented frames, we must disable RX flow
spreading for UDP.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller July 20, 2010, 6:44 a.m. UTC | #6
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 20 Jul 2010 08:39:40 +0200

> This would be a serious regression.

The regression is in the hardware Eric.

> Many UDP applications try hard to not use fragments. 
> 
> They are going to pay the price because some application :
> - Use big segments, fragmented.
> - Is subject to OOO artifacts.

None of this matters.  If the hardware can't flow seperate properly
it's tough cookies.

We never may reorder packets in our stack by our own doing.  The
only safe default is to turn UDP flow seperation off on chips
like ixgbe.

If you want an ethtool knob to turn it back on for your machines,
fine.  But never can it be enabled by default.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Duyck, Alexander H July 20, 2010, 4:11 p.m. UTC | #7
Eric Dumazet wrote:
> Le lundi 19 juillet 2010 à 16:59 -0700, Jeff Kirsher a écrit :
>> From: Alexander Duyck <alexander.h.duyck@intel.com>
>>
>> This change removes UDP from the supported protocols for RSS hashing.  The
>> reason for removing this protocol is because IP fragmentation was causing a
>> network flow to be broken into two streams, one for fragmented, and one for
>> non-fragmented and this in turn was causing out-of-order issues.
>>
> 
> Jeff, does it mean all UDP packets are going to be delivered to a single
> queue ?
> 
> This would be a serious regression.
> 
> Many UDP applications try hard to not use fragments. 
> 
> They are going to pay the price because some application :
> - Use big segments, fragmented.
> - Is subject to OOO artifacts.
> 
> We would like some clarifications please :)
> 
> 
> 

The packets will still be hashed on source and destination IPv4/IPv6 
addresses.  The change just drops reading the UDP source/destination 
ports since in the case of fragmented packets they are not available and 
as such were being parsed as IPv4/IPv6 packets.  By making this change 
the queue selection is consistent between all packets in the UDP stream.

The only regression I would expect to see would be in testing between 
two fixed systems since the IP addresses of the two systems would be 
fixed and so running multiple flows between the two would yield the same 
  RSS hash for multiple UDP streams.  As long as multiple ip addresses 
are used  you should see multiple RSS hashes generated and as such the 
load should still be distributed.

Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet July 20, 2010, 4:39 p.m. UTC | #8
Le mardi 20 juillet 2010 à 09:11 -0700, Alexander Duyck a écrit :

> The packets will still be hashed on source and destination IPv4/IPv6 
> addresses.  The change just drops reading the UDP source/destination 
> ports since in the case of fragmented packets they are not available and 
> as such were being parsed as IPv4/IPv6 packets.  By making this change 
> the queue selection is consistent between all packets in the UDP stream.
> 

Excellent, this is perfect IMHO.

> The only regression I would expect to see would be in testing between 
> two fixed systems since the IP addresses of the two systems would be 
> fixed and so running multiple flows between the two would yield the same 
>   RSS hash for multiple UDP streams.  As long as multiple ip addresses 
> are used  you should see multiple RSS hashes generated and as such the 
> load should still be distributed.
> 

Ack. Fortunately, one can still use RPS to spread load onto multiple
cpus in this case.

This until ixgpe fills skb->rxhash with a non null value.
If it happens one day, we shall remind _not_ filling it for UDP packets.

BTW, this reminds me a netdev discussion we had for bnx2x

http://www.kerneltrap.com/mailarchive/linux-netdev/2010/4/23/6275415/thread

And now, I understand why Toepliz hash doesnt use src port/dst port,
since this is not available on fragments, obviously...

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index b235aa1..813d2cb 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -2800,10 +2800,8 @@  static void ixgbe_configure_rx(struct ixgbe_adapter *adapter)
 		    /* Perform hash on these packet types */
 		mrqc |= IXGBE_MRQC_RSS_FIELD_IPV4
 		      | IXGBE_MRQC_RSS_FIELD_IPV4_TCP
-		      | IXGBE_MRQC_RSS_FIELD_IPV4_UDP
 		      | IXGBE_MRQC_RSS_FIELD_IPV6
-		      | IXGBE_MRQC_RSS_FIELD_IPV6_TCP
-		      | IXGBE_MRQC_RSS_FIELD_IPV6_UDP;
+		      | IXGBE_MRQC_RSS_FIELD_IPV6_TCP;
 	}
 	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);