Patchwork netdev: simple_tx_hash shouldn't hash inside fragments

login
register
mail settings
Submitter Alexander Duyck
Date Sept. 19, 2008, 12:43 a.m.
Message ID <20080919004337.10035.3409.stgit@localhost.localdomain>
Download mbox | patch
Permalink /patch/581/
State Accepted
Delegated to: David Miller
Headers show

Comments

Alexander Duyck - Sept. 19, 2008, 12:43 a.m.
Currently simple_tx_hash is hashing inside of udp fragments.  As a result
packets are getting getting sent to all queues when they shouldn't be.
This causes a serious performance regression which can be seen by sending
UDP frames larger than mtu on multiqueue devices.  This change will make
it so that fragments are hashed only as IP datagrams w/o any protocol
information.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
David Miller - Sept. 21, 2008, 5:06 a.m.
From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Thu, 18 Sep 2008 17:43:37 -0700

> Currently simple_tx_hash is hashing inside of udp fragments.  As a result
> packets are getting getting sent to all queues when they shouldn't be.
> This causes a serious performance regression which can be seen by sending
> UDP frames larger than mtu on multiqueue devices.  This change will make
> it so that fragments are hashed only as IP datagrams w/o any protocol
> information.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>

As we discussed on IRC this fixes a pretty serious performance regression
compared to previous releases, so I added this to net-2.6 and will push
to Linus.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 60c51f7..5d9fa1e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -122,6 +122,7 @@ 
 #include <linux/if_arp.h>
 #include <linux/if_vlan.h>
 #include <linux/ip.h>
+#include <net/ip.h>
 #include <linux/ipv6.h>
 #include <linux/in.h>
 #include <linux/jhash.h>
@@ -1667,7 +1668,7 @@  static u16 simple_tx_hash(struct net_device *dev, struct sk_buff *skb)
 {
 	u32 addr1, addr2, ports;
 	u32 hash, ihl;
-	u8 ip_proto;
+	u8 ip_proto = 0;
 
 	if (unlikely(!simple_tx_hashrnd_initialized)) {
 		get_random_bytes(&simple_tx_hashrnd, 4);
@@ -1676,7 +1677,8 @@  static u16 simple_tx_hash(struct net_device *dev, struct sk_buff *skb)
 
 	switch (skb->protocol) {
 	case __constant_htons(ETH_P_IP):
-		ip_proto = ip_hdr(skb)->protocol;
+		if (!(ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)))
+			ip_proto = ip_hdr(skb)->protocol;
 		addr1 = ip_hdr(skb)->saddr;
 		addr2 = ip_hdr(skb)->daddr;
 		ihl = ip_hdr(skb)->ihl;