diff mbox

[RFC,net-next-2.6] gro: drivers should feed GRO only with TCP packets

Message ID 1283360686.2556.381.camel@edumazet-laptop
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Sept. 1, 2010, 5:04 p.m. UTC
Many network devices can tell if an incoming frame is a TCP one for a
small cost.

Instead of feeding GRO with all packets, we could filter packets on this
information ?

This should help machines handling a mixed UDP/TCP workload, keeping gro
overhead as small as possible.

patch against tg3 as an example...

Alternative would be to set a bit "is_tcp" on napi_struct to let this
choice being done in network stack, not by each driver. 

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/tg3.c |   22 +++++++++++++++++-----
 1 files changed, 17 insertions(+), 5 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

stephen hemminger Sept. 1, 2010, 5:42 p.m. UTC | #1
On Wed, 01 Sep 2010 19:04:46 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Many network devices can tell if an incoming frame is a TCP one for a
> small cost.
> 
> Instead of feeding GRO with all packets, we could filter packets on this
> information ?
> 
> This should help machines handling a mixed UDP/TCP workload, keeping gro
> overhead as small as possible.
> 
> patch against tg3 as an example...
> 
> Alternative would be to set a bit "is_tcp" on napi_struct to let this
> choice being done in network stack, not by each driver. 
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

There was talk of doing GRO for UDP as well bug it never
got implemented.
David Miller Sept. 1, 2010, 5:48 p.m. UTC | #2
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 1 Sep 2010 10:42:35 -0700

> There was talk of doing GRO for UDP as well but it never
> got implemented. 

I suppose the most direct application would be for IP fragmentation.

In fact I could see that working very well.

But for non-fragmented UDP... I can't see much value to that.

We'd have to preserve the packet boundaries, and process each chunk of
data as one packet at a time within those boundaries, to retain
datagram recvmsg() semantics.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Sept. 1, 2010, 5:55 p.m. UTC | #3
On Wed, 01 Sep 2010 10:48:05 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Stephen Hemminger <shemminger@vyatta.com>
> Date: Wed, 1 Sep 2010 10:42:35 -0700
> 
> > There was talk of doing GRO for UDP as well but it never
> > got implemented. 
> 
> I suppose the most direct application would be for IP fragmentation.
> 
> In fact I could see that working very well.
> 
> But for non-fragmented UDP... I can't see much value to that.
> 
> We'd have to preserve the packet boundaries, and process each chunk of
> data as one packet at a time within those boundaries, to retain
> datagram recvmsg() semantics.

I was thinking for fragmented UDP, which still unfortunately
gets used. 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 1, 2010, 8:10 p.m. UTC | #4
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 01 Sep 2010 19:04:46 +0200

> Many network devices can tell if an incoming frame is a TCP one for a
> small cost.
> 
> Instead of feeding GRO with all packets, we could filter packets on this
> information ?
> 
> This should help machines handling a mixed UDP/TCP workload, keeping gro
> overhead as small as possible.
> 
> patch against tg3 as an example...
> 
> Alternative would be to set a bit "is_tcp" on napi_struct to let this
> choice being done in network stack, not by each driver. 
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

I would rather have each and every driver unconditionally go through
the GRO receive routines, than do things like this.  Especially since
if we apply a patch like your's, the flood gates open and if IP
fragmentation support is added to GRO we have to undo all of this
stuff.

So what I actually want to see is that we remove the distinction of
GRO, and just have things like "net_receive_skb()".  It will have the
same call signature as napi_gro_receive().

Yes we'll still have things like the skge driver which currently need
to have a-priori knowledge of how to complete a NAPI sequence for the
sake of GRO, but we can at least rename the function it calls to
remove the GRO'ness of it.

Even non-NAPI drivers could use this, passing NULL for 'napi' and a GRO
usable NAPI context can be placed somewhere else and used purely for
the sake of GRO.  NULL would also mean "in hardware interrupt", and
we can make this check cost absolutely nothing by doing something inline
in net_receive_skb() like:

	if (__builtin_constant_p(napi) && napi == NULL)
		net_receive_skb_in_irq(skb, dev);

or similar.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index bc3af78..d530a0a 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4748,13 +4748,25 @@  static int tg3_rx(struct tg3_napi *tnapi, int budget)
 			}
 		}
 
+		/*
+		 * give to GRO stack only TCP frames, with good checksums.
+		 */
+		if ((desc->type_flags & RXD_FLAG_IS_TCP) &&
+		    skb->ip_summed == CHECKSUM_UNNECESSARY) {
 #if TG3_VLAN_TAG_USED
-		if (hw_vlan)
-			vlan_gro_receive(&tnapi->napi, tp->vlgrp, vtag, skb);
-		else
+			if (hw_vlan)
+				vlan_gro_receive(&tnapi->napi, tp->vlgrp, vtag, skb);
+			else
 #endif
-			napi_gro_receive(&tnapi->napi, skb);
-
+				napi_gro_receive(&tnapi->napi, skb);
+		} else {
+#if TG3_VLAN_TAG_USED
+			if (hw_vlan)
+				vlan_hwaccel_receive_skb(skb, tp->vlgrp, vtag);
+			else
+#endif
+				netif_receive_skb(skb);
+		}
 		received++;
 		budget--;