diff mbox

[RFC] napi: limit GRO latency

Message ID 20121008115835.3a3bfed6@nehalam.linuxnetplumber.net
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

stephen hemminger Oct. 8, 2012, 6:58 p.m. UTC
Limit the latency of pending GRO in NAPI processing to 2*HZ.
When the system is under heavy network load, NAPI will go into
poll mode via soft irq, and only stay in the loop for
two jiffies. If this occurs, process the GRO pending list
to make sure and not delay outstanding TCP frames for too long.

Rearrange the exit path to get rid of unnecessary goto logic.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Oct. 8, 2012, 7:10 p.m. UTC | #1
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Mon, 8 Oct 2012 11:58:35 -0700

> Limit the latency of pending GRO in NAPI processing to 2*HZ.
> When the system is under heavy network load, NAPI will go into
> poll mode via soft irq, and only stay in the loop for
> two jiffies. If this occurs, process the GRO pending list
> to make sure and not delay outstanding TCP frames for too long.
> 
> Rearrange the exit path to get rid of unnecessary goto logic.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Did you see Eric's patch I just applied which limits it to 1ms?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Oct. 8, 2012, 7:12 p.m. UTC | #2
On Mon, 08 Oct 2012 15:10:35 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Stephen Hemminger <shemminger@vyatta.com>
> Date: Mon, 8 Oct 2012 11:58:35 -0700
> 
> > Limit the latency of pending GRO in NAPI processing to 2*HZ.
> > When the system is under heavy network load, NAPI will go into
> > poll mode via soft irq, and only stay in the loop for
> > two jiffies. If this occurs, process the GRO pending list
> > to make sure and not delay outstanding TCP frames for too long.
> > 
> > Rearrange the exit path to get rid of unnecessary goto logic.
> > 
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> Did you see Eric's patch I just applied which limits it to 1ms?

Think this is a different problem.
After leaving softirq, it may be a long time until
ksoftirqd is run which is a different problem.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 8, 2012, 7:21 p.m. UTC | #3
On Mon, 2012-10-08 at 11:58 -0700, Stephen Hemminger wrote:
> Limit the latency of pending GRO in NAPI processing to 2*HZ.
> When the system is under heavy network load, NAPI will go into
> poll mode via soft irq, and only stay in the loop for
> two jiffies. If this occurs, process the GRO pending list
> to make sure and not delay outstanding TCP frames for too long.
> 
> Rearrange the exit path to get rid of unnecessary goto logic.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> 
> --- a/net/core/dev.c	2012-10-08 09:21:27.466049785 -0700
> +++ b/net/core/dev.c	2012-10-08 11:56:41.714618745 -0700
> @@ -3937,8 +3937,16 @@ static void net_rx_action(struct softirq
>  		 * Allow this to run for 2 jiffies since which will allow
>  		 * an average latency of 1.5/HZ.
>  		 */
> -		if (unlikely(budget <= 0 || time_after(jiffies, time_limit)))
> -			goto softnet_break;
> +		if (unlikely(budget <= 0 || time_after(jiffies, time_limit))) {
> +			/* Cleanup all pending GRO */
> +
> +			list_for_each_entry(n, &sd->poll_list, poll_list)
> +				napi_gro_flush(n);
> +
> +			sd->time_squeeze++;
> +			__raise_softirq_irqoff(NET_RX_SOFTIRQ);
> +			break;
> +		}
>  

Hmm, I really dont think its a good idea.

I am working in making GRO storing more packets, so any kind of flushes
like that will completely defeat GRO intent.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 8, 2012, 7:30 p.m. UTC | #4
On Mon, 2012-10-08 at 12:12 -0700, Stephen Hemminger wrote:
> On Mon, 08 Oct 2012 15:10:35 -0400 (EDT)
> David Miller <davem@davemloft.net> wrote:
> 
> > From: Stephen Hemminger <shemminger@vyatta.com>
> > Date: Mon, 8 Oct 2012 11:58:35 -0700
> > 
> > > Limit the latency of pending GRO in NAPI processing to 2*HZ.
> > > When the system is under heavy network load, NAPI will go into
> > > poll mode via soft irq, and only stay in the loop for
> > > two jiffies. If this occurs, process the GRO pending list
> > > to make sure and not delay outstanding TCP frames for too long.
> > > 
> > > Rearrange the exit path to get rid of unnecessary goto logic.
> > > 
> > > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> > 
> > Did you see Eric's patch I just applied which limits it to 1ms?
> 
> Think this is a different problem.
> After leaving softirq, it may be a long time until
> ksoftirqd is run which is a different problem.

System is under stress, you dont want to increase load at this point,
but increase bandwidth.

Really try this on a 40Gbe link and see how bad it is.

On a server, we probably dedicate one or several cpus only to run NAPI,
so yes we hit net_rx_action() break, but we'll reenter it a few us
later.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Oct. 8, 2012, 7:40 p.m. UTC | #5
On Mon, 08 Oct 2012 21:30:12 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> System is under stress, you dont want to increase load at this point,
> but increase bandwidth.
> 
> Really try this on a 40Gbe link and see how bad it is.

Not everybody has 40Gbe hardware just lying around available for testing :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 8, 2012, 7:46 p.m. UTC | #6
On Mon, 2012-10-08 at 12:40 -0700, Stephen Hemminger wrote:
> On Mon, 08 Oct 2012 21:30:12 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > System is under stress, you dont want to increase load at this point,
> > but increase bandwidth.
> > 
> > Really try this on a 40Gbe link and see how bad it is.
> 
> Not everybody has 40Gbe hardware just lying around available for testing :-)

;)

In fact, the current strategy to flush napi_gro when napi completes
might be not good for moderate traffic (thats 99% of the linux machines
I believe). We could have a deferred flush instead...

I dont know what strategy is used by hardware. But surely hardware uses
timers.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/net/core/dev.c	2012-10-08 09:21:27.466049785 -0700
+++ b/net/core/dev.c	2012-10-08 11:56:41.714618745 -0700
@@ -3937,8 +3937,16 @@  static void net_rx_action(struct softirq
 		 * Allow this to run for 2 jiffies since which will allow
 		 * an average latency of 1.5/HZ.
 		 */
-		if (unlikely(budget <= 0 || time_after(jiffies, time_limit)))
-			goto softnet_break;
+		if (unlikely(budget <= 0 || time_after(jiffies, time_limit))) {
+			/* Cleanup all pending GRO */
+
+			list_for_each_entry(n, &sd->poll_list, poll_list)
+				napi_gro_flush(n);
+
+			sd->time_squeeze++;
+			__raise_softirq_irqoff(NET_RX_SOFTIRQ);
+			break;
+		}
 
 		local_irq_enable();
 
@@ -3987,7 +3995,6 @@  static void net_rx_action(struct softirq
 
 		netpoll_poll_unlock(have);
 	}
-out:
 	net_rps_action_and_irq_enable(sd);
 
 #ifdef CONFIG_NET_DMA
@@ -3997,13 +4004,6 @@  out:
 	 */
 	dma_issue_pending_all();
 #endif
-
-	return;
-
-softnet_break:
-	sd->time_squeeze++;
-	__raise_softirq_irqoff(NET_RX_SOFTIRQ);
-	goto out;
 }
 
 static gifconf_func_t *gifconf_list[NPROTO];