Message ID | 20121008115835.3a3bfed6@nehalam.linuxnetplumber.net |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: Stephen Hemminger <shemminger@vyatta.com> Date: Mon, 8 Oct 2012 11:58:35 -0700 > Limit the latency of pending GRO in NAPI processing to 2*HZ. > When the system is under heavy network load, NAPI will go into > poll mode via soft irq, and only stay in the loop for > two jiffies. If this occurs, process the GRO pending list > to make sure and not delay outstanding TCP frames for too long. > > Rearrange the exit path to get rid of unnecessary goto logic. > > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Did you see Eric's patch I just applied which limits it to 1ms? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 08 Oct 2012 15:10:35 -0400 (EDT) David Miller <davem@davemloft.net> wrote: > From: Stephen Hemminger <shemminger@vyatta.com> > Date: Mon, 8 Oct 2012 11:58:35 -0700 > > > Limit the latency of pending GRO in NAPI processing to 2*HZ. > > When the system is under heavy network load, NAPI will go into > > poll mode via soft irq, and only stay in the loop for > > two jiffies. If this occurs, process the GRO pending list > > to make sure and not delay outstanding TCP frames for too long. > > > > Rearrange the exit path to get rid of unnecessary goto logic. > > > > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> > > Did you see Eric's patch I just applied which limits it to 1ms? Think this is a different problem. After leaving softirq, it may be a long time until ksoftirqd is run which is a different problem. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-10-08 at 11:58 -0700, Stephen Hemminger wrote: > Limit the latency of pending GRO in NAPI processing to 2*HZ. > When the system is under heavy network load, NAPI will go into > poll mode via soft irq, and only stay in the loop for > two jiffies. If this occurs, process the GRO pending list > to make sure and not delay outstanding TCP frames for too long. > > Rearrange the exit path to get rid of unnecessary goto logic. > > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> > > > --- a/net/core/dev.c 2012-10-08 09:21:27.466049785 -0700 > +++ b/net/core/dev.c 2012-10-08 11:56:41.714618745 -0700 > @@ -3937,8 +3937,16 @@ static void net_rx_action(struct softirq > * Allow this to run for 2 jiffies since which will allow > * an average latency of 1.5/HZ. > */ > - if (unlikely(budget <= 0 || time_after(jiffies, time_limit))) > - goto softnet_break; > + if (unlikely(budget <= 0 || time_after(jiffies, time_limit))) { > + /* Cleanup all pending GRO */ > + > + list_for_each_entry(n, &sd->poll_list, poll_list) > + napi_gro_flush(n); > + > + sd->time_squeeze++; > + __raise_softirq_irqoff(NET_RX_SOFTIRQ); > + break; > + } > Hmm, I really dont think its a good idea. I am working in making GRO storing more packets, so any kind of flushes like that will completely defeat GRO intent. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-10-08 at 12:12 -0700, Stephen Hemminger wrote: > On Mon, 08 Oct 2012 15:10:35 -0400 (EDT) > David Miller <davem@davemloft.net> wrote: > > > From: Stephen Hemminger <shemminger@vyatta.com> > > Date: Mon, 8 Oct 2012 11:58:35 -0700 > > > > > Limit the latency of pending GRO in NAPI processing to 2*HZ. > > > When the system is under heavy network load, NAPI will go into > > > poll mode via soft irq, and only stay in the loop for > > > two jiffies. If this occurs, process the GRO pending list > > > to make sure and not delay outstanding TCP frames for too long. > > > > > > Rearrange the exit path to get rid of unnecessary goto logic. > > > > > > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> > > > > Did you see Eric's patch I just applied which limits it to 1ms? > > Think this is a different problem. > After leaving softirq, it may be a long time until > ksoftirqd is run which is a different problem. System is under stress, you dont want to increase load at this point, but increase bandwidth. Really try this on a 40Gbe link and see how bad it is. On a server, we probably dedicate one or several cpus only to run NAPI, so yes we hit net_rx_action() break, but we'll reenter it a few us later. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 08 Oct 2012 21:30:12 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote: > System is under stress, you dont want to increase load at this point, > but increase bandwidth. > > Really try this on a 40Gbe link and see how bad it is. Not everybody has 40Gbe hardware just lying around available for testing :-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-10-08 at 12:40 -0700, Stephen Hemminger wrote: > On Mon, 08 Oct 2012 21:30:12 +0200 > Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > System is under stress, you dont want to increase load at this point, > > but increase bandwidth. > > > > Really try this on a 40Gbe link and see how bad it is. > > Not everybody has 40Gbe hardware just lying around available for testing :-) ;) In fact, the current strategy to flush napi_gro when napi completes might be not good for moderate traffic (thats 99% of the linux machines I believe). We could have a deferred flush instead... I dont know what strategy is used by hardware. But surely hardware uses timers. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/net/core/dev.c 2012-10-08 09:21:27.466049785 -0700 +++ b/net/core/dev.c 2012-10-08 11:56:41.714618745 -0700 @@ -3937,8 +3937,16 @@ static void net_rx_action(struct softirq * Allow this to run for 2 jiffies since which will allow * an average latency of 1.5/HZ. */ - if (unlikely(budget <= 0 || time_after(jiffies, time_limit))) - goto softnet_break; + if (unlikely(budget <= 0 || time_after(jiffies, time_limit))) { + /* Cleanup all pending GRO */ + + list_for_each_entry(n, &sd->poll_list, poll_list) + napi_gro_flush(n); + + sd->time_squeeze++; + __raise_softirq_irqoff(NET_RX_SOFTIRQ); + break; + } local_irq_enable(); @@ -3987,7 +3995,6 @@ static void net_rx_action(struct softirq netpoll_poll_unlock(have); } -out: net_rps_action_and_irq_enable(sd); #ifdef CONFIG_NET_DMA @@ -3997,13 +4004,6 @@ out: */ dma_issue_pending_all(); #endif - - return; - -softnet_break: - sd->time_squeeze++; - __raise_softirq_irqoff(NET_RX_SOFTIRQ); - goto out; } static gifconf_func_t *gifconf_list[NPROTO];
Limit the latency of pending GRO in NAPI processing to 2*HZ. When the system is under heavy network load, NAPI will go into poll mode via soft irq, and only stay in the loop for two jiffies. If this occurs, process the GRO pending list to make sure and not delay outstanding TCP frames for too long. Rearrange the exit path to get rid of unnecessary goto logic. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html