Patchwork [v2,net-next] softirq: reduce latencies

login
register
mail settings
Submitter Eric Dumazet
Date Jan. 4, 2013, 7:49 a.m.
Message ID <1357285780.21409.28416.camel@edumazet-glaptop>
Download mbox | patch
Permalink /patch/209369/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - Jan. 4, 2013, 7:49 a.m.
From: Eric Dumazet <edumazet@google.com>

In various network workloads, __do_softirq() latencies can be up
to 20 ms if HZ=1000, and 200 ms if HZ=100.

This is because we iterate 10 times in the softirq dispatcher,
and some actions can consume a lot of cycles.

This patch changes the fallback to ksoftirqd condition to :

- A time limit of 2 ms.
- need_resched() being set on current task

When one of this condition is met, we wakeup ksoftirqd for further
softirq processing if we still have pending softirqs.

Using need_resched() as the only condition can trigger RCU stalls,
as we can keep BH disabled for too long.

I ran several benchmarks and got no significant difference in
throughput, but a very significant reduction of latencies (one order
of magnitude) :

In following bench, 200 antagonist "netperf -t TCP_RR" are started in
background, using all available cpus.

Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC
IRQ (hard+soft)

Before patch :

# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=550110.424
MIN_LATENCY=146858
MAX_LATENCY=997109
P50_LATENCY=305000
P90_LATENCY=550000
P99_LATENCY=710000
MEAN_LATENCY=376989.12
STDDEV_LATENCY=184046.92

After patch :

# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=40545.492
MIN_LATENCY=9834
MAX_LATENCY=78366
P50_LATENCY=33583
P90_LATENCY=59000
P99_LATENCY=69000
MEAN_LATENCY=38364.67
STDDEV_LATENCY=12865.26

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
---
v2: min(1, (2*HZ/1000)) -> max(1, (2*HZ/1000)), as spotted by Ben

 kernel/softirq.c |   17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joe Perches - Jan. 4, 2013, 8:15 a.m.
On Thu, 2013-01-03 at 23:49 -0800, Eric Dumazet wrote:
> In various network workloads, __do_softirq() latencies can be up
> to 20 ms if HZ=1000, and 200 ms if HZ=100.
> This patch changes the fallback to ksoftirqd condition to :
> - A time limit of 2 ms.

[]
> diff --git a/kernel/softirq.c b/kernel/softirq.c
[]
> +#define MAX_SOFTIRQ_TIME  max(1, (2*HZ/1000))

And if HZ is 10000?
 
>  asmlinkage void __do_softirq(void)
>  {
[]
> +	unsigned long end = jiffies + MAX_SOFTIRQ_TIME;

Perhaps MAX_SOFTIRQ_TIME should be

#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)

though it would be nicer if it were a compile time constant.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Jan. 4, 2013, 8:23 a.m.
On Fri, 2013-01-04 at 00:15 -0800, Joe Perches wrote:
> On Thu, 2013-01-03 at 23:49 -0800, Eric Dumazet wrote:
> > In various network workloads, __do_softirq() latencies can be up
> > to 20 ms if HZ=1000, and 200 ms if HZ=100.
> > This patch changes the fallback to ksoftirqd condition to :
> > - A time limit of 2 ms.
> 
> []
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> []
> > +#define MAX_SOFTIRQ_TIME  max(1, (2*HZ/1000))
> 
> And if HZ is 10000?
>  

Then its OK.  2*10000/1000 -> 20 ticks -> 2 ms


> >  asmlinkage void __do_softirq(void)
> >  {
> []
> > +	unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
> 
> Perhaps MAX_SOFTIRQ_TIME should be
> 
> #define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
> 
> though it would be nicer if it were a compile time constant.

If you send a patch to convert msecs_to_jiffies() to an inline function
when HZ = 1000, I will gladly use it instead of (2*HZ/1000)

Right now, max(1, msecs_to_jiffies(2)) uses way too many instructions,
while it should be the constant 2, known at compile time.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 4, 2013, 9:49 p.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 03 Jan 2013 23:49:40 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> In various network workloads, __do_softirq() latencies can be up
> to 20 ms if HZ=1000, and 200 ms if HZ=100.
> 
> This is because we iterate 10 times in the softirq dispatcher,
> and some actions can consume a lot of cycles.
> 
> This patch changes the fallback to ksoftirqd condition to :
> 
> - A time limit of 2 ms.
> - need_resched() being set on current task
> 
> When one of this condition is met, we wakeup ksoftirqd for further
> softirq processing if we still have pending softirqs.
 ...
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: David S. Miller <davem@davemloft.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/kernel/softirq.c b/kernel/softirq.c
index ed567ba..8d5e4be 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -195,21 +195,21 @@  void local_bh_enable_ip(unsigned long ip)
 EXPORT_SYMBOL(local_bh_enable_ip);
 
 /*
- * We restart softirq processing MAX_SOFTIRQ_RESTART times,
- * and we fall back to softirqd after that.
+ * We restart softirq processing for at most 2 ms,
+ * and if need_resched() is not set.
  *
- * This number has been established via experimentation.
+ * These limits have been established via experimentation.
  * The two things to balance is latency against fairness -
  * we want to handle softirqs as soon as possible, but they
  * should not be able to lock up the box.
  */
-#define MAX_SOFTIRQ_RESTART 10
+#define MAX_SOFTIRQ_TIME  max(1, (2*HZ/1000))
 
 asmlinkage void __do_softirq(void)
 {
 	struct softirq_action *h;
 	__u32 pending;
-	int max_restart = MAX_SOFTIRQ_RESTART;
+	unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
 	int cpu;
 	unsigned long old_flags = current->flags;
 
@@ -264,11 +264,12 @@  restart:
 	local_irq_disable();
 
 	pending = local_softirq_pending();
-	if (pending && --max_restart)
-		goto restart;
+	if (pending) {
+		if (time_before(jiffies, end) && !need_resched())
+			goto restart;
 
-	if (pending)
 		wakeup_softirqd();
+	}
 
 	lockdep_softirq_exit();