diff mbox

net: use synchronize_rcu_expedited()

Message ID 1306228052.3026.16.camel@edumazet-laptop
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet May 24, 2011, 9:07 a.m. UTC
synchronize_rcu() is very slow in various situations (HZ=100,
CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)

Extract from my (mostly idle) 8 core machine :

 synchronize_rcu() in 99985 us
 synchronize_rcu() in 79982 us
 synchronize_rcu() in 87612 us
 synchronize_rcu() in 79827 us
 synchronize_rcu() in 109860 us
 synchronize_rcu() in 98039 us
 synchronize_rcu() in 89841 us
 synchronize_rcu() in 79842 us
 synchronize_rcu() in 80151 us
 synchronize_rcu() in 119833 us
 synchronize_rcu() in 99858 us
 synchronize_rcu() in 73999 us
 synchronize_rcu() in 79855 us
 synchronize_rcu() in 79853 us


When we hold RTNL mutex, we would like to spend some cpu cycles but not
block too long other processes waiting for this mutex.

We also want to setup/dismantle network features as fast as possible at
boot/shutdown time.

This patch makes synchronize_net() call the expedited version if RTNL is
locked.

synchronize_rcu_expedited() typical delay is about 20 us on my machine.

 synchronize_rcu_expedited() in 18 us
 synchronize_rcu_expedited() in 18 us
 synchronize_rcu_expedited() in 18 us
 synchronize_rcu_expedited() in 18 us
 synchronize_rcu_expedited() in 20 us
 synchronize_rcu_expedited() in 16 us
 synchronize_rcu_expedited() in 20 us
 synchronize_rcu_expedited() in 18 us
 synchronize_rcu_expedited() in 18 us


Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Ben Greear <greearb@candelatech.com>
---
 net/core/dev.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paul E. McKenney May 24, 2011, 3:44 p.m. UTC | #1
On Tue, May 24, 2011 at 11:07:32AM +0200, Eric Dumazet wrote:
> synchronize_rcu() is very slow in various situations (HZ=100,
> CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)
> 
> Extract from my (mostly idle) 8 core machine :
> 
>  synchronize_rcu() in 99985 us
>  synchronize_rcu() in 79982 us
>  synchronize_rcu() in 87612 us
>  synchronize_rcu() in 79827 us
>  synchronize_rcu() in 109860 us
>  synchronize_rcu() in 98039 us
>  synchronize_rcu() in 89841 us
>  synchronize_rcu() in 79842 us
>  synchronize_rcu() in 80151 us
>  synchronize_rcu() in 119833 us
>  synchronize_rcu() in 99858 us
>  synchronize_rcu() in 73999 us
>  synchronize_rcu() in 79855 us
>  synchronize_rcu() in 79853 us
> 
> 
> When we hold RTNL mutex, we would like to spend some cpu cycles but not
> block too long other processes waiting for this mutex.
> 
> We also want to setup/dismantle network features as fast as possible at
> boot/shutdown time.
> 
> This patch makes synchronize_net() call the expedited version if RTNL is
> locked.
> 
> synchronize_rcu_expedited() typical delay is about 20 us on my machine.
> 
>  synchronize_rcu_expedited() in 18 us
>  synchronize_rcu_expedited() in 18 us
>  synchronize_rcu_expedited() in 18 us
>  synchronize_rcu_expedited() in 18 us
>  synchronize_rcu_expedited() in 20 us
>  synchronize_rcu_expedited() in 16 us
>  synchronize_rcu_expedited() in 20 us
>  synchronize_rcu_expedited() in 18 us
>  synchronize_rcu_expedited() in 18 us

Cool!!!

Just out of curiosity, how many CPUs does your system have?

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> CC: Ben Greear <greearb@candelatech.com>
> ---
>  net/core/dev.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index bcb05cb..ec11d75 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5954,7 +5954,10 @@ EXPORT_SYMBOL(free_netdev);
>  void synchronize_net(void)
>  {
>  	might_sleep();
> -	synchronize_rcu();
> +	if (rtnl_is_locked())
> +		synchronize_rcu_expedited();
> +	else
> +		synchronize_rcu();
>  }
>  EXPORT_SYMBOL(synchronize_net);
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 24, 2011, 3:52 p.m. UTC | #2
Le mardi 24 mai 2011 à 08:44 -0700, Paul E. McKenney a écrit :
> On Tue, May 24, 2011 at 11:07:32AM +0200, Eric Dumazet wrote:
> > synchronize_rcu() is very slow in various situations (HZ=100,
> > CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)
> > 
> > Extract from my (mostly idle) 8 core machine :
> > 
> >  synchronize_rcu() in 99985 us
> >  synchronize_rcu() in 79982 us
> >  synchronize_rcu() in 87612 us
> >  synchronize_rcu() in 79827 us
> >  synchronize_rcu() in 109860 us
> >  synchronize_rcu() in 98039 us
> >  synchronize_rcu() in 89841 us
> >  synchronize_rcu() in 79842 us
> >  synchronize_rcu() in 80151 us
> >  synchronize_rcu() in 119833 us
> >  synchronize_rcu() in 99858 us
> >  synchronize_rcu() in 73999 us
> >  synchronize_rcu() in 79855 us
> >  synchronize_rcu() in 79853 us
> > 
> > 
> > When we hold RTNL mutex, we would like to spend some cpu cycles but not
> > block too long other processes waiting for this mutex.
> > 
> > We also want to setup/dismantle network features as fast as possible at
> > boot/shutdown time.
> > 
> > This patch makes synchronize_net() call the expedited version if RTNL is
> > locked.
> > 
> > synchronize_rcu_expedited() typical delay is about 20 us on my machine.
> > 
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 20 us
> >  synchronize_rcu_expedited() in 16 us
> >  synchronize_rcu_expedited() in 20 us
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 18 us
> 
> Cool!!!
> 
> Just out of curiosity, how many CPUs does your system have?

16 (2x4x2)  [ processor.max_cstate=1 ]

I am now trying to optimize rcu_barrier(), if you have an idea to get an
expedited version as well ?

We can see in following trace 3 groups, spaced by one jiffie (HZ=100)

Maybe we can avoid sending a call_rcu() to a cpu that has no pending rcu
work ?

[  835.189996] cpu0 synchronize_rcu_expedited() in 30 us 
   -> begin rcu_barrier() immediately
[  835.259702] cpu15 rcu_barrier_callback()
[  835.259705] cpu14 rcu_barrier_callback()
[  835.259708] cpu7 rcu_barrier_callback()
[  835.259711] cpu12 rcu_barrier_callback()
[  835.259714] cpu8 rcu_barrier_callback()
[  835.259716] cpu1 rcu_barrier_callback()
[  835.259719] cpu0 rcu_barrier_callback()

[  835.269691] cpu13 rcu_barrier_callback()
[  835.269695] cpu11 rcu_barrier_callback()
[  835.269698] cpu5 rcu_barrier_callback()
[  835.269700] cpu6 rcu_barrier_callback()
[  835.269702] cpu10 rcu_barrier_callback()
[  835.269705] cpu3 rcu_barrier_callback()
[  835.269707] cpu2 rcu_barrier_callback()

[  835.279687] cpu4 rcu_barrier_callback()
[  835.279689] cpu9 rcu_barrier_callback()
[  835.279744] cpu0 rcu_barrier() in 89499 us

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 24, 2011, 5:28 p.m. UTC | #3
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 24 May 2011 11:07:32 +0200

> synchronize_rcu() is very slow in various situations (HZ=100,
> CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)
> 
> Extract from my (mostly idle) 8 core machine :
 ...
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks Eric.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney May 24, 2011, 7:24 p.m. UTC | #4
On Tue, May 24, 2011 at 05:52:44PM +0200, Eric Dumazet wrote:
> Le mardi 24 mai 2011 à 08:44 -0700, Paul E. McKenney a écrit :
> > On Tue, May 24, 2011 at 11:07:32AM +0200, Eric Dumazet wrote:
> > > synchronize_rcu() is very slow in various situations (HZ=100,
> > > CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)
> > > 
> > > Extract from my (mostly idle) 8 core machine :
> > > 
> > >  synchronize_rcu() in 99985 us
> > >  synchronize_rcu() in 79982 us
> > >  synchronize_rcu() in 87612 us
> > >  synchronize_rcu() in 79827 us
> > >  synchronize_rcu() in 109860 us
> > >  synchronize_rcu() in 98039 us
> > >  synchronize_rcu() in 89841 us
> > >  synchronize_rcu() in 79842 us
> > >  synchronize_rcu() in 80151 us
> > >  synchronize_rcu() in 119833 us
> > >  synchronize_rcu() in 99858 us
> > >  synchronize_rcu() in 73999 us
> > >  synchronize_rcu() in 79855 us
> > >  synchronize_rcu() in 79853 us
> > > 
> > > 
> > > When we hold RTNL mutex, we would like to spend some cpu cycles but not
> > > block too long other processes waiting for this mutex.
> > > 
> > > We also want to setup/dismantle network features as fast as possible at
> > > boot/shutdown time.
> > > 
> > > This patch makes synchronize_net() call the expedited version if RTNL is
> > > locked.
> > > 
> > > synchronize_rcu_expedited() typical delay is about 20 us on my machine.
> > > 
> > >  synchronize_rcu_expedited() in 18 us
> > >  synchronize_rcu_expedited() in 18 us
> > >  synchronize_rcu_expedited() in 18 us
> > >  synchronize_rcu_expedited() in 18 us
> > >  synchronize_rcu_expedited() in 20 us
> > >  synchronize_rcu_expedited() in 16 us
> > >  synchronize_rcu_expedited() in 20 us
> > >  synchronize_rcu_expedited() in 18 us
> > >  synchronize_rcu_expedited() in 18 us
> > 
> > Cool!!!
> > 
> > Just out of curiosity, how many CPUs does your system have?
> 
> 16 (2x4x2)  [ processor.max_cstate=1 ]
> 
> I am now trying to optimize rcu_barrier(), if you have an idea to get an
> expedited version as well ?
> 
> We can see in following trace 3 groups, spaced by one jiffie (HZ=100)
> 
> Maybe we can avoid sending a call_rcu() to a cpu that has no pending rcu
> work ?

Might make sense, though most of the gains would need to come from
kicking the grace-period machinery hard in order to make it go faster.

Interesting -- I will give this some thought.

							Thanx, Paul

> [  835.189996] cpu0 synchronize_rcu_expedited() in 30 us 
>    -> begin rcu_barrier() immediately
> [  835.259702] cpu15 rcu_barrier_callback()
> [  835.259705] cpu14 rcu_barrier_callback()
> [  835.259708] cpu7 rcu_barrier_callback()
> [  835.259711] cpu12 rcu_barrier_callback()
> [  835.259714] cpu8 rcu_barrier_callback()
> [  835.259716] cpu1 rcu_barrier_callback()
> [  835.259719] cpu0 rcu_barrier_callback()
> 
> [  835.269691] cpu13 rcu_barrier_callback()
> [  835.269695] cpu11 rcu_barrier_callback()
> [  835.269698] cpu5 rcu_barrier_callback()
> [  835.269700] cpu6 rcu_barrier_callback()
> [  835.269702] cpu10 rcu_barrier_callback()
> [  835.269705] cpu3 rcu_barrier_callback()
> [  835.269707] cpu2 rcu_barrier_callback()
> 
> [  835.279687] cpu4 rcu_barrier_callback()
> [  835.279689] cpu9 rcu_barrier_callback()
> [  835.279744] cpu0 rcu_barrier() in 89499 us
> 
> Thanks
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 24, 2011, 7:44 p.m. UTC | #5
Le mardi 24 mai 2011 à 12:24 -0700, Paul E. McKenney a écrit :

> Might make sense, though most of the gains would need to come from
> kicking the grace-period machinery hard in order to make it go faster.
> 
> Interesting -- I will give this some thought.
> 

I am working on a final step, using a workqueue so that the
rcu_barrier() is not done under RTNL, so it wont be anymore a blocking
point to dismantle hundred of devices per second...

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney May 24, 2011, 7:56 p.m. UTC | #6
On Tue, May 24, 2011 at 09:44:45PM +0200, Eric Dumazet wrote:
> Le mardi 24 mai 2011 à 12:24 -0700, Paul E. McKenney a écrit :
> 
> > Might make sense, though most of the gains would need to come from
> > kicking the grace-period machinery hard in order to make it go faster.
> > 
> > Interesting -- I will give this some thought.
> 
> I am working on a final step, using a workqueue so that the
> rcu_barrier() is not done under RTNL, so it wont be anymore a blocking
> point to dismantle hundred of devices per second...

OK, I will keep rcu_barrier_expedited() on the "might be useful list",
but will keep to the current plan: finishing up rough edges on RCU
priority boosting, merging SRCU into TREE_RCU and TINY_RCU, and so forth.

But let me know if you do need it.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index bcb05cb..ec11d75 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5954,7 +5954,10 @@  EXPORT_SYMBOL(free_netdev);
 void synchronize_net(void)
 {
 	might_sleep();
-	synchronize_rcu();
+	if (rtnl_is_locked())
+		synchronize_rcu_expedited();
+	else
+		synchronize_rcu();
 }
 EXPORT_SYMBOL(synchronize_net);