diff mbox series

[net-next] ip6mr: remove synchronize_rcu() in favor of SOCK_RCU_FREE

Message ID 1520440999.109662.48.camel@gmail.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series [net-next] ip6mr: remove synchronize_rcu() in favor of SOCK_RCU_FREE | expand

Commit Message

Eric Dumazet March 7, 2018, 4:43 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

Kirill found that recently added synchronize_rcu() call in
ip6mr_sk_done()
was slowing down netns dismantle and posted a patch to use it only if
the socket
was found.

I instead suggested to get rid of this call, and use instead
SOCK_RCU_FREE

We might later change IPv4 side to use the same technique and unify
both stacks. IPv4 does not use synchronize_rcu() but has a call_rcu()
that could be replaced by SOCK_RCU_FREE.

Tested:
 time for i in {1..1000}; do unshare -n /bin/false;done

 Before : real 7m18.911s
 After : real 10.187s

Fixes: 8571ab479a6e ("ip6mr: Make mroute_sk rcu-based")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Yuval Mintz <yuvalm@mellanox.com>
---
 net/ipv6/ip6mr.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Kirill Tkhai March 7, 2018, 10:11 p.m. UTC | #1
On 07.03.2018 19:43, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Kirill found that recently added synchronize_rcu() call in
> ip6mr_sk_done()
> was slowing down netns dismantle and posted a patch to use it only if
> the socket
> was found.
> 
> I instead suggested to get rid of this call, and use instead
> SOCK_RCU_FREE
> 
> We might later change IPv4 side to use the same technique and unify
> both stacks. IPv4 does not use synchronize_rcu() but has a call_rcu()
> that could be replaced by SOCK_RCU_FREE.
> 
> Tested:
>  time for i in {1..1000}; do unshare -n /bin/false;done
> 
>  Before : real 7m18.911s
>  After : real 10.187s
> 
> Fixes: 8571ab479a6e ("ip6mr: Make mroute_sk rcu-based")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> Cc: Yuval Mintz <yuvalm@mellanox.com>

Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

> ---
>  net/ipv6/ip6mr.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
> index 2a38f9de45d399fafc9f4bcc662b44be17279e51..7345bd6c4b7dda39c0d73d542e9ca9a5366542ff 100644
> --- a/net/ipv6/ip6mr.c
> +++ b/net/ipv6/ip6mr.c
> @@ -1443,6 +1443,7 @@ static int ip6mr_sk_init(struct mr_table *mrt, struct sock *sk)
>  		err = -EADDRINUSE;
>  	} else {
>  		rcu_assign_pointer(mrt->mroute_sk, sk);
> +		sock_set_flag(sk, SOCK_RCU_FREE);
>  		net->ipv6.devconf_all->mc_forwarding++;
>  	}
>  	write_unlock_bh(&mrt_lock);
> @@ -1472,6 +1473,10 @@ int ip6mr_sk_done(struct sock *sk)
>  		if (sk == rtnl_dereference(mrt->mroute_sk)) {
>  			write_lock_bh(&mrt_lock);
>  			RCU_INIT_POINTER(mrt->mroute_sk, NULL);
> +			/* Note that mroute_sk had SOCK_RCU_FREE set,
> +			 * so the RCU grace period before sk freeing
> +			 * is guaranteed by sk_destruct()
> +			 */
>  			net->ipv6.devconf_all->mc_forwarding--;
>  			write_unlock_bh(&mrt_lock);
>  			inet6_netconf_notify_devconf(net, RTM_NEWNETCONF,
> @@ -1485,7 +1490,6 @@ int ip6mr_sk_done(struct sock *sk)
>  		}
>  	}
>  	rtnl_unlock();
> -	synchronize_rcu();
>  
>  	return err;
>  }
>
David Miller March 7, 2018, 11:14 p.m. UTC | #2
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 07 Mar 2018 08:43:19 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> Kirill found that recently added synchronize_rcu() call in
> ip6mr_sk_done()
> was slowing down netns dismantle and posted a patch to use it only if
> the socket
> was found.
> 
> I instead suggested to get rid of this call, and use instead
> SOCK_RCU_FREE
> 
> We might later change IPv4 side to use the same technique and unify
> both stacks. IPv4 does not use synchronize_rcu() but has a call_rcu()
> that could be replaced by SOCK_RCU_FREE.
> 
> Tested:
>  time for i in {1..1000}; do unshare -n /bin/false;done
> 
>  Before : real 7m18.911s
>  After : real 10.187s
> 
> Fixes: 8571ab479a6e ("ip6mr: Make mroute_sk rcu-based")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> Cc: Yuval Mintz <yuvalm@mellanox.com>

Looks great, applied, thanks everyone.
diff mbox series

Patch

diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 2a38f9de45d399fafc9f4bcc662b44be17279e51..7345bd6c4b7dda39c0d73d542e9ca9a5366542ff 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -1443,6 +1443,7 @@  static int ip6mr_sk_init(struct mr_table *mrt, struct sock *sk)
 		err = -EADDRINUSE;
 	} else {
 		rcu_assign_pointer(mrt->mroute_sk, sk);
+		sock_set_flag(sk, SOCK_RCU_FREE);
 		net->ipv6.devconf_all->mc_forwarding++;
 	}
 	write_unlock_bh(&mrt_lock);
@@ -1472,6 +1473,10 @@  int ip6mr_sk_done(struct sock *sk)
 		if (sk == rtnl_dereference(mrt->mroute_sk)) {
 			write_lock_bh(&mrt_lock);
 			RCU_INIT_POINTER(mrt->mroute_sk, NULL);
+			/* Note that mroute_sk had SOCK_RCU_FREE set,
+			 * so the RCU grace period before sk freeing
+			 * is guaranteed by sk_destruct()
+			 */
 			net->ipv6.devconf_all->mc_forwarding--;
 			write_unlock_bh(&mrt_lock);
 			inet6_netconf_notify_devconf(net, RTM_NEWNETCONF,
@@ -1485,7 +1490,6 @@  int ip6mr_sk_done(struct sock *sk)
 		}
 	}
 	rtnl_unlock();
-	synchronize_rcu();
 
 	return err;
 }