diff mbox series

[ovs-dev,2/4] datapath: Introduce net_rwsem and remove rtnl_lock()

Message ID 1531788961-46115-3-git-send-email-yihung.wei@gmail.com
State Accepted
Headers show
Series Kernel backports from net-next | expand

Commit Message

Yi-Hung Wei July 17, 2018, 12:55 a.m. UTC
This patch backports the following two upstream commits and
add a new symbol HAVE_NET_RWSEM in acinclude.m4 to determine
whether to use new introduced rw_semaphore, net_rwsem.

Upstream commit:
    commit f0b07bb151b098d291fd1fd71ef7a2df56fb124a
    Author: Kirill Tkhai <ktkhai@virtuozzo.com>
    Date:   Thu Mar 29 19:20:32 2018 +0300

    net: Introduce net_rwsem to protect net_namespace_list

    rtnl_lock() is used everywhere, and contention is very high.
    When someone wants to iterate over alive net namespaces,
    he/she has no a possibility to do that without exclusive lock.
    But the exclusive rtnl_lock() in such places is overkill,
    and it just increases the contention. Yes, there is already
    for_each_net_rcu() in kernel, but it requires rcu_read_lock(),
    and this can't be sleepable. Also, sometimes it may be need
    really prevent net_namespace_list growth, so for_each_net_rcu()
    is not fit there.

    This patch introduces new rw_semaphore, which will be used
    instead of rtnl_mutex to protect net_namespace_list. It is
    sleepable and allows not-exclusive iterations over net
    namespaces list. It allows to stop using rtnl_lock()
    in several places (what is made in next patches) and makes
    less the time, we keep rtnl_mutex. Here we just add new lock,
    while the explanation of we can remove rtnl_lock() there are
    in next patches.

    Fine grained locks generally are better, then one big lock,
    so let's do that with net_namespace_list, while the situation
    allows that.

    Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Upstream commit:
    commit ec9c780925c57588637e1dbd8650d294107311c0
    Author: Kirill Tkhai <ktkhai@virtuozzo.com>
    Date:   Thu Mar 29 19:21:09 2018 +0300

    ovs: Remove rtnl_lock() from ovs_exit_net()

    Here we iterate for_each_net() and removes
    vport from alive net to the exiting net.

    ovs_net::dps are protected by ovs_mutex(),
    and the others, who change it (ovs_dp_cmd_new(),
    __dp_destroy()) also take it.
    The same with datapath::ports list.

    So, we remove rtnl_lock() here.

    Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
---
 acinclude.m4        | 1 +
 datapath/datapath.c | 8 ++++++++
 2 files changed, 9 insertions(+)

Comments

Gregory Rose July 17, 2018, 8:48 p.m. UTC | #1
On 7/16/2018 5:55 PM, Yi-Hung Wei wrote:
> This patch backports the following two upstream commits and
> add a new symbol HAVE_NET_RWSEM in acinclude.m4 to determine
> whether to use new introduced rw_semaphore, net_rwsem.
>
> Upstream commit:
>      commit f0b07bb151b098d291fd1fd71ef7a2df56fb124a
>      Author: Kirill Tkhai <ktkhai@virtuozzo.com>
>      Date:   Thu Mar 29 19:20:32 2018 +0300
>
>      net: Introduce net_rwsem to protect net_namespace_list
>
>      rtnl_lock() is used everywhere, and contention is very high.
>      When someone wants to iterate over alive net namespaces,
>      he/she has no a possibility to do that without exclusive lock.
>      But the exclusive rtnl_lock() in such places is overkill,
>      and it just increases the contention. Yes, there is already
>      for_each_net_rcu() in kernel, but it requires rcu_read_lock(),
>      and this can't be sleepable. Also, sometimes it may be need
>      really prevent net_namespace_list growth, so for_each_net_rcu()
>      is not fit there.
>
>      This patch introduces new rw_semaphore, which will be used
>      instead of rtnl_mutex to protect net_namespace_list. It is
>      sleepable and allows not-exclusive iterations over net
>      namespaces list. It allows to stop using rtnl_lock()
>      in several places (what is made in next patches) and makes
>      less the time, we keep rtnl_mutex. Here we just add new lock,
>      while the explanation of we can remove rtnl_lock() there are
>      in next patches.
>
>      Fine grained locks generally are better, then one big lock,
>      so let's do that with net_namespace_list, while the situation
>      allows that.
>
>      Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>      Signed-off-by: David S. Miller <davem@davemloft.net>
>
> Upstream commit:
>      commit ec9c780925c57588637e1dbd8650d294107311c0
>      Author: Kirill Tkhai <ktkhai@virtuozzo.com>
>      Date:   Thu Mar 29 19:21:09 2018 +0300
>
>      ovs: Remove rtnl_lock() from ovs_exit_net()
>
>      Here we iterate for_each_net() and removes
>      vport from alive net to the exiting net.
>
>      ovs_net::dps are protected by ovs_mutex(),
>      and the others, who change it (ovs_dp_cmd_new(),
>      __dp_destroy()) also take it.
>      The same with datapath::ports list.
>
>      So, we remove rtnl_lock() here.
>
>      Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>      Signed-off-by: David S. Miller <davem@davemloft.net>
>
> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>

I haven't had a chance to test this but it looks good.

Reviewed-by: Greg Rose <gvrose8192@gmail.com>

> ---
>   acinclude.m4        | 1 +
>   datapath/datapath.c | 8 ++++++++
>   2 files changed, 9 insertions(+)
>
> diff --git a/acinclude.m4 b/acinclude.m4
> index 991a6275b978..ae8e66fc4967 100644
> --- a/acinclude.m4
> +++ b/acinclude.m4
> @@ -634,6 +634,7 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
>                     [OVS_GREP_IFELSE([$KSRC/include/linux/rtnetlink.h],
>                                      [rcu_read_lock_held])])
>     OVS_GREP_IFELSE([$KSRC/include/linux/rtnetlink.h], [lockdep_rtnl_is_held])
> +  OVS_GREP_IFELSE([$KSRC/include/linux/rtnetlink.h], [net_rwsem])
>   
>     # Check for the proto_data_valid member in struct sk_buff.  The [^@]
>     # is necessary because some versions of this header remove the
> diff --git a/datapath/datapath.c b/datapath/datapath.c
> index 43f0d7432593..72b5e8b5c29c 100644
> --- a/datapath/datapath.c
> +++ b/datapath/datapath.c
> @@ -2385,10 +2385,18 @@ static void __net_exit ovs_exit_net(struct net *dnet)
>   	list_for_each_entry_safe(dp, dp_next, &ovs_net->dps, list_node)
>   		__dp_destroy(dp);
>   
> +#ifdef HAVE_NET_RWSEM
> +	down_read(&net_rwsem);
> +#else
>   	rtnl_lock();
> +#endif
>   	for_each_net(net)
>   		list_vports_from_net(net, dnet, &head);
> +#ifdef HAVE_NET_RWSEM
> +	up_read(&net_rwsem);
> +#else
>   	rtnl_unlock();
> +#endif
>   
>   	/* Detach all vports from given namespace. */
>   	list_for_each_entry_safe(vport, vport_next, &head, detach_list) {
diff mbox series

Patch

diff --git a/acinclude.m4 b/acinclude.m4
index 991a6275b978..ae8e66fc4967 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -634,6 +634,7 @@  AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
                   [OVS_GREP_IFELSE([$KSRC/include/linux/rtnetlink.h],
                                    [rcu_read_lock_held])])
   OVS_GREP_IFELSE([$KSRC/include/linux/rtnetlink.h], [lockdep_rtnl_is_held])
+  OVS_GREP_IFELSE([$KSRC/include/linux/rtnetlink.h], [net_rwsem])
 
   # Check for the proto_data_valid member in struct sk_buff.  The [^@]
   # is necessary because some versions of this header remove the
diff --git a/datapath/datapath.c b/datapath/datapath.c
index 43f0d7432593..72b5e8b5c29c 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -2385,10 +2385,18 @@  static void __net_exit ovs_exit_net(struct net *dnet)
 	list_for_each_entry_safe(dp, dp_next, &ovs_net->dps, list_node)
 		__dp_destroy(dp);
 
+#ifdef HAVE_NET_RWSEM
+	down_read(&net_rwsem);
+#else
 	rtnl_lock();
+#endif
 	for_each_net(net)
 		list_vports_from_net(net, dnet, &head);
+#ifdef HAVE_NET_RWSEM
+	up_read(&net_rwsem);
+#else
 	rtnl_unlock();
+#endif
 
 	/* Detach all vports from given namespace. */
 	list_for_each_entry_safe(vport, vport_next, &head, detach_list) {