From patchwork Wed Dec 6 23:03:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 845393 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="RcT9Ey9E"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3ysZ166rwfz9s03 for ; Thu, 7 Dec 2017 10:03:54 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752553AbdLFXDw (ORCPT ); Wed, 6 Dec 2017 18:03:52 -0500 Received: from mail-pg0-f66.google.com ([74.125.83.66]:45158 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752495AbdLFXDm (ORCPT ); Wed, 6 Dec 2017 18:03:42 -0500 Received: by mail-pg0-f66.google.com with SMTP id m25so3034576pgv.12 for ; Wed, 06 Dec 2017 15:03:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=5iB3RCLEZhozbOgScjI2/HzUP1w36L3q5OhbcKDBNpY=; b=RcT9Ey9EjT6f2WkR6lH6aq7kTDBcM4U1awShITQ6xgX4vqoHVMuE9VoocIl3I8QtNv Imrgk7pBevDa4OGjPTLGv7Iau+YMwJfXBYXV0gW7UT2hWw9qsWCGxRXeEyao35EJPLD2 ruOfbNJPVKdUokNDiU7IeNHiHlx0lDdTOSlSugKkfZ2gDskVtiK63sYEKoihnSzev8PE P95xL4mD6eFdYTvQsIYSGF3eu/3NGZ1Py4YgEp6EoDPthfhUmjnUwXURX7Sp4tBaZTCX 5peINFHXK9yMevc+8bon2JBsU+SGpjbIXqHvjrr39zDeTVNn7Fw9JKFlkhOCSVKRXkiu xFTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=5iB3RCLEZhozbOgScjI2/HzUP1w36L3q5OhbcKDBNpY=; b=oCxDxSFOIy8YzN0icdhZxxxpMCUAevPOXqwB/Id4BXYYd5MkPQ8fAFLdH8J+V4upyN j1B3wKnUjz74ecBKM1jTFhFOdOIxK5c/12lqrqUqhlqkBwQBQFzIeepexdhRJDeb4Wrr 1ZyZBp8X/e6ceU+YZHPI92J+thEFGWFMbwgtBVUANzpTiC/HSnYey4J1um/pYZu3aqci IVerZqY9f4S1kbJQyHCaBXWDSVCF84Dj6VAcfMfwi82at4F2f9nHPRXC3mx/9Ur2iLc4 Cr5Yi1LeHemGpmvJvkQKztsOZQeEBZuGlo/BMmxS8l0+qL3fy8CAz3NCwxMMuwkbOAFE v06Q== X-Gm-Message-State: AJaThX79zlJXE/lChRkqHavqltpz20LDuFEKAAok4gvAPHeV8HYNKsyE 0n4Cvym0ZqCvhM7hYMZ/8e6ASzOp X-Google-Smtp-Source: AGs4zMZN8RbwaswGuohDuO1eHCOoriCiJITiS9C20f28bk2525jr0UG8v2mCLM4B23Pgc27nY4OOEA== X-Received: by 10.99.95.143 with SMTP id t137mr23298620pgb.442.1512601421305; Wed, 06 Dec 2017 15:03:41 -0800 (PST) Received: from tw-172-25-30-113.office.twttr.net ([8.25.197.25]) by smtp.gmail.com with ESMTPSA id q22sm6475753pfj.94.2017.12.06.15.03.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 06 Dec 2017 15:03:40 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: Cong Wang , Daniel Borkmann , Kevin Cernekee Subject: [Patch net-next 1/2] netlink: make netlink tap per netns Date: Wed, 6 Dec 2017 15:03:19 -0800 Message-Id: <20171206230320.22191-2-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.9.4 In-Reply-To: <20171206230320.22191-1-xiyou.wangcong@gmail.com> References: <20171206230320.22191-1-xiyou.wangcong@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org nlmon device is not supposed to capture netlink events from other netns, so instead of filtering events, we can simply make netlink tap itself per netns. Cc: Daniel Borkmann Cc: Kevin Cernekee Signed-off-by: Cong Wang --- net/netlink/af_netlink.c | 66 +++++++++++++++++++++++++++++++++++------------- 1 file changed, 49 insertions(+), 17 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index b9e0ee4e22f5..de5324cc98d4 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -65,6 +65,7 @@ #include #include +#include #include #include #include @@ -145,8 +146,6 @@ static atomic_t nl_table_users = ATOMIC_INIT(0); static BLOCKING_NOTIFIER_HEAD(netlink_chain); -static DEFINE_SPINLOCK(netlink_tap_lock); -static struct list_head netlink_tap_all __read_mostly; static const struct rhashtable_params netlink_rhashtable_params; @@ -173,14 +172,24 @@ static struct sk_buff *netlink_to_full_skb(const struct sk_buff *skb, return new; } +static unsigned int netlink_tap_net_id; + +struct netlink_tap_net { + struct list_head netlink_tap_all; + spinlock_t netlink_tap_lock; +}; + int netlink_add_tap(struct netlink_tap *nt) { + struct net *net = dev_net(nt->dev); + struct netlink_tap_net *nn = net_generic(net, netlink_tap_net_id); + if (unlikely(nt->dev->type != ARPHRD_NETLINK)) return -EINVAL; - spin_lock(&netlink_tap_lock); - list_add_rcu(&nt->list, &netlink_tap_all); - spin_unlock(&netlink_tap_lock); + spin_lock(&nn->netlink_tap_lock); + list_add_rcu(&nt->list, &nn->netlink_tap_all); + spin_unlock(&nn->netlink_tap_lock); __module_get(nt->module); @@ -190,12 +199,14 @@ EXPORT_SYMBOL_GPL(netlink_add_tap); static int __netlink_remove_tap(struct netlink_tap *nt) { + struct net *net = dev_net(nt->dev); + struct netlink_tap_net *nn = net_generic(net, netlink_tap_net_id); bool found = false; struct netlink_tap *tmp; - spin_lock(&netlink_tap_lock); + spin_lock(&nn->netlink_tap_lock); - list_for_each_entry(tmp, &netlink_tap_all, list) { + list_for_each_entry(tmp, &nn->netlink_tap_all, list) { if (nt == tmp) { list_del_rcu(&nt->list); found = true; @@ -205,7 +216,7 @@ static int __netlink_remove_tap(struct netlink_tap *nt) pr_warn("__netlink_remove_tap: %p not found\n", nt); out: - spin_unlock(&netlink_tap_lock); + spin_unlock(&nn->netlink_tap_lock); if (found) module_put(nt->module); @@ -224,6 +235,26 @@ int netlink_remove_tap(struct netlink_tap *nt) } EXPORT_SYMBOL_GPL(netlink_remove_tap); +static __net_init int netlink_tap_init_net(struct net *net) +{ + struct netlink_tap_net *nn = net_generic(net, netlink_tap_net_id); + + INIT_LIST_HEAD(&nn->netlink_tap_all); + spin_lock_init(&nn->netlink_tap_lock); + return 0; +} + +static void __net_exit netlink_tap_exit_net(struct net *net) +{ +} + +static struct pernet_operations netlink_tap_net_ops = { + .init = netlink_tap_init_net, + .exit = netlink_tap_exit_net, + .id = &netlink_tap_net_id, + .size = sizeof(struct netlink_tap_net), +}; + static bool netlink_filter_tap(const struct sk_buff *skb) { struct sock *sk = skb->sk; @@ -274,7 +305,7 @@ static int __netlink_deliver_tap_skb(struct sk_buff *skb, return ret; } -static void __netlink_deliver_tap(struct sk_buff *skb) +static void __netlink_deliver_tap(struct sk_buff *skb, struct netlink_tap_net *nn) { int ret; struct netlink_tap *tmp; @@ -282,19 +313,21 @@ static void __netlink_deliver_tap(struct sk_buff *skb) if (!netlink_filter_tap(skb)) return; - list_for_each_entry_rcu(tmp, &netlink_tap_all, list) { + list_for_each_entry_rcu(tmp, &nn->netlink_tap_all, list) { ret = __netlink_deliver_tap_skb(skb, tmp->dev); if (unlikely(ret)) break; } } -static void netlink_deliver_tap(struct sk_buff *skb) +static void netlink_deliver_tap(struct net *net, struct sk_buff *skb) { + struct netlink_tap_net *nn = net_generic(net, netlink_tap_net_id); + rcu_read_lock(); - if (unlikely(!list_empty(&netlink_tap_all))) - __netlink_deliver_tap(skb); + if (unlikely(!list_empty(&nn->netlink_tap_all))) + __netlink_deliver_tap(skb, nn); rcu_read_unlock(); } @@ -303,7 +336,7 @@ static void netlink_deliver_tap_kernel(struct sock *dst, struct sock *src, struct sk_buff *skb) { if (!(netlink_is_kernel(dst) && netlink_is_kernel(src))) - netlink_deliver_tap(skb); + netlink_deliver_tap(sock_net(dst), skb); } static void netlink_overrun(struct sock *sk) @@ -1213,7 +1246,7 @@ static int __netlink_sendskb(struct sock *sk, struct sk_buff *skb) { int len = skb->len; - netlink_deliver_tap(skb); + netlink_deliver_tap(sock_net(sk), skb); skb_queue_tail(&sk->sk_receive_queue, skb); sk->sk_data_ready(sk); @@ -2730,12 +2763,11 @@ static int __init netlink_proto_init(void) } } - INIT_LIST_HEAD(&netlink_tap_all); - netlink_add_usersock_entry(); sock_register(&netlink_family_ops); register_pernet_subsys(&netlink_net_ops); + register_pernet_subsys(&netlink_tap_net_ops); /* The netlink device handler may be needed early. */ rtnetlink_init(); out: