{"id":816374,"url":"http://patchwork.ozlabs.org/api/patches/816374/?format=json","web_url":"http://patchwork.ozlabs.org/project/netdev/patch/db75c6a6872040712a9ab97b0bac04b697c42a4c.1505926196.git.pabeni@redhat.com/","project":{"id":7,"url":"http://patchwork.ozlabs.org/api/projects/7/?format=json","name":"Linux network development","link_name":"netdev","list_id":"netdev.vger.kernel.org","list_email":"netdev@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<db75c6a6872040712a9ab97b0bac04b697c42a4c.1505926196.git.pabeni@redhat.com>","list_archive_url":null,"date":"2017-09-20T16:54:02","name":"[net-next,2/5] net: allow early demux to fetch noref socket","commit_ref":null,"pull_url":null,"state":"changes-requested","archived":true,"hash":"7b3fad33d7541a7a4c4ab82e8077c56d1722e715","submitter":{"id":67312,"url":"http://patchwork.ozlabs.org/api/people/67312/?format=json","name":"Paolo Abeni","email":"pabeni@redhat.com"},"delegate":{"id":34,"url":"http://patchwork.ozlabs.org/api/users/34/?format=json","username":"davem","first_name":"David","last_name":"Miller","email":"davem@davemloft.net"},"mbox":"http://patchwork.ozlabs.org/project/netdev/patch/db75c6a6872040712a9ab97b0bac04b697c42a4c.1505926196.git.pabeni@redhat.com/mbox/","series":[{"id":4180,"url":"http://patchwork.ozlabs.org/api/series/4180/?format=json","web_url":"http://patchwork.ozlabs.org/project/netdev/list/?series=4180","date":"2017-09-20T16:54:00","name":"net: introduce noref sk","version":1,"mbox":"http://patchwork.ozlabs.org/series/4180/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/816374/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/816374/checks/","tags":{},"related":[],"headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ext-mx05.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx05.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=pabeni@redhat.com"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xy5Xn1h8tz9sP1\n\tfor <patchwork-incoming@ozlabs.org>;\n\tThu, 21 Sep 2017 02:58:17 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751521AbdITQ6O (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 20 Sep 2017 12:58:14 -0400","from mx1.redhat.com ([209.132.183.28]:34370 \"EHLO mx1.redhat.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1750938AbdITQ6M (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tWed, 20 Sep 2017 12:58:12 -0400","from smtp.corp.redhat.com\n\t(int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id 531DF1297;\n\tWed, 20 Sep 2017 16:58:12 +0000 (UTC)","from localhost.mxp.redhat.com (unknown [10.32.181.195])\n\tby smtp.corp.redhat.com (Postfix) with ESMTP id 11A3760C9A;\n\tWed, 20 Sep 2017 16:58:10 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com 531DF1297","From":"Paolo Abeni <pabeni@redhat.com>","To":"netdev@vger.kernel.org","Cc":"\"David S. Miller\" <davem@davemloft.net>,\n\tPablo Neira Ayuso <pablo@netfilter.org>, Florian Westphal <fw@strlen.de>,\n\tEric Dumazet <edumazet@google.com>,\n\tHannes Frederic Sowa <hannes@stressinduktion.org>","Subject":"[PATCH net-next 2/5] net: allow early demux to fetch noref socket","Date":"Wed, 20 Sep 2017 18:54:02 +0200","Message-Id":"<db75c6a6872040712a9ab97b0bac04b697c42a4c.1505926196.git.pabeni@redhat.com>","In-Reply-To":"<cover.1505926196.git.pabeni@redhat.com>","References":"<cover.1505926196.git.pabeni@redhat.com>","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.12","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.29]);\n\tWed, 20 Sep 2017 16:58:12 +0000 (UTC)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"},"content":"We must be careful to avoid leaking such sockets outside\nthe RCU section containing the early demux call; we clear\nthem on nonlocal delivery.\n\nFor ipv4 we must take care of local mcast delivery, too,\nsince udp early demux works also for mcast addresses.\n\nAlso update all iptables/nftables extension that can\nhappen in the input chain and can transmit the skb outside\nsuch patch, namely TEE, nft_dup and nfqueue.\n\nSigned-off-by: Paolo Abeni <pabeni@redhat.com>\n---\n net/ipv4/ip_input.c              | 12 ++++++++++++\n net/ipv4/ipmr.c                  | 18 ++++++++++++++----\n net/ipv4/netfilter/nf_dup_ipv4.c |  3 +++\n net/ipv6/ip6_input.c             |  7 ++++++-\n net/ipv6/netfilter/nf_dup_ipv6.c |  3 +++\n net/netfilter/nf_queue.c         |  3 +++\n 6 files changed, 41 insertions(+), 5 deletions(-)","diff":"diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c\nindex fa2dc8f692c6..e71abc8b698c 100644\n--- a/net/ipv4/ip_input.c\n+++ b/net/ipv4/ip_input.c\n@@ -349,6 +349,18 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)\n \t\t\t\t__NET_INC_STATS(net, LINUX_MIB_IPRPFILTER);\n \t\t\tgoto drop;\n \t\t}\n+\n+\t\t/* Since the sk has no reference to the socket, we must\n+\t\t * clear it before escaping this RCU section.\n+\t\t * The sk is just an hint and we know we are not going to use\n+\t\t * it outside the input path.\n+\t\t */\n+\t\tif (skb_dst(skb)->input != ip_local_deliver\n+#ifdef CONFIG_IP_MROUTE\n+\t\t    && skb_dst(skb)->input != ip_mr_input\n+#endif\n+\t\t    )\n+\t\t\tskb_clear_noref_sk(skb);\n \t}\n \n #ifdef CONFIG_IP_ROUTE_CLASSID\ndiff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c\nindex c9b3e6e069ae..76642af79038 100644\n--- a/net/ipv4/ipmr.c\n+++ b/net/ipv4/ipmr.c\n@@ -1978,11 +1978,12 @@ static struct mr_table *ipmr_rt_fib_lookup(struct net *net, struct sk_buff *skb)\n  */\n int ip_mr_input(struct sk_buff *skb)\n {\n-\tstruct mfc_cache *cache;\n-\tstruct net *net = dev_net(skb->dev);\n \tint local = skb_rtable(skb)->rt_flags & RTCF_LOCAL;\n-\tstruct mr_table *mrt;\n+\tstruct net *net = dev_net(skb->dev);\n+\tstruct mfc_cache *cache;\n \tstruct net_device *dev;\n+\tstruct mr_table *mrt;\n+\tstruct sock *sk;\n \n \t/* skb->dev passed in is the loX master dev for vrfs.\n \t * As there are no vifs associated with loopback devices,\n@@ -2052,6 +2053,9 @@ int ip_mr_input(struct sk_buff *skb)\n \t\t\tskb = skb2;\n \t\t}\n \n+\t\t/* avoid leaking the noref sk on forward path */\n+\t\tskb_clear_noref_sk(skb);\n+\n \t\tread_lock(&mrt_lock);\n \t\tvif = ipmr_find_vif(mrt, dev);\n \t\tif (vif >= 0) {\n@@ -2065,12 +2069,18 @@ int ip_mr_input(struct sk_buff *skb)\n \t\treturn -ENODEV;\n \t}\n \n+\t/* avoid leaking the noref sk on forward path... */\n+\tsk = skb_clear_noref_sk(skb);\n \tread_lock(&mrt_lock);\n \tip_mr_forward(net, mrt, dev, skb, cache, local);\n \tread_unlock(&mrt_lock);\n \n-\tif (local)\n+\tif (local) {\n+\t\t/* ... but preserve it for local delivery */\n+\t\tif (sk)\n+\t\t\tskb_set_noref_sk(skb, sk);\n \t\treturn ip_local_deliver(skb);\n+\t}\n \n \treturn 0;\n \ndiff --git a/net/ipv4/netfilter/nf_dup_ipv4.c b/net/ipv4/netfilter/nf_dup_ipv4.c\nindex 39895b9ddeb9..bf8b78492fc8 100644\n--- a/net/ipv4/netfilter/nf_dup_ipv4.c\n+++ b/net/ipv4/netfilter/nf_dup_ipv4.c\n@@ -71,6 +71,9 @@ void nf_dup_ipv4(struct net *net, struct sk_buff *skb, unsigned int hooknum,\n \tnf_reset(skb);\n \tnf_ct_set(skb, NULL, IP_CT_UNTRACKED);\n #endif\n+\t/* Avoid leaking noref sk outside the input path */\n+\tskb_clear_noref_sk(skb);\n+\n \t/*\n \t * If we are in PREROUTING/INPUT, decrease the TTL to mitigate potential\n \t * loops between two hosts.\ndiff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c\nindex 9ee208a348f5..9aa6baffd4b9 100644\n--- a/net/ipv6/ip6_input.c\n+++ b/net/ipv6/ip6_input.c\n@@ -65,9 +65,14 @@ int ip6_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)\n \t\tif (ipprot && (edemux = READ_ONCE(ipprot->early_demux)))\n \t\t\tedemux(skb);\n \t}\n-\tif (!skb_valid_dst(skb))\n+\tif (!skb_valid_dst(skb)) {\n \t\tip6_route_input(skb);\n \n+\t\t/* see comment on ipv4 edmux */\n+\t\tif (skb_dst(skb)->input != ip6_input)\n+\t\t\tskb_clear_noref_sk(skb);\n+\t}\n+\n \treturn dst_input(skb);\n }\n \ndiff --git a/net/ipv6/netfilter/nf_dup_ipv6.c b/net/ipv6/netfilter/nf_dup_ipv6.c\nindex 4a7ddeddbaab..939f6a2238f9 100644\n--- a/net/ipv6/netfilter/nf_dup_ipv6.c\n+++ b/net/ipv6/netfilter/nf_dup_ipv6.c\n@@ -60,6 +60,9 @@ void nf_dup_ipv6(struct net *net, struct sk_buff *skb, unsigned int hooknum,\n \tnf_reset(skb);\n \tnf_ct_set(skb, NULL, IP_CT_UNTRACKED);\n #endif\n+\t/* Avoid leaking noref sk outside the input path */\n+\tskb_clear_noref_sk(skb);\n+\n \tif (hooknum == NF_INET_PRE_ROUTING ||\n \t    hooknum == NF_INET_LOCAL_IN) {\n \t\tstruct ipv6hdr *iph = ipv6_hdr(skb);\ndiff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c\nindex f7e21953b1de..100eff08cb51 100644\n--- a/net/netfilter/nf_queue.c\n+++ b/net/netfilter/nf_queue.c\n@@ -145,6 +145,9 @@ static int __nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,\n \t\t.size\t= sizeof(*entry) + afinfo->route_key_size,\n \t};\n \n+\t/* Avoid leaking noref sk outside the input path */\n+\tskb_clear_noref_sk(skb);\n+\n \tnf_queue_entry_get_refs(entry);\n \tskb_dst_force(skb);\n \tafinfo->saveroute(skb, entry);\n","prefixes":["net-next","2/5"]}