{"id":818814,"url":"http://patchwork.ozlabs.org/api/patches/818814/?format=json","web_url":"http://patchwork.ozlabs.org/project/netdev/patch/e0d6b84bd95f975183dd96d34909867bf2617d19.1506114055.git.pabeni@redhat.com/","project":{"id":7,"url":"http://patchwork.ozlabs.org/api/projects/7/?format=json","name":"Linux network development","link_name":"netdev","list_id":"netdev.vger.kernel.org","list_email":"netdev@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<e0d6b84bd95f975183dd96d34909867bf2617d19.1506114055.git.pabeni@redhat.com>","list_archive_url":null,"date":"2017-09-26T20:18:52","name":"[RFC,10/11] IP: early demux can return an error code","commit_ref":null,"pull_url":null,"state":"rfc","archived":true,"hash":"bb09d8caeda5db131f0eb062000e6e13b2f4c6a0","submitter":{"id":67312,"url":"http://patchwork.ozlabs.org/api/people/67312/?format=json","name":"Paolo Abeni","email":"pabeni@redhat.com"},"delegate":{"id":34,"url":"http://patchwork.ozlabs.org/api/users/34/?format=json","username":"davem","first_name":"David","last_name":"Miller","email":"davem@davemloft.net"},"mbox":"http://patchwork.ozlabs.org/project/netdev/patch/e0d6b84bd95f975183dd96d34909867bf2617d19.1506114055.git.pabeni@redhat.com/mbox/","series":[{"id":4709,"url":"http://patchwork.ozlabs.org/api/series/4709/?format=json","web_url":"http://patchwork.ozlabs.org/project/netdev/list/?series=4709","date":"2017-09-22T21:06:24","name":"udp: full early demux for unconnected sockets","version":1,"mbox":"http://patchwork.ozlabs.org/series/4709/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/818814/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/818814/checks/","tags":{},"related":[],"headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ext-mx09.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx09.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=pabeni@redhat.com"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3y1sjq2z41z9t4X\n\tfor <patchwork-incoming@ozlabs.org>;\n\tWed, 27 Sep 2017 06:19:11 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S967016AbdIZUTI (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tTue, 26 Sep 2017 16:19:08 -0400","from mx1.redhat.com ([209.132.183.28]:6184 \"EHLO mx1.redhat.com\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S935846AbdIZUTH (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tTue, 26 Sep 2017 16:19:07 -0400","from smtp.corp.redhat.com\n\t(int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id BA9B812988B;\n\tTue, 26 Sep 2017 20:19:06 +0000 (UTC)","from dhcppc1.redhat.com (ovpn-116-44.ams2.redhat.com\n\t[10.36.116.44])\n\tby smtp.corp.redhat.com (Postfix) with ESMTP id B99F017DD4;\n\tTue, 26 Sep 2017 20:19:04 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com BA9B812988B","From":"Paolo Abeni <pabeni@redhat.com>","To":"netdev@vger.kernel.org","Cc":"\"David S. Miller\" <davem@davemloft.net>,\n\tPablo Neira Ayuso <pablo@netfilter.org>, Florian Westphal <fw@strlen.de>,\n\tEric Dumazet <edumazet@google.com>,\n\tHannes Frederic Sowa <hannes@stressinduktion.org>","Subject":"[RFC PATCH 10/11] IP: early demux can return an error code","Date":"Tue, 26 Sep 2017 22:18:52 +0200","Message-Id":"<e0d6b84bd95f975183dd96d34909867bf2617d19.1506114055.git.pabeni@redhat.com>","In-Reply-To":"<cover.1506114055.git.pabeni@redhat.com>","References":"<cover.1506114055.git.pabeni@redhat.com>","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.13","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.38]);\n\tTue, 26 Sep 2017 20:19:06 +0000 (UTC)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"},"content":"it will used by later patch to cope with unconnected sockets.\nSince early demux can do a route lookup and an ipv4 route\nlookup can return an error code this is consistent with the\ncurrent ipv4 route infrastructure.\n\nSigned-off-by: Paolo Abeni <pabeni@redhat.com>\n---\nThis patch and the next one did not land on the ML previously\ndue to PEBKAC, appending now to give the complete picture of\nthis RFC series.\nSide note: currently the early demux lookup for mcast sockets\ndoes not perform source address validation and we need (also)\nsomething like this commit to fix the issue without causing\nlarge performance regressions.\n---\n include/net/protocol.h |  4 ++--\n include/net/tcp.h      |  2 +-\n include/net/udp.h      |  2 +-\n net/ipv4/ip_input.c    | 25 +++++++++++++++----------\n net/ipv4/tcp_ipv4.c    |  9 +++++----\n net/ipv4/udp.c         | 11 ++++++-----\n 6 files changed, 30 insertions(+), 23 deletions(-)","diff":"diff --git a/include/net/protocol.h b/include/net/protocol.h\nindex 65ba335b0e7e..4fc75f7ae23b 100644\n--- a/include/net/protocol.h\n+++ b/include/net/protocol.h\n@@ -39,8 +39,8 @@\n \n /* This is used to register protocols. */\n struct net_protocol {\n-\tvoid\t\t\t(*early_demux)(struct sk_buff *skb);\n-\tvoid                    (*early_demux_handler)(struct sk_buff *skb);\n+\tint\t\t\t(*early_demux)(struct sk_buff *skb);\n+\tint\t\t\t(*early_demux_handler)(struct sk_buff *skb);\n \tint\t\t\t(*handler)(struct sk_buff *skb);\n \tvoid\t\t\t(*err_handler)(struct sk_buff *skb, u32 info);\n \tunsigned int\t\tno_policy:1,\ndiff --git a/include/net/tcp.h b/include/net/tcp.h\nindex 49a8a46466f3..cf0bb918c52d 100644\n--- a/include/net/tcp.h\n+++ b/include/net/tcp.h\n@@ -345,7 +345,7 @@ void tcp_v4_err(struct sk_buff *skb, u32);\n \n void tcp_shutdown(struct sock *sk, int how);\n \n-void tcp_v4_early_demux(struct sk_buff *skb);\n+int tcp_v4_early_demux(struct sk_buff *skb);\n int tcp_v4_rcv(struct sk_buff *skb);\n \n int tcp_v4_tw_remember_stamp(struct inet_timewait_sock *tw);\ndiff --git a/include/net/udp.h b/include/net/udp.h\nindex 12dfbfe2e2d7..6c759c8594e2 100644\n--- a/include/net/udp.h\n+++ b/include/net/udp.h\n@@ -259,7 +259,7 @@ static inline struct sk_buff *skb_recv_udp(struct sock *sk, unsigned int flags,\n \treturn __skb_recv_udp(sk, flags, noblock, &peeked, &off, err);\n }\n \n-void udp_v4_early_demux(struct sk_buff *skb);\n+int udp_v4_early_demux(struct sk_buff *skb);\n bool udp_sk_rx_dst_set(struct sock *sk, struct dst_entry *dst);\n int udp_get_port(struct sock *sk, unsigned short snum,\n \t\t int (*saddr_cmp)(const struct sock *,\ndiff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c\nindex 5690ef09da28..f172be87674f 100644\n--- a/net/ipv4/ip_input.c\n+++ b/net/ipv4/ip_input.c\n@@ -311,9 +311,10 @@ static inline bool ip_rcv_options(struct sk_buff *skb)\n static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)\n {\n \tconst struct iphdr *iph = ip_hdr(skb);\n-\tstruct rtable *rt;\n+\tint (*edemux)(struct sk_buff *skb);\n \tstruct net_device *dev = skb->dev;\n-\tvoid (*edemux)(struct sk_buff *skb);\n+\tstruct rtable *rt;\n+\tint err;\n \n \t/* if ingress device is enslaved to an L3 master device pass the\n \t * skb to its handler for processing\n@@ -331,7 +332,9 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)\n \n \t\tipprot = rcu_dereference(inet_protos[protocol]);\n \t\tif (ipprot && (edemux = READ_ONCE(ipprot->early_demux))) {\n-\t\t\tedemux(skb);\n+\t\t\terr = edemux(skb);\n+\t\t\tif (unlikely(err))\n+\t\t\t\tgoto drop_error;\n \t\t\t/* must reload iph, skb->head might have changed */\n \t\t\tiph = ip_hdr(skb);\n \t\t}\n@@ -342,13 +345,10 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)\n \t *\thow the packet travels inside Linux networking.\n \t */\n \tif (!skb_valid_dst(skb)) {\n-\t\tint err = ip_route_input_noref(skb, iph->daddr, iph->saddr,\n-\t\t\t\t\t       iph->tos, dev);\n-\t\tif (unlikely(err)) {\n-\t\t\tif (err == -EXDEV)\n-\t\t\t\t__NET_INC_STATS(net, LINUX_MIB_IPRPFILTER);\n-\t\t\tgoto drop;\n-\t\t}\n+\t\terr = ip_route_input_noref(skb, iph->daddr, iph->saddr,\n+\t\t\t\t\t   iph->tos, dev);\n+\t\tif (unlikely(err))\n+\t\t\tgoto drop_error;\n \t}\n \n \t/* Since the sk has no reference to the socket, we must\n@@ -407,6 +407,11 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)\n drop:\n \tkfree_skb(skb);\n \treturn NET_RX_DROP;\n+\n+drop_error:\n+\tif (err == -EXDEV)\n+\t\t__NET_INC_STATS(net, LINUX_MIB_IPRPFILTER);\n+\tgoto drop;\n }\n \n /*\ndiff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c\nindex d9416b5162bc..85164d4d3e53 100644\n--- a/net/ipv4/tcp_ipv4.c\n+++ b/net/ipv4/tcp_ipv4.c\n@@ -1503,23 +1503,23 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)\n }\n EXPORT_SYMBOL(tcp_v4_do_rcv);\n \n-void tcp_v4_early_demux(struct sk_buff *skb)\n+int tcp_v4_early_demux(struct sk_buff *skb)\n {\n \tconst struct iphdr *iph;\n \tconst struct tcphdr *th;\n \tstruct sock *sk;\n \n \tif (skb->pkt_type != PACKET_HOST)\n-\t\treturn;\n+\t\treturn 0;\n \n \tif (!pskb_may_pull(skb, skb_transport_offset(skb) + sizeof(struct tcphdr)))\n-\t\treturn;\n+\t\treturn 0;\n \n \tiph = ip_hdr(skb);\n \tth = tcp_hdr(skb);\n \n \tif (th->doff < sizeof(struct tcphdr) / 4)\n-\t\treturn;\n+\t\treturn 0;\n \n \tsk = __inet_lookup_established(dev_net(skb->dev), &tcp_hashinfo,\n \t\t\t\t       iph->saddr, th->source,\n@@ -1538,6 +1538,7 @@ void tcp_v4_early_demux(struct sk_buff *skb)\n \t\t\t\tskb_dst_set_noref(skb, dst);\n \t\t}\n \t}\n+\treturn 0;\n }\n \n bool tcp_add_backlog(struct sock *sk, struct sk_buff *skb)\ndiff --git a/net/ipv4/udp.c b/net/ipv4/udp.c\nindex 5cbbd78024dc..b7202a15f360 100644\n--- a/net/ipv4/udp.c\n+++ b/net/ipv4/udp.c\n@@ -2215,7 +2215,7 @@ void udp_set_skb_rx_dst(struct sock *sk, struct sk_buff *skb, u32 cookie)\n }\n EXPORT_SYMBOL_GPL(udp_set_skb_rx_dst);\n \n-void udp_v4_early_demux(struct sk_buff *skb)\n+int udp_v4_early_demux(struct sk_buff *skb)\n {\n \tstruct net *net = dev_net(skb->dev);\n \tint dif = skb->dev->ifindex;\n@@ -2227,7 +2227,7 @@ void udp_v4_early_demux(struct sk_buff *skb)\n \n \t/* validate the packet */\n \tif (!pskb_may_pull(skb, skb_transport_offset(skb) + sizeof(struct udphdr)))\n-\t\treturn;\n+\t\treturn 0;\n \n \tiph = ip_hdr(skb);\n \tuh = udp_hdr(skb);\n@@ -2237,14 +2237,14 @@ void udp_v4_early_demux(struct sk_buff *skb)\n \t\tstruct in_device *in_dev = __in_dev_get_rcu(skb->dev);\n \n \t\tif (!in_dev)\n-\t\t\treturn;\n+\t\t\treturn 0;\n \n \t\t/* we are supposed to accept bcast packets */\n \t\tif (skb->pkt_type == PACKET_MULTICAST) {\n \t\t\tours = ip_check_mc_rcu(in_dev, iph->daddr, iph->saddr,\n \t\t\t\t\t       iph->protocol);\n \t\t\tif (!ours)\n-\t\t\t\treturn;\n+\t\t\t\treturn 0;\n \t\t}\n \n \t\tsk = __udp4_lib_mcast_demux_lookup(net, uh->dest, iph->daddr,\n@@ -2256,11 +2256,12 @@ void udp_v4_early_demux(struct sk_buff *skb)\n \t}\n \n \tif (!sk)\n-\t\treturn;\n+\t\treturn 0;\n \n \tskb_set_noref_sk(skb, sk);\n \tif (udp_use_rx_dst_cache(sk, skb))\n \t\tudp_set_skb_rx_dst(sk, skb, 0);\n+\treturn 0;\n }\n \n int udp_rcv(struct sk_buff *skb)\n","prefixes":["RFC","10/11"]}