From patchwork Mon Oct 1 08:43:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977044 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42NwmW3nMSz9s3x for ; Mon, 1 Oct 2018 18:43:39 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728902AbeJAPUQ (ORCPT ); Mon, 1 Oct 2018 11:20:16 -0400 Received: from mx0b-00191d01.pphosted.com ([67.231.157.136]:43082 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728804AbeJAPUP (ORCPT ); Mon, 1 Oct 2018 11:20:15 -0400 Received: from pps.filterd (m0083689.ppops.net [127.0.0.1]) by m0083689.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w918ZBdL036605 for ; Mon, 1 Oct 2018 04:43:36 -0400 Received: from tlpd255.enaf.dadc.sbc.com (sbcsmtp3.sbc.com [144.160.112.28]) by m0083689.ppops.net-00191d01. with ESMTP id 2mu5smaccs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:36 -0400 Received: from enaf.dadc.sbc.com (localhost [127.0.0.1]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hZlf088902 for ; Mon, 1 Oct 2018 03:43:35 -0500 Received: from zlp30496.vci.att.com (zlp30496.vci.att.com [135.46.181.157]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hVxM088811 for ; Mon, 1 Oct 2018 03:43:31 -0500 Received: from zlp30496.vci.att.com (zlp30496.vci.att.com [127.0.0.1]) by zlp30496.vci.att.com (Service) with ESMTP id 6730E40002AC for ; Mon, 1 Oct 2018 08:43:31 +0000 (GMT) Received: from clpi183.sldc.sbc.com (unknown [135.41.1.46]) by zlp30496.vci.att.com (Service) with ESMTP id 3E73E40002A5 for ; Mon, 1 Oct 2018 08:43:31 +0000 (GMT) Received: from sldc.sbc.com (localhost [127.0.0.1]) by clpi183.sldc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hU9B012098 for ; Mon, 1 Oct 2018 03:43:31 -0500 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by clpi183.sldc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hTmo012010 for ; Mon, 1 Oct 2018 03:43:29 -0500 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 59A7536013D; Mon, 1 Oct 2018 01:43:28 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Cc: Robert Shearman Subject: [PATCH net-next v2 01/10] net: allow binding socket in a VRF when there's an unbound socket Date: Mon, 1 Oct 2018 09:43:11 +0100 Message-Id: <20181001084320.32453-2-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Robert Shearman There is no easy way currently for applications that want to receive packets in the default VRF to be isolated from packets arriving in VRFs, which makes using VRF-unaware applications in a VRF-aware system a potential security risk. So change the inet socket lookup to avoid packets arriving on a device enslaved to an l3mdev from matching unbound sockets by removing the wildcard for non sk_bound_dev_if and instead relying on check against the secondary device index, which will be 0 when the input device is not enslaved to an l3mdev and so match against an unbound socket and not match when the input device is enslaved. Change the socket binding to take the l3mdev into account to allow an unbound socket to not conflict sockets bound to an l3mdev given the datapath isolation now guaranteed. Signed-off-by: Robert Shearman Signed-off-by: Mike Manning --- Documentation/networking/vrf.txt | 9 +++++---- include/net/inet6_hashtables.h | 5 ++--- include/net/inet_hashtables.h | 13 ++++++------- include/net/inet_sock.h | 13 +++++++++++++ net/ipv4/inet_connection_sock.c | 13 ++++++++++--- net/ipv4/inet_hashtables.c | 20 +++++++++++++++----- 6 files changed, 51 insertions(+), 22 deletions(-) diff --git a/Documentation/networking/vrf.txt b/Documentation/networking/vrf.txt index 8ff7b4c8f91b..d4b129402d57 100644 --- a/Documentation/networking/vrf.txt +++ b/Documentation/networking/vrf.txt @@ -103,6 +103,11 @@ VRF device: or to specify the output device using cmsg and IP_PKTINFO. +By default the scope of the port bindings for unbound sockets is +limited to the default VRF. That is, it will not be matched by packets +arriving on interfaces enslaved to an l3mdev and processes may bind to +the same port if they bind to an l3mdev. + TCP & UDP services running in the default VRF context (ie., not bound to any VRF device) can work across all VRF domains by enabling the tcp_l3mdev_accept and udp_l3mdev_accept sysctl options: @@ -112,10 +117,6 @@ tcp_l3mdev_accept and udp_l3mdev_accept sysctl options: netfilter rules on the VRF device can be used to limit access to services running in the default VRF context as well. -The default VRF does not have limited scope with respect to port bindings. -That is, if a process does a wildcard bind to a port in the default VRF it -owns the port across all VRF domains within the network namespace. - ################################################################################ Using iproute2 for VRFs diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h index 6e91e38a31da..9db98af46985 100644 --- a/include/net/inet6_hashtables.h +++ b/include/net/inet6_hashtables.h @@ -115,9 +115,8 @@ int inet6_hash(struct sock *sk); ((__sk)->sk_family == AF_INET6) && \ ipv6_addr_equal(&(__sk)->sk_v6_daddr, (__saddr)) && \ ipv6_addr_equal(&(__sk)->sk_v6_rcv_saddr, (__daddr)) && \ - (!(__sk)->sk_bound_dev_if || \ - ((__sk)->sk_bound_dev_if == (__dif)) || \ - ((__sk)->sk_bound_dev_if == (__sdif))) && \ + (((__sk)->sk_bound_dev_if == (__dif)) || \ + ((__sk)->sk_bound_dev_if == (__sdif))) && \ net_eq(sock_net(__sk), (__net))) #endif /* _INET6_HASHTABLES_H */ diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h index 9141e95529e7..4ae060b4bac2 100644 --- a/include/net/inet_hashtables.h +++ b/include/net/inet_hashtables.h @@ -79,6 +79,7 @@ struct inet_ehash_bucket { struct inet_bind_bucket { possible_net_t ib_net; + int l3mdev; unsigned short port; signed char fastreuse; signed char fastreuseport; @@ -191,7 +192,7 @@ static inline void inet_ehash_locks_free(struct inet_hashinfo *hashinfo) struct inet_bind_bucket * inet_bind_bucket_create(struct kmem_cache *cachep, struct net *net, struct inet_bind_hashbucket *head, - const unsigned short snum); + const unsigned short snum, int l3mdev); void inet_bind_bucket_destroy(struct kmem_cache *cachep, struct inet_bind_bucket *tb); @@ -282,9 +283,8 @@ static inline struct sock *inet_lookup_listener(struct net *net, #define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif, __sdif) \ (((__sk)->sk_portpair == (__ports)) && \ ((__sk)->sk_addrpair == (__cookie)) && \ - (!(__sk)->sk_bound_dev_if || \ - ((__sk)->sk_bound_dev_if == (__dif)) || \ - ((__sk)->sk_bound_dev_if == (__sdif))) && \ + (((__sk)->sk_bound_dev_if == (__dif)) || \ + ((__sk)->sk_bound_dev_if == (__sdif))) && \ net_eq(sock_net(__sk), (__net))) #else /* 32-bit arch */ #define INET_ADDR_COOKIE(__name, __saddr, __daddr) \ @@ -294,9 +294,8 @@ static inline struct sock *inet_lookup_listener(struct net *net, (((__sk)->sk_portpair == (__ports)) && \ ((__sk)->sk_daddr == (__saddr)) && \ ((__sk)->sk_rcv_saddr == (__daddr)) && \ - (!(__sk)->sk_bound_dev_if || \ - ((__sk)->sk_bound_dev_if == (__dif)) || \ - ((__sk)->sk_bound_dev_if == (__sdif))) && \ + (((__sk)->sk_bound_dev_if == (__dif)) || \ + ((__sk)->sk_bound_dev_if == (__sdif))) && \ net_eq(sock_net(__sk), (__net))) #endif /* 64-bit arch */ diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index e03b93360f33..92e0aa3958f6 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -130,6 +130,19 @@ static inline int inet_request_bound_dev_if(const struct sock *sk, return sk->sk_bound_dev_if; } +static inline int inet_sk_bound_l3mdev(const struct sock *sk) +{ +#ifdef CONFIG_NET_L3_MASTER_DEV + struct net *net = sock_net(sk); + + if (!net->ipv4.sysctl_tcp_l3mdev_accept) + return l3mdev_master_ifindex_by_index(net, + sk->sk_bound_dev_if); +#endif + + return 0; +} + static inline struct ip_options_rcu *ireq_opt_deref(const struct inet_request_sock *ireq) { return rcu_dereference_check(ireq->ireq_opt, diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index dfd5009f96ef..97bba5b3d69f 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -183,7 +183,9 @@ inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret, int * int i, low, high, attempt_half; struct inet_bind_bucket *tb; u32 remaining, offset; + int l3mdev; + l3mdev = inet_sk_bound_l3mdev(sk); attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0; other_half_scan: inet_get_local_port_range(net, &low, &high); @@ -219,7 +221,8 @@ inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret, int * hinfo->bhash_size)]; spin_lock_bh(&head->lock); inet_bind_bucket_for_each(tb, &head->chain) - if (net_eq(ib_net(tb), net) && tb->port == port) { + if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev && + tb->port == port) { if (!inet_csk_bind_conflict(sk, tb, false, false)) goto success; goto next_port; @@ -293,6 +296,9 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum) struct net *net = sock_net(sk); struct inet_bind_bucket *tb = NULL; kuid_t uid = sock_i_uid(sk); + int l3mdev; + + l3mdev = inet_sk_bound_l3mdev(sk); if (!port) { head = inet_csk_find_open_port(sk, &tb, &port); @@ -306,11 +312,12 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum) hinfo->bhash_size)]; spin_lock_bh(&head->lock); inet_bind_bucket_for_each(tb, &head->chain) - if (net_eq(ib_net(tb), net) && tb->port == port) + if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev && + tb->port == port) goto tb_found; tb_not_found: tb = inet_bind_bucket_create(hinfo->bind_bucket_cachep, - net, head, port); + net, head, port, l3mdev); if (!tb) goto fail_unlock; tb_found: diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index f5c9ef2586de..260531dc6458 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -65,12 +65,14 @@ static u32 sk_ehashfn(const struct sock *sk) struct inet_bind_bucket *inet_bind_bucket_create(struct kmem_cache *cachep, struct net *net, struct inet_bind_hashbucket *head, - const unsigned short snum) + const unsigned short snum, + int l3mdev) { struct inet_bind_bucket *tb = kmem_cache_alloc(cachep, GFP_ATOMIC); if (tb) { write_pnet(&tb->ib_net, net); + tb->l3mdev = l3mdev; tb->port = snum; tb->fastreuse = 0; tb->fastreuseport = 0; @@ -135,6 +137,7 @@ int __inet_inherit_port(const struct sock *sk, struct sock *child) table->bhash_size); struct inet_bind_hashbucket *head = &table->bhash[bhash]; struct inet_bind_bucket *tb; + int l3mdev; spin_lock(&head->lock); tb = inet_csk(sk)->icsk_bind_hash; @@ -143,6 +146,8 @@ int __inet_inherit_port(const struct sock *sk, struct sock *child) return -ENOENT; } if (tb->port != port) { + l3mdev = inet_sk_bound_l3mdev(sk); + /* NOTE: using tproxy and redirecting skbs to a proxy * on a different listener port breaks the assumption * that the listener socket's icsk_bind_hash is the same @@ -150,12 +155,13 @@ int __inet_inherit_port(const struct sock *sk, struct sock *child) * create a new bind bucket for the child here. */ inet_bind_bucket_for_each(tb, &head->chain) { if (net_eq(ib_net(tb), sock_net(sk)) && - tb->port == port) + tb->l3mdev == l3mdev && tb->port == port) break; } if (!tb) { tb = inet_bind_bucket_create(table->bind_bucket_cachep, - sock_net(sk), head, port); + sock_net(sk), head, port, + l3mdev); if (!tb) { spin_unlock(&head->lock); return -ENOMEM; @@ -675,6 +681,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, u32 remaining, offset; int ret, i, low, high; static u32 hint; + int l3mdev; if (port) { head = &hinfo->bhash[inet_bhashfn(net, port, @@ -693,6 +700,8 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, return ret; } + l3mdev = inet_sk_bound_l3mdev(sk); + inet_get_local_port_range(net, &low, &high); high++; /* [32768, 60999] -> [32768, 61000[ */ remaining = high - low; @@ -719,7 +728,8 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, * the established check is already unique enough. */ inet_bind_bucket_for_each(tb, &head->chain) { - if (net_eq(ib_net(tb), net) && tb->port == port) { + if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev && + tb->port == port) { if (tb->fastreuse >= 0 || tb->fastreuseport >= 0) goto next_port; @@ -732,7 +742,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row, } tb = inet_bind_bucket_create(hinfo->bind_bucket_cachep, - net, head, port); + net, head, port, l3mdev); if (!tb) { spin_unlock_bh(&head->lock); return -ENOMEM; From patchwork Mon Oct 1 08:43:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977047 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Nwmg4VX0z9s4V for ; Mon, 1 Oct 2018 18:43:47 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729000AbeJAPUV (ORCPT ); Mon, 1 Oct 2018 11:20:21 -0400 Received: from mx0b-00191d01.pphosted.com ([67.231.157.136]:52060 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728945AbeJAPUU (ORCPT ); Mon, 1 Oct 2018 11:20:20 -0400 Received: from pps.filterd (m0049458.ppops.net [127.0.0.1]) by m0049458.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w918ZD0j040907 for ; Mon, 1 Oct 2018 04:43:41 -0400 Received: from alpi155.enaf.aldc.att.com (sbcsmtp7.sbc.com [144.160.229.24]) by m0049458.ppops.net-00191d01. with ESMTP id 2mube5n0u2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:40 -0400 Received: from enaf.aldc.att.com (localhost [127.0.0.1]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918he2i003119 for ; Mon, 1 Oct 2018 04:43:40 -0400 Received: from zlp27129.vci.att.com (zlp27129.vci.att.com [135.66.87.42]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918hYk2003059 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from zlp27129.vci.att.com (zlp27129.vci.att.com [127.0.0.1]) by zlp27129.vci.att.com (Service) with ESMTP id C9A8B400042A for ; Mon, 1 Oct 2018 08:43:34 +0000 (GMT) Received: from mlpi432.sfdc.sbc.com (unknown [144.151.223.11]) by zlp27129.vci.att.com (Service) with ESMTP id AAF264000408 for ; Mon, 1 Oct 2018 08:43:34 +0000 (GMT) Received: from sfdc.sbc.com (localhost [127.0.0.1]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hYA2003779 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hUOV003680 for ; Mon, 1 Oct 2018 04:43:30 -0400 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 6B5E336004A for ; Mon, 1 Oct 2018 01:43:29 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Subject: [PATCH net-next v2 02/10] net: ensure unbound stream socket to be chosen when not in a VRF Date: Mon, 1 Oct 2018 09:43:12 +0100 Message-Id: <20181001084320.32453-3-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The commit a04a480d4392 ("net: Require exact match for TCP socket lookups if dif is l3mdev") only ensures that the correct socket is selected for packets in a VRF. However, there is no guarantee that the unbound socket will be selected for packets when not in a VRF. By checking for a device match in compute_score() also for the case when there is no bound device and attaching a score to this, the unbound socket is selected. And if a failure is returned when there is no device match, this ensures that bound sockets are never selected, even if there is no unbound socket. Signed-off-by: Mike Manning --- include/net/inet_hashtables.h | 11 +++++++++++ include/net/inet_sock.h | 8 ++++++++ net/ipv4/inet_hashtables.c | 14 ++++++-------- net/ipv6/inet6_hashtables.c | 14 ++++++-------- 4 files changed, 31 insertions(+), 16 deletions(-) diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h index 4ae060b4bac2..5de2d9f24c05 100644 --- a/include/net/inet_hashtables.h +++ b/include/net/inet_hashtables.h @@ -189,6 +189,17 @@ static inline void inet_ehash_locks_free(struct inet_hashinfo *hashinfo) hashinfo->ehash_locks = NULL; } +static inline bool inet_sk_bound_dev_eq(struct net *net, int bound_dev_if, + int dif, int sdif) +{ +#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV) + return inet_bound_dev_eq(net->ipv4.sysctl_tcp_l3mdev_accept, + bound_dev_if, dif, sdif); +#else + return inet_bound_dev_eq(1, bound_dev_if, dif, sdif); +#endif +} + struct inet_bind_bucket * inet_bind_bucket_create(struct kmem_cache *cachep, struct net *net, struct inet_bind_hashbucket *head, diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index 92e0aa3958f6..47c03ea989ad 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -143,6 +143,14 @@ static inline int inet_sk_bound_l3mdev(const struct sock *sk) return 0; } +static inline bool inet_bound_dev_eq(bool l3mdev_accept, int bound_dev_if, + int dif, int sdif) +{ + if (!bound_dev_if) + return !sdif || l3mdev_accept; + return bound_dev_if == dif || bound_dev_if == sdif; +} + static inline struct ip_options_rcu *ireq_opt_deref(const struct inet_request_sock *ireq) { return rcu_dereference_check(ireq->ireq_opt, diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 260531dc6458..2ec684057ebd 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -235,6 +235,7 @@ static inline int compute_score(struct sock *sk, struct net *net, { int score = -1; struct inet_sock *inet = inet_sk(sk); + bool dev_match; if (net_eq(sock_net(sk), net) && inet->inet_num == hnum && !ipv6_only_sock(sk)) { @@ -245,15 +246,12 @@ static inline int compute_score(struct sock *sk, struct net *net, return -1; score += 4; } - if (sk->sk_bound_dev_if || exact_dif) { - bool dev_match = (sk->sk_bound_dev_if == dif || - sk->sk_bound_dev_if == sdif); + dev_match = inet_sk_bound_dev_eq(net, sk->sk_bound_dev_if, + dif, sdif); + if (!dev_match) + return -1; + score += 4; - if (!dev_match) - return -1; - if (sk->sk_bound_dev_if) - score += 4; - } if (sk->sk_incoming_cpu == raw_smp_processor_id()) score++; } diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c index 3d7c7460a0c5..5eeeba7181a1 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -99,6 +99,7 @@ static inline int compute_score(struct sock *sk, struct net *net, const int dif, const int sdif, bool exact_dif) { int score = -1; + bool dev_match; if (net_eq(sock_net(sk), net) && inet_sk(sk)->inet_num == hnum && sk->sk_family == PF_INET6) { @@ -109,15 +110,12 @@ static inline int compute_score(struct sock *sk, struct net *net, return -1; score++; } - if (sk->sk_bound_dev_if || exact_dif) { - bool dev_match = (sk->sk_bound_dev_if == dif || - sk->sk_bound_dev_if == sdif); + dev_match = inet_sk_bound_dev_eq(net, sk->sk_bound_dev_if, + dif, sdif); + if (!dev_match) + return -1; + score++; - if (!dev_match) - return -1; - if (sk->sk_bound_dev_if) - score++; - } if (sk->sk_incoming_cpu == raw_smp_processor_id()) score++; } From patchwork Mon Oct 1 08:43:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977046 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Nwmb1Z5yz9s4V for ; Mon, 1 Oct 2018 18:43:43 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728962AbeJAPUU (ORCPT ); Mon, 1 Oct 2018 11:20:20 -0400 Received: from mx0b-00191d01.pphosted.com ([67.231.157.136]:39604 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728863AbeJAPUS (ORCPT ); Mon, 1 Oct 2018 11:20:18 -0400 Received: from pps.filterd (m0049463.ppops.net [127.0.0.1]) by m0049463.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w918ZEfY017312 for ; Mon, 1 Oct 2018 04:43:39 -0400 Received: from alpi155.enaf.aldc.att.com (sbcsmtp7.sbc.com [144.160.229.24]) by m0049463.ppops.net-00191d01. with ESMTP id 2mua6sx5by-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:39 -0400 Received: from enaf.aldc.att.com (localhost [127.0.0.1]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918hcvc003084 for ; Mon, 1 Oct 2018 04:43:39 -0400 Received: from zlp27126.vci.att.com (zlp27126.vci.att.com [135.66.87.47]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918hYYZ003058 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from zlp27126.vci.att.com (zlp27126.vci.att.com [127.0.0.1]) by zlp27126.vci.att.com (Service) with ESMTP id B2FB54014782 for ; Mon, 1 Oct 2018 08:43:34 +0000 (GMT) Received: from mlpi432.sfdc.sbc.com (unknown [144.151.223.11]) by zlp27126.vci.att.com (Service) with ESMTP id 9B8BF4000374 for ; Mon, 1 Oct 2018 08:43:34 +0000 (GMT) Received: from sfdc.sbc.com (localhost [127.0.0.1]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hYkB003768 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hUdx003735 for ; Mon, 1 Oct 2018 04:43:31 -0400 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 53B1836013D for ; Mon, 1 Oct 2018 01:43:30 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Subject: [PATCH net-next v2 03/10] net: ensure unbound datagram socket to be chosen when not in a VRF Date: Mon, 1 Oct 2018 09:43:13 +0100 Message-Id: <20181001084320.32453-4-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Ensure an unbound datagram skt is chosen when not in a VRF. The check for a device match in compute_score() for UDP must be performed when there is no device match. For this, a failure is returned when there is no device match. This ensures that bound sockets are never selected, even if there is no unbound socket. Allow IPv6 packets to be sent over a datagram skt bound to a VRF. These packets are currently blocked, as flowi6_oif was set to that of the master vrf device, and the ipi6_ifindex is that of the slave device. Allow these packets to be sent by checking the device with ipi6_ifindex has the same L3 scope as that of the bound device of the skt, which is the master vrf device. Note that this check always succeeds if the skt is unbound. Even though the right datagram skt is now selected by compute_score(), a different skt is being returned that is bound to the wrong vrf. The difference between these and stream sockets is the handling of the skt option for SO_REUSEPORT. While the handling when adding a skt for reuse correctly checks that the bound device of the skt is a match, the skts in the hashslot are already incorrect. So for the same hash, a skt for the wrong vrf may be selected for the required port. The root cause is that the skt is immediately placed into a slot when it is created, but when the skt is then bound using SO_BINDTODEVICE, it remains in the same slot. The solution is to move the skt to the correct slot by forcing a rehash. Signed-off-by: Mike Manning --- include/net/udp.h | 11 +++++++++++ net/core/sock.c | 2 ++ net/ipv4/udp.c | 15 ++++++--------- net/ipv6/datagram.c | 5 ++++- net/ipv6/udp.c | 14 +++++--------- 5 files changed, 28 insertions(+), 19 deletions(-) diff --git a/include/net/udp.h b/include/net/udp.h index 8482a990b0bb..1e4fb8feaf50 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -252,6 +252,17 @@ static inline int udp_rqueue_get(struct sock *sk) return sk_rmem_alloc_get(sk) - READ_ONCE(udp_sk(sk)->forward_deficit); } +static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if, + int dif, int sdif) +{ +#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV) + return inet_bound_dev_eq(net->ipv4.sysctl_udp_l3mdev_accept, + bound_dev_if, dif, sdif); +#else + return inet_bound_dev_eq(1, bound_dev_if, dif, sdif); +#endif +} + /* net/ipv4/udp.c */ void udp_destruct_sock(struct sock *sk); void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len); diff --git a/net/core/sock.c b/net/core/sock.c index 3730eb855095..da1cbb88a6bf 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -567,6 +567,8 @@ static int sock_setbindtodevice(struct sock *sk, char __user *optval, lock_sock(sk); sk->sk_bound_dev_if = index; + if (sk->sk_prot->rehash) + sk->sk_prot->rehash(sk); sk_dst_reset(sk); release_sock(sk); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 3386b3b0218c..0559a7f4c83a 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -371,6 +371,7 @@ static int compute_score(struct sock *sk, struct net *net, { int score; struct inet_sock *inet; + bool dev_match; if (!net_eq(sock_net(sk), net) || udp_sk(sk)->udp_port_hash != hnum || @@ -398,15 +399,11 @@ static int compute_score(struct sock *sk, struct net *net, score += 4; } - if (sk->sk_bound_dev_if || exact_dif) { - bool dev_match = (sk->sk_bound_dev_if == dif || - sk->sk_bound_dev_if == sdif); - - if (!dev_match) - return -1; - if (sk->sk_bound_dev_if) - score += 4; - } + dev_match = udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, + dif, sdif); + if (!dev_match) + return -1; + score += 4; if (sk->sk_incoming_cpu == raw_smp_processor_id()) score++; diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 1ede7a16a0be..4813293d4fad 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -782,7 +782,10 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk, if (src_info->ipi6_ifindex) { if (fl6->flowi6_oif && - src_info->ipi6_ifindex != fl6->flowi6_oif) + src_info->ipi6_ifindex != fl6->flowi6_oif && + (sk->sk_bound_dev_if != fl6->flowi6_oif || + !sk_dev_equal_l3scope( + sk, src_info->ipi6_ifindex))) return -EINVAL; fl6->flowi6_oif = src_info->ipi6_ifindex; } diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 83f4c77c79d8..6722490c87b9 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -117,6 +117,7 @@ static int compute_score(struct sock *sk, struct net *net, { int score; struct inet_sock *inet; + bool dev_match; if (!net_eq(sock_net(sk), net) || udp_sk(sk)->udp_port_hash != hnum || @@ -144,15 +145,10 @@ static int compute_score(struct sock *sk, struct net *net, score++; } - if (sk->sk_bound_dev_if || exact_dif) { - bool dev_match = (sk->sk_bound_dev_if == dif || - sk->sk_bound_dev_if == sdif); - - if (!dev_match) - return -1; - if (sk->sk_bound_dev_if) - score++; - } + dev_match = udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif); + if (!dev_match) + return -1; + score++; if (sk->sk_incoming_cpu == raw_smp_processor_id()) score++; From patchwork Mon Oct 1 08:43:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977049 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Nwmk2H1bz9s5c for ; Mon, 1 Oct 2018 18:43:50 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729019AbeJAPUY (ORCPT ); Mon, 1 Oct 2018 11:20:24 -0400 Received: from mx0b-00191d01.pphosted.com ([67.231.157.136]:39688 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728937AbeJAPUX (ORCPT ); Mon, 1 Oct 2018 11:20:23 -0400 Received: from pps.filterd (m0049463.ppops.net [127.0.0.1]) by m0049463.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w918ZFYQ017320 for ; Mon, 1 Oct 2018 04:43:43 -0400 Received: from tlpd255.enaf.dadc.sbc.com (sbcsmtp3.sbc.com [144.160.112.28]) by m0049463.ppops.net-00191d01. with ESMTP id 2mua6sx5cu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:43 -0400 Received: from enaf.dadc.sbc.com (localhost [127.0.0.1]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hgcC089055 for ; Mon, 1 Oct 2018 03:43:42 -0500 Received: from zlp30494.vci.att.com (zlp30494.vci.att.com [135.46.181.159]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hc8g088940 for ; Mon, 1 Oct 2018 03:43:38 -0500 Received: from zlp30494.vci.att.com (zlp30494.vci.att.com [127.0.0.1]) by zlp30494.vci.att.com (Service) with ESMTP id 25FA94000490 for ; Mon, 1 Oct 2018 08:43:38 +0000 (GMT) Received: from clpi183.sldc.sbc.com (unknown [135.41.1.46]) by zlp30494.vci.att.com (Service) with ESMTP id F3044400048E for ; Mon, 1 Oct 2018 08:43:37 +0000 (GMT) Received: from sldc.sbc.com (localhost [127.0.0.1]) by clpi183.sldc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hbd1012482 for ; Mon, 1 Oct 2018 03:43:37 -0500 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by clpi183.sldc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hVC4012166 for ; Mon, 1 Oct 2018 03:43:32 -0500 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 384C636004A for ; Mon, 1 Oct 2018 01:43:31 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Subject: [PATCH net-next v2 04/10] net: provide a sysctl raw_l3mdev_accept for raw socket lookup with VRFs Date: Mon, 1 Oct 2018 09:43:14 +0100 Message-Id: <20181001084320.32453-5-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=933 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add a sysctl raw_l3mdev_accept to control raw socket lookup in a manner similar to use of tcp_l3mdev_accept for stream and of udp_l3mdev_accept for datagram sockets. Have this default to off as this is what users expect, given that there is no explicit mechanism to set unmodified VRF-unaware application into a default VRF. Signed-off-by: Mike Manning --- Documentation/networking/ip-sysctl.txt | 9 +++++++++ Documentation/networking/vrf.txt | 8 +++++--- include/net/netns/ipv4.h | 3 +++ net/ipv4/sysctl_net_ipv4.c | 11 +++++++++++ 4 files changed, 28 insertions(+), 3 deletions(-) diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 8313a636dd53..a46be4a5b7a0 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -788,6 +788,15 @@ udp_wmem_min - INTEGER total pages of UDP sockets exceed udp_mem pressure. The unit is byte. Default: 4K +RAW variables: + +raw_l3mdev_accept - BOOLEAN + Enabling this option allows a "global" bound socket to work + across L3 master domains (e.g., VRFs) with packets capable of + being received regardless of the L3 domain in which they + originated. Only valid when the kernel was compiled with + CONFIG_NET_L3_MASTER_DEV. + CIPSOv4 Variables: cipso_cache_enable - BOOLEAN diff --git a/Documentation/networking/vrf.txt b/Documentation/networking/vrf.txt index d4b129402d57..deb798342f1e 100644 --- a/Documentation/networking/vrf.txt +++ b/Documentation/networking/vrf.txt @@ -108,11 +108,13 @@ limited to the default VRF. That is, it will not be matched by packets arriving on interfaces enslaved to an l3mdev and processes may bind to the same port if they bind to an l3mdev. -TCP & UDP services running in the default VRF context (ie., not bound -to any VRF device) can work across all VRF domains by enabling the -tcp_l3mdev_accept and udp_l3mdev_accept sysctl options: +TCP & UDP services & services using RAW sockets that are running in the +default VRF context (ie., not bound to any VRF device) can work across +all VRF domains by enabling the tcp_l3mdev_accept, udp_l3mdev_accept and +raw_l3mdev_accept sysctl options: sysctl -w net.ipv4.tcp_l3mdev_accept=1 sysctl -w net.ipv4.udp_l3mdev_accept=1 + sysctl -w net.ipv4.raw_l3mdev_accept=1 netfilter rules on the VRF device can be used to limit access to services running in the default VRF context as well. diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index e47503b4e4d1..104a6669e344 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -103,6 +103,9 @@ struct netns_ipv4 { /* Shall we try to damage output packets if routing dev changes? */ int sysctl_ip_dynaddr; int sysctl_ip_early_demux; +#ifdef CONFIG_NET_L3_MASTER_DEV + int sysctl_raw_l3mdev_accept; +#endif int sysctl_tcp_early_demux; int sysctl_udp_early_demux; diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index b92f422f2fa8..d173337040ee 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -601,6 +601,17 @@ static struct ctl_table ipv4_net_table[] = { .mode = 0644, .proc_handler = ipv4_ping_group_range, }, +#ifdef CONFIG_NET_L3_MASTER_DEV + { + .procname = "raw_l3mdev_accept", + .data = &init_net.ipv4.sysctl_raw_l3mdev_accept, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &zero, + .extra2 = &one, + }, +#endif { .procname = "tcp_ecn", .data = &init_net.ipv4.sysctl_tcp_ecn, From patchwork Mon Oct 1 08:43:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977045 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42NwmX2VsVz9s7T for ; Mon, 1 Oct 2018 18:43:40 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728927AbeJAPUR (ORCPT ); Mon, 1 Oct 2018 11:20:17 -0400 Received: from mx0b-00191d01.pphosted.com ([67.231.157.136]:43084 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728863AbeJAPUQ (ORCPT ); Mon, 1 Oct 2018 11:20:16 -0400 Received: from pps.filterd (m0083689.ppops.net [127.0.0.1]) by m0083689.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w918Z9Hn036559 for ; Mon, 1 Oct 2018 04:43:37 -0400 Received: from alpi155.enaf.aldc.att.com (sbcsmtp7.sbc.com [144.160.229.24]) by m0083689.ppops.net-00191d01. with ESMTP id 2mu5smaccw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:36 -0400 Received: from enaf.aldc.att.com (localhost [127.0.0.1]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918haUH003067 for ; Mon, 1 Oct 2018 04:43:36 -0400 Received: from zlp27127.vci.att.com (zlp27127.vci.att.com [135.66.87.31]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918hYas003056 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from zlp27127.vci.att.com (zlp27127.vci.att.com [127.0.0.1]) by zlp27127.vci.att.com (Service) with ESMTP id A11314048B43 for ; Mon, 1 Oct 2018 08:43:34 +0000 (GMT) Received: from mlpi432.sfdc.sbc.com (unknown [144.151.223.11]) by zlp27127.vci.att.com (Service) with ESMTP id 8C06840002BD for ; Mon, 1 Oct 2018 08:43:34 +0000 (GMT) Received: from sfdc.sbc.com (localhost [127.0.0.1]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hYk9003768 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hW5Y003754 for ; Mon, 1 Oct 2018 04:43:33 -0400 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 2027936013D; Mon, 1 Oct 2018 01:43:31 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Cc: Duncan Eastoe Subject: [PATCH net-next v2 05/10] net: fix raw socket lookup device bind matching with VRFs Date: Mon, 1 Oct 2018 09:43:15 +0100 Message-Id: <20181001084320.32453-6-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Duncan Eastoe When there exist a pair of raw sockets one unbound and one bound to a VRF but equal in all other respects, when a packet is received in the VRF context, __raw_v4_lookup() matches on both sockets. This results in the packet being delivered over both sockets, instead of only the raw socket bound to the VRF. The bound device checks in __raw_v4_lookup() are replaced with a call to raw_sk_bound_dev_eq() which correctly handles whether the packet should be delivered over the unbound socket in such cases. In __raw_v6_lookup() the match on the device binding of the socket is similarly updated to use raw_sk_bound_dev_eq() which matches the handling in __raw_v4_lookup(). Importantly raw_sk_bound_dev_eq() takes the raw_l3mdev_accept sysctl into account. Signed-off-by: Duncan Eastoe Signed-off-by: Mike Manning --- include/net/raw.h | 12 ++++++++++++ net/ipv4/raw.c | 4 ++-- net/ipv6/raw.c | 6 +++--- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/include/net/raw.h b/include/net/raw.h index 9c9fa98a91a4..ce88fdd68933 100644 --- a/include/net/raw.h +++ b/include/net/raw.h @@ -18,6 +18,7 @@ #define _RAW_H +#include #include #include @@ -74,4 +75,15 @@ static inline struct raw_sock *raw_sk(const struct sock *sk) return (struct raw_sock *)sk; } +static inline bool raw_sk_bound_dev_eq(struct net *net, int bound_dev_if, + int dif, int sdif) +{ +#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV) + return inet_bound_dev_eq(net->ipv4.sysctl_raw_l3mdev_accept, + bound_dev_if, dif, sdif); +#else + return inet_bound_dev_eq(1, bound_dev_if, dif, sdif); +#endif +} + #endif /* _RAW_H */ diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index 8ca3eb06ba04..6d8006c86dc0 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -70,6 +70,7 @@ #include #include #include +#include #include #include #include @@ -131,8 +132,7 @@ struct sock *__raw_v4_lookup(struct net *net, struct sock *sk, if (net_eq(sock_net(sk), net) && inet->inet_num == num && !(inet->inet_daddr && inet->inet_daddr != raddr) && !(inet->inet_rcv_saddr && inet->inet_rcv_saddr != laddr) && - !(sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif && - sk->sk_bound_dev_if != sdif)) + raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif)) goto found; /* gotcha */ } sk = NULL; diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 413d98bf24f4..5b363d315b06 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -49,6 +49,7 @@ #include #include #include +#include #include #if IS_ENABLED(CONFIG_IPV6_MIP6) #include @@ -86,9 +87,8 @@ struct sock *__raw_v6_lookup(struct net *net, struct sock *sk, !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) continue; - if (sk->sk_bound_dev_if && - sk->sk_bound_dev_if != dif && - sk->sk_bound_dev_if != sdif) + if (!raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if, + dif, sdif)) continue; if (!ipv6_addr_any(&sk->sk_v6_rcv_saddr)) { From patchwork Mon Oct 1 08:43:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977053 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Nwn01Hhbz9s3x for ; Mon, 1 Oct 2018 18:44:04 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728956AbeJAPUU (ORCPT ); Mon, 1 Oct 2018 11:20:20 -0400 Received: from mx0a-00191d01.pphosted.com ([67.231.149.140]:54198 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728935AbeJAPUT (ORCPT ); Mon, 1 Oct 2018 11:20:19 -0400 Received: from pps.filterd (m0053301.ppops.net [127.0.0.1]) by mx0a-00191d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w918Z9cJ037029 for ; Mon, 1 Oct 2018 04:43:40 -0400 Received: from alpi155.enaf.aldc.att.com (sbcsmtp7.sbc.com [144.160.229.24]) by mx0a-00191d01.pphosted.com with ESMTP id 2mu1fxq49p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:39 -0400 Received: from enaf.aldc.att.com (localhost [127.0.0.1]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918hcvY003084 for ; Mon, 1 Oct 2018 04:43:38 -0400 Received: from zlp27125.vci.att.com (zlp27125.vci.att.com [135.66.87.52]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918hYrm003057 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from zlp27125.vci.att.com (zlp27125.vci.att.com [127.0.0.1]) by zlp27125.vci.att.com (Service) with ESMTP id B204F16A3EE for ; Mon, 1 Oct 2018 08:43:34 +0000 (GMT) Received: from mlpi432.sfdc.sbc.com (unknown [144.151.223.11]) by zlp27125.vci.att.com (Service) with ESMTP id 9F2CF16A3ED for ; Mon, 1 Oct 2018 08:43:34 +0000 (GMT) Received: from sfdc.sbc.com (localhost [127.0.0.1]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hYA0003779 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hYOQ003758 for ; Mon, 1 Oct 2018 04:43:34 -0400 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 31F8036004A; Mon, 1 Oct 2018 01:43:33 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Cc: Duncan Eastoe Subject: [PATCH net-next v2 06/10] net: IP[V6]_MULTICAST_IF constraint on unbound socket if VRFs present Date: Mon, 1 Oct 2018 09:43:16 +0100 Message-Id: <20181001084320.32453-7-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=543 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Duncan Eastoe If setsockopt(IP_MULTICAST_IF) or setsockopt(IPV6_MULTICAST_IF) is called on a socket which is not bound to a VRF then we should ensure that the output device chosen is also not bound to a VRF master. This avoids inadvertently sending traffic out of the wrong interface. This can be particularly problematic for IP_MULTICAST_IF since the interface lookup can be performed by address as well as ifindex. If there are interfaces with the same address, one unbound and one bound to a VRF, then the interface bound to the VRF may be chosen when the sockopt is called on an unbound socket. Signed-off-by: Duncan Eastoe Signed-off-by: Mike Manning --- net/ipv4/ip_sockglue.c | 3 +++ net/ipv6/ipv6_sockglue.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index c0fe5ad996f2..026971314c43 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -892,6 +892,9 @@ static int do_ip_setsockopt(struct sock *sk, int level, dev_put(dev); err = -EINVAL; + if (!sk->sk_bound_dev_if && midx) + break; + if (sk->sk_bound_dev_if && mreq.imr_ifindex != sk->sk_bound_dev_if && (!midx || midx != sk->sk_bound_dev_if)) diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index c0cac9cc3a28..7dfbc797b130 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -626,6 +626,9 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, rcu_read_unlock(); + if (!sk->sk_bound_dev_if && midx) + goto e_inval; + if (sk->sk_bound_dev_if && sk->sk_bound_dev_if != val && (!midx || midx != sk->sk_bound_dev_if)) From patchwork Mon Oct 1 08:43:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977048 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Nwmj1tj0z9s4V for ; Mon, 1 Oct 2018 18:43:49 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729023AbeJAPUZ (ORCPT ); Mon, 1 Oct 2018 11:20:25 -0400 Received: from mx0a-00191d01.pphosted.com ([67.231.149.140]:54252 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729006AbeJAPUX (ORCPT ); Mon, 1 Oct 2018 11:20:23 -0400 Received: from pps.filterd (m0053301.ppops.net [127.0.0.1]) by mx0a-00191d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w918Z5PE036971 for ; Mon, 1 Oct 2018 04:43:44 -0400 Received: from alpi155.enaf.aldc.att.com (sbcsmtp7.sbc.com [144.160.229.24]) by mx0a-00191d01.pphosted.com with ESMTP id 2mu1fxq4ad-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:43 -0400 Received: from enaf.aldc.att.com (localhost [127.0.0.1]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918hgi0003139 for ; Mon, 1 Oct 2018 04:43:42 -0400 Received: from zlp27125.vci.att.com (zlp27125.vci.att.com [135.66.87.52]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id w918hcxK003079 for ; Mon, 1 Oct 2018 04:43:38 -0400 Received: from zlp27125.vci.att.com (zlp27125.vci.att.com [127.0.0.1]) by zlp27125.vci.att.com (Service) with ESMTP id 9E91016A3EE for ; Mon, 1 Oct 2018 08:43:38 +0000 (GMT) Received: from mlpi432.sfdc.sbc.com (unknown [144.151.223.11]) by zlp27125.vci.att.com (Service) with ESMTP id 82DB616A3ED for ; Mon, 1 Oct 2018 08:43:38 +0000 (GMT) Received: from sfdc.sbc.com (localhost [127.0.0.1]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hcFI003851 for ; Mon, 1 Oct 2018 04:43:38 -0400 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by mlpi432.sfdc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hYNP003805 for ; Mon, 1 Oct 2018 04:43:35 -0400 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 4429F36013D for ; Mon, 1 Oct 2018 01:43:34 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Subject: [PATCH net-next v2 07/10] vrf: mark skb for multicast or link-local as enslaved to VRF Date: Mon, 1 Oct 2018 09:43:17 +0100 Message-Id: <20181001084320.32453-8-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The skb for packets that are multicast or to a link-local address are not marked as being enslaved to a VRF, if they are received on a socket bound to the VRF. This is needed for ND and it is preferable for the kernel not to have to deal with the additional use-cases if ll or mcast packets are handled as enslaved. However, this does not allow service instances listening on unbound and bound to VRF sockets to distinguish the VRF used, if packets are sent as multicast or to a link-local address. The fix is for the VRF driver to also mark these skb as being enslaved to the VRF. Signed-off-by: Mike Manning --- drivers/net/vrf.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index 69b7227c637e..21ad4b1d7f03 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -981,24 +981,23 @@ static struct sk_buff *vrf_ip6_rcv(struct net_device *vrf_dev, struct sk_buff *skb) { int orig_iif = skb->skb_iif; - bool need_strict; + bool need_strict = rt6_need_strict(&ipv6_hdr(skb)->daddr); + bool is_ndisc = ipv6_ndisc_frame(skb); - /* loopback traffic; do not push through packet taps again. - * Reset pkt_type for upper layers to process skb + /* loopback, multicast & non-ND link-local traffic; do not push through + * packet taps again. Reset pkt_type for upper layers to process skb */ - if (skb->pkt_type == PACKET_LOOPBACK) { + if (skb->pkt_type == PACKET_LOOPBACK || (need_strict && !is_ndisc)) { skb->dev = vrf_dev; skb->skb_iif = vrf_dev->ifindex; IP6CB(skb)->flags |= IP6SKB_L3SLAVE; - skb->pkt_type = PACKET_HOST; + if (skb->pkt_type == PACKET_LOOPBACK) + skb->pkt_type = PACKET_HOST; goto out; } - /* if packet is NDISC or addressed to multicast or link-local - * then keep the ingress interface - */ - need_strict = rt6_need_strict(&ipv6_hdr(skb)->daddr); - if (!ipv6_ndisc_frame(skb) && !need_strict) { + /* if packet is NDISC then keep the ingress interface */ + if (!is_ndisc) { vrf_rx_stats(vrf_dev, skb->len); skb->dev = vrf_dev; skb->skb_iif = vrf_dev->ifindex; From patchwork Mon Oct 1 08:43:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977052 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Nwmq2hnnz9s3x for ; Mon, 1 Oct 2018 18:43:55 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729034AbeJAPU1 (ORCPT ); Mon, 1 Oct 2018 11:20:27 -0400 Received: from mx0b-00191d01.pphosted.com ([67.231.157.136]:52120 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729012AbeJAPUY (ORCPT ); Mon, 1 Oct 2018 11:20:24 -0400 Received: from pps.filterd (m0049458.ppops.net [127.0.0.1]) by m0049458.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w918Z8Em040829 for ; Mon, 1 Oct 2018 04:43:44 -0400 Received: from tlpd255.enaf.dadc.sbc.com (sbcsmtp3.sbc.com [144.160.112.28]) by m0049458.ppops.net-00191d01. with ESMTP id 2mube5n0v0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:44 -0400 Received: from enaf.dadc.sbc.com (localhost [127.0.0.1]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hh2Y089084 for ; Mon, 1 Oct 2018 03:43:43 -0500 Received: from zlp30497.vci.att.com (zlp30497.vci.att.com [135.46.181.156]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hgm4089042 for ; Mon, 1 Oct 2018 03:43:42 -0500 Received: from zlp30497.vci.att.com (zlp30497.vci.att.com [127.0.0.1]) by zlp30497.vci.att.com (Service) with ESMTP id 025B14014201 for ; Mon, 1 Oct 2018 08:43:42 +0000 (GMT) Received: from tlpd252.dadc.sbc.com (unknown [135.31.184.157]) by zlp30497.vci.att.com (Service) with ESMTP id E257A40141F7 for ; Mon, 1 Oct 2018 08:43:41 +0000 (GMT) Received: from dadc.sbc.com (localhost [127.0.0.1]) by tlpd252.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hfxw014442 for ; Mon, 1 Oct 2018 03:43:41 -0500 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by tlpd252.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hZWQ014268 for ; Mon, 1 Oct 2018 03:43:36 -0500 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 2ADA736004A for ; Mon, 1 Oct 2018 01:43:35 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Subject: [PATCH net-next v2 08/10] ipv6: allow ping to link-local address in VRF Date: Mon, 1 Oct 2018 09:43:18 +0100 Message-Id: <20181001084320.32453-9-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org If link-local packets are marked as enslaved to a VRF, then to allow ping to the link-local from a vrf, the error handling for IPV6_PKTINFO needs to be relaxed to also allow the pkt ipi6_ifindex to be that of a slave device to the vrf. Note that the real device also needs to be retrieved in icmp6_iif() to set the ipv6 flow oif to this for icmp echo reply handling. The recent commit 24b711edfc34 ("net/ipv6: Fix linklocal to global address with VRF") takes care of this, so the sdif does not need checking here. This fix makes ping to link-local consistent with that to global addresses, in that this can now be done from within the same VRF that the address is in. Signed-off-by: Mike Manning --- net/ipv6/ipv6_sockglue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index 7dfbc797b130..4ebd395dd3df 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -486,7 +486,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, retv = -EFAULT; break; } - if (sk->sk_bound_dev_if && pkt.ipi6_ifindex != sk->sk_bound_dev_if) + if (!sk_dev_equal_l3scope(sk, pkt.ipi6_ifindex)) goto e_inval; np->sticky_pktinfo.ipi6_ifindex = pkt.ipi6_ifindex; From patchwork Mon Oct 1 08:43:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977050 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Nwml1D7pz9sB5 for ; Mon, 1 Oct 2018 18:43:51 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729041AbeJAPU2 (ORCPT ); Mon, 1 Oct 2018 11:20:28 -0400 Received: from mx0b-00191d01.pphosted.com ([67.231.157.136]:39720 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728940AbeJAPUY (ORCPT ); Mon, 1 Oct 2018 11:20:24 -0400 Received: from pps.filterd (m0049463.ppops.net [127.0.0.1]) by m0049463.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w918ZNUA017445 for ; Mon, 1 Oct 2018 04:43:44 -0400 Received: from tlpd255.enaf.dadc.sbc.com (sbcsmtp3.sbc.com [144.160.112.28]) by m0049463.ppops.net-00191d01. with ESMTP id 2mua6sx5d3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:44 -0400 Received: from enaf.dadc.sbc.com (localhost [127.0.0.1]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hh2c089084 for ; Mon, 1 Oct 2018 03:43:44 -0500 Received: from zlp30496.vci.att.com (zlp30496.vci.att.com [135.46.181.157]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hg5Y089051 for ; Mon, 1 Oct 2018 03:43:42 -0500 Received: from zlp30496.vci.att.com (zlp30496.vci.att.com [127.0.0.1]) by zlp30496.vci.att.com (Service) with ESMTP id 408F140F6CFB for ; Mon, 1 Oct 2018 08:43:42 +0000 (GMT) Received: from tlpd252.dadc.sbc.com (unknown [135.31.184.157]) by zlp30496.vci.att.com (Service) with ESMTP id 208D540002A5 for ; Mon, 1 Oct 2018 08:43:42 +0000 (GMT) Received: from dadc.sbc.com (localhost [127.0.0.1]) by tlpd252.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hgTd014462 for ; Mon, 1 Oct 2018 03:43:42 -0500 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by tlpd252.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918haOD014290 for ; Mon, 1 Oct 2018 03:43:37 -0500 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 0F82236013D; Mon, 1 Oct 2018 01:43:35 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Cc: Dewi Morgan Subject: [PATCH net-next v2 09/10] ipv6: handling of multicast packets received in VRF Date: Mon, 1 Oct 2018 09:43:19 +0100 Message-Id: <20181001084320.32453-10-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=869 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org If the skb for multicast packets marked as enslaved to a VRF are received, then the secondary device index should be used to obtain the real device. And verify the multicast address against the enslaved rather than the l3mdev device. Signed-off-by: Dewi Morgan Signed-off-by: Mike Manning --- net/ipv6/ip6_input.c | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c index 96577e742afd..df58e1100226 100644 --- a/net/ipv6/ip6_input.c +++ b/net/ipv6/ip6_input.c @@ -359,6 +359,8 @@ static int ip6_input_finish(struct net *net, struct sock *sk, struct sk_buff *sk } } else if (ipprot->flags & INET6_PROTO_FINAL) { const struct ipv6hdr *hdr; + int sdif = inet6_sdif(skb); + struct net_device *dev; /* Only do this once for first final protocol */ have_final = true; @@ -371,9 +373,19 @@ static int ip6_input_finish(struct net *net, struct sock *sk, struct sk_buff *sk skb_postpull_rcsum(skb, skb_network_header(skb), skb_network_header_len(skb)); hdr = ipv6_hdr(skb); + + /* skb->dev passed may be master dev for vrfs. */ + if (sdif) { + dev = dev_get_by_index_rcu(net, sdif); + if (!dev) + goto discard; + } else { + dev = skb->dev; + } + if (ipv6_addr_is_multicast(&hdr->daddr) && - !ipv6_chk_mcast_addr(skb->dev, &hdr->daddr, - &hdr->saddr) && + !ipv6_chk_mcast_addr(dev, &hdr->daddr, + &hdr->saddr) && !ipv6_is_mld(skb, nexthdr, skb_network_header_len(skb))) goto discard; } @@ -432,15 +444,32 @@ EXPORT_SYMBOL_GPL(ip6_input); int ip6_mc_input(struct sk_buff *skb) { + int sdif = inet6_sdif(skb); const struct ipv6hdr *hdr; + struct net_device *dev; bool deliver; __IP6_UPD_PO_STATS(dev_net(skb_dst(skb)->dev), __in6_dev_get_safely(skb->dev), IPSTATS_MIB_INMCAST, skb->len); + /* skb->dev passed may be master dev for vrfs. */ + if (sdif) { + rcu_read_lock(); + dev = dev_get_by_index_rcu(dev_net(skb->dev), sdif); + if (!dev) { + rcu_read_unlock(); + kfree_skb(skb); + return -ENODEV; + } + } else { + dev = skb->dev; + } + hdr = ipv6_hdr(skb); - deliver = ipv6_chk_mcast_addr(skb->dev, &hdr->daddr, NULL); + deliver = ipv6_chk_mcast_addr(dev, &hdr->daddr, NULL); + if (sdif) + rcu_read_unlock(); #ifdef CONFIG_IPV6_MROUTE /* From patchwork Mon Oct 1 08:43:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Manning X-Patchwork-Id: 977051 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=vyatta.att-mail.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Nwmm3LYrz9s3x for ; Mon, 1 Oct 2018 18:43:52 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729048AbeJAPU3 (ORCPT ); Mon, 1 Oct 2018 11:20:29 -0400 Received: from mx0b-00191d01.pphosted.com ([67.231.157.136]:41630 "EHLO mx0a-00191d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728749AbeJAPU2 (ORCPT ); Mon, 1 Oct 2018 11:20:28 -0400 Received: from pps.filterd (m0049462.ppops.net [127.0.0.1]) by m0049462.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w918Z9hv022602 for ; Mon, 1 Oct 2018 04:43:49 -0400 Received: from tlpd255.enaf.dadc.sbc.com (sbcsmtp3.sbc.com [144.160.112.28]) by m0049462.ppops.net-00191d01. with ESMTP id 2mu9hd6xqf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 01 Oct 2018 04:43:48 -0400 Received: from enaf.dadc.sbc.com (localhost [127.0.0.1]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hlgU089171 for ; Mon, 1 Oct 2018 03:43:47 -0500 Received: from zlp30494.vci.att.com (zlp30494.vci.att.com [135.46.181.159]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hkEP089131 for ; Mon, 1 Oct 2018 03:43:46 -0500 Received: from zlp30494.vci.att.com (zlp30494.vci.att.com [127.0.0.1]) by zlp30494.vci.att.com (Service) with ESMTP id 2D03B400048E for ; Mon, 1 Oct 2018 08:43:46 +0000 (GMT) Received: from tlpd252.dadc.sbc.com (unknown [135.31.184.157]) by zlp30494.vci.att.com (Service) with ESMTP id 0F3424000490 for ; Mon, 1 Oct 2018 08:43:46 +0000 (GMT) Received: from dadc.sbc.com (localhost [127.0.0.1]) by tlpd252.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hkUc014588 for ; Mon, 1 Oct 2018 03:43:46 -0500 Received: from mail.eng.vyatta.net (mail.eng.vyatta.net [10.156.50.82]) by tlpd252.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id w918hbcZ014325 for ; Mon, 1 Oct 2018 03:43:38 -0500 Received: from MM-7520.vyatta.net (unknown [10.156.47.144]) by mail.eng.vyatta.net (Postfix) with ESMTPA id 1F31336004A; Mon, 1 Oct 2018 01:43:36 -0700 (PDT) From: Mike Manning To: netdev@vger.kernel.org Cc: Dewi Morgan Subject: [PATCH net-next v2 10/10] ipv6: do not drop vrf udp multicast packets Date: Mon, 1 Oct 2018 09:43:20 +0100 Message-Id: <20181001084320.32453-11-mmanning@vyatta.att-mail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> References: <20181001084320.32453-1-mmanning@vyatta.att-mail.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-10-01_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810010088 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Dewi Morgan For bound udp sockets in a vrf, also check the sdif to get the index for ingress devices enslaved to an l3mdev. Signed-off-by: Dewi Morgan Signed-off-by: Mike Manning --- net/ipv6/udp.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 6722490c87b9..821fdc31dbc0 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -637,7 +637,7 @@ static int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) static bool __udp_v6_is_mcast_sock(struct net *net, struct sock *sk, __be16 loc_port, const struct in6_addr *loc_addr, __be16 rmt_port, const struct in6_addr *rmt_addr, - int dif, unsigned short hnum) + int dif, int sdif, unsigned short hnum) { struct inet_sock *inet = inet_sk(sk); @@ -649,7 +649,7 @@ static bool __udp_v6_is_mcast_sock(struct net *net, struct sock *sk, (inet->inet_dport && inet->inet_dport != rmt_port) || (!ipv6_addr_any(&sk->sk_v6_daddr) && !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) || - (sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif) || + !udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif) || (!ipv6_addr_any(&sk->sk_v6_rcv_saddr) && !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr))) return false; @@ -683,6 +683,7 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb, unsigned int offset = offsetof(typeof(*sk), sk_node); unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10); int dif = inet6_iif(skb); + int sdif = inet6_sdif(skb); struct hlist_node *node; struct sk_buff *nskb; @@ -697,7 +698,8 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb, sk_for_each_entry_offset_rcu(sk, node, &hslot->head, offset) { if (!__udp_v6_is_mcast_sock(net, sk, uh->dest, daddr, - uh->source, saddr, dif, hnum)) + uh->source, saddr, dif, sdif, + hnum)) continue; /* If zero checksum and no_check is not on for * the socket then skip it.