diff mbox

[net-next,1/2] net: l3mdev: Add master device lookup by index

Message ID 1447965121-2345-1-git-send-email-dsa@cumulusnetworks.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

David Ahern Nov. 19, 2015, 8:32 p.m. UTC
Add helper to lookup master index given a device index.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 include/net/l3mdev.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Comments

David Miller Nov. 22, 2015, 5:23 p.m. UTC | #1
From: David Ahern <dsa@cumulusnetworks.com>
Date: Thu, 19 Nov 2015 12:32:00 -0800

> Add helper to lookup master index given a device index.
> 
> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>

I don't like where this is going.

sk->sk_bound_dev_if is for device bindings which the user has
explicitly asked for.

We should never, therefore, automatically set it without the user's
consent.

I'm not applying these patches, sorry.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Ahern Nov. 22, 2015, 5:30 p.m. UTC | #2
On 11/22/15 10:23 AM, David Miller wrote:
> From: David Ahern <dsa@cumulusnetworks.com>
> Date: Thu, 19 Nov 2015 12:32:00 -0800
>
>> Add helper to lookup master index given a device index.
>>
>> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
>
> I don't like where this is going.
>
> sk->sk_bound_dev_if is for device bindings which the user has
> explicitly asked for.
>
> We should never, therefore, automatically set it without the user's
> consent.

In this case the user is running a daemon (bgpd) where a single instance 
works across all VRFs. The listen socket is not bound to a device, so 
this does not override what the user ask for. Child sockets are then 
bound to the VRF device the connection originates over, so it narrows 
the scope of accepted connections to a single VRF.

If you look at the change, e.g.,:

     ireq->ir_iif = sk->sk_bound_dev_if ? : 
l3mdev_master_ifindex_by_index(sock_net(sk), skb->skb_iif);

It keeps user requested sk_bound_dev_if if it is set. If not, applies 
the limited scope of a VRF device if the skb originated on a device 
enslaved to a VRF device.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 22, 2015, 6:17 p.m. UTC | #3
From: David Ahern <dsa@cumulusnetworks.com>
Date: Sun, 22 Nov 2015 10:30:32 -0700

> In this case ...

I understand the problem you are trying to solve, but I am saying
you can't use sk_bound_dev_if to use it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Ahern Nov. 23, 2015, 4:02 a.m. UTC | #4
On 11/22/15 11:17 AM, David Miller wrote:
> From: David Ahern <dsa@cumulusnetworks.com>
> Date: Sun, 22 Nov 2015 10:30:32 -0700
>
>> In this case ...
>
> I understand the problem you are trying to solve, but I am saying
> you can't use sk_bound_dev_if to use it.
>

I am confused by that response given that sk_bound_dev_if is one of the 
key principals for the VRF implementation. Applications wanting to 
communicate over interfaces in a VRF have to set sk_bound_dev_if. If 
sk_bound_dev_if is not set by the kernel when the child socket is 
created the TCP handshake will not complete. It is not something that 
can be deferred until after accept.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 23, 2015, 4:35 a.m. UTC | #5
From: David Ahern <dsa@cumulusnetworks.com>
Date: Sun, 22 Nov 2015 21:02:04 -0700

> I am confused by that response given that sk_bound_dev_if is one of
> the key principals for the VRF implementation. Applications wanting to
> communicate over interfaces in a VRF have to set sk_bound_dev_if.

Yes, they have to set it explicitly.

You are setting it for them in response to the connection
creation, and that's what I object to.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Ahern Nov. 23, 2015, 6:28 p.m. UTC | #6
On 11/22/15 9:35 PM, David Miller wrote:
> From: David Ahern <dsa@cumulusnetworks.com>
> Date: Sun, 22 Nov 2015 21:02:04 -0700
>
>> I am confused by that response given that sk_bound_dev_if is one of
>> the key principals for the VRF implementation. Applications wanting to
>> communicate over interfaces in a VRF have to set sk_bound_dev_if.
>
> Yes, they have to set it explicitly.
>
> You are setting it for them in response to the connection
> creation, and that's what I object to.
>

The intent is to not require having N-listen sockets/threads/tasks to 
support N-vrfs for scalability reasons. Having a special DEVICE_ANY 
index adds complexity to socket lookups, so I dropped that idea long 
ago. Would guarding this behavior by a sysctl be acceptable?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/l3mdev.h b/include/net/l3mdev.h
index 774d85b2d5d9..786226f8e77b 100644
--- a/include/net/l3mdev.h
+++ b/include/net/l3mdev.h
@@ -51,6 +51,24 @@  static inline int l3mdev_master_ifindex(struct net_device *dev)
 	return ifindex;
 }
 
+static inline int l3mdev_master_ifindex_by_index(struct net *net, int ifindex)
+{
+	struct net_device *dev;
+	int rc = 0;
+
+	if (likely(ifindex)) {
+		rcu_read_lock();
+
+		dev = dev_get_by_index_rcu(net, ifindex);
+		if (dev)
+			rc = l3mdev_master_ifindex_rcu(dev);
+
+		rcu_read_unlock();
+	}
+
+	return rc;
+}
+
 /* get index of an interface to use for FIB lookups. For devices
  * enslaved to an L3 master device FIB lookups are based on the
  * master index
@@ -167,6 +185,11 @@  static inline int l3mdev_master_ifindex(struct net_device *dev)
 	return 0;
 }
 
+static inline int l3mdev_master_ifindex_by_index(struct net *net, int ifindex)
+{
+	return 0;
+}
+
 static inline int l3mdev_fib_oif_rcu(struct net_device *dev)
 {
 	return dev ? dev->ifindex : 0;