diff mbox

[v2,2/5] vxlan: Generalize udp based tunnel offload

Message ID 1441749356-92927-2-git-send-email-anjali.singhai@intel.com
State Changes Requested
Delegated to: Jeff Kirsher
Headers show

Commit Message

Singhai, Anjali Sept. 8, 2015, 9:55 p.m. UTC
Replace add/del ndo ops for vxlan_port with tunnel_port so that all UDP
based tunnels can use the same ndo op. Add a parameter to pass tunnel
type to the ndo_op.

Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
---
 drivers/net/vxlan.c       | 20 ++++++++++----------
 include/linux/netdevice.h | 33 ++++++++++++++++-----------------
 include/net/udp_tunnel.h  |  6 ++++++
 3 files changed, 32 insertions(+), 27 deletions(-)

Comments

Jesse Gross Sept. 9, 2015, 9:30 p.m. UTC | #1
On Tue, Sep 8, 2015 at 2:55 PM, Anjali Singhai Jain
<anjali.singhai@intel.com> wrote:
> Replace add/del ndo ops for vxlan_port with tunnel_port so that all UDP
> based tunnels can use the same ndo op. Add a parameter to pass tunnel
> type to the ndo_op.
>
> Signed-off-by: Kiran Patil <kiran.patil@intel.com>
> Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>

You should squash the patch that comes after this into this one -
otherwise the kernel won't compile after this patch.

Did you mean to also add calls from the Geneve code to this? I think
you had it in the previous version.

I believe we also need a generalization of vxlan_get_rx_port().
Otherwise, there's no way for drivers to get the non-VXLAN tunnels
that were created when they weren't up.

> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index c69072b..b6f6beb 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> + * void (*ndo_add_tunnel_port)(struct  net_device *dev,
> + *                           sa_family_t sa_family, __be16 port, u32 type);
> + *     Called by UDP based tunnel modules to notify a driver about a UDP
> + *     port and socket address family that the tunnel is listening to. It is
> + *     called only when a new port starts listening. The operation is
> + *     protected by tunnel socket sock_lock.

This locking doesn't look safe to me. Each type of tunnel will have
its own socket lock but the data structures that are used in drivers
implementing these functions may be shared (such as in your i40e
patch). I believe that RTNL is currently held by both VXLAN and Geneve
when calling this with the possible exception of vxlan_get_rx_port().
In practice, my guess is that it is likely held then too, so it might
only require checking this and documenting/asserting it.

I might just call this ndo_add_udp_tunnel_port since that's what it
does anyways.

> diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
> index c491c12..328f3ab 100644
> --- a/include/net/udp_tunnel.h
> +++ b/include/net/udp_tunnel.h
[...]
> +enum udp_tunnel_type {
> +       UDP_TUNNEL_NONE,
> +       UDP_TUNNEL_VXLAN,
> +       UDP_TUNNEL_GENEVE,
> +};

What does UDP_TUNNEL_NONE mean?
Singhai, Anjali Sept. 10, 2015, 5:36 p.m. UTC | #2
> -----Original Message-----
> From: Jesse Gross [mailto:jesse@nicira.com]
> Sent: Wednesday, September 09, 2015 2:30 PM
> To: Singhai, Anjali
> Cc: intel-wired-lan
> Subject: Re: [Intel-wired-lan] [PATCH v2 2/5] vxlan: Generalize udp based
> tunnel offload
> 
> On Tue, Sep 8, 2015 at 2:55 PM, Anjali Singhai Jain <anjali.singhai@intel.com>
> wrote:
> > Replace add/del ndo ops for vxlan_port with tunnel_port so that all
> > UDP based tunnels can use the same ndo op. Add a parameter to pass
> > tunnel type to the ndo_op.
> >
> > Signed-off-by: Kiran Patil <kiran.patil@intel.com>
> > Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
> 
> You should squash the patch that comes after this into this one - otherwise
> the kernel won't compile after this patch.
> 
Ok
> Did you mean to also add calls from the Geneve code to this? I think you had
> it in the previous version.
> 
Good catch, I will fix it.

> I believe we also need a generalization of vxlan_get_rx_port().
> Otherwise, there's no way for drivers to get the non-VXLAN tunnels that
> were created when they weren't up.
>
I plan to send this out as a separate patch.
 
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index c69072b..b6f6beb 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > + * void (*ndo_add_tunnel_port)(struct  net_device *dev,
> > + *                           sa_family_t sa_family, __be16 port, u32 type);
> > + *     Called by UDP based tunnel modules to notify a driver about a UDP
> > + *     port and socket address family that the tunnel is listening to. It is
> > + *     called only when a new port starts listening. The operation is
> > + *     protected by tunnel socket sock_lock.
> 
> This locking doesn't look safe to me. Each type of tunnel will have its own
> socket lock but the data structures that are used in drivers implementing
> these functions may be shared (such as in your i40e patch). I believe that
> RTNL is currently held by both VXLAN and Geneve when calling this with the
> possible exception of vxlan_get_rx_port().
> In practice, my guess is that it is likely held then too, so it might only require
> checking this and documenting/asserting it.
> 
ASSERT_RTNL happens in dev.c for both, so do you suggest I just document in the code that we have a common data structure and is being guarded by RTNL acquired by the stack.

> I might just call this ndo_add_udp_tunnel_port since that's what it does
> anyways.
>
ok
 
> > diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h index
> > c491c12..328f3ab 100644
> > --- a/include/net/udp_tunnel.h
> > +++ b/include/net/udp_tunnel.h
> [...]
> > +enum udp_tunnel_type {
> > +       UDP_TUNNEL_NONE,
> > +       UDP_TUNNEL_VXLAN,
> > +       UDP_TUNNEL_GENEVE,
> > +};
> 
> What does UDP_TUNNEL_NONE mean?
It's a way to avoid 0 based
Jesse Gross Sept. 11, 2015, 3:01 a.m. UTC | #3
On Thu, Sep 10, 2015 at 10:36 AM, Singhai, Anjali
<anjali.singhai@intel.com> wrote:
>> -----Original Message-----
>> From: Jesse Gross [mailto:jesse@nicira.com]
>> Sent: Wednesday, September 09, 2015 2:30 PM
>> To: Singhai, Anjali
>> Cc: intel-wired-lan
>> Subject: Re: [Intel-wired-lan] [PATCH v2 2/5] vxlan: Generalize udp based
>> tunnel offload
>>
>> I believe we also need a generalization of vxlan_get_rx_port().
>> Otherwise, there's no way for drivers to get the non-VXLAN tunnels that
>> were created when they weren't up.
>>
> I plan to send this out as a separate patch.

I would probably suggest doing it as part of this series.

By the way - I do support this series and I'm not trying to add more
work to your plate. I'm slightly worried about this being
controversial though and so the more complete the patch is, the better
off you'll be. Plus, I guess you probably need it in order for the
later patches to work correctly in all circumstances.

>> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> > index c69072b..b6f6beb 100644
>> > --- a/include/linux/netdevice.h
>> > +++ b/include/linux/netdevice.h
>> > + * void (*ndo_add_tunnel_port)(struct  net_device *dev,
>> > + *                           sa_family_t sa_family, __be16 port, u32 type);
>> > + *     Called by UDP based tunnel modules to notify a driver about a UDP
>> > + *     port and socket address family that the tunnel is listening to. It is
>> > + *     called only when a new port starts listening. The operation is
>> > + *     protected by tunnel socket sock_lock.
>>
>> This locking doesn't look safe to me. Each type of tunnel will have its own
>> socket lock but the data structures that are used in drivers implementing
>> these functions may be shared (such as in your i40e patch). I believe that
>> RTNL is currently held by both VXLAN and Geneve when calling this with the
>> possible exception of vxlan_get_rx_port().
>> In practice, my guess is that it is likely held then too, so it might only require
>> checking this and documenting/asserting it.
>>
> ASSERT_RTNL happens in dev.c for both, so do you suggest I just document in the code that we have a common data structure and is being guarded by RTNL acquired by the stack.

I think mostly yes. The part that I would be concerned about is that
vxlan_get_rx_port() also uses this function and that could be called
from different places in drivers. In practice though, I think they are
mostly also protected by RTNL. So I would check the callsites and
assuming they are OK, also document the requirement for
vxlan_get_rx_port() and maybe add an ASSERT_RTNL there.

>> > diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h index
>> > c491c12..328f3ab 100644
>> > --- a/include/net/udp_tunnel.h
>> > +++ b/include/net/udp_tunnel.h
>> [...]
>> > +enum udp_tunnel_type {
>> > +       UDP_TUNNEL_NONE,
>> > +       UDP_TUNNEL_VXLAN,
>> > +       UDP_TUNNEL_GENEVE,
>> > +};
>>
>> What does UDP_TUNNEL_NONE mean?
> It's a way to avoid 0 based

Sorry, what's the significance of it being zero? These aren't flags so
I'm not sure why zero isn't a valid value.
diff mbox

Patch

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 61b457b..538e846 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -629,9 +629,9 @@  static void vxlan_notify_add_rx_port(struct vxlan_sock *vs)
 
 	rcu_read_lock();
 	for_each_netdev_rcu(net, dev) {
-		if (dev->netdev_ops->ndo_add_vxlan_port)
-			dev->netdev_ops->ndo_add_vxlan_port(dev, sa_family,
-							    port);
+		if (dev->netdev_ops->ndo_add_tunnel_port)
+			dev->netdev_ops->ndo_add_tunnel_port(dev, sa_family,
+							port, UDP_TUNNEL_VXLAN);
 	}
 	rcu_read_unlock();
 }
@@ -647,9 +647,9 @@  static void vxlan_notify_del_rx_port(struct vxlan_sock *vs)
 
 	rcu_read_lock();
 	for_each_netdev_rcu(net, dev) {
-		if (dev->netdev_ops->ndo_del_vxlan_port)
-			dev->netdev_ops->ndo_del_vxlan_port(dev, sa_family,
-							    port);
+		if (dev->netdev_ops->ndo_del_tunnel_port)
+			dev->netdev_ops->ndo_del_tunnel_port(dev, sa_family,
+							port, UDP_TUNNEL_VXLAN);
 	}
 	rcu_read_unlock();
 
@@ -2379,9 +2379,9 @@  static struct device_type vxlan_type = {
 	.name = "vxlan",
 };
 
-/* Calls the ndo_add_vxlan_port of the caller in order to
+/* Calls the ndo_add_tunnel_port of the caller in order to
  * supply the listening VXLAN udp ports. Callers are expected
- * to implement the ndo_add_vxlan_port.
+ * to implement the ndo_add_tunnel_port.
  */
 void vxlan_get_rx_port(struct net_device *dev)
 {
@@ -2397,8 +2397,8 @@  void vxlan_get_rx_port(struct net_device *dev)
 		hlist_for_each_entry_rcu(vs, &vn->sock_list[i], hlist) {
 			port = inet_sk(vs->sock->sk)->inet_sport;
 			sa_family = vxlan_get_sk_family(vs);
-			dev->netdev_ops->ndo_add_vxlan_port(dev, sa_family,
-							    port);
+			dev->netdev_ops->ndo_add_tunnel_port(dev, sa_family,
+							port, UDP_TUNNEL_VXLAN);
 		}
 	}
 	spin_unlock(&vn->sock_lock);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c69072b..b6f6beb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1003,18 +1003,18 @@  typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *	not implement this, it is assumed that the hw is not able to have
  *	multiple net devices on single physical port.
  *
- * void (*ndo_add_vxlan_port)(struct  net_device *dev,
- *			      sa_family_t sa_family, __be16 port);
- *	Called by vxlan to notiy a driver about the UDP port and socket
- *	address family that vxlan is listnening to. It is called only when
- *	a new port starts listening. The operation is protected by the
- *	vxlan_net->sock_lock.
- *
- * void (*ndo_del_vxlan_port)(struct  net_device *dev,
- *			      sa_family_t sa_family, __be16 port);
- *	Called by vxlan to notify the driver about a UDP port and socket
- *	address family that vxlan is not listening to anymore. The operation
- *	is protected by the vxlan_net->sock_lock.
+ * void (*ndo_add_tunnel_port)(struct  net_device *dev,
+ *			      sa_family_t sa_family, __be16 port, u32 type);
+ *	Called by UDP based tunnel modules to notify a driver about a UDP
+ *	port and socket address family that the tunnel is listening to. It is
+ *	called only when a new port starts listening. The operation is
+ *	protected by tunnel socket sock_lock.
+ *
+ * void (*ndo_del_tunnel_port)(struct  net_device *dev,
+ *			      sa_family_t sa_family, __be16 port, u32 type);
+ *	Called by UDP based tunnel modules to notify the driver about a UDP port
+ *	and socket address family that tunnel is not listening to anymore.
+ *	The operation is protected by tunnel socket sock_lock.
  *
  * void* (*ndo_dfwd_add_station)(struct net_device *pdev,
  *				 struct net_device *dev)
@@ -1204,13 +1204,12 @@  struct net_device_ops {
 							struct netdev_phys_item_id *ppid);
 	int			(*ndo_get_phys_port_name)(struct net_device *dev,
 							  char *name, size_t len);
-	void			(*ndo_add_vxlan_port)(struct  net_device *dev,
+	void			(*ndo_add_tunnel_port)(struct  net_device *dev,
 						      sa_family_t sa_family,
-						      __be16 port);
-	void			(*ndo_del_vxlan_port)(struct  net_device *dev,
+						      __be16 port, u32 type);
+	void			(*ndo_del_tunnel_port)(struct  net_device *dev,
 						      sa_family_t sa_family,
-						      __be16 port);
-
+						      __be16 port, u32 type);
 	void*			(*ndo_dfwd_add_station)(struct net_device *pdev,
 							struct net_device *dev);
 	void			(*ndo_dfwd_del_station)(struct net_device *pdev,
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index c491c12..328f3ab 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -9,6 +9,12 @@ 
 #include <net/addrconf.h>
 #endif
 
+enum udp_tunnel_type {
+	UDP_TUNNEL_NONE,
+	UDP_TUNNEL_VXLAN,
+	UDP_TUNNEL_GENEVE,
+};
+
 struct udp_port_cfg {
 	u8			family;