Message ID | 1439966648-26195-2-git-send-email-vincent@bernat.im |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 19 Aug 2015 08:44:08 +0200, Vincent Bernat wrote: > While the documentation doesn't say exactly what kind of relationship > iflink should represent, until a45253, only lower devices were > advertised this way. While veth cannot have a lower device, using iflink > to advertise the peer may create infinite loops in programs using iflink > to discover device topology. > > Instead of advertising the peer link with iflink, a symbolic link "peer" > is added to each peer. By removing veth_get_iflink, you're also stopping IFLA_LINK being advertised in netlink messages, which consequently makes impossible to reliably match veth peers across name spaces again. This would be a huge step backwards. Jiri
❦ 19 août 2015 13:00 +0200, Jiri Benc <jbenc@redhat.com> : >> While the documentation doesn't say exactly what kind of relationship >> iflink should represent, until a45253, only lower devices were >> advertised this way. While veth cannot have a lower device, using iflink >> to advertise the peer may create infinite loops in programs using iflink >> to discover device topology. >> >> Instead of advertising the peer link with iflink, a symbolic link "peer" >> is added to each peer. > > By removing veth_get_iflink, you're also stopping IFLA_LINK being > advertised in netlink messages, which consequently makes impossible to > reliably match veth peers across name spaces again. This would be a > huge step backwards. That's the main goal of this patch: advertising the peer link as IFLA_LINK attribute triggers an infinite loop in userland software when they follow iflink to discover network devices topology. iflink has always been the index of a lower device. If a sysfs symbolic link is not good enough, I can propose a new IFLA_PEER attribute instead.
On Wed, 19 Aug 2015 14:13:00 +0200, Vincent Bernat wrote: > That's the main goal of this patch: advertising the peer link as > IFLA_LINK attribute triggers an infinite loop in userland software when > they follow iflink to discover network devices topology. iflink has > always been the index of a lower device. If a sysfs symbolic link is not > good enough, I can propose a new IFLA_PEER attribute instead. This would cause regression and break applications for those of us who started relying on the netnsid feature to match interfaces across net name spaces. This is tough. If you're going to do such thing, you would at least need to also introduce IFLA_PEER_NETNSID. Jiri
❦ 19 août 2015 14:38 +0200, Jiri Benc <jbenc@redhat.com> : >> That's the main goal of this patch: advertising the peer link as >> IFLA_LINK attribute triggers an infinite loop in userland software when >> they follow iflink to discover network devices topology. iflink has >> always been the index of a lower device. If a sysfs symbolic link is not >> good enough, I can propose a new IFLA_PEER attribute instead. > > This would cause regression and break applications for those of us who > started relying on the netnsid feature to match interfaces across net > name spaces. Yes. Unfortunately. > This is tough. If you're going to do such thing, you would at least > need to also introduce IFLA_PEER_NETNSID. Yes I can. In my opinion, the change of semantics of IFLA_LINK is a break of API. However, I can live with it since it's easy to workaround it. It just seemed easier to start the discussion with a patch.
Le 19/08/2015 14:48, Vincent Bernat a écrit : > ❦ 19 août 2015 14:38 +0200, Jiri Benc <jbenc@redhat.com> : > >>> That's the main goal of this patch: advertising the peer link as >>> IFLA_LINK attribute triggers an infinite loop in userland software when >>> they follow iflink to discover network devices topology. iflink has >>> always been the index of a lower device. If a sysfs symbolic link is not >>> good enough, I can propose a new IFLA_PEER attribute instead. >> >> This would cause regression and break applications for those of us who >> started relying on the netnsid feature to match interfaces across net >> name spaces. > > Yes. Unfortunately. > >> This is tough. If you're going to do such thing, you would at least >> need to also introduce IFLA_PEER_NETNSID. Probably better to introduce veth netlink attribute then, something like IFLA_VETH_PEER and keeps IFLA_LINK_NETNSID. > > Yes I can. > > In my opinion, the change of semantics of IFLA_LINK is a break of > API. However, I can live with it since it's easy to workaround it. It > just seemed easier to start the discussion with a patch. > I also don't know what is the best way to handle this. veth advertises its peer via IFLA_LINK since 4.1, so it's too late to change it for this release. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 19 Aug 2015 18:33:14 +0200, Nicolas Dichtel wrote: > Probably better to introduce veth netlink attribute then, something like > IFLA_VETH_PEER and keeps IFLA_LINK_NETNSID. I'd prefer IFLA_PEER. More generic attribute will be helpful should we introduce an interface similar to veth in the future. Also, I'd not combine IFLA_LINK_NETNSID with IFLA_PEER. There might very well be an interface in the future that will need both IFLA_LINK and IFLA_PEER and this would just create a confusion. It may be unlikely but the attributes are cheap and it doesn't make sense to design uAPI in a way that might bring problems in the future. > I also don't know what is the best way to handle this. veth advertises > its peer via IFLA_LINK since 4.1, so it's too late to change it for this > release. Apparently we need to pick our poison. Either way, we break something. Jiri
Le 20/08/2015 13:53, Jiri Benc a écrit : > On Wed, 19 Aug 2015 18:33:14 +0200, Nicolas Dichtel wrote: >> Probably better to introduce veth netlink attribute then, something like >> IFLA_VETH_PEER and keeps IFLA_LINK_NETNSID. > > I'd prefer IFLA_PEER. More generic attribute will be helpful should we > introduce an interface similar to veth in the future.s Ok. > > Also, I'd not combine IFLA_LINK_NETNSID with IFLA_PEER. There might > very well be an interface in the future that will need both IFLA_LINK and > IFLA_PEER and this would just create a confusion. It may be unlikely > but the attributes are cheap and it doesn't make sense to design uAPI > in a way that might bring problems in the future. Ok, but then this IFLA_PEER can include the ifindex and the nsid. No need to have two new attributes. > >> I also don't know what is the best way to handle this. veth advertises >> its peer via IFLA_LINK since 4.1, so it's too late to change it for this >> release. > > Apparently we need to pick our poison. Either way, we break something. Sure. I would prefer to have the same mechanism in all version, but I can live with the other solution. David, any thoughts about this? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Nicolas Dichtel <nicolas.dichtel@6wind.com> Date: Thu, 20 Aug 2015 16:31:11 +0200 > Le 20/08/2015 13:53, Jiri Benc a écrit : >> On Wed, 19 Aug 2015 18:33:14 +0200, Nicolas Dichtel wrote: >>> Probably better to introduce veth netlink attribute then, something >>> like >>> IFLA_VETH_PEER and keeps IFLA_LINK_NETNSID. >> >> I'd prefer IFLA_PEER. More generic attribute will be helpful should we >> introduce an interface similar to veth in the future.s > Ok. > >> >> Also, I'd not combine IFLA_LINK_NETNSID with IFLA_PEER. There might >> very well be an interface in the future that will need both IFLA_LINK >> and >> IFLA_PEER and this would just create a confusion. It may be unlikely >> but the attributes are cheap and it doesn't make sense to design uAPI >> in a way that might bring problems in the future. > Ok, but then this IFLA_PEER can include the ifindex and the nsid. No > need > to have two new attributes. > >> >>> I also don't know what is the best way to handle this. veth advertises >>> its peer via IFLA_LINK since 4.1, so it's too late to change it for >>> this >>> release. >> >> Apparently we need to pick our poison. Either way, we break something. > Sure. I would prefer to have the same mechanism in all version, but I > can > live with the other solution. > > David, any thoughts about this? You can't break the 4.1 semantics, it's in a released kernel and people _ARE_ using it. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
❦ 20 août 2015 14:07 -0700, David Miller <davem@davemloft.net> : >>>> I also don't know what is the best way to handle this. veth advertises >>>> its peer via IFLA_LINK since 4.1, so it's too late to change it for >>>> this >>>> release. >>> >>> Apparently we need to pick our poison. Either way, we break something. >> Sure. I would prefer to have the same mechanism in all version, but I >> can live with the other solution. >> >> David, any thoughts about this? > > You can't break the 4.1 semantics, it's in a released kernel and people > _ARE_ using it. I had a look at what other kind of daemons may exploit the pre-4.1 semantics (of not having an infinite loop when following iflink) and failed to find any other users than "lldpd". Other LLDP daemons (lldpad, ladvd, openlldpd) have other ways to find the lower interface. I would also have thought that NetSNMP would use it to implement ifStackTable but it doesn't in fact implement this table.
diff --git a/drivers/net/veth.c b/drivers/net/veth.c index c8186ffda1a3..47f165bc3107 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -105,6 +105,17 @@ static const struct ethtool_ops veth_ethtool_ops = { .get_ethtool_stats = veth_get_ethtool_stats, }; +static int veth_peer_sysfs_add(struct net_device *dev, + struct net_device *peer_dev) +{ + return sysfs_create_link(&(dev->dev.kobj), &(peer_dev->dev.kobj), + "peer"); +} +static void veth_peer_sysfs_del(struct net_device *dev) +{ + sysfs_remove_link(&(dev->dev.kobj), "peer"); +} + static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev) { struct veth_priv *priv = netdev_priv(dev); @@ -263,20 +274,6 @@ static void veth_poll_controller(struct net_device *dev) } #endif /* CONFIG_NET_POLL_CONTROLLER */ -static int veth_get_iflink(const struct net_device *dev) -{ - struct veth_priv *priv = netdev_priv(dev); - struct net_device *peer; - int iflink; - - rcu_read_lock(); - peer = rcu_dereference(priv->peer); - iflink = peer ? peer->ifindex : 0; - rcu_read_unlock(); - - return iflink; -} - static const struct net_device_ops veth_netdev_ops = { .ndo_init = veth_dev_init, .ndo_open = veth_open, @@ -289,7 +286,6 @@ static const struct net_device_ops veth_netdev_ops = { #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller = veth_poll_controller, #endif - .ndo_get_iflink = veth_get_iflink, }; #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \ @@ -436,6 +432,13 @@ static int veth_newlink(struct net *src_net, struct net_device *dev, netif_carrier_off(dev); + err = veth_peer_sysfs_add(dev, peer); + if (err < 0) + goto err_configure_dev; + err = veth_peer_sysfs_add(peer, dev); + if (err < 0) + goto err_configure_dev; + /* * tie the deviced together */ @@ -447,9 +450,13 @@ static int veth_newlink(struct net *src_net, struct net_device *dev, rcu_assign_pointer(priv->peer, dev); return 0; +err_configure_dev: + veth_peer_sysfs_del(dev); + veth_peer_sysfs_del(peer); err_register_dev: /* nothing to do */ err_configure_peer: + veth_peer_sysfs_del(dev); unregister_netdevice(peer); return err; @@ -466,6 +473,9 @@ static void veth_dellink(struct net_device *dev, struct list_head *head) priv = netdev_priv(dev); peer = rtnl_dereference(priv->peer); + veth_peer_sysfs_del(dev); + if (peer) veth_peer_sysfs_del(dev); + /* Note : dellink() is called from default_device_exit_batch(), * before a rcu_synchronize() point. The devices are guaranteed * not being freed before one RCU grace period.
While the documentation doesn't say exactly what kind of relationship iflink should represent, until a45253, only lower devices were advertised this way. While veth cannot have a lower device, using iflink to advertise the peer may create infinite loops in programs using iflink to discover device topology. Instead of advertising the peer link with iflink, a symbolic link "peer" is added to each peer. Signed-off-by: Vincent Bernat <vincent@bernat.im> --- drivers/net/veth.c | 40 +++++++++++++++++++++++++--------------- 1 file changed, 25 insertions(+), 15 deletions(-)