diff mbox

veth: delay peer link configuration after interfaces are tied

Message ID 1464520637-19784-1-git-send-email-vincent@bernat.im
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Vincent Bernat May 29, 2016, 11:17 a.m. UTC
When the peer link is created, its "iflink" information is not
advertised through netlink. If a user is maintaining a cache from all
updates, it will miss this information:

    2: veth0@NONE: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
        link/ether ae:0e:08:af:fb:a0 brd ff:ff:ff:ff:ff:ff
    3: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default
        link/ether 3a:31:f1:36:2e:e5 brd ff:ff:ff:ff:ff:ff

To avoid this situation, the peer link is only configured after both
interfaces are tied together:

    3: veth0@veth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
        link/ether ee:0d:80:46:36:fe brd ff:ff:ff:ff:ff:ff
    4: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default
        link/ether ba:25:bc:7a:0d:c8 brd ff:ff:ff:ff:ff:ff

Signed-off-by: Vincent Bernat <vincent@bernat.im>
---
 drivers/net/veth.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Nicolas Dichtel May 30, 2016, 9:23 a.m. UTC | #1
Le 29/05/2016 13:17, Vincent Bernat a écrit :
[snip]
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index f37a6e61d4ad..9726c4dbf659 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -432,10 +432,6 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
>  
>  	netif_carrier_off(peer);
>  
> -	err = rtnl_configure_link(peer, ifmp);
> -	if (err < 0)
> -		goto err_configure_peer;
> -
>  	/*
>  	 * register dev last
>  	 *
> @@ -466,6 +462,10 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
>  
>  	priv = netdev_priv(peer);
>  	rcu_assign_pointer(priv->peer, dev);
> +
> +	err = rtnl_configure_link(peer, ifmp);
> +	if (err < 0)
> +		goto err_configure_peer;
You should fix the error path. 'unregister_netdevice(dev)' is missing.
Vincent Bernat May 30, 2016, 10:11 a.m. UTC | #2
❦ 30 mai 2016 11:23 CEST, Nicolas Dichtel <nicolas.dichtel@6wind.com> :

>> @@ -466,6 +462,10 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
>>  
>>  	priv = netdev_priv(peer);
>>  	rcu_assign_pointer(priv->peer, dev);
>> +
>> +	err = rtnl_configure_link(peer, ifmp);
>> +	if (err < 0)
>> +		goto err_configure_peer;

> You should fix the error path. 'unregister_netdevice(dev)' is missing.

I am sending another patch to fix that. I am quite unsure if I do the
right thing here.
Nicolas Dichtel May 30, 2016, 3:19 p.m. UTC | #3
Le 30/05/2016 12:11, Vincent Bernat a écrit :
>  ❦ 30 mai 2016 11:23 CEST, Nicolas Dichtel <nicolas.dichtel@6wind.com> :
> 
>>> @@ -466,6 +462,10 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
>>>  
>>>  	priv = netdev_priv(peer);
>>>  	rcu_assign_pointer(priv->peer, dev);
>>> +
>>> +	err = rtnl_configure_link(peer, ifmp);
>>> +	if (err < 0)
>>> +		goto err_configure_peer;
> 
>> You should fix the error path. 'unregister_netdevice(dev)' is missing.
> 
> I am sending another patch to fix that. I am quite unsure if I do the
> right thing here.
> 
A less intrusive fix is to call 'rtmsg_ifinfo(RTM_NEWLINK, peer, ~0U,
GFP_KERNEL);' a the end of veth_newlink().
Vincent Bernat May 30, 2016, 3:26 p.m. UTC | #4
❦ 30 mai 2016 17:19 CEST, Nicolas Dichtel <nicolas.dichtel@6wind.com> :

>>>>  	priv = netdev_priv(peer);
>>>>  	rcu_assign_pointer(priv->peer, dev);
>>>> +
>>>> +	err = rtnl_configure_link(peer, ifmp);
>>>> +	if (err < 0)
>>>> +		goto err_configure_peer;
>> 
>>> You should fix the error path. 'unregister_netdevice(dev)' is missing.
>> 
>> I am sending another patch to fix that. I am quite unsure if I do the
>> right thing here.
>> 
> A less intrusive fix is to call 'rtmsg_ifinfo(RTM_NEWLINK, peer, ~0U,
> GFP_KERNEL);' a the end of veth_newlink().

I did that at first. Maybe this would make more sense to do
that. Otherwise, the first message contains an iflink value that we
cannot resolve with just the received netlink messages (since the
information is in the next netlink message). "ip monitor" seems to be
able to get the info, but I suppose it does an additional
lookup.
Nicolas Dichtel May 30, 2016, 3:47 p.m. UTC | #5
Le 30/05/2016 17:26, Vincent Bernat a écrit :
>  ❦ 30 mai 2016 17:19 CEST, Nicolas Dichtel <nicolas.dichtel@6wind.com> :
> 
>>>>>  	priv = netdev_priv(peer);
>>>>>  	rcu_assign_pointer(priv->peer, dev);
>>>>> +
>>>>> +	err = rtnl_configure_link(peer, ifmp);
>>>>> +	if (err < 0)
>>>>> +		goto err_configure_peer;
>>>
>>>> You should fix the error path. 'unregister_netdevice(dev)' is missing.
>>>
>>> I am sending another patch to fix that. I am quite unsure if I do the
>>> right thing here.
>>>
>> A less intrusive fix is to call 'rtmsg_ifinfo(RTM_NEWLINK, peer, ~0U,
>> GFP_KERNEL);' a the end of veth_newlink().
> 
> I did that at first. Maybe this would make more sense to do
> that. Otherwise, the first message contains an iflink value that we
> cannot resolve with just the received netlink messages (since the
> information is in the next netlink message). "ip monitor" seems to be
> able to get the info, but I suppose it does an additional
> lookup.
> 
Yes, it's a chicken and egg problem ;-)
I think that the first message with an iflink set to '0' is not a problem if a
second one update this value. Daemons that listen to those rtnl messages must
always update their caches. When the peer is put in another netns, its ifindex
may change again.
diff mbox

Patch

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index f37a6e61d4ad..9726c4dbf659 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -432,10 +432,6 @@  static int veth_newlink(struct net *src_net, struct net_device *dev,
 
 	netif_carrier_off(peer);
 
-	err = rtnl_configure_link(peer, ifmp);
-	if (err < 0)
-		goto err_configure_peer;
-
 	/*
 	 * register dev last
 	 *
@@ -466,6 +462,10 @@  static int veth_newlink(struct net *src_net, struct net_device *dev,
 
 	priv = netdev_priv(peer);
 	rcu_assign_pointer(priv->peer, dev);
+
+	err = rtnl_configure_link(peer, ifmp);
+	if (err < 0)
+		goto err_configure_peer;
 	return 0;
 
 err_register_dev: