diff mbox

net: clear iflink when moving to a new netns

Message ID 1392162690-6647-1-git-send-email-xiyou.wangcong@gmail.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Cong Wang Feb. 11, 2014, 11:51 p.m. UTC
From: Cong Wang <cwang@twopensource.com>

BZ: https://bugzilla.kernel.org/show_bug.cgi?id=66691

macvlan and vlan both use iflink to identify its lower device,
however, after such device is moved to the new netns, its iflink
would become meaningless as ifindex is per netns. So, instead of
forbid them moving to another netns, just clear this field so that
it will not be dumped at least.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>,
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
---
 net/core/dev.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Nicolas Dichtel Feb. 12, 2014, 3:43 p.m. UTC | #1
Le 12/02/2014 00:51, Cong Wang a écrit :
> From: Cong Wang <cwang@twopensource.com>
>
> BZ: https://bugzilla.kernel.org/show_bug.cgi?id=66691
>
> macvlan and vlan both use iflink to identify its lower device,
> however, after such device is moved to the new netns, its iflink
> would become meaningless as ifindex is per netns. So, instead of
> forbid them moving to another netns, just clear this field so that
> it will not be dumped at least.
>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>,
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> Signed-off-by: Cong Wang <cwang@twopensource.com>
I wonder if this patch breaks things in ip tunnels.
For example, ip6_tunnel uses iflink to find tunnels that are bound to an interface.
If you reset this field, ipip6_tunnel_lookup() will fail when the tunnel moves
to another netns.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Hemminger Feb. 12, 2014, 4:33 p.m. UTC | #2
On Tue, 11 Feb 2014 15:51:28 -0800
Cong Wang <xiyou.wangcong@gmail.com> wrote:

> From: Cong Wang <cwang@twopensource.com>
> 
> BZ: https://bugzilla.kernel.org/show_bug.cgi?id=66691
> 
> macvlan and vlan both use iflink to identify its lower device,
> however, after such device is moved to the new netns, its iflink
> would become meaningless as ifindex is per netns. So, instead of
> forbid them moving to another netns, just clear this field so that
> it will not be dumped at least.
> 
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>,
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> Signed-off-by: Cong Wang <cwang@twopensource.com>
> ---
>  net/core/dev.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 4ad1b78..5e88b0c2 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6608,12 +6608,11 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
>  	dev_net_set(dev, net);
>  
>  	/* If there is an ifindex conflict assign a new one */
> -	if (__dev_get_by_index(net, dev->ifindex)) {
> -		int iflink = (dev->iflink == dev->ifindex);
> +	if (__dev_get_by_index(net, dev->ifindex))
>  		dev->ifindex = dev_new_index(net);
> -		if (iflink)
> -			dev->iflink = dev->ifindex;
> -	}
> +
> +	/* Old iflink is meaningless in the new namespace */
> +	dev->iflink = dev->ifindex;
>  
>  	/* Send a netdev-add uevent to the new namespace */
>  	kobject_uevent(&dev->dev.kobj, KOBJ_ADD);

This also breaks propogation of state changes from lower device
to upper device. Things like carrier and up/down.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Feb. 12, 2014, 11:18 p.m. UTC | #3
On Tue, 2014-02-11 at 15:51 -0800, Cong Wang wrote:
> From: Cong Wang <cwang@twopensource.com>
> 
> BZ: https://bugzilla.kernel.org/show_bug.cgi?id=66691
> 
> macvlan and vlan both use iflink to identify its lower device,
> however, after such device is moved to the new netns, its iflink
> would become meaningless as ifindex is per netns. So, instead of
> forbid them moving to another netns, just clear this field so that
> it will not be dumped at least.
[...]

And what if it's moved back into the same netns?

I think iflink should be changed to a net_device pointer, so it remains
valid for in-kernel users but rtnetlink can do the netns check before
revealing it to userland.

Ben.
Cong Wang Feb. 13, 2014, 1:18 a.m. UTC | #4
On Wed, Feb 12, 2014 at 7:43 AM, Nicolas Dichtel
<nicolas.dichtel@6wind.com> wrote:
> Le 12/02/2014 00:51, Cong Wang a écrit :
>
>> From: Cong Wang <cwang@twopensource.com>
>>
>> BZ: https://bugzilla.kernel.org/show_bug.cgi?id=66691
>>
>> macvlan and vlan both use iflink to identify its lower device,
>> however, after such device is moved to the new netns, its iflink
>> would become meaningless as ifindex is per netns. So, instead of
>> forbid them moving to another netns, just clear this field so that
>> it will not be dumped at least.
>>
>> Cc: David S. Miller <davem@davemloft.net>
>> Cc: Eric W. Biederman <ebiederm@xmission.com>
>> Cc: Eric Dumazet <eric.dumazet@gmail.com>
>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>,
>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> Signed-off-by: Cong Wang <cwang@twopensource.com>
>
> I wonder if this patch breaks things in ip tunnels.
> For example, ip6_tunnel uses iflink to find tunnels that are bound to an
> interface.
> If you reset this field, ipip6_tunnel_lookup() will fail when the tunnel
> moves
> to another netns.

Most tunnels set NETIF_F_NETNS_LOCAL, ip6_tunnel should set it too
(need a patch). So this is not a problem.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cong Wang Feb. 13, 2014, 1:20 a.m. UTC | #5
On Wed, Feb 12, 2014 at 8:33 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> This also breaks propogation of state changes from lower device
> to upper device. Things like carrier and up/down.
>

macvlan_device_event() handles this pretty well by using a pointer to
net_device.
I don't see it is a problem from the code.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cong Wang Feb. 13, 2014, 1:34 a.m. UTC | #6
On Wed, Feb 12, 2014 at 3:18 PM, Ben Hutchings <ben@decadent.org.uk> wrote:
> On Tue, 2014-02-11 at 15:51 -0800, Cong Wang wrote:
>> From: Cong Wang <cwang@twopensource.com>
>>
>> BZ: https://bugzilla.kernel.org/show_bug.cgi?id=66691
>>
>> macvlan and vlan both use iflink to identify its lower device,
>> however, after such device is moved to the new netns, its iflink
>> would become meaningless as ifindex is per netns. So, instead of
>> forbid them moving to another netns, just clear this field so that
>> it will not be dumped at least.
> [...]
>
> And what if it's moved back into the same netns?
>

Good point!

> I think iflink should be changed to a net_device pointer, so it remains
> valid for in-kernel users but rtnetlink can do the netns check before
> revealing it to userland.
>

I will try this approach.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Feb. 13, 2014, 2 a.m. UTC | #7
Cong Wang <cwang@twopensource.com> writes:

> On Wed, Feb 12, 2014 at 7:43 AM, Nicolas Dichtel
> <nicolas.dichtel@6wind.com> wrote:
>> Le 12/02/2014 00:51, Cong Wang a écrit :
>>
>>> From: Cong Wang <cwang@twopensource.com>
>>>
>>> BZ: https://bugzilla.kernel.org/show_bug.cgi?id=66691
>>>
>>> macvlan and vlan both use iflink to identify its lower device,
>>> however, after such device is moved to the new netns, its iflink
>>> would become meaningless as ifindex is per netns. So, instead of
>>> forbid them moving to another netns, just clear this field so that
>>> it will not be dumped at least.
>>>
>>> Cc: David S. Miller <davem@davemloft.net>
>>> Cc: Eric W. Biederman <ebiederm@xmission.com>
>>> Cc: Eric Dumazet <eric.dumazet@gmail.com>
>>> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>,
>>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>>> Signed-off-by: Cong Wang <cwang@twopensource.com>
>>
>> I wonder if this patch breaks things in ip tunnels.
>> For example, ip6_tunnel uses iflink to find tunnels that are bound to an
>> interface.
>> If you reset this field, ipip6_tunnel_lookup() will fail when the tunnel
>> moves
>> to another netns.
>
> Most tunnels set NETIF_F_NETNS_LOCAL, ip6_tunnel should set it too
> (need a patch). So this is not a problem.

There was an effort not long ago to make tunnels safe to pass between
namespaces.  NETIF_F_NETNS_LOCAL was removed from ip6_tunnel in that
effort.  Apparently something was overlooked.

Making iflink a netdevice reference or finding a way to remove it
entirely seems better that masking the problem.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Feb. 13, 2014, 2:01 a.m. UTC | #8
Cong Wang <cwang@twopensource.com> writes:

> On Wed, Feb 12, 2014 at 8:33 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>>
>> This also breaks propogation of state changes from lower device
>> to upper device. Things like carrier and up/down.
>>
>
> macvlan_device_event() handles this pretty well by using a pointer to
> net_device.
> I don't see it is a problem from the code.

A quick grep shows two uses of iflink in net/core/link_watch.c that are
likely broken by your proposed change.

Eric

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cong Wang Feb. 13, 2014, 10:44 p.m. UTC | #9
On Wed, Feb 12, 2014 at 6:00 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> There was an effort not long ago to make tunnels safe to pass between
> namespaces.  NETIF_F_NETNS_LOCAL was removed from ip6_tunnel in that
> effort.  Apparently something was overlooked.
>
> Making iflink a netdevice reference or finding a way to remove it
> entirely seems better that masking the problem.
>

Yeah, looks like the trend is allowing more stacked devices to move to
a new netns.
Sounds good. I am working on a draft patch now.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 4ad1b78..5e88b0c2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6608,12 +6608,11 @@  int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
 	dev_net_set(dev, net);
 
 	/* If there is an ifindex conflict assign a new one */
-	if (__dev_get_by_index(net, dev->ifindex)) {
-		int iflink = (dev->iflink == dev->ifindex);
+	if (__dev_get_by_index(net, dev->ifindex))
 		dev->ifindex = dev_new_index(net);
-		if (iflink)
-			dev->iflink = dev->ifindex;
-	}
+
+	/* Old iflink is meaningless in the new namespace */
+	dev->iflink = dev->ifindex;
 
 	/* Send a netdev-add uevent to the new namespace */
 	kobject_uevent(&dev->dev.kobj, KOBJ_ADD);