diff mbox

macvlan: Support creating macvlans from macvlans

Message ID m1ocwfsgw7.fsf@fess.ebiederm.org
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric W. Biederman March 5, 2009, 11:12 p.m. UTC
When running in a network namespace whose only link to
the outside world is a macvlan device, not being
able to create another macvlan is a real pain.

So modify macvlan creation to allow automatically forward
a creation of a macvlan on a macvlan to become a creation
of a macvlan on the underlying network device.

Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com>
---
 drivers/net/macvlan.c |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

Comments

Patrick McHardy March 6, 2009, 1:54 p.m. UTC | #1
Eric W. Biederman wrote:
> When running in a network namespace whose only link to
> the outside world is a macvlan device, not being
> able to create another macvlan is a real pain.
> 
> So modify macvlan creation to allow automatically forward
> a creation of a macvlan on a macvlan to become a creation
> of a macvlan on the underlying network device.

I'm not sure I understand the constallation, what is the underlying
device in this case? A device outside the namespace?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman March 6, 2009, 2:17 p.m. UTC | #2
Patrick McHardy <kaber@trash.net> writes:

> Eric W. Biederman wrote:
>> When running in a network namespace whose only link to
>> the outside world is a macvlan device, not being
>> able to create another macvlan is a real pain.
>>
>> So modify macvlan creation to allow automatically forward
>> a creation of a macvlan on a macvlan to become a creation
>> of a macvlan on the underlying network device.
>
> I'm not sure I understand the constallation, what is the underlying
> device in this case? A device outside the namespace?

Yes.

Typical usage would be:

eth0 in the initial namespace.
A macvlan off of eth0 in each child namespace.

Which works fine until I do things like create a network namespace
when I am already inside of a network namespace.  A child of a child.
In which case I have to start rigging up something like a pair of
veths an bridging or routing to get outside connectivity.

Or roughly:
ip link add mv0 link eth0 type macvlan.
ip link add mv1 link eth0 type macvlan.
ip link set mv0 netns 1234
ip link set mv1 netns 6789

Then later I would find it very handy to do:
echo $$ -> 1234
ip link add mv3 link mv0 type macvlan
ip link set mv3 netns 101112

The fact that we only bridge traffic on the ingress from the external
world is also a pain, but that isn't trivial to fix, and fixing
it might possibly break valid macvlan users.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy March 6, 2009, 2:25 p.m. UTC | #3
Eric W. Biederman wrote:
> Patrick McHardy <kaber@trash.net> writes:
> 
>>> So modify macvlan creation to allow automatically forward
>>> a creation of a macvlan on a macvlan to become a creation
>>> of a macvlan on the underlying network device.
>> I'm not sure I understand the constallation, what is the underlying
>> device in this case? A device outside the namespace?
> 
> Yes.
> 
> Typical usage would be:
> 
> eth0 in the initial namespace.
> A macvlan off of eth0 in each child namespace.
> 
> Which works fine until I do things like create a network namespace
> when I am already inside of a network namespace.  A child of a child.
> In which case I have to start rigging up something like a pair of
> veths an bridging or routing to get outside connectivity.
> 
> Or roughly:
> ip link add mv0 link eth0 type macvlan.
> ip link add mv1 link eth0 type macvlan.
> ip link set mv0 netns 1234
> ip link set mv1 netns 6789
> 
> Then later I would find it very handy to do:
> echo $$ -> 1234
> ip link add mv3 link mv0 type macvlan
> ip link set mv3 netns 101112

That makes sense of course. I'm mainly wondering whether a namespace
should be able to directly affect the real device like this. This
might move it to promiscous mode, or affect other performce-relevant
settings. Its also looks like you can steal the MAC address of a
different macvlan device this way and have the packets directed to you
(new devices are added to the beginning of the hash chains, so they
are found first on lookups).
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman March 6, 2009, 3:03 p.m. UTC | #4
Patrick McHardy <kaber@trash.net> writes:

> That makes sense of course. I'm mainly wondering whether a namespace
> should be able to directly affect the real device like this. This
> might move it to promiscous mode, or affect other performce-relevant
> settings. Its also looks like you can steal the MAC address of a
> different macvlan device this way and have the packets directed to you
> (new devices are added to the beginning of the hash chains, so they
> are found first on lookups).

To a large extent those are things that we already can do, simply by
having multiple mcavlans in different network namespaces.  I could
push it into promiscous mode by adding more multicast listeners,
and I could steal the mac address of another macvlan by changing
my mac address if I happen to come first in the hash chain.

Hmm.  Actually that appears to be a macvlan bug.  It looks like if I
change the macaddress on a macvlan we don't update the hash chains.
So unless we have the same low byte we will be on the wrong hash chain
and not receive the packets for the mac we specified.  Ouch!

It is also trivial to spoof a different macvlan device by using
PF_PACKET and sending packets with the source mac address of
another macvlan.

Also this still requires CAP_NET_ADMIN, as much as I would like
to remove that restriction.

Your concerns don't appear to be new to allowing the creation
of a macvlan from a macvlan or fundamental to creating
a macvlan from a macvlan.  You still must have access to at
least a macvlan in your namespace to create a new one.  So
I don't think those issues bear on my patch.



That said I am not opposed conceptually to something that is much
harder to abuse, and works better for the network namespace case.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy March 6, 2009, 3:08 p.m. UTC | #5
Eric W. Biederman wrote:
> Patrick McHardy <kaber@trash.net> writes:
> 
>> That makes sense of course. I'm mainly wondering whether a namespace
>> should be able to directly affect the real device like this. This
>> might move it to promiscous mode, or affect other performce-relevant
>> settings. Its also looks like you can steal the MAC address of a
>> different macvlan device this way and have the packets directed to you
>> (new devices are added to the beginning of the hash chains, so they
>> are found first on lookups).
> 
> To a large extent those are things that we already can do, simply by
> having multiple mcavlans in different network namespaces.  I could
> push it into promiscous mode by adding more multicast listeners,
> and I could steal the mac address of another macvlan by changing
> my mac address if I happen to come first in the hash chain.
> 
> Hmm.  Actually that appears to be a macvlan bug.  It looks like if I
> change the macaddress on a macvlan we don't update the hash chains.
> So unless we have the same low byte we will be on the wrong hash chain
> and not receive the packets for the mac we specified.  Ouch!

The address can only be changed while the device is down and unhashed.

> It is also trivial to spoof a different macvlan device by using
> PF_PACKET and sending packets with the source mac address of
> another macvlan.

Yes, but that doesn't allow one namespace to deny service to
a different one.

> Also this still requires CAP_NET_ADMIN, as much as I would like
> to remove that restriction.
> 
> Your concerns don't appear to be new to allowing the creation
> of a macvlan from a macvlan or fundamental to creating
> a macvlan from a macvlan.  You still must have access to at
> least a macvlan in your namespace to create a new one.  So
> I don't think those issues bear on my patch.

No, they're not, but it seemed worth pointing out. Your patch
looks perfectly fine.

Acked-by: Patrick McHardy <kaber@trash.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman March 6, 2009, 3:24 p.m. UTC | #6
Patrick McHardy <kaber@trash.net> writes:

> Eric W. Biederman wrote:
>> Patrick McHardy <kaber@trash.net> writes:
>>
>>> That makes sense of course. I'm mainly wondering whether a namespace
>>> should be able to directly affect the real device like this. This
>>> might move it to promiscous mode, or affect other performce-relevant
>>> settings. Its also looks like you can steal the MAC address of a
>>> different macvlan device this way and have the packets directed to you
>>> (new devices are added to the beginning of the hash chains, so they
>>> are found first on lookups).
>>
>> To a large extent those are things that we already can do, simply by
>> having multiple mcavlans in different network namespaces.  I could
>> push it into promiscous mode by adding more multicast listeners,
>> and I could steal the mac address of another macvlan by changing
>> my mac address if I happen to come first in the hash chain.
>>
>> Hmm.  Actually that appears to be a macvlan bug.  It looks like if I
>> change the macaddress on a macvlan we don't update the hash chains.
>> So unless we have the same low byte we will be on the wrong hash chain
>> and not receive the packets for the mac we specified.  Ouch!
>
> The address can only be changed while the device is down and unhashed.

Point.  The dev_unicast/dev_unicast_delete in macvlan_set_mac_address
appears to be completely unnecessary then.

>> It is also trivial to spoof a different macvlan device by using
>> PF_PACKET and sending packets with the source mac address of
>> another macvlan.
>
> Yes, but that doesn't allow one namespace to deny service to
> a different one.

No for that I guess it seems I just need to down the interface
change the mac and up the interface again.

>> Also this still requires CAP_NET_ADMIN, as much as I would like
>> to remove that restriction.
>>
>> Your concerns don't appear to be new to allowing the creation
>> of a macvlan from a macvlan or fundamental to creating
>> a macvlan from a macvlan.  You still must have access to at
>> least a macvlan in your namespace to create a new one.  So
>> I don't think those issues bear on my patch.
>
> No, they're not, but it seemed worth pointing out. Your patch
> looks perfectly fine.

Thanks.

Would you be opposed to changes that made macvlan more robust.
Such as refusing to come up if the macaddress is already in use.
And perhaps denying the sending of packets with the wrong source
mac?

> Acked-by: Patrick McHardy <kaber@trash.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy March 6, 2009, 3:33 p.m. UTC | #7
Eric W. Biederman wrote:
> Patrick McHardy <kaber@trash.net> writes:
> 
>>> Hmm.  Actually that appears to be a macvlan bug.  It looks like if I
>>> change the macaddress on a macvlan we don't update the hash chains.
>>> So unless we have the same low byte we will be on the wrong hash chain
>>> and not receive the packets for the mac we specified.  Ouch!
>> The address can only be changed while the device is down and unhashed.
> 
> Point.  The dev_unicast/dev_unicast_delete in macvlan_set_mac_address
> appears to be completely unnecessary then.

I think thats correct.

>> No, they're not, but it seemed worth pointing out. Your patch
>> looks perfectly fine.
> 
> Thanks.
> 
> Would you be opposed to changes that made macvlan more robust.
> Such as refusing to come up if the macaddress is already in use.
> And perhaps denying the sending of packets with the wrong source
> mac?

Refusing duplicate MACs (on one underlying device) makes sense, the
results are undefined currently. About the filtering - I don't like
the idea too much given that we already have multiple possiblities
to do that.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman March 6, 2009, 3:50 p.m. UTC | #8
Patrick McHardy <kaber@trash.net> writes:

> Eric W. Biederman wrote:
>> Patrick McHardy <kaber@trash.net> writes:
>>
>>>> Hmm.  Actually that appears to be a macvlan bug.  It looks like if I
>>>> change the macaddress on a macvlan we don't update the hash chains.
>>>> So unless we have the same low byte we will be on the wrong hash chain
>>>> and not receive the packets for the mac we specified.  Ouch!
>>> The address can only be changed while the device is down and unhashed.
>>
>> Point.  The dev_unicast/dev_unicast_delete in macvlan_set_mac_address
>> appears to be completely unnecessary then.
>
> I think thats correct.

Actually it is really weird.  We can change the mac address while
the devices is running but the code is broken because it does
not update the hash table.

> Refusing duplicate MACs (on one underlying device) makes sense, the
> results are undefined currently.

Then I will do that just for consistency.

> About the filtering - I don't like
> the idea too much given that we already have multiple possiblities
> to do that.

Agreed, and even more I can think of some reasons why would not want
that.  The observation I have is that if really want separation you
need separate vlans to the switch.  As the similar hardware based
features also don't perform egress filtering.

Eric

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy March 6, 2009, 3:56 p.m. UTC | #9
Eric W. Biederman wrote:
> Patrick McHardy <kaber@trash.net> writes:
> 
>> Eric W. Biederman wrote:
>>> Patrick McHardy <kaber@trash.net> writes:
>>>
>>>>> Hmm.  Actually that appears to be a macvlan bug.  It looks like if I
>>>>> change the macaddress on a macvlan we don't update the hash chains.
>>>>> So unless we have the same low byte we will be on the wrong hash chain
>>>>> and not receive the packets for the mac we specified.  Ouch!
>>>> The address can only be changed while the device is down and unhashed.
>>> Point.  The dev_unicast/dev_unicast_delete in macvlan_set_mac_address
>>> appears to be completely unnecessary then.
>> I think thats correct.
> 
> Actually it is really weird.  We can change the mac address while
> the devices is running but the code is broken because it does
> not update the hash table.

Thats strange. I know the assumption was that this can't be done.
But I can't find anything preventing it, not even in older versions.

>> Refusing duplicate MACs (on one underlying device) makes sense, the
>> results are undefined currently.
> 
> Then I will do that just for consistency.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear March 6, 2009, 5:07 p.m. UTC | #10
Eric W. Biederman wrote:
> Thanks.
> Would you be opposed to changes that made macvlan more robust.
> Such as refusing to come up if the macaddress is already in use.
> And perhaps denying the sending of packets with the wrong source
> mac?
>   
I wouldn't deny sending with wrong source mac..ethernet interfaces can 
do this,
and mac-vlan should look as much like ethernet is possible.

I'm all for making it harder to mis-configure things (like dup macs on 
single interface,
etc).

Thanks,
Ben
David Miller March 13, 2009, 8:15 p.m. UTC | #11
From: Patrick McHardy <kaber@trash.net>
Date: Fri, 06 Mar 2009 16:08:45 +0100

> Acked-by: Patrick McHardy <kaber@trash.net>

Applied to net-next-2.6
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 7e24b50..b5241fc 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -461,12 +461,13 @@  static int macvlan_newlink(struct net_device *dev,
 	if (lowerdev == NULL)
 		return -ENODEV;
 
-	/* Don't allow macvlans on top of other macvlans - its not really
-	 * wrong, but lockdep can't handle it and its not useful for anything
-	 * you couldn't do directly on top of the real device.
+	/* When creating macvlans on top of other macvlans - use
+	 * the real device as the lowerdev.
 	 */
-	if (lowerdev->rtnl_link_ops == dev->rtnl_link_ops)
-		return -ENODEV;
+	if (lowerdev->rtnl_link_ops == dev->rtnl_link_ops) {
+		struct macvlan_dev *lowervlan = netdev_priv(lowerdev);
+		lowerdev = lowervlan->lowerdev;
+	}
 
 	if (!tb[IFLA_MTU])
 		dev->mtu = lowerdev->mtu;