diff mbox series

[net-next,1/1] veth: tweak creation of veth device

Message ID 1507666124-8780-1-git-send-email-mrv@mojatatu.com
State Rejected, archived
Delegated to: David Miller
Headers show
Series [net-next,1/1] veth: tweak creation of veth device | expand

Commit Message

Roman Mashak Oct. 10, 2017, 8:08 p.m. UTC
When creating veth pair, at first rtnl_new_link() creates veth_dev, i.e.
one end of the veth pipe, but not registers it; then veth_newlink() gets
invoked, where peer dev is created _and_ registered, followed by veth_dev
registration, which may fail if peer information, that is VETH_INFO_PEER
attribute, has not been provided and the kernel will allocate unique veth
name.

So, we should ask the kernel to allocate unique name for veth_dev only
when peer info is not available.

Example:

% ip link dev veth0 type veth
RTNETLINK answers: File exists

After fix:
% ip link dev veth0 type veth
% ip link show dev veth0
5: veth0@veth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether f6:ef:8b:96:f4:ec brd ff:ff:ff:ff:ff:ff
%

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
 drivers/net/veth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Miller Oct. 11, 2017, 10:17 p.m. UTC | #1
From: Roman Mashak <mrv@mojatatu.com>
Date: Tue, 10 Oct 2017 16:08:44 -0400

> When creating veth pair, at first rtnl_new_link() creates veth_dev, i.e.
> one end of the veth pipe, but not registers it; then veth_newlink() gets
> invoked, where peer dev is created _and_ registered, followed by veth_dev
> registration, which may fail if peer information, that is VETH_INFO_PEER
> attribute, has not been provided and the kernel will allocate unique veth
> name.
> 
> So, we should ask the kernel to allocate unique name for veth_dev only
> when peer info is not available.
> 
> Example:
> 
> % ip link dev veth0 type veth
> RTNETLINK answers: File exists
> 
> After fix:
> % ip link dev veth0 type veth
> % ip link show dev veth0
> 5: veth0@veth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
>     link/ether f6:ef:8b:96:f4:ec brd ff:ff:ff:ff:ff:ff
> %
> 
> Signed-off-by: Roman Mashak <mrv@mojatatu.com>

I'm not so sure about this.

If we specify an explicit tb[IFLA_NAME], we shouldn't completely ignore that
request from the user just because they didn't give any peer information.

I see what happens in this case, the peer gets 'veth0' and then since
the user asked for 'veth0' for the non-peer it conflicts.

Well, too bad.  The user must work to orchestrate things such that
this doesn't happen.  That means either providing the IFLA_NAME for
both the peer and the non-peer, or specifying neither.

I'm not applying this, sorry.
Roman Mashak Oct. 12, 2017, 4:49 p.m. UTC | #2
David Miller <davem@davemloft.net> writes:

>> When creating veth pair, at first rtnl_new_link() creates veth_dev, i.e.
>> one end of the veth pipe, but not registers it; then veth_newlink() gets
>> invoked, where peer dev is created _and_ registered, followed by veth_dev
>> registration, which may fail if peer information, that is VETH_INFO_PEER
>> attribute, has not been provided and the kernel will allocate unique veth
>> name.
>> 
>> So, we should ask the kernel to allocate unique name for veth_dev only
>> when peer info is not available.
>> 
>> Example:
>> 
>> % ip link dev veth0 type veth
>> RTNETLINK answers: File exists
>> 
>> After fix:
>> % ip link dev veth0 type veth
>> % ip link show dev veth0
>> 5: veth0@veth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
>>     link/ether f6:ef:8b:96:f4:ec brd ff:ff:ff:ff:ff:ff
>> %
>> 
>> Signed-off-by: Roman Mashak <mrv@mojatatu.com>
>
> I'm not so sure about this.
>
> If we specify an explicit tb[IFLA_NAME], we shouldn't completely ignore that
> request from the user just because they didn't give any peer information.
>
> I see what happens in this case, the peer gets 'veth0' and then since
> the user asked for 'veth0' for the non-peer it conflicts.

So, the only way is to require user space to _always_ pass in
VETH_INFO_PEER, which may break existing code (fixing iproute2 is easiest).

Otherwise ignore netlink messages lacking of VETH_INFO_PEER and return
error.

IMO, neither of these solutions seem reasonable.

Also, there are valid use cases where a user does not care about veth
name sitting in container, but assigns a name following certain
pattern to a host-side veth.

> Well, too bad.  The user must work to orchestrate things such that
> this doesn't happen.  That means either providing the IFLA_NAME for
> both the peer and the non-peer, or specifying neither.
>
> I'm not applying this, sorry.
David Miller Oct. 12, 2017, 5:19 p.m. UTC | #3
From: Roman Mashak <mrv@mojatatu.com>
Date: Thu, 12 Oct 2017 12:49:37 -0400

> IMO, neither of these solutions seem reasonable.

I think the current behavior is reasonable.  The user is given tools,
one of which does dynamic allocation of names, and the user must simply
take that into consideration.
diff mbox series

Patch

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index f5438d0..00dce15 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -432,7 +432,7 @@  static int veth_newlink(struct net *src_net, struct net_device *dev,
 	if (tb[IFLA_ADDRESS] == NULL)
 		eth_hw_addr_random(dev);
 
-	if (tb[IFLA_IFNAME])
+	if (ifmp && tb[IFLA_IFNAME])
 		nla_strlcpy(dev->name, tb[IFLA_IFNAME], IFNAMSIZ);
 	else
 		snprintf(dev->name, IFNAMSIZ, DRV_NAME "%%d");