diff mbox

PATCH: Network Device Naming mechanism and policy

Message ID 20091009140000.GA18765@mock.linuxdev.us.dell.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Narendra K Oct. 9, 2009, 2 p.m. UTC
On Fri, Oct 09, 2009 at 07:12:07PM +0530, K, Narendra wrote:
> > example udev config:
> > SUBSYSTEM=="net",
> SYMLINK+="net/by-mac/$sysfs{ifindex}.$sysfs{address}"
> 
> work as well.  But coupling the ifindex to the MAC address like this
> doesn't work.  (In general, coupling any two unrelated attributes when
> trying to do persistent names doesn't work.)
> 
Attaching the latest patch incorporating review comments.

By creating character devices for every network device, we can use
udev to maintain alternate naming policies for devices, including
additional names for the same device, without interfering with the
name that the kernel assigns a device.

This is conditionalized on CONFIG_NET_CDEV.  If enabled (the default),
device nodes will automatically be created in /dev/netdev/ for each
network device.  (/dev/net/ is already populated by the tun device.)

These device nodes are not functional at the moment - open() returns
-ENOSYS.  Their only purpose is to provide userspace with a kernel
name to ifindex mapping, in a form that udev can easily manage.

Signed-off-by: Jordan Hargrave <Jordan_Hargrave@dell.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
---
 include/linux/netdevice.h |    4 ++++
 net/Kconfig               |   10 ++++++++++
 net/core/Makefile         |    1 +
 net/core/cdev.c           |   42 ++++++++++++++++++++++++++++++++++++++++++
 net/core/cdev.h           |   13 +++++++++++++
 net/core/dev.c            |   10 ++++++++++
 net/core/net-sysfs.c      |   13 +++++++++++++
 7 files changed, 93 insertions(+), 0 deletions(-)
 create mode 100644 net/core/cdev.c
 create mode 100644 net/core/cdev.h

Comments

Bryan Kadzban Oct. 9, 2009, 4:23 p.m. UTC | #1
Matt Domsch wrote:
> Let me also note that we are prepared to have userspace consumers of 
> this new character device node.
> 
> http://linux.dell.com/wiki/index.php/Oss/libnetdevname
> 
> notes how the kernel patch will interact with udev, describes the new
> library helper function in libnetdevname, and has patches for 
> net-tools, iproute2, and ethtool to make use of the helper function.
> 
> As has been noted here, MAC addresses are not necessarily unique to
> an interface.

Only in the case of e.g. qemu (virtual hardware), I think.  (Or some
kinds of broken hardware.  Anything not on the udev whitelist from
75-persistent-net-generator.rules.)

The combination of (MAC, ifindex) is not unique, which is what I meant
earlier -- but the setup on the wiki seems to handle this properly.
Assuming there was a /dev/net/by-mac/00:01:02:03:04:05 link, it should
work fine...
Greg KH Oct. 9, 2009, 4:36 p.m. UTC | #2
On Fri, Oct 09, 2009 at 09:00:01AM -0500, Narendra K wrote:
> On Fri, Oct 09, 2009 at 07:12:07PM +0530, K, Narendra wrote:
> > > example udev config:
> > > SUBSYSTEM=="net",
> > SYMLINK+="net/by-mac/$sysfs{ifindex}.$sysfs{address}"
> > 
> > work as well.  But coupling the ifindex to the MAC address like this
> > doesn't work.  (In general, coupling any two unrelated attributes when
> > trying to do persistent names doesn't work.)
> > 
> Attaching the latest patch incorporating review comments.
> 
> By creating character devices for every network device, we can use
> udev to maintain alternate naming policies for devices, including
> additional names for the same device, without interfering with the
> name that the kernel assigns a device.
> 
> This is conditionalized on CONFIG_NET_CDEV.  If enabled (the default),
> device nodes will automatically be created in /dev/netdev/ for each
> network device.  (/dev/net/ is already populated by the tun device.)
> 
> These device nodes are not functional at the moment - open() returns
> -ENOSYS.  Their only purpose is to provide userspace with a kernel
> name to ifindex mapping, in a form that udev can easily manage.

How does this patch work with the network namespace functionality?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marco d'Itri Oct. 9, 2009, 4:56 p.m. UTC | #3
On Oct 09, Bryan Kadzban <bryan@kadzban.is-a-geek.net> wrote:

> > As has been noted here, MAC addresses are not necessarily unique to
> > an interface.
> Only in the case of e.g. qemu (virtual hardware), I think.  (Or some
> kinds of broken hardware.
Some Sun products have multiple interfaces sharing the same MAC address.
Matt Domsch Oct. 9, 2009, 5:17 p.m. UTC | #4
On Fri, Oct 09, 2009 at 09:36:13AM -0700, Greg KH wrote:
> On Fri, Oct 09, 2009 at 09:00:01AM -0500, Narendra K wrote:
> > On Fri, Oct 09, 2009 at 07:12:07PM +0530, K, Narendra wrote:
> > > > example udev config:
> > > > SUBSYSTEM=="net",
> > > SYMLINK+="net/by-mac/$sysfs{ifindex}.$sysfs{address}"
> > > 
> > > work as well.  But coupling the ifindex to the MAC address like this
> > > doesn't work.  (In general, coupling any two unrelated attributes when
> > > trying to do persistent names doesn't work.)
> > > 
> > Attaching the latest patch incorporating review comments.
> > 
> > By creating character devices for every network device, we can use
> > udev to maintain alternate naming policies for devices, including
> > additional names for the same device, without interfering with the
> > name that the kernel assigns a device.
> > 
> > This is conditionalized on CONFIG_NET_CDEV.  If enabled (the default),
> > device nodes will automatically be created in /dev/netdev/ for each
> > network device.  (/dev/net/ is already populated by the tun device.)
> > 
> > These device nodes are not functional at the moment - open() returns
> > -ENOSYS.  Their only purpose is to provide userspace with a kernel
> > name to ifindex mapping, in a form that udev can easily manage.
> 
> How does this patch work with the network namespace functionality?

There is a monitonically increasing static ifindex kept in
net/core/dev.c:dev_new_index(), which is shared across all namespaces.
struct net_device ifindex field is assigned from this.  So two devices
in two different namespaces can't share an ifindex value.  However,
the device can be present (or not) in the per-namespace dev_name_hash
and dev_index_hashes.  This patch doesn't change this at all.

uevents aren't namespaced.  Presumably that means /dev can't be
polyinstantiated.  Therefore, all devnodes in /dev/netdev/* will be
visible to all processes, where 'ifconfig' and friends would only show
device names in the processes namespace.  This doesn't mean the app
can _do_ anything (it's the same as if it tried to act on a device
using an ifindex for a device not in its namespace), but yes, the fact
that such a device exists will be exposed.
Greg KH Oct. 9, 2009, 5:22 p.m. UTC | #5
On Fri, Oct 09, 2009 at 12:17:24PM -0500, Matt Domsch wrote:
> 
> uevents aren't namespaced.  Presumably that means /dev can't be
> polyinstantiated.  Therefore, all devnodes in /dev/netdev/* will be
> visible to all processes, where 'ifconfig' and friends would only show
> device names in the processes namespace.  This doesn't mean the app
> can _do_ anything (it's the same as if it tried to act on a device
> using an ifindex for a device not in its namespace), but yes, the fact
> that such a device exists will be exposed.

That's the problem that the sysfs namespace patches were trying to
address.

Now I'm not saying it is a valid thing to try to work with this kind of
crazy, I was just wondering how it would work out.  Looks like it
doesn't :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Scott James Remnant Oct. 12, 2009, 10:41 a.m. UTC | #6
On Fri, 2009-10-09 at 09:51 -0500, Matt Domsch wrote:

> As has been noted here, MAC addresses are not necessarily unique to an
> interface.  As such, we are not proposing a net/by-mac/* symlink to
> /dev/netdev/*.
> 
On the other hand, they *tend* to be unique for a wide range of systems.
This makes them pretty comparable to LABELs on disks, and we have
a /dev/disk/by-label

Remember that udev already supports symlink stacking, and priorities and
such.

I don't think there's any danger of supporting a /dev/netdev/by-mac by
default, it'll be a benefit to most and those who don't have unique MACs
will just ignore it.

Scott
Ben Hutchings Oct. 12, 2009, 11:31 a.m. UTC | #7
On Mon, 2009-10-12 at 11:41 +0100, Scott James Remnant wrote:
> On Fri, 2009-10-09 at 09:51 -0500, Matt Domsch wrote:
> 
> > As has been noted here, MAC addresses are not necessarily unique to an
> > interface.  As such, we are not proposing a net/by-mac/* symlink to
> > /dev/netdev/*.
> > 
> On the other hand, they *tend* to be unique for a wide range of systems.
> This makes them pretty comparable to LABELs on disks, and we have
> a /dev/disk/by-label
[...]

MAC addresses are normally assigned automatically but can be overridden
if necessary.  In that respect they are more like UUIDs for disks.

I don't see any analogue of disk labels, though labels could conceivably
be added to some NICs using VPD.

Ben.
Bill Nottingham Oct. 12, 2009, 5:37 p.m. UTC | #8
Scott James Remnant (scott@ubuntu.com) said: 
> On the other hand, they *tend* to be unique for a wide range of systems.
> This makes them pretty comparable to LABELs on disks, and we have
> a /dev/disk/by-label
> 
> Remember that udev already supports symlink stacking, and priorities and
> such.
> 
> I don't think there's any danger of supporting a /dev/netdev/by-mac by
> default, it'll be a benefit to most and those who don't have unique MACs
> will just ignore it.

At the moment, we do not appear to get the proper change uevents from things
like 'ip link set dev <foo> address <bar>', so we can't currently maintain
these symlinks.

Bill
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
dann frazier Oct. 13, 2009, 3:08 p.m. UTC | #9
On Fri, Oct 09, 2009 at 09:00:01AM -0500, Narendra K wrote:
> On Fri, Oct 09, 2009 at 07:12:07PM +0530, K, Narendra wrote:
> > > example udev config:
> > > SUBSYSTEM=="net",
> > SYMLINK+="net/by-mac/$sysfs{ifindex}.$sysfs{address}"
> > 
> > work as well.  But coupling the ifindex to the MAC address like this
> > doesn't work.  (In general, coupling any two unrelated attributes when
> > trying to do persistent names doesn't work.)
> > 
> Attaching the latest patch incorporating review comments.
> 
> By creating character devices for every network device, we can use
> udev to maintain alternate naming policies for devices, including
> additional names for the same device, without interfering with the
> name that the kernel assigns a device.
> 
> This is conditionalized on CONFIG_NET_CDEV.  If enabled (the default),
> device nodes will automatically be created in /dev/netdev/ for each
> network device.  (/dev/net/ is already populated by the tun device.)
> 
> These device nodes are not functional at the moment - open() returns
> -ENOSYS.  Their only purpose is to provide userspace with a kernel
> name to ifindex mapping, in a form that udev can easily manage.

If the idea is just to provide a userspace-visible mapping (and
presumably take advantage of udev's infrastructure for naming) does
this need kernel changes? Could this be a hierarchy under
e.g. /etc/udev instead, using plain text files? It still means we need
something like libnetdevname for apps to do the translation, but I'm
not seeing why it matters how this map is stored. Is there some
special property of the character devices (e.g. uevents) that we're
not already getting with the existing interfaces?
Narendra K Oct. 13, 2009, 5:13 p.m. UTC | #10
>> These device nodes are not functional at the moment - open() returns 
>> -ENOSYS.  Their only purpose is to provide userspace with a kernel 
>> name to ifindex mapping, in a form that udev can easily manage.
>
>If the idea is just to provide a userspace-visible mapping 
>(and presumably take advantage of udev's infrastructure for 
>naming) does this need kernel changes? Could this be a 
>hierarchy under e.g. /etc/udev instead, using plain text 
>files? It still means we need something like libnetdevname for 
>apps to do the translation, but I'm not seeing why it matters 
>how this map is stored. Is there some special property of the 
>character devices (e.g. uevents) that we're not already 
>getting with the existing interfaces?

Yes. The char device by itself doesn't help in any way. But it provides
a flexible mechanism to provide multiple names for the same device, just
the way it is for disks.

With regards,
Narendra K
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
dann frazier Oct. 13, 2009, 5:36 p.m. UTC | #11
1;2202;0cOn Tue, Oct 13, 2009 at 10:43:49PM +0530, Narendra_K@Dell.com wrote:
> 
> >> These device nodes are not functional at the moment - open() returns 
> >> -ENOSYS.  Their only purpose is to provide userspace with a kernel 
> >> name to ifindex mapping, in a form that udev can easily manage.
> >
> >If the idea is just to provide a userspace-visible mapping 
> >(and presumably take advantage of udev's infrastructure for 
> >naming) does this need kernel changes? Could this be a 
> >hierarchy under e.g. /etc/udev instead, using plain text 
> >files? It still means we need something like libnetdevname for 
> >apps to do the translation, but I'm not seeing why it matters 
> >how this map is stored. Is there some special property of the 
> >character devices (e.g. uevents) that we're not already 
> >getting with the existing interfaces?
> 
> Yes. The char device by itself doesn't help in any way. But it provides
> a flexible mechanism to provide multiple names for the same device, just
> the way it is for disks.

Right - so any reason this couldn't be implemented completely in
userspace by having udev manipulate plain text files under say
/etc/udev/net/?

I do agree that it would be nice for admins/installers to tweak/use
nic names in a similar way to storage names (udev rules), and it might
let us take advantage of a lot of the existing udev code.
Dan Williams Oct. 13, 2009, 6:06 p.m. UTC | #12
On Mon, 2009-10-12 at 13:37 -0400, Bill Nottingham wrote:
> Scott James Remnant (scott@ubuntu.com) said: 
> > On the other hand, they *tend* to be unique for a wide range of systems.
> > This makes them pretty comparable to LABELs on disks, and we have
> > a /dev/disk/by-label
> > 
> > Remember that udev already supports symlink stacking, and priorities and
> > such.
> > 
> > I don't think there's any danger of supporting a /dev/netdev/by-mac by
> > default, it'll be a benefit to most and those who don't have unique MACs
> > will just ignore it.
> 
> At the moment, we do not appear to get the proper change uevents from things
> like 'ip link set dev <foo> address <bar>', so we can't currently maintain
> these symlinks.

And if we really want seamless support for MAC spoofing, we want
ETHTOOL_GPERMADDR for all drivers too, so that if your configuration
says "rename device XX:XX:XX:XX:XX:XX to YY:YY:YY:YY:YY:YY" we can
actually figure stuff out after the spoof.

Dan


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Oct. 13, 2009, 6:53 p.m. UTC | #13
On Tue, 2009-10-13 at 11:06 -0700, Dan Williams wrote:
> On Mon, 2009-10-12 at 13:37 -0400, Bill Nottingham wrote:
> > Scott James Remnant (scott@ubuntu.com) said: 
> > > On the other hand, they *tend* to be unique for a wide range of systems.
> > > This makes them pretty comparable to LABELs on disks, and we have
> > > a /dev/disk/by-label
> > > 
> > > Remember that udev already supports symlink stacking, and priorities and
> > > such.
> > > 
> > > I don't think there's any danger of supporting a /dev/netdev/by-mac by
> > > default, it'll be a benefit to most and those who don't have unique MACs
> > > will just ignore it.
> > 
> > At the moment, we do not appear to get the proper change uevents from things
> > like 'ip link set dev <foo> address <bar>', so we can't currently maintain
> > these symlinks.
> 
> And if we really want seamless support for MAC spoofing, we want
> ETHTOOL_GPERMADDR for all drivers too, so that if your configuration
> says "rename device XX:XX:XX:XX:XX:XX to YY:YY:YY:YY:YY:YY" we can
> actually figure stuff out after the spoof.

ETHTOOL_GPERMADDR is handled in the ethtool core now.  Are you thinking
of drivers that don't have ethtool ops?  Maybe it's time to add default
operations.

Ben.
Greg KH Oct. 13, 2009, 7:51 p.m. UTC | #14
On Tue, Oct 13, 2009 at 10:43:49PM +0530, Narendra_K@Dell.com wrote:
> 
> >> These device nodes are not functional at the moment - open() returns 
> >> -ENOSYS.  Their only purpose is to provide userspace with a kernel 
> >> name to ifindex mapping, in a form that udev can easily manage.
> >
> >If the idea is just to provide a userspace-visible mapping 
> >(and presumably take advantage of udev's infrastructure for 
> >naming) does this need kernel changes? Could this be a 
> >hierarchy under e.g. /etc/udev instead, using plain text 
> >files? It still means we need something like libnetdevname for 
> >apps to do the translation, but I'm not seeing why it matters 
> >how this map is stored. Is there some special property of the 
> >character devices (e.g. uevents) that we're not already 
> >getting with the existing interfaces?
> 
> Yes. The char device by itself doesn't help in any way. But it provides
> a flexible mechanism to provide multiple names for the same device, just
> the way it is for disks.

No, it's quite different than disks in that the symlinks, _and_ the
device nodes do absolutly nothing.  And any reference to a name that is
a symlink will not work with any existing network tool, you will have to
do some kind of lookup to determine which network device you really were
referring to.

These links end up being useless, and confusing, I still don't see how
you can use them for anything.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John W. Linville Oct. 13, 2009, 7:53 p.m. UTC | #15
On Tue, Oct 13, 2009 at 07:53:04PM +0100, Ben Hutchings wrote:
> On Tue, 2009-10-13 at 11:06 -0700, Dan Williams wrote:

> > And if we really want seamless support for MAC spoofing, we want
> > ETHTOOL_GPERMADDR for all drivers too, so that if your configuration
> > says "rename device XX:XX:XX:XX:XX:XX to YY:YY:YY:YY:YY:YY" we can
> > actually figure stuff out after the spoof.
> 
> ETHTOOL_GPERMADDR is handled in the ethtool core now.  Are you thinking
> of drivers that don't have ethtool ops?  Maybe it's time to add default
> operations.

Not quite true -- dev->perm_addr still has to be set by the driver.

John
Jordan_Hargrave@Dell.com Oct. 13, 2009, 8 p.m. UTC | #16
We have developed a mapping library that will convert the user-friendly symlink names to the kernel names necessary for socket ioctls.  All network tools that normally take ethX as argument have been modified to use this mapping library.  Usually it's just a one-line addition when parsing the command line arguments.


--jordan hargrave
Dell Enterprise Linux Engineering



-----Original Message-----
From: Greg KH [mailto:greg@kroah.com]
Sent: Tue 10/13/2009 14:51
To: K, Narendra
Cc: dannf@hp.com; netdev@vger.kernel.org; linux-hotplug@vger.kernel.org; Domsch, Matt; Hargrave, Jordan; Rose, Charles
Subject: Re: PATCH: Network Device Naming mechanism and policy
 
On Tue, Oct 13, 2009 at 10:43:49PM +0530, Narendra_K@Dell.com wrote:
> 
> >> These device nodes are not functional at the moment - open() returns 
> >> -ENOSYS.  Their only purpose is to provide userspace with a kernel 
> >> name to ifindex mapping, in a form that udev can easily manage.
> >
> >If the idea is just to provide a userspace-visible mapping 
> >(and presumably take advantage of udev's infrastructure for 
> >naming) does this need kernel changes? Could this be a 
> >hierarchy under e.g. /etc/udev instead, using plain text 
> >files? It still means we need something like libnetdevname for 
> >apps to do the translation, but I'm not seeing why it matters 
> >how this map is stored. Is there some special property of the 
> >character devices (e.g. uevents) that we're not already 
> >getting with the existing interfaces?
> 
> Yes. The char device by itself doesn't help in any way. But it provides
> a flexible mechanism to provide multiple names for the same device, just
> the way it is for disks.

No, it's quite different than disks in that the symlinks, _and_ the
device nodes do absolutly nothing.  And any reference to a name that is
a symlink will not work with any existing network tool, you will have to
do some kind of lookup to determine which network device you really were
referring to.

These links end up being useless, and confusing, I still don't see how
you can use them for anything.

thanks,

greg k-h

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg KH Oct. 13, 2009, 8:19 p.m. UTC | #17
A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Tue, Oct 13, 2009 at 03:00:59PM -0500, Jordan_Hargrave@Dell.com wrote:
> We have developed a mapping library that will convert the
> user-friendly symlink names to the kernel names necessary for socket
> ioctls.  All network tools that normally take ethX as argument have
> been modified to use this mapping library.  Usually it's just a
> one-line addition when parsing the command line arguments.

Either I missed this in the first message in this thread, or this was
never stated before, but that is nice.  Where is this library, and will
it be accepted by the upstream tool maintainers?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matt Domsch Oct. 13, 2009, 10:05 p.m. UTC | #18
On Tue, Oct 13, 2009 at 01:19:31PM -0700, Greg KH wrote:
> On Tue, Oct 13, 2009 at 03:00:59PM -0500, Jordan_Hargrave@Dell.com wrote:
> > We have developed a mapping library that will convert the
> > user-friendly symlink names to the kernel names necessary for socket
> > ioctls.  All network tools that normally take ethX as argument have
> > been modified to use this mapping library.  Usually it's just a
> > one-line addition when parsing the command line arguments.
> 
> Either I missed this in the first message in this thread, or this was
> never stated before, but that is nice.  Where is this library,

It was not noted in the initial patch post, but I did note it
immediately thereafter.

 Let me also note that we are prepared to have userspace consumers of
 this new character device node.
 
 http://linux.dell.com/wiki/index.php/Oss/libnetdevname
 
 notes how the kernel patch will interact with udev, describes the new
 library helper function in libnetdevname, and has patches for
 net-tools, iproute2, and ethtool to make use of the helper function.
 
 As has been noted here, MAC addresses are not necessarily unique to an
 interface.  As such, we are not proposing a net/by-mac/* symlink to
 /dev/netdev/*.


> and will it be accepted by the upstream tool maintainers?

Unknown, we haven't proposed it to any yet as it's irrelevant until
there is general acceptance of the approach (kernel or otherwise).  I
figured we'd start with the kernel discussion, and show how it could
be used.
dann frazier Oct. 13, 2009, 10:08 p.m. UTC | #19
On Tue, Oct 13, 2009 at 01:19:31PM -0700, Greg KH wrote:
> 
> A: No.
> Q: Should I include quotations after my reply?
> 
> http://daringfireball.net/2007/07/on_top
> 
> On Tue, Oct 13, 2009 at 03:00:59PM -0500, Jordan_Hargrave@Dell.com wrote:
> > We have developed a mapping library that will convert the
> > user-friendly symlink names to the kernel names necessary for socket
> > ioctls.  All network tools that normally take ethX as argument have
> > been modified to use this mapping library.  Usually it's just a
> > one-line addition when parsing the command line arguments.
> 
> Either I missed this in the first message in this thread, or this was
> never stated before, but that is nice.  Where is this library,

I read about it here:
  http://linux.dell.com/wiki/index.php/Oss/libnetdevname#libnetdevname

Source appears to be here:
  http://linux.dell.com/git/?p=libnetdevname.git;a=summary

> and will
> it be accepted by the upstream tool maintainers?
dann frazier Oct. 16, 2009, 12:32 a.m. UTC | #20
On Tue, Oct 13, 2009 at 11:36:38AM -0600, dann frazier wrote:
> 1;2202;0cOn Tue, Oct 13, 2009 at 10:43:49PM +0530, Narendra_K@Dell.com wrote:
> > 
> > >> These device nodes are not functional at the moment - open() returns 
> > >> -ENOSYS.  Their only purpose is to provide userspace with a kernel 
> > >> name to ifindex mapping, in a form that udev can easily manage.
> > >
> > >If the idea is just to provide a userspace-visible mapping 
> > >(and presumably take advantage of udev's infrastructure for 
> > >naming) does this need kernel changes? Could this be a 
> > >hierarchy under e.g. /etc/udev instead, using plain text 
> > >files? It still means we need something like libnetdevname for 
> > >apps to do the translation, but I'm not seeing why it matters 
> > >how this map is stored. Is there some special property of the 
> > >character devices (e.g. uevents) that we're not already 
> > >getting with the existing interfaces?
> > 
> > Yes. The char device by itself doesn't help in any way. But it provides
> > a flexible mechanism to provide multiple names for the same device, just
> > the way it is for disks.
> 
> Right - so any reason this couldn't be implemented completely in
> userspace by having udev manipulate plain text files under say
> /etc/udev/net/?
> 
> I do agree that it would be nice for admins/installers to tweak/use
> nic names in a similar way to storage names (udev rules), and it might
> let us take advantage of a lot of the existing udev code.

Is there interest in this approach?

 - modify udev to manage network devices names as regular (non-device)
   files (stored in /etc/udev, /dev/netdev, or wherever)
 - use the existing udev rules to manage symlinks to these files
 - point libnetdevname at these text files for its name resolution

I've started prototyping this, and it certainly looks possible w/o any
kernel changes. However, I could probably use some advice from a udev
person to do a proper implementation.
Narendra K Oct. 16, 2009, 2:02 p.m. UTC | #21
>On Tue, Oct 13, 2009 at 11:36:38AM -0600, dann frazier wrote:
>> Right - so any reason this couldn't be implemented completely in 
>> userspace by having udev manipulate plain text files under say 
>> /etc/udev/net/?
>> 
>> I do agree that it would be nice for admins/installers to tweak/use 
>> nic names in a similar way to storage names (udev rules), 
>and it might 
>> let us take advantage of a lot of the existing udev code.
>
>Is there interest in this approach?
> - modify udev to manage network devices names as regular (non-device)
>   files (stored in /etc/udev, /dev/netdev, or wherever)

Yes. Would you elaborate little more on "modify udev to manage network
devices as regular files". Does it mean some custom rules which will
generate a regular file under, say, /dev/netdev/ or extend udev itself ?
And how would the regular file look like in terms of holding ifindex of
the interface, which can be passed to libnetdevname.


> - use the existing udev rules to manage symlinks to these files
> - point libnetdevname at these text files for its name resolution
>
>I've started prototyping this, and it certainly looks possible 
>w/o any kernel changes. However, I could probably use some 
>advice from a udev person to do a proper implementation.

With regards,
Narendra K  
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
dann frazier Oct. 16, 2009, 3:20 p.m. UTC | #22
On Fri, Oct 16, 2009 at 07:32:50PM +0530, Narendra_K@Dell.com wrote:
> 
> >On Tue, Oct 13, 2009 at 11:36:38AM -0600, dann frazier wrote:
> >> Right - so any reason this couldn't be implemented completely in 
> >> userspace by having udev manipulate plain text files under say 
> >> /etc/udev/net/?
> >> 
> >> I do agree that it would be nice for admins/installers to tweak/use 
> >> nic names in a similar way to storage names (udev rules), 
> >and it might 
> >> let us take advantage of a lot of the existing udev code.
> >
> >Is there interest in this approach?
> > - modify udev to manage network devices names as regular (non-device)
> >   files (stored in /etc/udev, /dev/netdev, or wherever)
> 
> Yes. Would you elaborate little more on "modify udev to manage network
> devices as regular files".

Sure. We already get an event when netifs get added/removed - udev
just doesn't create a device file for it. And since all we care about
is the file's name (and the symlinks to it), there's really no point
in creating a real device file anyway.

So, instead of 'mknod /dev/netdev/eth0', why not just 'touch
/dev/netdev/eth0'? A file exists, so we can still maintain
aliases as symlinks, we just don't need to modify the kernel.

> Does it mean some custom rules which will
> generate a regular file under, say, /dev/netdev/ or extend udev
> itself ?

I believe we have to extend udev itself. We could probably do this
completely within udev rules by running programs that do the touching
and symlinking, but it would be nicer and more consistent/familiar to
take advantage of the udev syntax (SYMLINK) to do this
natively. Besides, udev already has the logic to know when/how to
instantiate and unlink symlinks, it would suck to duplicate that.

So, udev would need to be modified to know how to go through the
normal "node" creation for net devices, and to call creat() instead of
mknod().

> And how would the regular file look like in terms of holding ifindex of
> the interface, which can be passed to libnetdevname.

I can't think of anything we need to store in the regular file. If we
have the kernel name for the device, we can look up the ifindex in
/sys. Correct me if I'm wrong, but storing it ourselves seems
redundant.

> 
> 
> > - use the existing udev rules to manage symlinks to these files
> > - point libnetdevname at these text files for its name resolution
> >
> >I've started prototyping this, and it certainly looks possible 
> >w/o any kernel changes. However, I could probably use some 
> >advice from a udev person to do a proper implementation.
> 
> With regards,
> Narendra K
Ben Hutchings Oct. 16, 2009, 3:33 p.m. UTC | #23
On Fri, 2009-10-16 at 09:20 -0600, dann frazier wrote:
> On Fri, Oct 16, 2009 at 07:32:50PM +0530, Narendra_K@Dell.com wrote:
[...]
> > And how would the regular file look like in terms of holding ifindex of
> > the interface, which can be passed to libnetdevname.
> 
> I can't think of anything we need to store in the regular file. If we
> have the kernel name for the device, we can look up the ifindex in
> /sys. Correct me if I'm wrong, but storing it ourselves seems
> redundant.

But the name of a netdev can change whereas its ifindex never does.
Identifying netdevs by name would require additional work to update the
links when a netdev is renamed and would still be prone to race
conditions.  This is why Narendra and Matt were proposing to store the
ifindex in the node all along...

Ben.
dann frazier Oct. 16, 2009, 3:41 p.m. UTC | #24
On Fri, Oct 16, 2009 at 04:33:13PM +0100, Ben Hutchings wrote:
> On Fri, 2009-10-16 at 09:20 -0600, dann frazier wrote:
> > On Fri, Oct 16, 2009 at 07:32:50PM +0530, Narendra_K@Dell.com wrote:
> [...]
> > > And how would the regular file look like in terms of holding ifindex of
> > > the interface, which can be passed to libnetdevname.
> > 
> > I can't think of anything we need to store in the regular file. If we
> > have the kernel name for the device, we can look up the ifindex in
> > /sys. Correct me if I'm wrong, but storing it ourselves seems
> > redundant.
> 
> But the name of a netdev can change whereas its ifindex never does.
> Identifying netdevs by name would require additional work to update the
> links when a netdev is renamed and would still be prone to race
> conditions.  This is why Narendra and Matt were proposing to store the
> ifindex in the node all along...

ah, yes - I see that now - the ability to rename an interface is what
prevents this from working. Thanks for the explanation.
dann frazier Oct. 16, 2009, 9:40 p.m. UTC | #25
On Fri, Oct 16, 2009 at 04:33:13PM +0100, Ben Hutchings wrote:
> On Fri, 2009-10-16 at 09:20 -0600, dann frazier wrote:
> > On Fri, Oct 16, 2009 at 07:32:50PM +0530, Narendra_K@Dell.com wrote:
> [...]
> > > And how would the regular file look like in terms of holding ifindex of
> > > the interface, which can be passed to libnetdevname.
> > 
> > I can't think of anything we need to store in the regular file. If we
> > have the kernel name for the device, we can look up the ifindex in
> > /sys. Correct me if I'm wrong, but storing it ourselves seems
> > redundant.
> 
> But the name of a netdev can change whereas its ifindex never does.
> Identifying netdevs by name would require additional work to update the
> links when a netdev is renamed and would still be prone to race
> conditions.  This is why Narendra and Matt were proposing to store the
> ifindex in the node all along...

Matt, Ben and I talked about a few other possibilities on IRC.
The one I like the most at the moment is an idea Ben had to creat
dummy files named after the ifindex. Then, use symlinks for the kernel
name and the various by-$property subdirectories. This means the KOBJ
events will need to expose the ifindex.

I'm a novice at net programming, but I'm told that ifindex is
the information apps ultimately require here.
Narendra K Oct. 19, 2009, 11:30 a.m. UTC | #26
>> > > And how would the regular file look like in terms of holding 
>> > > ifindex of the interface, which can be passed to libnetdevname.
>> > 
>> > I can't think of anything we need to store in the regular file. If 
>> > we have the kernel name for the device, we can look up the ifindex 
>> > in /sys. Correct me if I'm wrong, but storing it ourselves seems 
>> > redundant.
>> 
>> But the name of a netdev can change whereas its ifindex never does.
>> Identifying netdevs by name would require additional work to update 
>> the links when a netdev is renamed and would still be prone to race 
>> conditions.  This is why Narendra and Matt were proposing to 
>store the 
>> ifindex in the node all along...
>
>Matt, Ben and I talked about a few other possibilities on IRC.
>The one I like the most at the moment is an idea Ben had to 
>creat dummy files named after the ifindex. Then, use symlinks 
>for the kernel name and the various by-$property 
>subdirectories. This means the KOBJ events will need to expose 
>the ifindex.
>

I suppose the KOBJ events already expose the ifindex of a network
interface. The file "/sys/class/net/ethN/uevent" contains INTERFACE=ethN
and IFINDEX=n already. But it looks like udev doesn't use it in any way.
For example, with the kernel patch the "/sys/class/net/ethN/uevent"
contains in addition to the above details, MAJOR=M and MINOR=m which the
udev knows how to make use of with a rule like 

SUBSYSTEM=="net", KERNEL!="tun", NAME="netdev/%k", MODE="0600".

>I'm a novice at net programming, but I'm told that ifindex is 
>the information apps ultimately require here.

Yes. The minor number of the device node is retreived by libnetdevname
by "stat"ing the pathname which happens to be ifindex of the device and
it is mapped to corresponding kernel name by "if_indextoname"  call.

With regards,
Narendra K  
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bryan Kadzban Oct. 19, 2009, 4:14 p.m. UTC | #27
Narendra_K@Dell.com wrote:
>>>>> And how would the regular file look like in terms of holding
>>>>>  ifindex of the interface, which can be passed to
>>>>> libnetdevname.
>>>> I can't think of anything we need to store in the regular file.
>>>> If we have the kernel name for the device, we can look up the
>>>> ifindex in /sys. Correct me if I'm wrong, but storing it
>>>> ourselves seems redundant.
>>> But the name of a netdev can change whereas its ifindex never
>>> does. Identifying netdevs by name would require additional work
>>> to update the links when a netdev is renamed and would still be
>>> prone to race conditions.  This is why Narendra and Matt were
>>> proposing to
>> store the
>>> ifindex in the node all along...
>> Matt, Ben and I talked about a few other possibilities on IRC. The
>> one I like the most at the moment is an idea Ben had to creat dummy
>> files named after the ifindex. Then, use symlinks for the kernel
>> name and the various by-$property subdirectories. This means the
>> KOBJ events will need to expose the ifindex.
>> 
> 
> I suppose the KOBJ events already expose the ifindex of a network 
> interface. The file "/sys/class/net/ethN/uevent" contains
> INTERFACE=ethN and IFINDEX=n already. But it looks like udev doesn't
> use it in any way.

Right; it could simply do the equivalent of:

touch /dev/netdev/$env{IFINDEX}

instead of its normal mknod(2), and then do normal SYMLINK processing.
That last part is what would link /dev/netdev/by-name/$env{INTERFACE} to
that device, along with /dev/netdev/by-mac/*, /dev/netdev/by-path/*,
etc., etc., in as many different ways as people want to add rules.

(Or /dev/net/by-* instead of netdev; I'm mostly ambivalent about the
first-level directory under /dev.  Looks like libnetdevname requires
/dev/netdev though.)

> For example, with the kernel patch the "/sys/class/net/ethN/uevent" 
> contains in addition to the above details, MAJOR=M and MINOR=m which
> the udev knows how to make use of with a rule like
> 
> SUBSYSTEM=="net", KERNEL!="tun", NAME="netdev/%k", MODE="0600".

And if the only point is to get the ifindex via stat(2) on the resulting
symlinks, but people don't like device files, then why not get the
ifindex via readlink(2) (and a bit of string parsing, and a strtol(3) or
strtoul(3) call) instead?  :-)
Narendra K Nov. 4, 2009, 2:23 p.m. UTC | #28
>>> Matt, Ben and I talked about a few other possibilities on IRC. The 
>>> one I like the most at the moment is an idea Ben had to creat dummy 
>>> files named after the ifindex. Then, use symlinks for the 
>kernel name 
>>> and the various by-$property subdirectories. This means the KOBJ 
>>> events will need to expose the ifindex.
>>> 
>> 
>> I suppose the KOBJ events already expose the ifindex of a network 
>> interface. The file "/sys/class/net/ethN/uevent" contains 
>> INTERFACE=ethN and IFINDEX=n already. But it looks like udev doesn't 
>> use it in any way.
>
>Right; it could simply do the equivalent of:
>
>touch /dev/netdev/$env{IFINDEX}
>
>instead of its normal mknod(2), and then do normal SYMLINK processing.
>That last part is what would link 
>/dev/netdev/by-name/$env{INTERFACE} to that device, along with 
>/dev/netdev/by-mac/*, /dev/netdev/by-path/*, etc., etc., in as 
>many different ways as people want to add rules.
>
>(Or /dev/net/by-* instead of netdev; I'm mostly ambivalent 
>about the first-level directory under /dev.  Looks like 
>libnetdevname requires /dev/netdev though.)
>
>> For example, with the kernel patch the "/sys/class/net/ethN/uevent" 
>> contains in addition to the above details, MAJOR=M and MINOR=m which 
>> the udev knows how to make use of with a rule like
>> 
>> SUBSYSTEM=="net", KERNEL!="tun", NAME="netdev/%k", MODE="0600".
>
>And if the only point is to get the ifindex via stat(2) on the 
>resulting symlinks, but people don't like device files, then 
>why not get the ifindex via readlink(2) (and a bit of string 
>parsing, and a strtol(3) or
>strtoul(3) call) instead?  :-)


I suppose this issue can also be addressed in another way. Currently,
the sysfs contains various attributes of a network interface under the
directory "/sys/class/net/ethN", for example
"/sys/class/net/ethN/address". This will be used by udev as below -

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{address}=="00:1d:09:6a:78:ec", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth1".

Similarly, export an attribute named "smbios_name" to sysfs, i.e
"/sys/class/net/eth0/smbios_name". "Cat /sys/class/net/eth0/smbios_name"
would show "Embedded_NIC_1[23..]" and this can be used by udev in
70-persistent-net.rules as 

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
ATTR{smbios_name}=="Embedded_NIC_1", ATTR{type}=="1", KERNEL=="eth*",
NAME="eth0".

I suppose this would not need any changes to the udev code and existing
udev infrastructure can be used as udev is capable handling
ATTR{something}.

This would also ensure that whichever device is "Embedded_NIC_1" as per
the BIOS, will also be "eth0" in the os.

Netdev, What are your views on this idea ?

With regards,
Narendra K  
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marco d'Itri Nov. 6, 2009, 8:49 a.m. UTC | #29
On Nov 04, Narendra_K@Dell.com wrote:

> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
> ATTR{smbios_name}=="Embedded_NIC_1", ATTR{type}=="1", KERNEL=="eth*",
> NAME="eth0".
As a distribution developer I highly value solutions like this which do
not require patching every application which deals with interface names
and then teaching users about aliases which only work in some places and
are unknown to the kernel.
Matt Domsch Nov. 6, 2009, 10:05 p.m. UTC | #30
On Wed, Nov 04, 2009 at 08:23:38AM -0600, K, Narendra wrote:
> Similarly, export an attribute named "smbios_name" to sysfs, i.e
> "/sys/class/net/eth0/smbios_name". "Cat /sys/class/net/eth0/smbios_name"
> would show "Embedded_NIC_1[23..]" and this can be used by udev in
> 70-persistent-net.rules as 
> 
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
> ATTR{smbios_name}=="Embedded_NIC_1", ATTR{type}=="1", KERNEL=="eth*",
> NAME="eth0".
> 
> I suppose this would not need any changes to the udev code and existing
> udev infrastructure can be used as udev is capable handling
> ATTR{something}.
> 
> This would also ensure that whichever device is "Embedded_NIC_1" as per
> the BIOS, will also be "eth0" in the os.

We can grab the smbios_name value using biosdevname in a PROGRAM= part
of the udev rule.  But it doesn't actually solve the problem.  We
haven't changed the network device naming scheme from "eth%d" to
something else.  Therefore, by having rules which simply try to
re-order names within that scheme, when they're being enumerated in
parallel and racing, we get collisions.  Take for example, this which
tries to rename the 4 onboard NICs in a particular order, in the
absence of any other rules:

PROGRAM="/sbin/biosdevname --policy=smbios_names -i %k", RESULT=="Embedded NIC 1", NAME="eth0"
PROGRAM="/sbin/biosdevname --policy=smbios_names -i %k", RESULT=="Embedded NIC 2", NAME="eth1"
PROGRAM="/sbin/biosdevname --policy=smbios_names -i %k", RESULT=="Embedded NIC 3", NAME="eth2"
PROGRAM="/sbin/biosdevname --policy=smbios_names -i %k", RESULT=="Embedded NIC 4", NAME="eth3"

I wind up with instead this in ifconfig -a:

eth0        00:1B:21:42:66:30  
eth1        00:1B:21:42:66:31  
eth2        00:22:19:59:8E:5A  
eth2_rename 00:22:19:59:8E:56  
eth3        00:22:19:59:8E:5C  
eth3_rename 00:22:19:59:8E:58  

When what I would have expected would have been:

eth0 00:22:19:59:8E:56
eth1 00:22:19:59:8E:58
eth2 00:22:19:59:8E:5A
eth3 00:22:19:59:8E:5C
eth4 00:1B:21:42:66:30  
eth5 00:1B:21:42:66:31  


I can't use eth%d as the scheme - that's the kernel's scheme.  I have
to switch the scheme to something else.
Matt Domsch Nov. 6, 2009, 10:06 p.m. UTC | #31
On Fri, Nov 06, 2009 at 09:49:21AM +0100, Marco d'Itri wrote:
> On Nov 04, Narendra_K@Dell.com wrote:
> 
> > SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
> > ATTR{smbios_name}=="Embedded_NIC_1", ATTR{type}=="1", KERNEL=="eth*",
> > NAME="eth0".
> As a distribution developer I highly value solutions like this which do
> not require patching every application which deals with interface names
> and then teaching users about aliases which only work in some places and
> are unknown to the kernel.

Fair enough - but would you object if we changed the naming scheme
from eth%d to something else?
Marco d'Itri Nov. 6, 2009, 10:35 p.m. UTC | #32
On Nov 06, Matt Domsch <Matt_Domsch@Dell.com> wrote:

> > As a distribution developer I highly value solutions like this which do
> > not require patching every application which deals with interface names
> > and then teaching users about aliases which only work in some places and
> > are unknown to the kernel.
> Fair enough - but would you object if we changed the naming scheme
> from eth%d to something else?
I suppose that this would depend on what else. :-)
Since you want radical changes I recommend that you design the new
persistent naming infrastructure in a way that will allow root to choose
to use the classic naming scheme, or many users will scream a lot and at
least some distributions will do it anyway.
I also expect that providing choice at the beginning of development may
lead to more acceptance later if and when the new scheme will have
proved itself to be superior (at least in some situations).
You have tought about this for a long time and if so far you have not
found a solution which is widely considered superior then I doubt that
one will appear soon. Providing your favourite naming scheme as an
optional add on will immediately benefit those who like it and greatly
reduce opposition from those who do not.
dann frazier Nov. 6, 2009, 11:17 p.m. UTC | #33
On Fri, Nov 06, 2009 at 11:35:24PM +0100, Marco d'Itri wrote:
> On Nov 06, Matt Domsch <Matt_Domsch@Dell.com> wrote:
> 
> > > As a distribution developer I highly value solutions like this which do
> > > not require patching every application which deals with interface names
> > > and then teaching users about aliases which only work in some places and
> > > are unknown to the kernel.
> > Fair enough - but would you object if we changed the naming scheme
> > from eth%d to something else?
> I suppose that this would depend on what else. :-)
> Since you want radical changes I recommend that you design the new
> persistent naming infrastructure in a way that will allow root to choose
> to use the classic naming scheme, or many users will scream a lot and at
> least some distributions will do it anyway.
> I also expect that providing choice at the beginning of development may
> lead to more acceptance later if and when the new scheme will have
> proved itself to be superior (at least in some situations).
> You have tought about this for a long time and if so far you have not
> found a solution which is widely considered superior then I doubt that
> one will appear soon. Providing your favourite naming scheme as an
> optional add on will immediately benefit those who like it and greatly
> reduce opposition from those who do not.

This seems to me like a good installer feature - give the user an
option to enter a name for an interface, with the default option
to use the eth* names. To illustrate by example, I imagine an
installer flow that looks like this:

 [Do Hardware Discovery]
 [Automatically reorder kernel names for reasonable defaults;
  eth0-eth{n-1} map to n onboard nics]

  Sample user interface for network configuration:

 ------------Choose an interface to configure --------------
 | Multiple unconfigured interfaces detected.              |
 | Select an interface to configure by:                    |
 |   1. Kernel name (eth0, eth1, etc)                      |
 |   2. Mac Address                                        |
 |   3. Chassis name                                       |
 |   4. PCI Slot                                           |
 -----------------------------------------------------------

 ----Choose an interface to configure (by chassis name)-----
 |   1. LOM0                                               |
 |   2. LOM1                                               |
 |   3. Undefined                                          |
 |   4. Undefined                                          |
 -----------------------------------------------------------

 ----------------Name interface - (chassis name LOM0)-------
 |   Name to use for this interface [eth0]: __mynet0_      |
 -----------------------------------------------------------

 -----------------------------------------------------------
 | Configure interface - mynet0                            |
 |   1. DHCP                                               |
 |   2. Static                                             |
 |   ...                                                   |
 -----------------------------------------------------------
 
[Generate udev rules that bind the user-selected name to
 the user-selected attribute]
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Narendra K Nov. 9, 2009, 2:41 p.m. UTC | #34
>> > As a distribution developer I highly value solutions like 
>this which 
>> > do not require patching every application which deals with 
>interface 
>> > names and then teaching users about aliases which only 
>work in some 
>> > places and are unknown to the kernel.
>> Fair enough - but would you object if we changed the naming scheme 
>> from eth%d to something else?
>I suppose that this would depend on what else. :-) Since you 
>want radical changes I recommend that you design the new 
>persistent naming infrastructure in a way that will allow root 
>to choose to use the classic naming scheme, or many users will 
>scream a lot and at least some distributions will do it anyway.
>I also expect that providing choice at the beginning of 
>development may lead to more acceptance later if and when the 
>new scheme will have proved itself to be superior (at least in 
>some situations).
>You have tought about this for a long time and if so far you 
>have not found a solution which is widely considered superior 
>then I doubt that one will appear soon. Providing your 
>favourite naming scheme as an optional add on will immediately 
>benefit those who like it and greatly reduce opposition from 
>those who do not.

In that way, I suppose char device node solution fits the scheme
perfectly. It doesn't change or interfere with the kernel's default
naming scheme (ethN) in any way. Also, the applications continue to work
the way they did and in addition to supporting traditional names, they
would also support pathnames. Whether all the user space applications
need to be patched can be discussed and debated. But, we can patch
applications like, installers and firewall code, which when don't see
determinism ("eth0 mapping to integrated port 1"), fail and cause very
high impact could be patched. Since users are already familiar with
pathnames like /dev/disk/by-id{label, uuid}, I suppose it might not be
very difficult to get used to pathnames like
/dev/netdev/by-chassis-label/Embedded_NIC_1. Would that be acceptable ?


With regards,
Narendra K  
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Nov. 10, 2009, 5:23 p.m. UTC | #35
On Mon, 9 Nov 2009 20:11:47 +0530
<Narendra_K@Dell.com> wrote:

> 
> >> > As a distribution developer I highly value solutions like 
> >this which 
> >> > do not require patching every application which deals with 
> >interface 
> >> > names and then teaching users about aliases which only 
> >work in some 
> >> > places and are unknown to the kernel.
> >> Fair enough - but would you object if we changed the naming scheme 
> >> from eth%d to something else?
> >I suppose that this would depend on what else. :-) Since you 
> >want radical changes I recommend that you design the new 
> >persistent naming infrastructure in a way that will allow root 
> >to choose to use the classic naming scheme, or many users will 
> >scream a lot and at least some distributions will do it anyway.
> >I also expect that providing choice at the beginning of 
> >development may lead to more acceptance later if and when the 
> >new scheme will have proved itself to be superior (at least in 
> >some situations).
> >You have tought about this for a long time and if so far you 
> >have not found a solution which is widely considered superior 
> >then I doubt that one will appear soon. Providing your 
> >favourite naming scheme as an optional add on will immediately 
> >benefit those who like it and greatly reduce opposition from 
> >those who do not.
> 
> In that way, I suppose char device node solution fits the scheme
> perfectly. It doesn't change or interfere with the kernel's default
> naming scheme (ethN) in any way. Also, the applications continue to work
> the way they did and in addition to supporting traditional names, they
> would also support pathnames. Whether all the user space applications
> need to be patched can be discussed and debated. But, we can patch
> applications like, installers and firewall code, which when don't see
> determinism ("eth0 mapping to integrated port 1"), fail and cause very
> high impact could be patched. Since users are already familiar with
> pathnames like /dev/disk/by-id{label, uuid}, I suppose it might not be
> very difficult to get used to pathnames like
> /dev/netdev/by-chassis-label/Embedded_NIC_1. Would that be acceptable ?
>

IFNAMSIZ = 16 is hardwired as part of the kernel binary user space API.

Have you observed that the only developers arguing for this come from
outside the normal circle of networking? It seems to be favored only
by those who come to networking from a system or disk point of view.
Narendra K Nov. 11, 2009, 6:31 a.m. UTC | #36
>> >> Fair enough - but would you object if we changed the 
>naming scheme 
>> >> from eth%d to something else?
>> >I suppose that this would depend on what else. :-) Since you want 
>> >radical changes I recommend that you design the new 
>persistent naming 
>> >infrastructure in a way that will allow root to choose to use the 
>> >classic naming scheme, or many users will scream a lot and at least 
>> >some distributions will do it anyway.
>> >I also expect that providing choice at the beginning of development 
>> >may lead to more acceptance later if and when the new scheme will 
>> >have proved itself to be superior (at least in some situations).
>> >You have tought about this for a long time and if so far 
>you have not 
>> >found a solution which is widely considered superior then I doubt 
>> >that one will appear soon. Providing your favourite naming 
>scheme as 
>> >an optional add on will immediately benefit those who like it and 
>> >greatly reduce opposition from those who do not.
>> 
>> In that way, I suppose char device node solution fits the scheme 
>> perfectly. It doesn't change or interfere with the kernel's default 
>> naming scheme (ethN) in any way. Also, the applications continue to 
>> work the way they did and in addition to supporting 
>traditional names, 
>> they would also support pathnames. Whether all the user space 
>> applications need to be patched can be discussed and 
>debated. But, we 
>> can patch applications like, installers and firewall code, 
>which when 
>> don't see determinism ("eth0 mapping to integrated port 1"), 
>fail and 
>> cause very high impact could be patched. Since users are already 
>> familiar with pathnames like /dev/disk/by-id{label, uuid}, I suppose 
>> it might not be very difficult to get used to pathnames like 
>> /dev/netdev/by-chassis-label/Embedded_NIC_1. Would that be 
>acceptable ?
>>
>
>IFNAMSIZ = 16 is hardwired as part of the kernel binary user space API.

This factor is taken into consideration. The user space applications
take this pathname, map it to the kernel name and use the kernel name to
issue ioctls (http://linux.dell.com/wiki/index.php/Oss/libnetdevname).
The pathname was suggested because it provides a way to get to the right
interface when "integrated port 1" doesn't get the expected name "eth0".

With regards,
Narendra K  
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 94958c1..7c0fc81 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -44,6 +44,7 @@ 
 #include <linux/workqueue.h>
 
 #include <linux/ethtool.h>
+#include <linux/cdev.h>
 #include <net/net_namespace.h>
 #include <net/dsa.h>
 #ifdef CONFIG_DCB
@@ -916,6 +917,9 @@  struct net_device
 	/* max exchange id for FCoE LRO by ddp */
 	unsigned int		fcoe_ddp_xid;
 #endif
+#ifdef CONFIG_NET_CDEV
+	struct cdev cdev;
+#endif
 };
 #define to_net_dev(d) container_of(d, struct net_device, dev)
 
diff --git a/net/Kconfig b/net/Kconfig
index 041c35e..bdc5bd7 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -43,6 +43,16 @@  config COMPAT_NETLINK_MESSAGES
 	  Newly written code should NEVER need this option but do
 	  compat-independent messages instead!
 
+config NET_CDEV
+       bool "/dev files for network devices"
+       default y
+       help
+         This option causes /dev entries to be created for each
+         network device.  This allows the use of udev to create
+         alternate device naming policies.
+
+	 If unsure, say Y.
+
 menu "Networking options"
 
 source "net/packet/Kconfig"
diff --git a/net/core/Makefile b/net/core/Makefile
index 796f46e..0b40d2c 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -19,4 +19,5 @@  obj-$(CONFIG_NET_DMA) += user_dma.o
 obj-$(CONFIG_FIB_RULES) += fib_rules.o
 obj-$(CONFIG_TRACEPOINTS) += net-traces.o
 obj-$(CONFIG_NET_DROP_MONITOR) += drop_monitor.o
+obj-$(CONFIG_NET_CDEV) += cdev.o
 
diff --git a/net/core/cdev.c b/net/core/cdev.c
new file mode 100644
index 0000000..1f36076
--- /dev/null
+++ b/net/core/cdev.c
@@ -0,0 +1,42 @@ 
+#include <linux/fs.h>
+#include <linux/cdev.h>
+#include <linux/netdevice.h>
+#include <linux/device.h>
+
+/* Used for network dynamic major number */
+static dev_t netdev_devt;
+
+static int netdev_cdev_open(struct inode *inode, struct file *filep)
+{
+	/* no operations on this device are implemented */
+	return -ENOSYS;
+}
+
+static const struct file_operations netdev_cdev_fops = {
+	.owner = THIS_MODULE,
+	.open = netdev_cdev_open,
+};
+
+void netdev_cdev_alloc(void)
+{
+	alloc_chrdev_region(&netdev_devt, 0, 1<<20, "net");
+}
+
+void netdev_cdev_init(struct net_device *dev)
+{
+	cdev_init(&dev->cdev, &netdev_cdev_fops);
+	cdev_add(&dev->cdev, MKDEV(MAJOR(netdev_devt), dev->ifindex), 1);
+
+}
+
+void netdev_cdev_del(struct net_device *dev)
+{
+	if (dev->cdev.dev)
+		cdev_del(&dev->cdev);
+}
+
+void netdev_cdev_kobj_init(struct device *dev, struct net_device *net)
+{
+	if (net->cdev.dev)
+		dev->devt = net->cdev.dev;
+}
diff --git a/net/core/cdev.h b/net/core/cdev.h
new file mode 100644
index 0000000..9cf5a90
--- /dev/null
+++ b/net/core/cdev.h
@@ -0,0 +1,13 @@ 
+#include <linux/netdevice.h>
+
+#ifdef CONFIG_NET_CDEV
+void netdev_cdev_alloc(void);
+void netdev_cdev_init(struct net_device *dev);
+void netdev_cdev_del(struct net_device *dev);
+void netdev_cdev_kobj_init(struct device *dev, struct net_device *net);
+#else
+static inline void netdev_cdev_alloc(void) {}
+static inline void netdev_cdev_init(struct net_device *dev) {}
+static inline void netdev_cdev_del(struct net_device *dev) {}
+static inline void netdev_cdev_kobj_init(struct device *dev, struct net_device *net) {}
+#endif
diff --git a/net/core/dev.c b/net/core/dev.c
index b8f74cf..c4ebfcd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -129,6 +129,7 @@ 
 #include <trace/events/napi.h>
 
 #include "net-sysfs.h"
+#include "cdev.h"
 
 /* Instead of increasing this, you should create a hash table. */
 #define MAX_GRO_SKBS 8
@@ -4684,6 +4685,7 @@  static void rollback_registered(struct net_device *dev)
 
 	/* Remove entries from kobject tree */
 	netdev_unregister_kobject(dev);
+	netdev_cdev_del(dev);
 
 	synchronize_net();
 
@@ -4835,6 +4837,8 @@  int register_netdevice(struct net_device *dev)
 	if (dev->features & NETIF_F_SG)
 		dev->features |= NETIF_F_GSO;
 
+	netdev_cdev_init(dev);
+
 	netdev_initialize_kobject(dev);
 	ret = netdev_register_kobject(dev);
 	if (ret)
@@ -4864,6 +4868,7 @@  out:
 	return ret;
 
 err_uninit:
+	netdev_cdev_del(dev);
 	if (dev->netdev_ops->ndo_uninit)
 		dev->netdev_ops->ndo_uninit(dev);
 	goto out;
@@ -5371,6 +5376,7 @@  int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
 	dev_addr_discard(dev);
 
 	netdev_unregister_kobject(dev);
+	netdev_cdev_del(dev);
 
 	/* Actually switch the network namespace */
 	dev_net_set(dev, net);
@@ -5387,6 +5393,8 @@  int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
 			dev->iflink = dev->ifindex;
 	}
 
+	netdev_cdev_init(dev);
+
 	/* Fixup kobjects */
 	err = netdev_register_kobject(dev);
 	WARN_ON(err);
@@ -5620,6 +5628,8 @@  static int __init net_dev_init(void)
 
 	BUG_ON(!dev_boot_phase);
 
+	netdev_cdev_alloc();
+
 	if (dev_proc_init())
 		goto out;
 
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 821d309..ba0af79 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -19,6 +19,7 @@ 
 #include <net/wext.h>
 
 #include "net-sysfs.h"
+#include "cdev.h"
 
 #ifdef CONFIG_SYSFS
 static const char fmt_hex[] = "%#x\n";
@@ -461,6 +462,14 @@  static void netdev_release(struct device *d)
 	kfree((char *)dev - dev->padded);
 }
 
+#ifdef CONFIG_NET_CDEV
+static char *netdev_devnode(struct device *d, mode_t *mode)
+{
+	struct net_device *dev = to_net_dev(d);
+	return kasprintf(GFP_KERNEL, "netdev/%s", dev->name);
+}
+#endif
+
 static struct class net_class = {
 	.name = "net",
 	.dev_release = netdev_release,
@@ -470,6 +479,9 @@  static struct class net_class = {
 #ifdef CONFIG_HOTPLUG
 	.dev_uevent = netdev_uevent,
 #endif
+#ifdef CONFIG_NET_CDEV
+	.devnode = netdev_devnode,
+#endif
 };
 
 /* Delete sysfs entries but hold kobject reference until after all
@@ -496,6 +508,7 @@  int netdev_register_kobject(struct net_device *net)
 	dev->class = &net_class;
 	dev->platform_data = net;
 	dev->groups = groups;
+	netdev_cdev_kobj_init(dev, net);
 
 	dev_set_name(dev, "%s", net->name);