diff mbox

[V2] Use firmware provided index to register a network interface

Message ID 20101007142319.GB2641@libnet-test.oslab.blr.amer.dell.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Narendra K Oct. 7, 2010, 2:23 p.m. UTC
Hello,

V1 -> V2:

This patch addresses the scenario of buggy firmware/BIOS tables. The
patch introduces a command line parameter 'no_netfwindex', passing which
firmware provided index will not be used to derive 'eth' names. By
default, firmware index will be used and the parameter can be used to
work around buggy firmware/BIOS tables.

Please find the patch below.

From: Narendra K <narendra_k@dell.com>
Subject: [PATCH] Use firmware provided index to register a network device

This patch uses the firmware provided index to derive the ethN name.
If the firmware provides an index for the corresponding pdev, the N
is derived from the index.

As an example, consider a PowerEdge R710 which has 4 BCM5709
Lan-On-Motherboard ports,1 Intel 82572EI port and 4 82575GB ports.
The system firmware communicates the order of the 4 Lan-On-Motherboard
ports by assigning indexes to each one of them. This is available to
the OS as the SMBIOS type 41 record(for onboard devices), in the field
'device type index'. It looks like below -

Handle 0x2900, DMI type 41, 11 bytes
Onboard Device
	Reference Designation: Embedded NIC 1
	Type: Ethernet
	Status: Enabled
	Type Instance: 1
	Bus Address: 0000:01:00.0

Handle 0x2901, DMI type 41, 11 bytes
Onboard Device
	Reference Designation: Embedded NIC 2
	Type: Ethernet
	Status: Enabled
	Type Instance: 2
	Bus Address: 0000:01:00.1

Handle 0x2902, DMI type 41, 11 bytes
Onboard Device
	Reference Designation: Embedded NIC 3
	Type: Ethernet
	Status: Enabled
	Type Instance: 3
	Bus Address: 0000:02:00.0

Handle 0x2903, DMI type 41, 11 bytes
Onboard Device
	Reference Designation: Embedded NIC 4
	Type: Ethernet
	Status: Enabled
	Type Instance: 4
	Bus Address: 0000:02:00.1

The OS can use this index to name the network interfaces as below.

Onboard devices -

Interface		Fwindex		Driver
Name
eth[fwindex - 1] =eth0	1		bnx2
eth[fwindex - 1] =eth1	2		bnx2
eth[fwindex - 1] =eth2	3		bnx2
eth[fwindex - 1] =eth3	4		bnx2

The add-in devices do not get any index and they will get names from
eth4 onwards.

Add-in interfaces -

eth4			e1000e
eth5			igb
eth6			igb
eth7			igb
eth8			igb

With this patch,

1. This patch adheres to the established ABI of ethN namespace with
IFNAMSIZ length and ensures that onboard network interfaces get
expected names at the first instance itself and avoids any renaming
later.

2. The 'eth0' of the OS always corresponds to the 'Gb1' as labeled on
the system chassis. There is determinism in the way Lan-On-Motherboard
ports get named.

3. The add-in devices will always be named from beyond what the
Lan-On-Motherboard names as show above. But there is no determinism
as to which add-in interface gets what ethN name.

Passing 'no_netfwindex' command line parameter would result in
firmware index not being used to derive the names as described above.

Signed-off-by: Narendra K <narendra_k@dell.com>
---
 Documentation/kernel-parameters.txt |    6 +++++
 drivers/pci/pci-label.c             |    1 +
 drivers/pci/pci-sysfs.c             |    5 ++++
 include/linux/netdevice.h           |    2 +
 include/linux/pci.h                 |    1 +
 net/core/dev.c                      |   42 ++++++++++++++++++++++++++++++-----
 6 files changed, 51 insertions(+), 6 deletions(-)

Comments

Greg KH Oct. 7, 2010, 3:11 p.m. UTC | #1
On Thu, Oct 07, 2010 at 07:23:35AM -0700, Narendra_K@Dell.com wrote:
> Hello,
> 
> V1 -> V2:
> 
> This patch addresses the scenario of buggy firmware/BIOS tables. The
> patch introduces a command line parameter 'no_netfwindex', passing which
> firmware provided index will not be used to derive 'eth' names. By
> default, firmware index will be used and the parameter can be used to
> work around buggy firmware/BIOS tables.
> 
> Please find the patch below.
> 
> From: Narendra K <narendra_k@dell.com>
> Subject: [PATCH] Use firmware provided index to register a network device
> 
> This patch uses the firmware provided index to derive the ethN name.
> If the firmware provides an index for the corresponding pdev, the N
> is derived from the index.

No, this has the very big chance to reorder existing network names and
should all be done in userspace with the firmware information exported
in sysfs, if the user so desires.

So consider this patch rejected from me.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Oct. 7, 2010, 4:14 p.m. UTC | #2
On Thu, 7 Oct 2010 07:23:35 -0700
<Narendra_K@Dell.com> wrote:

> Hello,
> 
> V1 -> V2:
> 
> This patch addresses the scenario of buggy firmware/BIOS tables. The
> patch introduces a command line parameter 'no_netfwindex', passing which
> firmware provided index will not be used to derive 'eth' names. By
> default, firmware index will be used and the parameter can be used to
> work around buggy firmware/BIOS tables.
> 
> Please find the patch below.
> 
> From: Narendra K <narendra_k@dell.com>
> Subject: [PATCH] Use firmware provided index to register a network device
> 
> This patch uses the firmware provided index to derive the ethN name.
> If the firmware provides an index for the corresponding pdev, the N
> is derived from the index.
> 
> As an example, consider a PowerEdge R710 which has 4 BCM5709
> Lan-On-Motherboard ports,1 Intel 82572EI port and 4 82575GB ports.
> The system firmware communicates the order of the 4 Lan-On-Motherboard
> ports by assigning indexes to each one of them. This is available to
> the OS as the SMBIOS type 41 record(for onboard devices), in the field
> 'device type index'. It looks like below -

I agree with Greg. Provide the firmware index in sysfs for use
in udev userspace and let udev do the policy on the naming.
The firmware index could be missing, wrong, or interesting on only
some devices.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matt Domsch Oct. 7, 2010, 4:31 p.m. UTC | #3
On Thu, Oct 07, 2010 at 08:11:34AM -0700, Greg KH wrote:
> On Thu, Oct 07, 2010 at 07:23:35AM -0700, Narendra_K@Dell.com wrote:
> > Hello,
> > 
> > V1 -> V2:
> > 
> > This patch addresses the scenario of buggy firmware/BIOS tables. The
> > patch introduces a command line parameter 'no_netfwindex', passing which
> > firmware provided index will not be used to derive 'eth' names. By
> > default, firmware index will be used and the parameter can be used to
> > work around buggy firmware/BIOS tables.
> > 
> > Please find the patch below.
> > 
> > From: Narendra K <narendra_k@dell.com>
> > Subject: [PATCH] Use firmware provided index to register a network device
> > 
> > This patch uses the firmware provided index to derive the ethN name.
> > If the firmware provides an index for the corresponding pdev, the N
> > is derived from the index.
> 
> No, this has the very big chance to reorder existing network names and
> should all be done in userspace with the firmware information exported
> in sysfs, if the user so desires.

Existing names that use 70-persistent-net.rules won't change.  They
were non-deterministically assigned and recorded in that file, so
they'll persist.  At most, they'll have to deal with the same
"eth0_rename" garbage that anyone wanting to get a consistent name
from other than MAC address would get.  The only reason the
auto-created 70-persistent-net.rules works is that most people don't
change their NIC configuration after install, so it has no renaming to do.

The kernel patch has the advantage of not requiring users to change
their scripts, by reserving the first N values for onboard devices as
BIOS describes them.

Userspace udev rules require us to change user behavior, and likely
their scripts, to use a new namespace such as ethlom1, in order to get
deterministic naming.


> > It took some time to find out the details asked above. Right,
> > windows does not use SMBIOS type 41 record to derive names. But as
> > a datapoint, windows also has the same problem.

> If windows does not use this field, then you can guarantee it will
> show up incorrectly in BIOSes and never be tested by manufacturers
> before they ship their machines.

There are 2 methods of exporting the information that have gotten
confusing here.

1) SMBIOS type 41 method.  Windows does not use this today, and I
   can't speak to their future plans.  Narendra's kernel patch does,
   as has biosdevname, the udev helper we first wrote for this purpose, for several years.

2) soon-to-be-released PCIe Firmware Spec, exporting label and index
   as an ACPI _DSM().  It is anticipated that Windows will use this
   information in some fashion at some point, per our conversations
   with Microsoft as part of the PCI SIG proposal process.  We've held
   off submitting a kernel patch until the spec is public.



Cases:
1) BIOS doesn't provide this information (the common case today for
   all but Dell 10G and newer servers), nothing is reserved, and
   therefore the existing 70-persistent-net.rules take effect to name
   devices.  No change in behavior for existing systems.  New installs
   will have NICs named non-deterministically, and recorded in
   70-persistent-net.rules.  Users desiring consistent naming must
   adjust 70-persistent-net.rules after install, and to avoid
   eth0_rename collisions, must rename into another namespace.

2) BIOS provides this information correctly (Dell 10G and newer servers, or
   anyone else implementing the spec).  The first N values for onboard
   devices are reserved and assigned by the kernel to onboard
   devices.  70-persistent-net.rules may still try to rename the
   ports, and may fail with eth0_rename collisions.  Users would have
   to adjust 70-persistent-net.rules to accommodate, but they do not
   have to change the namespace from ethX to something else. New
   installs will have onboard NICs named deterministically, and recorded in
   70-persistent-net.rules.


3) BIOS provides this information incorrectly (none known today).
   Some number of N values may be reserved, more or fewer than
   actually present.  In either case, it's possible that
   70-persistent-net.rules would have to be adjusted to accommodate,
   but they do not have to change the namespace from ethX to something
   else. New installs will have onboard NICs named deterministically,
   if not ideally due to buggy BIOS, and recorded in
   70-persistent-net.rules.


What I can't find here is a solution where the user gets determistic
naming, without being forced to change the namespace and adjust their
scripts.  Can anyone?  I'm not opposed to "do it all in userspace" -
really, I'm not.  But the 'where' is only part of the problem.  No
userspace solution has yet been proposed that doesn't require a
namespace change, and I'm still hoping to avoid that.

Thanks,
Matt
Greg KH Oct. 7, 2010, 4:48 p.m. UTC | #4
On Thu, Oct 07, 2010 at 11:31:13AM -0500, Matt Domsch wrote:
> On Thu, Oct 07, 2010 at 08:11:34AM -0700, Greg KH wrote:
> > On Thu, Oct 07, 2010 at 07:23:35AM -0700, Narendra_K@Dell.com wrote:
> > > Hello,
> > > 
> > > V1 -> V2:
> > > 
> > > This patch addresses the scenario of buggy firmware/BIOS tables. The
> > > patch introduces a command line parameter 'no_netfwindex', passing which
> > > firmware provided index will not be used to derive 'eth' names. By
> > > default, firmware index will be used and the parameter can be used to
> > > work around buggy firmware/BIOS tables.
> > > 
> > > Please find the patch below.
> > > 
> > > From: Narendra K <narendra_k@dell.com>
> > > Subject: [PATCH] Use firmware provided index to register a network device
> > > 
> > > This patch uses the firmware provided index to derive the ethN name.
> > > If the firmware provides an index for the corresponding pdev, the N
> > > is derived from the index.
> > 
> > No, this has the very big chance to reorder existing network names and
> > should all be done in userspace with the firmware information exported
> > in sysfs, if the user so desires.
> 
> Existing names that use 70-persistent-net.rules won't change.

But users that don't use it would cause their names to change, and you
can't break that, no matter how naturally fragile those types of systems
are.  Sorry.

> The kernel patch has the advantage of not requiring users to change
> their scripts, by reserving the first N values for onboard devices as
> BIOS describes them.

So you feel that updating a kernel is easier than getting a user to
update their userspace scripts?  While I might also feel it is that way,
in reality, it isn't, sorry.

> Userspace udev rules require us to change user behavior, and likely
> their scripts, to use a new namespace such as ethlom1, in order to get
> deterministic naming.

Then use that, don't put this in the kernel please.

> > > It took some time to find out the details asked above. Right,
> > > windows does not use SMBIOS type 41 record to derive names. But as
> > > a datapoint, windows also has the same problem.
> 
> > If windows does not use this field, then you can guarantee it will
> > show up incorrectly in BIOSes and never be tested by manufacturers
> > before they ship their machines.
> 
> There are 2 methods of exporting the information that have gotten
> confusing here.
> 
> 1) SMBIOS type 41 method.  Windows does not use this today, and I
>    can't speak to their future plans.  Narendra's kernel patch does,
>    as has biosdevname, the udev helper we first wrote for this
>    purpose, for several years.

Then stick with that udev helper please :)

> 2) soon-to-be-released PCIe Firmware Spec, exporting label and index
>    as an ACPI _DSM().  It is anticipated that Windows will use this
>    information in some fashion at some point, per our conversations
>    with Microsoft as part of the PCI SIG proposal process.  We've held
>    off submitting a kernel patch until the spec is public.

Let's worry about that _when_ it is public, and when there is a Windows
version that supports it please.  Until then, there will not be support
for this in BIOSes in any usable way.

> Cases:
> 1) BIOS doesn't provide this information (the common case today for
>    all but Dell 10G and newer servers), nothing is reserved, and
>    therefore the existing 70-persistent-net.rules take effect to name
>    devices.  No change in behavior for existing systems.  New installs
>    will have NICs named non-deterministically, and recorded in
>    70-persistent-net.rules.  Users desiring consistent naming must
>    adjust 70-persistent-net.rules after install, and to avoid
>    eth0_rename collisions, must rename into another namespace.

That's fine and is how things work today, right?

> 2) BIOS provides this information correctly (Dell 10G and newer servers, or
>    anyone else implementing the spec).  The first N values for onboard
>    devices are reserved and assigned by the kernel to onboard
>    devices.

And you just broke people's machines that don't use udev for network
naming.  Sorry, that's not going to work.

>    70-persistent-net.rules may still try to rename the
>    ports, and may fail with eth0_rename collisions.

So you just broke their machines as well, when it was working, again,
not acceptable.

> 3) BIOS provides this information incorrectly (none known today).

Only because we have no way of testing for this :)

> What I can't find here is a solution where the user gets determistic
> naming, without being forced to change the namespace and adjust their
> scripts.  Can anyone?

No, and they shoudn't.

You should only have "deterministic" naming if you want it, and you set
it up to do so, from userspace.  The kernel is not responsible for this,
sorry.

If you want to do it in userspace, you have all the tools, and the data
exported from the kernel to do so.

> I'm not opposed to "do it all in userspace" - really, I'm not.  But
> the 'where' is only part of the problem.  No userspace solution has
> yet been proposed that doesn't require a namespace change, and I'm
> still hoping to avoid that.

Just use a new script for people who want this, they will have to know
they want it anyway, right?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Lamparter Oct. 7, 2010, 4:49 p.m. UTC | #5
Hi Matt,

On Thu, Oct 07, 2010 at 11:31:13AM -0500, Matt Domsch wrote:
> At most, they'll have to deal with the same "eth0_rename"

Getting that "eth0_rename" stuff is not something unchangeably imposed
by the kernel. I think this should be considered an udev bug most of all
things, since if the udev scripts did it right, this wouldn't happen. A
"right" way to do this would for example be to first rename all devices
to "tmpwhatever0", then rename them back to their proper ethX names.

Now I don't know even a single bit about udev scripting, so maybe this
isn't possible with the current udev infrastructure, but there's
certainly no kernel part stopping a proper implementation.

Btw, what happened to nameif(8)? It's totally outdated (and deprecated I
guess?), but it got the job done...


-David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kay Sievers Oct. 7, 2010, 5:05 p.m. UTC | #6
On Thu, Oct 7, 2010 at 18:48, Greg KH <greg@kroah.com> wrote:
> On Thu, Oct 07, 2010 at 11:31:13AM -0500, Matt Domsch wrote:
>> 1) SMBIOS type 41 method.  Windows does not use this today, and I
>>    can't speak to their future plans.  Narendra's kernel patch does,
>>    as has biosdevname, the udev helper we first wrote for this
>>    purpose, for several years.
>
> Then stick with that udev helper please :)

What about just exporting this information in sysfs, and not touch the naming?

Anyway, I'm pretty sure all of this naming of onboard devices should
happen only at install time, or from a system management tool and not
at hotplug time.

We should not get confused by the way the (very simple)
automatic-rule-creater for persistent netdev naming in udev works.
This is really just a tool for the common case, and works fine for the
majority of people.

I'm not sure, if we should put all these special use cases in the
hotplug path. I mean it's not that people add and remove 4 port
network cards with special BIOS all the time, and expect proper naming
on the first bootup, right? The installer, or the system management
tool could just create/edit udev rules to provide proper device naming
on whatever property is available at a specific hardware, be it the
MAC address or some other persistent match?

Kay
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kay Sievers Oct. 7, 2010, 5:13 p.m. UTC | #7
On Thu, Oct 7, 2010 at 18:49, David Lamparter <equinox@diac24.net> wrote:
> On Thu, Oct 07, 2010 at 11:31:13AM -0500, Matt Domsch wrote:
>> At most, they'll have to deal with the same "eth0_rename"
>
> Getting that "eth0_rename" stuff is not something unchangeably imposed
> by the kernel. I think this should be considered an udev bug most of all
> things, since if the udev scripts did it right, this wouldn't happen. A
> "right" way to do this would for example be to first rename all devices
> to "tmpwhatever0", then rename them back to their proper ethX names.

Whats the difference between eth0_rename and tmpeth0? I don't see any. :)

> Now I don't know even a single bit about udev scripting, so maybe this
> isn't possible with the current udev infrastructure, but there's
> certainly no kernel part stopping a proper implementation.
>
> Btw, what happened to nameif(8)? It's totally outdated (and deprecated I
> guess?), but it got the job done...

It can't swap already taken device names on hotplug, and that job
might get complicated very easily. Also stuff needs to be renamed
before the device is announced to userspace, otherwise the interface
gets configured and can't be renamed anymore.

Kay
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg KH Oct. 7, 2010, 5:15 p.m. UTC | #8
On Thu, Oct 07, 2010 at 07:05:14PM +0200, Kay Sievers wrote:
> On Thu, Oct 7, 2010 at 18:48, Greg KH <greg@kroah.com> wrote:
> > On Thu, Oct 07, 2010 at 11:31:13AM -0500, Matt Domsch wrote:
> >> 1) SMBIOS type 41 method.  Windows does not use this today, and I
> >>    can't speak to their future plans.  Narendra's kernel patch does,
> >>    as has biosdevname, the udev helper we first wrote for this
> >>    purpose, for several years.
> >
> > Then stick with that udev helper please :)
> 
> What about just exporting this information in sysfs, and not touch the naming?

I think this is now exported in userspace, right Narendra?

> Anyway, I'm pretty sure all of this naming of onboard devices should
> happen only at install time, or from a system management tool and not
> at hotplug time.
> 
> We should not get confused by the way the (very simple)
> automatic-rule-creater for persistent netdev naming in udev works.
> This is really just a tool for the common case, and works fine for the
> majority of people.
> 
> I'm not sure, if we should put all these special use cases in the
> hotplug path. I mean it's not that people add and remove 4 port
> network cards with special BIOS all the time, and expect proper naming
> on the first bootup, right? The installer, or the system management
> tool could just create/edit udev rules to provide proper device naming
> on whatever property is available at a specific hardware, be it the
> MAC address or some other persistent match?

I totally agree.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Narendra K Oct. 7, 2010, 5:44 p.m. UTC | #9
On Thu, Oct 07, 2010 at 10:45:32PM +0530, Greg KH wrote:
>    On Thu, Oct 07, 2010 at 07:05:14PM +0200, Kay Sievers wrote:
>    > On Thu, Oct 7, 2010 at 18:48, Greg KH <greg@kroah.com> wrote:
>    > > On Thu, Oct 07, 2010 at 11:31:13AM -0500, Matt Domsch wrote:
>    > >> 1) SMBIOS type 41 method.  Windows does not use this today, and I
>    > >>    can't speak to their future plans.  Narendra's kernel patch does,
>    > >>    as has biosdevname, the udev helper we first wrote for this
>    > >>    purpose, for several years.
>    > >
>    > > Then stick with that udev helper please :)
>    >
>    > What about just exporting this information in sysfs, and not touch the
>    naming?
> 
>    I think this is now exported in userspace, right Narendra?
> 

Right. It is in the mainline kernel.
( commit 911e1c9b05a8e3559a7aa89083930700a0b9e7ee)
Narendra K Oct. 7, 2010, 6:44 p.m. UTC | #10
On Thu, Oct 07, 2010 at 10:35:14PM +0530, Kay Sievers wrote:
>    On Thu, Oct 7, 2010 at 18:48, Greg KH <greg@kroah.com> wrote:
>    > On Thu, Oct 07, 2010 at 11:31:13AM -0500, Matt Domsch wrote:
>    >> 1) SMBIOS type 41 method.  Windows does not use this today, and I
>    >>    can't speak to their future plans.  Narendra's kernel patch does,
>    >>    as has biosdevname, the udev helper we first wrote for this
>    >>    purpose, for several years.
>    >
>    > Then stick with that udev helper please :)
> 
>    What about just exporting this information in sysfs, and not touch the
>    naming?
> 
>    Anyway, I'm pretty sure all of this naming of onboard devices should
>    happen only at install time, or from a system management tool and not
>    at hotplug time.
> 
>    We should not get confused by the way the (very simple)
>    automatic-rule-creater for persistent netdev naming in udev works.
>    This is really just a tool for the common case, and works fine for the
>    majority of people.

Right. It works as the automatic rule creator saves the snapshot of the
registered network interfaces at run time. 

> 
>    I'm not sure, if we should put all these special use cases in the
>    hotplug path. I mean it's not that people add and remove 4 port
>    network cards with special BIOS all the time, and expect proper naming
>    on the first bootup, right? The installer, or the system management
>    tool could just create/edit udev rules to provide proper device naming
>    on whatever property is available at a specific hardware, be it the
>    MAC address or some other persistent match?
> 

The proposal made is not expecting deterministic naming when an add-in
card with 'N' ports is plugged in/out. It is specific to onboard devices
only. Expectation is onboard devices have deterministic naming at first
bootup. And no special BIOS required as SMBIOS tables are in use for
sometime now.

I did explore using rules based on the exported attribute ATTRS{index}
on a system with 4 Onboard devices and two add-in devices.(where add-in
device becomes eth0 and eth1)

# PCI device 0x14e4:0x164c (bnx2) (custom name provided by external
# tool) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
# ATTRS{index}=="1", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

(and similary for eth1..eth3)

And for add-in devices.

# PCI device 0x8086:0x10c9 (igb) (custom name provided by external tool)
# SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*",
# ATTR{address}=="00:1b:21:54:33:3c", ATTR{type}=="1", KERNEL=="eth*",
# NAME="eth4" (and similar for eth5 as they do not have an index)

This works as i edited the file manually.

If this has to be done on a large number of systems where an image
based deployment is preferred, getting onboard device names as expected
is an issue and is important.
Matt Domsch Oct. 7, 2010, 8:42 p.m. UTC | #11
On Thu, Oct 07, 2010 at 10:05:14AM -0700, Kay Sievers wrote:
> On Thu, Oct 7, 2010 at 18:48, Greg KH <greg@kroah.com> wrote:
> > On Thu, Oct 07, 2010 at 11:31:13AM -0500, Matt Domsch wrote:
> >> 1) SMBIOS type 41 method.  Windows does not use this today, and I
> >>    can't speak to their future plans.  Narendra's kernel patch does,
> >>    as has biosdevname, the udev helper we first wrote for this
> >>    purpose, for several years.
> >
> > Then stick with that udev helper please :)
> 
> What about just exporting this information in sysfs, and not touch the naming?

The config tools all take a netdevice name as their argument.  What
would it look like then?


$ ifconfig $(netdevname 'Embedded NIC 1')

repeat for each tool that's called?  This is similar to what we
proposed with the userspace patch and libnetdevname, so the lookup can
happen inside each app, rather than the system admin having to do the
translation themselves.  That was rejected too...

otherwise, it's up to a human to do the translation in their head,
which isn't script-friendly.

-Matt

--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 3cdb4d8..73edbc0 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1529,6 +1529,12 @@  and is between 256 and 4096 characters. It is defined in the file
 			This usage is only documented in each driver source
 			file if at all.
 
+	no_netfwindex	[NET] Do not use firmware index to derive ethN names
+			Names for onboard network interfaces are derived from
+			the firmware provided index for these devices. Using
+			this parameter would result in firmware index not being
+			used to derive ethN names.
+
 	nf_conntrack.acct=
 			[NETFILTER] Enable connection tracking flow accounting
 			0 to disable accounting
diff --git a/drivers/pci/pci-label.c b/drivers/pci/pci-label.c
index 90c0a72..8086268 100644
--- a/drivers/pci/pci-label.c
+++ b/drivers/pci/pci-label.c
@@ -55,6 +55,7 @@  find_smbios_instance_string(struct pci_dev *pdev, char *buf,
 							 "%s\n",
 							 dmi->name);
 			}
+			pdev->firmware_index = donboard->instance;
 			return strlen(dmi->name);
 		}
 	}
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index b5a7d9b..448ed9d 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -28,6 +28,7 @@ 
 #include "pci.h"
 
 static int sysfs_initialized;	/* = 0 */
+int pci_netdevs_with_fwindex;
 
 /* show configuration fields */
 #define pci_config_attr(field, format_string)				\
@@ -1167,6 +1168,10 @@  int __must_check pci_create_sysfs_dev_files (struct pci_dev *pdev)
 
 	pci_create_firmware_label_files(pdev);
 
+	if (pdev->firmware_index && (pdev->class >> 16) ==
+						PCI_BASE_CLASS_NETWORK)
+		pci_netdevs_with_fwindex++;
+
 	return 0;
 
 err_vga_file:
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 46c36ff..4398dcf 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1080,6 +1080,8 @@  struct net_device {
 
 #define	NETDEV_ALIGN		32
 
+extern int pci_netdevs_with_fwindex;
+
 static inline
 struct netdev_queue *netdev_get_tx_queue(const struct net_device *dev,
 					 unsigned int index)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b1d1795..90113bb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -243,6 +243,7 @@  struct pci_dev {
 	unsigned short	subsystem_vendor;
 	unsigned short	subsystem_device;
 	unsigned int	class;		/* 3 bytes: (base,sub,prog-if) */
+	unsigned int	firmware_index; /* Firmware provided index */
 	u8		revision;	/* PCI revision, low byte of class word */
 	u8		hdr_type;	/* PCI header type (`multi' flag masked out) */
 	u8		pcie_cap;	/* PCI-E capability offset */
diff --git a/net/core/dev.c b/net/core/dev.c
index 1ae6543..f7982c4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -853,9 +853,18 @@  int dev_valid_name(const char *name)
 }
 EXPORT_SYMBOL(dev_valid_name);
 
+int netdev_use_fwindex = 1;
+
+static int __init netdev_use_fwindex_to_register(char *str)
+{
+	netdev_use_fwindex = 0;
+	return 0;
+}
+early_param("no_netfwindex", netdev_use_fwindex_to_register);
+
 /**
  *	__dev_alloc_name - allocate a name for a device
- *	@net: network namespace to allocate the device name in
+ *	@dev:  device
  *	@name: name format string
  *	@buf:  scratch buffer and result name string
  *
@@ -868,13 +877,15 @@  EXPORT_SYMBOL(dev_valid_name);
  *	Returns the number of the unit assigned or a negative errno code.
  */
 
-static int __dev_alloc_name(struct net *net, const char *name, char *buf)
+static int __dev_alloc_name(struct net_device *dev, const char *name, char *buf)
 {
 	int i = 0;
 	const char *p;
 	const int max_netdevices = 8*PAGE_SIZE;
 	unsigned long *inuse;
 	struct net_device *d;
+	struct net *net;
+	struct pci_dev *pdev;
 
 	p = strnchr(name, IFNAMSIZ-1, '%');
 	if (p) {
@@ -886,15 +897,36 @@  static int __dev_alloc_name(struct net *net, const char *name, char *buf)
 		if (p[1] != 'd' || strchr(p + 2, '%'))
 			return -EINVAL;
 
+		if (likely(netdev_use_fwindex)) {
+			pdev = to_pci_dev(dev->dev.parent);
+			if (pdev && pdev->firmware_index) {
+				snprintf(buf, IFNAMSIZ, name,
+					 pdev->firmware_index - 1);
+				return pdev->firmware_index - 1;
+			}
+		}
+
 		/* Use one page as a bit array of possible slots */
 		inuse = (unsigned long *) get_zeroed_page(GFP_ATOMIC);
 		if (!inuse)
 			return -ENOMEM;
 
+		/* Reserve 0 to < pci_netdevs_with_fwindex for integrated
+		 * ports with fwindex and allocate from pci_netdevs_with_fwindex
+		 * onwards for add-in devices
+		 */
+		if (likely(netdev_use_fwindex)) {
+			for (i = 0; i < pci_netdevs_with_fwindex; i++)
+				set_bit(i, inuse);
+		} else
+			pci_netdevs_with_fwindex = 0;
+
+		net = dev_net(dev);
+
 		for_each_netdev(net, d) {
 			if (!sscanf(d->name, name, &i))
 				continue;
-			if (i < 0 || i >= max_netdevices)
+			if (i < pci_netdevs_with_fwindex || i >= max_netdevices)
 				continue;
 
 			/*  avoid cases where sscanf is not exact inverse of printf */
@@ -936,12 +968,10 @@  static int __dev_alloc_name(struct net *net, const char *name, char *buf)
 int dev_alloc_name(struct net_device *dev, const char *name)
 {
 	char buf[IFNAMSIZ];
-	struct net *net;
 	int ret;
 
 	BUG_ON(!dev_net(dev));
-	net = dev_net(dev);
-	ret = __dev_alloc_name(net, name, buf);
+	ret = __dev_alloc_name(dev, name, buf);
 	if (ret >= 0)
 		strlcpy(dev->name, buf, IFNAMSIZ);
 	return ret;