diff mbox

netconsole fun

Message ID 1355345957.2687.18.camel@thor
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Peter Hurley Dec. 12, 2012, 8:59 p.m. UTC
On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > device?
> > > > > >
> > > > > 
> > > > > Yes, running it on the master device instead.
> > > > 
> > > > Thanks for the suggestion, but:
> > > > 
> > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > ...
> > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > [ 5.289929] netconsole: cleaning up
> > > > ...
> > > > [ 9.392291] Bridge firewalling registered
> > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > [ 9.418350] eth1:  setting full-duplex.
> > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > 
> > > > 
> > > > Is there a way to control or associate network device names prior to
> > > > udev renaming?
> > > > 
> > > That looks like a systemd problem (or more specifically a boot dependency
> > > problem).  You need to modify your netconsole unit/service file to start after
> > > all your networking is up.  NetworkManager provides a dummy service file for
> > > this purpose, called networkmanager-wait-online.service
> > 
> > Ok. So with a single physical network interface that will be bridged,
> > netconsole cannot used for kernel boot messages.
> > 
> > With a machine with multiple nics, is there a way to control device
> > naming so that the interface name to be used by netconsole specified on
> > the boot command line will actually corresponding to the intended
> > device. For example,
> > 
> > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > ....
> > [ 4.092184] 3c59x: Donald Becker and others.
> > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > ....
> > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > 
> > This is attaching netconsole to the wrong device because bus
> > enumeration, and therefore load order, is not consistent from boot to
> > boot.
> > 
> No, theres no way to do that.  As you note device ennumeration isn't consistent
> accross boots, thats why udev creates rules to rename devices based on immutable
> (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> happens, you'll have consistent names for your interfaces, and that work will be
> guaranteed to be done after networkmanager has finished opening all the
> interfaces that it needs (hence my suggestion to make netconsole service
> dependent on networkmanager service startup completing).

Just wondering if you think something like the patch below is
suitable/acceptable for insulating netconsole from inconsistent device
name scenarios without changing the existing semantics. The basic idea
is to allow an ethernet MAC address in the <dev> field of the
netconsole= options, and if a MAC address was specified rather than a
device name, to do the dev lookup from the MAC address instead.

This doesn't extend to, but also doesn't interfere with, the dynamic
config of netconsole via configfs.

Would you mind reviewing it?

Regards,
Peter

-- >% --
Subject: [PATCH] netconsole: allow mac addr to specify local interface device

Allow the local interface device to be specified by ethernet
MAC address. For example,

  netconsole=@10.0.0.1/12:34:56:78:9a:bc,30000@10.0.0.3/cb:a9:87:65:43:21

This alternate form enables netconsole to start and log boot messages
even if the network device name varies (eg., a machine with multiple NICs).

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
---
 Documentation/networking/netconsole.txt |  9 +++++++--
 drivers/net/netconsole.c                |  2 ++
 include/linux/netpoll.h                 |  1 +
 net/core/netpoll.c                      | 19 +++++++++++++++++--
 4 files changed, 27 insertions(+), 4 deletions(-)

Comments

Cong Wang Dec. 13, 2012, 10:33 a.m. UTC | #1
On Wed, 12 Dec 2012 at 20:59 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
>
> Just wondering if you think something like the patch below is
> suitable/acceptable for insulating netconsole from inconsistent device
> name scenarios without changing the existing semantics. The basic idea
> is to allow an ethernet MAC address in the <dev> field of the
> netconsole= options, and if a MAC address was specified rather than a
> device name, to do the dev lookup from the MAC address instead.
>
> This doesn't extend to, but also doesn't interfere with, the dynamic
> config of netconsole via configfs.
>
> Would you mind reviewing it?
>

This is a good idea. Just that you need to complete the configfs
interface too.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Dec. 13, 2012, 12:36 p.m. UTC | #2
On Wed, Dec 12, 2012 at 03:59:17PM -0500, Peter Hurley wrote:
> On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> > On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > > device?
> > > > > > >
> > > > > > 
> > > > > > Yes, running it on the master device instead.
> > > > > 
> > > > > Thanks for the suggestion, but:
> > > > > 
> > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > > ...
> > > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > > [ 5.289929] netconsole: cleaning up
> > > > > ...
> > > > > [ 9.392291] Bridge firewalling registered
> > > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > > [ 9.418350] eth1:  setting full-duplex.
> > > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > > 
> > > > > 
> > > > > Is there a way to control or associate network device names prior to
> > > > > udev renaming?
> > > > > 
> > > > That looks like a systemd problem (or more specifically a boot dependency
> > > > problem).  You need to modify your netconsole unit/service file to start after
> > > > all your networking is up.  NetworkManager provides a dummy service file for
> > > > this purpose, called networkmanager-wait-online.service
> > > 
> > > Ok. So with a single physical network interface that will be bridged,
> > > netconsole cannot used for kernel boot messages.
> > > 
> > > With a machine with multiple nics, is there a way to control device
> > > naming so that the interface name to be used by netconsole specified on
> > > the boot command line will actually corresponding to the intended
> > > device. For example,
> > > 
> > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > > ....
> > > [ 4.092184] 3c59x: Donald Becker and others.
> > > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > > ....
> > > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > > 
> > > This is attaching netconsole to the wrong device because bus
> > > enumeration, and therefore load order, is not consistent from boot to
> > > boot.
> > > 
> > No, theres no way to do that.  As you note device ennumeration isn't consistent
> > accross boots, thats why udev creates rules to rename devices based on immutable
> > (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> > happens, you'll have consistent names for your interfaces, and that work will be
> > guaranteed to be done after networkmanager has finished opening all the
> > interfaces that it needs (hence my suggestion to make netconsole service
> > dependent on networkmanager service startup completing).
> 
> Just wondering if you think something like the patch below is
> suitable/acceptable for insulating netconsole from inconsistent device
> name scenarios without changing the existing semantics. The basic idea
> is to allow an ethernet MAC address in the <dev> field of the
> netconsole= options, and if a MAC address was specified rather than a
> device name, to do the dev lookup from the MAC address instead.
> 
> This doesn't extend to, but also doesn't interfere with, the dynamic
> config of netconsole via configfs.
> 
> Would you mind reviewing it?
> 
> Regards,
> Peter
> 
This looks like a pretty good idea to me.  That said, something occured to me
when you wrote your summary above.  Have you looked at the netconsole service
scripts that most distros provide in their packaging?  I'm almost positive Red
Hat/Fedora (and also like Suse and Ubuntu), already implement this functionality
from user space.  Basically, instead of people just modprobing netconsole, they
create a service script that parses a config file that has contains all the
options needed to load the netconsole module, and it has the intellegence to see
if you specified a mac address rather than a device.  If you did that it finds
the corresponding device mac address and uses that as the device.  I'm sorry, I
don't know why I didn't think of that before.  Check that out though, that will
likey give you exactly what you need

Neil

P.S. Actually looking at it, I think it does one better, it lets you specify the
destinaition netconsole address, and then dynamically looks up the routing table
entry that gets you there, and uses the output device specified in the routing
table.

http://www.cyberciti.biz/tips/linux-netconsole-log-management-tutorial.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley Dec. 13, 2012, 2:49 p.m. UTC | #3
On Thu, 2012-12-13 at 07:36 -0500, Neil Horman wrote:
> On Wed, Dec 12, 2012 at 03:59:17PM -0500, Peter Hurley wrote:
> > On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> > > On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > > > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > > > device?
> > > > > > > >
> > > > > > > 
> > > > > > > Yes, running it on the master device instead.
> > > > > > 
> > > > > > Thanks for the suggestion, but:
> > > > > > 
> > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > > > ...
> > > > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > > > [ 5.289929] netconsole: cleaning up
> > > > > > ...
> > > > > > [ 9.392291] Bridge firewalling registered
> > > > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > > > [ 9.418350] eth1:  setting full-duplex.
> > > > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > > > 
> > > > > > 
> > > > > > Is there a way to control or associate network device names prior to
> > > > > > udev renaming?
> > > > > > 
> > > > > That looks like a systemd problem (or more specifically a boot dependency
> > > > > problem).  You need to modify your netconsole unit/service file to start after
> > > > > all your networking is up.  NetworkManager provides a dummy service file for
> > > > > this purpose, called networkmanager-wait-online.service
> > > > 
> > > > Ok. So with a single physical network interface that will be bridged,
> > > > netconsole cannot used for kernel boot messages.
> > > > 
> > > > With a machine with multiple nics, is there a way to control device
> > > > naming so that the interface name to be used by netconsole specified on
> > > > the boot command line will actually corresponding to the intended
> > > > device. For example,
> > > > 
> > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > > > ....
> > > > [ 4.092184] 3c59x: Donald Becker and others.
> > > > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > > > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > > > ....
> > > > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > > > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > > > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > > > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > > > 
> > > > This is attaching netconsole to the wrong device because bus
> > > > enumeration, and therefore load order, is not consistent from boot to
> > > > boot.
> > > > 
> > > No, theres no way to do that.  As you note device ennumeration isn't consistent
> > > accross boots, thats why udev creates rules to rename devices based on immutable
> > > (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> > > happens, you'll have consistent names for your interfaces, and that work will be
> > > guaranteed to be done after networkmanager has finished opening all the
> > > interfaces that it needs (hence my suggestion to make netconsole service
> > > dependent on networkmanager service startup completing).
> > 
> > Just wondering if you think something like the patch below is
> > suitable/acceptable for insulating netconsole from inconsistent device
> > name scenarios without changing the existing semantics. The basic idea
> > is to allow an ethernet MAC address in the <dev> field of the
> > netconsole= options, and if a MAC address was specified rather than a
> > device name, to do the dev lookup from the MAC address instead.
> > 
> > This doesn't extend to, but also doesn't interfere with, the dynamic
> > config of netconsole via configfs.
> > 
> > Would you mind reviewing it?
> > 
> > Regards,
> > Peter
> > 
> This looks like a pretty good idea to me.  That said, something occured to me
> when you wrote your summary above.  Have you looked at the netconsole service
> scripts that most distros provide in their packaging?  I'm almost positive Red
> Hat/Fedora (and also like Suse and Ubuntu), already implement this functionality
> from user space.  Basically, instead of people just modprobing netconsole, they
> create a service script that parses a config file that has contains all the
> options needed to load the netconsole module, and it has the intellegence to see
> if you specified a mac address rather than a device.  If you did that it finds
> the corresponding device mac address and uses that as the device.  I'm sorry, I
> don't know why I didn't think of that before.  Check that out though, that will
> likey give you exactly what you need

Even with a udev rule to load netconsole that runs immediately after
device renaming (so before scripting), most of the dynamic module
loading has already happened so netconsole misses it. At least with the
patch, netconsole will load and attach to the proper interface much
earlier in the boot so that module-load-time messages will be caught.

There is an unforeseen consequence of the patch: it breaks device
renaming because the device will already be in use by netconsole. Which
is the whole problem with userspace device renaming to begin with...

I guess that leaves only the option of building in netconsole and the
driver that supplies the interface.

Oh well.

Regards,
Peter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Dec. 13, 2012, 6:08 p.m. UTC | #4
On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
> On Thu, 2012-12-13 at 07:36 -0500, Neil Horman wrote:
> > On Wed, Dec 12, 2012 at 03:59:17PM -0500, Peter Hurley wrote:
> > > On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> > > > On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > > > > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > > > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > > > > device?
> > > > > > > > >
> > > > > > > > 
> > > > > > > > Yes, running it on the master device instead.
> > > > > > > 
> > > > > > > Thanks for the suggestion, but:
> > > > > > > 
> > > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > > > > ...
> > > > > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > > > > [ 5.289929] netconsole: cleaning up
> > > > > > > ...
> > > > > > > [ 9.392291] Bridge firewalling registered
> > > > > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > > > > [ 9.418350] eth1:  setting full-duplex.
> > > > > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > > > > 
> > > > > > > 
> > > > > > > Is there a way to control or associate network device names prior to
> > > > > > > udev renaming?
> > > > > > > 
> > > > > > That looks like a systemd problem (or more specifically a boot dependency
> > > > > > problem).  You need to modify your netconsole unit/service file to start after
> > > > > > all your networking is up.  NetworkManager provides a dummy service file for
> > > > > > this purpose, called networkmanager-wait-online.service
> > > > > 
> > > > > Ok. So with a single physical network interface that will be bridged,
> > > > > netconsole cannot used for kernel boot messages.
> > > > > 
> > > > > With a machine with multiple nics, is there a way to control device
> > > > > naming so that the interface name to be used by netconsole specified on
> > > > > the boot command line will actually corresponding to the intended
> > > > > device. For example,
> > > > > 
> > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > > > > ....
> > > > > [ 4.092184] 3c59x: Donald Becker and others.
> > > > > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > > > > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > > > > ....
> > > > > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > > > > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > > > > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > > > > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > > > > 
> > > > > This is attaching netconsole to the wrong device because bus
> > > > > enumeration, and therefore load order, is not consistent from boot to
> > > > > boot.
> > > > > 
> > > > No, theres no way to do that.  As you note device ennumeration isn't consistent
> > > > accross boots, thats why udev creates rules to rename devices based on immutable
> > > > (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> > > > happens, you'll have consistent names for your interfaces, and that work will be
> > > > guaranteed to be done after networkmanager has finished opening all the
> > > > interfaces that it needs (hence my suggestion to make netconsole service
> > > > dependent on networkmanager service startup completing).
> > > 
> > > Just wondering if you think something like the patch below is
> > > suitable/acceptable for insulating netconsole from inconsistent device
> > > name scenarios without changing the existing semantics. The basic idea
> > > is to allow an ethernet MAC address in the <dev> field of the
> > > netconsole= options, and if a MAC address was specified rather than a
> > > device name, to do the dev lookup from the MAC address instead.
> > > 
> > > This doesn't extend to, but also doesn't interfere with, the dynamic
> > > config of netconsole via configfs.
> > > 
> > > Would you mind reviewing it?
> > > 
> > > Regards,
> > > Peter
> > > 
> > This looks like a pretty good idea to me.  That said, something occured to me
> > when you wrote your summary above.  Have you looked at the netconsole service
> > scripts that most distros provide in their packaging?  I'm almost positive Red
> > Hat/Fedora (and also like Suse and Ubuntu), already implement this functionality
> > from user space.  Basically, instead of people just modprobing netconsole, they
> > create a service script that parses a config file that has contains all the
> > options needed to load the netconsole module, and it has the intellegence to see
> > if you specified a mac address rather than a device.  If you did that it finds
> > the corresponding device mac address and uses that as the device.  I'm sorry, I
> > don't know why I didn't think of that before.  Check that out though, that will
> > likey give you exactly what you need
> 
> Even with a udev rule to load netconsole that runs immediately after
> device renaming (so before scripting), most of the dynamic module
> loading has already happened so netconsole misses it. At least with the
> patch, netconsole will load and attach to the proper interface much
> earlier in the boot so that module-load-time messages will be caught.
> 
I'm not sure what you mean by this.  I get that other drivers are loaded and
that if the netconsole service runs too soon, you might not have a network path
to the netconsole host you want, but as long as the network is up, the service
script (if properly written), shouldn't make you specify a network device.
Instead of having to specify the device to the netconsole module directly, the
config file for the service script lets you specify the destination ip address
of the netconsole server, and figures out the output device based on the routing
tables, whatever that device name happens to be on that boot.  So you don't
really need this patch, because user space can do this work (and more) for you.
You just have to wait for the network to come up, which you need to do anyway.

> There is an unforeseen consequence of the patch: it breaks device
> renaming because the device will already be in use by netconsole. Which
> is the whole problem with userspace device renaming to begin with...
> 
That is bad, but see above, the netconsole service can work around this for you,
allowing you to never have to specify a particular device at all.

> I guess that leaves only the option of building in netconsole and the
> driver that supplies the interface.
> 

> Oh well.
> 
> Regards,
> Peter
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley Dec. 13, 2012, 7:27 p.m. UTC | #5
On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
> On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
> > On Thu, 2012-12-13 at 07:36 -0500, Neil Horman wrote:
> > > On Wed, Dec 12, 2012 at 03:59:17PM -0500, Peter Hurley wrote:
> > > > On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> > > > > On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > > > > > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > > > > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > > > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > > > > > device?
> > > > > > > > > >
> > > > > > > > > 
> > > > > > > > > Yes, running it on the master device instead.
> > > > > > > > 
> > > > > > > > Thanks for the suggestion, but:
> > > > > > > > 
> > > > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > > > > > ...
> > > > > > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > > > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > > > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > > > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > > > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > > > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > > > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > > > > > [ 5.289929] netconsole: cleaning up
> > > > > > > > ...
> > > > > > > > [ 9.392291] Bridge firewalling registered
> > > > > > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > > > > > [ 9.418350] eth1:  setting full-duplex.
> > > > > > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > > > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Is there a way to control or associate network device names prior to
> > > > > > > > udev renaming?
> > > > > > > > 
> > > > > > > That looks like a systemd problem (or more specifically a boot dependency
> > > > > > > problem).  You need to modify your netconsole unit/service file to start after
> > > > > > > all your networking is up.  NetworkManager provides a dummy service file for
> > > > > > > this purpose, called networkmanager-wait-online.service
> > > > > > 
> > > > > > Ok. So with a single physical network interface that will be bridged,
> > > > > > netconsole cannot used for kernel boot messages.
> > > > > > 
> > > > > > With a machine with multiple nics, is there a way to control device
> > > > > > naming so that the interface name to be used by netconsole specified on
> > > > > > the boot command line will actually corresponding to the intended
> > > > > > device. For example,
> > > > > > 
> > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > > > > > ....
> > > > > > [ 4.092184] 3c59x: Donald Becker and others.
> > > > > > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > > > > > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > > > > > ....
> > > > > > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > > > > > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > > > > > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > > > > > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > > > > > 
> > > > > > This is attaching netconsole to the wrong device because bus
> > > > > > enumeration, and therefore load order, is not consistent from boot to
> > > > > > boot.
> > > > > > 
> > > > > No, theres no way to do that.  As you note device ennumeration isn't consistent
> > > > > accross boots, thats why udev creates rules to rename devices based on immutable
> > > > > (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> > > > > happens, you'll have consistent names for your interfaces, and that work will be
> > > > > guaranteed to be done after networkmanager has finished opening all the
> > > > > interfaces that it needs (hence my suggestion to make netconsole service
> > > > > dependent on networkmanager service startup completing).
> > > > 
> > > > Just wondering if you think something like the patch below is
> > > > suitable/acceptable for insulating netconsole from inconsistent device
> > > > name scenarios without changing the existing semantics. The basic idea
> > > > is to allow an ethernet MAC address in the <dev> field of the
> > > > netconsole= options, and if a MAC address was specified rather than a
> > > > device name, to do the dev lookup from the MAC address instead.
> > > > 
> > > > This doesn't extend to, but also doesn't interfere with, the dynamic
> > > > config of netconsole via configfs.
> > > > 
> > > > Would you mind reviewing it?
> > > > 
> > > > Regards,
> > > > Peter
> > > > 
> > > This looks like a pretty good idea to me.  That said, something occured to me
> > > when you wrote your summary above.  Have you looked at the netconsole service
> > > scripts that most distros provide in their packaging?  I'm almost positive Red
> > > Hat/Fedora (and also like Suse and Ubuntu), already implement this functionality
> > > from user space.  Basically, instead of people just modprobing netconsole, they
> > > create a service script that parses a config file that has contains all the
> > > options needed to load the netconsole module, and it has the intellegence to see
> > > if you specified a mac address rather than a device.  If you did that it finds
> > > the corresponding device mac address and uses that as the device.  I'm sorry, I
> > > don't know why I didn't think of that before.  Check that out though, that will
> > > likey give you exactly what you need
> > 
> > Even with a udev rule to load netconsole that runs immediately after
> > device renaming (so before scripting), most of the dynamic module
> > loading has already happened so netconsole misses it. At least with the
> > patch, netconsole will load and attach to the proper interface much
> > earlier in the boot so that module-load-time messages will be caught.
> > 
> I'm not sure what you mean by this.

This is the beginning of my netconsole log if I use userspace scripts to
start it.

[   19.125314] ip_tables: (C) 2000-2006 Netfilter Core Team
[   20.060925] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   21.829331] ip6_tables: (C) 2000-2006 Netfilter Core Team
[   25.728370] at-spi-registry[1862]: segfault at 18 ip 00007f6dd1dd45f1 sp 00007fff49bcd760 error 4 in libgconf-2.so.4.1.5[7f6dd1dbd000+2d000]
[   26.778848] EXT4-fs (dm-3): re-mounted. Opts: errors=remount-ro,commit=0
[   30.643469] Bluetooth: RFCOMM TTY layer initialized
[   30.643509] Bluetooth: RFCOMM socket layer initialized
[   30.643512] Bluetooth: RFCOMM ver 1.11
[   30.784550] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[   30.784567] Bluetooth: BNEP filters: protocol multicast
[   30.784584] Bluetooth: BNEP socket layer initialized
[   34.010813] init: plymouth-stop pre-start process (2205) terminated with status 1

This is the beginning of my netconsole log if I am able to specify
netconsole= options on the boot command line. Netconsole starts logging
much earlier because it is much loaded earlier.

[    8.764336] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
[    9.409379] firewire_core 0000:07:06.0: created device fw1: GUID 0800460301c2d69e, S400
[    9.567395] init: ureadahead main process (500) terminated with status 5
[   10.400338] Adding 10996456k swap on /dev/mapper/isw_cbdbfhdjad_Raid0p5.  Priority:-1 extents:1 across:10996456k 
[   10.496974] udevd[541]: starting version 173
[   10.725906] EXT4-fs (dm-4): re-mounted. Opts: errors=remount-ro
[   11.288352] lp: driver loaded but no devices found
[   12.240058] parport_pc 00:05: reported by Plug and Play ACPI
[   12.240145] parport0: PC-style at 0x378 (0x778), irq 7, using FIFO [PCSPP,TRISTATE,COMPAT,ECP]
[   12.336161] lp0: using parport0 (interrupt-driven).
[   12.342867] microcode: CPU0 sig=0x10676, pf=0x40, revision=0x60f
[   12.436657] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[   12.442245] ppdev: user-space parallel port driver
[   12.451592] net firewire0: IPv4 over IEEE 1394 on card 0000:07:06.0

Does that make more sense now?

Thanks again,
Peter

> > There is an unforeseen consequence of the patch: it breaks device
> > renaming because the device will already be in use by netconsole. Which
> > is the whole problem with userspace device renaming to begin with...
> > 
> That is bad, but see above, the netconsole service can work around this for you,
> allowing you to never have to specify a particular device at all.

Just to be clear here,



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Dec. 13, 2012, 9:17 p.m. UTC | #6
On Thu, Dec 13, 2012 at 02:27:01PM -0500, Peter Hurley wrote:
> On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
> > On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
> > > On Thu, 2012-12-13 at 07:36 -0500, Neil Horman wrote:
> > > > On Wed, Dec 12, 2012 at 03:59:17PM -0500, Peter Hurley wrote:
> > > > > On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> > > > > > On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > > > > > > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > > > > > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > > > > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > > > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > > > > > > device?
> > > > > > > > > > >
> > > > > > > > > > 
> > > > > > > > > > Yes, running it on the master device instead.
> > > > > > > > > 
> > > > > > > > > Thanks for the suggestion, but:
> > > > > > > > > 
> > > > > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > > > > > > ...
> > > > > > > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > > > > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > > > > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > > > > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > > > > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > > > > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > > > > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > > > > > > [ 5.289929] netconsole: cleaning up
> > > > > > > > > ...
> > > > > > > > > [ 9.392291] Bridge firewalling registered
> > > > > > > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > > > > > > [ 9.418350] eth1:  setting full-duplex.
> > > > > > > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > > > > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Is there a way to control or associate network device names prior to
> > > > > > > > > udev renaming?
> > > > > > > > > 
> > > > > > > > That looks like a systemd problem (or more specifically a boot dependency
> > > > > > > > problem).  You need to modify your netconsole unit/service file to start after
> > > > > > > > all your networking is up.  NetworkManager provides a dummy service file for
> > > > > > > > this purpose, called networkmanager-wait-online.service
> > > > > > > 
> > > > > > > Ok. So with a single physical network interface that will be bridged,
> > > > > > > netconsole cannot used for kernel boot messages.
> > > > > > > 
> > > > > > > With a machine with multiple nics, is there a way to control device
> > > > > > > naming so that the interface name to be used by netconsole specified on
> > > > > > > the boot command line will actually corresponding to the intended
> > > > > > > device. For example,
> > > > > > > 
> > > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > > > > > > ....
> > > > > > > [ 4.092184] 3c59x: Donald Becker and others.
> > > > > > > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > > > > > > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > > > > > > ....
> > > > > > > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > > > > > > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > > > > > > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > > > > > > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > > > > > > 
> > > > > > > This is attaching netconsole to the wrong device because bus
> > > > > > > enumeration, and therefore load order, is not consistent from boot to
> > > > > > > boot.
> > > > > > > 
> > > > > > No, theres no way to do that.  As you note device ennumeration isn't consistent
> > > > > > accross boots, thats why udev creates rules to rename devices based on immutable
> > > > > > (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> > > > > > happens, you'll have consistent names for your interfaces, and that work will be
> > > > > > guaranteed to be done after networkmanager has finished opening all the
> > > > > > interfaces that it needs (hence my suggestion to make netconsole service
> > > > > > dependent on networkmanager service startup completing).
> > > > > 
> > > > > Just wondering if you think something like the patch below is
> > > > > suitable/acceptable for insulating netconsole from inconsistent device
> > > > > name scenarios without changing the existing semantics. The basic idea
> > > > > is to allow an ethernet MAC address in the <dev> field of the
> > > > > netconsole= options, and if a MAC address was specified rather than a
> > > > > device name, to do the dev lookup from the MAC address instead.
> > > > > 
> > > > > This doesn't extend to, but also doesn't interfere with, the dynamic
> > > > > config of netconsole via configfs.
> > > > > 
> > > > > Would you mind reviewing it?
> > > > > 
> > > > > Regards,
> > > > > Peter
> > > > > 
> > > > This looks like a pretty good idea to me.  That said, something occured to me
> > > > when you wrote your summary above.  Have you looked at the netconsole service
> > > > scripts that most distros provide in their packaging?  I'm almost positive Red
> > > > Hat/Fedora (and also like Suse and Ubuntu), already implement this functionality
> > > > from user space.  Basically, instead of people just modprobing netconsole, they
> > > > create a service script that parses a config file that has contains all the
> > > > options needed to load the netconsole module, and it has the intellegence to see
> > > > if you specified a mac address rather than a device.  If you did that it finds
> > > > the corresponding device mac address and uses that as the device.  I'm sorry, I
> > > > don't know why I didn't think of that before.  Check that out though, that will
> > > > likey give you exactly what you need
> > > 
> > > Even with a udev rule to load netconsole that runs immediately after
> > > device renaming (so before scripting), most of the dynamic module
> > > loading has already happened so netconsole misses it. At least with the
> > > patch, netconsole will load and attach to the proper interface much
> > > earlier in the boot so that module-load-time messages will be caught.
> > > 
> > I'm not sure what you mean by this.
> 
> This is the beginning of my netconsole log if I use userspace scripts to
> start it.
> 
> [   19.125314] ip_tables: (C) 2000-2006 Netfilter Core Team
> [   20.060925] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
> [   21.829331] ip6_tables: (C) 2000-2006 Netfilter Core Team
> [   25.728370] at-spi-registry[1862]: segfault at 18 ip 00007f6dd1dd45f1 sp 00007fff49bcd760 error 4 in libgconf-2.so.4.1.5[7f6dd1dbd000+2d000]
> [   26.778848] EXT4-fs (dm-3): re-mounted. Opts: errors=remount-ro,commit=0
> [   30.643469] Bluetooth: RFCOMM TTY layer initialized
> [   30.643509] Bluetooth: RFCOMM socket layer initialized
> [   30.643512] Bluetooth: RFCOMM ver 1.11
> [   30.784550] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> [   30.784567] Bluetooth: BNEP filters: protocol multicast
> [   30.784584] Bluetooth: BNEP socket layer initialized
> [   34.010813] init: plymouth-stop pre-start process (2205) terminated with status 1
> 
> This is the beginning of my netconsole log if I am able to specify
> netconsole= options on the boot command line. Netconsole starts logging
> much earlier because it is much loaded earlier.
> 
> [    8.764336] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
> [    9.409379] firewire_core 0000:07:06.0: created device fw1: GUID 0800460301c2d69e, S400
> [    9.567395] init: ureadahead main process (500) terminated with status 5
> [   10.400338] Adding 10996456k swap on /dev/mapper/isw_cbdbfhdjad_Raid0p5.  Priority:-1 extents:1 across:10996456k 
> [   10.496974] udevd[541]: starting version 173
> [   10.725906] EXT4-fs (dm-4): re-mounted. Opts: errors=remount-ro
> [   11.288352] lp: driver loaded but no devices found
> [   12.240058] parport_pc 00:05: reported by Plug and Play ACPI
> [   12.240145] parport0: PC-style at 0x378 (0x778), irq 7, using FIFO [PCSPP,TRISTATE,COMPAT,ECP]
> [   12.336161] lp0: using parport0 (interrupt-driven).
> [   12.342867] microcode: CPU0 sig=0x10676, pf=0x40, revision=0x60f
> [   12.436657] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> [   12.442245] ppdev: user-space parallel port driver
> [   12.451592] net firewire0: IPv4 over IEEE 1394 on card 0000:07:06.0
> 
> Does that make more sense now?
> 
No, actually, what exactly are you trying to show me here?  I don't see any
indication of netconsole doing anything in either of these log snippets.  I'm
also not sure why you're specifying netconsole options on the kernel command
line at all.
Can you elaborate?
Neil
 

> Thanks again,
> Peter
> 
> > > There is an unforeseen consequence of the patch: it breaks device
> > > renaming because the device will already be in use by netconsole. Which
> > > is the whole problem with userspace device renaming to begin with...
> > > 
> > That is bad, but see above, the netconsole service can work around this for you,
> > allowing you to never have to specify a particular device at all.
> 
> Just to be clear here,
> 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley Dec. 13, 2012, 10:24 p.m. UTC | #7
On Thu, 2012-12-13 at 16:17 -0500, Neil Horman wrote:
> On Thu, Dec 13, 2012 at 02:27:01PM -0500, Peter Hurley wrote:
> > On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
> > > On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
> > > > On Thu, 2012-12-13 at 07:36 -0500, Neil Horman wrote:
> > > > > On Wed, Dec 12, 2012 at 03:59:17PM -0500, Peter Hurley wrote:
> > > > > > On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> > > > > > > On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > > > > > > > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > > > > > > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > > > > > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > > > > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > > > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > > > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > > > > > > > device?
> > > > > > > > > > > >
> > > > > > > > > > > 
> > > > > > > > > > > Yes, running it on the master device instead.
> > > > > > > > > > 
> > > > > > > > > > Thanks for the suggestion, but:
> > > > > > > > > > 
> > > > > > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > > > > > > > ...
> > > > > > > > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > > > > > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > > > > > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > > > > > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > > > > > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > > > > > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > > > > > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > > > > > > > [ 5.289929] netconsole: cleaning up
> > > > > > > > > > ...
> > > > > > > > > > [ 9.392291] Bridge firewalling registered
> > > > > > > > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > > > > > > > [ 9.418350] eth1:  setting full-duplex.
> > > > > > > > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > > > > > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Is there a way to control or associate network device names prior to
> > > > > > > > > > udev renaming?
> > > > > > > > > > 
> > > > > > > > > That looks like a systemd problem (or more specifically a boot dependency
> > > > > > > > > problem).  You need to modify your netconsole unit/service file to start after
> > > > > > > > > all your networking is up.  NetworkManager provides a dummy service file for
> > > > > > > > > this purpose, called networkmanager-wait-online.service
> > > > > > > > 
> > > > > > > > Ok. So with a single physical network interface that will be bridged,
> > > > > > > > netconsole cannot used for kernel boot messages.
> > > > > > > > 
> > > > > > > > With a machine with multiple nics, is there a way to control device
> > > > > > > > naming so that the interface name to be used by netconsole specified on
> > > > > > > > the boot command line will actually corresponding to the intended
> > > > > > > > device. For example,
> > > > > > > > 
> > > > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > > > > > > > ....
> > > > > > > > [ 4.092184] 3c59x: Donald Becker and others.
> > > > > > > > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > > > > > > > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > > > > > > > ....
> > > > > > > > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > > > > > > > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > > > > > > > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > > > > > > > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > > > > > > > 
> > > > > > > > This is attaching netconsole to the wrong device because bus
> > > > > > > > enumeration, and therefore load order, is not consistent from boot to
> > > > > > > > boot.
> > > > > > > > 
> > > > > > > No, theres no way to do that.  As you note device ennumeration isn't consistent
> > > > > > > accross boots, thats why udev creates rules to rename devices based on immutable
> > > > > > > (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> > > > > > > happens, you'll have consistent names for your interfaces, and that work will be
> > > > > > > guaranteed to be done after networkmanager has finished opening all the
> > > > > > > interfaces that it needs (hence my suggestion to make netconsole service
> > > > > > > dependent on networkmanager service startup completing).
> > > > > > 
> > > > > > Just wondering if you think something like the patch below is
> > > > > > suitable/acceptable for insulating netconsole from inconsistent device
> > > > > > name scenarios without changing the existing semantics. The basic idea
> > > > > > is to allow an ethernet MAC address in the <dev> field of the
> > > > > > netconsole= options, and if a MAC address was specified rather than a
> > > > > > device name, to do the dev lookup from the MAC address instead.
> > > > > > 
> > > > > > This doesn't extend to, but also doesn't interfere with, the dynamic
> > > > > > config of netconsole via configfs.
> > > > > > 
> > > > > > Would you mind reviewing it?
> > > > > > 
> > > > > > Regards,
> > > > > > Peter
> > > > > > 
> > > > > This looks like a pretty good idea to me.  That said, something occured to me
> > > > > when you wrote your summary above.  Have you looked at the netconsole service
> > > > > scripts that most distros provide in their packaging?  I'm almost positive Red
> > > > > Hat/Fedora (and also like Suse and Ubuntu), already implement this functionality
> > > > > from user space.  Basically, instead of people just modprobing netconsole, they
> > > > > create a service script that parses a config file that has contains all the
> > > > > options needed to load the netconsole module, and it has the intellegence to see
> > > > > if you specified a mac address rather than a device.  If you did that it finds
> > > > > the corresponding device mac address and uses that as the device.  I'm sorry, I
> > > > > don't know why I didn't think of that before.  Check that out though, that will
> > > > > likey give you exactly what you need
> > > > 
> > > > Even with a udev rule to load netconsole that runs immediately after
> > > > device renaming (so before scripting), most of the dynamic module
> > > > loading has already happened so netconsole misses it. At least with the
> > > > patch, netconsole will load and attach to the proper interface much
> > > > earlier in the boot so that module-load-time messages will be caught.
> > > > 
> > > I'm not sure what you mean by this.
> > 
> > This is the beginning of my netconsole log if I use userspace scripts to
> > start it.
> > 
> > [   19.125314] ip_tables: (C) 2000-2006 Netfilter Core Team
> > [   20.060925] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
> > [   21.829331] ip6_tables: (C) 2000-2006 Netfilter Core Team
> > [   25.728370] at-spi-registry[1862]: segfault at 18 ip 00007f6dd1dd45f1 sp 00007fff49bcd760 error 4 in libgconf-2.so.4.1.5[7f6dd1dbd000+2d000]
> > [   26.778848] EXT4-fs (dm-3): re-mounted. Opts: errors=remount-ro,commit=0
> > [   30.643469] Bluetooth: RFCOMM TTY layer initialized
> > [   30.643509] Bluetooth: RFCOMM socket layer initialized
> > [   30.643512] Bluetooth: RFCOMM ver 1.11
> > [   30.784550] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> > [   30.784567] Bluetooth: BNEP filters: protocol multicast
> > [   30.784584] Bluetooth: BNEP socket layer initialized
> > [   34.010813] init: plymouth-stop pre-start process (2205) terminated with status 1
> > 
> > This is the beginning of my netconsole log if I am able to specify
> > netconsole= options on the boot command line. Netconsole starts logging
> > much earlier because it is much loaded earlier.
> > 
> > [    8.764336] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
> > [    9.409379] firewire_core 0000:07:06.0: created device fw1: GUID 0800460301c2d69e, S400
> > [    9.567395] init: ureadahead main process (500) terminated with status 5
> > [   10.400338] Adding 10996456k swap on /dev/mapper/isw_cbdbfhdjad_Raid0p5.  Priority:-1 extents:1 across:10996456k 
> > [   10.496974] udevd[541]: starting version 173
> > [   10.725906] EXT4-fs (dm-4): re-mounted. Opts: errors=remount-ro
> > [   11.288352] lp: driver loaded but no devices found
> > [   12.240058] parport_pc 00:05: reported by Plug and Play ACPI
> > [   12.240145] parport0: PC-style at 0x378 (0x778), irq 7, using FIFO [PCSPP,TRISTATE,COMPAT,ECP]
> > [   12.336161] lp0: using parport0 (interrupt-driven).
> > [   12.342867] microcode: CPU0 sig=0x10676, pf=0x40, revision=0x60f
> > [   12.436657] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> > [   12.442245] ppdev: user-space parallel port driver
> > [   12.451592] net firewire0: IPv4 over IEEE 1394 on card 0000:07:06.0
> > 
> > Does that make more sense now?
> > 
> No, actually, what exactly are you trying to show me here?  I don't see any
> indication of netconsole doing anything in either of these log snippets.
                 ^^^^^^^^^^^^^^^^^^^^^^
      except that it was netconsole that wrote these logs

Note the log times.

In the first log, which is from a netconsole loaded by userspace
scripts, the first printk it logs isn't until 19.125314 secs into boot.
Most kernel + module init has already happened by this time.

In the second log, which is from a netconsole (still built as a module)
loaded as a result of using the boot command line. It starts logging at
8.764336 secs into boot -- almost 10 secs earlier than using userspace
scripting to load netconsole.

> I'm
> also not sure why you're specifying netconsole options on the kernel command
> line at all.
> Can you elaborate?

Specifying netconsole= on the boot command line loads the netconsole
module at the earliest possible time (and much earlier than scripting
will do). And also leaves it as an optional configuration (at least for
me it does since I use grub2 and can edit the boot command line before
booting).

AFAIK, there are 5 ways to load netconsole:
1. On the boot command line with netconsole=
   If netconsole is built-in, this is the only way to initialize it.
   If netconsole is a module, this forces netconsole to load at the
   earliest possible time.
2. Via /etc/modules
   This happens before network device renaming, so suffers from the
   problems we've been discussing.
3. Via a custom udev rule in /etc/udev/rules.d/
   Earliest userspace method to modprobe netconsole and can be used
   after device renaming, but is still fairly late in the boot process.
4. Via init/service scripting
5. At the user shell

AFAIK, there are 4 ways to specify the necessary options to netconsole:
a. On the boot command line with netconsole=
   Also loads netconsole
b. In a .conf file in /etc/modprobe.d
c. On the modprobe command line
d. Via configfs

Regards,
Peter



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Dec. 14, 2012, 2:20 p.m. UTC | #8
On Thu, Dec 13, 2012 at 05:24:40PM -0500, Peter Hurley wrote:
> On Thu, 2012-12-13 at 16:17 -0500, Neil Horman wrote:
> > On Thu, Dec 13, 2012 at 02:27:01PM -0500, Peter Hurley wrote:
> > > On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
> > > > On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
> > > > > On Thu, 2012-12-13 at 07:36 -0500, Neil Horman wrote:
> > > > > > On Wed, Dec 12, 2012 at 03:59:17PM -0500, Peter Hurley wrote:
> > > > > > > On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> > > > > > > > On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > > > > > > > > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > > > > > > > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > > > > > > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > > > > > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > > > > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > > > > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > > > > > > > > device?
> > > > > > > > > > > > >
> > > > > > > > > > > > 
> > > > > > > > > > > > Yes, running it on the master device instead.
> > > > > > > > > > > 
> > > > > > > > > > > Thanks for the suggestion, but:
> > > > > > > > > > > 
> > > > > > > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > > > > > > > > ...
> > > > > > > > > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > > > > > > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > > > > > > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > > > > > > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > > > > > > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > > > > > > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > > > > > > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > > > > > > > > [ 5.289929] netconsole: cleaning up
> > > > > > > > > > > ...
> > > > > > > > > > > [ 9.392291] Bridge firewalling registered
> > > > > > > > > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > > > > > > > > [ 9.418350] eth1:  setting full-duplex.
> > > > > > > > > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > > > > > > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Is there a way to control or associate network device names prior to
> > > > > > > > > > > udev renaming?
> > > > > > > > > > > 
> > > > > > > > > > That looks like a systemd problem (or more specifically a boot dependency
> > > > > > > > > > problem).  You need to modify your netconsole unit/service file to start after
> > > > > > > > > > all your networking is up.  NetworkManager provides a dummy service file for
> > > > > > > > > > this purpose, called networkmanager-wait-online.service
> > > > > > > > > 
> > > > > > > > > Ok. So with a single physical network interface that will be bridged,
> > > > > > > > > netconsole cannot used for kernel boot messages.
> > > > > > > > > 
> > > > > > > > > With a machine with multiple nics, is there a way to control device
> > > > > > > > > naming so that the interface name to be used by netconsole specified on
> > > > > > > > > the boot command line will actually corresponding to the intended
> > > > > > > > > device. For example,
> > > > > > > > > 
> > > > > > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > > > > > > > > ....
> > > > > > > > > [ 4.092184] 3c59x: Donald Becker and others.
> > > > > > > > > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > > > > > > > > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > > > > > > > > ....
> > > > > > > > > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > > > > > > > > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > > > > > > > > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > > > > > > > > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > > > > > > > > 
> > > > > > > > > This is attaching netconsole to the wrong device because bus
> > > > > > > > > enumeration, and therefore load order, is not consistent from boot to
> > > > > > > > > boot.
> > > > > > > > > 
> > > > > > > > No, theres no way to do that.  As you note device ennumeration isn't consistent
> > > > > > > > accross boots, thats why udev creates rules to rename devices based on immutable
> > > > > > > > (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> > > > > > > > happens, you'll have consistent names for your interfaces, and that work will be
> > > > > > > > guaranteed to be done after networkmanager has finished opening all the
> > > > > > > > interfaces that it needs (hence my suggestion to make netconsole service
> > > > > > > > dependent on networkmanager service startup completing).
> > > > > > > 
> > > > > > > Just wondering if you think something like the patch below is
> > > > > > > suitable/acceptable for insulating netconsole from inconsistent device
> > > > > > > name scenarios without changing the existing semantics. The basic idea
> > > > > > > is to allow an ethernet MAC address in the <dev> field of the
> > > > > > > netconsole= options, and if a MAC address was specified rather than a
> > > > > > > device name, to do the dev lookup from the MAC address instead.
> > > > > > > 
> > > > > > > This doesn't extend to, but also doesn't interfere with, the dynamic
> > > > > > > config of netconsole via configfs.
> > > > > > > 
> > > > > > > Would you mind reviewing it?
> > > > > > > 
> > > > > > > Regards,
> > > > > > > Peter
> > > > > > > 
> > > > > > This looks like a pretty good idea to me.  That said, something occured to me
> > > > > > when you wrote your summary above.  Have you looked at the netconsole service
> > > > > > scripts that most distros provide in their packaging?  I'm almost positive Red
> > > > > > Hat/Fedora (and also like Suse and Ubuntu), already implement this functionality
> > > > > > from user space.  Basically, instead of people just modprobing netconsole, they
> > > > > > create a service script that parses a config file that has contains all the
> > > > > > options needed to load the netconsole module, and it has the intellegence to see
> > > > > > if you specified a mac address rather than a device.  If you did that it finds
> > > > > > the corresponding device mac address and uses that as the device.  I'm sorry, I
> > > > > > don't know why I didn't think of that before.  Check that out though, that will
> > > > > > likey give you exactly what you need
> > > > > 
> > > > > Even with a udev rule to load netconsole that runs immediately after
> > > > > device renaming (so before scripting), most of the dynamic module
> > > > > loading has already happened so netconsole misses it. At least with the
> > > > > patch, netconsole will load and attach to the proper interface much
> > > > > earlier in the boot so that module-load-time messages will be caught.
> > > > > 
> > > > I'm not sure what you mean by this.
> > > 
> > > This is the beginning of my netconsole log if I use userspace scripts to
> > > start it.
> > > 
> > > [   19.125314] ip_tables: (C) 2000-2006 Netfilter Core Team
> > > [   20.060925] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
> > > [   21.829331] ip6_tables: (C) 2000-2006 Netfilter Core Team
> > > [   25.728370] at-spi-registry[1862]: segfault at 18 ip 00007f6dd1dd45f1 sp 00007fff49bcd760 error 4 in libgconf-2.so.4.1.5[7f6dd1dbd000+2d000]
> > > [   26.778848] EXT4-fs (dm-3): re-mounted. Opts: errors=remount-ro,commit=0
> > > [   30.643469] Bluetooth: RFCOMM TTY layer initialized
> > > [   30.643509] Bluetooth: RFCOMM socket layer initialized
> > > [   30.643512] Bluetooth: RFCOMM ver 1.11
> > > [   30.784550] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> > > [   30.784567] Bluetooth: BNEP filters: protocol multicast
> > > [   30.784584] Bluetooth: BNEP socket layer initialized
> > > [   34.010813] init: plymouth-stop pre-start process (2205) terminated with status 1
> > > 
> > > This is the beginning of my netconsole log if I am able to specify
> > > netconsole= options on the boot command line. Netconsole starts logging
> > > much earlier because it is much loaded earlier.
> > > 
> > > [    8.764336] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
> > > [    9.409379] firewire_core 0000:07:06.0: created device fw1: GUID 0800460301c2d69e, S400
> > > [    9.567395] init: ureadahead main process (500) terminated with status 5
> > > [   10.400338] Adding 10996456k swap on /dev/mapper/isw_cbdbfhdjad_Raid0p5.  Priority:-1 extents:1 across:10996456k 
> > > [   10.496974] udevd[541]: starting version 173
> > > [   10.725906] EXT4-fs (dm-4): re-mounted. Opts: errors=remount-ro
> > > [   11.288352] lp: driver loaded but no devices found
> > > [   12.240058] parport_pc 00:05: reported by Plug and Play ACPI
> > > [   12.240145] parport0: PC-style at 0x378 (0x778), irq 7, using FIFO [PCSPP,TRISTATE,COMPAT,ECP]
> > > [   12.336161] lp0: using parport0 (interrupt-driven).
> > > [   12.342867] microcode: CPU0 sig=0x10676, pf=0x40, revision=0x60f
> > > [   12.436657] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> > > [   12.442245] ppdev: user-space parallel port driver
> > > [   12.451592] net firewire0: IPv4 over IEEE 1394 on card 0000:07:06.0
> > > 
> > > Does that make more sense now?
> > > 
> > No, actually, what exactly are you trying to show me here?  I don't see any
> > indication of netconsole doing anything in either of these log snippets.
>                  ^^^^^^^^^^^^^^^^^^^^^^
>       except that it was netconsole that wrote these logs
> 
> Note the log times.
> 
> In the first log, which is from a netconsole loaded by userspace
> scripts, the first printk it logs isn't until 19.125314 secs into boot.
> Most kernel + module init has already happened by this time.
> 
> In the second log, which is from a netconsole (still built as a module)
> loaded as a result of using the boot command line. It starts logging at
> 8.764336 secs into boot -- almost 10 secs earlier than using userspace
> scripting to load netconsole.
> 
> > I'm
> > also not sure why you're specifying netconsole options on the kernel command
> > line at all.
> > Can you elaborate?
> 
> Specifying netconsole= on the boot command line loads the netconsole
> module at the earliest possible time (and much earlier than scripting
> will do). And also leaves it as an optional configuration (at least for
> me it does since I use grub2 and can edit the boot command line before
> booting).
> 
> AFAIK, there are 5 ways to load netconsole:
> 1. On the boot command line with netconsole=
>    If netconsole is built-in, this is the only way to initialize it.
>    If netconsole is a module, this forces netconsole to load at the
>    earliest possible time.
> 2. Via /etc/modules
>    This happens before network device renaming, so suffers from the
>    problems we've been discussing.
> 3. Via a custom udev rule in /etc/udev/rules.d/
>    Earliest userspace method to modprobe netconsole and can be used
>    after device renaming, but is still fairly late in the boot process.
> 4. Via init/service scripting
> 5. At the user shell
> 
> AFAIK, there are 4 ways to specify the necessary options to netconsole:
> a. On the boot command line with netconsole=
>    Also loads netconsole
> b. In a .conf file in /etc/modprobe.d
> c. On the modprobe command line
> d. Via configfs
> 
Ah!  I'm sorry, I didn't realize this was really about getting netconsole up
early in the boot, rather than just getting it up robustly using the startup
script.  If thats the case, then I would recommend that you modify the initramfs
to do something simmilar to the startup script (since thats where the netconsole
module will get loaded anyway).  You can write a script there that will let you
specify the destination ip address and figure out the output dev based on the
routing tables.  If you're using dracut to build your initramfs, then this
should be pretty straightforward.

Neil

> Regards,
> Peter
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley Dec. 15, 2012, 2:13 p.m. UTC | #9
On Fri, 2012-12-14 at 09:20 -0500, Neil Horman wrote:
> Ah!  I'm sorry, I didn't realize this was really about getting netconsole up
> early in the boot, rather than just getting it up robustly using the startup
> script.

Well, it's both but I should have been clearer here. Sorry about that.

> If thats the case, then I would recommend that you modify the initramfs
> to do something simmilar to the startup script (since thats where the netconsole
> module will get loaded anyway).  You can write a script there that will let you
> specify the destination ip address and figure out the output dev based on the
> routing tables.  If you're using dracut to build your initramfs, then this
> should be pretty straightforward.

When I get some more free time I'll experiment with this approach.

Just to clarify something from earlier in the discussion:

On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
> On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
....
> > There is an unforeseen consequence of the patch: it breaks device
> > renaming because the device will already be in use by netconsole. Which
> > is the whole problem with userspace device renaming to begin with...
> > 
> That is bad, but see above, the netconsole service can work around this for you,
> allowing you to never have to specify a particular device at all.

The breakage is a normal consequence of being able to load netconsole
before the udev rules that do device renaming. The same thing would
happen modifying initramfs.

Basically, once netconsole attaches to a device, that device cannot be
renamed. Unfortunately, the default udev behavior messes things up
further because it will try to do this:
  eth0->eth1
  eth1->eth0
which means neither device will be renamed.

Maybe the net core should just implement persistent device names ;)

Thanks again for all your time,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Dec. 17, 2012, 2:20 p.m. UTC | #10
On Sat, Dec 15, 2012 at 09:13:58AM -0500, Peter Hurley wrote:
> On Fri, 2012-12-14 at 09:20 -0500, Neil Horman wrote:
> > Ah!  I'm sorry, I didn't realize this was really about getting netconsole up
> > early in the boot, rather than just getting it up robustly using the startup
> > script.
> 
> Well, it's both but I should have been clearer here. Sorry about that.
> 
> > If thats the case, then I would recommend that you modify the initramfs
> > to do something simmilar to the startup script (since thats where the netconsole
> > module will get loaded anyway).  You can write a script there that will let you
> > specify the destination ip address and figure out the output dev based on the
> > routing tables.  If you're using dracut to build your initramfs, then this
> > should be pretty straightforward.
> 
> When I get some more free time I'll experiment with this approach.
> 
> Just to clarify something from earlier in the discussion:
> 
> On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
> > On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
> ....
> > > There is an unforeseen consequence of the patch: it breaks device
> > > renaming because the device will already be in use by netconsole. Which
> > > is the whole problem with userspace device renaming to begin with...
> > > 
> > That is bad, but see above, the netconsole service can work around this for you,
> > allowing you to never have to specify a particular device at all.
> 
> The breakage is a normal consequence of being able to load netconsole
> before the udev rules that do device renaming. The same thing would
> happen modifying initramfs.
> 
> Basically, once netconsole attaches to a device, that device cannot be
> renamed. Unfortunately, the default udev behavior messes things up
> further because it will try to do this:
>   eth0->eth1
>   eth1->eth0
> which means neither device will be renamed.
> 
> Maybe the net core should just implement persistent device names ;)
> 
Theres no good way for the kernel to do that, as persistent naming in this case
is a matter of user policy, not kernel hardware management (i.e. do you want a
network name to follow a mac address, a pci slot, or the network its connected
to)?  You can use smbios to get some modicum of persistent device naming
currently, but I don't recall if that requires udev rules to implement as well

You're best bet is to simply make your initramfs more robust.  I understand what
you're saying regarding renaming after you've taken a reference on a device not
being possible, but you can run udev within the initramfs, and do your renaming
prior to your netconsole load.

Thanks
Neil

> Thanks again for all your time,
> Peter Hurley
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley April 29, 2013, 5:28 p.m. UTC | #11
On Mon, 2012-12-17 at 09:20 -0500, Neil Horman wrote:
> On Sat, Dec 15, 2012 at 09:13:58AM -0500, Peter Hurley wrote:
> > On Fri, 2012-12-14 at 09:20 -0500, Neil Horman wrote:
> > > Ah!  I'm sorry, I didn't realize this was really about getting netconsole up
> > > early in the boot, rather than just getting it up robustly using the startup
> > > script.
> > 
> > Well, it's both but I should have been clearer here. Sorry about that.
> > 
> > > If thats the case, then I would recommend that you modify the initramfs
> > > to do something simmilar to the startup script (since thats where the netconsole
> > > module will get loaded anyway).  You can write a script there that will let you
> > > specify the destination ip address and figure out the output dev based on the
> > > routing tables.  If you're using dracut to build your initramfs, then this
> > > should be pretty straightforward.
> > 
> > When I get some more free time I'll experiment with this approach.
> > 
> > Just to clarify something from earlier in the discussion:
> > 
> > On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
> > > On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
> > ....
> > > > There is an unforeseen consequence of the patch: it breaks device
> > > > renaming because the device will already be in use by netconsole. Which
> > > > is the whole problem with userspace device renaming to begin with...
> > > > 
> > > That is bad, but see above, the netconsole service can work around this for you,
> > > allowing you to never have to specify a particular device at all.
> > 
> > The breakage is a normal consequence of being able to load netconsole
> > before the udev rules that do device renaming. The same thing would
> > happen modifying initramfs.
> > 
> > Basically, once netconsole attaches to a device, that device cannot be
> > renamed. Unfortunately, the default udev behavior messes things up
> > further because it will try to do this:
> >   eth0->eth1
> >   eth1->eth0
> > which means neither device will be renamed.
> > 
> > Maybe the net core should just implement persistent device names ;)
> > 
> Theres no good way for the kernel to do that, as persistent naming in this case
> is a matter of user policy, not kernel hardware management (i.e. do you want a
> network name to follow a mac address, a pci slot, or the network its connected
> to)?  You can use smbios to get some modicum of persistent device naming
> currently, but I don't recall if that requires udev rules to implement as well
> 
> You're best bet is to simply make your initramfs more robust.  I understand what
> you're saying regarding renaming after you've taken a reference on a device not
> being possible, but you can run udev within the initramfs, and do your renaming
> prior to your netconsole load.

Hi Neil,

I plan to re-submit 'netconsole: allow mac addr to specify local
interface device' which you originally objected to because you asserted
that the same effect could be obtained through udev scripts in the
initramfs.

When you shot down this patch, did you actually try what you suggested
in the initramfs or were you just hypothesizing that it would possible?

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman April 29, 2013, 6:21 p.m. UTC | #12
On Mon, Apr 29, 2013 at 01:28:45PM -0400, Peter Hurley wrote:
> On Mon, 2012-12-17 at 09:20 -0500, Neil Horman wrote:
> > On Sat, Dec 15, 2012 at 09:13:58AM -0500, Peter Hurley wrote:
> > > On Fri, 2012-12-14 at 09:20 -0500, Neil Horman wrote:
> > > > Ah!  I'm sorry, I didn't realize this was really about getting netconsole up
> > > > early in the boot, rather than just getting it up robustly using the startup
> > > > script.
> > > 
> > > Well, it's both but I should have been clearer here. Sorry about that.
> > > 
> > > > If thats the case, then I would recommend that you modify the initramfs
> > > > to do something simmilar to the startup script (since thats where the netconsole
> > > > module will get loaded anyway).  You can write a script there that will let you
> > > > specify the destination ip address and figure out the output dev based on the
> > > > routing tables.  If you're using dracut to build your initramfs, then this
> > > > should be pretty straightforward.
> > > 
> > > When I get some more free time I'll experiment with this approach.
> > > 
> > > Just to clarify something from earlier in the discussion:
> > > 
> > > On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
> > > > On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
> > > ....
> > > > > There is an unforeseen consequence of the patch: it breaks device
> > > > > renaming because the device will already be in use by netconsole. Which
> > > > > is the whole problem with userspace device renaming to begin with...
> > > > > 
> > > > That is bad, but see above, the netconsole service can work around this for you,
> > > > allowing you to never have to specify a particular device at all.
> > > 
> > > The breakage is a normal consequence of being able to load netconsole
> > > before the udev rules that do device renaming. The same thing would
> > > happen modifying initramfs.
> > > 
> > > Basically, once netconsole attaches to a device, that device cannot be
> > > renamed. Unfortunately, the default udev behavior messes things up
> > > further because it will try to do this:
> > >   eth0->eth1
> > >   eth1->eth0
> > > which means neither device will be renamed.
> > > 
> > > Maybe the net core should just implement persistent device names ;)
> > > 
> > Theres no good way for the kernel to do that, as persistent naming in this case
> > is a matter of user policy, not kernel hardware management (i.e. do you want a
> > network name to follow a mac address, a pci slot, or the network its connected
> > to)?  You can use smbios to get some modicum of persistent device naming
> > currently, but I don't recall if that requires udev rules to implement as well
> > 
> > You're best bet is to simply make your initramfs more robust.  I understand what
> > you're saying regarding renaming after you've taken a reference on a device not
> > being possible, but you can run udev within the initramfs, and do your renaming
> > prior to your netconsole load.
> 
> Hi Neil,
> 
> I plan to re-submit 'netconsole: allow mac addr to specify local
> interface device' which you originally objected to because you asserted
> that the same effect could be obtained through udev scripts in the
> initramfs.
> 
> When you shot down this patch, did you actually try what you suggested
> in the initramfs or were you just hypothesizing that it would possible?
> 
I've not tried specifically what want to do, no, but I've done interface
renaming plenty of times in the initramfs back when I did kdump work (we had to
rename devices in the initramfs to align them with whatever udev renamed them to
once we pivot_root-ed to the rootfs).

I presume you're sending me this note because you've for some reason decided
that doing this in the initramfs isn't feasible?  I'm happy to help you through
it if you like.

You're also welcome to resubmit your patch, but you're going to have to justify
why doing this in user space isn't a sufficient solution.

Regards
Neil

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cong Wang April 30, 2013, 2:44 a.m. UTC | #13
On Tue, Apr 30, 2013 at 2:21 AM, Neil Horman <nhorman@tuxdriver.com> wrote:
> I've not tried specifically what want to do, no, but I've done interface
> renaming plenty of times in the initramfs back when I did kdump work (we had to
> rename devices in the initramfs to align them with whatever udev renamed them to
> once we pivot_root-ed to the rootfs).
>

The new kdump infrastructure which uses dracut already switches to udev
completely, it has some rule file to map mac address to its name.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hurley May 3, 2013, 7:07 p.m. UTC | #14
On 04/29/2013 02:21 PM, Neil Horman wrote:
> On Mon, Apr 29, 2013 at 01:28:45PM -0400, Peter Hurley wrote:
>> On Mon, 2012-12-17 at 09:20 -0500, Neil Horman wrote:
>>> On Sat, Dec 15, 2012 at 09:13:58AM -0500, Peter Hurley wrote:
>>>> On Fri, 2012-12-14 at 09:20 -0500, Neil Horman wrote:
>>>>> Ah!  I'm sorry, I didn't realize this was really about getting netconsole up
>>>>> early in the boot, rather than just getting it up robustly using the startup
>>>>> script.
>>>>
>>>> Well, it's both but I should have been clearer here. Sorry about that.
>>>>
>>>>> If thats the case, then I would recommend that you modify the initramfs
>>>>> to do something simmilar to the startup script (since thats where the netconsole
>>>>> module will get loaded anyway).  You can write a script there that will let you
>>>>> specify the destination ip address and figure out the output dev based on the
>>>>> routing tables.  If you're using dracut to build your initramfs, then this
>>>>> should be pretty straightforward.
>>>>
>>>> When I get some more free time I'll experiment with this approach.
>>>>
>>>> Just to clarify something from earlier in the discussion:
>>>>
>>>> On Thu, 2012-12-13 at 13:08 -0500, Neil Horman wrote:
>>>>> On Thu, Dec 13, 2012 at 09:49:31AM -0500, Peter Hurley wrote:
>>>> ....
>>>>>> There is an unforeseen consequence of the patch: it breaks device
>>>>>> renaming because the device will already be in use by netconsole. Which
>>>>>> is the whole problem with userspace device renaming to begin with...
>>>>>>
>>>>> That is bad, but see above, the netconsole service can work around this for you,
>>>>> allowing you to never have to specify a particular device at all.
>>>>
>>>> The breakage is a normal consequence of being able to load netconsole
>>>> before the udev rules that do device renaming. The same thing would
>>>> happen modifying initramfs.
>>>>
>>>> Basically, once netconsole attaches to a device, that device cannot be
>>>> renamed. Unfortunately, the default udev behavior messes things up
>>>> further because it will try to do this:
>>>>    eth0->eth1
>>>>    eth1->eth0
>>>> which means neither device will be renamed.
>>>>
>>>> Maybe the net core should just implement persistent device names ;)
>>>>
>>> Theres no good way for the kernel to do that, as persistent naming in this case
>>> is a matter of user policy, not kernel hardware management (i.e. do you want a
>>> network name to follow a mac address, a pci slot, or the network its connected
>>> to)?  You can use smbios to get some modicum of persistent device naming
>>> currently, but I don't recall if that requires udev rules to implement as well
>>>
>>> You're best bet is to simply make your initramfs more robust.  I understand what
>>> you're saying regarding renaming after you've taken a reference on a device not
>>> being possible, but you can run udev within the initramfs, and do your renaming
>>> prior to your netconsole load.
>>
>> Hi Neil,
>>
>> I plan to re-submit 'netconsole: allow mac addr to specify local
>> interface device' which you originally objected to because you asserted
>> that the same effect could be obtained through udev scripts in the
>> initramfs.
>>
>> When you shot down this patch, did you actually try what you suggested
>> in the initramfs or were you just hypothesizing that it would possible?
>>
> I've not tried specifically what want to do, no, but I've done interface
> renaming plenty of times in the initramfs back when I did kdump work (we had to
> rename devices in the initramfs to align them with whatever udev renamed them to
> once we pivot_root-ed to the rootfs).
>
> I presume you're sending me this note because you've for some reason decided
> that doing this in the initramfs isn't feasible?  I'm happy to help you through
> it if you like.

Neil,

I owe you an apology.

Performing the udev device renaming and modprobing netconsole
with the renamed device interface is indeed possible within initramfs.
Once I had managed to get udev device renaming working in the initramfs,
I had confused myself regarding which interface went with which MAC address.

Have a good weekend.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/networking/netconsole.txt b/Documentation/networking/netconsole.txt
index 2e9e0ae2..2dfd703 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -23,12 +23,13 @@  Sender and receiver configuration:
 It takes a string configuration parameter "netconsole" in the
 following format:
 
- netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]
+ netconsole=[src-port]@[src-ip]/[dev|macaddr],[tgt-port]@<tgt-ip>/[tgt-macaddr]
 
    where
         src-port      source for UDP packets (defaults to 6665)
         src-ip        source IP to use (interface address)
-        dev           network interface (eth0)
+        dev|macaddr   network interface (eth0)
+		      alternate: ethernet MAC address of network interface
         tgt-port      port for logging agent (6666)
         tgt-ip        IP address for logging agent
         tgt-macaddr   ethernet MAC address for logging agent (broadcast)
@@ -47,6 +48,10 @@  complete string enclosed in "quotes", thusly:
 
  modprobe netconsole netconsole="@/,@10.0.0.2/;@/eth1,6892@10.0.0.3/"
 
+The alternate form for specifying the local network interface with the
+ethernet MAC address is useful when the device names are inconsistent from
+boot to boot (eg., if the machine has multiple NICs).
+
 Built-in netconsole starts immediately after the TCP stack is
 initialized and attempts to bring up the supplied dev at the supplied
 address.
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 6989ebe..3808a31 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -180,6 +180,7 @@  static struct netconsole_target *alloc_param_target(char *target_config)
 	strlcpy(nt->np.dev_name, "eth0", IFNAMSIZ);
 	nt->np.local_port = 6665;
 	nt->np.remote_port = 6666;
+	memset(nt->np.local_mac, 0, ETH_ALEN);
 	memset(nt->np.remote_mac, 0xff, ETH_ALEN);
 
 	/* Parse parameters and setup netpoll */
@@ -560,6 +561,7 @@  static struct config_item *make_netconsole_target(struct config_group *group,
 	strlcpy(nt->np.dev_name, "eth0", IFNAMSIZ);
 	nt->np.local_port = 6665;
 	nt->np.remote_port = 6666;
+	memset(nt->np.local_mac, 0, ETH_ALEN);
 	memset(nt->np.remote_mac, 0xff, ETH_ALEN);
 
 	/* Initialize the config_item member */
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 66d5379..d646b26 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -20,6 +20,7 @@  struct netpoll {
 
 	__be32 local_ip, remote_ip;
 	u16 local_port, remote_port;
+	u8 local_mac[ETH_ALEN];
 	u8 remote_mac[ETH_ALEN];
 
 	struct list_head rx; /* rx_np list element */
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 77a0388..8910a95 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -660,6 +660,7 @@  void netpoll_print_options(struct netpoll *np)
 	np_info(np, "local port %d\n", np->local_port);
 	np_info(np, "local IP %pI4\n", &np->local_ip);
 	np_info(np, "interface '%s'\n", np->dev_name);
+	np_info(np, "local ethernet address %pM\n", np->local_mac);
 	np_info(np, "remote port %d\n", np->remote_port);
 	np_info(np, "remote IP %pI4\n", &np->remote_ip);
 	np_info(np, "remote ethernet address %pM\n", np->remote_mac);
@@ -693,7 +694,8 @@  int netpoll_parse_options(struct netpoll *np, char *opt)
 		if ((delim = strchr(cur, ',')) == NULL)
 			goto parse_failed;
 		*delim = 0;
-		strlcpy(np->dev_name, cur, sizeof(np->dev_name));
+		if (!mac_pton(cur, np->local_mac))
+			strlcpy(np->dev_name, cur, sizeof(np->dev_name));
 		cur = delim;
 	}
 	cur++;
@@ -806,8 +808,21 @@  int netpoll_setup(struct netpoll *np)
 	struct in_device *in_dev;
 	int err;
 
-	if (np->dev_name)
+	if (!is_zero_ether_addr(np->local_mac)) {
+		rcu_read_lock();
+		ndev = dev_getbyhwaddr_rcu(&init_net, ARPHRD_ETHER, np->local_mac);
+		if (!ndev) {
+			rcu_read_unlock();
+			np_err(np, "%pM doesn't exist, aborting\n", np->local_mac);
+			return -ENODEV;
+		}
+		dev_hold(ndev);
+		rcu_read_unlock();
+		strlcpy(np->dev_name, ndev->name, IFNAMSIZ);
+
+	} else if (np->dev_name)
 		ndev = dev_get_by_name(&init_net, np->dev_name);
+
 	if (!ndev) {
 		np_err(np, "%s doesn't exist, aborting\n", np->dev_name);
 		return -ENODEV;