diff mbox

[1/1] Tell linkwatch about new interfaces

Message ID cb0375e10904010840n1543a890w9039465d0f80ffcc@mail.gmail.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Andrew Lutomirski April 1, 2009, 3:40 p.m. UTC
From: Andrew Lutomirski <amluto@gmail.com>

When a network driver registers a new interface, linkwatch will not notice,
and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
If the new interface has no carrier when it is connected, then a status of
"unknown" is reported to userspace, which confuses various tools
(NetworkManager, for example).

This fires a linkwatch event for all new interfaces, so that operstate
gets set reasonably quickly.

Signed-off-by: Andrew Lutomirski <amluto@gmail.com>

---

This looks like the root cause of the bug I reported here:

http://lkml.org/lkml/2009/3/24/499

Without knowing all the locking and ordering constraints imposed on network
drivers, this seemed like the safest way to fix the bug.  Alternative
approaches would be to call rfc2863_policy directly or to initialize the
relevent fields to sane values (for the carrier off state) in alloc_netdev.

This applies to 2.6.29.  I can rediff it for any other tree, but I didn't see
any changes that would conflict.

the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller April 5, 2009, 12:05 a.m. UTC | #1
From: Andrew Lutomirski <amluto@gmail.com>
Date: Wed, 1 Apr 2009 11:40:06 -0400

> When a network driver registers a new interface, linkwatch will not notice,
> and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
> If the new interface has no carrier when it is connected, then a status of
> "unknown" is reported to userspace, which confuses various tools
> (NetworkManager, for example).
> 
> This fires a linkwatch event for all new interfaces, so that operstate
> gets set reasonably quickly.
> 
> Signed-off-by: Andrew Lutomirski <amluto@gmail.com>

The default assumed state for a freshly registered network
device is that the link is up.

If that disagrees from reality, the driver should make the
appropriate netif_carrier_off() call.

I'm sure you'll find that the e1000 driver is not doing this
and that is what causes the bug you are seeing.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Lutomirski April 5, 2009, 4 a.m. UTC | #2
On Sat, Apr 4, 2009 at 8:05 PM, David Miller <davem@davemloft.net> wrote:
> From: Andrew Lutomirski <amluto@gmail.com>
> Date: Wed, 1 Apr 2009 11:40:06 -0400
>
>> When a network driver registers a new interface, linkwatch will not notice,
>> and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
>> If the new interface has no carrier when it is connected, then a status of
>> "unknown" is reported to userspace, which confuses various tools
>> (NetworkManager, for example).
>>
>> This fires a linkwatch event for all new interfaces, so that operstate
>> gets set reasonably quickly.
>>
>> Signed-off-by: Andrew Lutomirski <amluto@gmail.com>
>
> The default assumed state for a freshly registered network
> device is that the link is up.
>
> If that disagrees from reality, the driver should make the
> appropriate netif_carrier_off() call.
>
> I'm sure you'll find that the e1000 driver is not doing this
> and that is what causes the bug you are seeing.
>

Nope.  The end of e1000_probe (in e1000e) is:

        /* tell the stack to leave us alone until e1000_open() is called */
        netif_carrier_off(netdev);
        netif_tx_stop_all_queues(netdev);

        strcpy(netdev->name, "eth%d");
        err = register_netdev(netdev);
        if (err)
                goto err_register;

        e1000_print_device_info(adapter);

netif_carrier_off does:

void netif_carrier_off(struct net_device *dev)
{
        if (!test_and_set_bit(__LINK_STATE_NOCARRIER, &dev->state)) {
                if (dev->reg_state == NETREG_UNINITIALIZED)
                        return;
                linkwatch_fire_event(dev);
        }
}

So, either it should be illegal to call netif_carrier_off on an
unregistered net_device (and there should be a WARN() in
netif_carrier_off), or it should work correctly, and
register_netdevice should do the right thing when there is no carrier
(i.e. something like my patch).

--Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sergio Luis July 14, 2009, 5:17 p.m. UTC | #3
Hello Dave,

On Sat, Apr 4, 2009 at 9:05 PM, David Miller<davem@davemloft.net> wrote:
> From: Andrew Lutomirski <amluto@gmail.com>
> Date: Wed, 1 Apr 2009 11:40:06 -0400
>
>> When a network driver registers a new interface, linkwatch will not notice,
>> and hence not set the rfc2863 operstate, until netif_carrier_on gets called.
>> If the new interface has no carrier when it is connected, then a status of
>> "unknown" is reported to userspace, which confuses various tools
>> (NetworkManager, for example).
>>
>> This fires a linkwatch event for all new interfaces, so that operstate
>> gets set reasonably quickly.
>>
>> Signed-off-by: Andrew Lutomirski <amluto@gmail.com>
>
> The default assumed state for a freshly registered network
> device is that the link is up.
>
> If that disagrees from reality, the driver should make the
> appropriate netif_carrier_off() call.
>
> I'm sure you'll find that the e1000 driver is not doing this
> and that is what causes the bug you are seeing.
> --

is this patch incorrect, though? with the linkwatch_fire_event() call,
the rfc2863 operstate will be set for everyone at device register
time.
in here I am having the interface operstate as 'unknown', but I do
ifconfig down and up or unplug/plug the cable again it will finally
set the correct rfc2863 operstate.

or should this be fixed on a per-driver basis, like it apparently was
in this case, for his e1000? (drivers/net/skge.c in here).

thanks,
sergio
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller July 14, 2009, 6:33 p.m. UTC | #4
From: Sergio Luis <eeeesti@gmail.com>
Date: Tue, 14 Jul 2009 14:17:21 -0300

> is this patch incorrect, though? with the linkwatch_fire_event() call,
> the rfc2863 operstate will be set for everyone at device register
> time.

The issue is dumb drivers that do not manage their link state
at all.  We want them to always have their links up, from the
moment they are registered.

This is especially important for virtual devices.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sergio Luis July 14, 2009, 6:37 p.m. UTC | #5
On Tue, Jul 14, 2009 at 3:33 PM, David Miller<davem@davemloft.net> wrote:
> From: Sergio Luis <eeeesti@gmail.com>
> Date: Tue, 14 Jul 2009 14:17:21 -0300
>
>> is this patch incorrect, though? with the linkwatch_fire_event() call,
>> the rfc2863 operstate will be set for everyone at device register
>> time.
>
> The issue is dumb drivers that do not manage their link state
> at all.  We want them to always have their links up, from the
> moment they are registered.
>
> This is especially important for virtual devices.
>

I see. I will try to take a look at the driver in question, then. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Lutomirski July 14, 2009, 6:58 p.m. UTC | #6
On Tue, Jul 14, 2009 at 2:33 PM, David Miller<davem@davemloft.net> wrote:
> From: Sergio Luis <eeeesti@gmail.com>
> Date: Tue, 14 Jul 2009 14:17:21 -0300
>
>> is this patch incorrect, though? with the linkwatch_fire_event() call,
>> the rfc2863 operstate will be set for everyone at device register
>> time.
>
> The issue is dumb drivers that do not manage their link state
> at all.  We want them to always have their links up, from the
> moment they are registered.

Such dumb drivers still end up with bogus operstate.

>
> This is especially important for virtual devices.

$ ip link show lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

I've never noticed this causing a problem, but it seems a little
silly.  Presumably lo should be "UP."

--Andy

>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index e438f54..45911fd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4445,6 +4445,9 @@  int register_netdevice(struct net_device *dev)
 		dev->reg_state = NETREG_UNREGISTERED;
 	}

+	/* Update link state. */
+	linkwatch_fire_event(dev);
+
 out:
 	return ret;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in