Patchwork A race in register_netdevice()

login
register
mail settings
Submitter Kalle Valo
Date April 28, 2011, 10:36 p.m.
Message ID <87y62ugg0a.fsf@purkki.adurom.net>
Download mbox | patch
Permalink /patch/93347/
State RFC
Delegated to: David Miller
Headers show

Comments

Kalle Valo - April 28, 2011, 10:36 p.m.
Hi,

there seems to be a race in register_netdevice(), which is reported here:

https://bugzilla.kernel.org/show_bug.cgi?id=15606

This is visible at least with flimflam and ath6kl. Basically what
happens is this:

Apr 29 00:21:35 roska flimflamd[2598]: src/udev.c:add_net_device() 
Apr 29 00:21:35 roska flimflamd[2598]: connman_inet_ifname: SIOCGIFNAME(index
4): No such device
Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message() buf
0xbfefda3c len 1004
Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message()
NEWLINK len 1004 type 16 flags 0x0000 seq 0

(ignore the 10 s delay, I added that to reproduce the issue easily)

There are two ways to fix this, first is to move kobject registration
after the call to list_netdevice():

           return -EFAULT;

I have confirmed that both of these patches fix the issue. Now I'm
wondering which one is the best way forward. Or is there a better way
to fix this?
stephen hemminger - April 28, 2011, 11:52 p.m.
On Fri, 29 Apr 2011 01:36:37 +0300
Kalle Valo <kvalo@adurom.com> wrote:

> Hi,
> 
> there seems to be a race in register_netdevice(), which is reported here:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=15606
> 
> This is visible at least with flimflam and ath6kl. Basically what
> happens is this:
> 
> Apr 29 00:21:35 roska flimflamd[2598]: src/udev.c:add_net_device() 
> Apr 29 00:21:35 roska flimflamd[2598]: connman_inet_ifname: SIOCGIFNAME(index
> 4): No such device
> Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message() buf
> 0xbfefda3c len 1004
> Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message()
> NEWLINK len 1004 type 16 flags 0x0000 seq 0
> 
> (ignore the 10 s delay, I added that to reproduce the issue easily)
> 
> There are two ways to fix this, first is to move kobject registration
> after the call to list_netdevice():
> 
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5425,11 +5425,6 @@ int register_netdevice(struct net_device *dev)
>         if (ret)
>                 goto err_uninit;
>  
> -       ret = netdev_register_kobject(dev);
> -       if (ret)
> -               goto err_uninit;
> -       dev->reg_state = NETREG_REGISTERED;
> -
>         netdev_update_features(dev);
>  
>         /*
> @@ -5443,6 +5438,11 @@ int register_netdevice(struct net_device *dev)
>         dev_hold(dev);
>         list_netdevice(dev);
>  
> +       ret = netdev_register_kobject(dev);
> +       if (ret)
> +               goto err_uninit;
> +       dev->reg_state = NETREG_REGISTERED;
> +
>         /* Notify protocols, that a new device appeared. */
>         ret = call_netdevice_notifiers(NETDEV_REGISTER, dev);
>         ret = notifier_to_errno(ret);
> 
> Other option, noticed by Jouni Malinen, is to take rtnl for
> SIOCGIFNAME. For some reason it's currently unprotected:
> 
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4917,8 +4917,12 @@ int dev_ioctl(struct net *net, unsigned int
> cmd, void __user *arg)
>           rtnl_unlock();
>                 return ret;
>                 }
> -               if (cmd == SIOCGIFNAME)
> -                  return dev_ifname(net, (struct ifreq __user *)arg);
> +                  if (cmd == SIOCGIFNAME) {
> +                     rtnl_lock();
> +                       ret = dev_ifname(net, (struct ifreq __user
> -                  *)arg);
> +                       rtnl_unlock();
> +                               return ret;
> +                               }
>  
>         if (copy_from_user(&ifr, arg, sizeof(struct ifreq)))
>            return -EFAULT;
> 
> I have confirmed that both of these patches fix the issue. Now I'm
> wondering which one is the best way forward. Or is there a better way
> to fix this?
> 

I see no problem with moving this.
SIOCGIFNAME should not need to hold rtnl.
Kalle Valo - April 29, 2011, 5:20 p.m.
Stephen Hemminger <shemminger@vyatta.com> writes:

> On Fri, 29 Apr 2011 01:36:37 +0300
>
>> there seems to be a race in register_netdevice(), which is reported here:
>> 
>> https://bugzilla.kernel.org/show_bug.cgi?id=15606
>> 
>> This is visible at least with flimflam and ath6kl. Basically what
>> happens is this:
>> 
>> Apr 29 00:21:35 roska flimflamd[2598]: src/udev.c:add_net_device() 
>> Apr 29 00:21:35 roska flimflamd[2598]: connman_inet_ifname: SIOCGIFNAME(index
>> 4): No such device
>> Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message() buf
>> 0xbfefda3c len 1004
>> Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message()
>> NEWLINK len 1004 type 16 flags 0x0000 seq 0

[...]

>> I have confirmed that both of these patches fix the issue. Now I'm
>> wondering which one is the best way forward. Or is there a better way
>> to fix this?
>> 
>
> I see no problem with moving this.
> SIOCGIFNAME should not need to hold rtnl.

Ok, thanks for comments. I'll send a proper patch.
Kalle Valo - May 3, 2011, 11:18 p.m.
Hi Stephen,

Stephen Hemminger <shemminger@vyatta.com> writes:

> On Fri, 29 Apr 2011 01:36:37 +0300
> Kalle Valo <kvalo@adurom.com> wrote:
>
>> there seems to be a race in register_netdevice(), which is reported here:
>> 
>> https://bugzilla.kernel.org/show_bug.cgi?id=15606
>> 
>> This is visible at least with flimflam and ath6kl. Basically what
>> happens is this:
>> 
>> Apr 29 00:21:35 roska flimflamd[2598]: src/udev.c:add_net_device() 
>> Apr 29 00:21:35 roska flimflamd[2598]: connman_inet_ifname: SIOCGIFNAME(index
>> 4): No such device
>> Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message() buf
>> 0xbfefda3c len 1004
>> Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message()
>> NEWLINK len 1004 type 16 flags 0x0000 seq 0

[...]

>> I have confirmed that both of these patches fix the issue. Now I'm
>> wondering which one is the best way forward. Or is there a better way
>> to fix this?
>> 
>
> I see no problem with moving this.
> SIOCGIFNAME should not need to hold rtnl.

I'm having difficulties of fixing the race and exploring other
options. Is there any particular issue why SIOCGIFNAME should not take
rtnl?
stephen hemminger - May 3, 2011, 11:41 p.m.
On Wed, 04 May 2011 02:18:11 +0300
Kalle Valo <kvalo@adurom.com> wrote:

> Hi Stephen,
> 
> Stephen Hemminger <shemminger@vyatta.com> writes:
> 
> > On Fri, 29 Apr 2011 01:36:37 +0300
> > Kalle Valo <kvalo@adurom.com> wrote:
> >
> >> there seems to be a race in register_netdevice(), which is reported here:
> >> 
> >> https://bugzilla.kernel.org/show_bug.cgi?id=15606
> >> 
> >> This is visible at least with flimflam and ath6kl. Basically what
> >> happens is this:
> >> 
> >> Apr 29 00:21:35 roska flimflamd[2598]: src/udev.c:add_net_device() 
> >> Apr 29 00:21:35 roska flimflamd[2598]: connman_inet_ifname: SIOCGIFNAME(index
> >> 4): No such device
> >> Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message() buf
> >> 0xbfefda3c len 1004
> >> Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message()
> >> NEWLINK len 1004 type 16 flags 0x0000 seq 0
> 
> [...]
> 
> >> I have confirmed that both of these patches fix the issue. Now I'm
> >> wondering which one is the best way forward. Or is there a better way
> >> to fix this?
> >> 
> >
> > I see no problem with moving this.
> > SIOCGIFNAME should not need to hold rtnl.
> 
> I'm having difficulties of fixing the race and exploring other
> options. Is there any particular issue why SIOCGIFNAME should not take
> rtnl?

None really, but the answer given by SIOCGIFNAME is going to race
anyway. I.e if ioctl returns a value, by the time user space sees it
the result may have changed.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5425,11 +5425,6 @@  int register_netdevice(struct net_device *dev)
        if (ret)
                goto err_uninit;
 
-       ret = netdev_register_kobject(dev);
-       if (ret)
-               goto err_uninit;
-       dev->reg_state = NETREG_REGISTERED;
-
        netdev_update_features(dev);
 
        /*
@@ -5443,6 +5438,11 @@  int register_netdevice(struct net_device *dev)
        dev_hold(dev);
        list_netdevice(dev);
 
+       ret = netdev_register_kobject(dev);
+       if (ret)
+               goto err_uninit;
+       dev->reg_state = NETREG_REGISTERED;
+
        /* Notify protocols, that a new device appeared. */
        ret = call_netdevice_notifiers(NETDEV_REGISTER, dev);
        ret = notifier_to_errno(ret);

Other option, noticed by Jouni Malinen, is to take rtnl for
SIOCGIFNAME. For some reason it's currently unprotected:

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4917,8 +4917,12 @@  int dev_ioctl(struct net *net, unsigned int
cmd, void __user *arg)
          rtnl_unlock();
                return ret;
                }
-               if (cmd == SIOCGIFNAME)
-                  return dev_ifname(net, (struct ifreq __user *)arg);
+                  if (cmd == SIOCGIFNAME) {
+                     rtnl_lock();
+                       ret = dev_ifname(net, (struct ifreq __user
-                  *)arg);
+                       rtnl_unlock();
+                               return ret;
+                               }
 
        if (copy_from_user(&ifr, arg, sizeof(struct ifreq)))