Patchwork dev_get_valid_name buggy with hash collision

login
register
mail settings
Submitter Daniel Lezcano
Date May 18, 2010, 10:17 a.m.
Message ID <4BF26926.4070507@free.fr>
Download mbox | patch
Permalink /patch/52859/
State Superseded
Delegated to: David Miller
Headers show

Comments

Daniel Lezcano - May 18, 2010, 10:17 a.m.
Hi all,

the commit:

commit d90310243fd750240755e217c5faa13e24f41536
Author: Octavian Purdila <opurdila@ixiacom.com>
Date:   Wed Nov 18 02:36:59 2009 +0000

     net: device name allocation cleanups

introduced a bug when there is a hash collision making impossible to 
rename a device with eth%d

This bug is very hard to reproduce and appears rarely, but finally I 
succeed to reproduce it with the program in attachment which fail to 
rename a device with errno ENFILE.

The test program creates a new network namespace in order to avoid 
messing the real network and to be sure we don't have any ethernet 
devices. Hence, we know if we create one ethernet device with the name 
eth%d the result will be 'eth0'.

The first step is to find a conflicting name with 'eth0':
  1) We compute the hash of 'eth0' with the hashing functions imported 
from the kernel
  2) We create a temporary name, compute its hash
  3) We compare the hash with the one we found for 'eth0'.

We loop until the hashes are different. When they are the same, then the 
temporary name is a conflicting name.

We create a dummy device with the temporary conflicting name and then we 
try to rename it with 'eth%d' (expecting 'eth0'), that fails with ENFILE 
due to the kernel bug.

 From the kernel POV, this is what happen:

dev_change_name does:
---------------------

     [ ... ]

     dev_get_valid_name(net, newname, dev->name, 1);

     [ ... ]

Note the dev->name parameter and newname is 'eth%d'.

dev_get_valid_name does:
------------------------

     [ ... ]

     if (fmt && strchr(name, '%'))
         return __dev_alloc_name(net, name, buf);

     [ ... ]

Note the 'buf' parameter is the 'dev->name' parameter and 'name' is "eth%d"

__dev_alloc_name does:
----------------------

     [ ... ]

         for_each_netdev(net, d) {
             if (!sscanf(d->name, name, &i))
                 continue;
             if (i < 0 || i >= max_netdevices)
                 continue;

             /*  avoid cases where sscanf is not exact inverse of printf */
             snprintf(buf, IFNAMSIZ, name, i);
             if (!strncmp(buf, d->name, IFNAMSIZ))
                 set_bit(i, inuse);
         }

     [ ... ]

Here the buf parameter is 'dev->name', so while we are browsing the 
network devices, we change the name of our eth device we want to rename. 
But in the context of our test program, that does not happen because 
there is no "eth[0-9]" network devices in the namespace, then we exit 
the loop with 'i == 0'.

Right after we do:

     if (buf != name)
         snprintf(buf, IFNAMSIZ, name, i);

Here buf and name pointers are different, so we modify 'buf' which is 
'dev->name', that is the network device name directly. So we have 
'dev->name' == "eth0" after this line.


So right after we are trying to find ourself :)

     [ ... ]

     if (!__dev_get_by_name(net, buf))
         return i;

     [ ... ]

When hashing are the same for the oldname and the newname, the function 
'__dev_get_by_name':

     [ ... ]

     struct hlist_head *head = dev_name_hash(net, name);

     hlist_for_each_entry(dev, p, head, name_hlist)
         if (!strncmp(dev->name, name, IFNAMSIZ))
             return dev;

     [ ... ]

will find the network device because we do __dev_get_by_name(net, 
"eth0"), the dev->name is already modified with "eth0" and the hashing 
of the temporary name and "eth0" are the same so returning the same hash 
entry.

We are lucky, most of the time, because the name of the network device 
and the new name have different hash entry, so we compare to ourself 
very rarely.


IMO, the bug is to pass the 'dev->name' parameter to __dev_alloc_name 
directly instead of a temporary name.

The patch in attachment fix the problem but I am not sure we shouldn't 
go further and do more cleanup around this bug, so you may consider it 
as a RFC more than a fix. If it is acceptable, I will send it as a patch 
against net-2.6.

Thanks
   -- Daniel
Octavian Purdila - May 18, 2010, 12:29 p.m.
On Tuesday 18 May 2010 13:17:10 you wrote:

> the commit:
> 
> commit d90310243fd750240755e217c5faa13e24f41536
> Author: Octavian Purdila <opurdila@ixiacom.com>
> Date:   Wed Nov 18 02:36:59 2009 +0000
> 
>      net: device name allocation cleanups
> 
> introduced a bug when there is a hash collision making impossible to
> rename a device with eth%d
<snip>

Hi Daniel,

Thanks for the detailed explanation ! 

> 
> IMO, the bug is to pass the 'dev->name' parameter to __dev_alloc_name
> directly instead of a temporary name.

I agree.

> 
> The patch in attachment fix the problem but I am not sure we shouldn't
> go further and do more cleanup around this bug, so you may consider it
> as a RFC more than a fix. If it is acceptable, I will send it as a patch
> against net-2.6.
> 

The patch looks good to me, just one doubt here:

>--- net-2.6.orig/net/core/dev.c
>+++ net-2.6/net/core/dev.c
>@@ -936,18 +936,22 @@ int dev_alloc_name(struct net_device *de
> }
> EXPORT_SYMBOL(dev_alloc_name);
> 
>-static int dev_get_valid_name(struct net *net, const char *name, char *buf,
>-                             bool fmt)
>+static int dev_get_valid_name(struct net_device *dev, const char *name, bool 
fmt)
> {
>+       struct net *net;
>+
>+       BUG_ON(!dev_net(dev));
>+       net = dev_net(dev);
>+
>        if (!dev_valid_name(name))
>                return -EINVAL;
> 
>        if (fmt && strchr(name, '%'))
>-               return __dev_alloc_name(net, name, buf);
>+               return dev_alloc_name(dev, name);
>        else if (__dev_get_by_name(net, name))
>                return -EEXIST;
>-       else if (buf != name)
>-               strlcpy(buf, name, IFNAMSIZ);
>+       else if (strncmp(dev->name, name, IFNAMSIZ))
>+                strlcpy(dev->name, name, IFNAMSIZ);
> 

Why do the strncmp, can't we preserve the (buf != name) condition?

Thanks,
tavi
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano - May 18, 2010, 2:55 p.m.
On 05/18/2010 02:29 PM, Octavian Purdila wrote:
> On Tuesday 18 May 2010 13:17:10 you wrote:
>
>    
>> the commit:
>>
>> commit d90310243fd750240755e217c5faa13e24f41536
>> Author: Octavian Purdila<opurdila@ixiacom.com>
>> Date:   Wed Nov 18 02:36:59 2009 +0000
>>
>>       net: device name allocation cleanups
>>
>> introduced a bug when there is a hash collision making impossible to
>> rename a device with eth%d
>>
>>      

[ ... ]

>> --- net-2.6.orig/net/core/dev.c
>> +++ net-2.6/net/core/dev.c
>> @@ -936,18 +936,22 @@ int dev_alloc_name(struct net_device *de
>> }
>> EXPORT_SYMBOL(dev_alloc_name);
>>
>> -static int dev_get_valid_name(struct net *net, const char *name, char *buf,
>> -                             bool fmt)
>> +static int dev_get_valid_name(struct net_device *dev, const char *name, bool
>>      
> fmt)
>    
>> {
>> +       struct net *net;
>> +
>> +       BUG_ON(!dev_net(dev));
>> +       net = dev_net(dev);
>> +
>>         if (!dev_valid_name(name))
>>                 return -EINVAL;
>>
>>         if (fmt&&  strchr(name, '%'))
>> -               return __dev_alloc_name(net, name, buf);
>> +               return dev_alloc_name(dev, name);
>>         else if (__dev_get_by_name(net, name))
>>                 return -EEXIST;
>> -       else if (buf != name)
>> -               strlcpy(buf, name, IFNAMSIZ);
>> +       else if (strncmp(dev->name, name, IFNAMSIZ))
>> +                strlcpy(dev->name, name, IFNAMSIZ);
>>
>>      
> Why do the strncmp, can't we preserve the (buf != name) condition

The 'buf' parameter is no longer passed to the function. We have the 
'dev'  and the 'newname' parameters.
The pointer test was just to check 'dev_get_valid_name' was called from 
the 'register_netdevice' function context with 'dev_get_valid_name(net, 
dev->name, dev->name, 0)'. Comparing the strings is valid in this case.

Otherwise dev_get_valid_name is called from:

  *  "dev_change_net_namespace" with "dev%d" or "ifname" specified 
within the netlink message. Both are different pointers, the first will 
fall in the "if (fmt && strchr(name, '%'))".

  * "dev_change_name", where the pointers are different and the strings 
are different.

I think it is safe to do the string comparison here. But maybe there are 
a few simplifications (eg. remove fmt) to do.
If you agree, I will send this patch against net-2.6 and the 
simplifications against net-next-2.6.

Thanks
   -- Daniel


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Octavian Purdila - May 19, 2010, 5:05 p.m.
On Tuesday 18 May 2010 17:55:36 you wrote:

> >>         if (!dev_valid_name(name))
> >>                 return -EINVAL;
> >>
> >>         if (fmt&&  strchr(name, '%'))
> >> -               return __dev_alloc_name(net, name, buf);
> >> +               return dev_alloc_name(dev, name);
> >>         else if (__dev_get_by_name(net, name))
> >>                 return -EEXIST;
> >> -       else if (buf != name)
> >> -               strlcpy(buf, name, IFNAMSIZ);
> >> +       else if (strncmp(dev->name, name, IFNAMSIZ))
> >> +                strlcpy(dev->name, name, IFNAMSIZ);
> >
> > Why do the strncmp, can't we preserve the (buf != name) condition
> 
> The 'buf' parameter is no longer passed to the function. We have the
> 'dev'  and the 'newname' parameters.
> The pointer test was just to check 'dev_get_valid_name' was called from
> the 'register_netdevice' function context with 'dev_get_valid_name(net,
> dev->name, dev->name, 0)'. Comparing the strings is valid in this case.
> 
> Otherwise dev_get_valid_name is called from:
> 
>   *  "dev_change_net_namespace" with "dev%d" or "ifname" specified
> within the netlink message. Both are different pointers, the first will
> fall in the "if (fmt && strchr(name, '%'))".
> 
>   * "dev_change_name", where the pointers are different and the strings
> are different.
> 

True, but we why not use "if (dev->name !=name)" instead of strncmp? It should 
yield the same results and it is lighter then full strncmp.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano - May 19, 2010, 7:39 p.m.
On 05/19/2010 07:05 PM, Octavian Purdila wrote:
> On Tuesday 18 May 2010 17:55:36 you wrote:
>
>    
>>>>          if (!dev_valid_name(name))
>>>>                  return -EINVAL;
>>>>
>>>>          if (fmt&&   strchr(name, '%'))
>>>> -               return __dev_alloc_name(net, name, buf);
>>>> +               return dev_alloc_name(dev, name);
>>>>          else if (__dev_get_by_name(net, name))
>>>>                  return -EEXIST;
>>>> -       else if (buf != name)
>>>> -               strlcpy(buf, name, IFNAMSIZ);
>>>> +       else if (strncmp(dev->name, name, IFNAMSIZ))
>>>> +                strlcpy(dev->name, name, IFNAMSIZ);
>>>>          
>>> Why do the strncmp, can't we preserve the (buf != name) condition
>>>        
>> The 'buf' parameter is no longer passed to the function. We have the
>> 'dev'  and the 'newname' parameters.
>> The pointer test was just to check 'dev_get_valid_name' was called from
>> the 'register_netdevice' function context with 'dev_get_valid_name(net,
>> dev->name, dev->name, 0)'. Comparing the strings is valid in this case.
>>
>> Otherwise dev_get_valid_name is called from:
>>
>>    *  "dev_change_net_namespace" with "dev%d" or "ifname" specified
>> within the netlink message. Both are different pointers, the first will
>> fall in the "if (fmt&&  strchr(name, '%'))".
>>
>>    * "dev_change_name", where the pointers are different and the strings
>> are different.
>>
>>      
> True, but we why not use "if (dev->name !=name)" instead of strncmp? It should
> yield the same results and it is lighter then full strncmp.
>    

Yes, I agree. In the context of the different callers, that's correct.
Will resend it with the pointer comparison.

Thanks
   -- Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

Subject: fix dev_get_valid_name
From: Daniel Lezcano <daniel.lezcano@free.fr>

the commit:

commit d90310243fd750240755e217c5faa13e24f41536
Author: Octavian Purdila <opurdila@ixiacom.com>
Date:   Wed Nov 18 02:36:59 2009 +0000

    net: device name allocation cleanups

introduced a bug when there is a hash collision making impossible
to rename a device with eth%d. This bug is very hard to reproduce
and appears rarely.

The problem is coming from we don't pass a temporary buffer to
__dev_alloc_name but 'dev->name' which is modified by the function.

Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
---
 net/core/dev.c |   20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

Index: net-2.6/net/core/dev.c
===================================================================
--- net-2.6.orig/net/core/dev.c
+++ net-2.6/net/core/dev.c
@@ -936,18 +936,22 @@  int dev_alloc_name(struct net_device *de
 }
 EXPORT_SYMBOL(dev_alloc_name);
 
-static int dev_get_valid_name(struct net *net, const char *name, char *buf,
-			      bool fmt)
+static int dev_get_valid_name(struct net_device *dev, const char *name, bool fmt)
 {
+	struct net *net;
+
+	BUG_ON(!dev_net(dev));
+	net = dev_net(dev);
+
 	if (!dev_valid_name(name))
 		return -EINVAL;
 
 	if (fmt && strchr(name, '%'))
-		return __dev_alloc_name(net, name, buf);
+		return dev_alloc_name(dev, name);
 	else if (__dev_get_by_name(net, name))
 		return -EEXIST;
-	else if (buf != name)
-		strlcpy(buf, name, IFNAMSIZ);
+	else if (strncmp(dev->name, name, IFNAMSIZ))
+		 strlcpy(dev->name, name, IFNAMSIZ);
 
 	return 0;
 }
@@ -979,7 +983,7 @@  int dev_change_name(struct net_device *d
 
 	memcpy(oldname, dev->name, IFNAMSIZ);
 
-	err = dev_get_valid_name(net, newname, dev->name, 1);
+	err = dev_get_valid_name(dev, newname, 1);
 	if (err < 0)
 		return err;
 
@@ -5083,7 +5087,7 @@  int register_netdevice(struct net_device
 		}
 	}
 
-	ret = dev_get_valid_name(net, dev->name, dev->name, 0);
+	ret = dev_get_valid_name(dev, dev->name, 0);
 	if (ret)
 		goto err_uninit;
 
@@ -5661,7 +5665,7 @@  int dev_change_net_namespace(struct net_
 		/* We get here if we can't use the current device name */
 		if (!pat)
 			goto out;
-		if (dev_get_valid_name(net, pat, dev->name, 1))
+		if (dev_get_valid_name(dev, pat, 1))
 			goto out;
 	}