Patchwork Fix netdev_printk null dereference

login
register
mail settings
Submitter Steve Glendinning
Date March 5, 2010, 12:47 p.m.
Message ID <1267793225-426-1-git-send-email-steve.glendinning@smsc.com>
Download mbox | patch
Permalink /patch/47006/
State Rejected
Delegated to: David Miller
Headers show

Comments

Steve Glendinning - March 5, 2010, 12:47 p.m.
This patch fixes a reproducible null dereference in smsc95xx (and I
suspect others) when the device is removed during a control register
access.  This can be reproduced by rapidly plugging and unplugging
the device during its initialisation.

Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
---
 include/linux/netdevice.h |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)
David Miller - March 5, 2010, 2:39 p.m.
From: Steve Glendinning <steve.glendinning@smsc.com>
Date: Fri,  5 Mar 2010 12:47:05 +0000

> This patch fixes a reproducible null dereference in smsc95xx (and I
> suspect others) when the device is removed during a control register
> access.  This can be reproduced by rapidly plugging and unplugging
> the device during its initialisation.
> 
> Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>

The parent shouldn't become NULL until the device is totally quiesced
and is no longer accesses.

Maybe you can instead fix the smsc95xx driver to abide by this rule
instead of adding a conditional check to thousands of other drivers in
the tree that do not need this?

I really have no intention of adding your change, please fix this
properly, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joe Perches - March 5, 2010, 2:50 p.m.
On Fri, 2010-03-05 at 06:39 -0800, David Miller wrote:
> From: Steve Glendinning <steve.glendinning@smsc.com>
> Date: Fri,  5 Mar 2010 12:47:05 +0000
> 
> > This patch fixes a reproducible null dereference in smsc95xx (and I
> > suspect others) when the device is removed during a control register
> > access.  This can be reproduced by rapidly plugging and unplugging
> > the device during its initialisation.
> > 
> > Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
> 
> The parent shouldn't become NULL until the device is totally quiesced
> and is no longer accesses.
> 
> Maybe you can instead fix the smsc95xx driver to abide by this rule
> instead of adding a conditional check to thousands of other drivers in
> the tree that do not need this?
> 
> I really have no intention of adding your change, please fix this
> properly, thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Perhaps something like this is appropriate in the mean time:

const char *get_netdev_parent_name(const struct net_device *dev)
{
	if (!dev->dev.parent)
		return "Unparented net_device, please report this";
	return netdev->dev.parent;
}


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - March 5, 2010, 3:03 p.m.
From: Joe Perches <joe@perches.com>
Date: Fri, 05 Mar 2010 06:50:20 -0800

> On Fri, 2010-03-05 at 06:39 -0800, David Miller wrote:
>> From: Steve Glendinning <steve.glendinning@smsc.com>
>> Date: Fri,  5 Mar 2010 12:47:05 +0000
>> 
>> > This patch fixes a reproducible null dereference in smsc95xx (and I
>> > suspect others) when the device is removed during a control register
>> > access.  This can be reproduced by rapidly plugging and unplugging
>> > the device during its initialisation.
>> > 
>> > Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
>> 
>> The parent shouldn't become NULL until the device is totally quiesced
>> and is no longer accesses.
>> 
>> Maybe you can instead fix the smsc95xx driver to abide by this rule
>> instead of adding a conditional check to thousands of other drivers in
>> the tree that do not need this?
>> 
>> I really have no intention of adding your change, please fix this
>> properly, thanks.
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Perhaps something like this is appropriate in the mean time:
> 
> const char *get_netdev_parent_name(const struct net_device *dev)
> {
> 	if (!dev->dev.parent)
> 		return "Unparented net_device, please report this";
> 	return netdev->dev.parent;
> }

Yes,  but in the smsc95xx driver. :-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Glendinning - March 5, 2010, 3:29 p.m.
David Miller <davem@davemloft.net> wrote on 05/03/2010 14:39:09:

> From: Steve Glendinning <steve.glendinning@smsc.com>
> Date: Fri,  5 Mar 2010 12:47:05 +0000
> 
> > This patch fixes a reproducible null dereference in smsc95xx (and I
> > suspect others) when the device is removed during a control register
> > access.  This can be reproduced by rapidly plugging and unplugging
> > the device during its initialisation.
> > 
> > Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
> 
> The parent shouldn't become NULL until the device is totally quiesced
> and is no longer accesses.

The failure I'm seeing is caused when the usb device is disconnected.
smsc95xx detects that a pending USB control operation failed
and tries to print a message via netdev_printk to report this.

Unfortunately, something else (the USB subsystem?) has already set
parent to null at this time so the netdev_printk causes a null 
dereference.

So netdev_printk suddenly changes from safe to use to unsafe to use?

I could change all instances in smsc95xx to defensively check this
at each call, but what should I test to see if the device is still
valid?  Is testing parent != null the correct thing to do?

I think other usbnet drivers may have the same problem, but I don't have
any hardware to test them.

> Maybe you can instead fix the smsc95xx driver to abide by this rule
> instead of adding a conditional check to thousands of other drivers in
> the tree that do not need this?

I agree it'd be better to avoid this check where it's unnecessary, but
if netdev_printk isn't necessarily safe to call for *removable* interfaces
then shouldn't all such callers be checking that it's safe to do so?

Steve
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - March 5, 2010, 3:43 p.m.
From: Steve.Glendinning@smsc.com
Date: Fri, 5 Mar 2010 15:29:41 +0000

> The failure I'm seeing is caused when the usb device is disconnected.
> smsc95xx detects that a pending USB control operation failed
> and tries to print a message via netdev_printk to report this.
> 
> Unfortunately, something else (the USB subsystem?) has already set
> parent to null at this time so the netdev_printk causes a null 
> dereference.
> 
> So netdev_printk suddenly changes from safe to use to unsafe to use?

It seems to me that really you only need this parent NULL check where
you notice the USB control operation failed and want to print a
message about that.

That should cover all the necessary cases shouldn't it?

Even more importantly, why does a USB disconnect NULL out the netdev
parent device pointer?  Until you actually release this USB device in
the driver, the parent pointer should stay there.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Glendinning - March 5, 2010, 4:32 p.m.
David Miller <davem@davemloft.net> wrote on 05/03/2010 15:43:35:

> It seems to me that really you only need this parent NULL check where
> you notice the USB control operation failed and want to print a
> message about that.
> 
> That should cover all the necessary cases shouldn't it?

That's the easily reproducible case - during initialisation there are
a lot of register reads and writes, so it's quite easy to trigger by
manually unplugging ~500ms after connection. In this case usbnet is
still in usbnet_probe.

Theoretically though, this device could be disconnected at any time,
and there are other places in the code where we print messages not
triggered by a usb failure (for example set_multicast).  Could it
be possibly unsafe to call netdev_printk here?

> Even more importantly, why does a USB disconnect NULL out the netdev
> parent device pointer?  Until you actually release this USB device in
> the driver, the parent pointer should stay there.

Most of the time it's not nulled out, and the code succesfully prints
errors as expected, but maybe 1 time in 20 dev.parent is NULL.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - March 5, 2010, 4:42 p.m.
From: Steve.Glendinning@smsc.com
Date: Fri, 5 Mar 2010 16:32:03 +0000

> David Miller <davem@davemloft.net> wrote on 05/03/2010 15:43:35:
> 
>> Even more importantly, why does a USB disconnect NULL out the netdev
>> parent device pointer?  Until you actually release this USB device in
>> the driver, the parent pointer should stay there.
> 
> Most of the time it's not nulled out, and the code succesfully prints
> errors as expected, but maybe 1 time in 20 dev.parent is NULL.

I think until the device driver puts it's references and whatnot
of the device it's driving, that parent pointer should be kept
non-NULL.

As long as the netdevice exists and is registered, for example, people
can get at the parent device chain via SYSFS file accesses and
similar.

So it seems to me this is a huge problem waiting to happen anyways and
this netdev_printk() issue is merely making the problem more obvious
:-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c79a88b..250dbc0 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2132,9 +2132,10 @@  static inline const char *netdev_name(const struct net_device *dev)
 }
 
 #define netdev_printk(level, netdev, format, args...)		\
-	dev_printk(level, (netdev)->dev.parent,			\
+	({ if ((netdev)->dev.parent)				\
+		dev_printk(level, (netdev)->dev.parent,		\
 		   "%s: " format,				\
-		   netdev_name(netdev), ##args)
+		   netdev_name(netdev), ##args); })
 
 #define netdev_emerg(dev, format, args...)			\
 	netdev_printk(KERN_EMERG, dev, format, ##args)