diff mbox

Driver-core: Always create class directories fixing the broken network drivers.

Message ID m139wiuswa.fsf_-_@fess.ebiederm.org
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Eric W. Biederman June 20, 2010, 6:20 a.m. UTC
In get_device_parent there is a check to not add a class directory
when a class device was put under another class device.  The check was
put in place as a just in case measure to not break old userspace if
any existing code happened to depend on it.  Currently the only known
way that we get a class device under a class device is due to the
rearrangement of devices that happened when the new sysfs layout was
introduced.

With the introduction of tagged sysfs directories for properly
handling network namespace support this omission in creating the class
directories went from a bad thing in terms of namespace pollution, to
actually breaking device_remove.

Currently there are two reported network device drivers that break
because the class directory was not created by the device layer.  The
usb bnep driver and the mac80211_hwsim driver.

Every solution proposed changes the sysfs layout for the affected
devices, and thus has the potential to break userspace.

Since we are changing the sysfs layout anyway, and since we are now
talking about several devices all with the same problem, all caused by
the same over conservative bit of code.  Let's kill that bit of code.

There have been other proposals to fix this but they all have been
more complicated, and none of them have actually resulted in working
code.

Any userspace that works with both the old and the new sysfs layouts
should not be affected by this change, and even if someone depends
on it we are talking a very small number of drivers overall that
are affected.

My apologoies for not fully catching this hole in the logic the
when this code was originally added.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 drivers/base/core.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

Comments

Kay Sievers June 20, 2010, 10:52 a.m. UTC | #1
On Sun, Jun 20, 2010 at 08:20, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> In get_device_parent there is a check to not add a class directory
> when a class device was put under another class device.  The check was
> put in place as a just in case measure to not break old userspace if
> any existing code happened to depend on it.  Currently the only known
> way that we get a class device under a class device is due to the
> rearrangement of devices that happened when the new sysfs layout was
> introduced.
>
> With the introduction of tagged sysfs directories for properly
> handling network namespace support this omission in creating the class
> directories went from a bad thing in terms of namespace pollution, to
> actually breaking device_remove.
>
> Currently there are two reported network device drivers that break
> because the class directory was not created by the device layer.  The
> usb bnep driver and the mac80211_hwsim driver.
>
> Every solution proposed changes the sysfs layout for the affected
> devices, and thus has the potential to break userspace.
>
> Since we are changing the sysfs layout anyway, and since we are now
> talking about several devices all with the same problem, all caused by
> the same over conservative bit of code.  Let's kill that bit of code.
>
> There have been other proposals to fix this but they all have been
> more complicated, and none of them have actually resulted in working
> code.
>
> Any userspace that works with both the old and the new sysfs layouts
> should not be affected by this change, and even if someone depends
> on it we are talking a very small number of drivers overall that
> are affected.
>
> My apologoies for not fully catching this hole in the logic the
> when this code was originally added.

We can not do this. Simply comparing the sysfs tree before and after
shows that it breaks 'input'. inputX and mouseX are now spearated by a
subdirectory, which is wrong.

As mentioned earlier, It's pretty fragile to change things in this
area, and I prefer the broken network driver-core interactions to be
fixed instead - even when they are more complicated.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg June 20, 2010, 11:33 a.m. UTC | #2
On Sun, 2010-06-20 at 12:52 +0200, Kay Sievers wrote:

> As mentioned earlier, It's pretty fragile to change things in this
> area, and I prefer the broken network driver-core interactions to be
> fixed instead - even when they are more complicated.

Can you _please_ offer a proper way to fix it then?

johannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kay Sievers June 20, 2010, 11:46 a.m. UTC | #3
On Sun, Jun 20, 2010 at 13:33, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Sun, 2010-06-20 at 12:52 +0200, Kay Sievers wrote:
>
>> As mentioned earlier, It's pretty fragile to change things in this
>> area, and I prefer the broken network driver-core interactions to be
>> fixed instead - even when they are more complicated.
>
> Can you _please_ offer a proper way to fix it then?

Sorry, I have no real experience with the issues created by the
assumption that network driver need to be able to get unloaded while
in use. That's very special, always requires a
compiled-into-the-kernel part of the subsystem, and makes it hard to
work with, as we can not use any of the usual core infrastructure to
solve that.

The only real simple thing that works is splitting the module in two
modules, which isn't really something I would propose.

Maybe the wait-for in the module-exit like your recent mail suggests
works, but I did not try that. Otherwise we can solve this by changing
the net driver and by adding some needed stuff to the core to allow
in-core bus device cleanup.

The class device hierarchy should be removed for proper network
namespace support, it's nothing we properly support with the current
core code. We better don't fiddle around with stuff nobody really
knows what it breaks. Just like I ran into the 'input' stuff now,
which was a really simple case to find.

Kay
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman June 20, 2010, 12:29 p.m. UTC | #4
Kay Sievers <kay.sievers@vrfy.org> writes:

> On Sun, Jun 20, 2010 at 13:33, Johannes Berg <johannes@sipsolutions.net> wrote:
>> On Sun, 2010-06-20 at 12:52 +0200, Kay Sievers wrote:
>>
>>> As mentioned earlier, It's pretty fragile to change things in this
>>> area, and I prefer the broken network driver-core interactions to be
>>> fixed instead - even when they are more complicated.
>>
>> Can you _please_ offer a proper way to fix it then?
>
> Sorry, I have no real experience with the issues created by the
> assumption that network driver need to be able to get unloaded while
> in use. That's very special, always requires a
> compiled-into-the-kernel part of the subsystem, and makes it hard to
> work with, as we can not use any of the usual core infrastructure to
> solve that.

So please look at https://bugzilla.kernel.org/show_bug.cgi?id=16215

That simply creates and destroys the network device as things come
and go.

I think the bnep case is much more serious because it is real hardware
not a testing simulation, and it is the second instance of this.

Calling the change broken when I can boot up and run X in that
configuration just fine is a vast overstatement.  Especially
when you don't acknowledge that the device layer is broken.

I will agree that insane amounts of backwards compatibility are a good
idea.  So I will cook up a version of my patch that adds a hack to the
device layer to only apply this change to devices of class net.

That should save let us postpone the architectural dreams for another
day.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kay Sievers June 20, 2010, 1:37 p.m. UTC | #5
On Sun, Jun 20, 2010 at 14:29, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Kay Sievers <kay.sievers@vrfy.org> writes:
>> On Sun, Jun 20, 2010 at 13:33, Johannes Berg <johannes@sipsolutions.net> wrote:
>>> On Sun, 2010-06-20 at 12:52 +0200, Kay Sievers wrote:
>>>
>>>> As mentioned earlier, It's pretty fragile to change things in this
>>>> area, and I prefer the broken network driver-core interactions to be
>>>> fixed instead - even when they are more complicated.
>>>
>>> Can you _please_ offer a proper way to fix it then?
>>
>> Sorry, I have no real experience with the issues created by the
>> assumption that network driver need to be able to get unloaded while
>> in use. That's very special, always requires a
>> compiled-into-the-kernel part of the subsystem, and makes it hard to
>> work with, as we can not use any of the usual core infrastructure to
>> solve that.
>
> So please look at https://bugzilla.kernel.org/show_bug.cgi?id=16215
>
> That simply creates and destroys the network device as things come
> and go.

I'm still not sure, any help here would be appreciated.

> I think the bnep case is much more serious because it is real hardware
> not a testing simulation, and it is the second instance of this.
>
> Calling the change broken when I can boot up and run X in that
> configuration just fine is a vast overstatement.

Oh, I seriously would love this rule - it would make my work so much
easier. But I need to make it totally clear: "Adding intermediate
directories into 'input' sysfs it absolutely broken, regardless if
your box comes up or not. :)

X is using udev, and udev aggressively hides these details and forbids
matching such details, but many other tools which read sysfs directly,
including ones using the conceptually broken 'device' symlink will for
sure break with such changes.

> Especially
> when you don't acknowledge that the device layer is broken.

Stacking devices from different classes is broken, and not a direct
problem of the core. It is just not supported. The core might just
need to refuse that in the first place, but that's a different issue.

> I will agree that insane amounts of backwards compatibility are a good
> idea.  So I will cook up a version of my patch that adds a hack to the
> device layer to only apply this change to devices of class net.
>
> That should save let us postpone the architectural dreams for another
> day.

It's not a dream, it needs to be fixed where it is used. We can not
allow to stack classes.

Kay
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 9630fbd..7b1c4d4 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -673,8 +673,6 @@  static struct kobject *get_device_parent(struct device *dev,
 		 */
 		if (parent == NULL)
 			parent_kobj = virtual_device_parent(dev);
-		else if (parent->class)
-			return &parent->kobj;
 		else
 			parent_kobj = &parent->kobj;