diff mbox

netns: fix net_alloc_generic()

Message ID m1zkdacbsx.fsf@fess.ebiederm.org
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric W. Biederman Jan. 26, 2012, 10:57 p.m. UTC
Eric Dumazet <eric.dumazet@gmail.com> writes:

> Le jeudi 26 janvier 2012 à 14:44 +0400, Pavel Emelyanov a écrit :
>> > I believe the problem is in net_namespace infrastructure, not in CAIF.
>> > 
>> > Could you test following patch instead ?
>> > 
>> > [PATCH] netns: fix net_alloc_generic()
>> > 
>> > When a new net namespace is created, we should attach to it a "struct
>> > net_generic" with enough slots (even empty), or we can hit the following
>> > BUG_ON() :
>> > 
>> > [  200.752016] kernel BUG at include/net/netns/generic.h:40!
>> > ...
>> > [  200.752016]  [<ffffffff825c3cea>] ? get_cfcnfg+0x3a/0x180
>> > [  200.752016]  [<ffffffff821cf0b0>] ? lockdep_rtnl_is_held+0x10/0x20
>> > [  200.752016]  [<ffffffff825c41be>] caif_device_notify+0x2e/0x530
>> > [  200.752016]  [<ffffffff810d61b7>] notifier_call_chain+0x67/0x110
>> > [  200.752016]  [<ffffffff810d67c1>] raw_notifier_call_chain+0x11/0x20
>> > [  200.752016]  [<ffffffff821bae82>] call_netdevice_notifiers+0x32/0x60
>> > [  200.752016]  [<ffffffff821c2b26>] register_netdevice+0x196/0x300
>> > [  200.752016]  [<ffffffff821c2ca9>] register_netdev+0x19/0x30
>> > [  200.752016]  [<ffffffff81c1c67a>] loopback_net_init+0x4a/0xa0
>> > [  200.752016]  [<ffffffff821b5e62>] ops_init+0x42/0x180
>> > [  200.752016]  [<ffffffff821b600b>] setup_net+0x6b/0x100
>> > [  200.752016]  [<ffffffff821b6466>] copy_net_ns+0x86/0x110
>> > [  200.752016]  [<ffffffff810d5789>] create_new_namespaces+0xd9/0x190
>> > 
>> > net_alloc_generic() should take into account the maximum index into the
>> > ptr array, as a subsystem might use net_generic() anytime.
>> 
>> I'm not sure I understand it correctly, but subsystem can only use the
>> net_generic() only (!) after the net_assign_generic() is performed.
>
> Yes, but here, loopback_net_init() calls register_netdev()
>
> So every subsystems _notify are called, even if subsystem _init_net()
> was not yet called.
>
> Its a chicken and egg problem.

It is not a chicken and egg problem.  It is a bug in caif.
caif is claiming to be a network device when it is acting as a subsytem.
That means it is being initialized too late.

Untested but this should trivially fix the problem, and a bunch
of others of the same ilk.

It is not safe to shutdown subsystems until all of the devices
are gone, otherwise there will be problems with packets in flight.



Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Jan. 26, 2012, 11:07 p.m. UTC | #1
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Thu, 26 Jan 2012 14:57:02 -0800

> It is not a chicken and egg problem.  It is a bug in caif.
> caif is claiming to be a network device when it is acting as a subsytem.
> That means it is being initialized too late.
> 
> Untested but this should trivially fix the problem, and a bunch
> of others of the same ilk.
> 
> It is not safe to shutdown subsystems until all of the devices
> are gone, otherwise there will be problems with packets in flight.

If you guys can agree that this is the better fix, and we can
get testing showing that it works, I'll revert Eric D.'s patch
and apply the final version of this one.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Jan. 26, 2012, 11:57 p.m. UTC | #2
David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Thu, 26 Jan 2012 14:57:02 -0800
>
>> It is not a chicken and egg problem.  It is a bug in caif.
>> caif is claiming to be a network device when it is acting as a subsytem.
>> That means it is being initialized too late.
>> 
>> Untested but this should trivially fix the problem, and a bunch
>> of others of the same ilk.
>> 
>> It is not safe to shutdown subsystems until all of the devices
>> are gone, otherwise there will be problems with packets in flight.
>
> If you guys can agree that this is the better fix, and we can
> get testing showing that it works, I'll revert Eric D.'s patch
> and apply the final version of this one.

Eric D.'s change is correct, but it is just an optimization of
net_assign_generic.  Eric D.'s change is not a bug fix.

Tested fix in a moment.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 27, 2012, 6:09 a.m. UTC | #3
Le jeudi 26 janvier 2012 à 14:57 -0800, Eric W. Biederman a écrit :

> It is not a chicken and egg problem.  It is a bug in caif.
> caif is claiming to be a network device when it is acting as a subsytem.
> That means it is being initialized too late.
> 

Ah ok !

> Untested but this should trivially fix the problem, and a bunch
> of others of the same ilk.
> 

Hmm, please refrain from using "trivially" or "trivial", you're not
fooling anyone.

Truth is this netns layer is horribly complex, since this CAIF bug
needed no more than four patch attempts and lastly your own work before
finding the root cause.

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Jan. 27, 2012, 6:54 a.m. UTC | #4
Eric Dumazet <eric.dumazet@gmail.com> writes:

> Le jeudi 26 janvier 2012 à 14:57 -0800, Eric W. Biederman a écrit :
>
>> It is not a chicken and egg problem.  It is a bug in caif.
>> caif is claiming to be a network device when it is acting as a subsytem.
>> That means it is being initialized too late.
>> 
>
> Ah ok !
>
>> Untested but this should trivially fix the problem, and a bunch
>> of others of the same ilk.
>> 
>
> Hmm, please refrain from using "trivially" or "trivial", you're not
> fooling anyone.

All I meant is that the change was trivial.

> Truth is this netns layer is horribly complex, since this CAIF bug
> needed no more than four patch attempts and lastly your own work before
> finding the root cause.

As for the complexity I don't know that it is noticeably worse than the
initialization complexity of the network stack in general.

I do think that it is non-obvious that serious initialization ordering
problems can be caused by such a small difference.  The non-locality
and of cause and effect, combined with unfamiliarity of the code seems
to be what hides problems like this.

Once you know that initialization ordering problems tend to registering
the wrong way.  Aka as a device instead of a subsys the solution to
problems like this tend to jump out at you.

Now the common plumbing in net/core/net_namespace.c does count as
complex.  The fact we missed such an obvious optimization opportunity
for so long seems to confirm that.

I am open for ideas on how to simply things.  

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 27, 2012, 7:07 a.m. UTC | #5
Le jeudi 26 janvier 2012 à 22:54 -0800, Eric W. Biederman a écrit :

> Now the common plumbing in net/core/net_namespace.c does count as
> complex.  The fact we missed such an obvious optimization opportunity
> for so long seems to confirm that.

This was probably masked by the 13 pointers initial value.

I find hard to believe some real setups need more than that !



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c
index 673728a..cf5bdd3 100644
--- a/net/caif/caif_dev.c
+++ b/net/caif/caif_dev.c
@@ -569,7 +569,7 @@  static int __init caif_device_init(void)
 {
 	int result;
 
-	result = register_pernet_device(&caif_net_ops);
+	result = register_pernet_subsys(&caif_net_ops);
 
 	if (result)
 		return result;
@@ -582,7 +582,7 @@  static int __init caif_device_init(void)
 
 static void __exit caif_device_exit(void)
 {
-	unregister_pernet_device(&caif_net_ops);
+	unregister_pernet_subsys(&caif_net_ops);
 	unregister_netdevice_notifier(&caif_device_notifier);
 	dev_remove_pack(&caif_packet_type);
 }