Message ID | m1zkdacbsx.fsf@fess.ebiederm.org |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: ebiederm@xmission.com (Eric W. Biederman) Date: Thu, 26 Jan 2012 14:57:02 -0800 > It is not a chicken and egg problem. It is a bug in caif. > caif is claiming to be a network device when it is acting as a subsytem. > That means it is being initialized too late. > > Untested but this should trivially fix the problem, and a bunch > of others of the same ilk. > > It is not safe to shutdown subsystems until all of the devices > are gone, otherwise there will be problems with packets in flight. If you guys can agree that this is the better fix, and we can get testing showing that it works, I'll revert Eric D.'s patch and apply the final version of this one. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> writes: > From: ebiederm@xmission.com (Eric W. Biederman) > Date: Thu, 26 Jan 2012 14:57:02 -0800 > >> It is not a chicken and egg problem. It is a bug in caif. >> caif is claiming to be a network device when it is acting as a subsytem. >> That means it is being initialized too late. >> >> Untested but this should trivially fix the problem, and a bunch >> of others of the same ilk. >> >> It is not safe to shutdown subsystems until all of the devices >> are gone, otherwise there will be problems with packets in flight. > > If you guys can agree that this is the better fix, and we can > get testing showing that it works, I'll revert Eric D.'s patch > and apply the final version of this one. Eric D.'s change is correct, but it is just an optimization of net_assign_generic. Eric D.'s change is not a bug fix. Tested fix in a moment. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le jeudi 26 janvier 2012 à 14:57 -0800, Eric W. Biederman a écrit : > It is not a chicken and egg problem. It is a bug in caif. > caif is claiming to be a network device when it is acting as a subsytem. > That means it is being initialized too late. > Ah ok ! > Untested but this should trivially fix the problem, and a bunch > of others of the same ilk. > Hmm, please refrain from using "trivially" or "trivial", you're not fooling anyone. Truth is this netns layer is horribly complex, since this CAIF bug needed no more than four patch attempts and lastly your own work before finding the root cause. Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric Dumazet <eric.dumazet@gmail.com> writes: > Le jeudi 26 janvier 2012 à 14:57 -0800, Eric W. Biederman a écrit : > >> It is not a chicken and egg problem. It is a bug in caif. >> caif is claiming to be a network device when it is acting as a subsytem. >> That means it is being initialized too late. >> > > Ah ok ! > >> Untested but this should trivially fix the problem, and a bunch >> of others of the same ilk. >> > > Hmm, please refrain from using "trivially" or "trivial", you're not > fooling anyone. All I meant is that the change was trivial. > Truth is this netns layer is horribly complex, since this CAIF bug > needed no more than four patch attempts and lastly your own work before > finding the root cause. As for the complexity I don't know that it is noticeably worse than the initialization complexity of the network stack in general. I do think that it is non-obvious that serious initialization ordering problems can be caused by such a small difference. The non-locality and of cause and effect, combined with unfamiliarity of the code seems to be what hides problems like this. Once you know that initialization ordering problems tend to registering the wrong way. Aka as a device instead of a subsys the solution to problems like this tend to jump out at you. Now the common plumbing in net/core/net_namespace.c does count as complex. The fact we missed such an obvious optimization opportunity for so long seems to confirm that. I am open for ideas on how to simply things. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le jeudi 26 janvier 2012 à 22:54 -0800, Eric W. Biederman a écrit : > Now the common plumbing in net/core/net_namespace.c does count as > complex. The fact we missed such an obvious optimization opportunity > for so long seems to confirm that. This was probably masked by the 13 pointers initial value. I find hard to believe some real setups need more than that ! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c index 673728a..cf5bdd3 100644 --- a/net/caif/caif_dev.c +++ b/net/caif/caif_dev.c @@ -569,7 +569,7 @@ static int __init caif_device_init(void) { int result; - result = register_pernet_device(&caif_net_ops); + result = register_pernet_subsys(&caif_net_ops); if (result) return result; @@ -582,7 +582,7 @@ static int __init caif_device_init(void) static void __exit caif_device_exit(void) { - unregister_pernet_device(&caif_net_ops); + unregister_pernet_subsys(&caif_net_ops); unregister_netdevice_notifier(&caif_device_notifier); dev_remove_pack(&caif_packet_type); }