Message ID | 498ABDA2.5040603@dev.6wind.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> Date: Thu, 05 Feb 2009 11:21:22 +0100 > In dev_change_net_namespace(), synchronize_net() is called at the > end of the function, but there is no reason (no deletion occurs). > > Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> It is necessary to make sure all cpus stop looking at the previous namespace the device was attached to, and only see the new mapping. That's why this function has two synchronize_net() calls. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 06.02.2009 08:45, David Miller a écrit : > From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> > Date: Thu, 05 Feb 2009 11:21:22 +0100 > >> In dev_change_net_namespace(), synchronize_net() is called at the >> end of the function, but there is no reason (no deletion occurs). >> >> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> > > It is necessary to make sure all cpus stop looking at the > previous namespace the device was attached to, and only > see the new mapping. I didn't really understand why it is 'necessary'. If namespace is destroyed after this function, then cleanup_net() will ensure that nobody is looking at it There is only two callers, rtnetlink and default_device_exit(). Thank you for your answer, Nicolas > > That's why this function has two synchronize_net() calls. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> Date: Fri, 06 Feb 2009 14:50:53 +0100 > If namespace is destroyed after this function, then cleanup_net() > will ensure that nobody is looking at it Maybe, but you better get some opinions from the people who wrote and maintain the network namespace code before I can consider your change seriously. None of them responded to your patch posting, probably because you failed to CC: any of them. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 06.02.2009 23:10, David Miller a écrit : > From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> > Date: Fri, 06 Feb 2009 14:50:53 +0100 > >> If namespace is destroyed after this function, then cleanup_net() >> will ensure that nobody is looking at it > > Maybe, but you better get some opinions from the people who wrote > and maintain the network namespace code before I can consider > your change seriously. > > None of them responded to your patch posting, probably because > you failed to CC: any of them. Sorry, I forget to cc them, now it's done. The thread can be found here: http://marc.info/?l=linux-netdev&m=123382930115535&w=2 So, I'm waiting for maintainers's opinions. Thank you, Nicolas -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Nicolas Dichtel wrote: > Le 06.02.2009 23:10, David Miller a écrit : >> From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> >> Date: Fri, 06 Feb 2009 14:50:53 +0100 >> >>> If namespace is destroyed after this function, then cleanup_net() >>> will ensure that nobody is looking at it >> >> Maybe, but you better get some opinions from the people who wrote >> and maintain the network namespace code before I can consider >> your change seriously. >> >> None of them responded to your patch posting, probably because >> you failed to CC: any of them. > Sorry, I forget to cc them, now it's done. > The thread can be found here: > http://marc.info/?l=linux-netdev&m=123382930115535&w=2 > > So, I'm waiting for maintainers's opinions. We can move one network device from one namespace to another namespace, and that do not necessarily implies the network namespace will die and call cleanup_net. Without synchronize_net, it would be possible to have netif_receive_skb and dev_change_net_namespace to be executed concurrently, no ? Wouldn't the execution of one of this function be problematic if we are in the delivery of a packet to the upper protocol in the big rcu_read_lock section of netif_receive_skb ? dev_shutdown(dev); /* Notify protocols, that we are about to destroy this device. They should clean all the things. */ call_netdevice_notifiers(NETDEV_UNREGISTER, dev); /* * Flush the unicast and multicast chains */ dev_addr_discard(dev); netdev_unregister_kobject(dev); /* Actually switch the network namespace */ dev_net_set(dev, net); Thanks -- Daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 10.02.2009 17:40, Daniel Lezcano a écrit : > Nicolas Dichtel wrote: >> Le 06.02.2009 23:10, David Miller a écrit : >>> From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> >>> Date: Fri, 06 Feb 2009 14:50:53 +0100 >>> >>>> If namespace is destroyed after this function, then cleanup_net() >>>> will ensure that nobody is looking at it >>> >>> Maybe, but you better get some opinions from the people who wrote >>> and maintain the network namespace code before I can consider >>> your change seriously. >>> >>> None of them responded to your patch posting, probably because >>> you failed to CC: any of them. >> Sorry, I forget to cc them, now it's done. >> The thread can be found here: >> http://marc.info/?l=linux-netdev&m=123382930115535&w=2 >> >> So, I'm waiting for maintainers's opinions. > We can move one network device from one namespace to another namespace, > and that do not necessarily implies the network namespace will die and > call cleanup_net. > Without synchronize_net, it would be possible to have netif_receive_skb > and dev_change_net_namespace to be executed concurrently, no ? > Wouldn't the execution of one of this function be problematic if we are > in the delivery of a packet to the upper protocol in the big > rcu_read_lock section of netif_receive_skb ? Just to be sure: there is two synchronize_net() in dev_change_net_namespace(), and I was talking about the second one. The second one is called just before exiting the function. Regards, Nicolas > > dev_shutdown(dev); > > /* Notify protocols, that we are about to destroy > this device. They should clean all the things. > */ > call_netdevice_notifiers(NETDEV_UNREGISTER, dev); > > /* > * Flush the unicast and multicast chains > */ > dev_addr_discard(dev); > > netdev_unregister_kobject(dev); > > /* Actually switch the network namespace */ > dev_net_set(dev, net); > > > Thanks > -- Daniel > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Nicolas Dichtel wrote: > Le 10.02.2009 17:40, Daniel Lezcano a écrit : >> Nicolas Dichtel wrote: >>> Le 06.02.2009 23:10, David Miller a écrit : >>>> From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> >>>> Date: Fri, 06 Feb 2009 14:50:53 +0100 >>>> >>>>> If namespace is destroyed after this function, then cleanup_net() >>>>> will ensure that nobody is looking at it >>>> >>>> Maybe, but you better get some opinions from the people who wrote >>>> and maintain the network namespace code before I can consider >>>> your change seriously. >>>> >>>> None of them responded to your patch posting, probably because >>>> you failed to CC: any of them. >>> Sorry, I forget to cc them, now it's done. >>> The thread can be found here: >>> http://marc.info/?l=linux-netdev&m=123382930115535&w=2 >>> >>> So, I'm waiting for maintainers's opinions. >> We can move one network device from one namespace to another >> namespace, and that do not necessarily implies the network namespace >> will die and call cleanup_net. >> Without synchronize_net, it would be possible to have >> netif_receive_skb and dev_change_net_namespace to be executed >> concurrently, no ? >> Wouldn't the execution of one of this function be problematic if we >> are in the delivery of a packet to the upper protocol in the big >> rcu_read_lock section of netif_receive_skb ? > Just to be sure: there is two synchronize_net() in > dev_change_net_namespace(), and I was talking about the second one. > The second one is called just before exiting the function. > > > Regards, > Nicolas Ah, ok :) Hmm, at the first glance I would say it is useless but perhaps there is a trick here I do not understand. Eric, is there any particular reason to call synchronize_net before exiting the dev_change_net_namespace function ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Daniel Lezcano <daniel.lezcano@free.fr> writes: > > Hmm, at the first glance I would say it is useless but perhaps there is a trick > here I do not understand. > Eric, is there any particular reason to call synchronize_net before exiting the > dev_change_net_namespace function ? I haven't thought about that part of the code path in detail in a long time. dev_change_net_namespace() is a condensed version of register_netdevice() unregister_netdevice(). With the calls down into the driver removed. On a side note. It looks like we now cope with: call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in register_netdev, but no one updated dev_change_net_namespace to handle the change, looks like a real pain to cope with. As for the synchronize_net, and in response to the original comment as best as I can tell we do have things being being deleted that are at least candidates for synchronize_net. dev_addr_discard(dev); dev_net_set(dev, net); netdev_unregister_kobject(dev); We very much do access dev->net with only rcu protection. Hmm. It looks like I originally took the second synchronize_net from what became rollback_registered, which happens just before we start freeing the netdevice. The nastiest case that I can envision is if we happen to receive a packet (on another cpu) for the network device that we are moving, just after it has registered in the new network namespace. If we read the old network namespace and forward it up the network stack in that context I can imagine it being a recipe for all kinds of strange non-deterministic behavior. So unless there is a reason for this change beyond general cleanup I would prefer not to think about it potential weirdness, and keep the code the way it is. I seem to remember a conversation about this synchronize_net when the code was merged as well so if we are going to change it, let's look up those arguments if we can and see if there was something useful said. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric W. Biederman wrote: > Daniel Lezcano <daniel.lezcano@free.fr> writes: > >> Hmm, at the first glance I would say it is useless but perhaps there is a trick >> here I do not understand. >> Eric, is there any particular reason to call synchronize_net before exiting the >> dev_change_net_namespace function ? >> > > I haven't thought about that part of the code path in detail in a long > time. dev_change_net_namespace() is a condensed version of > register_netdevice() unregister_netdevice(). With the calls down into > the driver removed. > > On a side note. It looks like we now cope with: > call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in > register_netdev, but no one updated dev_change_net_namespace to handle > the change, looks like a real pain to cope with. > > As for the synchronize_net, and in response to the original > comment as best as I can tell we do have things being being > deleted that are at least candidates for synchronize_net. > > dev_addr_discard(dev); > dev_net_set(dev, net); > netdev_unregister_kobject(dev); > > We very much do access dev->net with only rcu protection. > > Hmm. > > It looks like I originally took the second synchronize_net from what > became rollback_registered, which happens just before we start freeing > the netdevice. > > The nastiest case that I can envision is if we happen to receive a > packet (on another cpu) for the network device that we are moving, > just after it has registered in the new network namespace. If we read > the old network namespace and forward it up the network stack in that > context I can imagine it being a recipe for all kinds of strange > non-deterministic behavior. > The code does: dev_close dev_deactive synchronize_rcu synchronize_net ... dev_shutdown ... synchronize_net The network device can no longer receive packets after dev_deactive, no ? The first synchronize_net will wait for the outstanding packets to be delivered to the upper layer and we change the nd_net field after. Your scenario makes sense for the first synchronize_net but I am not sure that can happen if we remove the second synchronize_net. > So unless there is a reason for this change beyond general cleanup I > would prefer not to think about it potential weirdness, and keep the > code the way it is. > > I seem to remember a conversation about this synchronize_net when the > code was merged as well so if we are going to change it, let's look > up those arguments if we can and see if there was something useful > said. > > Eric > > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Daniel Lezcano <daniel.lezcano@free.fr> writes: > Eric W. Biederman wrote: >> Daniel Lezcano <daniel.lezcano@free.fr> writes: >> >>> Hmm, at the first glance I would say it is useless but perhaps there is a > trick >>> here I do not understand. >>> Eric, is there any particular reason to call synchronize_net before exiting > the >>> dev_change_net_namespace function ? >>> >> >> I haven't thought about that part of the code path in detail in a long >> time. dev_change_net_namespace() is a condensed version of >> register_netdevice() unregister_netdevice(). With the calls down into >> the driver removed. >> >> On a side note. It looks like we now cope with: >> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in >> register_netdev, but no one updated dev_change_net_namespace to handle >> the change, looks like a real pain to cope with. >> >> As for the synchronize_net, and in response to the original >> comment as best as I can tell we do have things being being >> deleted that are at least candidates for synchronize_net. >> >> dev_addr_discard(dev); >> dev_net_set(dev, net); >> netdev_unregister_kobject(dev); >> >> We very much do access dev->net with only rcu protection. >> >> Hmm. >> >> It looks like I originally took the second synchronize_net from what >> became rollback_registered, which happens just before we start freeing >> the netdevice. >> >> The nastiest case that I can envision is if we happen to receive a >> packet (on another cpu) for the network device that we are moving, >> just after it has registered in the new network namespace. If we read >> the old network namespace and forward it up the network stack in that >> context I can imagine it being a recipe for all kinds of strange >> non-deterministic behavior. >> > > The code does: > > dev_close > dev_deactive > synchronize_rcu > synchronize_net > ... > dev_shutdown > ... > synchronize_net > > The network device can no longer receive packets after dev_deactive, no ? > The first synchronize_net will wait for the outstanding packets to be delivered > to the upper layer and we change the nd_net field after. > Your scenario makes sense for the first synchronize_net but I am not sure that > can happen if we remove the second synchronize_net. Good point. Visibility is key. What can find us after we call list_netdevice() ? Aren't there some pieces of code that do for_each_netdevice under the rcu lock? Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric W. Biederman wrote: > Daniel Lezcano <daniel.lezcano@free.fr> writes: > > >> Eric W. Biederman wrote: >> >>> Daniel Lezcano <daniel.lezcano@free.fr> writes: >>> >>> >>>> Hmm, at the first glance I would say it is useless but perhaps there is a >>>> >> trick >> >>>> here I do not understand. >>>> Eric, is there any particular reason to call synchronize_net before exiting >>>> >> the >> >>>> dev_change_net_namespace function ? >>>> >>>> >>> I haven't thought about that part of the code path in detail in a long >>> time. dev_change_net_namespace() is a condensed version of >>> register_netdevice() unregister_netdevice(). With the calls down into >>> the driver removed. >>> >>> On a side note. It looks like we now cope with: >>> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in >>> register_netdev, but no one updated dev_change_net_namespace to handle >>> the change, looks like a real pain to cope with. >>> >>> As for the synchronize_net, and in response to the original >>> comment as best as I can tell we do have things being being >>> deleted that are at least candidates for synchronize_net. >>> >>> dev_addr_discard(dev); >>> dev_net_set(dev, net); >>> netdev_unregister_kobject(dev); >>> >>> We very much do access dev->net with only rcu protection. >>> >>> Hmm. >>> >>> It looks like I originally took the second synchronize_net from what >>> became rollback_registered, which happens just before we start freeing >>> the netdevice. >>> >>> The nastiest case that I can envision is if we happen to receive a >>> packet (on another cpu) for the network device that we are moving, >>> just after it has registered in the new network namespace. If we read >>> the old network namespace and forward it up the network stack in that >>> context I can imagine it being a recipe for all kinds of strange >>> non-deterministic behavior. >>> >>> >> The code does: >> >> dev_close >> dev_deactive >> synchronize_rcu >> synchronize_net >> ... >> dev_shutdown >> ... >> synchronize_net >> >> The network device can no longer receive packets after dev_deactive, no ? >> The first synchronize_net will wait for the outstanding packets to be delivered >> to the upper layer and we change the nd_net field after. >> Your scenario makes sense for the first synchronize_net but I am not sure that >> can happen if we remove the second synchronize_net. >> > > Good point. Visibility is key. What can find us after we > call list_netdevice() ? Aren't there some pieces of code that > do for_each_netdevice under the rcu lock? > AFAIR, no. for_each_netdev is protected by rtnl_lock. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Daniel Lezcano wrote: > Eric W. Biederman wrote: > >> Daniel Lezcano <daniel.lezcano@free.fr> writes: >> >> >> >>> Eric W. Biederman wrote: >>> >>> >>>> Daniel Lezcano <daniel.lezcano@free.fr> writes: >>>> >>>> >>>> >>>>> Hmm, at the first glance I would say it is useless but perhaps there is a >>>>> >>>>> >>> trick >>> >>> >>>>> here I do not understand. >>>>> Eric, is there any particular reason to call synchronize_net before exiting >>>>> >>>>> >>> the >>> >>> >>>>> dev_change_net_namespace function ? >>>>> >>>>> >>>>> >>>> I haven't thought about that part of the code path in detail in a long >>>> time. dev_change_net_namespace() is a condensed version of >>>> register_netdevice() unregister_netdevice(). With the calls down into >>>> the driver removed. >>>> >>>> On a side note. It looks like we now cope with: >>>> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in >>>> register_netdev, but no one updated dev_change_net_namespace to handle >>>> the change, looks like a real pain to cope with. >>>> >>>> As for the synchronize_net, and in response to the original >>>> comment as best as I can tell we do have things being being >>>> deleted that are at least candidates for synchronize_net. >>>> >>>> dev_addr_discard(dev); >>>> dev_net_set(dev, net); >>>> netdev_unregister_kobject(dev); >>>> >>>> We very much do access dev->net with only rcu protection. >>>> >>>> Hmm. >>>> >>>> It looks like I originally took the second synchronize_net from what >>>> became rollback_registered, which happens just before we start freeing >>>> the netdevice. >>>> >>>> The nastiest case that I can envision is if we happen to receive a >>>> packet (on another cpu) for the network device that we are moving, >>>> just after it has registered in the new network namespace. If we read >>>> the old network namespace and forward it up the network stack in that >>>> context I can imagine it being a recipe for all kinds of strange >>>> non-deterministic behavior. >>>> >>>> >>>> >>> The code does: >>> >>> dev_close >>> dev_deactive >>> synchronize_rcu >>> synchronize_net >>> ... >>> dev_shutdown >>> ... >>> synchronize_net >>> >>> The network device can no longer receive packets after dev_deactive, no ? >>> The first synchronize_net will wait for the outstanding packets to be delivered >>> to the upper layer and we change the nd_net field after. >>> Your scenario makes sense for the first synchronize_net but I am not sure that >>> can happen if we remove the second synchronize_net. >>> >>> >> Good point. Visibility is key. What can find us after we >> call list_netdevice() ? Aren't there some pieces of code that >> do for_each_netdevice under the rcu lock? >> >> > AFAIR, no. for_each_netdev is protected by rtnl_lock. > Nicolas, At the first glance it looks like the removing of the second synchronize_net is fine, but before posting the patch do you mind to wait a little ? I would like to do some tests with your patch to check if we don't missed something. Thanks -- Daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Daniel Lezcano wrote: > Daniel Lezcano wrote: >> Eric W. Biederman wrote: >> >>> Daniel Lezcano <daniel.lezcano@free.fr> writes: >>> >>> >>>> Eric W. Biederman wrote: >>>> >>>>> Daniel Lezcano <daniel.lezcano@free.fr> writes: >>>>> >>>>>> Hmm, at the first glance I would say it is useless but perhaps >>>>>> there is a >>>>>> >>>> trick >>>> >>>>>> here I do not understand. >>>>>> Eric, is there any particular reason to call synchronize_net >>>>>> before exiting >>>>>> >>>> the >>>> >>>>>> dev_change_net_namespace function ? >>>>>> >>>>> I haven't thought about that part of the code path in detail in a long >>>>> time. dev_change_net_namespace() is a condensed version of >>>>> register_netdevice() unregister_netdevice(). With the calls down into >>>>> the driver removed. >>>>> >>>>> On a side note. It looks like we now cope with: >>>>> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in >>>>> register_netdev, but no one updated dev_change_net_namespace to handle >>>>> the change, looks like a real pain to cope with. >>>>> >>>>> As for the synchronize_net, and in response to the original >>>>> comment as best as I can tell we do have things being being >>>>> deleted that are at least candidates for synchronize_net. >>>>> >>>>> dev_addr_discard(dev); >>>>> dev_net_set(dev, net); >>>>> netdev_unregister_kobject(dev); >>>>> >>>>> We very much do access dev->net with only rcu protection. >>>>> >>>>> Hmm. >>>>> >>>>> It looks like I originally took the second synchronize_net from what >>>>> became rollback_registered, which happens just before we start freeing >>>>> the netdevice. >>>>> >>>>> The nastiest case that I can envision is if we happen to receive a >>>>> packet (on another cpu) for the network device that we are moving, >>>>> just after it has registered in the new network namespace. If we read >>>>> the old network namespace and forward it up the network stack in that >>>>> context I can imagine it being a recipe for all kinds of strange >>>>> non-deterministic behavior. >>>>> >>>> The code does: >>>> >>>> dev_close >>>> dev_deactive >>>> synchronize_rcu >>>> synchronize_net >>>> ... >>>> dev_shutdown >>>> ... >>>> synchronize_net >>>> >>>> The network device can no longer receive packets after dev_deactive, >>>> no ? >>>> The first synchronize_net will wait for the outstanding packets to >>>> be delivered >>>> to the upper layer and we change the nd_net field after. >>>> Your scenario makes sense for the first synchronize_net but I am not >>>> sure that >>>> can happen if we remove the second synchronize_net. >>>> >>> Good point. Visibility is key. What can find us after we >>> call list_netdevice() ? Aren't there some pieces of code that >>> do for_each_netdevice under the rcu lock? >>> >> AFAIR, no. for_each_netdev is protected by rtnl_lock. >> > > Nicolas, > > At the first glance it looks like the removing of the second > synchronize_net is fine, but before posting the patch do you mind to > wait a little ? > I would like to do some tests with your patch to check if we don't > missed something. > Hi Daniel, no problem, there is no hurry. Let me know the result of your tests. Thanks, Nicolas -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- linux-2.6.28.2/net/core/dev.c 2009-01-24 19:42:07.000000000 -0500 +++ linux-2.6.28.2-new/net/core/dev.c 2009-02-05 04:16:35.000000000 -0500 @@ -4546,7 +4546,6 @@ int dev_change_net_namespace(struct net_ /* Notify protocols, that a new device appeared. */ call_netdevice_notifiers(NETDEV_REGISTER, dev); - synchronize_net(); err = 0; out: return err;
In dev_change_net_namespace(), synchronize_net() is called at the end of the function, but there is no reason (no deletion occurs). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>