diff mbox

netns: remove useless synchronize_net()

Message ID 498ABDA2.5040603@dev.6wind.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Nicolas Dichtel Feb. 5, 2009, 10:21 a.m. UTC
In dev_change_net_namespace(), synchronize_net() is called at the end of the 
function, but there is no reason (no deletion occurs).

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

Comments

David Miller Feb. 6, 2009, 7:45 a.m. UTC | #1
From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
Date: Thu, 05 Feb 2009 11:21:22 +0100

> In dev_change_net_namespace(), synchronize_net() is called at the
> end of the function, but there is no reason (no deletion occurs).
>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

It is necessary to make sure all cpus stop looking at the
previous namespace the device was attached to, and only
see the new mapping.

That's why this function has two synchronize_net() calls.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Dichtel Feb. 6, 2009, 1:50 p.m. UTC | #2
Le 06.02.2009 08:45, David Miller a écrit :
> From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
> Date: Thu, 05 Feb 2009 11:21:22 +0100
> 
>> In dev_change_net_namespace(), synchronize_net() is called at the
>> end of the function, but there is no reason (no deletion occurs).
>>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> 
> It is necessary to make sure all cpus stop looking at the
> previous namespace the device was attached to, and only
> see the new mapping.
I didn't really understand why it is 'necessary'.
If namespace is destroyed after this function, then cleanup_net() will ensure 
that nobody is looking at it
There is only two callers, rtnetlink and default_device_exit().


Thank you for your answer,
Nicolas

> 
> That's why this function has two synchronize_net() calls.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 6, 2009, 10:10 p.m. UTC | #3
From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
Date: Fri, 06 Feb 2009 14:50:53 +0100

> If namespace is destroyed after this function, then cleanup_net()
> will ensure that nobody is looking at it

Maybe, but you better get some opinions from the people who wrote
and maintain the network namespace code before I can consider
your change seriously.

None of them responded to your patch posting, probably because
you failed to CC: any of them.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Dichtel Feb. 10, 2009, 3:40 p.m. UTC | #4
Le 06.02.2009 23:10, David Miller a écrit :
> From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
> Date: Fri, 06 Feb 2009 14:50:53 +0100
> 
>> If namespace is destroyed after this function, then cleanup_net()
>> will ensure that nobody is looking at it
> 
> Maybe, but you better get some opinions from the people who wrote
> and maintain the network namespace code before I can consider
> your change seriously.
> 
> None of them responded to your patch posting, probably because
> you failed to CC: any of them.
Sorry, I forget to cc them, now it's done.
The thread can be found here: http://marc.info/?l=linux-netdev&m=123382930115535&w=2

So, I'm waiting for maintainers's opinions.

Thank you,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Feb. 10, 2009, 4:40 p.m. UTC | #5
Nicolas Dichtel wrote:
> Le 06.02.2009 23:10, David Miller a écrit :
>> From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
>> Date: Fri, 06 Feb 2009 14:50:53 +0100
>>
>>> If namespace is destroyed after this function, then cleanup_net()
>>> will ensure that nobody is looking at it
>>
>> Maybe, but you better get some opinions from the people who wrote
>> and maintain the network namespace code before I can consider
>> your change seriously.
>>
>> None of them responded to your patch posting, probably because
>> you failed to CC: any of them.
> Sorry, I forget to cc them, now it's done.
> The thread can be found here: 
> http://marc.info/?l=linux-netdev&m=123382930115535&w=2
>
> So, I'm waiting for maintainers's opinions.
We can move one network device from one namespace to another namespace, 
and that do not necessarily implies the network namespace will die and 
call cleanup_net.
Without synchronize_net, it would be possible to have netif_receive_skb 
and dev_change_net_namespace to be executed concurrently, no ?
Wouldn't the execution of one of this function be problematic if we are 
in the delivery of a packet to the upper protocol in the big 
rcu_read_lock section of netif_receive_skb ?

    dev_shutdown(dev);

    /* Notify protocols, that we are about to destroy
       this device. They should clean all the things.
    */
    call_netdevice_notifiers(NETDEV_UNREGISTER, dev);

    /*
     *    Flush the unicast and multicast chains
     */
    dev_addr_discard(dev);

    netdev_unregister_kobject(dev);

    /* Actually switch the network namespace */
    dev_net_set(dev, net);


Thanks
  -- Daniel


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Dichtel Feb. 10, 2009, 4:48 p.m. UTC | #6
Le 10.02.2009 17:40, Daniel Lezcano a écrit :
> Nicolas Dichtel wrote:
>> Le 06.02.2009 23:10, David Miller a écrit :
>>> From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
>>> Date: Fri, 06 Feb 2009 14:50:53 +0100
>>>
>>>> If namespace is destroyed after this function, then cleanup_net()
>>>> will ensure that nobody is looking at it
>>>
>>> Maybe, but you better get some opinions from the people who wrote
>>> and maintain the network namespace code before I can consider
>>> your change seriously.
>>>
>>> None of them responded to your patch posting, probably because
>>> you failed to CC: any of them.
>> Sorry, I forget to cc them, now it's done.
>> The thread can be found here: 
>> http://marc.info/?l=linux-netdev&m=123382930115535&w=2
>>
>> So, I'm waiting for maintainers's opinions.
> We can move one network device from one namespace to another namespace, 
> and that do not necessarily implies the network namespace will die and 
> call cleanup_net.
> Without synchronize_net, it would be possible to have netif_receive_skb 
> and dev_change_net_namespace to be executed concurrently, no ?
> Wouldn't the execution of one of this function be problematic if we are 
> in the delivery of a packet to the upper protocol in the big 
> rcu_read_lock section of netif_receive_skb ?
Just to be sure: there is two synchronize_net() in dev_change_net_namespace(), 
and I was talking about the second one. The second one is called just before 
exiting the function.


Regards,
Nicolas

> 
>    dev_shutdown(dev);
> 
>    /* Notify protocols, that we are about to destroy
>       this device. They should clean all the things.
>    */
>    call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
> 
>    /*
>     *    Flush the unicast and multicast chains
>     */
>    dev_addr_discard(dev);
> 
>    netdev_unregister_kobject(dev);
> 
>    /* Actually switch the network namespace */
>    dev_net_set(dev, net);
> 
> 
> Thanks
>  -- Daniel
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Feb. 10, 2009, 5:13 p.m. UTC | #7
Nicolas Dichtel wrote:
> Le 10.02.2009 17:40, Daniel Lezcano a écrit :
>> Nicolas Dichtel wrote:
>>> Le 06.02.2009 23:10, David Miller a écrit :
>>>> From: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com>
>>>> Date: Fri, 06 Feb 2009 14:50:53 +0100
>>>>
>>>>> If namespace is destroyed after this function, then cleanup_net()
>>>>> will ensure that nobody is looking at it
>>>>
>>>> Maybe, but you better get some opinions from the people who wrote
>>>> and maintain the network namespace code before I can consider
>>>> your change seriously.
>>>>
>>>> None of them responded to your patch posting, probably because
>>>> you failed to CC: any of them.
>>> Sorry, I forget to cc them, now it's done.
>>> The thread can be found here: 
>>> http://marc.info/?l=linux-netdev&m=123382930115535&w=2
>>>
>>> So, I'm waiting for maintainers's opinions.
>> We can move one network device from one namespace to another 
>> namespace, and that do not necessarily implies the network namespace 
>> will die and call cleanup_net.
>> Without synchronize_net, it would be possible to have 
>> netif_receive_skb and dev_change_net_namespace to be executed 
>> concurrently, no ?
>> Wouldn't the execution of one of this function be problematic if we 
>> are in the delivery of a packet to the upper protocol in the big 
>> rcu_read_lock section of netif_receive_skb ?
> Just to be sure: there is two synchronize_net() in 
> dev_change_net_namespace(), and I was talking about the second one. 
> The second one is called just before exiting the function.
>
>
> Regards,
> Nicolas

Ah, ok :)

Hmm, at the first glance I would say it is useless but perhaps there is 
a trick here I do not understand.
Eric, is there any particular reason to call synchronize_net before 
exiting the dev_change_net_namespace function ?


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Feb. 11, 2009, 7:51 a.m. UTC | #8
Daniel Lezcano <daniel.lezcano@free.fr> writes:
>
> Hmm, at the first glance I would say it is useless but perhaps there is a trick
> here I do not understand.
> Eric, is there any particular reason to call synchronize_net before exiting the
> dev_change_net_namespace function ?

I haven't thought about that part of the code path in detail in a long
time.  dev_change_net_namespace() is a condensed version of
register_netdevice() unregister_netdevice().  With the calls down into
the driver removed.

On a side note.  It looks like we now cope with:
call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in
register_netdev, but no one updated dev_change_net_namespace to handle
the change, looks like a real pain to cope with.

As for the synchronize_net, and in response to the original
comment as best as I can tell we do have things being being
deleted that are at least candidates for synchronize_net.

dev_addr_discard(dev);
dev_net_set(dev, net);
netdev_unregister_kobject(dev);

We very much do access dev->net with only rcu protection.

Hmm.

It looks like I originally took the second synchronize_net from what
became rollback_registered, which happens just before we start freeing
the netdevice.

The nastiest case that I can envision is if we happen to receive a
packet (on another cpu) for the network device that we are moving,
just after it has registered in the new network namespace.  If we read
the old network namespace and forward it up the network stack in that
context I can imagine it being a recipe for all kinds of strange
non-deterministic behavior.

So unless there is a reason for this change beyond general cleanup I
would prefer not to think about it potential weirdness, and keep the
code the way it is.

I seem to remember a conversation about this synchronize_net when the
code was merged as well so if we are going to change it, let's look
up those arguments if we can and see if there was something useful
said.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Feb. 11, 2009, 3:49 p.m. UTC | #9
Eric W. Biederman wrote:
> Daniel Lezcano <daniel.lezcano@free.fr> writes:
>   
>> Hmm, at the first glance I would say it is useless but perhaps there is a trick
>> here I do not understand.
>> Eric, is there any particular reason to call synchronize_net before exiting the
>> dev_change_net_namespace function ?
>>     
>
> I haven't thought about that part of the code path in detail in a long
> time.  dev_change_net_namespace() is a condensed version of
> register_netdevice() unregister_netdevice().  With the calls down into
> the driver removed.
>
> On a side note.  It looks like we now cope with:
> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in
> register_netdev, but no one updated dev_change_net_namespace to handle
> the change, looks like a real pain to cope with.
>
> As for the synchronize_net, and in response to the original
> comment as best as I can tell we do have things being being
> deleted that are at least candidates for synchronize_net.
>
> dev_addr_discard(dev);
> dev_net_set(dev, net);
> netdev_unregister_kobject(dev);
>
> We very much do access dev->net with only rcu protection.
>
> Hmm.
>
> It looks like I originally took the second synchronize_net from what
> became rollback_registered, which happens just before we start freeing
> the netdevice.
>
> The nastiest case that I can envision is if we happen to receive a
> packet (on another cpu) for the network device that we are moving,
> just after it has registered in the new network namespace.  If we read
> the old network namespace and forward it up the network stack in that
> context I can imagine it being a recipe for all kinds of strange
> non-deterministic behavior.
>   

The code does:

    dev_close
       dev_deactive
          synchronize_rcu
    synchronize_net
    ...
    dev_shutdown
    ...
    synchronize_net

The network device can no longer receive packets after dev_deactive, no ?
The first synchronize_net will wait for the outstanding packets to be 
delivered to the upper layer and we change the nd_net field after.
Your scenario makes sense for the first synchronize_net but I am not 
sure that can happen if we remove the second synchronize_net.

> So unless there is a reason for this change beyond general cleanup I
> would prefer not to think about it potential weirdness, and keep the
> code the way it is.
>
> I seem to remember a conversation about this synchronize_net when the
> code was merged as well so if we are going to change it, let's look
> up those arguments if we can and see if there was something useful
> said.
>
> Eric
>
>
>   

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Feb. 11, 2009, 11:03 p.m. UTC | #10
Daniel Lezcano <daniel.lezcano@free.fr> writes:

> Eric W. Biederman wrote:
>> Daniel Lezcano <daniel.lezcano@free.fr> writes:
>>   
>>> Hmm, at the first glance I would say it is useless but perhaps there is a
> trick
>>> here I do not understand.
>>> Eric, is there any particular reason to call synchronize_net before exiting
> the
>>> dev_change_net_namespace function ?
>>>     
>>
>> I haven't thought about that part of the code path in detail in a long
>> time.  dev_change_net_namespace() is a condensed version of
>> register_netdevice() unregister_netdevice().  With the calls down into
>> the driver removed.
>>
>> On a side note.  It looks like we now cope with:
>> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in
>> register_netdev, but no one updated dev_change_net_namespace to handle
>> the change, looks like a real pain to cope with.
>>
>> As for the synchronize_net, and in response to the original
>> comment as best as I can tell we do have things being being
>> deleted that are at least candidates for synchronize_net.
>>
>> dev_addr_discard(dev);
>> dev_net_set(dev, net);
>> netdev_unregister_kobject(dev);
>>
>> We very much do access dev->net with only rcu protection.
>>
>> Hmm.
>>
>> It looks like I originally took the second synchronize_net from what
>> became rollback_registered, which happens just before we start freeing
>> the netdevice.
>>
>> The nastiest case that I can envision is if we happen to receive a
>> packet (on another cpu) for the network device that we are moving,
>> just after it has registered in the new network namespace.  If we read
>> the old network namespace and forward it up the network stack in that
>> context I can imagine it being a recipe for all kinds of strange
>> non-deterministic behavior.
>>   
>
> The code does:
>
>    dev_close
>       dev_deactive
>          synchronize_rcu
>    synchronize_net
>    ...
>    dev_shutdown
>    ...
>    synchronize_net
>
> The network device can no longer receive packets after dev_deactive, no ?
> The first synchronize_net will wait for the outstanding packets to be delivered
> to the upper layer and we change the nd_net field after.
> Your scenario makes sense for the first synchronize_net but I am not sure that
> can happen if we remove the second synchronize_net.

Good point.  Visibility is key.  What can find us after we
call list_netdevice() ?  Aren't there some pieces of code that
do for_each_netdevice under the rcu lock?

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Feb. 12, 2009, 3:11 p.m. UTC | #11
Eric W. Biederman wrote:
> Daniel Lezcano <daniel.lezcano@free.fr> writes:
>
>   
>> Eric W. Biederman wrote:
>>     
>>> Daniel Lezcano <daniel.lezcano@free.fr> writes:
>>>   
>>>       
>>>> Hmm, at the first glance I would say it is useless but perhaps there is a
>>>>         
>> trick
>>     
>>>> here I do not understand.
>>>> Eric, is there any particular reason to call synchronize_net before exiting
>>>>         
>> the
>>     
>>>> dev_change_net_namespace function ?
>>>>     
>>>>         
>>> I haven't thought about that part of the code path in detail in a long
>>> time.  dev_change_net_namespace() is a condensed version of
>>> register_netdevice() unregister_netdevice().  With the calls down into
>>> the driver removed.
>>>
>>> On a side note.  It looks like we now cope with:
>>> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in
>>> register_netdev, but no one updated dev_change_net_namespace to handle
>>> the change, looks like a real pain to cope with.
>>>
>>> As for the synchronize_net, and in response to the original
>>> comment as best as I can tell we do have things being being
>>> deleted that are at least candidates for synchronize_net.
>>>
>>> dev_addr_discard(dev);
>>> dev_net_set(dev, net);
>>> netdev_unregister_kobject(dev);
>>>
>>> We very much do access dev->net with only rcu protection.
>>>
>>> Hmm.
>>>
>>> It looks like I originally took the second synchronize_net from what
>>> became rollback_registered, which happens just before we start freeing
>>> the netdevice.
>>>
>>> The nastiest case that I can envision is if we happen to receive a
>>> packet (on another cpu) for the network device that we are moving,
>>> just after it has registered in the new network namespace.  If we read
>>> the old network namespace and forward it up the network stack in that
>>> context I can imagine it being a recipe for all kinds of strange
>>> non-deterministic behavior.
>>>   
>>>       
>> The code does:
>>
>>    dev_close
>>       dev_deactive
>>          synchronize_rcu
>>    synchronize_net
>>    ...
>>    dev_shutdown
>>    ...
>>    synchronize_net
>>
>> The network device can no longer receive packets after dev_deactive, no ?
>> The first synchronize_net will wait for the outstanding packets to be delivered
>> to the upper layer and we change the nd_net field after.
>> Your scenario makes sense for the first synchronize_net but I am not sure that
>> can happen if we remove the second synchronize_net.
>>     
>
> Good point.  Visibility is key.  What can find us after we
> call list_netdevice() ?  Aren't there some pieces of code that
> do for_each_netdevice under the rcu lock?
>   
AFAIR, no. for_each_netdev is protected by rtnl_lock.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Feb. 15, 2009, 4:13 p.m. UTC | #12
Daniel Lezcano wrote:
> Eric W. Biederman wrote:
>   
>> Daniel Lezcano <daniel.lezcano@free.fr> writes:
>>
>>   
>>     
>>> Eric W. Biederman wrote:
>>>     
>>>       
>>>> Daniel Lezcano <daniel.lezcano@free.fr> writes:
>>>>   
>>>>       
>>>>         
>>>>> Hmm, at the first glance I would say it is useless but perhaps there is a
>>>>>         
>>>>>           
>>> trick
>>>     
>>>       
>>>>> here I do not understand.
>>>>> Eric, is there any particular reason to call synchronize_net before exiting
>>>>>         
>>>>>           
>>> the
>>>     
>>>       
>>>>> dev_change_net_namespace function ?
>>>>>     
>>>>>         
>>>>>           
>>>> I haven't thought about that part of the code path in detail in a long
>>>> time.  dev_change_net_namespace() is a condensed version of
>>>> register_netdevice() unregister_netdevice().  With the calls down into
>>>> the driver removed.
>>>>
>>>> On a side note.  It looks like we now cope with:
>>>> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in
>>>> register_netdev, but no one updated dev_change_net_namespace to handle
>>>> the change, looks like a real pain to cope with.
>>>>
>>>> As for the synchronize_net, and in response to the original
>>>> comment as best as I can tell we do have things being being
>>>> deleted that are at least candidates for synchronize_net.
>>>>
>>>> dev_addr_discard(dev);
>>>> dev_net_set(dev, net);
>>>> netdev_unregister_kobject(dev);
>>>>
>>>> We very much do access dev->net with only rcu protection.
>>>>
>>>> Hmm.
>>>>
>>>> It looks like I originally took the second synchronize_net from what
>>>> became rollback_registered, which happens just before we start freeing
>>>> the netdevice.
>>>>
>>>> The nastiest case that I can envision is if we happen to receive a
>>>> packet (on another cpu) for the network device that we are moving,
>>>> just after it has registered in the new network namespace.  If we read
>>>> the old network namespace and forward it up the network stack in that
>>>> context I can imagine it being a recipe for all kinds of strange
>>>> non-deterministic behavior.
>>>>   
>>>>       
>>>>         
>>> The code does:
>>>
>>>    dev_close
>>>       dev_deactive
>>>          synchronize_rcu
>>>    synchronize_net
>>>    ...
>>>    dev_shutdown
>>>    ...
>>>    synchronize_net
>>>
>>> The network device can no longer receive packets after dev_deactive, no ?
>>> The first synchronize_net will wait for the outstanding packets to be delivered
>>> to the upper layer and we change the nd_net field after.
>>> Your scenario makes sense for the first synchronize_net but I am not sure that
>>> can happen if we remove the second synchronize_net.
>>>     
>>>       
>> Good point.  Visibility is key.  What can find us after we
>> call list_netdevice() ?  Aren't there some pieces of code that
>> do for_each_netdevice under the rcu lock?
>>   
>>     
> AFAIR, no. for_each_netdev is protected by rtnl_lock.
>   

Nicolas,

At the first glance it looks like the removing of the second 
synchronize_net is fine, but before posting the patch do you mind to 
wait a little ?
I would like to do some tests with your patch to check if we don't 
missed something.

Thanks
  -- Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Dichtel Feb. 16, 2009, 1:46 p.m. UTC | #13
Daniel Lezcano wrote:
> Daniel Lezcano wrote:
>> Eric W. Biederman wrote:
>>  
>>> Daniel Lezcano <daniel.lezcano@free.fr> writes:
>>>
>>>      
>>>> Eric W. Biederman wrote:
>>>>          
>>>>> Daniel Lezcano <daniel.lezcano@free.fr> writes:
>>>>>                
>>>>>> Hmm, at the first glance I would say it is useless but perhaps 
>>>>>> there is a
>>>>>>                   
>>>> trick
>>>>          
>>>>>> here I do not understand.
>>>>>> Eric, is there any particular reason to call synchronize_net 
>>>>>> before exiting
>>>>>>                   
>>>> the
>>>>          
>>>>>> dev_change_net_namespace function ?
>>>>>>                       
>>>>> I haven't thought about that part of the code path in detail in a long
>>>>> time.  dev_change_net_namespace() is a condensed version of
>>>>> register_netdevice() unregister_netdevice().  With the calls down into
>>>>> the driver removed.
>>>>>
>>>>> On a side note.  It looks like we now cope with:
>>>>> call_netdevice_notifiers(NETDEV_REGISTER, dev); failing in
>>>>> register_netdev, but no one updated dev_change_net_namespace to handle
>>>>> the change, looks like a real pain to cope with.
>>>>>
>>>>> As for the synchronize_net, and in response to the original
>>>>> comment as best as I can tell we do have things being being
>>>>> deleted that are at least candidates for synchronize_net.
>>>>>
>>>>> dev_addr_discard(dev);
>>>>> dev_net_set(dev, net);
>>>>> netdev_unregister_kobject(dev);
>>>>>
>>>>> We very much do access dev->net with only rcu protection.
>>>>>
>>>>> Hmm.
>>>>>
>>>>> It looks like I originally took the second synchronize_net from what
>>>>> became rollback_registered, which happens just before we start freeing
>>>>> the netdevice.
>>>>>
>>>>> The nastiest case that I can envision is if we happen to receive a
>>>>> packet (on another cpu) for the network device that we are moving,
>>>>> just after it has registered in the new network namespace.  If we read
>>>>> the old network namespace and forward it up the network stack in that
>>>>> context I can imagine it being a recipe for all kinds of strange
>>>>> non-deterministic behavior.
>>>>>                 
>>>> The code does:
>>>>
>>>>    dev_close
>>>>       dev_deactive
>>>>          synchronize_rcu
>>>>    synchronize_net
>>>>    ...
>>>>    dev_shutdown
>>>>    ...
>>>>    synchronize_net
>>>>
>>>> The network device can no longer receive packets after dev_deactive, 
>>>> no ?
>>>> The first synchronize_net will wait for the outstanding packets to 
>>>> be delivered
>>>> to the upper layer and we change the nd_net field after.
>>>> Your scenario makes sense for the first synchronize_net but I am not 
>>>> sure that
>>>> can happen if we remove the second synchronize_net.
>>>>           
>>> Good point.  Visibility is key.  What can find us after we
>>> call list_netdevice() ?  Aren't there some pieces of code that
>>> do for_each_netdevice under the rcu lock?
>>>       
>> AFAIR, no. for_each_netdev is protected by rtnl_lock.
>>   
> 
> Nicolas,
> 
> At the first glance it looks like the removing of the second 
> synchronize_net is fine, but before posting the patch do you mind to 
> wait a little ?
> I would like to do some tests with your patch to check if we don't 
> missed something.
> 

Hi Daniel,

no problem, there is no hurry. Let me know the result of your tests.

Thanks,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- linux-2.6.28.2/net/core/dev.c	2009-01-24 19:42:07.000000000 -0500
+++ linux-2.6.28.2-new/net/core/dev.c	2009-02-05 04:16:35.000000000 -0500
@@ -4546,7 +4546,6 @@  int dev_change_net_namespace(struct net_
 	/* Notify protocols, that a new device appeared. */
 	call_netdevice_notifiers(NETDEV_REGISTER, dev);
 
-	synchronize_net();
 	err = 0;
 out:
 	return err;