diff mbox series

net: Fix double free and memory corruption in get_net_ns_by_id()

Message ID 87d13aaaqr.fsf@xmission.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series net: Fix double free and memory corruption in get_net_ns_by_id() | expand

Commit Message

Eric W. Biederman Dec. 19, 2017, 5:27 p.m. UTC
(I can trivially verify that that idr_remove in cleanup_net happens
 after the network namespace count has dropped to zero --EWB)

Function get_net_ns_by_id() does not check for net::count
after it has found a peer in netns_ids idr.

It may dereference a peer, after its count has already been
finaly decremented. This leads to double free and memory
corruption:

put_net(peer)                                   rtnl_lock()
atomic_dec_and_test(&peer->count) [count=0]     ...
__put_net(peer)                                 get_net_ns_by_id(net, id)
  spin_lock(&cleanup_list_lock)
  list_add(&net->cleanup_list, &cleanup_list)
  spin_unlock(&cleanup_list_lock)
queue_work()                                      peer = idr_find(&net->netns_ids, id)
  |                                               get_net(peer) [count=1]
  |                                               ...
  |                                               (use after final put)
  v                                               ...
  cleanup_net()                                   ...
    spin_lock(&cleanup_list_lock)                 ...
    list_replace_init(&cleanup_list, ..)          ...
    spin_unlock(&cleanup_list_lock)               ...
    ...                                           ...
    ...                                           put_net(peer)
    ...                                             atomic_dec_and_test(&peer->count) [count=0]
    ...                                               spin_lock(&cleanup_list_lock)
    ...                                               list_add(&net->cleanup_list, &cleanup_list)
    ...                                               spin_unlock(&cleanup_list_lock)
    ...                                             queue_work()
    ...                                           rtnl_unlock()
    rtnl_lock()                                   ...
    for_each_net(tmp) {                           ...
      id = __peernet2id(tmp, peer)                ...
      spin_lock_irq(&tmp->nsid_lock)              ...
      idr_remove(&tmp->netns_ids, id)             ...
      ...                                         ...
      net_drop_ns()                               ...
	net_free(peer)                            ...
    }                                             ...
  |
  v
  cleanup_net()
    ...
    (Second free of peer)

Also, put_net() on the right cpu may reorder with left's cpu
list_replace_init(&cleanup_list, ..), and then cleanup_list
will be corrupted.

Since cleanup_net() is executed in worker thread, while
put_net(peer) can happen everywhere, there should be
enough time for concurrent get_net_ns_by_id() to pick
the peer up, and the race does not seem to be unlikely.
The patch fixes the problem in standard way.

(Also, there is possible problem in peernet2id_alloc(), which requires
check for net::count under nsid_lock and maybe_get_net(peer), but
in current stable kernel it's used under rtnl_lock() and it has to be
safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
be fixed too. While this is not in stable kernel yet, so I'll send
a separate message to netdev@ later).

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Fixes: 0c7aecd4bde4 "netns: add rtnl cmd to add and get peer netns ids"
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 net/core/net_namespace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Dumazet Dec. 19, 2017, 6:31 p.m. UTC | #1
On Tue, 2017-12-19 at 11:27 -0600, Eric W. Biederman wrote:
> (I can trivially verify that that idr_remove in cleanup_net happens
>  after the network namespace count has dropped to zero --EWB)
> 

Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks !
Nicolas Dichtel Dec. 20, 2017, 3 p.m. UTC | #2
Le 19/12/2017 à 18:27, Eric W. Biederman a écrit :
> 
> (I can trivially verify that that idr_remove in cleanup_net happens
>  after the network namespace count has dropped to zero --EWB)
> 
> Function get_net_ns_by_id() does not check for net::count
> after it has found a peer in netns_ids idr.
> 
> It may dereference a peer, after its count has already been
> finaly decremented. This leads to double free and memory
> corruption:
> 
> put_net(peer)                                   rtnl_lock()
> atomic_dec_and_test(&peer->count) [count=0]     ...
> __put_net(peer)                                 get_net_ns_by_id(net, id)
>   spin_lock(&cleanup_list_lock)
>   list_add(&net->cleanup_list, &cleanup_list)
>   spin_unlock(&cleanup_list_lock)
> queue_work()                                      peer = idr_find(&net->netns_ids, id)
>   |                                               get_net(peer) [count=1]
>   |                                               ...
>   |                                               (use after final put)
>   v                                               ...
>   cleanup_net()                                   ...
>     spin_lock(&cleanup_list_lock)                 ...
>     list_replace_init(&cleanup_list, ..)          ...
>     spin_unlock(&cleanup_list_lock)               ...
>     ...                                           ...
>     ...                                           put_net(peer)
>     ...                                             atomic_dec_and_test(&peer->count) [count=0]
>     ...                                               spin_lock(&cleanup_list_lock)
>     ...                                               list_add(&net->cleanup_list, &cleanup_list)
>     ...                                               spin_unlock(&cleanup_list_lock)
>     ...                                             queue_work()
>     ...                                           rtnl_unlock()
>     rtnl_lock()                                   ...
>     for_each_net(tmp) {                           ...
>       id = __peernet2id(tmp, peer)                ...
>       spin_lock_irq(&tmp->nsid_lock)              ...
>       idr_remove(&tmp->netns_ids, id)             ...
>       ...                                         ...
>       net_drop_ns()                               ...
> 	net_free(peer)                            ...
>     }                                             ...
>   |
>   v
>   cleanup_net()
>     ...
>     (Second free of peer)
> 
> Also, put_net() on the right cpu may reorder with left's cpu
> list_replace_init(&cleanup_list, ..), and then cleanup_list
> will be corrupted.
> 
> Since cleanup_net() is executed in worker thread, while
> put_net(peer) can happen everywhere, there should be
> enough time for concurrent get_net_ns_by_id() to pick
> the peer up, and the race does not seem to be unlikely.
> The patch fixes the problem in standard way.
> 
> (Also, there is possible problem in peernet2id_alloc(), which requires
> check for net::count under nsid_lock and maybe_get_net(peer), but
> in current stable kernel it's used under rtnl_lock() and it has to be
> safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
> be fixed too. While this is not in stable kernel yet, so I'll send
> a separate message to netdev@ later).
> 
> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> Fixes: 0c7aecd4bde4 "netns: add rtnl cmd to add and get peer netns ids"
> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Good catch, thank you.

Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
David Miller Dec. 20, 2017, 5:43 p.m. UTC | #3
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 19 Dec 2017 11:27:56 -0600

> (I can trivially verify that that idr_remove in cleanup_net happens
>  after the network namespace count has dropped to zero --EWB)
> 
> Function get_net_ns_by_id() does not check for net::count
> after it has found a peer in netns_ids idr.
> 
> It may dereference a peer, after its count has already been
> finaly decremented. This leads to double free and memory
> corruption:
 ...
> Also, put_net() on the right cpu may reorder with left's cpu
> list_replace_init(&cleanup_list, ..), and then cleanup_list
> will be corrupted.
> 
> Since cleanup_net() is executed in worker thread, while
> put_net(peer) can happen everywhere, there should be
> enough time for concurrent get_net_ns_by_id() to pick
> the peer up, and the race does not seem to be unlikely.
> The patch fixes the problem in standard way.
> 
> (Also, there is possible problem in peernet2id_alloc(), which requires
> check for net::count under nsid_lock and maybe_get_net(peer), but
> in current stable kernel it's used under rtnl_lock() and it has to be
> safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
> be fixed too. While this is not in stable kernel yet, so I'll send
> a separate message to netdev@ later).
> 
> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> Fixes: 0c7aecd4bde4 "netns: add rtnl cmd to add and get peer netns ids"
> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied and queued up for -stable, thanks.
diff mbox series

Patch

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index b797832565d3..60a71be75aea 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -267,7 +267,7 @@  struct net *get_net_ns_by_id(struct net *net, int id)
 	spin_lock_bh(&net->nsid_lock);
 	peer = idr_find(&net->netns_ids, id);
 	if (peer)
-		get_net(peer);
+		peer = maybe_get_net(peer);
 	spin_unlock_bh(&net->nsid_lock);
 	rcu_read_unlock();