diff mbox series

[bpf,1/2] bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free

Message ID 20200607205229.2389672-2-jakub@cloudflare.com
State Accepted
Delegated to: BPF Maintainers
Headers show
Series Fixes for sock_hash_free | expand

Commit Message

Jakub Sitnicki June 7, 2020, 8:52 p.m. UTC
When sockhash gets destroyed while sockets are still linked to it, we will
walk the bucket lists and delete the links. However, we are not freeing the
list elements after processing them, leaking the memory.

The leak can be triggered by close()'ing a sockhash map when it still
contains sockets, and observed with kmemleak:

  unreferenced object 0xffff888116e86f00 (size 64):
    comm "race_sock_unlin", pid 223, jiffies 4294731063 (age 217.404s)
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      81 de e8 41 00 00 00 00 c0 69 2f 15 81 88 ff ff  ...A.....i/.....
    backtrace:
      [<00000000dd089ebb>] sock_hash_update_common+0x4ca/0x760
      [<00000000b8219bd5>] sock_hash_update_elem+0x1d2/0x200
      [<000000005e2c23de>] __do_sys_bpf+0x2046/0x2990
      [<00000000d0084618>] do_syscall_64+0xad/0x9a0
      [<000000000d96f263>] entry_SYSCALL_64_after_hwframe+0x49/0xb3

Fix it by freeing the list element when we're done with it.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/sock_map.c | 1 +
 1 file changed, 1 insertion(+)

Comments

John Fastabend June 9, 2020, 4:56 p.m. UTC | #1
Jakub Sitnicki wrote:
> When sockhash gets destroyed while sockets are still linked to it, we will
> walk the bucket lists and delete the links. However, we are not freeing the
> list elements after processing them, leaking the memory.
> 
> The leak can be triggered by close()'ing a sockhash map when it still
> contains sockets, and observed with kmemleak:
> 
>   unreferenced object 0xffff888116e86f00 (size 64):
>     comm "race_sock_unlin", pid 223, jiffies 4294731063 (age 217.404s)
>     hex dump (first 32 bytes):
>       00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>       81 de e8 41 00 00 00 00 c0 69 2f 15 81 88 ff ff  ...A.....i/.....
>     backtrace:
>       [<00000000dd089ebb>] sock_hash_update_common+0x4ca/0x760
>       [<00000000b8219bd5>] sock_hash_update_elem+0x1d2/0x200
>       [<000000005e2c23de>] __do_sys_bpf+0x2046/0x2990
>       [<00000000d0084618>] do_syscall_64+0xad/0x9a0
>       [<000000000d96f263>] entry_SYSCALL_64_after_hwframe+0x49/0xb3
> 
> Fix it by freeing the list element when we're done with it.
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  net/core/sock_map.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 00a26cf2cfe9..ea46f07a22d8 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -1031,6 +1031,7 @@ static void sock_hash_free(struct bpf_map *map)
>  			sock_map_unref(elem->sk, elem);
>  			rcu_read_unlock();
>  			release_sock(elem->sk);
> +			sock_hash_free_elem(htab, elem);
>  		}
>  	}
>  
> -- 
> 2.25.4
> 

In Cilium we pin the map and never release it thanks for catching this.

Acked-by: John Fastabend <john.fastabend@gmail.com>
diff mbox series

Patch

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 00a26cf2cfe9..ea46f07a22d8 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1031,6 +1031,7 @@  static void sock_hash_free(struct bpf_map *map)
 			sock_map_unref(elem->sk, elem);
 			rcu_read_unlock();
 			release_sock(elem->sk);
+			sock_hash_free_elem(htab, elem);
 		}
 	}