diff mbox series

[RFC,bpf-next,2/5] bpf, sockmap: Allow inserting listening TCP sockets into SOCKMAP

Message ID 20191022113730.29303-3-jakub@cloudflare.com
State RFC
Delegated to: BPF Maintainers
Headers show
Series Extend SOCKMAP to store listening sockets | expand

Commit Message

Jakub Sitnicki Oct. 22, 2019, 11:37 a.m. UTC
In order for SOCKMAP type to become a generic collection for storing socket
references we need to loosen the checks in update callback.

Currently SOCKMAP requires the TCP socket to be in established state, which
prevents us from using it to keep references to listening sockets.

Change the update pre-checks so that it is sufficient for socket to be in a
hash table, i.e. have a local address/port, to be inserted.

Return -EINVAL if the condition is not met to be consistent with
REUSEPORT_SOCKARRY map type.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/sock_map.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

John Fastabend Oct. 24, 2019, 5:06 p.m. UTC | #1
Jakub Sitnicki wrote:
> In order for SOCKMAP type to become a generic collection for storing socket
> references we need to loosen the checks in update callback.
> 
> Currently SOCKMAP requires the TCP socket to be in established state, which
> prevents us from using it to keep references to listening sockets.
> 
> Change the update pre-checks so that it is sufficient for socket to be in a
> hash table, i.e. have a local address/port, to be inserted.
> 
> Return -EINVAL if the condition is not met to be consistent with
> REUSEPORT_SOCKARRY map type.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---

We need to also have some tests then to verify redirecting to this listen socket
does the correct thing. Once its in the map we can redirect (ingress or egress)
to it and need to be sure the semantics are sane.

>  net/core/sock_map.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index facacc296e6c..222036393b90 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -415,11 +415,14 @@ static int sock_map_update_elem(struct bpf_map *map, void *key,
>  		ret = -EINVAL;
>  		goto out;
>  	}
> -	if (!sock_map_sk_is_suitable(sk) ||
> -	    sk->sk_state != TCP_ESTABLISHED) {
> +	if (!sock_map_sk_is_suitable(sk)) {
>  		ret = -EOPNOTSUPP;
>  		goto out;
>  	}
> +	if (!sk_hashed(sk)) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
>  
>  	sock_map_sk_acquire(sk);
>  	ret = sock_map_update_common(map, idx, sk, flags);
> -- 
> 2.20.1
>
Jakub Sitnicki Oct. 25, 2019, 9:41 a.m. UTC | #2
On Thu, Oct 24, 2019 at 07:06 PM CEST, John Fastabend wrote:
> Jakub Sitnicki wrote:
>> In order for SOCKMAP type to become a generic collection for storing socket
>> references we need to loosen the checks in update callback.
>>
>> Currently SOCKMAP requires the TCP socket to be in established state, which
>> prevents us from using it to keep references to listening sockets.
>>
>> Change the update pre-checks so that it is sufficient for socket to be in a
>> hash table, i.e. have a local address/port, to be inserted.
>>
>> Return -EINVAL if the condition is not met to be consistent with
>> REUSEPORT_SOCKARRY map type.
>>
>> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>> ---
>
> We need to also have some tests then to verify redirecting to this listen socket
> does the correct thing. Once its in the map we can redirect (ingress or egress)
> to it and need to be sure the semantics are sane.

You're right. The redirect BPF helpers that operate on SOCMAP might be
relying on an assumption that sockets are in established state. I need
look into that.

Thanks,
Jakub
diff mbox series

Patch

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index facacc296e6c..222036393b90 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -415,11 +415,14 @@  static int sock_map_update_elem(struct bpf_map *map, void *key,
 		ret = -EINVAL;
 		goto out;
 	}
-	if (!sock_map_sk_is_suitable(sk) ||
-	    sk->sk_state != TCP_ESTABLISHED) {
+	if (!sock_map_sk_is_suitable(sk)) {
 		ret = -EOPNOTSUPP;
 		goto out;
 	}
+	if (!sk_hashed(sk)) {
+		ret = -EINVAL;
+		goto out;
+	}
 
 	sock_map_sk_acquire(sk);
 	ret = sock_map_update_common(map, idx, sk, flags);