diff mbox series

[bpf-next] bpf: sockmap, fix crash when ipv6 sock is added

Message ID 20180604152125.6930.88723.stgit@john-Precision-Tower-5810
State Changes Requested, archived
Delegated to: BPF Maintainers
Headers show
Series [bpf-next] bpf: sockmap, fix crash when ipv6 sock is added | expand

Commit Message

John Fastabend June 4, 2018, 3:21 p.m. UTC
This fixes a crash where we assign tcp_prot to IPv6 sockets instead
of tcpv6_prot.

Previously we overwrote the sk->prot field with tcp_prot even in the
AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
are used. Further, only allow ESTABLISHED connections to join the
map per note in TLS ULP,

   /* The TLS ulp is currently supported only for TCP sockets
    * in ESTABLISHED state.
    * Supporting sockets in LISTEN state will require us
    * to modify the accept implementation to clone rather then
    * share the ulp context.
    */

Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
crashing case here.

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Wei Wang <weiwan@google.com>
---
 kernel/bpf/sockmap.c |   23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

Comments

Daniel Borkmann June 4, 2018, 7:59 p.m. UTC | #1
On 06/04/2018 05:21 PM, John Fastabend wrote:
> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
> of tcpv6_prot.
> 
> Previously we overwrote the sk->prot field with tcp_prot even in the
> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
> are used. Further, only allow ESTABLISHED connections to join the
> map per note in TLS ULP,
> 
>    /* The TLS ulp is currently supported only for TCP sockets
>     * in ESTABLISHED state.
>     * Supporting sockets in LISTEN state will require us
>     * to modify the accept implementation to clone rather then
>     * share the ulp context.
>     */
> 
> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
> crashing case here.
> 
> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Wei Wang <weiwan@google.com>

Applied to bpf-next, thanks everyone!
John Fastabend June 4, 2018, 11:08 p.m. UTC | #2
On 06/04/2018 12:59 PM, Daniel Borkmann wrote:
> On 06/04/2018 05:21 PM, John Fastabend wrote:
>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
>> of tcpv6_prot.
>>
>> Previously we overwrote the sk->prot field with tcp_prot even in the
>> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
>> are used. Further, only allow ESTABLISHED connections to join the
>> map per note in TLS ULP,
>>
>>    /* The TLS ulp is currently supported only for TCP sockets
>>     * in ESTABLISHED state.
>>     * Supporting sockets in LISTEN state will require us
>>     * to modify the accept implementation to clone rather then
>>     * share the ulp context.
>>     */
>>
>> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
>> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
>> crashing case here.
>>
>> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
>> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> Signed-off-by: Wei Wang <weiwan@google.com>
> 
> Applied to bpf-next, thanks everyone!
> 

Thanks Daniel, this has the unfortunate side-effect though of
making it hard to add sockets transitioning from LISTEN into
ESTABLISHED states to a sockmap. Before this patch we could add
sockets from the sock_ops event BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
to a sock{map|hash}. However, this is before a socket is in established
state so risked crashing and wasn't valid at all per this thread. So
I believe its correct to block this action, seeing it will crash a
system in many (most!) cases.

That said we still would like to support pushing sockets into a
sock{map|hash} in this case. I thought about adding a new hook but
we already have a few sock op hooks in the TCP stack so its too bad we
don't have one that fires after the ESTABLISHED state has transitioned.
Right now I'm looking into seeing if the following would have any
issues,

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2206,9 +2206,6 @@ void tcp_set_state(struct sock *sk, int state)
        BUILD_BUG_ON((int)BPF_TCP_NEW_SYN_RECV != (int)TCP_NEW_SYN_RECV);
        BUILD_BUG_ON((int)BPF_TCP_MAX_STATES != (int)TCP_MAX_STATES);
 
-       if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG))
-               tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state);
-
        switch (state) {
        case TCP_ESTABLISHED:
                if (oldstate != TCP_ESTABLISHED)
@@ -2234,6 +2231,9 @@ void tcp_set_state(struct sock *sk, int state)
         */
        inet_sk_state_store(sk, state);
 
+       if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG))
+               tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state);
+
 #ifdef STATE_TRACE
        SOCK_DEBUG(sk, "TCP sk=%p, State %s -> %s\n", sk, statename[oldstate], statename[state]);
 #endif


This would change the call hook slightly, moving it to after the state
change. However unless the unhash is some how visible from the bpf program
I don't think it should impact existing BPF programs.

Thanks,
John
Daniel Borkmann June 4, 2018, 11:20 p.m. UTC | #3
On 06/05/2018 01:08 AM, John Fastabend wrote:
> On 06/04/2018 12:59 PM, Daniel Borkmann wrote:
>> On 06/04/2018 05:21 PM, John Fastabend wrote:
>>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
>>> of tcpv6_prot.
>>>
>>> Previously we overwrote the sk->prot field with tcp_prot even in the
>>> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
>>> are used. Further, only allow ESTABLISHED connections to join the
>>> map per note in TLS ULP,
>>>
>>>    /* The TLS ulp is currently supported only for TCP sockets
>>>     * in ESTABLISHED state.
>>>     * Supporting sockets in LISTEN state will require us
>>>     * to modify the accept implementation to clone rather then
>>>     * share the ulp context.
>>>     */
>>>
>>> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
>>> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
>>> crashing case here.
>>>
>>> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
>>> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
>>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>>> Signed-off-by: Wei Wang <weiwan@google.com>
>>
>> Applied to bpf-next, thanks everyone!
> 
> Thanks Daniel, this has the unfortunate side-effect though of
> making it hard to add sockets transitioning from LISTEN into
> ESTABLISHED states to a sockmap. Before this patch we could add
> sockets from the sock_ops event BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB
> to a sock{map|hash}. However, this is before a socket is in established
> state so risked crashing and wasn't valid at all per this thread. So
> I believe its correct to block this action, seeing it will crash a
> system in many (most!) cases.
> 
> That said we still would like to support pushing sockets into a
> sock{map|hash} in this case. I thought about adding a new hook but
> we already have a few sock op hooks in the TCP stack so its too bad we
> don't have one that fires after the ESTABLISHED state has transitioned.
> Right now I'm looking into seeing if the following would have any
> issues,
> 
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2206,9 +2206,6 @@ void tcp_set_state(struct sock *sk, int state)
>         BUILD_BUG_ON((int)BPF_TCP_NEW_SYN_RECV != (int)TCP_NEW_SYN_RECV);
>         BUILD_BUG_ON((int)BPF_TCP_MAX_STATES != (int)TCP_MAX_STATES);
>  
> -       if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG))
> -               tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state);
> -
>         switch (state) {
>         case TCP_ESTABLISHED:
>                 if (oldstate != TCP_ESTABLISHED)
> @@ -2234,6 +2231,9 @@ void tcp_set_state(struct sock *sk, int state)
>          */
>         inet_sk_state_store(sk, state);
>  
> +       if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG))
> +               tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state);
> +
>  #ifdef STATE_TRACE
>         SOCK_DEBUG(sk, "TCP sk=%p, State %s -> %s\n", sk, statename[oldstate], statename[state]);
>  #endif
> 
> This would change the call hook slightly, moving it to after the state
> change. However unless the unhash is some how visible from the bpf program
> I don't think it should impact existing BPF programs.

Hmm, the current fix also breaks compilation when IPv6 is compiled out, so I had
to take it out for now. :-( I think this needs similar workaround as in kTLS case
in tls_init(). Given this and your above seen side-effect, lets respin all with a
clean fix.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
head:   4e1f687e302835e45e2f296392f21cfeb5671303
commit: 4e1f687e302835e45e2f296392f21cfeb5671303 [3/3] bpf: sockmap, fix crash when ipv6 sock is added
config: i386-randconfig-a0-06041847 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        git checkout 4e1f687e302835e45e2f296392f21cfeb5671303
        # save the attached .config to linux build tree
        make ARCH=i386

All errors (new ones prefixed by >>):

   kernel/bpf/sockmap.o: In function `bpf_tcp_ulp_register':
>> kernel/bpf/sockmap.c:1127: undefined reference to `tcpv6_prot'
>> kernel/bpf/sockmap.c:1127: undefined reference to `tcpv6_prot'

vim +1127 kernel/bpf/sockmap.c

  1122	
  1123	static int bpf_tcp_ulp_register(void)
  1124	{
  1125		tcp_bpf_proto = tcp_prot;
  1126		tcp_bpf_proto.close = bpf_tcp_close;
> 1127		tcpv6_bpf_proto = tcpv6_prot;
  1128		tcpv6_bpf_proto.close = bpf_tcp_close;
  1129		/* Once BPF TX ULP is registered it is never unregistered. It
  1130		 * will be in the ULP list for the lifetime of the system. Doing
  1131		 * duplicate registers is not a problem.
  1132		 */
  1133		return tcp_register_ulp(&bpf_tcp_ulp_ops);
  1134	}
  1135	

Thanks,
Daniel
diff mbox series

Patch

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 52a91d8..9e80118 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -41,6 +41,7 @@ 
 #include <linux/mm.h>
 #include <net/strparser.h>
 #include <net/tcp.h>
+#include <net/transp_v6.h>
 #include <linux/ptr_ring.h>
 #include <net/inet_common.h>
 #include <linux/sched/signal.h>
@@ -162,6 +163,8 @@  static bool bpf_tcp_stream_read(const struct sock *sk)
 }
 
 static struct proto tcp_bpf_proto;
+static struct proto tcpv6_bpf_proto;
+
 static int bpf_tcp_init(struct sock *sk)
 {
 	struct smap_psock *psock;
@@ -182,13 +185,21 @@  static int bpf_tcp_init(struct sock *sk)
 	psock->sk_proto = sk->sk_prot;
 
 	if (psock->bpf_tx_msg) {
+		tcpv6_bpf_proto.sendmsg = bpf_tcp_sendmsg;
+		tcpv6_bpf_proto.sendpage = bpf_tcp_sendpage;
+		tcpv6_bpf_proto.recvmsg = bpf_tcp_recvmsg;
+		tcpv6_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
 		tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg;
 		tcp_bpf_proto.sendpage = bpf_tcp_sendpage;
 		tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg;
 		tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
 	}
 
-	sk->sk_prot = &tcp_bpf_proto;
+	if (sk->sk_family == AF_INET6)
+		sk->sk_prot = &tcpv6_bpf_proto;
+	else
+		sk->sk_prot = &tcp_bpf_proto;
+
 	rcu_read_unlock();
 	return 0;
 }
@@ -1113,6 +1124,8 @@  static int bpf_tcp_ulp_register(void)
 {
 	tcp_bpf_proto = tcp_prot;
 	tcp_bpf_proto.close = bpf_tcp_close;
+	tcpv6_bpf_proto = tcpv6_prot;
+	tcpv6_bpf_proto.close = bpf_tcp_close;
 	/* Once BPF TX ULP is registered it is never unregistered. It
 	 * will be in the ULP list for the lifetime of the system. Doing
 	 * duplicate registers is not a problem.
@@ -1716,6 +1729,14 @@  static int __sock_map_ctx_update_elem(struct bpf_map *map,
 	bool new = false;
 	int err = 0;
 
+	/* ULPs are currently supported only for TCP sockets in ESTABLISHED
+	 * state. Supporting sockets in LISTEN state will require us to
+	 * modify the accept implementation to clone rather then share the
+	 * ulp context.
+	 */
+	if (sock->sk_state != TCP_ESTABLISHED)
+		return -ENOTSUPP;
+
 	/* 1. If sock map has BPF programs those will be inherited by the
 	 * sock being added. If the sock is already attached to BPF programs
 	 * this results in an error.