Message ID | 20180604152125.6930.88723.stgit@john-Precision-Tower-5810 |
---|---|
State | Changes Requested, archived |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf-next] bpf: sockmap, fix crash when ipv6 sock is added | expand |
On 06/04/2018 05:21 PM, John Fastabend wrote: > This fixes a crash where we assign tcp_prot to IPv6 sockets instead > of tcpv6_prot. > > Previously we overwrote the sk->prot field with tcp_prot even in the > AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot > are used. Further, only allow ESTABLISHED connections to join the > map per note in TLS ULP, > > /* The TLS ulp is currently supported only for TCP sockets > * in ESTABLISHED state. > * Supporting sockets in LISTEN state will require us > * to modify the accept implementation to clone rather then > * share the ulp context. > */ > > Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as > 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously > crashing case here. > > Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") > Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com > Signed-off-by: John Fastabend <john.fastabend@gmail.com> > Signed-off-by: Wei Wang <weiwan@google.com> Applied to bpf-next, thanks everyone!
On 06/04/2018 12:59 PM, Daniel Borkmann wrote: > On 06/04/2018 05:21 PM, John Fastabend wrote: >> This fixes a crash where we assign tcp_prot to IPv6 sockets instead >> of tcpv6_prot. >> >> Previously we overwrote the sk->prot field with tcp_prot even in the >> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot >> are used. Further, only allow ESTABLISHED connections to join the >> map per note in TLS ULP, >> >> /* The TLS ulp is currently supported only for TCP sockets >> * in ESTABLISHED state. >> * Supporting sockets in LISTEN state will require us >> * to modify the accept implementation to clone rather then >> * share the ulp context. >> */ >> >> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as >> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously >> crashing case here. >> >> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") >> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com >> Signed-off-by: John Fastabend <john.fastabend@gmail.com> >> Signed-off-by: Wei Wang <weiwan@google.com> > > Applied to bpf-next, thanks everyone! > Thanks Daniel, this has the unfortunate side-effect though of making it hard to add sockets transitioning from LISTEN into ESTABLISHED states to a sockmap. Before this patch we could add sockets from the sock_ops event BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB to a sock{map|hash}. However, this is before a socket is in established state so risked crashing and wasn't valid at all per this thread. So I believe its correct to block this action, seeing it will crash a system in many (most!) cases. That said we still would like to support pushing sockets into a sock{map|hash} in this case. I thought about adding a new hook but we already have a few sock op hooks in the TCP stack so its too bad we don't have one that fires after the ESTABLISHED state has transitioned. Right now I'm looking into seeing if the following would have any issues, --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2206,9 +2206,6 @@ void tcp_set_state(struct sock *sk, int state) BUILD_BUG_ON((int)BPF_TCP_NEW_SYN_RECV != (int)TCP_NEW_SYN_RECV); BUILD_BUG_ON((int)BPF_TCP_MAX_STATES != (int)TCP_MAX_STATES); - if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG)) - tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state); - switch (state) { case TCP_ESTABLISHED: if (oldstate != TCP_ESTABLISHED) @@ -2234,6 +2231,9 @@ void tcp_set_state(struct sock *sk, int state) */ inet_sk_state_store(sk, state); + if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG)) + tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state); + #ifdef STATE_TRACE SOCK_DEBUG(sk, "TCP sk=%p, State %s -> %s\n", sk, statename[oldstate], statename[state]); #endif This would change the call hook slightly, moving it to after the state change. However unless the unhash is some how visible from the bpf program I don't think it should impact existing BPF programs. Thanks, John
On 06/05/2018 01:08 AM, John Fastabend wrote: > On 06/04/2018 12:59 PM, Daniel Borkmann wrote: >> On 06/04/2018 05:21 PM, John Fastabend wrote: >>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead >>> of tcpv6_prot. >>> >>> Previously we overwrote the sk->prot field with tcp_prot even in the >>> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot >>> are used. Further, only allow ESTABLISHED connections to join the >>> map per note in TLS ULP, >>> >>> /* The TLS ulp is currently supported only for TCP sockets >>> * in ESTABLISHED state. >>> * Supporting sockets in LISTEN state will require us >>> * to modify the accept implementation to clone rather then >>> * share the ulp context. >>> */ >>> >>> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as >>> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously >>> crashing case here. >>> >>> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") >>> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com >>> Signed-off-by: John Fastabend <john.fastabend@gmail.com> >>> Signed-off-by: Wei Wang <weiwan@google.com> >> >> Applied to bpf-next, thanks everyone! > > Thanks Daniel, this has the unfortunate side-effect though of > making it hard to add sockets transitioning from LISTEN into > ESTABLISHED states to a sockmap. Before this patch we could add > sockets from the sock_ops event BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB > to a sock{map|hash}. However, this is before a socket is in established > state so risked crashing and wasn't valid at all per this thread. So > I believe its correct to block this action, seeing it will crash a > system in many (most!) cases. > > That said we still would like to support pushing sockets into a > sock{map|hash} in this case. I thought about adding a new hook but > we already have a few sock op hooks in the TCP stack so its too bad we > don't have one that fires after the ESTABLISHED state has transitioned. > Right now I'm looking into seeing if the following would have any > issues, > > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -2206,9 +2206,6 @@ void tcp_set_state(struct sock *sk, int state) > BUILD_BUG_ON((int)BPF_TCP_NEW_SYN_RECV != (int)TCP_NEW_SYN_RECV); > BUILD_BUG_ON((int)BPF_TCP_MAX_STATES != (int)TCP_MAX_STATES); > > - if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG)) > - tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state); > - > switch (state) { > case TCP_ESTABLISHED: > if (oldstate != TCP_ESTABLISHED) > @@ -2234,6 +2231,9 @@ void tcp_set_state(struct sock *sk, int state) > */ > inet_sk_state_store(sk, state); > > + if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_STATE_CB_FLAG)) > + tcp_call_bpf_2arg(sk, BPF_SOCK_OPS_STATE_CB, oldstate, state); > + > #ifdef STATE_TRACE > SOCK_DEBUG(sk, "TCP sk=%p, State %s -> %s\n", sk, statename[oldstate], statename[state]); > #endif > > This would change the call hook slightly, moving it to after the state > change. However unless the unhash is some how visible from the bpf program > I don't think it should impact existing BPF programs. Hmm, the current fix also breaks compilation when IPv6 is compiled out, so I had to take it out for now. :-( I think this needs similar workaround as in kTLS case in tls_init(). Given this and your above seen side-effect, lets respin all with a clean fix. tree: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master head: 4e1f687e302835e45e2f296392f21cfeb5671303 commit: 4e1f687e302835e45e2f296392f21cfeb5671303 [3/3] bpf: sockmap, fix crash when ipv6 sock is added config: i386-randconfig-a0-06041847 (attached as .config) compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4 reproduce: git checkout 4e1f687e302835e45e2f296392f21cfeb5671303 # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): kernel/bpf/sockmap.o: In function `bpf_tcp_ulp_register': >> kernel/bpf/sockmap.c:1127: undefined reference to `tcpv6_prot' >> kernel/bpf/sockmap.c:1127: undefined reference to `tcpv6_prot' vim +1127 kernel/bpf/sockmap.c 1122 1123 static int bpf_tcp_ulp_register(void) 1124 { 1125 tcp_bpf_proto = tcp_prot; 1126 tcp_bpf_proto.close = bpf_tcp_close; > 1127 tcpv6_bpf_proto = tcpv6_prot; 1128 tcpv6_bpf_proto.close = bpf_tcp_close; 1129 /* Once BPF TX ULP is registered it is never unregistered. It 1130 * will be in the ULP list for the lifetime of the system. Doing 1131 * duplicate registers is not a problem. 1132 */ 1133 return tcp_register_ulp(&bpf_tcp_ulp_ops); 1134 } 1135 Thanks, Daniel
diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c index 52a91d8..9e80118 100644 --- a/kernel/bpf/sockmap.c +++ b/kernel/bpf/sockmap.c @@ -41,6 +41,7 @@ #include <linux/mm.h> #include <net/strparser.h> #include <net/tcp.h> +#include <net/transp_v6.h> #include <linux/ptr_ring.h> #include <net/inet_common.h> #include <linux/sched/signal.h> @@ -162,6 +163,8 @@ static bool bpf_tcp_stream_read(const struct sock *sk) } static struct proto tcp_bpf_proto; +static struct proto tcpv6_bpf_proto; + static int bpf_tcp_init(struct sock *sk) { struct smap_psock *psock; @@ -182,13 +185,21 @@ static int bpf_tcp_init(struct sock *sk) psock->sk_proto = sk->sk_prot; if (psock->bpf_tx_msg) { + tcpv6_bpf_proto.sendmsg = bpf_tcp_sendmsg; + tcpv6_bpf_proto.sendpage = bpf_tcp_sendpage; + tcpv6_bpf_proto.recvmsg = bpf_tcp_recvmsg; + tcpv6_bpf_proto.stream_memory_read = bpf_tcp_stream_read; tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg; tcp_bpf_proto.sendpage = bpf_tcp_sendpage; tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg; tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read; } - sk->sk_prot = &tcp_bpf_proto; + if (sk->sk_family == AF_INET6) + sk->sk_prot = &tcpv6_bpf_proto; + else + sk->sk_prot = &tcp_bpf_proto; + rcu_read_unlock(); return 0; } @@ -1113,6 +1124,8 @@ static int bpf_tcp_ulp_register(void) { tcp_bpf_proto = tcp_prot; tcp_bpf_proto.close = bpf_tcp_close; + tcpv6_bpf_proto = tcpv6_prot; + tcpv6_bpf_proto.close = bpf_tcp_close; /* Once BPF TX ULP is registered it is never unregistered. It * will be in the ULP list for the lifetime of the system. Doing * duplicate registers is not a problem. @@ -1716,6 +1729,14 @@ static int __sock_map_ctx_update_elem(struct bpf_map *map, bool new = false; int err = 0; + /* ULPs are currently supported only for TCP sockets in ESTABLISHED + * state. Supporting sockets in LISTEN state will require us to + * modify the accept implementation to clone rather then share the + * ulp context. + */ + if (sock->sk_state != TCP_ESTABLISHED) + return -ENOTSUPP; + /* 1. If sock map has BPF programs those will be inherited by the * sock being added. If the sock is already attached to BPF programs * this results in an error.