Message ID | 5be9e5da904e5930b3a376c4aae6e2c908a6ba3f.1591884399.git.dcaratti@redhat.com |
---|---|
State | Accepted, archived |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | [net-next] mptcp: fallback in case of simultaneous connect | expand |
On Thu, 11 Jun 2020, Davide Caratti wrote: > when a MPTCP client tries to connect to itself, tcp_finish_connect() is > never reached. Because of this, depending on the socket current state, > multiple faulty behaviours can be observed: > > 1) a WARN_ON() in subflow_data_ready() is hit > WARNING: CPU: 2 PID: 882 at net/mptcp/subflow.c:911 subflow_data_ready+0x18b/0x230 > [...] > CPU: 2 PID: 882 Comm: gh35 Not tainted 5.7.0+ #187 > [...] > RIP: 0010:subflow_data_ready+0x18b/0x230 > [...] > Call Trace: > tcp_data_queue+0xd2f/0x4250 > tcp_rcv_state_process+0xb1c/0x49d3 > tcp_v4_do_rcv+0x2bc/0x790 > __release_sock+0x153/0x2d0 > release_sock+0x4f/0x170 > mptcp_shutdown+0x167/0x4e0 > __sys_shutdown+0xe6/0x180 > __x64_sys_shutdown+0x50/0x70 > do_syscall_64+0x9a/0x370 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > 2) client is stuck forever in mptcp_sendmsg() because the socket is not > TCP_ESTABLISHED > > crash> bt 4847 > PID: 4847 TASK: ffff88814b2fb100 CPU: 1 COMMAND: "gh35" > #0 [ffff8881376ff680] __schedule at ffffffff97248da4 > #1 [ffff8881376ff778] schedule at ffffffff9724a34f > #2 [ffff8881376ff7a0] schedule_timeout at ffffffff97252ba0 > #3 [ffff8881376ff8a8] wait_woken at ffffffff958ab4ba > #4 [ffff8881376ff940] sk_stream_wait_connect at ffffffff96c2d859 > #5 [ffff8881376ffa28] mptcp_sendmsg at ffffffff97207fca > #6 [ffff8881376ffbc0] sock_sendmsg at ffffffff96be1b5b > #7 [ffff8881376ffbe8] sock_write_iter at ffffffff96be1daa > #8 [ffff8881376ffce8] new_sync_write at ffffffff95e5cb52 > #9 [ffff8881376ffe50] vfs_write at ffffffff95e6547f > #10 [ffff8881376ffe90] ksys_write at ffffffff95e65d26 > #11 [ffff8881376fff28] do_syscall_64 at ffffffff956088ba > #12 [ffff8881376fff50] entry_SYSCALL_64_after_hwframe at ffffffff9740008c > RIP: 00007f126f6956ed RSP: 00007ffc2a320278 RFLAGS: 00000217 > RAX: ffffffffffffffda RBX: 0000000020000044 RCX: 00007f126f6956ed > RDX: 0000000000000004 RSI: 00000000004007b8 RDI: 0000000000000003 > RBP: 00007ffc2a3202a0 R8: 0000000000400720 R9: 0000000000400720 > R10: 0000000000400720 R11: 0000000000000217 R12: 00000000004004b0 > R13: 00007ffc2a320380 R14: 0000000000000000 R15: 0000000000000000 > ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b > > 3) tcpdump captures show that DSS is exchanged even when MP_CAPABLE handshake > didn't complete. > > $ tcpdump -tnnr bad.pcap > IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [S], seq 3208913911, win 65483, options [mss 65495,sackOK,TS val 3291706876 ecr 3291694721,nop,wscale 7,mptcp capable v1], length 0 > IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [S.], seq 3208913911, ack 3208913912, win 65483, options [mss 65495,sackOK,TS val 3291706876 ecr 3291706876,nop,wscale 7,mptcp capable v1], length 0 > IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [.], ack 1, win 512, options [nop,nop,TS val 3291706876 ecr 3291706876], length 0 > IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [F.], seq 1, ack 1, win 512, options [nop,nop,TS val 3291707876 ecr 3291706876,mptcp dss fin seq 0 subseq 0 len 1,nop,nop], length 0 > IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [.], ack 2, win 512, options [nop,nop,TS val 3291707876 ecr 3291707876], length 0 > > force a fallback to TCP in these cases, and adjust the main socket > state to avoid hanging in mptcp_sendmsg(). > > BugLink: https://github.com/multipath-tcp/mptcp_net-next/issues/35 > Reported-by: Christoph Paasch <cpaasch@apple.com> > Suggested-by: Paolo Abeni <pabeni@redhat.com> > Signed-off-by: Davide Caratti <dcaratti@redhat.com> > --- > net/mptcp/protocol.h | 10 ++++++++++ > net/mptcp/subflow.c | 10 ++++++++++ > 2 files changed, 20 insertions(+) > > diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h > index dc3bfcbc5ba29..634216e03f777 100644 > --- a/net/mptcp/protocol.h > +++ b/net/mptcp/protocol.h > @@ -507,4 +507,14 @@ static inline void mptcp_do_fallback(struct sock *sk) > #define pr_fallback(a) do { pr_debug("%s:fallback to TCP (msk=%p)",\ > __FUNCTION__, a); } while (0) > > +static inline bool subflow_simultaneous_connect(struct sock *sk) > +{ > + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); > + struct sock *parent = subflow->conn; > + > + return sk->sk_state == TCP_ESTABLISHED && > + !mptcp_sk(parent)->pm.server_side && > + !subflow->conn_finished; > +} > + > #endif /* __MPTCP_PROTOCOL_H */ > diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c > index 417a82c12dddd..5a1e87301124f 100644 > --- a/net/mptcp/subflow.c > +++ b/net/mptcp/subflow.c > @@ -1119,6 +1119,16 @@ static void subflow_state_change(struct sock *sk) > > __subflow_state_change(sk); > > + if (subflow_simultaneous_connect(sk)) { > + mptcp_do_fallback(sk); > + pr_fallback(mptcp_sk(parent)); > + subflow->conn_finished = 1; > + if (inet_sk_state_load(parent) == TCP_SYN_SENT) { > + inet_sk_state_store(parent, TCP_ESTABLISHED); > + parent->sk_state_change(parent); > + } > + } > + > /* as recvmsg() does not acquire the subflow socket for ssk selection > * a fin packet carrying a DSS can be unnoticed if we don't trigger > * the data available machinery here. > -- > 2.26.2 Thanks for the v2 Davide, looks good to me. -- Mat Martineau Intel
Hi Davide, Mat, On 12/06/2020 00:57, Mat Martineau wrote: > On Thu, 11 Jun 2020, Davide Caratti wrote: > >> when a MPTCP client tries to connect to itself, tcp_finish_connect() is >> never reached. Because of this, depending on the socket current state, >> multiple faulty behaviours can be observed: (...) > Thanks for the v2 Davide, looks good to me. Thank you for the patch and the review! Just added after "net: mptcp: improve fallback to TCP": - 295da4ec8264: mptcp: fallback in case of simultaneous connect (I replaced "BugFixes:" with "Closes:" tag to have the reference in Github.) Tests + export are in progress! Cheers, Matt
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index dc3bfcbc5ba29..634216e03f777 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -507,4 +507,14 @@ static inline void mptcp_do_fallback(struct sock *sk) #define pr_fallback(a) do { pr_debug("%s:fallback to TCP (msk=%p)",\ __FUNCTION__, a); } while (0) +static inline bool subflow_simultaneous_connect(struct sock *sk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct sock *parent = subflow->conn; + + return sk->sk_state == TCP_ESTABLISHED && + !mptcp_sk(parent)->pm.server_side && + !subflow->conn_finished; +} + #endif /* __MPTCP_PROTOCOL_H */ diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 417a82c12dddd..5a1e87301124f 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1119,6 +1119,16 @@ static void subflow_state_change(struct sock *sk) __subflow_state_change(sk); + if (subflow_simultaneous_connect(sk)) { + mptcp_do_fallback(sk); + pr_fallback(mptcp_sk(parent)); + subflow->conn_finished = 1; + if (inet_sk_state_load(parent) == TCP_SYN_SENT) { + inet_sk_state_store(parent, TCP_ESTABLISHED); + parent->sk_state_change(parent); + } + } + /* as recvmsg() does not acquire the subflow socket for ssk selection * a fin packet carrying a DSS can be unnoticed if we don't trigger * the data available machinery here.
when a MPTCP client tries to connect to itself, tcp_finish_connect() is never reached. Because of this, depending on the socket current state, multiple faulty behaviours can be observed: 1) a WARN_ON() in subflow_data_ready() is hit WARNING: CPU: 2 PID: 882 at net/mptcp/subflow.c:911 subflow_data_ready+0x18b/0x230 [...] CPU: 2 PID: 882 Comm: gh35 Not tainted 5.7.0+ #187 [...] RIP: 0010:subflow_data_ready+0x18b/0x230 [...] Call Trace: tcp_data_queue+0xd2f/0x4250 tcp_rcv_state_process+0xb1c/0x49d3 tcp_v4_do_rcv+0x2bc/0x790 __release_sock+0x153/0x2d0 release_sock+0x4f/0x170 mptcp_shutdown+0x167/0x4e0 __sys_shutdown+0xe6/0x180 __x64_sys_shutdown+0x50/0x70 do_syscall_64+0x9a/0x370 entry_SYSCALL_64_after_hwframe+0x44/0xa9 2) client is stuck forever in mptcp_sendmsg() because the socket is not TCP_ESTABLISHED crash> bt 4847 PID: 4847 TASK: ffff88814b2fb100 CPU: 1 COMMAND: "gh35" #0 [ffff8881376ff680] __schedule at ffffffff97248da4 #1 [ffff8881376ff778] schedule at ffffffff9724a34f #2 [ffff8881376ff7a0] schedule_timeout at ffffffff97252ba0 #3 [ffff8881376ff8a8] wait_woken at ffffffff958ab4ba #4 [ffff8881376ff940] sk_stream_wait_connect at ffffffff96c2d859 #5 [ffff8881376ffa28] mptcp_sendmsg at ffffffff97207fca #6 [ffff8881376ffbc0] sock_sendmsg at ffffffff96be1b5b #7 [ffff8881376ffbe8] sock_write_iter at ffffffff96be1daa #8 [ffff8881376ffce8] new_sync_write at ffffffff95e5cb52 #9 [ffff8881376ffe50] vfs_write at ffffffff95e6547f #10 [ffff8881376ffe90] ksys_write at ffffffff95e65d26 #11 [ffff8881376fff28] do_syscall_64 at ffffffff956088ba #12 [ffff8881376fff50] entry_SYSCALL_64_after_hwframe at ffffffff9740008c RIP: 00007f126f6956ed RSP: 00007ffc2a320278 RFLAGS: 00000217 RAX: ffffffffffffffda RBX: 0000000020000044 RCX: 00007f126f6956ed RDX: 0000000000000004 RSI: 00000000004007b8 RDI: 0000000000000003 RBP: 00007ffc2a3202a0 R8: 0000000000400720 R9: 0000000000400720 R10: 0000000000400720 R11: 0000000000000217 R12: 00000000004004b0 R13: 00007ffc2a320380 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b 3) tcpdump captures show that DSS is exchanged even when MP_CAPABLE handshake didn't complete. $ tcpdump -tnnr bad.pcap IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [S], seq 3208913911, win 65483, options [mss 65495,sackOK,TS val 3291706876 ecr 3291694721,nop,wscale 7,mptcp capable v1], length 0 IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [S.], seq 3208913911, ack 3208913912, win 65483, options [mss 65495,sackOK,TS val 3291706876 ecr 3291706876,nop,wscale 7,mptcp capable v1], length 0 IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [.], ack 1, win 512, options [nop,nop,TS val 3291706876 ecr 3291706876], length 0 IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [F.], seq 1, ack 1, win 512, options [nop,nop,TS val 3291707876 ecr 3291706876,mptcp dss fin seq 0 subseq 0 len 1,nop,nop], length 0 IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [.], ack 2, win 512, options [nop,nop,TS val 3291707876 ecr 3291707876], length 0 force a fallback to TCP in these cases, and adjust the main socket state to avoid hanging in mptcp_sendmsg(). BugLink: https://github.com/multipath-tcp/mptcp_net-next/issues/35 Reported-by: Christoph Paasch <cpaasch@apple.com> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> --- net/mptcp/protocol.h | 10 ++++++++++ net/mptcp/subflow.c | 10 ++++++++++ 2 files changed, 20 insertions(+)