Message ID | 1322059124.17693.24.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 23 Nov 2011 15:38:44 +0100 > [PATCH] ipv6: tcp: fix tcp_v6_conn_request() > > Since linux 2.6.26 (commit c6aefafb7ec6 : Add IPv6 support to TCP SYN > cookies), we can drop a SYN packet reusing a TIME_WAIT socket. > > (As a matter of fact we fail to send the SYNACK answer) > > As the client resends its SYN packet after a one second timeout, we > accept it, because first packet removed the TIME_WAIT socket before > being dropped. > > This probably explains why nobody ever noticed or complained. > > Reported-by: Jesse Young <jlyo@jlyo.org> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Applied and queued up for -stable, thanks! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-03-22 at 19:03 -0700, parasytic@gmail.com wrote: > Hi List! > > > First, I'm sorry for resurrecting an extremely old thread, but I've > exhausted all other resources. We're experiencing this same "1 second > retransmit" with ipv4 (including loopback). And the best part is, it > can be replicated very easily using the 'closed' and 'tcping' tests > provided by Jesse Young in the initial post. For reference: > > > $ git clone git://github.com/jlyo/tcping.git > $ cd tcping && make > > > $ git clone git://github.com/jlyo/closed.git > $ cd closed && make > > > $ ./closed 0.0.0.0 > > > $ time ./tcping -f -p8009 0.0.0.0 > > > Results: > > > ... > response from 0.0.0.0:8009, seq=1907 time=0.02 ms > response from 0.0.0.0:8009, seq=1908 time=0.03 ms > response from 0.0.0.0:8009, seq=1909 time=999.11 ms > --- 0.0.0.0:8009 ping statistics --- > 1909 responses, 1910 ok, 0.00% failed > round-trip min/avg/max = 0.0/0.6/999.1 ms > > > real 0m1.125s > user 0m0.008s > sys 0m0.104s > > > > > Packet captures from tcpdump look remarkably similar to what Eric > Dumazet shared. That eventually lead me to this thread. > > > This happens on a fresh Ubuntu 12.10 install, and also with our tuning > parameters. (Includes increasing the syn backlog, open file > descriptors, TCP memory, max orphans, etc.) I've also seen the > problem with other kernels, within EC2 and Azure. I have not been able > to test with ipv6 yet. > > > $ uname -a > Linux test 3.5.0-21-generic #32-Ubuntu SMP Tue Dec 11 18:51:59 UTC > 2012 x86_64 x86_64 x86_64 GNU/Linux > > > I'm hoping to spark some interest in revisiting this issue (with focus > on ipv4, this time). > > > Thanks everyone! > Jay > Hi Jay Not reproducible on current kernels (net-next tree for example) ip netns add eric ip netns exec eric ifconfig -a ip netns exec eric ifconfig lo 127.0.0.1 up ip netns exec eric ./closed 0.0.0.0 & ip netns exec eric nstat ip netns exec eric ./tcping -f -p8009 0.0.0.0 127.0.0.1:40832 Connected...response from 0.0.0.0:8009, seq=32799 time=0.04 ms closed 127.0.0.1:40999 Connected...response from 0.0.0.0:8009, seq=32800 time=0.04 ms closed 127.0.0.1:42795 Connected...response from 0.0.0.0:8009, seq=32801 time=0.20 ms closed 127.0.0.1:43226 Connected...response from 0.0.0.0:8009, seq=32802 time=0.07 ms closed error connecting to host (99): Cannot assign requested address ................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................^C.--- 0.0.0.0:8009 ping statistics --- 33765 responses, 32803 ok, 0.00% failed round-trip min/avg/max = 0.0/0.0/0.5 ms # ip netns exec eric nstat #kernel IpInReceives 197087 0.0 IpInDelivers 197087 0.0 IpOutRequests 197087 0.0 TcpActiveOpens 32803 0.0 TcpPassiveOpens 32803 0.0 TcpInSegs 197087 0.0 TcpOutSegs 197084 0.0 TcpRetransSegs 3 0.0 TcpOutRsts 11 0.0 TcpExtSyncookiesFailed 11 0.0 TcpExtDelayedACKs 238 0.0 TcpExtDelayedACKLocked 248 0.0 TcpExtTCPPureAcks 65838 0.0 TcpExtTCPTimeouts 3 0.0 IpExtInOctets 10773240 0.0 IpExtOutOctets 10773240 0.0 But yes, on 3.5.X kernel you might hit a bug somewhere. Since the same sequence gives suspect TcpExtListenDrops : # ip netns exec eric nstat #kernel IpInReceives 49367 0.0 IpInDelivers 49367 0.0 IpOutRequests 49367 0.0 TcpActiveOpens 8184 0.0 TcpPassiveOpens 8184 0.0 TcpInSegs 49367 0.0 TcpOutSegs 49362 0.0 TcpRetransSegs 5 0.0 TcpExtDelayedACKs 63 0.0 TcpExtDelayedACKLocked 32 0.0 TcpExtListenOverflows 4 0.0 TcpExtListenDrops 4 0.0 TcpExtTCPPureAcks 16624 0.0 TcpExtTCPLossUndo 1 0.0 TcpExtTCPTimeouts 5 0.0 IpExtInOctets 2698036 0.0 IpExtOutOctets 2698036 0.0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Eric, On Mar 23, 2013, at 9:58 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Hi Jay > > Not reproducible on current kernels (net-next tree for example) Thank you for looking into this so quickly! And it sounds like promising news, too. I'll experiment with newer kernels right away. Thanks again, Jay-- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi again, Resending this; it didn't make it to the netdev list. On Mar 23, 2013, at 12:17 PM, Jason Oster <parasytic@gmail.com> wrote: > Hello Eric, > > Thank you for looking into this so quickly! And it sounds like promising news, too. I'll experiment with newer kernels right away. > > Thanks again, > Jay FYI net-next may have some changes not available in 3.9.0-rc3, because I can still reproduce it there: $ uname -a Linux test 3.9.0-030900rc3-generic #201303171935 SMP Sun Mar 17 23:36:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux This is from the Ubuntu kernel mainline PPA (on Raring Ringtail-dev). I haven't built the kernel yet.-- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 36131d1..2dea4bb 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1255,6 +1255,13 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb) if (!want_cookie || tmp_opt.tstamp_ok) TCP_ECN_create_request(req, tcp_hdr(skb)); + treq->iif = sk->sk_bound_dev_if; + + /* So that link locals have meaning */ + if (!sk->sk_bound_dev_if && + ipv6_addr_type(&treq->rmt_addr) & IPV6_ADDR_LINKLOCAL) + treq->iif = inet6_iif(skb); + if (!isn) { struct inet_peer *peer = NULL; @@ -1264,12 +1271,6 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb) atomic_inc(&skb->users); treq->pktopts = skb; } - treq->iif = sk->sk_bound_dev_if; - - /* So that link locals have meaning */ - if (!sk->sk_bound_dev_if && - ipv6_addr_type(&treq->rmt_addr) & IPV6_ADDR_LINKLOCAL) - treq->iif = inet6_iif(skb); if (want_cookie) { isn = cookie_v6_init_sequence(sk, skb, &req->mss);