From patchwork Wed Mar 26 19:48:38 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kamal Mostafa X-Patchwork-Id: 334064 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id A2FD01400A2 for ; Thu, 27 Mar 2014 06:49:08 +1100 (EST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1WStou-0006Kw-58; Wed, 26 Mar 2014 19:49:04 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1WStoX-0006D6-B9 for kernel-team@lists.ubuntu.com; Wed, 26 Mar 2014 19:48:41 +0000 Received: from c-67-160-228-185.hsd1.ca.comcast.net ([67.160.228.185] helo=fourier) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1WStoW-00085x-Qz; Wed, 26 Mar 2014 19:48:41 +0000 Received: from kamal by fourier with local (Exim 4.80) (envelope-from ) id 1WStoV-0002eM-0C; Wed, 26 Mar 2014 12:48:39 -0700 From: Kamal Mostafa To: John Ogness Subject: [3.8.y.z extended stable] Patch "tcp: tsq: fix nonagle handling" has been added to staging queue Date: Wed, 26 Mar 2014 12:48:38 -0700 Message-Id: <1395863318-10156-1-git-send-email-kamal@canonical.com> X-Mailer: git-send-email 1.8.3.2 X-Extended-Stable: 3.8 Cc: Kamal Mostafa , Eric Dumazet , Thomas Glanzmann , "David S. Miller" , kernel-team@lists.ubuntu.com X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled tcp: tsq: fix nonagle handling to the linux-3.8.y-queue branch of the 3.8.y.z extended stable tree which can be found at: http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.8.y-queue This patch is scheduled to be released in version 3.8.13.21. If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.8.y.z tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Kamal ------ From 661bd35c89572d5050196e8e65162dac6a72d173 Mon Sep 17 00:00:00 2001 From: John Ogness Date: Sun, 9 Feb 2014 18:40:11 -0800 Subject: tcp: tsq: fix nonagle handling [ Upstream commit bf06200e732de613a1277984bf34d1a21c2de03d ] Commit 46d3ceabd8d9 ("tcp: TCP Small Queues") introduced a possible regression for applications using TCP_NODELAY. If TCP session is throttled because of tsq, we should consult tp->nonagle when TX completion is done and allow us to send additional segment, especially if this segment is not a full MSS. Otherwise this segment is sent after an RTO. [edumazet] : Cooked the changelog, added another fix about testing sk_wmem_alloc twice because TX completion can happen right before setting TSQ_THROTTLED bit. This problem is particularly visible with recent auto corking, but might also be triggered with low tcp_limit_output_bytes values or NIC drivers delaying TX completion by hundred of usec, and very low rtt. Thomas Glanzmann for example reported an iscsi regression, caused by tcp auto corking making this bug quite visible. Fixes: 46d3ceabd8d9 ("tcp: TCP Small Queues") Signed-off-by: John Ogness Signed-off-by: Eric Dumazet Reported-by: Thomas Glanzmann Signed-off-by: David S. Miller Signed-off-by: Kamal Mostafa --- net/ipv4/tcp_output.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) -- 1.8.3.2 diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index aa215b8..df953b3 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -851,7 +851,8 @@ static void tcp_tsq_handler(struct sock *sk) if ((1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_CLOSING | TCPF_CLOSE_WAIT | TCPF_LAST_ACK)) - tcp_write_xmit(sk, tcp_current_mss(sk), 0, 0, GFP_ATOMIC); + tcp_write_xmit(sk, tcp_current_mss(sk), tcp_sk(sk)->nonagle, + 0, GFP_ATOMIC); } /* * One tasklest per cpu tries to send more skbs. @@ -2028,7 +2029,15 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, if (atomic_read(&sk->sk_wmem_alloc) > limit) { set_bit(TSQ_THROTTLED, &tp->tsq_flags); - break; + /* It is possible TX completion already happened + * before we set TSQ_THROTTLED, so we must + * test again the condition. + * We abuse smp_mb__after_clear_bit() because + * there is no smp_mb__after_set_bit() yet + */ + smp_mb__after_clear_bit(); + if (atomic_read(&sk->sk_wmem_alloc) > limit) + break; } limit = mss_now;