From patchwork Tue Jan 27 20:34:43 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Neal Cardwell X-Patchwork-Id: 433667 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 238E6140168 for ; Wed, 28 Jan 2015 07:35:18 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759787AbbA0UfK (ORCPT ); Tue, 27 Jan 2015 15:35:10 -0500 Received: from mail-yk0-f174.google.com ([209.85.160.174]:46941 "EHLO mail-yk0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932066AbbA0UfH (ORCPT ); Tue, 27 Jan 2015 15:35:07 -0500 Received: by mail-yk0-f174.google.com with SMTP id 131so7296759ykp.5 for ; Tue, 27 Jan 2015 12:35:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=EJYoWPc4D2mXYaOfr59XUJUIPNEeNzeRWJkX9PO7wdo=; b=FYMoabQ4bcTgcbDbh/+HeUe3sLVcLKFJkJbJxniGzgSlpfHKNHUvyMytmMUrOeEvLK 6QpmviAShNM8Y9JMVBFtir1I98GUGmNpl0q6cCjxYH+JE9SjVkjR7byKIq10eLR/9aOn gyQUtZiE2cZZLBRPR9PjJ8I1nGNEQu4yU8ug0o95IaHt4lL+bnsPFbBZqrdK+OGNqAUA 4l1rWQZ9Wtv3N80XrjZIn3t2+hA/xHnaSvIh5/Pli5AXrBITT1hLNhYtBRsBVnO1Fjfp dMbQb49Rhoq3zaeDFQgQabXC64+jkXDDSnKPfRFxlZlIuA59AHv9vCGGnJnFKTIOkKjP P6cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=EJYoWPc4D2mXYaOfr59XUJUIPNEeNzeRWJkX9PO7wdo=; b=OBEHg9vnglE7Dtl7d50NXJ6rTd7Frf2Byl14EdNmEnX4wgSozQuonet+Q0DW1E8c9t mkEkgsleLGt/k2MbiSx2XeLz5JjD8H3Lz48XbCs/U35xpQVD9tnBpUaOmq5SS5pH0L9I 5c+y+uk87A8mp1u6u+Kk+rrBUL7QgEqY9CCq+6MbLryL24PnQPhFNYW2/c41fGO40sa8 E1MZKhWwBRPRhCQs84Dmp8dfFKxMBGoY/cIQyRIqTXmrSiSnchxJfl4ZojGG6VZmbysh HpfvP0/D78uiLUy82yc3v6wfLJjcpMJk0vUF5ufTHamfb7KMGSkHaRXzHpWdMSMvtpvD RVjQ== X-Gm-Message-State: ALoCoQm3tWP4Hg89LaaGPNAUkWoMUQTeBKqOpuRSu6xj0XApElLP9lqOKvU135kajnlRgiPQEYyb X-Received: by 10.236.97.1 with SMTP id s1mr1555725yhf.149.1422390906825; Tue, 27 Jan 2015 12:35:06 -0800 (PST) Received: from coy.nyc.corp.google.com ([172.26.105.221]) by mx.google.com with ESMTPSA id s63sm1314942yhf.7.2015.01.27.12.35.06 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 27 Jan 2015 12:35:06 -0800 (PST) From: Neal Cardwell To: David Miller Cc: netdev@vger.kernel.org, Neal Cardwell , Yuchung Cheng , Eric Dumazet Subject: [PATCH net-next 5/5] tcp: fix timing issue in CUBIC slope calculation Date: Tue, 27 Jan 2015 15:34:43 -0500 Message-Id: <1422390883-15603-6-git-send-email-ncardwell@google.com> X-Mailer: git-send-email 2.2.0.rc0.207.ga3a616c In-Reply-To: <1422390883-15603-1-git-send-email-ncardwell@google.com> References: <1422390883-15603-1-git-send-email-ncardwell@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch fixes a bug in CUBIC that causes cwnd to increase slightly too slowly when multiple ACKs arrive in the same jiffy. If cwnd is supposed to increase at a rate of more than once per jiffy, then CUBIC was sometimes too slow. Because the bic_target is calculated for a future point in time, calculated with time in jiffies, the cwnd can increase over the course of the jiffy while the bic_target calculated as the proper CUBIC cwnd at time t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only increases on jiffy tick boundaries. So since the cnt is set to: ca->cnt = cwnd / (bic_target - cwnd); as cwnd increases but bic_target does not increase due to jiffy granularity, the cnt becomes too large, causing cwnd to increase too slowly. For example: - suppose at the beginning of a jiffy, cwnd=40, bic_target=44 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10 - suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai() increases cwnd to 41 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13 So now CUBIC will wait for 13 packets to be ACKed before increasing cwnd to 42, insted of 10 as it should. The fix is to avoid adjusting the slope (determined by ca->cnt) multiple times within a jiffy, and instead skip to compute the Reno cwnd, the "TCP friendliness" code path. Reported-by: Eyal Perry Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet --- net/ipv4/tcp_cubic.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c index e036958..4f91141 100644 --- a/net/ipv4/tcp_cubic.c +++ b/net/ipv4/tcp_cubic.c @@ -213,6 +213,13 @@ static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked) (s32)(tcp_time_stamp - ca->last_time) <= HZ / 32) return; + /* The CUBIC function can update ca->cnt at most once per jiffy. + * On all cwnd reduction events, ca->epoch_start is set to 0, + * which will force a recalculation of ca->cnt. + */ + if (ca->epoch_start && tcp_time_stamp == ca->last_time) + goto tcp_friendliness; + ca->last_cwnd = cwnd; ca->last_time = tcp_time_stamp; @@ -280,6 +287,7 @@ static inline void bictcp_update(struct bictcp *ca, u32 cwnd, u32 acked) if (ca->last_max_cwnd == 0 && ca->cnt > 20) ca->cnt = 20; /* increase cwnd 5% per RTT */ +tcp_friendliness: /* TCP Friendly */ if (tcp_friendliness) { u32 scale = beta_scale;