From patchwork Fri Sep 21 15:51:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973292 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="P2pGoVF9"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42GylP2NYFz9sCD for ; Sat, 22 Sep 2018 01:52:01 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390337AbeIUVl2 (ORCPT ); Fri, 21 Sep 2018 17:41:28 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:41541 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727392AbeIUVl0 (ORCPT ); Fri, 21 Sep 2018 17:41:26 -0400 Received: by mail-pf1-f195.google.com with SMTP id h79-v6so6163404pfk.8 for ; Fri, 21 Sep 2018 08:51:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Rw7Mo7BguBdV7TLXFLEgADJwARUEPYgEF4bWe62imt8=; b=P2pGoVF9LTtFS8uNHPQJo0vmwP1o1AI0nN2aetaelt0Il/uPIEVOqYTu8Fut+1hgNX UjVfoJeESxpwv188VsLgx6DTS6ghs9z7wsfbFVCj79arGrpbK7DVNHRjYbn1V6yaXM1H 4ZRgIMNsIGcvYuZA5tZr9Z7smcVEwFffh+rzlpVH00bDR/cYfJY4VfLLhcN7SQwwGX/r OAkMwBmnkGiSEDOISh5wfvxPUmureuSHN6sS3j9lMdJoMf0Sou6M76CVxbQ6Ybj5O13p Ivx4grP1Q5RiepTprsSUBSC7DXvAtq0fOlr5E+aM4+kxZ7pzop7H7Be+VZAN4crnUYaG CwtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Rw7Mo7BguBdV7TLXFLEgADJwARUEPYgEF4bWe62imt8=; b=ZxjvwulsMY7q5CxWonk1QFBR19tQJoiWJWVLC7nRSpf9JEwLDbQCN1x4jZetppQ3c3 Ae+rqyeCo0ocp0VqsThiC7dQe2fnF2lETUGySC1A3dwD0KuJNZKBJDC8osT/lrBr2ydt +fqcetPtBobdhpfYBoIO1brueoVU5B5zjfE8q+su3aPxf3o1LZo//fCxzgdpbY5FffYI skBG6LrLfnpw1TzXNF3qWLFDSkmpYErkm9Df7xIqoxyFvh5gTBxz+rbVMblemvXXwrA+ O3NIyvcbax4jWl2FAgwluRjziFlKo1lowkntelK6aAD2aKAesWU44dCXZOF6k3wvQwuR QNDA== X-Gm-Message-State: APzg51B+CZ7r2O3IsFEv2Qs6UX2/9LL0MHV/Lwrob3vj7W0wuUq7Xhps vQMq9lCg50fRfaHdb7o2f001qVqSIZ0= X-Google-Smtp-Source: ANB0VdavPnknukHIGeVsPfOGY7mLYwv2FNSbsWQyQewZVQ+NVKlL1WLQ73JpbHD39qDuHu89aRxjBw== X-Received: by 2002:a62:9683:: with SMTP id s3-v6mr47241301pfk.191.1537545117329; Fri, 21 Sep 2018 08:51:57 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id j184-v6sm35568627pge.77.2018.09.21.08.51.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:51:56 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 1/9] tcp: switch tcp_clock_ns() to CLOCK_TAI base Date: Fri, 21 Sep 2018 08:51:46 -0700 Message-Id: <20180921155154.49489-2-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org TCP pacing is either implemented in sch_fq or internally. We have the goal of being able to offload pacing on the NICS. TCP will soon provide per skb skb->tstamp as early departure time. Like ETF in commit 25db26a91364 ("net/sched: Introduce the ETF Qdisc") we chose CLOCK_T as the clock base, so that TCP and pacers can share a common clock, to get better RTT samples (without pacing artificially inflating these samples). Signed-off-by: Eric Dumazet --- include/net/tcp.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 770917d0caa71896b6adac06a62b150bfdc72836..c6f0bc1dc6782a1976c06932e846b3f6d708ba9f 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -732,7 +732,7 @@ void tcp_send_window_probe(struct sock *sk); static inline u64 tcp_clock_ns(void) { - return local_clock(); + return ktime_get_tai_ns(); } static inline u64 tcp_clock_us(void) From patchwork Fri Sep 21 15:51:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973293 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="D+Dufmhn"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42GylQ0HPYz9sCP for ; Sat, 22 Sep 2018 01:52:02 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390363AbeIUVl3 (ORCPT ); Fri, 21 Sep 2018 17:41:29 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:37046 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390320AbeIUVl3 (ORCPT ); Fri, 21 Sep 2018 17:41:29 -0400 Received: by mail-pf1-f193.google.com with SMTP id h69-v6so6164715pfd.4 for ; Fri, 21 Sep 2018 08:52:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OGf+q+wFW48QiHl4812+I5S4jvylty2EQCe/zu2wPrI=; b=D+DufmhnrjaG7hzZZdq51MypBrCPGzIWL5X6JTgo+uunkrluMHGwUg4UezcPr8dKd2 nhwAEXkOKxWUY5+0Tv1Wrvws96No0H+IVapKeEYvCNYHTkQ6t1Cno1IrYpunJlAtcuim l/4xupyAOVqH8UJ/RWM/POd7qEPcvXCL4dJ9IGzHScyWlYeKyV5Hd5F1G59PiS2AeQGV Dug9tN+W92EHRlZoXmNo88YY1nsPAkBlsuJ27AGgzG8kZ8cej9Hb+Zxegw+J44dJafnY 5MdvJRFyvcfcQwa9wclmiZrvOUx1oqZEpMOwnVXgLA3x4TjyCJD2VSC7goIRgVa0ndIG vbeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OGf+q+wFW48QiHl4812+I5S4jvylty2EQCe/zu2wPrI=; b=QZM+6cqHPyuTbeg/eDrKM7CiY+y4ZbC1ADjPTLtrWoDzLrQYY0OIEs5xrlUkdK0xFe JjkHdeT+VSdz93O0YG5vrIVtuOJz9Wr/fGuGJfOD5VbuDIOvOqTII7IkoHRTTQGuNVf9 GiYxRT3ZVJd8r/5hnT2yb1M5w3yMdAIoEmilY86zJb8uTZxHvI85T1SdgYhO6tg0Jz7f J/iTdDZ1oERIO5Q0pHAYwpAk8JyAjNuS0/eVJxzRlXqqjLtdy4e++jwS/tmUwthiIPGX +2D0HTxQG5FI2cVyS6WiiE30pTliPI+oBGGz4FdgdiVhAcA/dXshec5d0OdbKmi1wsc+ cElA== X-Gm-Message-State: APzg51BAbi2hl4znwzw/EZXj7RYAfWfn75SCedAEg2KEOqji4qpIUlvn mxrVvNRzf/9zyAFfXu5htxIrLg== X-Google-Smtp-Source: ANB0VdaHWkFm3VeT6gRYl26P2VRS8ia505vFFhappCO5XIshEQ7IWYaWhl8w8xvI9NylT2WqdwDsrw== X-Received: by 2002:a63:2605:: with SMTP id m5-v6mr40310225pgm.225.1537545119353; Fri, 21 Sep 2018 08:51:59 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id u13-v6sm36860928pfn.59.2018.09.21.08.51.58 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:51:58 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 2/9] tcp: introduce tcp_skb_timestamp_us() helper Date: Fri, 21 Sep 2018 08:51:47 -0700 Message-Id: <20180921155154.49489-3-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org There are few places where TCP reads skb->skb_mstamp expecting a value in usec unit. skb->tstamp (aka skb->skb_mstamp) will soon store CLOCK_TAI nsec value. Add tcp_skb_timestamp_us() to provide proper conversion when needed. Signed-off-by: Eric Dumazet --- include/net/tcp.h | 8 +++++++- net/ipv4/tcp_input.c | 11 ++++++----- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_rate.c | 17 +++++++++-------- net/ipv4/tcp_recovery.c | 5 +++-- 6 files changed, 27 insertions(+), 18 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index c6f0bc1dc6782a1976c06932e846b3f6d708ba9f..0ca5ea10dc06f3552597c94de31dcd0c8e0ecc32 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -774,6 +774,12 @@ static inline u32 tcp_skb_timestamp(const struct sk_buff *skb) return div_u64(skb->skb_mstamp, USEC_PER_SEC / TCP_TS_HZ); } +/* provide the departure time in us unit */ +static inline u64 tcp_skb_timestamp_us(const struct sk_buff *skb) +{ + return skb->skb_mstamp; +} + #define tcp_flag_byte(th) (((u_int8_t *)th)[13]) @@ -1940,7 +1946,7 @@ static inline s64 tcp_rto_delta_us(const struct sock *sk) { const struct sk_buff *skb = tcp_rtx_queue_head(sk); u32 rto = inet_csk(sk)->icsk_rto; - u64 rto_time_stamp_us = skb->skb_mstamp + jiffies_to_usecs(rto); + u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + jiffies_to_usecs(rto); return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp; } diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index d9034073138ce49c423f7a22143bac415415bc09..d703a0b3b6a2f0efd8607354c1c74ac1a8e78d4f 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1305,7 +1305,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *prev, */ tcp_sacktag_one(sk, state, TCP_SKB_CB(skb)->sacked, start_seq, end_seq, dup_sack, pcount, - skb->skb_mstamp); + tcp_skb_timestamp_us(skb)); tcp_rate_skb_delivered(sk, skb, state->rate); if (skb == tp->lost_skb_hint) @@ -1580,7 +1580,7 @@ static struct sk_buff *tcp_sacktag_walk(struct sk_buff *skb, struct sock *sk, TCP_SKB_CB(skb)->end_seq, dup_sack, tcp_skb_pcount(skb), - skb->skb_mstamp); + tcp_skb_timestamp_us(skb)); tcp_rate_skb_delivered(sk, skb, state->rate); if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) list_del_init(&skb->tcp_tsorted_anchor); @@ -3103,7 +3103,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, tp->retrans_out -= acked_pcount; flag |= FLAG_RETRANS_DATA_ACKED; } else if (!(sacked & TCPCB_SACKED_ACKED)) { - last_ackt = skb->skb_mstamp; + last_ackt = tcp_skb_timestamp_us(skb); WARN_ON_ONCE(last_ackt == 0); if (!first_ackt) first_ackt = last_ackt; @@ -3121,7 +3121,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, tp->delivered += acked_pcount; if (!tcp_skb_spurious_retrans(tp, skb)) tcp_rack_advance(tp, sacked, scb->end_seq, - skb->skb_mstamp); + tcp_skb_timestamp_us(skb)); } if (sacked & TCPCB_LOST) tp->lost_out -= acked_pcount; @@ -3215,7 +3215,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, tp->lost_cnt_hint -= min(tp->lost_cnt_hint, delta); } } else if (skb && rtt_update && sack_rtt_us >= 0 && - sack_rtt_us > tcp_stamp_us_delta(tp->tcp_mstamp, skb->skb_mstamp)) { + sack_rtt_us > tcp_stamp_us_delta(tp->tcp_mstamp, + tcp_skb_timestamp_us(skb))) { /* Do not re-arm RTO if the sack RTT is measured from data sent * after when the head was last (re)transmitted. Otherwise the * timeout may continue to extend in loss recovery. diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 09547ef9c4c644fba0f7887afad0a6393e3dd03a..1f2496e8620dd78cecefbb0dceb8570fc92661e5 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -544,7 +544,7 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info) BUG_ON(!skb); tcp_mstamp_refresh(tp); - delta_us = (u32)(tp->tcp_mstamp - skb->skb_mstamp); + delta_us = (u32)(tp->tcp_mstamp - tcp_skb_timestamp_us(skb)); remaining = icsk->icsk_rto - usecs_to_jiffies(delta_us); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 597dbd749f05dc72e53962a5821861fc218774d6..b95aa72d88233dd6376a70ccd7cbb13744444889 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1966,7 +1966,7 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb, head = tcp_rtx_queue_head(sk); if (!head) goto send_now; - age = tcp_stamp_us_delta(tp->tcp_mstamp, head->skb_mstamp); + age = tcp_stamp_us_delta(tp->tcp_mstamp, tcp_skb_timestamp_us(head)); /* If next ACK is likely to come too late (half srtt), do not defer */ if (age < (tp->srtt_us >> 4)) goto send_now; diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c index 4dff40dad4dc5ccc372f5108b0d6ba38497ab81f..baed2186c7c623737c739cbc1e35a3c772a8b15a 100644 --- a/net/ipv4/tcp_rate.c +++ b/net/ipv4/tcp_rate.c @@ -55,8 +55,10 @@ void tcp_rate_skb_sent(struct sock *sk, struct sk_buff *skb) * bandwidth estimate. */ if (!tp->packets_out) { - tp->first_tx_mstamp = skb->skb_mstamp; - tp->delivered_mstamp = skb->skb_mstamp; + u64 tstamp_us = tcp_skb_timestamp_us(skb); + + tp->first_tx_mstamp = tstamp_us; + tp->delivered_mstamp = tstamp_us; } TCP_SKB_CB(skb)->tx.first_tx_mstamp = tp->first_tx_mstamp; @@ -88,13 +90,12 @@ void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb, rs->is_app_limited = scb->tx.is_app_limited; rs->is_retrans = scb->sacked & TCPCB_RETRANS; - /* Find the duration of the "send phase" of this window: */ - rs->interval_us = tcp_stamp_us_delta( - skb->skb_mstamp, - scb->tx.first_tx_mstamp); - /* Record send time of most recently ACKed packet: */ - tp->first_tx_mstamp = skb->skb_mstamp; + tp->first_tx_mstamp = tcp_skb_timestamp_us(skb); + /* Find the duration of the "send phase" of this window: */ + rs->interval_us = tcp_stamp_us_delta(tp->first_tx_mstamp, + scb->tx.first_tx_mstamp); + } /* Mark off the skb delivered once it's sacked to avoid being * used again when it's cumulatively acked. For acked packets diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c index c81aadff769b2c3eee02e6de3a5545c27e8cbc38..fdb715bdd2d11dd33a1474d02892546bbac66f41 100644 --- a/net/ipv4/tcp_recovery.c +++ b/net/ipv4/tcp_recovery.c @@ -50,7 +50,7 @@ static u32 tcp_rack_reo_wnd(const struct sock *sk) s32 tcp_rack_skb_timeout(struct tcp_sock *tp, struct sk_buff *skb, u32 reo_wnd) { return tp->rack.rtt_us + reo_wnd - - tcp_stamp_us_delta(tp->tcp_mstamp, skb->skb_mstamp); + tcp_stamp_us_delta(tp->tcp_mstamp, tcp_skb_timestamp_us(skb)); } /* RACK loss detection (IETF draft draft-ietf-tcpm-rack-01): @@ -91,7 +91,8 @@ static void tcp_rack_detect_loss(struct sock *sk, u32 *reo_timeout) !(scb->sacked & TCPCB_SACKED_RETRANS)) continue; - if (!tcp_rack_sent_after(tp->rack.mstamp, skb->skb_mstamp, + if (!tcp_rack_sent_after(tp->rack.mstamp, + tcp_skb_timestamp_us(skb), tp->rack.end_seq, scb->end_seq)) break; From patchwork Fri Sep 21 15:51:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973294 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="O2/stR4v"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42GylR6bh8z9sCD for ; Sat, 22 Sep 2018 01:52:03 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390389AbeIUVla (ORCPT ); Fri, 21 Sep 2018 17:41:30 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:41544 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390320AbeIUVla (ORCPT ); Fri, 21 Sep 2018 17:41:30 -0400 Received: by mail-pf1-f194.google.com with SMTP id h79-v6so6163478pfk.8 for ; Fri, 21 Sep 2018 08:52:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NhyDmyy2VI48q7zR4G7zuss31D3qhr9msNQjMjgbEOc=; b=O2/stR4vbyZ5+hmo7dmWYcGevwoEgVEAIYaJNUJkfycedp5DtHo4ey8x1ijh+fyElo FyF1q4mQaSQ90X7C32vQfATWaWWt2PBJHuDOExaQ5VtNAV71DvAt5/W3VebOmhIp5fw5 NpeB+i+PRPUcnujveYVAKIGhRqts2dr1iY0CMGs6+HEFM0TPia9SGjORC+cKmT/DX0Hf zn2Wx4SQEyTlztdvKtiE11Nd8DNJYuf3le4XEWIk4U5rqcUGoSunXECxoI1KyefKhemn Jjj9wxf8ekF182A8aA/Go9cf/iru5/vYfnQYT/WYJzJDB9AaJNkB55DFsTVm2otpW8hF hZBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NhyDmyy2VI48q7zR4G7zuss31D3qhr9msNQjMjgbEOc=; b=h9UHZMMaO61o3bY2F5bmihEtJ1aBdRQCeTaW3En2UOShckizC/CF7SnWtZGa1AzwhK tLgxxo1PPXNWe7UfWDYlw6onniQfZ9s6XJ/ixTIa2pRz2DXX8dp2ZP0i21q5y3cXvyZ2 Y3y1OqV0LNy1fUIN/7qlh4UY5MuCrXYUfMyizkFWwa5HE0cKk3gFVFvpGvjSighSPEoz XNSmjP3gZbxS1cgWITPNhG7ClnK0HvsiCNti4bb6qiLOfAhTCiw5kpYBfb7dOZqoI2IM KFi0mdZQgF5wtcqXWJ6h3z0HnU9HIzKsmIpO9JviUO6MaVjmvQbpnTHwn/bu/64VTnBw ZoHA== X-Gm-Message-State: APzg51C+SzgzyuErCtbwHsh2f5oNwZRDjaO157nl+Tx5fys7YnJb45Do mn44DHCvgCzzPURYqTCJQkifsA== X-Google-Smtp-Source: ANB0VdbElOT5+CHSm1W/wnM0Xv199vrGnBeACqBXBm2s5LqvPgsk1rVYMbdFrzM3Mxqd4HdX2v8H0A== X-Received: by 2002:a63:3dc6:: with SMTP id k189-v6mr41876769pga.191.1537545121098; Fri, 21 Sep 2018 08:52:01 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id r25-v6sm33006119pgm.59.2018.09.21.08.52.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:52:00 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 3/9] net_sched: sch_fq: switch to CLOCK_TAI Date: Fri, 21 Sep 2018 08:51:48 -0700 Message-Id: <20180921155154.49489-4-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org TCP will soon provide per skb->tstamp with earliest departure time, so that sch_fq does not have to determine departure time by looking at socket sk_pacing_rate. We chose in linux-4.19 CLOCK_TAI as the clock base for transports, qdiscs, and NIC offloads. Signed-off-by: Eric Dumazet --- net/sched/sch_fq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index b27ba36a269cc72cd716da19dcfa27018ec01490..d5185c44e9a5f521ca99243b6e9b53ec05b84d49 100644 --- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -460,7 +460,7 @@ static void fq_check_throttled(struct fq_sched_data *q, u64 now) static struct sk_buff *fq_dequeue(struct Qdisc *sch) { struct fq_sched_data *q = qdisc_priv(sch); - u64 now = ktime_get_ns(); + u64 now = ktime_get_tai_ns(); struct fq_flow_head *head; struct sk_buff *skb; struct fq_flow *f; @@ -823,7 +823,7 @@ static int fq_init(struct Qdisc *sch, struct nlattr *opt, q->fq_trees_log = ilog2(1024); q->orphan_mask = 1024 - 1; q->low_rate_threshold = 550000 / 8; - qdisc_watchdog_init(&q->watchdog, sch); + qdisc_watchdog_init_clockid(&q->watchdog, sch, CLOCK_TAI); if (opt) err = fq_change(sch, opt, extack); @@ -878,7 +878,7 @@ static int fq_dump_stats(struct Qdisc *sch, struct gnet_dump *d) st.flows_plimit = q->stat_flows_plimit; st.pkts_too_long = q->stat_pkts_too_long; st.allocation_errors = q->stat_allocation_errors; - st.time_next_delayed_flow = q->time_next_delayed_flow - ktime_get_ns(); + st.time_next_delayed_flow = q->time_next_delayed_flow - ktime_get_tai_ns(); st.flows = q->flows; st.inactive_flows = q->inactive_flows; st.throttled_flows = q->throttled_flows; From patchwork Fri Sep 21 15:51:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973295 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="u4HB0UdF"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42GylT5mHRz9sCD for ; Sat, 22 Sep 2018 01:52:05 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390418AbeIUVld (ORCPT ); Fri, 21 Sep 2018 17:41:33 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:46380 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390320AbeIUVlc (ORCPT ); Fri, 21 Sep 2018 17:41:32 -0400 Received: by mail-pf1-f193.google.com with SMTP id u24-v6so6146603pfn.13 for ; Fri, 21 Sep 2018 08:52:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hWQSq8Fc/QhIt2s13+tGFLYMGtyGuwX1J3tJVzxEO2Y=; b=u4HB0UdFMd51r5Gqh7Hq86f1a1yNQuwflqPispvwhJ2EWLp0bIfFmODcrNpxzGme70 ZoVL31fEiLZ8Aov1VGBX5Fa0qnbzqisA5K5KTkduqvepIdfhHYg8UwpVlxS+p8ndW0DZ Vsy16NVLGrs5m9pHbEHNwLXhWVRenNfAyDz1FU4HU8v4NIDYQOsjByvykDTccSJxgTaH Chq33gyf0RLScRxH9Y2vbb69EJmF7abU/nOSqi/IiiBHZ5V0QGN37a7g0icNgRfPfiEU 2fw3Xb78F72+TgfyUovg4SyW3bw9Qqe+4OggBDtsbKKbMGrheY4iUGpMKvyg9t1n95u2 LjSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hWQSq8Fc/QhIt2s13+tGFLYMGtyGuwX1J3tJVzxEO2Y=; b=m+qVIVWiYvzLvDHXpeWCKM0qBEAz2odtaL/x2UPVq3tGwlFhpFIB9DUh38LdBAGTEY 39RUtAPc2tCQu8wNNmLTRE55DoEAdeNVIsmUBYq/Wewa+IfGRuB20r4G6zWq7wkpxyAA IQ/+LtonCrqr7TwpNBky51s8GAb5XpGZGB0qqvKqQIMofnKm/IzXT2grAb6Ygdv8Bvlm 54WHgAoBupykXpDKWBROB3T6OWESf68SrIMfi90J948T8eDEJLWN47Eo3A3vB0kaTwGp SX4R/NNGvXZtTQjudCrOdZj7szcxM0UaPE/a+DyNALwVb4l3L25UijmGeVzypV8nS2m0 U2GA== X-Gm-Message-State: APzg51BnA/OPO5a2toLzpfu2hXNQO+v7qGaxX9Srd4V8cNSwrpLlPZet J3uC4S5T/bdJAG2puw5eYJKK8vsXTKE= X-Google-Smtp-Source: ANB0VdbToBhMiiXSx41cZ4J1AUQmcWMtC0UF5UK6524F8DRpxlUrym+gpL34ZqErkoQ/FT4myUwuAA== X-Received: by 2002:a62:9541:: with SMTP id p62-v6mr47783622pfd.194.1537545123017; Fri, 21 Sep 2018 08:52:03 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id v72-v6sm46934583pfj.22.2018.09.21.08.52.02 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:52:02 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 4/9] tcp: add tcp_wstamp_ns socket field Date: Fri, 21 Sep 2018 08:51:49 -0700 Message-Id: <20180921155154.49489-5-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org TCP will soon provide earliest departure time on TX skbs. It needs to track this in a new variable. tcp_mstamp_refresh() needs to update this variable, and became too big to stay an inline. Signed-off-by: Eric Dumazet --- include/linux/tcp.h | 2 ++ include/net/tcp.h | 12 +----------- net/ipv4/tcp_output.c | 16 ++++++++++++++++ 3 files changed, 19 insertions(+), 11 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 263e37271afda18f3d61c99272d34da15dfdca29..848f5b25e178288ce870637b68a692ab88dc7d4d 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -248,6 +248,8 @@ struct tcp_sock { syn_smc:1; /* SYN includes SMC */ u32 tlp_high_seq; /* snd_nxt at the time of TLP retransmit. */ + u64 tcp_wstamp_ns; /* departure time for next sent data packet */ + /* RTT measurement */ u64 tcp_mstamp; /* most recent packet received/sent */ u32 srtt_us; /* smoothed round trip time << 3 in usecs */ diff --git a/include/net/tcp.h b/include/net/tcp.h index 0ca5ea10dc06f3552597c94de31dcd0c8e0ecc32..370198fdc65d3e863104665e20faefd0e5a09b92 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -752,17 +752,7 @@ static inline u32 tcp_time_stamp_raw(void) return div_u64(tcp_clock_ns(), NSEC_PER_SEC / TCP_TS_HZ); } - -/* Refresh 1us clock of a TCP socket, - * ensuring monotically increasing values. - */ -static inline void tcp_mstamp_refresh(struct tcp_sock *tp) -{ - u64 val = tcp_clock_us(); - - if (val > tp->tcp_mstamp) - tp->tcp_mstamp = val; -} +void tcp_mstamp_refresh(struct tcp_sock *tp); static inline u32 tcp_stamp_us_delta(u64 t1, u64 t0) { diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index b95aa72d88233dd6376a70ccd7cbb13744444889..5a8105e84f7c1a876bbd15e8050c2574c1fbe162 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -45,6 +45,22 @@ #include +/* Refresh clocks of a TCP socket, + * ensuring monotically increasing values. + */ +void tcp_mstamp_refresh(struct tcp_sock *tp) +{ + u64 val = tcp_clock_ns(); + + /* departure time for next data packet */ + if (val > tp->tcp_wstamp_ns) + tp->tcp_wstamp_ns = val; + + val = div_u64(val, NSEC_PER_USEC); + if (val > tp->tcp_mstamp) + tp->tcp_mstamp = val; +} + static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, int push_one, gfp_t gfp); From patchwork Fri Sep 21 15:51:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973296 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="fFWNX68o"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42GylW4thPz9sCD for ; Sat, 22 Sep 2018 01:52:07 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390448AbeIUVlf (ORCPT ); Fri, 21 Sep 2018 17:41:35 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:34337 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390320AbeIUVle (ORCPT ); Fri, 21 Sep 2018 17:41:34 -0400 Received: by mail-pf1-f194.google.com with SMTP id k19-v6so6173703pfi.1 for ; Fri, 21 Sep 2018 08:52:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gB2rwY4QajoBKDB/xhl8xPRqx1wVEI84AJAawQ9d3jI=; b=fFWNX68oxBzD8bUM3qXt5dzhtNnFSpA8mc8GVRvx+hN3mlMx9m76vOizlJO+/E8glL SEk4jFWHqbYhPvuWxIN8/nfiSquk8I0+K/R6KTWz8+fVgtRtTPisPksBAaESwhStsUhn AZuqGAJSH4PhaE7f3TbwcfpF0g+XojCMiOL1Ulf8RvWeEwJHK5ST7USTzpucmDMlf+9+ g0qLFTGMMb0+dJBcRdT8SdwonqPE6o2EvSjdQat0ervAYVhDQC0oImiNASP3bHaToxpH ElHSvDgVHvc5mN2rHQuM1l/C1ZD6bNNLtE10t184Y02YueKl9t65xGngBpraEcM/fEdV FLaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gB2rwY4QajoBKDB/xhl8xPRqx1wVEI84AJAawQ9d3jI=; b=N3Weh/kwRQc5Q6qRrMejsEzQND5fjlAvo4r2ruyObir60dzKDfv6FxfypZzNgsdLsG aCali663WNuLqZfBzSEsT4FfdQ1jfVMU2BByXWrsKRC+kyBv3v4F5rPw1gvE+JUTi2S9 BjnxaXctAcM4xEwuZ/4AA8dwzYU8apui3TNjM2pICvEbgrRBbHOeeFZD+p38Egq4VG5I 6Q/CX7C6GbHI12sI8V0GSxGFR+kczVTWnugbNVakZP8pOLcrXE7sPcqYPqEEp99UYbFL fqO31BhUy+Hc7ddwOytbilfbjurzn4T2a5cWrEeFp1neJ0jxq0g8CDXgR+fCOKfsmVWr 0GKQ== X-Gm-Message-State: APzg51B5IYMeIJsQs7P7oMLfNjsLVqXtA0ubflexrWmc8S90uCcHWHqb 42X1hazkbuC4ldNsghCl9I4iOw== X-Google-Smtp-Source: ANB0VdbrbyY5u5pyj5Q9xVug5BV2K61KH9b7TkCP3TSFe7PeIf80lEnHRqUQzCVyCtzr5PpmvXqg+g== X-Received: by 2002:a62:12c7:: with SMTP id 68-v6mr47184688pfs.216.1537545124794; Fri, 21 Sep 2018 08:52:04 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id d22-v6sm66217505pfm.48.2018.09.21.08.52.04 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:52:04 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 5/9] tcp: provide earliest departure time in skb->tstamp Date: Fri, 21 Sep 2018 08:51:50 -0700 Message-Id: <20180921155154.49489-6-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Switch internal TCP skb->skb_mstamp to skb->skb_mstamp_ns, from usec units to nsec units. Do not clear skb->tstamp before entering IP stacks in TX, so that qdisc or devices can implement pacing based on the earliest departure time instead of socket sk->sk_pacing_rate Packets are fed with tcp_wstamp_ns, and following patch will update tcp_wstamp_ns when both TCP and sch_fq switch to the earliest departure time mechanism. Signed-off-by: Eric Dumazet --- include/linux/skbuff.h | 2 +- include/net/tcp.h | 6 +++--- net/ipv4/syncookies.c | 2 +- net/ipv4/tcp.c | 2 +- net/ipv4/tcp_output.c | 13 ++++++------- net/ipv4/tcp_timer.c | 2 +- 6 files changed, 13 insertions(+), 14 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index e3a53ca4a9b51b84b7d75ce87485d4d9109a4cf2..86f337e9a81d5eff360335a19ab09f26ae48fca8 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -689,7 +689,7 @@ struct sk_buff { union { ktime_t tstamp; - u64 skb_mstamp; + u64 skb_mstamp_ns; /* earliest departure time */ }; /* * This is the control buffer. It is free to use for every diff --git a/include/net/tcp.h b/include/net/tcp.h index 370198fdc65d3e863104665e20faefd0e5a09b92..ff15d8e0d525715b17671e64f6abdead9df0a8f3 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -761,13 +761,13 @@ static inline u32 tcp_stamp_us_delta(u64 t1, u64 t0) static inline u32 tcp_skb_timestamp(const struct sk_buff *skb) { - return div_u64(skb->skb_mstamp, USEC_PER_SEC / TCP_TS_HZ); + return div_u64(skb->skb_mstamp_ns, NSEC_PER_SEC / TCP_TS_HZ); } /* provide the departure time in us unit */ static inline u64 tcp_skb_timestamp_us(const struct sk_buff *skb) { - return skb->skb_mstamp; + return div_u64(skb->skb_mstamp_ns, NSEC_PER_USEC); } @@ -813,7 +813,7 @@ struct tcp_skb_cb { #define TCPCB_SACKED_RETRANS 0x02 /* SKB retransmitted */ #define TCPCB_LOST 0x04 /* SKB is lost */ #define TCPCB_TAGBITS 0x07 /* All tag bits */ -#define TCPCB_REPAIRED 0x10 /* SKB repaired (no skb_mstamp) */ +#define TCPCB_REPAIRED 0x10 /* SKB repaired (no skb_mstamp_ns) */ #define TCPCB_EVER_RETRANS 0x80 /* Ever retransmitted frame */ #define TCPCB_RETRANS (TCPCB_SACKED_RETRANS|TCPCB_EVER_RETRANS| \ TCPCB_REPAIRED) diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index c3387dfd725bf99bcddefb9fb4f1dc98f5dd7f23..606f868d9f3fde1c3140aa7eecde87d2ec32b5f2 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -88,7 +88,7 @@ u64 cookie_init_timestamp(struct request_sock *req) ts <<= TSBITS; ts |= options; } - return (u64)ts * (USEC_PER_SEC / TCP_TS_HZ); + return (u64)ts * (NSEC_PER_SEC / TCP_TS_HZ); } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 67670fac7c8de510df351fe3a835b554cc4759a9..69c236943f56bd0749e5efb18de97e69898f1bde 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1295,7 +1295,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) copy = size_goal; /* All packets are restored as if they have - * already been sent. skb_mstamp isn't set to + * already been sent. skb_mstamp_ns isn't set to * avoid wrong rtt estimation. */ if (tp->repair) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 5a8105e84f7c1a876bbd15e8050c2574c1fbe162..957f7a0e21c06cae9f0d3bed57017bbc0a36c880 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1014,7 +1014,7 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) static void tcp_update_skb_after_send(struct tcp_sock *tp, struct sk_buff *skb) { - skb->skb_mstamp = tp->tcp_mstamp; + skb->skb_mstamp_ns = tp->tcp_wstamp_ns; list_move_tail(&skb->tcp_tsorted_anchor, &tp->tsorted_sent_queue); } @@ -1061,7 +1061,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, if (unlikely(!skb)) return -ENOBUFS; } - skb->skb_mstamp = tp->tcp_mstamp; + skb->skb_mstamp_ns = tp->tcp_wstamp_ns; inet = inet_sk(sk); tcb = TCP_SKB_CB(skb); @@ -1165,8 +1165,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, skb_shinfo(skb)->gso_segs = tcp_skb_pcount(skb); skb_shinfo(skb)->gso_size = tcp_skb_mss(skb); - /* Our usage of tstamp should remain private */ - skb->tstamp = 0; + /* Leave earliest departure time in skb->tstamp (skb->skb_mstamp_ns) */ /* Cleanup our debris for IP stacks */ memset(skb->cb, 0, max(sizeof(struct inet_skb_parm), @@ -3221,10 +3220,10 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst, memset(&opts, 0, sizeof(opts)); #ifdef CONFIG_SYN_COOKIES if (unlikely(req->cookie_ts)) - skb->skb_mstamp = cookie_init_timestamp(req); + skb->skb_mstamp_ns = cookie_init_timestamp(req); else #endif - skb->skb_mstamp = tcp_clock_us(); + skb->skb_mstamp_ns = tcp_clock_ns(); #ifdef CONFIG_TCP_MD5SIG rcu_read_lock(); @@ -3440,7 +3439,7 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn) err = tcp_transmit_skb(sk, syn_data, 1, sk->sk_allocation); - syn->skb_mstamp = syn_data->skb_mstamp; + syn->skb_mstamp_ns = syn_data->skb_mstamp_ns; /* Now full SYN+DATA was cloned and sent (or not), * remove the SYN from the original skb (syn_data) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 7fdf222a0bdfe9775970082f6b5dcdcc82b2ae1a..61023d50cd604d5e19464a32c33b65d29c75c81e 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -360,7 +360,7 @@ static void tcp_probe_timer(struct sock *sk) */ start_ts = tcp_skb_timestamp(skb); if (!start_ts) - skb->skb_mstamp = tp->tcp_mstamp; + skb->skb_mstamp_ns = tp->tcp_wstamp_ns; else if (icsk->icsk_user_timeout && (s32)(tcp_time_stamp(tp) - start_ts) > icsk->icsk_user_timeout) goto abort; From patchwork Fri Sep 21 15:51:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973297 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="PuF2kh+I"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42GylY4TDBz9sCD for ; Sat, 22 Sep 2018 01:52:09 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390460AbeIUVlh (ORCPT ); Fri, 21 Sep 2018 17:41:37 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:43268 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390320AbeIUVlf (ORCPT ); Fri, 21 Sep 2018 17:41:35 -0400 Received: by mail-pg1-f193.google.com with SMTP id q19-v6so5378071pgn.10 for ; Fri, 21 Sep 2018 08:52:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6MKllOEASOYWRqrzxUABY97cMzHTictFgcRnblBIFbw=; b=PuF2kh+IhONgXGlYh8oQUDtIkjKxLSgwIH28kpHSuFp6jlWHCTWqqbF+GvaSr1zi7O oeelbC0EsoyLl7zsv8w1PsuaAG81fuCydUsouZe/+0QQgtjhKpCT5lzwOCttSd8fQWrI AKxHJlJJbtw80BMnEUhL0ZFQASFOmxptVEUjuQ2R7e3VxAX7kj8YWoOUAsXwS0Q/5Nx1 GroymEc24TwfMlPUUGO7NRfsm8cUNXj/P120Rr+65iSR+BOKO+HrWFeviDwc5Z8WVL9O 5jdkBBRjkkc1rc+qKnCYawNJcqsnvGnTbFIB3ulf98CR5uJFu+U21dEI5dFjxxiB80uj uHgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6MKllOEASOYWRqrzxUABY97cMzHTictFgcRnblBIFbw=; b=AjW+KIbTG2O+VJKZoUR3Pm8Ot4AogGKrEOMzg1ciVXw50IP1JIJqxmcox3r95RT4Bw MWAJVj2c+dc1O6mEolKYPnEDQQqwn/+REH6Z/IRWWQXj6MR7MH0lO1fneYFfPlLEVfmg oGCZRP8rwRRiTG6qZmwSnGJsUIvPcWK3/j9IyhNc2LpLy3xRYp+h3EPlfymJmJZhCXZ7 us6s3mF2+uC7J8HUhADHFdx/comCZBPIh7pQ/uXpzKArudjfp0Q08jJEPXFYPYVxTsoG d5KcjO+VD+uUGuxDyX/o+76sQUpOp2sLvlf1XIvtkLPtNumkxt7pSnmn07lMysnML6RV zMgA== X-Gm-Message-State: APzg51BoA9xcV2hkoTaCRG1fVaxLfYln1glCOU9jHvvrelgS+UNvUT15 nLPsK71RGPotclZXRpu8uDJRbQ== X-Google-Smtp-Source: ANB0Vdaivu49beEohV2vIOj5+TkMkwXq/3yC0LJMKEZ5noxZ4OShkyAGO1FEhRBpl8sgQWC3w+HkxA== X-Received: by 2002:a63:3285:: with SMTP id y127-v6mr5951579pgy.104.1537545126612; Fri, 21 Sep 2018 08:52:06 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id p4-v6sm37745047pgs.75.2018.09.21.08.52.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:52:05 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 6/9] tcp: switch internal pacing timer to CLOCK_TAI Date: Fri, 21 Sep 2018 08:51:51 -0700 Message-Id: <20180921155154.49489-7-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Next patch will use tcp_wstamp_ns to feed internal TCP pacing timer, so switch to CLOCK_TAI to share same base. Signed-off-by: Eric Dumazet --- net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_timer.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 957f7a0e21c06cae9f0d3bed57017bbc0a36c880..a87068fa9b1aa582310df6371966fd2d6461edb8 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1007,7 +1007,7 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) len_ns = (u64)skb->len * NSEC_PER_SEC; do_div(len_ns, rate); hrtimer_start(&tcp_sk(sk)->pacing_timer, - ktime_add_ns(ktime_get(), len_ns), + ktime_add_ns(ktime_get_tai_ns(), len_ns), HRTIMER_MODE_ABS_PINNED_SOFT); sock_hold(sk); } diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 61023d50cd604d5e19464a32c33b65d29c75c81e..4f661e178da8465203266ff4dfa3e8743e60ff82 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -758,7 +758,7 @@ void tcp_init_xmit_timers(struct sock *sk) { inet_csk_init_xmit_timers(sk, &tcp_write_timer, &tcp_delack_timer, &tcp_keepalive_timer); - hrtimer_init(&tcp_sk(sk)->pacing_timer, CLOCK_MONOTONIC, + hrtimer_init(&tcp_sk(sk)->pacing_timer, CLOCK_TAI, HRTIMER_MODE_ABS_PINNED_SOFT); tcp_sk(sk)->pacing_timer.function = tcp_pace_kick; From patchwork Fri Sep 21 15:51:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973298 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="cLeHTA7+"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Gylb5TsPz9sCD for ; Sat, 22 Sep 2018 01:52:11 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390499AbeIUVli (ORCPT ); Fri, 21 Sep 2018 17:41:38 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:35457 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390320AbeIUVli (ORCPT ); Fri, 21 Sep 2018 17:41:38 -0400 Received: by mail-pg1-f194.google.com with SMTP id 205-v6so5298631pgd.2 for ; Fri, 21 Sep 2018 08:52:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=x37/02ZN+x22WErdvqZId4sZQrvts3s0o/rCyF1BcN0=; b=cLeHTA7+IitC0DCxI2CNp9qbOnAhLJH8mpZ2Q2zK2+7Z314R10ryltX1/OQjrp66vy 3FQRKL5Dzry2P/HtQ+JBh2IQLhDl9+4JEUh405/JFYG6W13k2CSvCOEd1lJZOStV7ZX1 GI6/Rqz7phaoEaKoK980a8ErZTpPDCRQIiyeUGDn5aXPIMo2KDxdq00NJyOto0pj/7O4 s94A6EfW/cVA8jRyUZrO2UZz6Reu1SP8cUxDt/l3GO9cTVHdl7bWXaC18jbq7iEHKUmw oEdZ/0vHkeWuwW546667EIH7GYCKpSzVozpX7Wur78/Ptv6H0Xgr0i/gwGdKrVqDSOG9 gdmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=x37/02ZN+x22WErdvqZId4sZQrvts3s0o/rCyF1BcN0=; b=Iw4lWpWbIRSjUgNvmY7VJ64eU7Q6Y65l8ToBIjHZD6yRkQtx3h1uGI00YTq+4wL/KE wIhdHPgJuPoMzfbz7hn2xjob6NaYtrxGrSv/UmFDY3KSnxTo4SG0fb9h/hytaFGptQER AJUf7XwnQbwktgOlRg7o0WSZPtAbAiod8JJNx4+1cPcAo4tsV6NKStwjCbmavctEkRc0 e/lSM9nL0022sgllkQ9TlYiWe8VKC4o/p1qBeg9FKfQhv3ZlFLZ5NichfNkUB89SmSkZ PeSuY6NCV76lDuPzufIH5U8ZOGN9s4HzHbfxnxF1AFNC8GzwJ6EVjpvcihK2Yd2hwf3E uuHQ== X-Gm-Message-State: APzg51CU6eQtAxtoNlkI5al5BIGYmcjER1wuTuegkdGd2wljZNHPVsB+ YOXdxXiiAQtCl8bsJHcER604OA== X-Google-Smtp-Source: ANB0VdZQPU+VWC3krwg75nAfSe2E4O928kFgJjnlVDNBkc+PycFdvL8l30KcvPmyb2ISfffs9NsNcw== X-Received: by 2002:a65:5245:: with SMTP id q5-v6mr41277228pgp.67.1537545128486; Fri, 21 Sep 2018 08:52:08 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id j184-v6sm35569005pge.77.2018.09.21.08.52.07 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:52:07 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 7/9] tcp: switch tcp and sch_fq to new earliest departure time model Date: Fri, 21 Sep 2018 08:51:52 -0700 Message-Id: <20180921155154.49489-8-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org TCP keeps track of tcp_wstamp_ns by itself, meaning sch_fq no longer has to do it. Thanks to this model, TCP can get more accurate RTT samples, since pacing no longer inflates them. This has the nice effect of removing some delays caused by FQ quantum mechanism, causing inflated max/P99 latencies. Also we might relax TCP Small Queue tight limits in the future, since this new model allow TCP to build bigger batches, since sch_fq (or a device with earliest departure time offload) ensure these packets will be delivered on time. Note that other protocols are not converted (they will probably never be) so sch_fq has still support for SO_MAX_PACING_RATE Tested: Test showing FQ pacing quantum artifact for low-rate flows, adding unexpected throttles for RPC flows, inflating max and P99 latencies. The parameters chosen here are to show what happens typically when a TCP flow has a reduced pacing rate (this can be caused by a reduced cwin after few losses, or/and rtt above few ms) MIBS="MIN_LATENCY,MEAN_LATENCY,MAX_LATENCY,P99_LATENCY,STDDEV_LATENCY" Before : $ netperf -H 10.246.7.133 -t TCP_RR -Cc -T6,6 -- -q 2000000 -r 100,100 -o $MIBS MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.133 () port 0 AF_INET : first burst 0 : cpu bind Minimum Latency Microseconds,Mean Latency Microseconds,Maximum Latency Microseconds,99th Percentile Latency Microseconds,Stddev Latency Microseconds 19,82.78,5279,3825,482.02 After : $ netperf -H 10.246.7.133 -t TCP_RR -Cc -T6,6 -- -q 2000000 -r 100,100 -o $MIBS MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.133 () port 0 AF_INET : first burst 0 : cpu bind Minimum Latency Microseconds,Mean Latency Microseconds,Maximum Latency Microseconds,99th Percentile Latency Microseconds,Stddev Latency Microseconds 20,49.94,128,63,3.18 Signed-off-by: Eric Dumazet --- net/ipv4/tcp_bbr.c | 7 ++++--- net/ipv4/tcp_output.c | 22 ++++++++++++++++++---- net/sched/sch_fq.c | 21 +++++++++++---------- 3 files changed, 33 insertions(+), 17 deletions(-) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index 02ff2dde96094cf33b662a20994424a7adea509e..a5786e3e2c16ce53a332f29c9a55b9a641eec791 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -128,6 +128,9 @@ static const u32 bbr_probe_rtt_mode_ms = 200; /* Skip TSO below the following bandwidth (bits/sec): */ static const int bbr_min_tso_rate = 1200000; +/* Pace at ~1% below estimated bw, on average, to reduce queue at bottleneck. */ +static const int bbr_pacing_marging_percent = 1; + /* We use a high_gain value of 2/ln(2) because it's the smallest pacing gain * that will allow a smoothly increasing pacing rate that will double each RTT * and send the same number of packets per RTT that an un-paced, slow-starting @@ -208,12 +211,10 @@ static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain) { unsigned int mss = tcp_sk(sk)->mss_cache; - if (!tcp_needs_internal_pacing(sk)) - mss = tcp_mss_to_mtu(sk, mss); rate *= mss; rate *= gain; rate >>= BBR_SCALE; - rate *= USEC_PER_SEC; + rate *= USEC_PER_SEC / 100 * (100 - bbr_pacing_marging_percent); return rate >> BW_SCALE; } diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index a87068fa9b1aa582310df6371966fd2d6461edb8..2adb719e97b89021becfa1243d33c87df6cdf8a5 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1012,9 +1012,23 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) sock_hold(sk); } -static void tcp_update_skb_after_send(struct tcp_sock *tp, struct sk_buff *skb) +static void tcp_update_skb_after_send(struct sock *sk, struct sk_buff *skb) { + struct tcp_sock *tp = tcp_sk(sk); + skb->skb_mstamp_ns = tp->tcp_wstamp_ns; + if (sk->sk_pacing_status != SK_PACING_NONE) { + u32 rate = sk->sk_pacing_rate; + + /* Original sch_fq does not pace first 10 MSS + * Note that tp->data_segs_out overflows after 2^32 packets, + * this is a minor annoyance. + */ + if (rate != ~0U && rate && tp->data_segs_out >= 10) { + tp->tcp_wstamp_ns += div_u64((u64)skb->len * NSEC_PER_SEC, rate); + /* TODO: update internal pacing here */ + } + } list_move_tail(&skb->tcp_tsorted_anchor, &tp->tsorted_sent_queue); } @@ -1178,7 +1192,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, err = net_xmit_eval(err); } if (!err && oskb) { - tcp_update_skb_after_send(tp, oskb); + tcp_update_skb_after_send(sk, oskb); tcp_rate_skb_sent(sk, oskb); } return err; @@ -2327,7 +2341,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) { /* "skb_mstamp" is used as a start point for the retransmit timer */ - tcp_update_skb_after_send(tp, skb); + tcp_update_skb_after_send(sk, skb); goto repair; /* Skip network transmission */ } @@ -2902,7 +2916,7 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs) } tcp_skb_tsorted_restore(skb); if (!err) { - tcp_update_skb_after_send(tp, skb); + tcp_update_skb_after_send(sk, skb); tcp_rate_skb_sent(sk, skb); } } else { diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index d5185c44e9a5f521ca99243b6e9b53ec05b84d49..77692ad6741de14025bd848741604e775742430b 100644 --- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -491,11 +491,16 @@ static struct sk_buff *fq_dequeue(struct Qdisc *sch) } skb = f->head; - if (unlikely(skb && now < f->time_next_packet && - !skb_is_tcp_pure_ack(skb))) { - head->first = f->next; - fq_flow_set_throttled(q, f); - goto begin; + if (skb && !skb_is_tcp_pure_ack(skb)) { + u64 time_next_packet = max_t(u64, ktime_to_ns(skb->tstamp), + f->time_next_packet); + + if (now < time_next_packet) { + head->first = f->next; + f->time_next_packet = time_next_packet; + fq_flow_set_throttled(q, f); + goto begin; + } } skb = fq_dequeue_head(sch, f); @@ -513,11 +518,7 @@ static struct sk_buff *fq_dequeue(struct Qdisc *sch) prefetch(&skb->end); f->credit -= qdisc_pkt_len(skb); - if (!q->rate_enable) - goto out; - - /* Do not pace locally generated ack packets */ - if (skb_is_tcp_pure_ack(skb)) + if (ktime_to_ns(skb->tstamp) || !q->rate_enable) goto out; rate = q->flow_max_rate; From patchwork Fri Sep 21 15:51:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973299 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="Bi0UoC7f"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Gylf4VLhz9sCD for ; Sat, 22 Sep 2018 01:52:14 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390512AbeIUVll (ORCPT ); Fri, 21 Sep 2018 17:41:41 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:47026 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390320AbeIUVlk (ORCPT ); Fri, 21 Sep 2018 17:41:40 -0400 Received: by mail-pl1-f193.google.com with SMTP id t20-v6so2710015ply.13 for ; Fri, 21 Sep 2018 08:52:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=q5PwJ94hhI+3Gk2VgqSLl0/t1bavBXPLRy+BX2VNDE0=; b=Bi0UoC7f1ca7UZ1YY10WiCvqicjwQqcUeWEwSVCjkQQW7zGFkDnqlEyI2MX6i+2fjo JZKKapElRYaKT+6LCAVdA91fIKNvPGwhS8E7vuqDEn7HhJOl7EC0iWgZzFeFcLZJ4qBM 6BIDVSQy3mt5ComBgu5ALrzUd5yQKcHwljr/5mIQIem02O7Ait8m/3EBN4fIK4uFqcNb Xg0hE9FZr3xrYWQQbf9tuzeVjwNZ3jbAT0PsVzJkeNb7tqjRoQXpDO+5I+QtOrBhOoH9 2sJHQk4dfgNoUzk5ukbMwyrQHzYkp2C66vlT3Kp+j2vSPG6J3Zaj54MFZRRIqJhxCff5 L3NQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=q5PwJ94hhI+3Gk2VgqSLl0/t1bavBXPLRy+BX2VNDE0=; b=Ma8ClEwPoO0hoXHppRUXL1/LKQ4SiYwbt47A2Bv59GAP39IPxHQSCf6aiXXWuxuBmL Tm61G8m2jVH0zk2qA8Mbtw5CkAp/WGk9ZbjZgtiZgHiJMbSwdP2tmomywJinRrqOHlnJ 50NHRFit2100SwNi+JxCPdtdNXoubdPhcLP0mN6GFWSJLhWF8mx9+BHxh89AJfJ17swe N5Dsqw1O4uJ+l4ypMTjZz56lo9opL++2BLKCDHWqyydqxpkuKbkRgG5vcS37RRrh6xtw bzGqnE3DbdoQ2jgT60beSJ3OkMPxYCFLK/jr/y+avAw8RsbAStjCUlvn/mmDEX7zswpP BNWA== X-Gm-Message-State: APzg51ATsZ0F3CsIWU00Iv7NMQFccZIMbvB2SnU6orfyujshtCv/HCFt PujrLBDWboXxwUgN0RDjUUOzyIlenfw= X-Google-Smtp-Source: ANB0VdYN9AmbCgMKyODDvi8BxoMroHQp5wmDFaGEclXJNGtGOIny5rLkIffM1Krfr/mNSmRxXhhA2w== X-Received: by 2002:a17:902:a504:: with SMTP id s4-v6mr46222422plq.101.1537545130472; Fri, 21 Sep 2018 08:52:10 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id o20-v6sm72683088pfj.35.2018.09.21.08.52.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:52:09 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 8/9] tcp: switch tcp_internal_pacing() to tcp_wstamp_ns Date: Fri, 21 Sep 2018 08:51:53 -0700 Message-Id: <20180921155154.49489-9-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Now TCP keeps track of tcp_wstamp_ns, recording the earliest departure time of next packet, we can remove duplicate code from tcp_internal_pacing() This removes one ktime_get_tai_ns() call, and a divide. Signed-off-by: Eric Dumazet --- net/ipv4/tcp_output.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 2adb719e97b89021becfa1243d33c87df6cdf8a5..fe7855b090e4feed6a7d1ba6ee874cdb23a9bd0c 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -993,21 +993,12 @@ enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer) return HRTIMER_NORESTART; } -static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) +static void tcp_internal_pacing(struct sock *sk) { - u64 len_ns; - u32 rate; - if (!tcp_needs_internal_pacing(sk)) return; - rate = sk->sk_pacing_rate; - if (!rate || rate == ~0U) - return; - - len_ns = (u64)skb->len * NSEC_PER_SEC; - do_div(len_ns, rate); hrtimer_start(&tcp_sk(sk)->pacing_timer, - ktime_add_ns(ktime_get_tai_ns(), len_ns), + ns_to_ktime(tcp_sk(sk)->tcp_wstamp_ns), HRTIMER_MODE_ABS_PINNED_SOFT); sock_hold(sk); } @@ -1026,7 +1017,8 @@ static void tcp_update_skb_after_send(struct sock *sk, struct sk_buff *skb) */ if (rate != ~0U && rate && tp->data_segs_out >= 10) { tp->tcp_wstamp_ns += div_u64((u64)skb->len * NSEC_PER_SEC, rate); - /* TODO: update internal pacing here */ + + tcp_internal_pacing(sk); } } list_move_tail(&skb->tcp_tsorted_anchor, &tp->tsorted_sent_queue); @@ -1167,7 +1159,6 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, tcp_event_data_sent(tp, sk); tp->data_segs_out += tcp_skb_pcount(skb); tp->bytes_sent += skb->len - tcp_header_size; - tcp_internal_pacing(sk, skb); } if (after(tcb->end_seq, tp->snd_nxt) || tcb->seq == tcb->end_seq) From patchwork Fri Sep 21 15:51:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 973300 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="GlMny86a"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 42Gylg2K7hz9sCP for ; Sat, 22 Sep 2018 01:52:15 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390519AbeIUVlm (ORCPT ); Fri, 21 Sep 2018 17:41:42 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:42064 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390505AbeIUVll (ORCPT ); Fri, 21 Sep 2018 17:41:41 -0400 Received: by mail-pg1-f193.google.com with SMTP id y4-v6so6241458pgp.9 for ; Fri, 21 Sep 2018 08:52:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=U0sF/yeVs4aCHRi55n96y5hF56/ueKxNkTSc0C2uXVQ=; b=GlMny86aAnu6h1rVluIFxy6XqmHifn+uR4ks4lJtNkZp2ge9ifx+jw6QNlUxuWIdhk FkkUmvPzHo1Zs5l84rUsZ56/qoiEJch70pp/KxwKQ+Wdbm6tUQ/J5FtAPvI0VdrAnmY0 WET4+8sN8X4R19AIScd8744X4cHYjCSPOGXH0pYlyPXwi3tW21QwQXVJr+xYqmu91z7n Rqe2zuvF7imPr9ByyfVPzNPDGY90CLhjaj6jX7b0dcEXJXjdA8GmXpmYVEbpfx3C7E6n 1gqu8H+4ZR6cV48jxEzOD4S4hpLKTLXEEKZQyeSMLt22O3qIFEL7ENBxojp7w65VNZYD bjZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=U0sF/yeVs4aCHRi55n96y5hF56/ueKxNkTSc0C2uXVQ=; b=rkYSCh6X4fUKQO5I6ynfEpzbmeX+S+A4CbBuDgCeCbhPKrak1PNF8bGFD3wka6navv /Zm8/XZagyRqo5BWiHPMqru3DrgXN0Ag8Jbr2qQGIbczEXil+T7cxOecJYvo2XL6WX7E sFx/u5OTOQrqgis/SjvMRmRLCCGG+tCLrimoGOADRiCqfsCGAHgIN95AJLHLhoQXqNBE 9GTF+kTaxPjYON/xdugwEyPpJcPBhdLvcJQkPpPjmL2H3m4Hp3fRtlOyJCfo+uHzXphZ ZZ7ANuNktghZNAxOyh0FWjUxPdiz/Yb5ensptTQr4YYFqZzU98pEfoazefG9b8BU+iwI dNoQ== X-Gm-Message-State: APzg51BjN+Yw0nsiAwYxH2hsC2vjZKKh7t+fmR0Nwnv3DEKIvcVeqNmb KHqBkxaaNTjMTdOSC+JMmbJXQg== X-Google-Smtp-Source: ANB0VdYA34sKu70SvvkEqQkR9PMwjHSOcSsd0/AYjAhsF+ze2ZRVY28rwsTvpAzeHtR+SURrA6XpGQ== X-Received: by 2002:a62:71c4:: with SMTP id m187-v6mr46991895pfc.232.1537545132345; Fri, 21 Sep 2018 08:52:12 -0700 (PDT) Received: from localhost ([2620:15c:2c4:201:f5a:7eca:440a:3ead]) by smtp.gmail.com with ESMTPSA id s23-v6sm36273131pgo.44.2018.09.21.08.52.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 21 Sep 2018 08:52:11 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , Van Jacobson , Neal Cardwell , Yuchung Cheng , Soheil Hassas Yeganeh , Willem de Bruijn , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 9/9] net_sched: sch_fq: remove dead code dealing with retransmits Date: Fri, 21 Sep 2018 08:51:54 -0700 Message-Id: <20180921155154.49489-10-edumazet@google.com> X-Mailer: git-send-email 2.19.0.444.g18242da7ef-goog In-Reply-To: <20180921155154.49489-1-edumazet@google.com> References: <20180921155154.49489-1-edumazet@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org With the earliest departure time model, we no longer plan special casing TCP retransmits. We therefore remove dead code (since most compilers understood skb_is_retransmit() was false) Signed-off-by: Eric Dumazet --- net/sched/sch_fq.c | 58 ++++------------------------------------------ 1 file changed, 5 insertions(+), 53 deletions(-) diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index 77692ad6741de14025bd848741604e775742430b..628a2cdcfc6f2fa69d9402f06881949d2e1423d9 100644 --- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -106,7 +106,6 @@ struct fq_sched_data { u64 stat_gc_flows; u64 stat_internal_packets; - u64 stat_tcp_retrans; u64 stat_throttled; u64 stat_flows_plimit; u64 stat_pkts_too_long; @@ -327,62 +326,17 @@ static struct sk_buff *fq_dequeue_head(struct Qdisc *sch, struct fq_flow *flow) return skb; } -/* We might add in the future detection of retransmits - * For the time being, just return false - */ -static bool skb_is_retransmit(struct sk_buff *skb) -{ - return false; -} - -/* add skb to flow queue - * flow queue is a linked list, kind of FIFO, except for TCP retransmits - * We special case tcp retransmits to be transmitted before other packets. - * We rely on fact that TCP retransmits are unlikely, so we do not waste - * a separate queue or a pointer. - * head-> [retrans pkt 1] - * [retrans pkt 2] - * [ normal pkt 1] - * [ normal pkt 2] - * [ normal pkt 3] - * tail-> [ normal pkt 4] - */ static void flow_queue_add(struct fq_flow *flow, struct sk_buff *skb) { - struct sk_buff *prev, *head = flow->head; + struct sk_buff *head = flow->head; skb->next = NULL; - if (!head) { + if (!head) flow->head = skb; - flow->tail = skb; - return; - } - if (likely(!skb_is_retransmit(skb))) { + else flow->tail->next = skb; - flow->tail = skb; - return; - } - /* This skb is a tcp retransmit, - * find the last retrans packet in the queue - */ - prev = NULL; - while (skb_is_retransmit(head)) { - prev = head; - head = head->next; - if (!head) - break; - } - if (!prev) { /* no rtx packet in queue, become the new head */ - skb->next = flow->head; - flow->head = skb; - } else { - if (prev == flow->tail) - flow->tail = skb; - else - skb->next = prev->next; - prev->next = skb; - } + flow->tail = skb; } static int fq_enqueue(struct sk_buff *skb, struct Qdisc *sch, @@ -401,8 +355,6 @@ static int fq_enqueue(struct sk_buff *skb, struct Qdisc *sch, } f->qlen++; - if (skb_is_retransmit(skb)) - q->stat_tcp_retrans++; qdisc_qstats_backlog_inc(sch, skb); if (fq_flow_is_detached(f)) { struct sock *sk = skb->sk; @@ -874,7 +826,7 @@ static int fq_dump_stats(struct Qdisc *sch, struct gnet_dump *d) st.gc_flows = q->stat_gc_flows; st.highprio_packets = q->stat_internal_packets; - st.tcp_retrans = q->stat_tcp_retrans; + st.tcp_retrans = 0; st.throttled = q->stat_throttled; st.flows_plimit = q->stat_flows_plimit; st.pkts_too_long = q->stat_pkts_too_long;