From patchwork Wed Mar 7 12:59:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 882615 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=helsinki.fi Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zxFhJ6gb5z9sXN for ; Thu, 8 Mar 2018 01:02:24 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933418AbeCGOCL (ORCPT ); Wed, 7 Mar 2018 09:02:11 -0500 Received: from smtp-rs1-vallila2.fe.helsinki.fi ([128.214.173.75]:60022 "EHLO smtp-rs1-vallila2.fe.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933343AbeCGOCF (ORCPT ); Wed, 7 Mar 2018 09:02:05 -0500 X-Greylist: delayed 3743 seconds by postgrey-1.27 at vger.kernel.org; Wed, 07 Mar 2018 09:01:59 EST Received: from whs-18.cs.helsinki.fi (whs-18.cs.helsinki.fi [128.214.166.46]) by smtp-rs1.it.helsinki.fi (8.14.4/8.14.4) with ESMTP id w27CxX6h018152; Wed, 7 Mar 2018 14:59:33 +0200 Received: by whs-18.cs.helsinki.fi (Postfix, from userid 1070048) id 1C6F9360015; Wed, 7 Mar 2018 14:59:33 +0200 (EET) From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: netdev@vger.kernel.org Subject: [PATCH net 1/5] tcp: feed correct number of pkts acked to cc modules also in recovery Date: Wed, 7 Mar 2018 14:59:25 +0200 Message-Id: <1520427569-14365-2-git-send-email-ilpo.jarvinen@helsinki.fi> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> References: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The pkts_acked is clearly not the correct value for slow start after RTO as it may include segments that were not lost and therefore did not need retransmissions in the slow start following the RTO. Then tcp_slow_start will add the excess into cwnd bloating it and triggering a burst. Instead, we want to pass only the number of retransmitted segments that were covered by the cumulative ACK (and potentially newly sent data segments too if the cumulative ACK covers that far). Signed-off-by: Ilpo Järvinen --- net/ipv4/tcp_input.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 9a1b3c1..0305f6d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3027,6 +3027,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, long seq_rtt_us = -1L; long ca_rtt_us = -1L; u32 pkts_acked = 0; + u32 rexmit_acked = 0; + u32 newdata_acked = 0; u32 last_in_flight = 0; bool rtt_update; int flag = 0; @@ -3056,8 +3058,10 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, } if (unlikely(sacked & TCPCB_RETRANS)) { - if (sacked & TCPCB_SACKED_RETRANS) + if (sacked & TCPCB_SACKED_RETRANS) { tp->retrans_out -= acked_pcount; + rexmit_acked += acked_pcount; + } flag |= FLAG_RETRANS_DATA_ACKED; } else if (!(sacked & TCPCB_SACKED_ACKED)) { last_ackt = skb->skb_mstamp; @@ -3068,8 +3072,11 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, last_in_flight = TCP_SKB_CB(skb)->tx.in_flight; if (before(start_seq, reord)) reord = start_seq; - if (!after(scb->end_seq, tp->high_seq)) + if (!after(scb->end_seq, tp->high_seq)) { flag |= FLAG_ORIG_SACK_ACKED; + } else { + newdata_acked += acked_pcount; + } } if (sacked & TCPCB_SACKED_ACKED) { @@ -3151,6 +3158,14 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, } if (tcp_is_reno(tp)) { + /* Due to discontinuity on RTO in the artificial + * sacked_out calculations, TCP must restrict + * pkts_acked without SACK to rexmits and new data + * segments + */ + if (icsk->icsk_ca_state == TCP_CA_Loss) + pkts_acked = rexmit_acked + newdata_acked; + tcp_remove_reno_sacks(sk, pkts_acked); } else { int delta; From patchwork Wed Mar 7 12:59:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 882614 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=helsinki.fi Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zxFhH50Ddz9sWm for ; Thu, 8 Mar 2018 01:02:23 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933407AbeCGOCJ (ORCPT ); Wed, 7 Mar 2018 09:02:09 -0500 Received: from smtp-rs1-vallila2.fe.helsinki.fi ([128.214.173.75]:60022 "EHLO smtp-rs1-vallila2.fe.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933173AbeCGOCD (ORCPT ); Wed, 7 Mar 2018 09:02:03 -0500 X-Greylist: delayed 3743 seconds by postgrey-1.27 at vger.kernel.org; Wed, 07 Mar 2018 09:01:59 EST Received: from whs-18.cs.helsinki.fi (whs-18.cs.helsinki.fi [128.214.166.46]) by smtp-rs1.it.helsinki.fi (8.14.4/8.14.4) with ESMTP id w27CxXxO018155; Wed, 7 Mar 2018 14:59:33 +0200 Received: by whs-18.cs.helsinki.fi (Postfix, from userid 1070048) id 1F298360367; Wed, 7 Mar 2018 14:59:33 +0200 (EET) From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: netdev@vger.kernel.org Subject: [PATCH net 2/5] tcp: prevent bogus FRTO undos with non-SACK flows Date: Wed, 7 Mar 2018 14:59:26 +0200 Message-Id: <1520427569-14365-3-git-send-email-ilpo.jarvinen@helsinki.fi> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> References: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org In a non-SACK case, any non-retransmitted segment acknowledged will set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is no indication that it would have been delivered for real (the scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK case). This causes bogus undos in ordinary RTO recoveries where segments are lost here and there, with a few delivered segments in between losses. A cumulative ACKs will cover retransmitted ones at the bottom and the non-retransmitted ones following that causing FLAG_ORIG_SACK_ACKED to be set in tcp_clean_rtx_queue and results in a spurious FRTO undo. We need to make the check more strict for non-SACK case and check that none of the cumulatively ACKed segments were retransmitted, which would be the case for the last step of FRTO algorithm as we sent out only new segments previously. Only then, allow FRTO undo to proceed in non-SACK case. Signed-off-by: Ilpo Järvinen --- net/ipv4/tcp_input.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 0305f6d..1a33752 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2629,8 +2629,13 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack, if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */ /* Step 3.b. A timeout is spurious if not all data are * lost, i.e., never-retransmitted data are (s)acked. + * + * As the non-SACK case does not keep track of TCPCB_SACKED_ACKED, + * we need to ensure that none of the cumulative ACKed segments + * was retransmitted to confirm the validity of FLAG_ORIG_SACK_ACKED. */ if ((flag & FLAG_ORIG_SACK_ACKED) && + (tcp_is_sack(tp) || !(flag & FLAG_RETRANS_DATA_ACKED)) && tcp_try_undo_loss(sk, true)) return; From patchwork Wed Mar 7 12:59:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 882613 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=helsinki.fi Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zxFhB410Kz9sXN for ; Thu, 8 Mar 2018 01:02:18 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933362AbeCGOCF (ORCPT ); Wed, 7 Mar 2018 09:02:05 -0500 Received: from smtp-rs1-vallila2.fe.helsinki.fi ([128.214.173.75]:60022 "EHLO smtp-rs1-vallila2.fe.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933293AbeCGOCB (ORCPT ); Wed, 7 Mar 2018 09:02:01 -0500 X-Greylist: delayed 3743 seconds by postgrey-1.27 at vger.kernel.org; Wed, 07 Mar 2018 09:01:59 EST Received: from whs-18.cs.helsinki.fi (whs-18.cs.helsinki.fi [128.214.166.46]) by smtp-rs1.it.helsinki.fi (8.14.4/8.14.4) with ESMTP id w27CxXhX018157; Wed, 7 Mar 2018 14:59:33 +0200 Received: by whs-18.cs.helsinki.fi (Postfix, from userid 1070048) id 2150B360368; Wed, 7 Mar 2018 14:59:33 +0200 (EET) From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: netdev@vger.kernel.org Subject: [PATCH net 3/5] tcp: move false FR condition into tcp_false_fast_retrans_possible() Date: Wed, 7 Mar 2018 14:59:27 +0200 Message-Id: <1520427569-14365-4-git-send-email-ilpo.jarvinen@helsinki.fi> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> References: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org No functional changes. Signed-off-by: Ilpo Järvinen --- net/ipv4/tcp_input.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 1a33752..e20f9ad 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2211,6 +2211,19 @@ static void tcp_update_scoreboard(struct sock *sk, int fast_rexmit) } } +/* False fast retransmits may occur when SACK is not in use under certain + * conditions (RFC6582). The sender MUST hold old state until something + * *above* high_seq is ACKed to prevent triggering such false fast + * retransmits. SACK TCP is safe. + */ +static bool tcp_false_fast_retrans_possible(const struct sock *sk, + const u32 snd_una) +{ + const struct tcp_sock *tp = tcp_sk(sk); + + return ((snd_una == tp->high_seq) && tcp_is_reno(tp)); +} + static bool tcp_tsopt_ecr_before(const struct tcp_sock *tp, u32 when) { return tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr && @@ -2350,10 +2363,10 @@ static bool tcp_try_undo_recovery(struct sock *sk) } else if (tp->rack.reo_wnd_persist) { tp->rack.reo_wnd_persist--; } - if (tp->snd_una == tp->high_seq && tcp_is_reno(tp)) { - /* Hold old state until something *above* high_seq - * is ACKed. For Reno it is MUST to prevent false - * fast retransmits (RFC2582). SACK TCP is safe. */ + if (tcp_false_fast_retrans_possible(sk, tp->snd_una)) { + /* Hold old state until something *above* high_seq is ACKed + * if false fast retransmit is possible. + */ if (!tcp_any_retrans_done(sk)) tp->retrans_stamp = 0; return true; From patchwork Wed Mar 7 12:59:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 882616 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=helsinki.fi Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zxFhL2gNrz9sWm for ; Thu, 8 Mar 2018 01:02:26 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933449AbeCGOCQ (ORCPT ); Wed, 7 Mar 2018 09:02:16 -0500 Received: from smtp-rs1-vallila2.fe.helsinki.fi ([128.214.173.75]:60022 "EHLO smtp-rs1-vallila2.fe.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933343AbeCGOCM (ORCPT ); Wed, 7 Mar 2018 09:02:12 -0500 X-Greylist: delayed 3743 seconds by postgrey-1.27 at vger.kernel.org; Wed, 07 Mar 2018 09:01:59 EST Received: from whs-18.cs.helsinki.fi (whs-18.cs.helsinki.fi [128.214.166.46]) by smtp-rs1.it.helsinki.fi (8.14.4/8.14.4) with ESMTP id w27CxXsr018159; Wed, 7 Mar 2018 14:59:33 +0200 Received: by whs-18.cs.helsinki.fi (Postfix, from userid 1070048) id 23807360369; Wed, 7 Mar 2018 14:59:33 +0200 (EET) From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: netdev@vger.kernel.org Subject: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled Date: Wed, 7 Mar 2018 14:59:28 +0200 Message-Id: <1520427569-14365-5-git-send-email-ilpo.jarvinen@helsinki.fi> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> References: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org A bogus undo may/will trigger when the loss recovery state is kept until snd_una is above high_seq. If tcp_any_retrans_done is zero, retrans_stamp is cleared in this transient state. On the next ACK, tcp_try_undo_recovery again executes and tcp_may_undo will always return true because tcp_packet_delayed has this condition: return !tp->retrans_stamp || ... Check for the false fast retransmit transient condition in tcp_packet_delayed to avoid bogus undos. Since snd_una may have advanced on this ACK but CA state still remains unchanged, prior_snd_una needs to be passed instead of tp->snd_una. Signed-off-by: Ilpo Järvinen --- net/ipv4/tcp_input.c | 50 ++++++++++++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 16 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index e20f9ad..b689915 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -98,6 +98,7 @@ int sysctl_tcp_max_orphans __read_mostly = NR_FILE; #define FLAG_UPDATE_TS_RECENT 0x4000 /* tcp_replace_ts_recent() */ #define FLAG_NO_CHALLENGE_ACK 0x8000 /* do not call tcp_send_challenge_ack() */ #define FLAG_ACK_MAYBE_DELAYED 0x10000 /* Likely a delayed ACK */ +#define FLAG_PACKET_DELAYED 0x20000 /* 0 rexmits or tstamps reveal delayed pkt */ #define FLAG_ACKED (FLAG_DATA_ACKED|FLAG_SYN_ACKED) #define FLAG_NOT_DUP (FLAG_DATA|FLAG_WIN_UPDATE|FLAG_ACKED) @@ -2243,10 +2244,19 @@ static bool tcp_skb_spurious_retrans(const struct tcp_sock *tp, /* Nothing was retransmitted or returned timestamp is less * than timestamp of the first retransmission. */ -static inline bool tcp_packet_delayed(const struct tcp_sock *tp) +static inline bool tcp_packet_delayed(const struct sock *sk, + const u32 prior_snd_una) { - return !tp->retrans_stamp || - tcp_tsopt_ecr_before(tp, tp->retrans_stamp); + const struct tcp_sock *tp = tcp_sk(sk); + + if (!tp->retrans_stamp) { + /* Sender will be in a transient state with cleared + * retrans_stamp during false fast retransmit prevention + * mechanism + */ + return !tcp_false_fast_retrans_possible(sk, prior_snd_una); + } + return tcp_tsopt_ecr_before(tp, tp->retrans_stamp); } /* Undo procedures. */ @@ -2336,17 +2346,20 @@ static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss) tp->rack.advanced = 1; /* Force RACK to re-exam losses */ } -static inline bool tcp_may_undo(const struct tcp_sock *tp) +static inline bool tcp_may_undo(const struct sock *sk, const int flag) { - return tp->undo_marker && (!tp->undo_retrans || tcp_packet_delayed(tp)); + const struct tcp_sock *tp = tcp_sk(sk); + + return tp->undo_marker && + (!tp->undo_retrans || (flag & FLAG_PACKET_DELAYED)); } /* People celebrate: "We love our President!" */ -static bool tcp_try_undo_recovery(struct sock *sk) +static bool tcp_try_undo_recovery(struct sock *sk, const int flag) { struct tcp_sock *tp = tcp_sk(sk); - if (tcp_may_undo(tp)) { + if (tcp_may_undo(sk, flag)) { int mib_idx; /* Happy end! We did not retransmit anything @@ -2393,11 +2406,11 @@ static bool tcp_try_undo_dsack(struct sock *sk) } /* Undo during loss recovery after partial ACK or using F-RTO. */ -static bool tcp_try_undo_loss(struct sock *sk, bool frto_undo) +static bool tcp_try_undo_loss(struct sock *sk, bool frto_undo, const int flag) { struct tcp_sock *tp = tcp_sk(sk); - if (frto_undo || tcp_may_undo(tp)) { + if (frto_undo || tcp_may_undo(sk, flag)) { tcp_undo_cwnd_reduction(sk, true); DBGUNDO(sk, "partial loss"); @@ -2636,7 +2649,7 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack, bool recovered = !before(tp->snd_una, tp->high_seq); if ((flag & FLAG_SND_UNA_ADVANCED) && - tcp_try_undo_loss(sk, false)) + tcp_try_undo_loss(sk, false, flag)) return; if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */ @@ -2649,7 +2662,7 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack, */ if ((flag & FLAG_ORIG_SACK_ACKED) && (tcp_is_sack(tp) || !(flag & FLAG_RETRANS_DATA_ACKED)) && - tcp_try_undo_loss(sk, true)) + tcp_try_undo_loss(sk, true, flag)) return; if (after(tp->snd_nxt, tp->high_seq)) { @@ -2672,7 +2685,7 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack, if (recovered) { /* F-RTO RFC5682 sec 3.1 step 2.a and 1st part of step 3.a */ - tcp_try_undo_recovery(sk); + tcp_try_undo_recovery(sk, flag); return; } if (tcp_is_reno(tp)) { @@ -2688,11 +2701,12 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack, } /* Undo during fast recovery after partial ACK. */ -static bool tcp_try_undo_partial(struct sock *sk, u32 prior_snd_una) +static bool tcp_try_undo_partial(struct sock *sk, u32 prior_snd_una, + const int flag) { struct tcp_sock *tp = tcp_sk(sk); - if (tp->undo_marker && tcp_packet_delayed(tp)) { + if (tp->undo_marker && (flag & FLAG_PACKET_DELAYED)) { /* Plain luck! Hole if filled with delayed * packet, rather than with a retransmit. Check reordering. */ @@ -2795,13 +2809,17 @@ static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una, case TCP_CA_Recovery: if (tcp_is_reno(tp)) tcp_reset_reno_sack(tp); - if (tcp_try_undo_recovery(sk)) + if (tcp_try_undo_recovery(sk, flag)) return; tcp_end_cwnd_reduction(sk); break; } } + if (icsk->icsk_ca_state >= TCP_CA_Recovery && + tcp_packet_delayed(sk, prior_snd_una)) + flag |= FLAG_PACKET_DELAYED; + /* E. Process state. */ switch (icsk->icsk_ca_state) { case TCP_CA_Recovery: @@ -2809,7 +2827,7 @@ static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una, if (tcp_is_reno(tp) && is_dupack) tcp_add_reno_sack(sk); } else { - if (tcp_try_undo_partial(sk, prior_snd_una)) + if (tcp_try_undo_partial(sk, prior_snd_una, flag)) return; /* Partial ACK arrived. Force fast retransmit. */ do_lost = tcp_is_reno(tp) || From patchwork Wed Mar 7 12:59:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 882617 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=helsinki.fi Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3zxFhN0JnHz9sdk for ; Thu, 8 Mar 2018 01:02:27 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933474AbeCGOCX (ORCPT ); Wed, 7 Mar 2018 09:02:23 -0500 Received: from smtp-rs1-vallila2.fe.helsinki.fi ([128.214.173.75]:60022 "EHLO smtp-rs1-vallila2.fe.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933293AbeCGOCH (ORCPT ); Wed, 7 Mar 2018 09:02:07 -0500 X-Greylist: delayed 3743 seconds by postgrey-1.27 at vger.kernel.org; Wed, 07 Mar 2018 09:01:59 EST Received: from whs-18.cs.helsinki.fi (whs-18.cs.helsinki.fi [128.214.166.46]) by smtp-rs1.it.helsinki.fi (8.14.4/8.14.4) with ESMTP id w27CxXFA018167; Wed, 7 Mar 2018 14:59:33 +0200 Received: by whs-18.cs.helsinki.fi (Postfix, from userid 1070048) id 2632336036A; Wed, 7 Mar 2018 14:59:33 +0200 (EET) From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: netdev@vger.kernel.org Subject: [PATCH net 5/5] tcp: send real dupACKs by locking advertized window for non-SACK flows Date: Wed, 7 Mar 2018 14:59:29 +0200 Message-Id: <1520427569-14365-6-git-send-email-ilpo.jarvinen@helsinki.fi> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> References: <1520427569-14365-1-git-send-email-ilpo.jarvinen@helsinki.fi> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Currently, the TCP code is overly eager to update window on every ACK. It makes some of the ACKs that the receiver should sent as dupACKs look like they update window update that are not considered real dupACKs by the non-SACK sender-side code. Make sure that when an ofo segment is received, no change to window is applied if we are going to send a dupACK. It's ok to change the window for non-dupACKs (such as the first ACK after ofo arrivals start if that ACK was using delayed ACKs). Signed-off-by: Ilpo Järvinen --- include/linux/tcp.h | 3 ++- net/ipv4/tcp_input.c | 5 ++++- net/ipv4/tcp_output.c | 43 +++++++++++++++++++++++++------------------ 3 files changed, 31 insertions(+), 20 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 8f4c549..e239662 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -225,7 +225,8 @@ struct tcp_sock { fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */ fastopen_no_cookie:1, /* Allow send/recv SYN+data without a cookie */ is_sack_reneg:1, /* in recovery from loss with SACK reneg? */ - unused:2; + dupack_wnd_lock :1, /* Non-SACK constant rwnd dupacks needed? */ + unused:1; u8 nonagle : 4,/* Disable Nagle algorithm? */ thin_lto : 1,/* Use linear timeouts for thin streams */ unused1 : 1, diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index b689915..77a289f 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4633,6 +4633,7 @@ int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size) static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) { struct tcp_sock *tp = tcp_sk(sk); + struct inet_connection_sock *icsk = inet_csk(sk); bool fragstolen; int eaten; @@ -4676,7 +4677,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) * gap in queue is filled. */ if (RB_EMPTY_ROOT(&tp->out_of_order_queue)) - inet_csk(sk)->icsk_ack.pingpong = 0; + icsk->icsk_ack.pingpong = 0; } if (tp->rx_opt.num_sacks) @@ -4726,6 +4727,8 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) goto queue_and_out; } + if (tcp_is_reno(tp) && !(icsk->icsk_ack.pending & ICSK_ACK_TIMER)) + tp->dupack_wnd_lock = 1; tcp_data_queue_ofo(sk, skb); } diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 6818042..45fbe92 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -249,25 +249,32 @@ static u16 tcp_select_window(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); u32 old_win = tp->rcv_wnd; - u32 cur_win = tcp_receive_window(tp); - u32 new_win = __tcp_select_window(sk); - - /* Never shrink the offered window */ - if (new_win < cur_win) { - /* Danger Will Robinson! - * Don't update rcv_wup/rcv_wnd here or else - * we will not be able to advertise a zero - * window in time. --DaveM - * - * Relax Will Robinson. - */ - if (new_win == 0) - NET_INC_STATS(sock_net(sk), - LINUX_MIB_TCPWANTZEROWINDOWADV); - new_win = ALIGN(cur_win, 1 << tp->rx_opt.rcv_wscale); + u32 cur_win; + u32 new_win; + + if (tp->dupack_wnd_lock) { + new_win = old_win; + tp->dupack_wnd_lock = 0; + } else { + cur_win = tcp_receive_window(tp); + new_win = __tcp_select_window(sk); + /* Never shrink the offered window */ + if (new_win < cur_win) { + /* Danger Will Robinson! + * Don't update rcv_wup/rcv_wnd here or else + * we will not be able to advertise a zero + * window in time. --DaveM + * + * Relax Will Robinson. + */ + if (new_win == 0) + NET_INC_STATS(sock_net(sk), + LINUX_MIB_TCPWANTZEROWINDOWADV); + new_win = ALIGN(cur_win, 1 << tp->rx_opt.rcv_wscale); + } + tp->rcv_wnd = new_win; + tp->rcv_wup = tp->rcv_nxt; } - tp->rcv_wnd = new_win; - tp->rcv_wup = tp->rcv_nxt; /* Make sure we do not exceed the maximum possible * scaled window.