diff mbox

[net-next] tcp: Fix CWV being too strict on thin streams

Message ID CADVnQy=BXvWMXUGeeABjsHDpfa4FF+RUfCvJXpupcyYKG+=+ig@mail.gmail.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Neal Cardwell Sept. 19, 2015, 12:46 p.m. UTC
On Fri, Sep 18, 2015 at 7:38 PM, Bendik Rønning Opstad
<bro.devel@gmail.com> wrote:
>
> Application limited streams such as thin streams, that transmit small
> amounts of payload in relatively few packets per RTT, are prevented from
> growing the CWND after experiencing loss. This leads to increased
> sojourn times for data segments in streams that often transmit
> time-dependent data.
>
> After the CWND is reduced due to loss, and an ACK has made room in the
> send window for more packets to be transmitted, the CWND will not grow
> unless there is more unsent data buffered in the output queue than the
> CWND permits to be sent. That is because tcp_cwnd_validate(), which
> updates tp->is_cwnd_limited, is only called in tcp_write_xmit() when at
> least one packet with new data has been sent. However, if all the
> buffered data in the output queue was sent within the current CWND,
> is_cwnd_limited will remain false even when there is no more room in the
> CWND. While the CWND is fully utilized, any new data put on the output
> queue will be held back (i.e. limited by the CWND), but
> tp->is_cwnd_limited will not be updated as no packets were transmitted.
>
> Fix by updating tp->is_cwnd_limited if no packets are sent due to the
> CWND being fully utilized.

Thanks for this report!

When you say "CWND is reduced due to loss", are you talking about RTO
or Fast Recovery? Do you have any traces you can share that illustrate
this issue?

Have you verified that this patch fixes the issue you identified? I
think the fix may be in the wrong place for that scenario, or at least
incomplete.

In the scenario you describe, you say "all the buffered data in the
output queue was sent within the current CWND", which means that there
is no unsent data, so in tcp_write_xmit() the call to tcp_send_head()
would return NULL, so we would not enter the while loop, and could not
set is_cwnd_limited to true.

In the scenario you describe, I think we'd need to check for being
cwnd-limited in tcp_xmit_retransmit_queue().

How about something like the following patch:


neal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index d0ad355..8e6a772 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2145,6 +2145,7 @@  repair:
                tcp_cwnd_validate(sk, is_cwnd_limited);
                return false;
        }
+       tp->is_cwnd_limited |= is_cwnd_limited;
        return !tp->packets_out && tcp_send_head(sk);
 }

@@ -2762,8 +2763,10 @@  void tcp_xmit_retransmit_queue(struct sock *sk)
                 * packet to be MSS sized and all the
                 * packet counting works out.
                 */
-               if (tcp_packets_in_flight(tp) >= tp->snd_cwnd)
+               if (tcp_packets_in_flight(tp) >= tp->snd_cwnd) {
+                       tp->is_cwnd_limited = true;
                        return;
+               }

                if (fwd_rexmitting) {
 begin_fwd: