diff mbox

tcp_cubic: enable TCP timestamps

Message ID 20110308080926.GA22641@xanadu.blop.info
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Lucas Nussbaum March 8, 2011, 8:09 a.m. UTC
The Hystart slow start algorithm requires precise RTT delay measurements
to decide when to leave slow start. However, currently, CUBIC doesn't
enable TCP timestamps. This can cause Hystart to mis-estimate the RTT,
and to leave slow start too early, generating bad performance since
convergence to the optimal cwnd is slower.

Timestamps are already used by TCP Illinois, LP, Vegas, Veno and Yeah.

Signed-off-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>

Comments

stephen hemminger March 8, 2011, 6:42 p.m. UTC | #1
On Tue, 8 Mar 2011 09:09:26 +0100
Lucas Nussbaum <lucas.nussbaum@loria.fr> wrote:

> The Hystart slow start algorithm requires precise RTT delay measurements
> to decide when to leave slow start. However, currently, CUBIC doesn't
> enable TCP timestamps. This can cause Hystart to mis-estimate the RTT,
> and to leave slow start too early, generating bad performance since
> convergence to the optimal cwnd is slower.
> 
> Timestamps are already used by TCP Illinois, LP, Vegas, Veno and Yeah.
> 
> Signed-off-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>

Just to explain what RTT_STAMP does. It causes the tcp receive code
to compute the rtt using high resolution clocks rather than just
jiffies. It requires access to ktime_get_real which means accessing
clock source. This is cheap for TSC, a little expensive for HPET but
expensive for PIT. I worry that enabling it may hurt regular users
on old desktops.  But without it enabling RTT_STAMP, packets that
get acked in less than a jiffie (1 - 10 ms) will 

Also I should have used ktime_get rather than ktime_get_real
because real time is altered by NTP and other actions.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 8, 2011, 6:55 p.m. UTC | #2
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Tue, 8 Mar 2011 10:42:11 -0800

> On Tue, 8 Mar 2011 09:09:26 +0100
> Lucas Nussbaum <lucas.nussbaum@loria.fr> wrote:
> 
>> The Hystart slow start algorithm requires precise RTT delay measurements
>> to decide when to leave slow start. However, currently, CUBIC doesn't
>> enable TCP timestamps. This can cause Hystart to mis-estimate the RTT,
>> and to leave slow start too early, generating bad performance since
>> convergence to the optimal cwnd is slower.
>> 
>> Timestamps are already used by TCP Illinois, LP, Vegas, Veno and Yeah.
>> 
>> Signed-off-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
> 
> Just to explain what RTT_STAMP does. It causes the tcp receive code
> to compute the rtt using high resolution clocks rather than just
> jiffies. It requires access to ktime_get_real which means accessing
> clock source. This is cheap for TSC, a little expensive for HPET but
> expensive for PIT. I worry that enabling it may hurt regular users
> on old desktops.  But without it enabling RTT_STAMP, packets that
> get acked in less than a jiffie (1 - 10 ms) will 
> 
> Also I should have used ktime_get rather than ktime_get_real
> because real time is altered by NTP and other actions.

This also means that whatever testing was originally done on Hystart
is dependent upon whatever value CONFIG_HZ had in the test kernel.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sangtae Ha March 8, 2011, 7:15 p.m. UTC | #3
Yes. I remember that CONFIG_HZ was 1000 at that time and the value of
CONFIG_HZ could affect the algorithm.
I don't think HyStart needs this extra RTT_STAMP since we only need a
rough delay estimate.
Let me check HyStart with the latest git and with different CONFIG_HZ values.

Sangtae

On Tue, Mar 8, 2011 at 1:55 PM, David Miller <davem@davemloft.net> wrote:
>
> From: Stephen Hemminger <shemminger@vyatta.com>
> Date: Tue, 8 Mar 2011 10:42:11 -0800
>
> > On Tue, 8 Mar 2011 09:09:26 +0100
> > Lucas Nussbaum <lucas.nussbaum@loria.fr> wrote:
> >
> >> The Hystart slow start algorithm requires precise RTT delay measurements
> >> to decide when to leave slow start. However, currently, CUBIC doesn't
> >> enable TCP timestamps. This can cause Hystart to mis-estimate the RTT,
> >> and to leave slow start too early, generating bad performance since
> >> convergence to the optimal cwnd is slower.
> >>
> >> Timestamps are already used by TCP Illinois, LP, Vegas, Veno and Yeah.
> >>
> >> Signed-off-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
> >
> > Just to explain what RTT_STAMP does. It causes the tcp receive code
> > to compute the rtt using high resolution clocks rather than just
> > jiffies. It requires access to ktime_get_real which means accessing
> > clock source. This is cheap for TSC, a little expensive for HPET but
> > expensive for PIT. I worry that enabling it may hurt regular users
> > on old desktops.  But without it enabling RTT_STAMP, packets that
> > get acked in less than a jiffie (1 - 10 ms) will
> >
> > Also I should have used ktime_get rather than ktime_get_real
> > because real time is altered by NTP and other actions.
>
> This also means that whatever testing was originally done on Hystart
> is dependent upon whatever value CONFIG_HZ had in the test kernel.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lucas Nussbaum March 8, 2011, 7:36 p.m. UTC | #4
On 08/03/11 at 14:15 -0500, Sangtae Ha wrote:
> Yes. I remember that CONFIG_HZ was 1000 at that time and the value of
> CONFIG_HZ could affect the algorithm.
> I don't think HyStart needs this extra RTT_STAMP since we only need a
> rough delay estimate.
> Let me check HyStart with the latest git and with different CONFIG_HZ values.

One of the problems I discovered was that ca->delay_min was completely
off if RTT_STAMP is disabled. For example, on a link with a 11ms RTT, I
could get delay_min = 4ms. But even with RTT_STAMP, there's the problem
that the code rounds the computed RTT value to a jiffie (in
bictcp_acked()).

My other patch mitigates the performance problem by making it harder for
Hystart to abort slow start. But after a few days working on this issue,
I'm wondering whether Hystart shouldn't be completely rewritten to work
on the usec values, or disabled by default.
David Miller March 8, 2011, 7:38 p.m. UTC | #5
From: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Date: Tue, 8 Mar 2011 20:36:02 +0100

> My other patch mitigates the performance problem by making it harder for
> Hystart to abort slow start. But after a few days working on this issue,
> I'm wondering whether Hystart shouldn't be completely rewritten to work
> on the usec values, or disabled by default.

I am of the opinion that Hystart, like TCP VEGAS, fundamentally cannot
work.

It's disabled in several major distributions because of the performance
problems you noticed, and this was on my own personal recommendation.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 71d5f2f..3a73509 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -406,6 +406,7 @@  static void bictcp_acked(struct sock *sk, u32 cnt, s32 rtt_us)
 }
 
 static struct tcp_congestion_ops cubictcp = {
+       .flags          = TCP_CONG_RTT_STAMP,
        .init           = bictcp_init,
        .ssthresh       = bictcp_recalc_ssthresh,
        .cong_avoid     = bictcp_cong_avoid,