diff mbox series

[net-next] tcp: add TCP_INFO status for failed client TFO

Message ID 1571425340-7082-1-git-send-email-jbaron@akamai.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [net-next] tcp: add TCP_INFO status for failed client TFO | expand

Commit Message

Jason Baron Oct. 18, 2019, 7:02 p.m. UTC
The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
or not data-in-SYN was ack'd on both the client and server side. We'd like
to gather more information on the client-side in the failure case in order
to indicate the reason for the failure. This can be useful for not only
debugging TFO, but also for creating TFO socket policies. For example, if
a middle box removes the TFO option or drops a data-in-SYN, we can
can detect this case, and turn off TFO for these connections saving the
extra retransmits.

The newly added tcpi_fastopen_client_fail status is 2 bits and has 4
states:

1) TFO_STATUS_UNSPEC

catch-all.

2) TFO_NO_COOKIE_SENT

If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
was sent because we don't have one yet, its not in cache or black-holing
may be enabled (already indicated by the global
LINUX_MIB_TCPFASTOPENBLACKHOLE).

3) TFO_NO_SYN_DATA

Data was sent with SYN, we received a SYN/ACK but it did not cover the data
portion. Cookie is not accepted by server because the cookie may be invalid
or the server may be overloaded.


4) TFO_NO_SYN_DATA_TIMEOUT

Data was sent with SYN, we received a SYN/ACK which did not cover the data
after at least 1 additional SYN was sent (without data). It may be the case
that a middle-box is dropping data-in-SYN packets. Thus, it would be more
efficient to not use TFO on this connection to avoid extra retransmits
during connection establishment.

These new fields certainly not cover all the cases where TFO may fail, but
other failures, such as SYN/ACK + data being dropped, will result in the
connection not becoming established. And a connection blackhole after
session establishment shows up as a stalled connection.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Christoph Paasch <cpaasch@apple.com>
---
 include/linux/tcp.h      |  2 +-
 include/uapi/linux/tcp.h | 10 +++++++++-
 net/ipv4/tcp.c           |  1 +
 net/ipv4/tcp_fastopen.c  |  9 +++++++--
 net/ipv4/tcp_input.c     |  5 +++++
 5 files changed, 23 insertions(+), 4 deletions(-)

Comments

Neal Cardwell Oct. 18, 2019, 11:57 p.m. UTC | #1
On Fri, Oct 18, 2019 at 3:03 PM Jason Baron <jbaron@akamai.com> wrote:
>
> The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
> or not data-in-SYN was ack'd on both the client and server side. We'd like
> to gather more information on the client-side in the failure case in order
> to indicate the reason for the failure. This can be useful for not only
> debugging TFO, but also for creating TFO socket policies. For example, if
> a middle box removes the TFO option or drops a data-in-SYN, we can
> can detect this case, and turn off TFO for these connections saving the
> extra retransmits.
>
> The newly added tcpi_fastopen_client_fail status is 2 bits and has 4
> states:
>
> 1) TFO_STATUS_UNSPEC
>
> catch-all.
>
> 2) TFO_NO_COOKIE_SENT
>
> If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
> was sent because we don't have one yet, its not in cache or black-holing
> may be enabled (already indicated by the global
> LINUX_MIB_TCPFASTOPENBLACKHOLE).
>
> 3) TFO_NO_SYN_DATA
>
> Data was sent with SYN, we received a SYN/ACK but it did not cover the data
> portion. Cookie is not accepted by server because the cookie may be invalid
> or the server may be overloaded.
>
>
> 4) TFO_NO_SYN_DATA_TIMEOUT
>
> Data was sent with SYN, we received a SYN/ACK which did not cover the data
> after at least 1 additional SYN was sent (without data). It may be the case
> that a middle-box is dropping data-in-SYN packets. Thus, it would be more
> efficient to not use TFO on this connection to avoid extra retransmits
> during connection establishment.
>
> These new fields certainly not cover all the cases where TFO may fail, but
> other failures, such as SYN/ACK + data being dropped, will result in the
> connection not becoming established. And a connection blackhole after
> session establishment shows up as a stalled connection.
>
> Signed-off-by: Jason Baron <jbaron@akamai.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Christoph Paasch <cpaasch@apple.com>
> ---

Thanks for adding this!

It would be good to reset tp->fastopen_client_fail to 0 in tcp_disconnect().

> +/* why fastopen failed from client perspective */
> +enum tcp_fastopen_client_fail {
> +       TFO_STATUS_UNSPEC, /* catch-all */
> +       TFO_NO_COOKIE_SENT, /* if not in TFO_CLIENT_NO_COOKIE mode */
> +       TFO_NO_SYN_DATA, /* SYN-ACK did not ack SYN data */

I found the "TFO_NO_SYN_DATA" name a little unintuitive; it sounded to
me like this means the client didn't send a SYN+DATA. What about
"TFO_DATA_NOT_ACKED", or something like that?

If you don't mind, it would be great to cc: Yuchung on the next rev.

thanks,
neal
Yuchung Cheng Oct. 21, 2019, 6:02 p.m. UTC | #2
Thanks for the patch. Detailed comments below

On Fri, Oct 18, 2019 at 4:58 PM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Fri, Oct 18, 2019 at 3:03 PM Jason Baron <jbaron@akamai.com> wrote:
> >
> > The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
> > or not data-in-SYN was ack'd on both the client and server side. We'd like
> > to gather more information on the client-side in the failure case in order
> > to indicate the reason for the failure. This can be useful for not only
> > debugging TFO, but also for creating TFO socket policies. For example, if
> > a middle box removes the TFO option or drops a data-in-SYN, we can
> > can detect this case, and turn off TFO for these connections saving the
> > extra retransmits.
> >
> > The newly added tcpi_fastopen_client_fail status is 2 bits and has 4
> > states:
> >
> > 1) TFO_STATUS_UNSPEC
> >
> > catch-all.
> >
> > 2) TFO_NO_COOKIE_SENT
> >
> > If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
> > was sent because we don't have one yet, its not in cache or black-holing
> > may be enabled (already indicated by the global
> > LINUX_MIB_TCPFASTOPENBLACKHOLE).

It'd be useful to separate the two that cookie is available but is
prohibited to use due to BH checking. We've seen users internally get
confused due to lack of this info (after seeing cookies from ip
metrics).

> >
> > 3) TFO_NO_SYN_DATA
> >
> > Data was sent with SYN, we received a SYN/ACK but it did not cover the data
> > portion. Cookie is not accepted by server because the cookie may be invalid
> > or the server may be overloaded.
> >
> >
> > 4) TFO_NO_SYN_DATA_TIMEOUT
> >
> > Data was sent with SYN, we received a SYN/ACK which did not cover the data
> > after at least 1 additional SYN was sent (without data). It may be the case
> > that a middle-box is dropping data-in-SYN packets. Thus, it would be more
> > efficient to not use TFO on this connection to avoid extra retransmits
> > during connection establishment.
> >
> > These new fields certainly not cover all the cases where TFO may fail, but
> > other failures, such as SYN/ACK + data being dropped, will result in the
> > connection not becoming established. And a connection blackhole after
> > session establishment shows up as a stalled connection.
> >
> > Signed-off-by: Jason Baron <jbaron@akamai.com>
> > Cc: Eric Dumazet <edumazet@google.com>
> > Cc: Neal Cardwell <ncardwell@google.com>
> > Cc: Christoph Paasch <cpaasch@apple.com>
> > ---
>
> Thanks for adding this!
>
> It would be good to reset tp->fastopen_client_fail to 0 in tcp_disconnect().
>
> > +/* why fastopen failed from client perspective */
> > +enum tcp_fastopen_client_fail {
> > +       TFO_STATUS_UNSPEC, /* catch-all */
> > +       TFO_NO_COOKIE_SENT, /* if not in TFO_CLIENT_NO_COOKIE mode */
> > +       TFO_NO_SYN_DATA, /* SYN-ACK did not ack SYN data */
>
> I found the "TFO_NO_SYN_DATA" name a little unintuitive; it sounded to
> me like this means the client didn't send a SYN+DATA. What about
> "TFO_DATA_NOT_ACKED", or something like that?
>
> If you don't mind, it would be great to cc: Yuchung on the next rev.
TFO_DATA_NOT_ACKED is already available from the inverse of TCPI_OPT_SYN_DATA
#define TCPI_OPT_SYN_DATA       32 /* SYN-ACK acked data in SYN sent or rcvd */

It occurs (3)(4) are already available indirectly from
TCPI_OPT_SYN_DATA and tcpi_total_retrans together, but the socket must
query tcpi_total_retrans right after connect/sendto returns which may
not be preferred.

How about an alternative proposal to the types to catch more TFO issues:

TFO_STATUS_UNSPEC
TFO_DISABLED_BLACKHOLE_DETECTED
TFO_COOKIE_UNAVAILABLE
TFO_SYN_RETRANSMITTED  // use in conjunction w/ TCPI_OPT_SYN_DATA for (3)(4)

>
> thanks,
> neal
Jason Baron Oct. 21, 2019, 6:27 p.m. UTC | #3
On 10/21/19 2:02 PM, Yuchung Cheng wrote:
> Thanks for the patch. Detailed comments below
> 
> On Fri, Oct 18, 2019 at 4:58 PM Neal Cardwell <ncardwell@google.com> wrote:
>>
>> On Fri, Oct 18, 2019 at 3:03 PM Jason Baron <jbaron@akamai.com> wrote:
>>>
>>> The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
>>> or not data-in-SYN was ack'd on both the client and server side. We'd like
>>> to gather more information on the client-side in the failure case in order
>>> to indicate the reason for the failure. This can be useful for not only
>>> debugging TFO, but also for creating TFO socket policies. For example, if
>>> a middle box removes the TFO option or drops a data-in-SYN, we can
>>> can detect this case, and turn off TFO for these connections saving the
>>> extra retransmits.
>>>
>>> The newly added tcpi_fastopen_client_fail status is 2 bits and has 4
>>> states:
>>>
>>> 1) TFO_STATUS_UNSPEC
>>>
>>> catch-all.
>>>
>>> 2) TFO_NO_COOKIE_SENT
>>>
>>> If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
>>> was sent because we don't have one yet, its not in cache or black-holing
>>> may be enabled (already indicated by the global
>>> LINUX_MIB_TCPFASTOPENBLACKHOLE).
> 
> It'd be useful to separate the two that cookie is available but is
> prohibited to use due to BH checking. We've seen users internally get
> confused due to lack of this info (after seeing cookies from ip
> metrics).
> 

ok, yeah i had been thinking about splitting these out but thought that
the LINUX_MIB_TCPFASTOPENBLACKHOLE counter could help differentiate
these cases - but I'm ok making it explicit.

>>>
>>> 3) TFO_NO_SYN_DATA
>>>
>>> Data was sent with SYN, we received a SYN/ACK but it did not cover the data
>>> portion. Cookie is not accepted by server because the cookie may be invalid
>>> or the server may be overloaded.
>>>
>>>
>>> 4) TFO_NO_SYN_DATA_TIMEOUT
>>>
>>> Data was sent with SYN, we received a SYN/ACK which did not cover the data
>>> after at least 1 additional SYN was sent (without data). It may be the case
>>> that a middle-box is dropping data-in-SYN packets. Thus, it would be more
>>> efficient to not use TFO on this connection to avoid extra retransmits
>>> during connection establishment.
>>>
>>> These new fields certainly not cover all the cases where TFO may fail, but
>>> other failures, such as SYN/ACK + data being dropped, will result in the
>>> connection not becoming established. And a connection blackhole after
>>> session establishment shows up as a stalled connection.
>>>
>>> Signed-off-by: Jason Baron <jbaron@akamai.com>
>>> Cc: Eric Dumazet <edumazet@google.com>
>>> Cc: Neal Cardwell <ncardwell@google.com>
>>> Cc: Christoph Paasch <cpaasch@apple.com>
>>> ---
>>
>> Thanks for adding this!
>>
>> It would be good to reset tp->fastopen_client_fail to 0 in tcp_disconnect().
>>
>>> +/* why fastopen failed from client perspective */
>>> +enum tcp_fastopen_client_fail {
>>> +       TFO_STATUS_UNSPEC, /* catch-all */
>>> +       TFO_NO_COOKIE_SENT, /* if not in TFO_CLIENT_NO_COOKIE mode */
>>> +       TFO_NO_SYN_DATA, /* SYN-ACK did not ack SYN data */
>>
>> I found the "TFO_NO_SYN_DATA" name a little unintuitive; it sounded to
>> me like this means the client didn't send a SYN+DATA. What about
>> "TFO_DATA_NOT_ACKED", or something like that?
>>
>> If you don't mind, it would be great to cc: Yuchung on the next rev.
> TFO_DATA_NOT_ACKED is already available from the inverse of TCPI_OPT_SYN_DATA
> #define TCPI_OPT_SYN_DATA       32 /* SYN-ACK acked data in SYN sent or rcvd */
> 
> It occurs (3)(4) are already available indirectly from
> TCPI_OPT_SYN_DATA and tcpi_total_retrans together, but the socket must
> query tcpi_total_retrans right after connect/sendto returns which may
> not be preferred.
> 
> How about an alternative proposal to the types to catch more TFO issues:
> 
> TFO_STATUS_UNSPEC
> TFO_DISABLED_BLACKHOLE_DETECTED
> TFO_COOKIE_UNAVAILABLE
> TFO_SYN_RETRANSMITTED  // use in conjunction w/ TCPI_OPT_SYN_DATA for (3)(4)

Ok, that set works for me. I will re-spin with these states for v2.
Thanks for the suggestion!

-Jason
William Dauchy Oct. 21, 2019, 6:49 p.m. UTC | #4
Hello Jason,

On Sat, Oct 19, 2019 at 11:10 AM Jason Baron <jbaron@akamai.com> wrote:
> The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
> or not data-in-SYN was ack'd on both the client and server side. We'd like
> to gather more information on the client-side in the failure case in order
> to indicate the reason for the failure. This can be useful for not only
> debugging TFO, but also for creating TFO socket policies. For example, if
> a middle box removes the TFO option or drops a data-in-SYN, we can
> can detect this case, and turn off TFO for these connections saving the
> extra retransmits.
>
> The newly added tcpi_fastopen_client_fail status is 2 bits and has 4
> states:
>
> 1) TFO_STATUS_UNSPEC
>
> catch-all.
>
> 2) TFO_NO_COOKIE_SENT
>
> If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
> was sent because we don't have one yet, its not in cache or black-holing
> may be enabled (already indicated by the global
> LINUX_MIB_TCPFASTOPENBLACKHOLE).
>
> 3) TFO_NO_SYN_DATA
>
> Data was sent with SYN, we received a SYN/ACK but it did not cover the data
> portion. Cookie is not accepted by server because the cookie may be invalid
> or the server may be overloaded.
>
>
> 4) TFO_NO_SYN_DATA_TIMEOUT
>
> Data was sent with SYN, we received a SYN/ACK which did not cover the data
> after at least 1 additional SYN was sent (without data). It may be the case
> that a middle-box is dropping data-in-SYN packets. Thus, it would be more
> efficient to not use TFO on this connection to avoid extra retransmits
> during connection establishment.
>
> These new fields certainly not cover all the cases where TFO may fail, but
> other failures, such as SYN/ACK + data being dropped, will result in the
> connection not becoming established. And a connection blackhole after
> session establishment shows up as a stalled connection.

I'm curious what would be the arguments compared to creating a new
getsockopt() to fetch this data?

Thanks,
Eric Dumazet Oct. 21, 2019, 7:02 p.m. UTC | #5
On 10/21/19 11:49 AM, William Dauchy wrote:
 
> I'm curious what would be the arguments compared to creating a new
> getsockopt() to fetch this data?

Reporting tsval/tsecr values were not well defined and seemed quite experimental to me.

TCP fastopen would use 2 unused bits, not real extra cost here.

This is persistent information after the connect(), while your tsval/tsecr report
seems quite dynamic stuff can be stale as soon as the info is fetched from the kernel.
William Dauchy Oct. 21, 2019, 7:07 p.m. UTC | #6
Hello Eric,

On Mon, Oct 21, 2019 at 9:02 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Reporting tsval/tsecr values were not well defined and seemed quite experimental to me.
>
> TCP fastopen would use 2 unused bits, not real extra cost here.
>
> This is persistent information after the connect(), while your tsval/tsecr report
> seems quite dynamic stuff can be stale as soon as the info is fetched from the kernel.

Thanks for your answer. I better understand the context.
Christoph Paasch Oct. 21, 2019, 7:51 p.m. UTC | #7
On 21/10/19 - 14:27:24, Jason Baron wrote:
> 
> 
> On 10/21/19 2:02 PM, Yuchung Cheng wrote:
> > Thanks for the patch. Detailed comments below
> > 
> > On Fri, Oct 18, 2019 at 4:58 PM Neal Cardwell <ncardwell@google.com> wrote:
> >>
> >> On Fri, Oct 18, 2019 at 3:03 PM Jason Baron <jbaron@akamai.com> wrote:
> >>>
> >>> The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
> >>> or not data-in-SYN was ack'd on both the client and server side. We'd like
> >>> to gather more information on the client-side in the failure case in order
> >>> to indicate the reason for the failure. This can be useful for not only
> >>> debugging TFO, but also for creating TFO socket policies. For example, if
> >>> a middle box removes the TFO option or drops a data-in-SYN, we can
> >>> can detect this case, and turn off TFO for these connections saving the
> >>> extra retransmits.
> >>>
> >>> The newly added tcpi_fastopen_client_fail status is 2 bits and has 4
> >>> states:
> >>>
> >>> 1) TFO_STATUS_UNSPEC
> >>>
> >>> catch-all.
> >>>
> >>> 2) TFO_NO_COOKIE_SENT
> >>>
> >>> If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
> >>> was sent because we don't have one yet, its not in cache or black-holing
> >>> may be enabled (already indicated by the global
> >>> LINUX_MIB_TCPFASTOPENBLACKHOLE).
> > 
> > It'd be useful to separate the two that cookie is available but is
> > prohibited to use due to BH checking. We've seen users internally get
> > confused due to lack of this info (after seeing cookies from ip
> > metrics).
> > 
> 
> ok, yeah i had been thinking about splitting these out but thought that
> the LINUX_MIB_TCPFASTOPENBLACKHOLE counter could help differentiate
> these cases - but I'm ok making it explicit.
> 
> >>>
> >>> 3) TFO_NO_SYN_DATA
> >>>
> >>> Data was sent with SYN, we received a SYN/ACK but it did not cover the data
> >>> portion. Cookie is not accepted by server because the cookie may be invalid
> >>> or the server may be overloaded.
> >>>
> >>>
> >>> 4) TFO_NO_SYN_DATA_TIMEOUT
> >>>
> >>> Data was sent with SYN, we received a SYN/ACK which did not cover the data
> >>> after at least 1 additional SYN was sent (without data). It may be the case
> >>> that a middle-box is dropping data-in-SYN packets. Thus, it would be more
> >>> efficient to not use TFO on this connection to avoid extra retransmits
> >>> during connection establishment.
> >>>
> >>> These new fields certainly not cover all the cases where TFO may fail, but
> >>> other failures, such as SYN/ACK + data being dropped, will result in the
> >>> connection not becoming established. And a connection blackhole after
> >>> session establishment shows up as a stalled connection.
> >>>
> >>> Signed-off-by: Jason Baron <jbaron@akamai.com>
> >>> Cc: Eric Dumazet <edumazet@google.com>
> >>> Cc: Neal Cardwell <ncardwell@google.com>
> >>> Cc: Christoph Paasch <cpaasch@apple.com>
> >>> ---
> >>
> >> Thanks for adding this!
> >>
> >> It would be good to reset tp->fastopen_client_fail to 0 in tcp_disconnect().
> >>
> >>> +/* why fastopen failed from client perspective */
> >>> +enum tcp_fastopen_client_fail {
> >>> +       TFO_STATUS_UNSPEC, /* catch-all */
> >>> +       TFO_NO_COOKIE_SENT, /* if not in TFO_CLIENT_NO_COOKIE mode */
> >>> +       TFO_NO_SYN_DATA, /* SYN-ACK did not ack SYN data */
> >>
> >> I found the "TFO_NO_SYN_DATA" name a little unintuitive; it sounded to
> >> me like this means the client didn't send a SYN+DATA. What about
> >> "TFO_DATA_NOT_ACKED", or something like that?
> >>
> >> If you don't mind, it would be great to cc: Yuchung on the next rev.
> > TFO_DATA_NOT_ACKED is already available from the inverse of TCPI_OPT_SYN_DATA
> > #define TCPI_OPT_SYN_DATA       32 /* SYN-ACK acked data in SYN sent or rcvd */
> > 
> > It occurs (3)(4) are already available indirectly from
> > TCPI_OPT_SYN_DATA and tcpi_total_retrans together, but the socket must
> > query tcpi_total_retrans right after connect/sendto returns which may
> > not be preferred.
> > 
> > How about an alternative proposal to the types to catch more TFO issues:
> > 
> > TFO_STATUS_UNSPEC
> > TFO_DISABLED_BLACKHOLE_DETECTED
> > TFO_COOKIE_UNAVAILABLE
> > TFO_SYN_RETRANSMITTED  // use in conjunction w/ TCPI_OPT_SYN_DATA for (3)(4)
> 
> Ok, that set works for me. I will re-spin with these states for v2.
> Thanks for the suggestion!

Actually, longterm I hope we would be able to get rid of the
blackhole-detection and fallback heuristics. In a far distant future where
these middleboxes have been weeded out ;-)

So, do we really want to eternalize this as part of the API in tcp_info ?


Christoph
Eric Dumazet Oct. 21, 2019, 8:36 p.m. UTC | #8
On Mon, Oct 21, 2019 at 12:53 PM Christoph Paasch <cpaasch@apple.com> wrote:
>

> Actually, longterm I hope we would be able to get rid of the
> blackhole-detection and fallback heuristics. In a far distant future where
> these middleboxes have been weeded out ;-)
>
> So, do we really want to eternalize this as part of the API in tcp_info ?
>

A getsockopt() option won't be available for netlink diag (ss command).

So it all depends on plans to use this FASTOPEN information ?
Jason Baron Oct. 21, 2019, 9:10 p.m. UTC | #9
On 10/21/19 4:36 PM, Eric Dumazet wrote:
> On Mon, Oct 21, 2019 at 12:53 PM Christoph Paasch <cpaasch@apple.com> wrote:
>>
> 
>> Actually, longterm I hope we would be able to get rid of the
>> blackhole-detection and fallback heuristics. In a far distant future where
>> these middleboxes have been weeded out ;-)
>>
>> So, do we really want to eternalize this as part of the API in tcp_info ?
>>
> 
> A getsockopt() option won't be available for netlink diag (ss command).
> 
> So it all depends on plans to use this FASTOPEN information ?
> 

The original proposal I had 4 states of interest:

First, we are interested in knowing when a socket has TFO set but
actually requires a retransmit of a non-TFO syn to become established.
In this case, we'd be better off not using TFO.

A second case is when the server immediately rejects the DATA and just
acks the syn (but not the data). Again in that case, we don't want to be
sending syn+data.

The third case was whether or not we sent a cookie. Perhaps, the server
doesn't have TFO enabled in which case, it really doesn't make make
sense to enable TFO in the first place. Or if one also controls the
server its helpful in understanding if the server is mis-configured. So
that was the motivation I had for the original four states that I
proposed (final state was a catch-all state).

Yuchung suggested dividing the 3rd case into 2 for - no cookie sent
because of blackhole or no cookie sent because its not in cache. And
then dropping the second state because we already have the
TCPI_OPT_SYN_DATA bit. However, the TCPI_OPT_SYN_DATA may not be set
because we may fallback in tcp_send_syn_data() due to send failure. So
I'm inclined to say that the second state is valuable. And since
blackhole is already a global thing via MIB, I didn't see a strong need
for it. But it sounded like Yuchung had an interest in it, and I'd
obviously like a set of states that is generally useful.

Thanks,

-Jason
Neal Cardwell Oct. 22, 2019, 2:13 a.m. UTC | #10
On Mon, Oct 21, 2019 at 5:11 PM Jason Baron <jbaron@akamai.com> wrote:
>
>
>
> On 10/21/19 4:36 PM, Eric Dumazet wrote:
> > On Mon, Oct 21, 2019 at 12:53 PM Christoph Paasch <cpaasch@apple.com> wrote:
> >>
> >
> >> Actually, longterm I hope we would be able to get rid of the
> >> blackhole-detection and fallback heuristics. In a far distant future where
> >> these middleboxes have been weeded out ;-)
> >>
> >> So, do we really want to eternalize this as part of the API in tcp_info ?
> >>
> >
> > A getsockopt() option won't be available for netlink diag (ss command).
> >
> > So it all depends on plans to use this FASTOPEN information ?
> >
>
> The original proposal I had 4 states of interest:
>
> First, we are interested in knowing when a socket has TFO set but
> actually requires a retransmit of a non-TFO syn to become established.
> In this case, we'd be better off not using TFO.
>
> A second case is when the server immediately rejects the DATA and just
> acks the syn (but not the data). Again in that case, we don't want to be
> sending syn+data.
>
> The third case was whether or not we sent a cookie. Perhaps, the server
> doesn't have TFO enabled in which case, it really doesn't make make
> sense to enable TFO in the first place. Or if one also controls the
> server its helpful in understanding if the server is mis-configured. So
> that was the motivation I had for the original four states that I
> proposed (final state was a catch-all state).
>
> Yuchung suggested dividing the 3rd case into 2 for - no cookie sent
> because of blackhole or no cookie sent because its not in cache. And
> then dropping the second state because we already have the
> TCPI_OPT_SYN_DATA bit. However, the TCPI_OPT_SYN_DATA may not be set
> because we may fallback in tcp_send_syn_data() due to send failure. So
> I'm inclined to say that the second state is valuable. And since
> blackhole is already a global thing via MIB, I didn't see a strong need
> for it. But it sounded like Yuchung had an interest in it, and I'd
> obviously like a set of states that is generally useful.

I have not kept up with all the proposals in this thread, but would it
work to include all of the cases proposed in this thread? It seems
like there are enough bits available in holes in tcp_sock and tcp_info
to have up to 7 bits of information saved in tcp_sock and exported to
tcp_info. That seems like more than enough?

The pahole tool shows the following small holes in tcp_sock:

        u8                         compressed_ack;       /*  1530     1 */

        /* XXX 1 byte hole, try to pack */

        u32                        chrono_start;         /*  1532     4 */
...
        u8                         bpf_sock_ops_cb_flags; /*  2076     1 */

        /* XXX 3 bytes hole, try to pack */

        u32                        rcv_ooopack;          /*  2080     4 */

...and the following hole in tcp_info:
        __u8                       tcpi_delivery_rate_app_limited:1;
/*     7: 7  1 */

        /* XXX 7 bits hole, try to pack */

        __u32                      tcpi_rto;             /*     8     4 */

cheers,
neal
Yuchung Cheng Oct. 22, 2019, 6:17 p.m. UTC | #11
On Mon, Oct 21, 2019 at 7:14 PM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Mon, Oct 21, 2019 at 5:11 PM Jason Baron <jbaron@akamai.com> wrote:
> >
> >
> >
> > On 10/21/19 4:36 PM, Eric Dumazet wrote:
> > > On Mon, Oct 21, 2019 at 12:53 PM Christoph Paasch <cpaasch@apple.com> wrote:
> > >>
> > >
> > >> Actually, longterm I hope we would be able to get rid of the
> > >> blackhole-detection and fallback heuristics. In a far distant future where
> > >> these middleboxes have been weeded out ;-)
> > >>
> > >> So, do we really want to eternalize this as part of the API in tcp_info ?
> > >>
> > >
> > > A getsockopt() option won't be available for netlink diag (ss command).
> > >
> > > So it all depends on plans to use this FASTOPEN information ?
> > >
> >
> > The original proposal I had 4 states of interest:
> >
> > First, we are interested in knowing when a socket has TFO set but
> > actually requires a retransmit of a non-TFO syn to become established.
> > In this case, we'd be better off not using TFO.
> >
> > A second case is when the server immediately rejects the DATA and just
> > acks the syn (but not the data). Again in that case, we don't want to be
> > sending syn+data.
> >
> > The third case was whether or not we sent a cookie. Perhaps, the server
> > doesn't have TFO enabled in which case, it really doesn't make make
> > sense to enable TFO in the first place. Or if one also controls the
> > server its helpful in understanding if the server is mis-configured. So
> > that was the motivation I had for the original four states that I
> > proposed (final state was a catch-all state).
> >
> > Yuchung suggested dividing the 3rd case into 2 for - no cookie sent
> > because of blackhole or no cookie sent because its not in cache. And
> > then dropping the second state because we already have the
> > TCPI_OPT_SYN_DATA bit. However, the TCPI_OPT_SYN_DATA may not be set
> > because we may fallback in tcp_send_syn_data() due to send failure. So
but sendto would return -1 w/ EINPROGRESS in this case already so the
application shouldn't expect TCPI_OPT_SYN_DATA?


> > I'm inclined to say that the second state is valuable. And since
> > blackhole is already a global thing via MIB, I didn't see a strong need
> > for it. But it sounded like Yuchung had an interest in it, and I'd
> > obviously like a set of states that is generally useful.
>
> I have not kept up with all the proposals in this thread, but would it
> work to include all of the cases proposed in this thread? It seems
> like there are enough bits available in holes in tcp_sock and tcp_info
> to have up to 7 bits of information saved in tcp_sock and exported to
> tcp_info. That seems like more than enough?
I would rather use only at most 2-bits for TFO errors to be
parsimonious on (bloated) tcp sock. I don't mind if the next patch
skip my idea of BH detection.
my experience is reading host global stats for most applications are a
hassle (or sometimes not even feasible). they mostly care about
information of their own sockets.



>
> The pahole tool shows the following small holes in tcp_sock:
>
>         u8                         compressed_ack;       /*  1530     1 */
>
>         /* XXX 1 byte hole, try to pack */
>
>         u32                        chrono_start;         /*  1532     4 */
> ...
>         u8                         bpf_sock_ops_cb_flags; /*  2076     1 */
>
>         /* XXX 3 bytes hole, try to pack */
>
>         u32                        rcv_ooopack;          /*  2080     4 */
>
> ...and the following hole in tcp_info:
>         __u8                       tcpi_delivery_rate_app_limited:1;
> /*     7: 7  1 */
>
>         /* XXX 7 bits hole, try to pack */
>
>         __u32                      tcpi_rto;             /*     8     4 */
>
> cheers,
> neal
Jason Baron Oct. 22, 2019, 7:32 p.m. UTC | #12
On 10/22/19 2:17 PM, Yuchung Cheng wrote:
> On Mon, Oct 21, 2019 at 7:14 PM Neal Cardwell <ncardwell@google.com> wrote:
>>
>> On Mon, Oct 21, 2019 at 5:11 PM Jason Baron <jbaron@akamai.com> wrote:
>>>
>>>
>>>
>>> On 10/21/19 4:36 PM, Eric Dumazet wrote:
>>>> On Mon, Oct 21, 2019 at 12:53 PM Christoph Paasch <cpaasch@apple.com> wrote:
>>>>>
>>>>
>>>>> Actually, longterm I hope we would be able to get rid of the
>>>>> blackhole-detection and fallback heuristics. In a far distant future where
>>>>> these middleboxes have been weeded out ;-)
>>>>>
>>>>> So, do we really want to eternalize this as part of the API in tcp_info ?
>>>>>
>>>>
>>>> A getsockopt() option won't be available for netlink diag (ss command).
>>>>
>>>> So it all depends on plans to use this FASTOPEN information ?
>>>>
>>>
>>> The original proposal I had 4 states of interest:
>>>
>>> First, we are interested in knowing when a socket has TFO set but
>>> actually requires a retransmit of a non-TFO syn to become established.
>>> In this case, we'd be better off not using TFO.
>>>
>>> A second case is when the server immediately rejects the DATA and just
>>> acks the syn (but not the data). Again in that case, we don't want to be
>>> sending syn+data.
>>>
>>> The third case was whether or not we sent a cookie. Perhaps, the server
>>> doesn't have TFO enabled in which case, it really doesn't make make
>>> sense to enable TFO in the first place. Or if one also controls the
>>> server its helpful in understanding if the server is mis-configured. So
>>> that was the motivation I had for the original four states that I
>>> proposed (final state was a catch-all state).
>>>
>>> Yuchung suggested dividing the 3rd case into 2 for - no cookie sent
>>> because of blackhole or no cookie sent because its not in cache. And
>>> then dropping the second state because we already have the
>>> TCPI_OPT_SYN_DATA bit. However, the TCPI_OPT_SYN_DATA may not be set
>>> because we may fallback in tcp_send_syn_data() due to send failure. So
> but sendto would return -1 w/ EINPROGRESS in this case already so the
> application shouldn't expect TCPI_OPT_SYN_DATA?

Ok, but let's say the sk_stream_alloc_skb() fails in
tcp_send_syn_data(), in that case we aren't going to send a TFO cookie
(just a regular syn). The user isn't going to get any error and would
expect TCPI_OPT_SYN_DATA. Now, TCPI_OPT_SYN_DATA wouldn't be set but we
can't assume then that a SYN+data was sent and the SYN_ACK didn't cover
the data part. Instead, the reason for failure is really -ENOMEM, which
in the proposed states would fall into TFO_STATUS_UNSPEC, but it does
mean that I think we shouldn't have a TFO_DATA_NOT_ACKED state otherwise
we can't differentiate the two.

> 
> 
>>> I'm inclined to say that the second state is valuable. And since
>>> blackhole is already a global thing via MIB, I didn't see a strong need
>>> for it. But it sounded like Yuchung had an interest in it, and I'd
>>> obviously like a set of states that is generally useful.
>>
>> I have not kept up with all the proposals in this thread, but would it
>> work to include all of the cases proposed in this thread? It seems
>> like there are enough bits available in holes in tcp_sock and tcp_info
>> to have up to 7 bits of information saved in tcp_sock and exported to
>> tcp_info. That seems like more than enough?
> I would rather use only at most 2-bits for TFO errors to be
> parsimonious on (bloated) tcp sock. I don't mind if the next patch
> skip my idea of BH detection.
> my experience is reading host global stats for most applications are a
> hassle (or sometimes not even feasible). they mostly care about
> information of their own sockets.

hmmm, if you are ok without having the BH failure state which Christoph
also pushed back on a bit, but we still only want 2 bits as you
suggested how about:

1) TFO_STATUS_UNSPEC - includes black hole state and other failures
2) TFO_COOKIE_UNAVAILABLE - we don't have a cookie in the cache
3) TFO_DATA_NOT_ACKED - syn+data sent but no ack for data part
4) TFO_SYN_RETRANSMITTED - same as 3 but at least 1 additional syn sent

Thanks,

-Jason
Yuchung Cheng Oct. 22, 2019, 7:45 p.m. UTC | #13
On Tue, Oct 22, 2019 at 12:34 PM Jason Baron <jbaron@akamai.com> wrote:
>
>
>
> On 10/22/19 2:17 PM, Yuchung Cheng wrote:
> > On Mon, Oct 21, 2019 at 7:14 PM Neal Cardwell <ncardwell@google.com> wrote:
> >>
> >> On Mon, Oct 21, 2019 at 5:11 PM Jason Baron <jbaron@akamai.com> wrote:
> >>>
> >>>
> >>>
> >>> On 10/21/19 4:36 PM, Eric Dumazet wrote:
> >>>> On Mon, Oct 21, 2019 at 12:53 PM Christoph Paasch <cpaasch@apple.com> wrote:
> >>>>>
> >>>>
> >>>>> Actually, longterm I hope we would be able to get rid of the
> >>>>> blackhole-detection and fallback heuristics. In a far distant future where
> >>>>> these middleboxes have been weeded out ;-)
> >>>>>
> >>>>> So, do we really want to eternalize this as part of the API in tcp_info ?
> >>>>>
> >>>>
> >>>> A getsockopt() option won't be available for netlink diag (ss command).
> >>>>
> >>>> So it all depends on plans to use this FASTOPEN information ?
> >>>>
> >>>
> >>> The original proposal I had 4 states of interest:
> >>>
> >>> First, we are interested in knowing when a socket has TFO set but
> >>> actually requires a retransmit of a non-TFO syn to become established.
> >>> In this case, we'd be better off not using TFO.
> >>>
> >>> A second case is when the server immediately rejects the DATA and just
> >>> acks the syn (but not the data). Again in that case, we don't want to be
> >>> sending syn+data.
> >>>
> >>> The third case was whether or not we sent a cookie. Perhaps, the server
> >>> doesn't have TFO enabled in which case, it really doesn't make make
> >>> sense to enable TFO in the first place. Or if one also controls the
> >>> server its helpful in understanding if the server is mis-configured. So
> >>> that was the motivation I had for the original four states that I
> >>> proposed (final state was a catch-all state).
> >>>
> >>> Yuchung suggested dividing the 3rd case into 2 for - no cookie sent
> >>> because of blackhole or no cookie sent because its not in cache. And
> >>> then dropping the second state because we already have the
> >>> TCPI_OPT_SYN_DATA bit. However, the TCPI_OPT_SYN_DATA may not be set
> >>> because we may fallback in tcp_send_syn_data() due to send failure. So
> > but sendto would return -1 w/ EINPROGRESS in this case already so the
> > application shouldn't expect TCPI_OPT_SYN_DATA?
>
> Ok, but let's say the sk_stream_alloc_skb() fails in
> tcp_send_syn_data(), in that case we aren't going to send a TFO cookie
> (just a regular syn). The user isn't going to get any error and would
> expect TCPI_OPT_SYN_DATA. Now, TCPI_OPT_SYN_DATA wouldn't be set but we
> can't assume then that a SYN+data was sent and the SYN_ACK didn't cover
> the data part. Instead, the reason for failure is really -ENOMEM, which
> in the proposed states would fall into TFO_STATUS_UNSPEC, but it does
> mean that I think we shouldn't have a TFO_DATA_NOT_ACKED state otherwise
> we can't differentiate the two.
>
> >
> >
> >>> I'm inclined to say that the second state is valuable. And since
> >>> blackhole is already a global thing via MIB, I didn't see a strong need
> >>> for it. But it sounded like Yuchung had an interest in it, and I'd
> >>> obviously like a set of states that is generally useful.
> >>
> >> I have not kept up with all the proposals in this thread, but would it
> >> work to include all of the cases proposed in this thread? It seems
> >> like there are enough bits available in holes in tcp_sock and tcp_info
> >> to have up to 7 bits of information saved in tcp_sock and exported to
> >> tcp_info. That seems like more than enough?
> > I would rather use only at most 2-bits for TFO errors to be
> > parsimonious on (bloated) tcp sock. I don't mind if the next patch
> > skip my idea of BH detection.
> > my experience is reading host global stats for most applications are a
> > hassle (or sometimes not even feasible). they mostly care about
> > information of their own sockets.
>
> hmmm, if you are ok without having the BH failure state which Christoph
> also pushed back on a bit, but we still only want 2 bits as you
> suggested how about:
>
> 1) TFO_STATUS_UNSPEC - includes black hole state and other failures
> 2) TFO_COOKIE_UNAVAILABLE - we don't have a cookie in the cache
> 3) TFO_DATA_NOT_ACKED - syn+data sent but no ack for data part
> 4) TFO_SYN_RETRANSMITTED - same as 3 but at least 1 additional syn sent
sounds good to me. thanks
>
> Thanks,
>
> -Jason
diff mbox series

Patch

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 99617e5..7790f28 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -223,7 +223,7 @@  struct tcp_sock {
 		fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */
 		fastopen_no_cookie:1, /* Allow send/recv SYN+data without a cookie */
 		is_sack_reneg:1,    /* in recovery from loss with SACK reneg? */
-		unused:2;
+		fastopen_client_fail:2; /* reason why fastopen failed */
 	u8	nonagle     : 4,/* Disable Nagle algorithm?             */
 		thin_lto    : 1,/* Use linear timeouts for thin streams */
 		recvmsg_inq : 1,/* Indicate # of bytes in queue upon recvmsg */
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 81e6979..dbee3ed 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -155,6 +155,14 @@  enum {
 	TCP_QUEUES_NR,
 };
 
+/* why fastopen failed from client perspective */
+enum tcp_fastopen_client_fail {
+	TFO_STATUS_UNSPEC, /* catch-all */
+	TFO_NO_COOKIE_SENT, /* if not in TFO_CLIENT_NO_COOKIE mode */
+	TFO_NO_SYN_DATA, /* SYN-ACK did not ack SYN data */
+	TFO_NO_SYN_DATA_TIMEOUT /* SYN-ACK did not ack SYN data after timeout */
+};
+
 /* for TCP_INFO socket option */
 #define TCPI_OPT_TIMESTAMPS	1
 #define TCPI_OPT_SACK		2
@@ -211,7 +219,7 @@  struct tcp_info {
 	__u8	tcpi_backoff;
 	__u8	tcpi_options;
 	__u8	tcpi_snd_wscale : 4, tcpi_rcv_wscale : 4;
-	__u8	tcpi_delivery_rate_app_limited:1;
+	__u8	tcpi_delivery_rate_app_limited:1, tcpi_fastopen_client_fail:2;
 
 	__u32	tcpi_rto;
 	__u32	tcpi_ato;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9f41a76..764f623 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3296,6 +3296,7 @@  void tcp_get_info(struct sock *sk, struct tcp_info *info)
 	info->tcpi_reord_seen = tp->reord_seen;
 	info->tcpi_rcv_ooopack = tp->rcv_ooopack;
 	info->tcpi_snd_wnd = tp->snd_wnd;
+	info->tcpi_fastopen_client_fail = tp->fastopen_client_fail;
 	unlock_sock_fast(sk, slow);
 }
 EXPORT_SYMBOL_GPL(tcp_get_info);
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
index 3fd4512..d88286d 100644
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -413,7 +413,7 @@  bool tcp_fastopen_cookie_check(struct sock *sk, u16 *mss,
 	/* Firewall blackhole issue check */
 	if (tcp_fastopen_active_should_disable(sk)) {
 		cookie->len = -1;
-		return false;
+		goto no_cookie;
 	}
 
 	dst = __sk_dst_get(sk);
@@ -422,7 +422,12 @@  bool tcp_fastopen_cookie_check(struct sock *sk, u16 *mss,
 		cookie->len = -1;
 		return true;
 	}
-	return cookie->len > 0;
+	if (cookie->len > 0)
+		return true;
+
+no_cookie:
+	tcp_sk(sk)->fastopen_client_fail = TFO_NO_COOKIE_SENT;
+	return false;
 }
 
 /* This function checks if we want to defer sending SYN until the first
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3578357a..357f757 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5819,6 +5819,11 @@  static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack,
 		tcp_rearm_rto(sk);
 		NET_INC_STATS(sock_net(sk),
 				LINUX_MIB_TCPFASTOPENACTIVEFAIL);
+		if (syn_drop)
+			tp->fastopen_client_fail = TFO_NO_SYN_DATA_TIMEOUT;
+		else
+			tp->fastopen_client_fail = TFO_NO_SYN_DATA;
+
 		return true;
 	}
 	tp->syn_data_acked = tp->syn_data;