diff mbox

[-next,v2,1/2] syncookies: remove ecn_ok validation when decoding option timestamp

Message ID 1414757602-27637-2-git-send-email-fw@strlen.de
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Florian Westphal Oct. 31, 2014, 12:13 p.m. UTC
Won't work anymore when tcp_ecn=0 and RTAX_FEATURES route attribute did
allow ecn.

Just turn on ecn if the client ts says so.

This means that while syn cookies are in use clients can turn on ecn
even if it is off on the server.

However, there seems to be no harm in permitting this.

Alternatively one can extend the test to also perform route lookup and
check RTAX_FEATURES, but it simply doesn't appear to be worth the effort.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 Changes since v1:
  - reword commit message

 include/net/tcp.h     | 3 +--
 net/ipv4/syncookies.c | 6 ++----
 net/ipv6/syncookies.c | 2 +-
 3 files changed, 4 insertions(+), 7 deletions(-)

Comments

Eric Dumazet Oct. 31, 2014, 1:32 p.m. UTC | #1
On Fri, 2014-10-31 at 13:13 +0100, Florian Westphal wrote:
> Won't work anymore when tcp_ecn=0 and RTAX_FEATURES route attribute did
> allow ecn.
> 
> Just turn on ecn if the client ts says so.
> 
> This means that while syn cookies are in use clients can turn on ecn
> even if it is off on the server.
> 
> However, there seems to be no harm in permitting this.
> 
> Alternatively one can extend the test to also perform route lookup and
> check RTAX_FEATURES, but it simply doesn't appear to be worth the effort.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  Changes since v1:
>   - reword commit message

Sorry.

Google chose to disable ecn on their GFE, so we set sysctl_tcp_ecn to 0

If I understand your patch, if a synflood is going on, some innocent
connections could get ECN enabled, while we do not want this to ever
happen. ECN really hurts our customers, this is a known fact.

You cannot change this like that, it would force us (and maybe others)
to either revert this patch, or add a knob.

If sysctl_tcp_ecn = 0, there is no way a connection should have ECN
enabled. This was documented years ago.

For the record :

vi +247 Documentation/networking/ip-sysctl.txt

tcp_ecn - INTEGER
        Control use of Explicit Congestion Notification (ECN) by TCP.
        ECN is used only when both ends of the TCP connection indicate
        support for it.  This feature is useful in avoiding losses due
        to congestion by allowing supporting routers to signal
        congestion before having to drop packets.
        Possible values are:
                0 Disable ECN.  Neither initiate nor accept ECN.
                1 Enable ECN when requested by incoming connections and
                  also request ECN on outgoing connection attempts.
                2 Enable ECN when requested by incoming connections
                  but do not request ECN on outgoing connections.
        Default: 2



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal Oct. 31, 2014, 1:39 p.m. UTC | #2
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2014-10-31 at 13:13 +0100, Florian Westphal wrote:
> > Won't work anymore when tcp_ecn=0 and RTAX_FEATURES route attribute did
> > allow ecn.
> > 
> > Just turn on ecn if the client ts says so.
> > 
> > This means that while syn cookies are in use clients can turn on ecn
> > even if it is off on the server.
> > 
> > However, there seems to be no harm in permitting this.
> > 
> > Alternatively one can extend the test to also perform route lookup and
> > check RTAX_FEATURES, but it simply doesn't appear to be worth the effort.
> > 
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> > ---
> >  Changes since v1:
> >   - reword commit message
> 
> Sorry.
> 
> Google chose to disable ecn on their GFE, so we set sysctl_tcp_ecn to 0
> 
> If I understand your patch, if a synflood is going on, some innocent
> connections could get ECN enabled, while we do not want this to ever
> happen. ECN really hurts our customers, this is a known fact.
>
> You cannot change this like that, it would force us (and maybe others)
> to either revert this patch, or add a knob.

Mot needed, if you think its wrong to remove the check, I will respin
with a proper validation.

> If sysctl_tcp_ecn = 0, there is no way a connection should have ECN
> enabled. This was documented years ago.

It would only get enabled if the echoed timestamp (ie the timestamp we
sent in the synack) indicates that ecn was enabled, i.e. the client or
a middlebox would have to munge/modify it to set the 'ecn on' bit in the
timestamp.

If that is too fragile in your opinion I will respin the patch to include
the additional validation via dst.  We already need to fetch the dst
object anyway to fetch certain route attributes not in the timestamp or
cookie, so its only a matter of reorganizing code first to avoid two lookups.

Let me know what you prefer.

Thanks,
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 31, 2014, 2:04 p.m. UTC | #3
On Fri, 2014-10-31 at 14:39 +0100, Florian Westphal wrote:

> It would only get enabled if the echoed timestamp (ie the timestamp we
> sent in the synack) indicates that ecn was enabled, i.e. the client or
> a middlebox would have to munge/modify it to set the 'ecn on' bit in the
> timestamp.
> 
> If that is too fragile in your opinion I will respin the patch to include
> the additional validation via dst.  We already need to fetch the dst
> object anyway to fetch certain route attributes not in the timestamp or
> cookie, so its only a matter of reorganizing code first to avoid two lookups.

Well, your changelog is so confusing, I have no idea what is your
intent.

I do not really understand why you need to change something.

Maybe this is because I have not yet took my coffee ;)

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal Oct. 31, 2014, 2:15 p.m. UTC | #4
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2014-10-31 at 14:39 +0100, Florian Westphal wrote:
> 
> > It would only get enabled if the echoed timestamp (ie the timestamp we
> > sent in the synack) indicates that ecn was enabled, i.e. the client or
> > a middlebox would have to munge/modify it to set the 'ecn on' bit in the
> > timestamp.
> > 
> > If that is too fragile in your opinion I will respin the patch to include
> > the additional validation via dst.  We already need to fetch the dst
> > object anyway to fetch certain route attributes not in the timestamp or
> > cookie, so its only a matter of reorganizing code first to avoid two lookups.
> 
> Well, your changelog is so confusing, I have no idea what is your
> intent.

Sorry :-/

So if you have a per route ecn setting, and syncookies are used,
and tcp_ecn sysctl is 0:

1. we receive syn with ecn on and timestamps
2. we send cookie synack, with timestamp and ecn (route allowed it),
the lower bits of the timestamp have a "magic" bit set that allows
us to infer that ecn was negotiated successfully.
3. we drop the ack from the client, since timestamp decoding sees
"ecn is on according to timestamp, but the tcp_ecn sysctl is off".

So to fix this, step 3 either has to check the dst setting
in addition to the global sysctl, or to rely on the timestamp alone
that ecn was requested by the original client and allowed by our host
at the time synack timestamp was generated/sent.

I hope that explains the reason behind patch #1 up.

> I do not really understand why you need to change something.

Yes, unfortunately you're not the first person saying that my
changelogs are not precise enough sometimes, I hope to do
a better job next time around.

> Maybe this is because I have not yet took my coffee ;)

Oh, well, that could also explain it 8-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 31, 2014, 3:47 p.m. UTC | #5
On Fri, 2014-10-31 at 15:15 +0100, Florian Westphal wrote:

> So if you have a per route ecn setting, and syncookies are used,
> and tcp_ecn sysctl is 0:

This part I do not understand.

Why should tcp_ecn be 0 here, and not 2 (default value) ?

> 
> 1. we receive syn with ecn on and timestamps
> 2. we send cookie synack, with timestamp and ecn (route allowed it),
> the lower bits of the timestamp have a "magic" bit set that allows
> us to infer that ecn was negotiated successfully.
> 3. we drop the ack from the client, since timestamp decoding sees
> "ecn is on according to timestamp, but the tcp_ecn sysctl is off".
> 
> So to fix this, step 3 either has to check the dst setting
> in addition to the global sysctl, or to rely on the timestamp alone
> that ecn was requested by the original client and allowed by our host
> at the time synack timestamp was generated/sent.
> 
> I hope that explains the reason behind patch #1 up.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal Oct. 31, 2014, 4 p.m. UTC | #6
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2014-10-31 at 15:15 +0100, Florian Westphal wrote:
> 
> > So if you have a per route ecn setting, and syncookies are used,
> > and tcp_ecn sysctl is 0:
> 
> This part I do not understand.
> 
> Why should tcp_ecn be 0 here, and not 2 (default value) ?

Because admin might have changed it.
There is no problem if tcp_ecn sysctl is nonzero (1 or 2).

This problem will only manifest itself iff tcp_ecn sysctl was set to 0,
and the remote peer requests ecn and a route specific setting enabled
ecn for the source network and syncookies are used.

Current timestamp cookie validation will think "client is lying about
ecn in the timestamp as sysctl is off", since it does not consider a
per-route ecn knob.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 3a35b15..57521de 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -493,8 +493,7 @@  __u32 cookie_v4_init_sequence(struct sock *sk, const struct sk_buff *skb,
 #endif
 
 __u32 cookie_init_timestamp(struct request_sock *req);
-bool cookie_check_timestamp(struct tcp_options_received *opt, struct net *net,
-			    bool *ecn_ok);
+bool cookie_check_timestamp(struct tcp_options_received *opt, bool *ecn_ok);
 
 /* From net/ipv6/syncookies.c */
 int __cookie_v6_check(const struct ipv6hdr *iph, const struct tcphdr *th,
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 4ac7bca..c4e5e2d 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -225,7 +225,7 @@  static inline struct sock *get_cookie_sock(struct sock *sk, struct sk_buff *skb,
  * return false if we decode an option that should not be.
  */
 bool cookie_check_timestamp(struct tcp_options_received *tcp_opt,
-			struct net *net, bool *ecn_ok)
+			    bool *ecn_ok)
 {
 	/* echoed timestamp, lowest bits contain options */
 	u32 options = tcp_opt->rcv_tsecr & TSMASK;
@@ -240,8 +240,6 @@  bool cookie_check_timestamp(struct tcp_options_received *tcp_opt,
 
 	tcp_opt->sack_ok = (options & (1 << 4)) ? TCP_SACK_SEEN : 0;
 	*ecn_ok = (options >> 5) & 1;
-	if (*ecn_ok && !net->ipv4.sysctl_tcp_ecn)
-		return false;
 
 	if (tcp_opt->sack_ok && !sysctl_tcp_sack)
 		return false;
@@ -290,7 +288,7 @@  struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
 	memset(&tcp_opt, 0, sizeof(tcp_opt));
 	tcp_parse_options(skb, &tcp_opt, 0, NULL);
 
-	if (!cookie_check_timestamp(&tcp_opt, sock_net(sk), &ecn_ok))
+	if (!cookie_check_timestamp(&tcp_opt, &ecn_ok))
 		goto out;
 
 	ret = NULL;
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index be291ba..a08062c 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -186,7 +186,7 @@  struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
 	memset(&tcp_opt, 0, sizeof(tcp_opt));
 	tcp_parse_options(skb, &tcp_opt, 0, NULL);
 
-	if (!cookie_check_timestamp(&tcp_opt, sock_net(sk), &ecn_ok))
+	if (!cookie_check_timestamp(&tcp_opt, &ecn_ok))
 		goto out;
 
 	ret = NULL;