diff mbox

[net] tcp: don't allow syn packets without timestamps to pass tcp_tw_recycle logic

Message ID 69ff43477a795a1117302b11583bc8ea8c5dc811.1407802666.git.hannes@stressinduktion.org
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Hannes Frederic Sowa Aug. 12, 2014, 12:21 a.m. UTC
If tw_recycle is enabled, non-timestamped SYN packets could get past
the tw_recycle check and create a new connection. This is dangerous
as we cannot verify that segments from an old connection won't be
accepted by the new one in tcp_validate_incoming because of the missing
timestamps. Note that Windows seems to have timestamps disabled by
default. Thus this broken situation could easily arise by a Linux and
Windows box sharing one IP address and talking to a tcp_tw_recycle
enabled server.

We don't change the behavior regarding how many SYNs we queue up from
non timestamping hosts (the second tcp_peer_is_proven check), because the
second call to tcp_peer_is_proven does not use the new boolean timestamp
argument at all because PAWS check is disabled.

Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/tcp.h      | 2 +-
 net/ipv4/tcp_input.c   | 9 ++++++---
 net/ipv4/tcp_metrics.c | 6 ++++--
 3 files changed, 11 insertions(+), 6 deletions(-)

Comments

Eric Dumazet Aug. 12, 2014, 1:32 a.m. UTC | #1
On Tue, 2014-08-12 at 02:21 +0200, Hannes Frederic Sowa wrote:
> If tw_recycle is enabled, non-timestamped SYN packets could get past
> the tw_recycle check and create a new connection. This is dangerous
> as we cannot verify that segments from an old connection won't be
> accepted by the new one in tcp_validate_incoming because of the missing
> timestamps. Note that Windows seems to have timestamps disabled by
> default. Thus this broken situation could easily arise by a Linux and
> Windows box sharing one IP address and talking to a tcp_tw_recycle
> enabled server.
> 
> We don't change the behavior regarding how many SYNs we queue up from
> non timestamping hosts (the second tcp_peer_is_proven check), because the
> second call to tcp_peer_is_proven does not use the new boolean timestamp
> argument at all because PAWS check is disabled.
> 
> Cc: Florian Westphal <fw@strlen.de>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---

Not sure what you try to achieve here.

tw_recycle can only be used in very controlled environment, no NAT, and
all hosts using timestamps.

If using NAT, then tw_recycle can not be used, even if all hosts are
linux boxes with timestamps enabled.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Aug. 12, 2014, 3:08 a.m. UTC | #2
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue, 12 Aug 2014 02:21:36 +0200

> Thus this broken situation could easily arise by a Linux and Windows
> box sharing one IP address and talking to a tcp_tw_recycle enabled
> server.

As Eric Dumazet mentioned, timewait recycling does not work if any
traffic goes through a NAT box.

So this situation of two boxes "sharing one IP address" fundamentally
makes timewait recycling unusable.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa Aug. 12, 2014, 8:03 a.m. UTC | #3
On Tue, Aug 12, 2014, at 03:32, Eric Dumazet wrote:
> On Tue, 2014-08-12 at 02:21 +0200, Hannes Frederic Sowa wrote:
> > If tw_recycle is enabled, non-timestamped SYN packets could get past
> > the tw_recycle check and create a new connection. This is dangerous
> > as we cannot verify that segments from an old connection won't be
> > accepted by the new one in tcp_validate_incoming because of the missing
> > timestamps. Note that Windows seems to have timestamps disabled by
> > default. Thus this broken situation could easily arise by a Linux and
> > Windows box sharing one IP address and talking to a tcp_tw_recycle
> > enabled server.
> > 
> > We don't change the behavior regarding how many SYNs we queue up from
> > non timestamping hosts (the second tcp_peer_is_proven check), because the
> > second call to tcp_peer_is_proven does not use the new boolean timestamp
> > argument at all because PAWS check is disabled.
> > 
> > Cc: Florian Westphal <fw@strlen.de>
> > Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > ---
> 
> Not sure what you try to achieve here.
>
> tw_recycle can only be used in very controlled environment, no NAT, and
> all hosts using timestamps.
> 
> If using NAT, then tw_recycle can not be used, even if all hosts are
> linux boxes with timestamps enabled.

Mostly me being pessimistic. ;)

I noticed that tw_recycle nonetheless tries to cope with the fact that
sometimes non-timestamped SYNs arrive. E.g. the scheduling of the
time-wait timeout only happens for only RTO in case the host saw
timestamps on the connection, otherwise normal TIMEWAIT_MSL applies.

So I wanted to stop "illegal" connection setups and trade that against
possible data corruption in case someone switches this knob on in a not
controlled environment.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa Aug. 12, 2014, 8:08 a.m. UTC | #4
On Tue, Aug 12, 2014, at 05:08, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Tue, 12 Aug 2014 02:21:36 +0200
> 
> > Thus this broken situation could easily arise by a Linux and Windows
> > box sharing one IP address and talking to a tcp_tw_recycle enabled
> > server.
> 
> As Eric Dumazet mentioned, timewait recycling does not work if any
> traffic goes through a NAT box.
> 
> So this situation of two boxes "sharing one IP address" fundamentally
> makes timewait recycling unusable.

Exactly, I'll just throw away the SYN packet instead of opening a
connection where we couldn't very if the preconditions for timewait
recycling did not hold.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa Aug. 14, 2014, 9:37 a.m. UTC | #5
Hi David,

On Di, 2014-08-12 at 10:08 +0200, Hannes Frederic Sowa wrote:
> 
> On Tue, Aug 12, 2014, at 05:08, David Miller wrote:
> > From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > Date: Tue, 12 Aug 2014 02:21:36 +0200
> > 
> > > Thus this broken situation could easily arise by a Linux and Windows
> > > box sharing one IP address and talking to a tcp_tw_recycle enabled
> > > server.
> > 
> > As Eric Dumazet mentioned, timewait recycling does not work if any
> > traffic goes through a NAT box.
> > 
> > So this situation of two boxes "sharing one IP address" fundamentally
> > makes timewait recycling unusable.
> 
> Exactly, I'll just throw away the SYN packet instead of opening a
> connection where we couldn't very if the preconditions for timewait
> recycling did not hold.

did you have a chance to look at this patch again?

I found this during code review. Non time stamped SYN packets could
eventually trigger the completion of a 3WHS even though we had
tw_recycle enabled and the SYN arrived in a TCP_PAWS_MSL of this host
period.

I don't want to make this feature more general usable (without time
stamps), they are absolutely required. It just adds protection against
accidental 3WHS completion of 3WHS if a packet without time stamps
arrived.

I don't have a strong opinion on that but it just seems to be natural,
as we also conditional schedule the timeout for the tw buckets depending
on if we saw time stamps on the prior connection.

Thanks,
Hannes


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 14, 2014, 3:38 p.m. UTC | #6
On Thu, 2014-08-14 at 11:37 +0200, Hannes Frederic Sowa wrote:

> did you have a chance to look at this patch again?
> 
> I found this during code review. Non time stamped SYN packets could
> eventually trigger the completion of a 3WHS even though we had
> tw_recycle enabled and the SYN arrived in a TCP_PAWS_MSL of this host
> period.


> 
> I don't want to make this feature more general usable (without time
> stamps), they are absolutely required. It just adds protection against
> accidental 3WHS completion of 3WHS if a packet without time stamps
> arrived.
> 
> I don't have a strong opinion on that but it just seems to be natural,
> as we also conditional schedule the timeout for the tw buckets depending
> on if we saw time stamps on the prior connection.

I believe this patch gives a wrong sense of comfort, and honestly this
is caused by its changelog.

Sane people should not use tw_recycle, and eventually we should remove
its support.

Your changelog is misleading because it could give bad incentive about
_using_ tw_recycle.

Please rephrase it so that no doubt is possible.

Thanks

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa Aug. 14, 2014, 6:39 p.m. UTC | #7
On Do, 2014-08-14 at 08:38 -0700, Eric Dumazet wrote:
> On Thu, 2014-08-14 at 11:37 +0200, Hannes Frederic Sowa wrote:
> 
> > did you have a chance to look at this patch again?
> > 
> > I found this during code review. Non time stamped SYN packets could
> > eventually trigger the completion of a 3WHS even though we had
> > tw_recycle enabled and the SYN arrived in a TCP_PAWS_MSL of this host
> > period.
> 
> 
> > 
> > I don't want to make this feature more general usable (without time
> > stamps), they are absolutely required. It just adds protection against
> > accidental 3WHS completion of 3WHS if a packet without time stamps
> > arrived.
> > 
> > I don't have a strong opinion on that but it just seems to be natural,
> > as we also conditional schedule the timeout for the tw buckets depending
> > on if we saw time stamps on the prior connection.
> 
> I believe this patch gives a wrong sense of comfort, and honestly this
> is caused by its changelog.
> 
> Sane people should not use tw_recycle, and eventually we should remove
> its support.
> 
> Your changelog is misleading because it could give bad incentive about
> _using_ tw_recycle.
> 
> Please rephrase it so that no doubt is possible.

Yep, I also thought the changelog might be too poor after your response.
Will resend soon with updated changelog.

Thanks,
Hannes


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/tcp.h b/include/net/tcp.h
index dafa1cb..68425af 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -417,7 +417,7 @@  void tcp_update_metrics(struct sock *sk);
 void tcp_init_metrics(struct sock *sk);
 void tcp_metrics_init(void);
 bool tcp_peer_is_proven(struct request_sock *req, struct dst_entry *dst,
-			bool paws_check);
+			bool paws_check, bool timestamps);
 bool tcp_remember_stamp(struct sock *sk);
 bool tcp_tw_remember_stamp(struct inet_timewait_sock *tw);
 void tcp_fetch_timewait_stamp(struct sock *sk, struct dst_entry *dst);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index a3d47af..a0eb435 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5979,12 +5979,14 @@  int tcp_conn_request(struct request_sock_ops *rsk_ops,
 		 * timewait bucket, so that all the necessary checks
 		 * are made in the function processing timewait state.
 		 */
-		if (tmp_opt.saw_tstamp && tcp_death_row.sysctl_tw_recycle) {
+		if (tcp_death_row.sysctl_tw_recycle) {
 			bool strict;
 
 			dst = af_ops->route_req(sk, &fl, req, &strict);
+
 			if (dst && strict &&
-			    !tcp_peer_is_proven(req, dst, true)) {
+			    !tcp_peer_is_proven(req, dst, true,
+						tmp_opt.saw_tstamp)) {
 				NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
 				goto drop_and_release;
 			}
@@ -5993,7 +5995,8 @@  int tcp_conn_request(struct request_sock_ops *rsk_ops,
 		else if (!sysctl_tcp_syncookies &&
 			 (sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) <
 			  (sysctl_max_syn_backlog >> 2)) &&
-			 !tcp_peer_is_proven(req, dst, false)) {
+			 !tcp_peer_is_proven(req, dst, false,
+					     tmp_opt.saw_tstamp)) {
 			/* Without syncookies last quarter of
 			 * backlog is filled with destinations,
 			 * proven to be alive.
diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
index 0d54e59..ed9c9a9 100644
--- a/net/ipv4/tcp_metrics.c
+++ b/net/ipv4/tcp_metrics.c
@@ -576,7 +576,8 @@  reset:
 	tp->snd_cwnd_stamp = tcp_time_stamp;
 }
 
-bool tcp_peer_is_proven(struct request_sock *req, struct dst_entry *dst, bool paws_check)
+bool tcp_peer_is_proven(struct request_sock *req, struct dst_entry *dst,
+			bool paws_check, bool timestamps)
 {
 	struct tcp_metrics_block *tm;
 	bool ret;
@@ -589,7 +590,8 @@  bool tcp_peer_is_proven(struct request_sock *req, struct dst_entry *dst, bool pa
 	if (paws_check) {
 		if (tm &&
 		    (u32)get_seconds() - tm->tcpm_ts_stamp < TCP_PAWS_MSL &&
-		    (s32)(tm->tcpm_ts - req->ts_recent) > TCP_PAWS_WINDOW)
+		    ((s32)(tm->tcpm_ts - req->ts_recent) > TCP_PAWS_WINDOW ||
+		     !timestamps))
 			ret = false;
 		else
 			ret = true;