diff mbox series

[net,v3,1/3] tcp: fix rejected syncookies due to stale timestamps

Message ID 3f38a305b3a07fe7b1c275d299f003f290009e13.1575595670.git.gnault@redhat.com
State Changes Requested
Delegated to: David Miller
Headers show
Series tcp: fix handling of stale syncookies timestamps | expand

Commit Message

Guillaume Nault Dec. 6, 2019, 1:49 a.m. UTC
If no synflood happens for a long enough period of time, then the
synflood timestamp isn't refreshed and jiffies can advance so much
that time_after32() can't accurately compare them any more.

Therefore, we can end up in a situation where time_after32(now,
last_overflow + HZ) returns false, just because these two values are
too far apart. In that case, the synflood timestamp isn't updated as
it should be, which can trick tcp_synq_no_recent_overflow() into
rejecting valid syncookies.

For example, let's consider the following scenario on a system
with HZ=1000:

  * The synflood timestamp is 0, either because that's the timestamp
    of the last synflood or, more commonly, because we're working with
    a freshly created socket.

  * We receive a new SYN, which triggers synflood protection. Let's say
    that this happens when jiffies == 2147484649 (that is,
    'synflood timestamp' + HZ + 2^31 + 1).

  * Then tcp_synq_overflow() doesn't update the synflood timestamp,
    because time_after32(2147484649, 1000) returns false.
    With:
      - 2147484649: the value of jiffies, aka. 'now'.
      - 1000: the value of 'last_overflow' + HZ.

  * A bit later, we receive the ACK completing the 3WHS. But
    cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow()
    says that we're not under synflood. That's because
    time_after32(2147484649, 120000) returns false.
    With:
      - 2147484649: the value of jiffies, aka. 'now'.
      - 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID.

    Of course, in reality jiffies would have increased a bit, but this
    condition will last for the next 119 seconds, which is far enough
    to accommodate for jiffie's growth.

Fix this by updating the overflow timestamp whenever jiffies isn't
within the [last_overflow, last_overflow + HZ] range. That shouldn't
have any performance impact since the update still happens at most once
per second.

Now we're guaranteed to have fresh timestamps while under synflood, so
tcp_synq_no_recent_overflow() can safely use it with time_after32() in
such situations.

Stale timestamps can still make tcp_synq_no_recent_overflow() return
the wrong verdict when not under synflood. This will be handled in the
next patch.

For 64 bits architectures, the problem was introduced with the
conversion of ->tw_ts_recent_stamp to 32 bits integer by commit
cca9bab1b72c ("tcp: use monotonic timestamps for PAWS").
The problem has always been there on 32 bits architectures.

Fixes: cca9bab1b72c ("tcp: use monotonic timestamps for PAWS")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
 include/linux/time.h | 13 +++++++++++++
 include/net/tcp.h    |  5 +++--
 2 files changed, 16 insertions(+), 2 deletions(-)

Comments

Eric Dumazet Dec. 6, 2019, 3:41 a.m. UTC | #1
On 12/5/19 5:49 PM, Guillaume Nault wrote:
> If no synflood happens for a long enough period of time, then the
> synflood timestamp isn't refreshed and jiffies can advance so much
> that time_after32() can't accurately compare them any more.
> 
> Therefore, we can end up in a situation where time_after32(now,
> last_overflow + HZ) returns false, just because these two values are
> too far apart. In that case, the synflood timestamp isn't updated as
> it should be, which can trick tcp_synq_no_recent_overflow() into
> rejecting valid syncookies.
> 
> For example, let's consider the following scenario on a system
> with HZ=1000:
> 
>   * The synflood timestamp is 0, either because that's the timestamp
>     of the last synflood or, more commonly, because we're working with
>     a freshly created socket.
> 
>   * We receive a new SYN, which triggers synflood protection. Let's say
>     that this happens when jiffies == 2147484649 (that is,
>     'synflood timestamp' + HZ + 2^31 + 1).
> 
>   * Then tcp_synq_overflow() doesn't update the synflood timestamp,
>     because time_after32(2147484649, 1000) returns false.
>     With:
>       - 2147484649: the value of jiffies, aka. 'now'.
>       - 1000: the value of 'last_overflow' + HZ.
> 
>   * A bit later, we receive the ACK completing the 3WHS. But
>     cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow()
>     says that we're not under synflood. That's because
>     time_after32(2147484649, 120000) returns false.
>     With:
>       - 2147484649: the value of jiffies, aka. 'now'.
>       - 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID.
> 
>     Of course, in reality jiffies would have increased a bit, but this
>     condition will last for the next 119 seconds, which is far enough
>     to accommodate for jiffie's growth.
> 
> Fix this by updating the overflow timestamp whenever jiffies isn't
> within the [last_overflow, last_overflow + HZ] range. That shouldn't
> have any performance impact since the update still happens at most once
> per second.
> 
> Now we're guaranteed to have fresh timestamps while under synflood, so
> tcp_synq_no_recent_overflow() can safely use it with time_after32() in
> such situations.
> 
> Stale timestamps can still make tcp_synq_no_recent_overflow() return
> the wrong verdict when not under synflood. This will be handled in the
> next patch.
> 
> For 64 bits architectures, the problem was introduced with the
> conversion of ->tw_ts_recent_stamp to 32 bits integer by commit
> cca9bab1b72c ("tcp: use monotonic timestamps for PAWS").
> The problem has always been there on 32 bits architectures.
> 
> Fixes: cca9bab1b72c ("tcp: use monotonic timestamps for PAWS")
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Guillaume Nault <gnault@redhat.com>
> ---
>  include/linux/time.h | 13 +++++++++++++
>  include/net/tcp.h    |  5 +++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/time.h b/include/linux/time.h
> index 0760a4f5a15c..30efc1f0f67d 100644
> --- a/include/linux/time.h
> +++ b/include/linux/time.h
> @@ -97,4 +97,17 @@ static inline bool itimerspec64_valid(const struct itimerspec64 *its)
>   */
>  #define time_after32(a, b)	((s32)((u32)(b) - (u32)(a)) < 0)
>  #define time_before32(b, a)	time_after32(a, b)
> +
> +/**
> + * time_before32 - check if a 32-bit timestamp is within a given time range

Wrong name ? This should be time_between32

> + * @t:	the time which may be within [l,h]
> + * @l:	the lower bound of the range
> + * @h:	the higher bound of the range
> + *
> + * time_before32(t, l, h) returns true if @l <= @t <= @h. All operands are
> + * treated as 32-bit integers.
> + *
> + * Equivalent to !(time_before32(@t, @l) || time_after32(@t, @h)).
> + */
> +#define time_between32(t, l, h) ((u32)(h) - (u32)(l) >= (u32)(t) - (u32)(l))
>  #endif
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 36f195fb576a..7d734ba391fc 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -494,14 +494,15 @@ static inline void tcp_synq_overflow(const struct sock *sk)
>  		reuse = rcu_dereference(sk->sk_reuseport_cb);
>  		if (likely(reuse)) {
>  			last_overflow = READ_ONCE(reuse->synq_overflow_ts);
> -			if (time_after32(now, last_overflow + HZ))
> +			if (!time_between32(now, last_overflow,
> +					    last_overflow + HZ))
>  				WRITE_ONCE(reuse->synq_overflow_ts, now);
>  			return;
>  		}
>  	}
>  
>  	last_overflow = tcp_sk(sk)->rx_opt.ts_recent_stamp;
> -	if (time_after32(now, last_overflow + HZ))
> +	if (!time_between32(now, last_overflow, last_overflow + HZ))
>  		tcp_sk(sk)->rx_opt.ts_recent_stamp = now;
>  }
>  
>
Guillaume Nault Dec. 6, 2019, 11:03 a.m. UTC | #2
On Thu, Dec 05, 2019 at 07:41:34PM -0800, Eric Dumazet wrote:
> 
> 
> On 12/5/19 5:49 PM, Guillaume Nault wrote:
> > +/**
> > + * time_before32 - check if a 32-bit timestamp is within a given time range
> 
> Wrong name ? This should be time_between32
> 
Ouch, yes. Will fix in v4.
diff mbox series

Patch

diff --git a/include/linux/time.h b/include/linux/time.h
index 0760a4f5a15c..30efc1f0f67d 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -97,4 +97,17 @@  static inline bool itimerspec64_valid(const struct itimerspec64 *its)
  */
 #define time_after32(a, b)	((s32)((u32)(b) - (u32)(a)) < 0)
 #define time_before32(b, a)	time_after32(a, b)
+
+/**
+ * time_before32 - check if a 32-bit timestamp is within a given time range
+ * @t:	the time which may be within [l,h]
+ * @l:	the lower bound of the range
+ * @h:	the higher bound of the range
+ *
+ * time_before32(t, l, h) returns true if @l <= @t <= @h. All operands are
+ * treated as 32-bit integers.
+ *
+ * Equivalent to !(time_before32(@t, @l) || time_after32(@t, @h)).
+ */
+#define time_between32(t, l, h) ((u32)(h) - (u32)(l) >= (u32)(t) - (u32)(l))
 #endif
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 36f195fb576a..7d734ba391fc 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -494,14 +494,15 @@  static inline void tcp_synq_overflow(const struct sock *sk)
 		reuse = rcu_dereference(sk->sk_reuseport_cb);
 		if (likely(reuse)) {
 			last_overflow = READ_ONCE(reuse->synq_overflow_ts);
-			if (time_after32(now, last_overflow + HZ))
+			if (!time_between32(now, last_overflow,
+					    last_overflow + HZ))
 				WRITE_ONCE(reuse->synq_overflow_ts, now);
 			return;
 		}
 	}
 
 	last_overflow = tcp_sk(sk)->rx_opt.ts_recent_stamp;
-	if (time_after32(now, last_overflow + HZ))
+	if (!time_between32(now, last_overflow, last_overflow + HZ))
 		tcp_sk(sk)->rx_opt.ts_recent_stamp = now;
 }