[ovs-dev,v1,7/9] conntrack: Do not rate limit ct-sweep

Message ID	3871314632b1a1b09b3d83dad1f73122ee1377ef.1613557616.git.grive@u256.net
State	Changes Requested
Headers	show Return-Path: <ovs-dev-bounces@openvswitch.org> From: Gaetan Rivet <grive@u256.net> To: dev@openvswitch.org Date: Wed, 17 Feb 2021 17:34:41 +0100 Message-Id: <3871314632b1a1b09b3d83dad1f73122ee1377ef.1613557616.git.grive@u256.net> In-Reply-To: <cover.1613557616.git.grive@u256.net> References: <cover.1613557616.git.grive@u256.net> MIME-Version: 1.0 Cc: Eli Britstein <elibr@nvidia.com>, ameerm@nvidia.com, majd@nvidia.com Subject: [ovs-dev] [PATCH v1 7/9] conntrack: Do not rate limit ct-sweep Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" <ovs-dev-bounces@openvswitch.org>
Series	conntrack: improve multithread scalability \| expand [ovs-dev,v1,0/9] conntrack: improve multithread scalability [ovs-dev,v1,1/9] conntrack: Use rcu-lists to store conn expirations [ovs-dev,v1,2/9] conntrack: Use a cmap to store zone limits [ovs-dev,v1,3/9] conntrack: Init hash basis first at creation [ovs-dev,v1,4/9] conntrack-tp: Use a cmap to store timeout policies [ovs-dev,v1,5/9] conntrack: Inverse conn and ct lock precedence [ovs-dev,v1,6/9] conntrack: Do not schedule zero ms timers [ovs-dev,v1,7/9] conntrack: Do not rate limit ct-sweep [ovs-dev,v1,8/9] conntrack: Do not log empty ct-sweep [ovs-dev,v1,9/9] conntrack: Use an atomic conn expiration value

Message ID

3871314632b1a1b09b3d83dad1f73122ee1377ef.1613557616.git.grive@u256.net

State

Changes Requested

Headers

From: Gaetan Rivet <grive@u256.net>
To: dev@openvswitch.org
Date: Wed, 17 Feb 2021 17:34:41 +0100
Message-Id: 
 <3871314632b1a1b09b3d83dad1f73122ee1377ef.1613557616.git.grive@u256.net>
In-Reply-To: <cover.1613557616.git.grive@u256.net>
References: <cover.1613557616.git.grive@u256.net>
MIME-Version: 1.0
Cc: Eli Britstein <elibr@nvidia.com>, ameerm@nvidia.com, majd@nvidia.com
Subject: [ovs-dev] [PATCH v1 7/9] conntrack: Do not rate limit ct-sweep
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: ovs-dev-bounces@openvswitch.org
Sender: "dev" <ovs-dev-bounces@openvswitch.org>

Series

conntrack: improve multithread scalability | expand

Commit Message

Gaetan Rivet Feb. 17, 2021, 4:34 p.m. UTC

The current rate limit is set to allow other threads to update the
connections when applicable. This was valid when taking the 'ct_lock'
was needed with a global critical section.

Now that the size of the critical section for 'ct_lock' is reduced, it
is not necessary to rate limit calls to ct_sweep() anymore.

Signed-off-by: Gaetan Rivet <grive@u256.net>
Reviewed-by: Eli Britstein <elibr@nvidia.com>
---
 lib/conntrack.c | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

Comments

William Tu Feb. 23, 2021, 10:55 p.m. UTC | #1

On Wed, Feb 17, 2021 at 8:34 AM Gaetan Rivet <grive@u256.net> wrote:
>
> The current rate limit is set to allow other threads to update the
> connections when applicable. This was valid when taking the 'ct_lock'
> was needed with a global critical section.
>
> Now that the size of the critical section for 'ct_lock' is reduced, it
> is not necessary to rate limit calls to ct_sweep() anymore.
>
> Signed-off-by: Gaetan Rivet <grive@u256.net>
> Reviewed-by: Eli Britstein <elibr@nvidia.com>
> ---
>  lib/conntrack.c | 24 +++++++-----------------
>  1 file changed, 7 insertions(+), 17 deletions(-)
>
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 71f79a790..1b21b79bd 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -1602,20 +1602,12 @@ conntrack_clean(struct conntrack *ct, long long now)
>   * there is an actual connection that expires, or because a new connection
>   * might be created with the minimum timeout).
>   *
> - * The logic below has two goals:
> - *
> - * - We want to reduce the number of wakeups and batch connection cleanup
> - *   when the load is not very high.  CT_CLEAN_INTERVAL ensures that if we
> - *   are coping with the current cleanup tasks, then we wait at least
> - *   5 seconds to do further cleanup.
> - *
> - * - We don't want to keep the map locked too long, as we might prevent
> - *   traffic from flowing.  CT_CLEAN_MIN_INTERVAL ensures that if cleanup is
> - *   behind, there is at least some 200ms blocks of time when the map will be
> - *   left alone, so the datapath can operate unhindered.
> + * We want to reduce the number of wakeups and batch connection cleanup
> + * when the load is not very high.  CT_CLEAN_INTERVAL ensures that if we
> + * are coping with the current cleanup tasks, then we wait at least
> + * 5 seconds to do further cleanup.
>   */

IIUC, it's either wait for next 5-second interval, or keep cleaning when behind.
It depends on how fine grained people program the timeout value.
If users program s.t like 2-second, probably in reality it takes longer
to timeout.


William

>  #define CT_CLEAN_INTERVAL 5000 /* 5 seconds */
> -#define CT_CLEAN_MIN_INTERVAL 200  /* 0.2 seconds */
>
>  static void *
>  clean_thread_main(void *f_)
> @@ -1627,12 +1619,10 @@ clean_thread_main(void *f_)
>          long long now = time_msec();
>          next_wake = conntrack_clean(ct, now);
>
> -        if (next_wake < now) {
> -            poll_immediate_wake();
> -        } else if (next_wake < now + CT_CLEAN_MIN_INTERVAL) {
> -            poll_timer_wait_until(now + CT_CLEAN_MIN_INTERVAL);
> +        if (next_wake > now) {
> +            poll_timer_wait_until(MIN(next_wake, now + CT_CLEAN_INTERVAL));
>          } else {
> -            poll_timer_wait_until(MAX(next_wake, now + CT_CLEAN_INTERVAL));
> +            poll_immediate_wake();
>          }
>          latch_wait(&ct->clean_thread_exit);
>          poll_block();
> --
> 2.30.0
>

Gaetan Rivet Feb. 24, 2021, 12:35 a.m. UTC | #2

On Tue, Feb 23, 2021, at 22:55, William Tu wrote:
> On Wed, Feb 17, 2021 at 8:34 AM Gaetan Rivet <grive@u256.net> wrote:
> >
> > The current rate limit is set to allow other threads to update the
> > connections when applicable. This was valid when taking the 'ct_lock'
> > was needed with a global critical section.
> >
> > Now that the size of the critical section for 'ct_lock' is reduced, it
> > is not necessary to rate limit calls to ct_sweep() anymore.
> >
> > Signed-off-by: Gaetan Rivet <grive@u256.net>
> > Reviewed-by: Eli Britstein <elibr@nvidia.com>
> > ---
> >  lib/conntrack.c | 24 +++++++-----------------
> >  1 file changed, 7 insertions(+), 17 deletions(-)
> >
> > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > index 71f79a790..1b21b79bd 100644
> > --- a/lib/conntrack.c
> > +++ b/lib/conntrack.c
> > @@ -1602,20 +1602,12 @@ conntrack_clean(struct conntrack *ct, long long now)
> >   * there is an actual connection that expires, or because a new connection
> >   * might be created with the minimum timeout).
> >   *
> > - * The logic below has two goals:
> > - *
> > - * - We want to reduce the number of wakeups and batch connection cleanup
> > - *   when the load is not very high.  CT_CLEAN_INTERVAL ensures that if we
> > - *   are coping with the current cleanup tasks, then we wait at least
> > - *   5 seconds to do further cleanup.
> > - *
> > - * - We don't want to keep the map locked too long, as we might prevent
> > - *   traffic from flowing.  CT_CLEAN_MIN_INTERVAL ensures that if cleanup is
> > - *   behind, there is at least some 200ms blocks of time when the map will be
> > - *   left alone, so the datapath can operate unhindered.
> > + * We want to reduce the number of wakeups and batch connection cleanup
> > + * when the load is not very high.  CT_CLEAN_INTERVAL ensures that if we
> > + * are coping with the current cleanup tasks, then we wait at least
> > + * 5 seconds to do further cleanup.
> >   */
> 
> IIUC, it's either wait for next 5-second interval, or keep cleaning when behind.
> It depends on how fine grained people program the timeout value.
> If users program s.t like 2-second, probably in reality it takes longer
> to timeout.
> 
> 
> William
> 

Yes, if the timeout is less than CT_CLEAN_INTERVAL, it will be scheduled sooner, but it is still subject to potential backlog from ct_clean().

Removing the lower rate limit is a way to reduce this backlog, which should help getting closer to the programmed timeouts.

As the ct_lock is now taken for shorter spans, I thought it was appropriate to revisit these limits, but maybe others would prefer ct_clean to take less resources.

> >  #define CT_CLEAN_INTERVAL 5000 /* 5 seconds */
> > -#define CT_CLEAN_MIN_INTERVAL 200  /* 0.2 seconds */
> >
> >  static void *
> >  clean_thread_main(void *f_)
> > @@ -1627,12 +1619,10 @@ clean_thread_main(void *f_)
> >          long long now = time_msec();
> >          next_wake = conntrack_clean(ct, now);
> >
> > -        if (next_wake < now) {
> > -            poll_immediate_wake();
> > -        } else if (next_wake < now + CT_CLEAN_MIN_INTERVAL) {
> > -            poll_timer_wait_until(now + CT_CLEAN_MIN_INTERVAL);
> > +        if (next_wake > now) {
> > +            poll_timer_wait_until(MIN(next_wake, now + CT_CLEAN_INTERVAL));
> >          } else {
> > -            poll_timer_wait_until(MAX(next_wake, now + CT_CLEAN_INTERVAL));
> > +            poll_immediate_wake();
> >          }
> >          latch_wait(&ct->clean_thread_exit);
> >          poll_block();
> > --
> > 2.30.0
> >
>

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 71f79a790..1b21b79bd 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1602,20 +1602,12 @@  conntrack_clean(struct conntrack *ct, long long now)
  * there is an actual connection that expires, or because a new connection
  * might be created with the minimum timeout).
  *
- * The logic below has two goals:
- *
- * - We want to reduce the number of wakeups and batch connection cleanup
- *   when the load is not very high.  CT_CLEAN_INTERVAL ensures that if we
- *   are coping with the current cleanup tasks, then we wait at least
- *   5 seconds to do further cleanup.
- *
- * - We don't want to keep the map locked too long, as we might prevent
- *   traffic from flowing.  CT_CLEAN_MIN_INTERVAL ensures that if cleanup is
- *   behind, there is at least some 200ms blocks of time when the map will be
- *   left alone, so the datapath can operate unhindered.
+ * We want to reduce the number of wakeups and batch connection cleanup
+ * when the load is not very high.  CT_CLEAN_INTERVAL ensures that if we
+ * are coping with the current cleanup tasks, then we wait at least
+ * 5 seconds to do further cleanup.
  */
 #define CT_CLEAN_INTERVAL 5000 /* 5 seconds */
-#define CT_CLEAN_MIN_INTERVAL 200  /* 0.2 seconds */
 
 static void *
 clean_thread_main(void *f_)
@@ -1627,12 +1619,10 @@  clean_thread_main(void *f_)
         long long now = time_msec();
         next_wake = conntrack_clean(ct, now);
 
-        if (next_wake < now) {
-            poll_immediate_wake();
-        } else if (next_wake < now + CT_CLEAN_MIN_INTERVAL) {
-            poll_timer_wait_until(now + CT_CLEAN_MIN_INTERVAL);
+        if (next_wake > now) {
+            poll_timer_wait_until(MIN(next_wake, now + CT_CLEAN_INTERVAL));
         } else {
-            poll_timer_wait_until(MAX(next_wake, now + CT_CLEAN_INTERVAL));
+            poll_immediate_wake();
         }
         latch_wait(&ct->clean_thread_exit);
         poll_block();

[ovs-dev,v1,7/9] conntrack: Do not rate limit ct-sweep

Commit Message

Comments

Patch