diff mbox series

[ovs-dev,v4,1/3] dpif-netdev: Add parameters to configure auto load balance.

Message ID 20201217192331.549753-2-ktraynor@redhat.com
State Superseded
Headers show
Series Add auto load balance parameters | expand

Commit Message

Kevin Traynor Dec. 17, 2020, 7:23 p.m. UTC
From: Christophe Fontaine <cfontain@redhat.com>

Two important parts of how auto load balance operates is how
loaded a core needs to be and how much improvement is estimated
before an auto load balance can trigger.

Previously they were hardcoded to 95% loaded and 25% variance
improvement.

These default values may not be suitable for all use cases and
we may want to use a more (or less) aggressive rebalance, either
on the pmd load threshold or on the minimum variance improvement
threshold.

The defaults are not changed, but "pmd-auto-lb-pmd-load" and
"pmd-auto-lb-improvement" parameters are added to override the defaults.

$ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-pmd-load="70"
$ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-improvement="20"

Signed-off-by: Christophe Fontaine <cfontain@redhat.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
---
 NEWS                 |  1 +
 lib/dpif-netdev.c    | 38 ++++++++++++++++++++++++++++++++------
 vswitchd/vswitch.xml | 27 ++++++++++++++++++++++++++-
 3 files changed, 59 insertions(+), 7 deletions(-)

Comments

0-day Robot Dec. 17, 2020, 7:59 p.m. UTC | #1
Bleep bloop.  Greetings Kevin Traynor, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors or committers: Kevin Traynor <ktraynor@redhat.com>
Lines checked: 189, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
David Marchand Dec. 18, 2020, 3:55 p.m. UTC | #2
On Thu, Dec 17, 2020 at 8:23 PM Kevin Traynor <ktraynor@redhat.com> wrote:
>
> From: Christophe Fontaine <cfontain@redhat.com>
>
> Two important parts of how auto load balance operates is how
> loaded a core needs to be and how much improvement is estimated
> before an auto load balance can trigger.
>
> Previously they were hardcoded to 95% loaded and 25% variance
> improvement.
>
> These default values may not be suitable for all use cases and
> we may want to use a more (or less) aggressive rebalance, either
> on the pmd load threshold or on the minimum variance improvement
> threshold.
>
> The defaults are not changed, but "pmd-auto-lb-pmd-load" and
> "pmd-auto-lb-improvement" parameters are added to override the defaults.
>
> $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-pmd-load="70"
> $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-improvement="20"
>
> Signed-off-by: Christophe Fontaine <cfontain@redhat.com>
> Signed-off-by: Kevin Traynor <ktraynor@redhat.com>

Acked-by: David Marchand <david.marchand@redhat.com>

Just a nit: we could use a single denomination like *PMD* auto load
balancing/balance in both NEWS and the database fields description for
consistency and avoid confusion with other OVS features.


--
David Marchand
Kevin Traynor Dec. 18, 2020, 5:56 p.m. UTC | #3
On 18/12/2020 15:55, David Marchand wrote:
> On Thu, Dec 17, 2020 at 8:23 PM Kevin Traynor <ktraynor@redhat.com> wrote:
>>
>> From: Christophe Fontaine <cfontain@redhat.com>
>>
>> Two important parts of how auto load balance operates is how
>> loaded a core needs to be and how much improvement is estimated
>> before an auto load balance can trigger.
>>
>> Previously they were hardcoded to 95% loaded and 25% variance
>> improvement.
>>
>> These default values may not be suitable for all use cases and
>> we may want to use a more (or less) aggressive rebalance, either
>> on the pmd load threshold or on the minimum variance improvement
>> threshold.
>>
>> The defaults are not changed, but "pmd-auto-lb-pmd-load" and
>> "pmd-auto-lb-improvement" parameters are added to override the defaults.
>>
>> $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-pmd-load="70"
>> $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-improvement="20"
>>
>> Signed-off-by: Christophe Fontaine <cfontain@redhat.com>
>> Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
> 
> Acked-by: David Marchand <david.marchand@redhat.com>
> 
> Just a nit: we could use a single denomination like *PMD* auto load
> balancing/balance in both NEWS and the database fields description for
> consistency and avoid confusion with other OVS features.
> 

Thanks David. I was also missing a Co-Authored-by, so I made this more
consistent and sent a v5.

> 
> --
> David Marchand
>
diff mbox series

Patch

diff --git a/NEWS b/NEWS
index 1a39cc661..9aa5a65c8 100644
--- a/NEWS
+++ b/NEWS
@@ -18,4 +18,5 @@  Post-v2.14.0
      * New 'options:dpdk-vf-mac' field for DPDK interface of VF ports,
        that allows configuring the MAC address of a VF representor.
+     * Add parameters to configure auto load balance behaviour.
    - The environment variable OVS_UNBOUND_CONF, if set, is now used
      as the DNS resolver's (unbound) configuration file.
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 300861ca5..eb19a3afc 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -86,7 +86,7 @@  VLOG_DEFINE_THIS_MODULE(dpif_netdev);
 
 /* Auto Load Balancing Defaults */
-#define ALB_ACCEPTABLE_IMPROVEMENT       25
-#define ALB_PMD_LOAD_THRESHOLD           95
-#define ALB_PMD_REBALANCE_POLL_INTERVAL  1 /* 1 Min */
+#define ALB_IMPROVEMENT_THRESHOLD    25
+#define ALB_LOAD_THRESHOLD           95
+#define ALB_REBALANCE_INTERVAL       1 /* 1 Min */
 #define MIN_TO_MSEC                  60000
 
@@ -301,4 +301,6 @@  struct pmd_auto_lb {
     uint64_t rebalance_intvl;
     uint64_t rebalance_poll_timer;
+    uint8_t rebalance_improve_thresh;
+    atomic_uint8_t rebalance_load_thresh;
 };
 
@@ -4260,4 +4262,6 @@  dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config)
     uint32_t tx_flush_interval, cur_tx_flush_interval;
     uint64_t rebalance_intvl;
+    uint8_t rebalance_load, cur_rebalance_load;
+    uint8_t rebalance_improve;
 
     tx_flush_interval = smap_get_int(other_config, "tx-flush-interval",
@@ -4337,5 +4341,5 @@  dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config)
 
     rebalance_intvl = smap_get_int(other_config, "pmd-auto-lb-rebal-interval",
-                              ALB_PMD_REBALANCE_POLL_INTERVAL);
+                                   ALB_REBALANCE_INTERVAL);
 
     /* Input is in min, convert it to msec. */
@@ -4347,4 +4351,23 @@  dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config)
     }
 
+    rebalance_improve = smap_get_int(other_config, "pmd-auto-lb-improvement",
+                                     ALB_IMPROVEMENT_THRESHOLD);
+    if (rebalance_improve > 100) {
+        rebalance_improve = ALB_IMPROVEMENT_THRESHOLD;
+    }
+    if (rebalance_improve != pmd_alb->rebalance_improve_thresh) {
+        pmd_alb->rebalance_improve_thresh = rebalance_improve;
+    }
+
+    rebalance_load = smap_get_int(other_config, "pmd-auto-lb-pmd-load",
+                             ALB_LOAD_THRESHOLD);
+    if (rebalance_load > 100) {
+        rebalance_load = ALB_LOAD_THRESHOLD;
+    }
+    atomic_read_relaxed(&pmd_alb->rebalance_load_thresh, &cur_rebalance_load);
+    if (rebalance_load != cur_rebalance_load) {
+        atomic_store_relaxed(&pmd_alb->rebalance_load_thresh,
+                             rebalance_load);
+    }
     set_pmd_auto_lb(dp);
     return 0;
@@ -5675,5 +5698,5 @@  pmd_rebalance_dry_run(struct dp_netdev *dp)
                 ((curr_variance - new_variance) * 100) / curr_variance;
         }
-        if (improvement < ALB_ACCEPTABLE_IMPROVEMENT) {
+        if (improvement < dp->pmd_alb.rebalance_improve_thresh) {
             ret = false;
         }
@@ -8710,4 +8733,5 @@  dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd,
     if (pmd->ctx.now > pmd->rxq_next_cycle_store) {
         uint64_t curr_tsc;
+        uint8_t rebalance_load_trigger;
         struct pmd_auto_lb *pmd_alb = &pmd->dp->pmd_alb;
         if (pmd_alb->is_enabled && !pmd->isolated
@@ -8726,5 +8750,7 @@  dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd,
             }
 
-            if (pmd_load >= ALB_PMD_LOAD_THRESHOLD) {
+            atomic_read_relaxed(&pmd_alb->rebalance_load_thresh,
+                                &rebalance_load_trigger);
+            if (pmd_load >= rebalance_load_trigger) {
                 atomic_count_inc(&pmd->pmd_overloaded);
             } else {
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 89a876796..b974c8b21 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -655,5 +655,5 @@ 
          Configures PMD Auto Load Balancing that allows automatic assignment of
          RX queues to PMDs if any of PMDs is overloaded (i.e. processing cycles
-         > 95%).
+         > other_config:pmd-auto-lb-pmd-load).
         </p>
         <p>
@@ -691,4 +691,29 @@ 
         </p>
       </column>
+      <column name="other_config" key="pmd-auto-lb-pmd-load"
+              type='{"type": "integer", "minInteger": 0, "maxInteger": 100}'>
+        <p>
+         Specifies the minimum pmd load threshold (% of used cycled) of
+         any non-isolated pmds when an auto load balance may be triggered.
+        </p>
+        <p>
+         The default value is <code>95%</code>.
+        </p>
+      </column>
+      <column name="other_config" key="pmd-auto-lb-improvement"
+              type='{"type": "integer", "minInteger": 0, "maxInteger": 100}'>
+        <p>
+         Specifies the minimum evaluated % improvement in load distribution
+         across the non-isolated pmds that will allow an auto load balance to
+         occur.
+        </p>
+        <p>
+         Warning: setting this parameter to 0 will always allow an auto load
+         balance to occur regardless of improvement or not.
+        </p>
+        <p>
+         The default value is <code>25%</code>.
+        </p>
+      </column>
       <column name="other_config" key="userspace-tso-enable"
               type='{"type": "boolean"}'>