Message ID | 20200813090713.48316-1-cfontain@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v2] dpif-netdev: add parameters to configure autolb | expand |
cfontain@redhat.com writes: > From: Christophe Fontaine <cfontain@redhat.com> > > ALB_ACCEPTABLE_IMPROVEMENT and ALB_PMD_LOAD_THRESHOLD default values > can be overriden with "pmd-auto-lb-acc-improvement" and "pmd-auto-lb-threshold". > > Default values may not be suitable for all use cases, and we may want to > experiment a more (or less) aggressive rebalance, either on the threshold > (ie CPU usage which triggers a rebalance) or on the acceptable improvement > (ie if the new queue assignation will be applied or discarded). > > $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-acc-improvement=20 > $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-threshold=70 > > Signed-off-by: Christophe Fontaine <cfontain@redhat.com> > --- Hi Christophe, Thanks for the contribution. I think it should have a NEWS item. Also, add an entry into .mailmap and AUTHORS.rst with your name and email :) > lib/dpif-netdev.c | 12 ++++++++++-- > vswitchd/vswitch.xml | 28 +++++++++++++++++++++++++++- > 2 files changed, 37 insertions(+), 3 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 02df8f11e..b981706b5 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -300,6 +300,8 @@ struct pmd_auto_lb { > bool is_enabled; /* Current status of Auto load balancing. */ > uint64_t rebalance_intvl; > uint64_t rebalance_poll_timer; > + uint64_t rebalance_acc_improvement; > + uint64_t rebalance_threshold; > }; > > /* Datapath based on the network device interface from netdev.h. > @@ -4346,6 +4348,12 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) > pmd_alb->rebalance_intvl = rebalance_intvl; > } > > + pmd_alb->rebalance_acc_improvement = smap_get_int(other_config, > + "pmd-auto-lb-acc-improvement", ALB_ACCEPTABLE_IMPROVEMENT); > + > + pmd_alb->rebalance_threshold = smap_get_int(other_config, > + "pmd-auto-lb-threshold", ALB_PMD_LOAD_THRESHOLD); > + > set_pmd_auto_lb(dp); > return 0; > } > @@ -5674,7 +5682,7 @@ pmd_rebalance_dry_run(struct dp_netdev *dp) > improvement = > ((curr_variance - new_variance) * 100) / curr_variance; > } > - if (improvement < ALB_ACCEPTABLE_IMPROVEMENT) { > + if (improvement < dp->pmd_alb.rebalance_acc_improvement) { > ret = false; > } > } > @@ -8724,7 +8732,7 @@ dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd, > pmd_load = ((tot_proc * 100) / (tot_idle + tot_proc)); > } > > - if (pmd_load >= ALB_PMD_LOAD_THRESHOLD) { > + if (pmd_load >= pmd_alb->rebalance_threshold) { > atomic_count_inc(&pmd->pmd_overloaded); > } else { > atomic_count_set(&pmd->pmd_overloaded, 0); > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 81c84927f..53f9be591 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -654,7 +654,7 @@ > <p> > Configures PMD Auto Load Balancing that allows automatic assignment of > RX queues to PMDs if any of PMDs is overloaded (i.e. processing cycles > - > 95%). > + > other_config:pmd-auto-lb-threshold). > </p> > <p> > It uses current scheme of cycle based assignment of RX queues that > @@ -690,6 +690,32 @@ > once in few hours or a day or a week. > </p> > </column> > + <column name="other_config" key="pmd-auto-lb-threshold" > + type='{"type": "integer", > + "minInteger": 0, "maxInteger": 100}'> > + <p> > + Specifies the threshold defining when a PMD is overloaded. > + When this threshold is reached, it will trigger a request to > + rebalance the different queues between the non-pinned pmds. > + </p> > + <p> > + The default value is <code>95</code>. > + </p> > + </column> > + <column name="other_config" key="pmd-auto-lb-acc-improvement" > + type='{"type": "integer", > + "minInteger": 0, "maxInteger": 100}'> > + <p> > + When an auto load balance iteration is requested, this value > + defines if the new queue assignation will be applied: comparing > + the current queue assignation and the new one, the new assignation > + will be applied only if the improvement of CPU usage is above > + this value. > + </p> > + <p> > + The default value is <code>25</code>. > + </p> > + </column> > <column name="other_config" key="userspace-tso-enable" > type='{"type": "boolean"}'> > <p>
Hi Christophe, Thanks for sending the patch. Few comments below. On 13/08/2020 10:07, cfontain@redhat.com wrote: > From: Christophe Fontaine <cfontain@redhat.com> > > ALB_ACCEPTABLE_IMPROVEMENT and ALB_PMD_LOAD_THRESHOLD default values > can be overriden with "pmd-auto-lb-acc-improvement" and "pmd-auto-lb-threshold". > > Default values may not be suitable for all use cases, and we may want to > experiment a more (or less) aggressive rebalance, either on the threshold > (ie CPU usage which triggers a rebalance) or on the acceptable improvement > (ie if the new queue assignation will be applied or discarded). > > $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-acc-improvement=20 > $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-threshold=70 > Naming is hard. Trying to think what is a good name for a user not familiar with the internals... How about something like, pmd-auto-lb-variance/pmd-auto-lb-improvement/pmd-auto-lb-var-improvement and pmd-auto-lb-cpu. Just suggestions, there could be better names. > Signed-off-by: Christophe Fontaine <cfontain@redhat.com> > --- > lib/dpif-netdev.c | 12 ++++++++++-- > vswitchd/vswitch.xml | 28 +++++++++++++++++++++++++++- > 2 files changed, 37 insertions(+), 3 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 02df8f11e..b981706b5 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -300,6 +300,8 @@ struct pmd_auto_lb { > bool is_enabled; /* Current status of Auto load balancing. */ > uint64_t rebalance_intvl; > uint64_t rebalance_poll_timer; > + uint64_t rebalance_acc_improvement; > + uint64_t rebalance_threshold; afaict, this one is being accessed in main and pmd threads and should be type atomic. > }; > > /* Datapath based on the network device interface from netdev.h. > @@ -4346,6 +4348,12 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) > pmd_alb->rebalance_intvl = rebalance_intvl; > } > > + pmd_alb->rebalance_acc_improvement = smap_get_int(other_config, > + "pmd-auto-lb-acc-improvement", ALB_ACCEPTABLE_IMPROVEMENT); > + > + pmd_alb->rebalance_threshold = smap_get_int(other_config, > + "pmd-auto-lb-threshold", ALB_PMD_LOAD_THRESHOLD); You could check if these have increased (see other args above) and if so, clear the pmd_overloaded for each pmd (like in https://github.com/openvswitch/ovs/blob/master/lib/dpif-netdev.c#L5657). Otherwise, overloads by the old threshold may have already been recorded and contribute towards a rebalance happening. OTOH, you could argue that they were legitimate when they were recorded and should stay. > + > set_pmd_auto_lb(dp); set_pmd_auto_lib() prints the rebalance interval when it is enabled, I think the two new parameters should be printed too now. > return 0; > } > @@ -5674,7 +5682,7 @@ pmd_rebalance_dry_run(struct dp_netdev *dp) > improvement = > ((curr_variance - new_variance) * 100) / curr_variance; > } > - if (improvement < ALB_ACCEPTABLE_IMPROVEMENT) { > + if (improvement < dp->pmd_alb.rebalance_acc_improvement) { > ret = false; > } > } Now that the user is setting the variance % improvement required it makes sense to add some debug so they can see %'s rather than just the variance values. > @@ -8724,7 +8732,7 @@ dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd, > pmd_load = ((tot_proc * 100) / (tot_idle + tot_proc)); > } > > - if (pmd_load >= ALB_PMD_LOAD_THRESHOLD) { > + if (pmd_load >= pmd_alb->rebalance_threshold) { > atomic_count_inc(&pmd->pmd_overloaded); > } else { > atomic_count_set(&pmd->pmd_overloaded, 0); > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 81c84927f..53f9be591 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -654,7 +654,7 @@ > <p> > Configures PMD Auto Load Balancing that allows automatic assignment of > RX queues to PMDs if any of PMDs is overloaded (i.e. processing cycles > - > 95%). > + > other_config:pmd-auto-lb-threshold). We can't say overloaded as it may be a lowish value now. It's not worded great in the first place as CPU is not the only consideration. Could be something like "...if CPU thresholds are exceeded and there is an imbalance between PMDs." > </p> > <p> > It uses current scheme of cycle based assignment of RX queues that > @@ -690,6 +690,32 @@ > once in few hours or a day or a week. > </p> > </column> > + <column name="other_config" key="pmd-auto-lb-threshold" > + type='{"type": "integer", > + "minInteger": 0, "maxInteger": 100}'> > + <p> > + Specifies the threshold defining when a PMD is overloaded. > + When this threshold is reached, it will trigger a request to > + rebalance the different queues between the non-pinned pmds. Need to mention the units - % of used cycles on the PMD. > + </p> > + <p> > + The default value is <code>95</code>. > + </p> > + </column> > + <column name="other_config" key="pmd-auto-lb-acc-improvement" > + type='{"type": "integer", > + "minInteger": 0, "maxInteger": 100}'> > + <p> > + When an auto load balance iteration is requested, this value > + defines if the new queue assignation will be applied: comparing > + the current queue assignation and the new one, the new assignation > + will be applied only if the improvement of CPU usage is above > + this value. Need to mention the units - % variance between the loads on the PMDs. > + </p> > + <p> > + The default value is <code>25</code>. > + </p> > + </column> > <column name="other_config" key="userspace-tso-enable" > type='{"type": "boolean"}'> > <p> > You could add a few sentences to the auto-load balance section in pmd.rst also, as it's a bit more humanly readable than here. thanks, Kevin.
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 02df8f11e..b981706b5 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -300,6 +300,8 @@ struct pmd_auto_lb { bool is_enabled; /* Current status of Auto load balancing. */ uint64_t rebalance_intvl; uint64_t rebalance_poll_timer; + uint64_t rebalance_acc_improvement; + uint64_t rebalance_threshold; }; /* Datapath based on the network device interface from netdev.h. @@ -4346,6 +4348,12 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) pmd_alb->rebalance_intvl = rebalance_intvl; } + pmd_alb->rebalance_acc_improvement = smap_get_int(other_config, + "pmd-auto-lb-acc-improvement", ALB_ACCEPTABLE_IMPROVEMENT); + + pmd_alb->rebalance_threshold = smap_get_int(other_config, + "pmd-auto-lb-threshold", ALB_PMD_LOAD_THRESHOLD); + set_pmd_auto_lb(dp); return 0; } @@ -5674,7 +5682,7 @@ pmd_rebalance_dry_run(struct dp_netdev *dp) improvement = ((curr_variance - new_variance) * 100) / curr_variance; } - if (improvement < ALB_ACCEPTABLE_IMPROVEMENT) { + if (improvement < dp->pmd_alb.rebalance_acc_improvement) { ret = false; } } @@ -8724,7 +8732,7 @@ dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd, pmd_load = ((tot_proc * 100) / (tot_idle + tot_proc)); } - if (pmd_load >= ALB_PMD_LOAD_THRESHOLD) { + if (pmd_load >= pmd_alb->rebalance_threshold) { atomic_count_inc(&pmd->pmd_overloaded); } else { atomic_count_set(&pmd->pmd_overloaded, 0); diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 81c84927f..53f9be591 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -654,7 +654,7 @@ <p> Configures PMD Auto Load Balancing that allows automatic assignment of RX queues to PMDs if any of PMDs is overloaded (i.e. processing cycles - > 95%). + > other_config:pmd-auto-lb-threshold). </p> <p> It uses current scheme of cycle based assignment of RX queues that @@ -690,6 +690,32 @@ once in few hours or a day or a week. </p> </column> + <column name="other_config" key="pmd-auto-lb-threshold" + type='{"type": "integer", + "minInteger": 0, "maxInteger": 100}'> + <p> + Specifies the threshold defining when a PMD is overloaded. + When this threshold is reached, it will trigger a request to + rebalance the different queues between the non-pinned pmds. + </p> + <p> + The default value is <code>95</code>. + </p> + </column> + <column name="other_config" key="pmd-auto-lb-acc-improvement" + type='{"type": "integer", + "minInteger": 0, "maxInteger": 100}'> + <p> + When an auto load balance iteration is requested, this value + defines if the new queue assignation will be applied: comparing + the current queue assignation and the new one, the new assignation + will be applied only if the improvement of CPU usage is above + this value. + </p> + <p> + The default value is <code>25</code>. + </p> + </column> <column name="other_config" key="userspace-tso-enable" type='{"type": "boolean"}'> <p>