diff mbox series

[ovs-dev] ovn-controller: Reduce size of the SB monitor condition.

Message ID 20230606145706.175849-1-dceara@redhat.com
State Changes Requested
Headers show
Series [ovs-dev] ovn-controller: Reduce size of the SB monitor condition. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Dumitru Ceara June 6, 2023, 2:57 p.m. UTC
We don't need to explicitly add port bindings that were already bound
locally.  We implicitly get those because we monitor the datapaths
they're attached to.

When performing an ovn-heater 500-node density-heavy scale test [0], with
conditional monitoring enabled, the unreasonably long poll intervals on
the Southbound database (the ones that took more than one second) are
measured as:

  ---------------------------------------------------------------------
       Count      Min        Max       Median      Mean   95 percentile
  ---------------------------------------------------------------------
        56.0     1010.0     2664.0     1455.5     1544.9     2163.0
        77.0     1015.0     3460.0     1896.0     1858.2     2907.8
        69.0     1010.0     3118.0     1618.0     1688.1     2582.4
  ---------------------------------------------------------------------
       202.0     1010.0     3460.0     1610.0     1713.3     2711.5

Compared to the baseline results (also with conditional monitoring
enabled):
  ---------------------------------------------------------------------
       Count      Min        Max       Median      Mean   95 percentile
  ---------------------------------------------------------------------
       141.0     1003.0    18338.0     1721.0     2625.4     7626.0
       151.0     1019.0    80297.0     1808.0     3410.7     9089.0
       165.0     1007.0    50736.0     2354.0     3067.7     7309.8
  ---------------------------------------------------------------------
       457.0     1003.0    80297.0     2131.0     3044.6     7799.6

We see significant improvement on the server side.  This is explained
by the fact that now the Southbound database server doesn't need to
apply as many conditions as before when filtering individual monitor
contents.

Note: Sub-ports - OVN switch ports with parent_port set - have to be
monitored unconditionally as we cannot efficiently determine their local
datapaths without including all local OVS interfaces in the monitor.
This, however, should not be a huge issue because the majority of ports
are regular VIFs, not sub-ports.

[0] https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
---
 controller/binding.c        |  2 +-
 controller/binding.h        |  2 +-
 controller/ovn-controller.c | 19 ++++++++++++++++---
 3 files changed, 18 insertions(+), 5 deletions(-)

Comments

Dumitru Ceara June 6, 2023, 7:05 p.m. UTC | #1
On 6/6/23 20:00, 0-day Robot wrote:
> From: robot@bytheb.org
> 
> Test-Label: github-robot: ovn-kubernetes
> Test-Status: fail
> http://patchwork.ozlabs.org/api/patches/1791271/
> 
> _github build: failed_
> Build URL: https://github.com/ovsrobot/ovn/actions/runs/5190342903
> Build Logs:

The actual failure is here:

https://github.com/ovsrobot/ovn/actions/runs/5190342903/jobs/9357157532#step:12:16795

"[Fail] External Gateway test suite With Admin Policy Based External
Route CRs e2e multiple external gateway validation [BeforeEach] Should
validate TCP/UDP connectivity to multiple external gateways for a UDP /
TCP scenario IPV4 udp"

I reached out to Surya (in CC) and she confirmed that this is a new
ovn-kubernetes test that also occasionally fails in upstream
ovn-org/ovn-kubernetes.  It's probably safe to ignore for now and should
be fixed in the ovn-kubernetes code base in the near future.

Regards,
Dumitru

> -----------------------Summary of failed steps-----------------------
> "e2e (control-plane, HA, shared, ipv4, noSnatGW)" failed at step Run Tests
> ----------------------End summary of failed steps--------------------
> 
> -------------------------------BEGIN LOGS----------------------------
> ####################################################################################
> #### [Begin job log] "e2e (control-plane, HA, shared, ipv4, noSnatGW)" at step Run Tests
> ####################################################################################
> FAIL! -- 125 Passed | 1 Failed | 0 Flaked | 0 Pending | 97 Skipped
> 
> You're using deprecated Ginkgo functionality:
> =============================================
> Ginkgo 2.0 is under active development and will introduce several new features, improvements, and a small handful of breaking changes.
> A release candidate for 2.0 is now available and 2.0 should GA in Fall 2021.  Please give the RC a try and send us feedback!
>   - To learn more, view the migration guide at https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md
>   - For instructions on using the Release Candidate visit https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#using-the-beta
>   - To comment, chime in at https://github.com/onsi/ginkgo/issues/711
> 
>   You are using a custom reporter.  Support for custom reporters will likely be removed in V2.  Most users were using them to generate junit or teamcity reports and this functionality will be merged into the core reporter.  In addition, Ginkgo 2.0 will support emitting a JSON-formatted report that users can then manipulate to generate custom reports.
> 
>   If this change will be impactful to you please leave a comment on https://github.com/onsi/ginkgo/issues/711
>   Learn more at: https://github.com/onsi/ginkgo/blob/ver2/docs/MIGRATING_TO_V2.md#removed-custom-reporters
> 
> To silence deprecations that can be silenced set the following environment variable:
>   ACK_GINKGO_DEPRECATIONS=1.16.5
> 
> --- FAIL: TestE2e (7183.72s)
> FAIL
> FAIL	github.com/ovn-org/ovn-kubernetes/test/e2e	7183.785s
> FAIL
> make: *** [Makefile:32: control-plane] Error 1
> make: Leaving directory '/home/runner/work/ovn/ovn/src/github.com/ovn-org/ovn-kubernetes/test'
> ##[error]Process completed with exit code 2.
> ####################################################################################
> #### [End job log] "e2e (control-plane, HA, shared, ipv4, noSnatGW)" at step Run Tests
> ####################################################################################
> --------------------------------END LOGS-----------------------------
>
Han Zhou June 20, 2023, 1:49 a.m. UTC | #2
On Tue, Jun 6, 2023 at 7:57 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> We don't need to explicitly add port bindings that were already bound
> locally.  We implicitly get those because we monitor the datapaths
> they're attached to.
>
> When performing an ovn-heater 500-node density-heavy scale test [0], with
> conditional monitoring enabled, the unreasonably long poll intervals on
> the Southbound database (the ones that took more than one second) are
> measured as:
>
>   ---------------------------------------------------------------------
>        Count      Min        Max       Median      Mean   95 percentile
>   ---------------------------------------------------------------------
>         56.0     1010.0     2664.0     1455.5     1544.9     2163.0
>         77.0     1015.0     3460.0     1896.0     1858.2     2907.8
>         69.0     1010.0     3118.0     1618.0     1688.1     2582.4
>   ---------------------------------------------------------------------
>        202.0     1010.0     3460.0     1610.0     1713.3     2711.5
>
> Compared to the baseline results (also with conditional monitoring
> enabled):
>   ---------------------------------------------------------------------
>        Count      Min        Max       Median      Mean   95 percentile
>   ---------------------------------------------------------------------
>        141.0     1003.0    18338.0     1721.0     2625.4     7626.0
>        151.0     1019.0    80297.0     1808.0     3410.7     9089.0
>        165.0     1007.0    50736.0     2354.0     3067.7     7309.8
>   ---------------------------------------------------------------------
>        457.0     1003.0    80297.0     2131.0     3044.6     7799.6
>
> We see significant improvement on the server side.  This is explained
> by the fact that now the Southbound database server doesn't need to
> apply as many conditions as before when filtering individual monitor
> contents.
>
Thanks Dumitru for the great improvement! This is very helpful for the high
port-density environment.
Just to make sure I understand the test result correctly, in [0], it shows
22500 pods and 500 nodes, so is it 45 pods per node?

> Note: Sub-ports - OVN switch ports with parent_port set - have to be
> monitored unconditionally as we cannot efficiently determine their local
> datapaths without including all local OVS interfaces in the monitor.
> This, however, should not be a huge issue because the majority of ports
> are regular VIFs, not sub-ports.

I am not sure if we can make such a conclusion. For the current ovn-k8s or
environments similar to that, it shouldn't be a problem.
However, for environments that model pods/containers as sub-ports of the VM
VIFs, probably most of the majority of the ports would be sub-ports. This
is what sub-ports are designed for, right?
So, I think this would be a significant change of data monitored for those
environments. I'd suggest at least we should properly document the
implication in the documents (such as ovn-monitor-all, and also the
sub-port related parts). There may also be such users who prefer not
monitoring all sub-ports (for efficiency of ovn-controller) sacrificing SB
DB performance (probably they don't have very high port density so the
conditional monitoring impact is not that big). I am not aware of any such
users yet, but if they complain, we will have to provide a knob, if no
better ideas.

Other than this, the patch looks good to me.

Acked-by: Han Zhou <hzhou@ovn.org>

>
> [0]
https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
>
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194
> Reported-by: Ilya Maximets <i.maximets@ovn.org>
> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> ---
>  controller/binding.c        |  2 +-
>  controller/binding.h        |  2 +-
>  controller/ovn-controller.c | 19 ++++++++++++++++---
>  3 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/controller/binding.c b/controller/binding.c
> index 9b0647b70e..ad03332915 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -761,7 +761,7 @@ local_binding_data_destroy(struct local_binding_data
*lbinding_data)
>  }
>
>  const struct sbrec_port_binding *
> -local_binding_get_primary_pb(struct shash *local_bindings,
> +local_binding_get_primary_pb(const struct shash *local_bindings,
>                               const char *pb_name)
>  {
>      struct local_binding *lbinding =
> diff --git a/controller/binding.h b/controller/binding.h
> index 0e57f02ee5..319cbd263c 100644
> --- a/controller/binding.h
> +++ b/controller/binding.h
> @@ -150,7 +150,7 @@ void local_binding_data_init(struct
local_binding_data *);
>  void local_binding_data_destroy(struct local_binding_data *);
>
>  const struct sbrec_port_binding *local_binding_get_primary_pb(
> -    struct shash *local_bindings, const char *pb_name);
> +    const struct shash *local_bindings, const char *pb_name);
>  ofp_port_t local_binding_get_lport_ofport(const struct shash
*local_bindings,
>                                            const char *pb_name);
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 3a81a13fb0..c82f0697e8 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -199,6 +199,7 @@ static unsigned int
>  update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
>                     const struct sbrec_chassis *chassis,
>                     const struct sset *local_ifaces,
> +                   const struct shash *local_bindings,
>                     struct hmap *local_datapaths,
>                     bool monitor_all)
>  {
> @@ -297,10 +298,21 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
>
>      if (local_ifaces) {
>          const char *name;
> +
> +        ovs_assert(local_bindings);
>          SSET_FOR_EACH (name, local_ifaces) {
> +            /* Skip the VIFs we bound already, we should have a local
datapath
> +             * for those. */
> +            const struct sbrec_port_binding *local_pb
> +                = local_binding_get_primary_pb(local_bindings, name);
> +            if (local_pb && get_lport_type(local_pb) == LP_VIF) {
> +                continue;
> +            }
>              sbrec_port_binding_add_clause_logical_port(&pb, OVSDB_F_EQ,
name);
> -            sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ,
name);
>          }
> +        /* Monitor all sub-ports unconditionally; we don't expect a lot
of
> +         * them in the SB database. */
> +        sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, NULL);
>      }
>      if (local_datapaths) {
>          const struct local_datapath *ld;
> @@ -609,7 +621,7 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct
ovsdb_idl *ovnsb_idl,
>           * extra cost. Instead, it is called after the engine execution
only
>           * when it is necessary. */
>          unsigned int next_cond_seqno =
> -            update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true);
> +            update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, true);
>          if (sb_cond_seqno) {
>              *sb_cond_seqno = next_cond_seqno;
>          }
> @@ -4712,7 +4724,7 @@ main(int argc, char *argv[])
>      ovsdb_idl_omit(ovnsb_idl_loop.idl,
>                     &sbrec_chassis_private_col_external_ids);
>
> -    update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false);
> +    update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL,
false);
>
>      stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS);
>      stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS);
> @@ -5355,6 +5367,7 @@ main(int argc, char *argv[])
>                                  update_sb_monitors(
>                                      ovnsb_idl_loop.idl, chassis,
>                                      &runtime_data->local_lports,
> +
 &runtime_data->lbinding_data.bindings,
>                                      &runtime_data->local_datapaths,
>                                      sb_monitor_all);
>                          }
> --
> 2.31.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Dumitru Ceara June 20, 2023, 7:48 a.m. UTC | #3
On 6/20/23 03:49, Han Zhou wrote:
> On Tue, Jun 6, 2023 at 7:57 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> We don't need to explicitly add port bindings that were already bound
>> locally.  We implicitly get those because we monitor the datapaths
>> they're attached to.
>>
>> When performing an ovn-heater 500-node density-heavy scale test [0], with
>> conditional monitoring enabled, the unreasonably long poll intervals on
>> the Southbound database (the ones that took more than one second) are
>> measured as:
>>
>>   ---------------------------------------------------------------------
>>        Count      Min        Max       Median      Mean   95 percentile
>>   ---------------------------------------------------------------------
>>         56.0     1010.0     2664.0     1455.5     1544.9     2163.0
>>         77.0     1015.0     3460.0     1896.0     1858.2     2907.8
>>         69.0     1010.0     3118.0     1618.0     1688.1     2582.4
>>   ---------------------------------------------------------------------
>>        202.0     1010.0     3460.0     1610.0     1713.3     2711.5
>>
>> Compared to the baseline results (also with conditional monitoring
>> enabled):
>>   ---------------------------------------------------------------------
>>        Count      Min        Max       Median      Mean   95 percentile
>>   ---------------------------------------------------------------------
>>        141.0     1003.0    18338.0     1721.0     2625.4     7626.0
>>        151.0     1019.0    80297.0     1808.0     3410.7     9089.0
>>        165.0     1007.0    50736.0     2354.0     3067.7     7309.8
>>   ---------------------------------------------------------------------
>>        457.0     1003.0    80297.0     2131.0     3044.6     7799.6
>>
>> We see significant improvement on the server side.  This is explained
>> by the fact that now the Southbound database server doesn't need to
>> apply as many conditions as before when filtering individual monitor
>> contents.
>>
> Thanks Dumitru for the great improvement! This is very helpful for the high
> port-density environment.
> Just to make sure I understand the test result correctly, in [0], it shows
> 22500 pods and 500 nodes, so is it 45 pods per node?
> 

Yes, for density-heavy tests (load balancers are also configured) the
pod density is 45 per node.

>> Note: Sub-ports - OVN switch ports with parent_port set - have to be
>> monitored unconditionally as we cannot efficiently determine their local
>> datapaths without including all local OVS interfaces in the monitor.
>> This, however, should not be a huge issue because the majority of ports
>> are regular VIFs, not sub-ports.
> 
> I am not sure if we can make such a conclusion. For the current ovn-k8s or
> environments similar to that, it shouldn't be a problem.
> However, for environments that model pods/containers as sub-ports of the VM
> VIFs, probably most of the majority of the ports would be sub-ports. This
> is what sub-ports are designed for, right?

My impression was that this was one of the use cases for OpenStack and
that it's only one of the different ways of providing container
connectivity in a given deployment.  But I might be wrong.  I can remove
this sentence, it makes a lot of assumptions indeed.

> So, I think this would be a significant change of data monitored for those
> environments. I'd suggest at least we should properly document the
> implication in the documents (such as ovn-monitor-all, and also the
> sub-port related parts). There may also be such users who prefer not
> monitoring all sub-ports (for efficiency of ovn-controller) sacrificing SB
> DB performance (probably they don't have very high port density so the
> conditional monitoring impact is not that big). I am not aware of any such
> users yet, but if they complain, we will have to provide a knob, if no
> better ideas.
> 

I agree, if really needed, we can easily add a knob.

What do you think of the following incremental?  I can fold it in if it
looks good to you.

diff --git a/TODO.rst b/TODO.rst
index 2a5c68ea3f..115cf54e70 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -189,3 +189,9 @@ OVN To-do List
 * Load Balancer templates
 
   * Support combining the VIP IP and port into a single template variable.
+
+* ovn-controller conditional monitoring
+
+  * Improve sub-ports (with parent_port set) conditional monitoring; these
+    are currently unconditionally monitored, even if ovn-monitor-all is
+    set to false.
diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
index f61f430084..0883d8da9b 100644
--- a/controller/ovn-controller.8.xml
+++ b/controller/ovn-controller.8.xml
@@ -128,6 +128,12 @@
           to <code>true</code> for environments that all workloads need to be
           reachable from each other.
         </p>
+        <p>
+          NOTE: for efficiency and scalability in common scenarios
+          <code>ovn-controller</code> unconditionally monitors all sub-ports
+          (ports with <code>parent_port</code> set) regardless of the
+          <code>ovn-monitor-all</code> value.
+        </p>
         <p>
           Default value is <var>false</var>.
         </p>
---

> Other than this, the patch looks good to me.
> 
> Acked-by: Han Zhou <hzhou@ovn.org>
> 

Thanks for the review!

Regards,
Dumitru

>>
>> [0]
> https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
Han Zhou June 20, 2023, 7:01 p.m. UTC | #4
On Tue, Jun 20, 2023 at 12:48 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 6/20/23 03:49, Han Zhou wrote:
> > On Tue, Jun 6, 2023 at 7:57 AM Dumitru Ceara <dceara@redhat.com> wrote:
> >>
> >> We don't need to explicitly add port bindings that were already bound
> >> locally.  We implicitly get those because we monitor the datapaths
> >> they're attached to.
> >>
> >> When performing an ovn-heater 500-node density-heavy scale test [0],
with
> >> conditional monitoring enabled, the unreasonably long poll intervals on
> >> the Southbound database (the ones that took more than one second) are
> >> measured as:
> >>
> >>   ---------------------------------------------------------------------
> >>        Count      Min        Max       Median      Mean   95 percentile
> >>   ---------------------------------------------------------------------
> >>         56.0     1010.0     2664.0     1455.5     1544.9     2163.0
> >>         77.0     1015.0     3460.0     1896.0     1858.2     2907.8
> >>         69.0     1010.0     3118.0     1618.0     1688.1     2582.4
> >>   ---------------------------------------------------------------------
> >>        202.0     1010.0     3460.0     1610.0     1713.3     2711.5
> >>
> >> Compared to the baseline results (also with conditional monitoring
> >> enabled):
> >>   ---------------------------------------------------------------------
> >>        Count      Min        Max       Median      Mean   95 percentile
> >>   ---------------------------------------------------------------------
> >>        141.0     1003.0    18338.0     1721.0     2625.4     7626.0
> >>        151.0     1019.0    80297.0     1808.0     3410.7     9089.0
> >>        165.0     1007.0    50736.0     2354.0     3067.7     7309.8
> >>   ---------------------------------------------------------------------
> >>        457.0     1003.0    80297.0     2131.0     3044.6     7799.6
> >>
> >> We see significant improvement on the server side.  This is explained
> >> by the fact that now the Southbound database server doesn't need to
> >> apply as many conditions as before when filtering individual monitor
> >> contents.
> >>
> > Thanks Dumitru for the great improvement! This is very helpful for the
high
> > port-density environment.
> > Just to make sure I understand the test result correctly, in [0], it
shows
> > 22500 pods and 500 nodes, so is it 45 pods per node?
> >
>
> Yes, for density-heavy tests (load balancers are also configured) the
> pod density is 45 per node.
>
> >> Note: Sub-ports - OVN switch ports with parent_port set - have to be
> >> monitored unconditionally as we cannot efficiently determine their
local
> >> datapaths without including all local OVS interfaces in the monitor.
> >> This, however, should not be a huge issue because the majority of ports
> >> are regular VIFs, not sub-ports.
> >
> > I am not sure if we can make such a conclusion. For the current ovn-k8s
or
> > environments similar to that, it shouldn't be a problem.
> > However, for environments that model pods/containers as sub-ports of
the VM
> > VIFs, probably most of the majority of the ports would be sub-ports.
This
> > is what sub-ports are designed for, right?
>
> My impression was that this was one of the use cases for OpenStack and
> that it's only one of the different ways of providing container
> connectivity in a given deployment.  But I might be wrong.  I can remove
> this sentence, it makes a lot of assumptions indeed.
>
> > So, I think this would be a significant change of data monitored for
those
> > environments. I'd suggest at least we should properly document the
> > implication in the documents (such as ovn-monitor-all, and also the
> > sub-port related parts). There may also be such users who prefer not
> > monitoring all sub-ports (for efficiency of ovn-controller) sacrificing
SB
> > DB performance (probably they don't have very high port density so the
> > conditional monitoring impact is not that big). I am not aware of any
such
> > users yet, but if they complain, we will have to provide a knob, if no
> > better ideas.
> >
>
> I agree, if really needed, we can easily add a knob.
>
> What do you think of the following incremental?  I can fold it in if it
> looks good to you.

Thanks Dumitru. The below documentation looks good. In addition, I think we
should add some notes in the ovn-nb.xml under the section <group
title="Containers"> of Logical_Switch_Port, which is the place where the
"sub-port" feature is described. Could you add it as well?

Regards,
Han
>
> diff --git a/TODO.rst b/TODO.rst
> index 2a5c68ea3f..115cf54e70 100644
> --- a/TODO.rst
> +++ b/TODO.rst
> @@ -189,3 +189,9 @@ OVN To-do List
>  * Load Balancer templates
>
>    * Support combining the VIP IP and port into a single template
variable.
> +
> +* ovn-controller conditional monitoring
> +
> +  * Improve sub-ports (with parent_port set) conditional monitoring;
these
> +    are currently unconditionally monitored, even if ovn-monitor-all is
> +    set to false.
> diff --git a/controller/ovn-controller.8.xml
b/controller/ovn-controller.8.xml
> index f61f430084..0883d8da9b 100644
> --- a/controller/ovn-controller.8.xml
> +++ b/controller/ovn-controller.8.xml
> @@ -128,6 +128,12 @@
>            to <code>true</code> for environments that all workloads need
to be
>            reachable from each other.
>          </p>
> +        <p>
> +          NOTE: for efficiency and scalability in common scenarios
> +          <code>ovn-controller</code> unconditionally monitors all
sub-ports
> +          (ports with <code>parent_port</code> set) regardless of the
> +          <code>ovn-monitor-all</code> value.
> +        </p>
>          <p>
>            Default value is <var>false</var>.
>          </p>
> ---
>
> > Other than this, the patch looks good to me.
> >
> > Acked-by: Han Zhou <hzhou@ovn.org>
> >
>
> Thanks for the review!
>
> Regards,
> Dumitru
>
> >>
> >> [0]
> >
https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
>
>
Dumitru Ceara June 21, 2023, 9:49 a.m. UTC | #5
On 6/20/23 21:01, Han Zhou wrote:
> On Tue, Jun 20, 2023 at 12:48 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> On 6/20/23 03:49, Han Zhou wrote:
>>> On Tue, Jun 6, 2023 at 7:57 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>>>
>>>> We don't need to explicitly add port bindings that were already bound
>>>> locally.  We implicitly get those because we monitor the datapaths
>>>> they're attached to.
>>>>
>>>> When performing an ovn-heater 500-node density-heavy scale test [0],
> with
>>>> conditional monitoring enabled, the unreasonably long poll intervals on
>>>> the Southbound database (the ones that took more than one second) are
>>>> measured as:
>>>>
>>>>   ---------------------------------------------------------------------
>>>>        Count      Min        Max       Median      Mean   95 percentile
>>>>   ---------------------------------------------------------------------
>>>>         56.0     1010.0     2664.0     1455.5     1544.9     2163.0
>>>>         77.0     1015.0     3460.0     1896.0     1858.2     2907.8
>>>>         69.0     1010.0     3118.0     1618.0     1688.1     2582.4
>>>>   ---------------------------------------------------------------------
>>>>        202.0     1010.0     3460.0     1610.0     1713.3     2711.5
>>>>
>>>> Compared to the baseline results (also with conditional monitoring
>>>> enabled):
>>>>   ---------------------------------------------------------------------
>>>>        Count      Min        Max       Median      Mean   95 percentile
>>>>   ---------------------------------------------------------------------
>>>>        141.0     1003.0    18338.0     1721.0     2625.4     7626.0
>>>>        151.0     1019.0    80297.0     1808.0     3410.7     9089.0
>>>>        165.0     1007.0    50736.0     2354.0     3067.7     7309.8
>>>>   ---------------------------------------------------------------------
>>>>        457.0     1003.0    80297.0     2131.0     3044.6     7799.6
>>>>
>>>> We see significant improvement on the server side.  This is explained
>>>> by the fact that now the Southbound database server doesn't need to
>>>> apply as many conditions as before when filtering individual monitor
>>>> contents.
>>>>
>>> Thanks Dumitru for the great improvement! This is very helpful for the
> high
>>> port-density environment.
>>> Just to make sure I understand the test result correctly, in [0], it
> shows
>>> 22500 pods and 500 nodes, so is it 45 pods per node?
>>>
>>
>> Yes, for density-heavy tests (load balancers are also configured) the
>> pod density is 45 per node.
>>
>>>> Note: Sub-ports - OVN switch ports with parent_port set - have to be
>>>> monitored unconditionally as we cannot efficiently determine their
> local
>>>> datapaths without including all local OVS interfaces in the monitor.
>>>> This, however, should not be a huge issue because the majority of ports
>>>> are regular VIFs, not sub-ports.
>>>
>>> I am not sure if we can make such a conclusion. For the current ovn-k8s
> or
>>> environments similar to that, it shouldn't be a problem.
>>> However, for environments that model pods/containers as sub-ports of
> the VM
>>> VIFs, probably most of the majority of the ports would be sub-ports.
> This
>>> is what sub-ports are designed for, right?
>>
>> My impression was that this was one of the use cases for OpenStack and
>> that it's only one of the different ways of providing container
>> connectivity in a given deployment.  But I might be wrong.  I can remove
>> this sentence, it makes a lot of assumptions indeed.
>>
>>> So, I think this would be a significant change of data monitored for
> those
>>> environments. I'd suggest at least we should properly document the
>>> implication in the documents (such as ovn-monitor-all, and also the
>>> sub-port related parts). There may also be such users who prefer not
>>> monitoring all sub-ports (for efficiency of ovn-controller) sacrificing
> SB
>>> DB performance (probably they don't have very high port density so the
>>> conditional monitoring impact is not that big). I am not aware of any
> such
>>> users yet, but if they complain, we will have to provide a knob, if no
>>> better ideas.
>>>
>>
>> I agree, if really needed, we can easily add a knob.
>>
>> What do you think of the following incremental?  I can fold it in if it
>> looks good to you.
> 
> Thanks Dumitru. The below documentation looks good. In addition, I think we
> should add some notes in the ovn-nb.xml under the section <group
> title="Containers"> of Logical_Switch_Port, which is the place where the
> "sub-port" feature is described. Could you add it as well?
> 

To avoid an "incremental on top of another incremental" I posted v2.  I
also added a NEWS item as this is a user-visible change.

Please let me know what you think.

https://patchwork.ozlabs.org/project/ovn/patch/20230621094753.150072-1-dceara@redhat.com/

Thanks,
Dumitru
diff mbox series

Patch

diff --git a/controller/binding.c b/controller/binding.c
index 9b0647b70e..ad03332915 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -761,7 +761,7 @@  local_binding_data_destroy(struct local_binding_data *lbinding_data)
 }
 
 const struct sbrec_port_binding *
-local_binding_get_primary_pb(struct shash *local_bindings,
+local_binding_get_primary_pb(const struct shash *local_bindings,
                              const char *pb_name)
 {
     struct local_binding *lbinding =
diff --git a/controller/binding.h b/controller/binding.h
index 0e57f02ee5..319cbd263c 100644
--- a/controller/binding.h
+++ b/controller/binding.h
@@ -150,7 +150,7 @@  void local_binding_data_init(struct local_binding_data *);
 void local_binding_data_destroy(struct local_binding_data *);
 
 const struct sbrec_port_binding *local_binding_get_primary_pb(
-    struct shash *local_bindings, const char *pb_name);
+    const struct shash *local_bindings, const char *pb_name);
 ofp_port_t local_binding_get_lport_ofport(const struct shash *local_bindings,
                                           const char *pb_name);
 
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 3a81a13fb0..c82f0697e8 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -199,6 +199,7 @@  static unsigned int
 update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
                    const struct sbrec_chassis *chassis,
                    const struct sset *local_ifaces,
+                   const struct shash *local_bindings,
                    struct hmap *local_datapaths,
                    bool monitor_all)
 {
@@ -297,10 +298,21 @@  update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
 
     if (local_ifaces) {
         const char *name;
+
+        ovs_assert(local_bindings);
         SSET_FOR_EACH (name, local_ifaces) {
+            /* Skip the VIFs we bound already, we should have a local datapath
+             * for those. */
+            const struct sbrec_port_binding *local_pb
+                = local_binding_get_primary_pb(local_bindings, name);
+            if (local_pb && get_lport_type(local_pb) == LP_VIF) {
+                continue;
+            }
             sbrec_port_binding_add_clause_logical_port(&pb, OVSDB_F_EQ, name);
-            sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ, name);
         }
+        /* Monitor all sub-ports unconditionally; we don't expect a lot of
+         * them in the SB database. */
+        sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, NULL);
     }
     if (local_datapaths) {
         const struct local_datapath *ld;
@@ -609,7 +621,7 @@  update_sb_db(struct ovsdb_idl *ovs_idl, struct ovsdb_idl *ovnsb_idl,
          * extra cost. Instead, it is called after the engine execution only
          * when it is necessary. */
         unsigned int next_cond_seqno =
-            update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true);
+            update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, true);
         if (sb_cond_seqno) {
             *sb_cond_seqno = next_cond_seqno;
         }
@@ -4712,7 +4724,7 @@  main(int argc, char *argv[])
     ovsdb_idl_omit(ovnsb_idl_loop.idl,
                    &sbrec_chassis_private_col_external_ids);
 
-    update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false);
+    update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL, false);
 
     stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS);
     stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS);
@@ -5355,6 +5367,7 @@  main(int argc, char *argv[])
                                 update_sb_monitors(
                                     ovnsb_idl_loop.idl, chassis,
                                     &runtime_data->local_lports,
+                                    &runtime_data->lbinding_data.bindings,
                                     &runtime_data->local_datapaths,
                                     sb_monitor_all);
                         }