diff mbox series

[ovs-dev,4/4] ovn-controller: Support configuration option ovn-delay-nb-cfg-report.

Message ID 1599002805-103574-4-git-send-email-hzhou@ovn.org
State Changes Requested
Headers show
Series [ovs-dev,1/4] ovn-nbctl.8.xml: Fix section OVN_NBCTL_OPTIONS. | expand

Commit Message

Han Zhou Sept. 1, 2020, 11:26 p.m. UTC
This option is added to avoid the flooding of nb_cfg updates from
a large nubmer of hypervisors interfere with the original SB DB
data distribution and handling, so that the e2e control plane latency
can be measured more accurately during scale testing.

Signed-off-by: Han Zhou <hzhou@ovn.org>
---
 controller/ovn-controller.8.xml | 12 ++++++++++++
 controller/ovn-controller.c     |  9 +++++++++
 2 files changed, 21 insertions(+)

Comments

Numan Siddique Sept. 4, 2020, 6:18 p.m. UTC | #1
On Wed, Sep 2, 2020 at 4:57 AM Han Zhou <hzhou@ovn.org> wrote:

> This option is added to avoid the flooding of nb_cfg updates from
> a large nubmer of hypervisors interfere with the original SB DB
> data distribution and handling, so that the e2e control plane latency
> can be measured more accurately during scale testing.
>
> Signed-off-by: Han Zhou <hzhou@ovn.org>
> ---
>  controller/ovn-controller.8.xml | 12 ++++++++++++
>  controller/ovn-controller.c     |  9 +++++++++
>  2 files changed, 21 insertions(+)
>
> diff --git a/controller/ovn-controller.8.xml
> b/controller/ovn-controller.8.xml
> index 6687731..4cb3f41 100644
> --- a/controller/ovn-controller.8.xml
> +++ b/controller/ovn-controller.8.xml
> @@ -233,6 +233,18 @@
>          The boolean flag indicates if the chassis is used as an
>          interconnection gateway.
>        </dd>
> +
> +      <dt><code>external_ids:ovn-delay-nb-cfg-report</code></dt>
> +      <dd>
> +        An integer that represents the time (in seconds) to sleep before
> +        <code>ovn-controller</code> updating <code>nb_cfg</code> column
> +        of <code>Chassis_Private</code> table after it completes
> processing
> +        the corresponding changes. By default it is 0 (meaning do not
> sleep).
> +        This value can be configured when doing control plane latency
> testing,
> +        to make sure the updates of <code>nb_cfg</code> to the Southbound
> DB
> +        from a large number of hypervisors do not interfere with the
> original
> +        Southbound DB data processing.
> +      </dd>
>      </dl>
>
>      <p>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 89f10a6..5ab5a13 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -2568,6 +2568,15 @@ main(int argc, char *argv[])
>                      sbrec_chassis_private_set_nb_cfg(chassis_private,
> cur_cfg);
>                      sbrec_chassis_private_set_nb_cfg_timestamp(
>                          chassis_private, time_wall_msec());
> +                    const struct ovsrec_open_vswitch *cfg =
> +                        ovsrec_open_vswitch_first(ovs_idl_loop.idl);
> +                    if (cfg) {
> +                        int delay_cfg_report =
> smap_get_int(&cfg->external_ids,
> +                                               "ovn-delay-nb-cfg-report",
> 0);
> +                        if (delay_cfg_report) {
> +                            xsleep(delay_cfg_report);
> +                        }
> +                    }
>

Hi Han,

Maybe we can also sync this option from the ovs config to chassis
other_config in south db ?

What do you think ?

Thanks
Numan


>                  }
>              }
>
> --
> 2.1.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Han Zhou Sept. 4, 2020, 6:33 p.m. UTC | #2
On Fri, Sep 4, 2020 at 11:18 AM Numan Siddique <numans@ovn.org> wrote:
>
>
>
> On Wed, Sep 2, 2020 at 4:57 AM Han Zhou <hzhou@ovn.org> wrote:
>>
>> This option is added to avoid the flooding of nb_cfg updates from
>> a large nubmer of hypervisors interfere with the original SB DB
>> data distribution and handling, so that the e2e control plane latency
>> can be measured more accurately during scale testing.
>>
>> Signed-off-by: Han Zhou <hzhou@ovn.org>
>> ---
>>  controller/ovn-controller.8.xml | 12 ++++++++++++
>>  controller/ovn-controller.c     |  9 +++++++++
>>  2 files changed, 21 insertions(+)
>>
>> diff --git a/controller/ovn-controller.8.xml
b/controller/ovn-controller.8.xml
>> index 6687731..4cb3f41 100644
>> --- a/controller/ovn-controller.8.xml
>> +++ b/controller/ovn-controller.8.xml
>> @@ -233,6 +233,18 @@
>>          The boolean flag indicates if the chassis is used as an
>>          interconnection gateway.
>>        </dd>
>> +
>> +      <dt><code>external_ids:ovn-delay-nb-cfg-report</code></dt>
>> +      <dd>
>> +        An integer that represents the time (in seconds) to sleep before
>> +        <code>ovn-controller</code> updating <code>nb_cfg</code> column
>> +        of <code>Chassis_Private</code> table after it completes
processing
>> +        the corresponding changes. By default it is 0 (meaning do not
sleep).
>> +        This value can be configured when doing control plane latency
testing,
>> +        to make sure the updates of <code>nb_cfg</code> to the
Southbound DB
>> +        from a large number of hypervisors do not interfere with the
original
>> +        Southbound DB data processing.
>> +      </dd>
>>      </dl>
>>
>>      <p>
>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>> index 89f10a6..5ab5a13 100644
>> --- a/controller/ovn-controller.c
>> +++ b/controller/ovn-controller.c
>> @@ -2568,6 +2568,15 @@ main(int argc, char *argv[])
>>                      sbrec_chassis_private_set_nb_cfg(chassis_private,
cur_cfg);
>>                      sbrec_chassis_private_set_nb_cfg_timestamp(
>>                          chassis_private, time_wall_msec());
>> +                    const struct ovsrec_open_vswitch *cfg =
>> +                        ovsrec_open_vswitch_first(ovs_idl_loop.idl);
>> +                    if (cfg) {
>> +                        int delay_cfg_report =
smap_get_int(&cfg->external_ids,
>> +
"ovn-delay-nb-cfg-report", 0);
>> +                        if (delay_cfg_report) {
>> +                            xsleep(delay_cfg_report);
>> +                        }
>> +                    }
>
>
> Hi Han,
>
> Maybe we can also sync this option from the ovs config to chassis
other_config in south db ?
>
> What do you think ?
>

Hi Numan,

I am not sure what's the purpose of syncing this to SB? It is used only by
ovn-controller, something similar to ovn-remote-probe-interval, etc.

Thanks,
Han

> Thanks
> Numan
>
>>
>>                  }
>>              }
>>
>> --
>> 2.1.0
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
Numan Siddique Sept. 4, 2020, 7:16 p.m. UTC | #3
On Sat, Sep 5, 2020, 12:04 AM Han Zhou <hzhou@ovn.org> wrote:

> On Fri, Sep 4, 2020 at 11:18 AM Numan Siddique <numans@ovn.org> wrote:
> >
> >
> >
> > On Wed, Sep 2, 2020 at 4:57 AM Han Zhou <hzhou@ovn.org> wrote:
> >>
> >> This option is added to avoid the flooding of nb_cfg updates from
> >> a large nubmer of hypervisors interfere with the original SB DB
> >> data distribution and handling, so that the e2e control plane latency
> >> can be measured more accurately during scale testing.
> >>
> >> Signed-off-by: Han Zhou <hzhou@ovn.org>
> >> ---
> >>  controller/ovn-controller.8.xml | 12 ++++++++++++
> >>  controller/ovn-controller.c     |  9 +++++++++
> >>  2 files changed, 21 insertions(+)
> >>
> >> diff --git a/controller/ovn-controller.8.xml
> b/controller/ovn-controller.8.xml
> >> index 6687731..4cb3f41 100644
> >> --- a/controller/ovn-controller.8.xml
> >> +++ b/controller/ovn-controller.8.xml
> >> @@ -233,6 +233,18 @@
> >>          The boolean flag indicates if the chassis is used as an
> >>          interconnection gateway.
> >>        </dd>
> >> +
> >> +      <dt><code>external_ids:ovn-delay-nb-cfg-report</code></dt>
> >> +      <dd>
> >> +        An integer that represents the time (in seconds) to sleep
> before
> >> +        <code>ovn-controller</code> updating <code>nb_cfg</code> column
> >> +        of <code>Chassis_Private</code> table after it completes
> processing
> >> +        the corresponding changes. By default it is 0 (meaning do not
> sleep).
> >> +        This value can be configured when doing control plane latency
> testing,
> >> +        to make sure the updates of <code>nb_cfg</code> to the
> Southbound DB
> >> +        from a large number of hypervisors do not interfere with the
> original
> >> +        Southbound DB data processing.
> >> +      </dd>
> >>      </dl>
> >>
> >>      <p>
> >> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> >> index 89f10a6..5ab5a13 100644
> >> --- a/controller/ovn-controller.c
> >> +++ b/controller/ovn-controller.c
> >> @@ -2568,6 +2568,15 @@ main(int argc, char *argv[])
> >>                      sbrec_chassis_private_set_nb_cfg(chassis_private,
> cur_cfg);
> >>                      sbrec_chassis_private_set_nb_cfg_timestamp(
> >>                          chassis_private, time_wall_msec());
> >> +                    const struct ovsrec_open_vswitch *cfg =
> >> +                        ovsrec_open_vswitch_first(ovs_idl_loop.idl);
> >> +                    if (cfg) {
> >> +                        int delay_cfg_report =
> smap_get_int(&cfg->external_ids,
> >> +
> "ovn-delay-nb-cfg-report", 0);
> >> +                        if (delay_cfg_report) {
> >> +                            xsleep(delay_cfg_report);
> >> +                        }
> >> +                    }
> >
> >
> > Hi Han,
> >
> > Maybe we can also sync this option from the ovs config to chassis
> other_config in south db ?
> >
> > What do you think ?
> >
>
> Hi Numan,
>
> I am not sure what's the purpose of syncing this to SB? It is used only by
> ovn-controller, something similar to ovn-remote-probe-interval, etc.
>

We do copy some of the configs from ovs db to chassis other_config. But I
agree this config doesnt need to go to sb and probably CMS wouldn't care
much know about it.

Thanks
Numan


> Thanks,
> Han
>
> > Thanks
> > Numan
> >
> >>
> >>                  }
> >>              }
> >>
> >> --
> >> 2.1.0
> >>
> >> _______________________________________________
> >> dev mailing list
> >> dev@openvswitch.org
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Numan Siddique Sept. 7, 2020, 6:02 p.m. UTC | #4
On Sat, Sep 5, 2020 at 12:46 AM Numan Siddique <numans@ovn.org> wrote:

>
>
> On Sat, Sep 5, 2020, 12:04 AM Han Zhou <hzhou@ovn.org> wrote:
>
>> On Fri, Sep 4, 2020 at 11:18 AM Numan Siddique <numans@ovn.org> wrote:
>> >
>> >
>> >
>> > On Wed, Sep 2, 2020 at 4:57 AM Han Zhou <hzhou@ovn.org> wrote:
>> >>
>> >> This option is added to avoid the flooding of nb_cfg updates from
>> >> a large nubmer of hypervisors interfere with the original SB DB
>> >> data distribution and handling, so that the e2e control plane latency
>> >> can be measured more accurately during scale testing.
>> >>
>> >> Signed-off-by: Han Zhou <hzhou@ovn.org>
>>
>
Hi Han,

If I understand correctly, if the option is configured, ovn-controller will
sleep before it updates
the OVN Southbound database. Can you please explain a bit on how it helps
in latency measurement.

Also I don't think this option will be used during live deployments. So
instead of adding this as configuration
in the ovs db, I would suggest to add an ovn-appctl command -
debug/set-delay-nb-cfg-report (or  a more
appropriate name) so that a control plane latency test suite can set this
option -
   ovn-appctl -t ovn-controller  debug/set-delay-nb-cfg-report 3 (for
example).

Thanks
Numan



>> ---
>> >>  controller/ovn-controller.8.xml | 12 ++++++++++++
>> >>  controller/ovn-controller.c     |  9 +++++++++
>> >>  2 files changed, 21 insertions(+)
>> >>
>> >> diff --git a/controller/ovn-controller.8.xml
>> b/controller/ovn-controller.8.xml
>> >> index 6687731..4cb3f41 100644
>> >> --- a/controller/ovn-controller.8.xml
>> >> +++ b/controller/ovn-controller.8.xml
>> >> @@ -233,6 +233,18 @@
>> >>          The boolean flag indicates if the chassis is used as an
>> >>          interconnection gateway.
>> >>        </dd>
>> >> +
>> >> +      <dt><code>external_ids:ovn-delay-nb-cfg-report</code></dt>
>> >> +      <dd>
>> >> +        An integer that represents the time (in seconds) to sleep
>> before
>> >> +        <code>ovn-controller</code> updating <code>nb_cfg</code>
>> column
>> >> +        of <code>Chassis_Private</code> table after it completes
>> processing
>> >> +        the corresponding changes. By default it is 0 (meaning do not
>> sleep).
>> >> +        This value can be configured when doing control plane latency
>> testing,
>> >> +        to make sure the updates of <code>nb_cfg</code> to the
>> Southbound DB
>> >> +        from a large number of hypervisors do not interfere with the
>> original
>> >> +        Southbound DB data processing.
>> >> +      </dd>
>> >>      </dl>
>> >>
>> >>      <p>
>> >> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>> >> index 89f10a6..5ab5a13 100644
>> >> --- a/controller/ovn-controller.c
>> >> +++ b/controller/ovn-controller.c
>> >> @@ -2568,6 +2568,15 @@ main(int argc, char *argv[])
>> >>                      sbrec_chassis_private_set_nb_cfg(chassis_private,
>> cur_cfg);
>> >>                      sbrec_chassis_private_set_nb_cfg_timestamp(
>> >>                          chassis_private, time_wall_msec());
>> >> +                    const struct ovsrec_open_vswitch *cfg =
>> >> +                        ovsrec_open_vswitch_first(ovs_idl_loop.idl);
>> >> +                    if (cfg) {
>> >> +                        int delay_cfg_report =
>> smap_get_int(&cfg->external_ids,
>> >> +
>> "ovn-delay-nb-cfg-report", 0);
>> >> +                        if (delay_cfg_report) {
>> >> +                            xsleep(delay_cfg_report);
>> >> +                        }
>> >> +                    }
>> >
>> >
>> > Hi Han,
>> >
>> > Maybe we can also sync this option from the ovs config to chassis
>> other_config in south db ?
>> >
>> > What do you think ?
>> >
>>
>> Hi Numan,
>>
>> I am not sure what's the purpose of syncing this to SB? It is used only by
>> ovn-controller, something similar to ovn-remote-probe-interval, etc.
>>
>
> We do copy some of the configs from ovs db to chassis other_config. But I
> agree this config doesnt need to go to sb and probably CMS wouldn't care
> much know about it.
>
> Thanks
> Numan
>
>
>> Thanks,
>> Han
>>
>> > Thanks
>> > Numan
>> >
>> >>
>> >>                  }
>> >>              }
>> >>
>> >> --
>> >> 2.1.0
>> >>
>> >> _______________________________________________
>> >> dev mailing list
>> >> dev@openvswitch.org
>> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> >>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
Han Zhou Sept. 8, 2020, 1:22 a.m. UTC | #5
On Mon, Sep 7, 2020 at 11:02 AM Numan Siddique <numans@ovn.org> wrote:
>
>
>
> On Sat, Sep 5, 2020 at 12:46 AM Numan Siddique <numans@ovn.org> wrote:
>>
>>
>>
>> On Sat, Sep 5, 2020, 12:04 AM Han Zhou <hzhou@ovn.org> wrote:
>>>
>>> On Fri, Sep 4, 2020 at 11:18 AM Numan Siddique <numans@ovn.org> wrote:
>>> >
>>> >
>>> >
>>> > On Wed, Sep 2, 2020 at 4:57 AM Han Zhou <hzhou@ovn.org> wrote:
>>> >>
>>> >> This option is added to avoid the flooding of nb_cfg updates from
>>> >> a large nubmer of hypervisors interfere with the original SB DB
>>> >> data distribution and handling, so that the e2e control plane latency
>>> >> can be measured more accurately during scale testing.
>>> >>
>>> >> Signed-off-by: Han Zhou <hzhou@ovn.org>
>
>
> Hi Han,
>
> If I understand correctly, if the option is configured, ovn-controller
will sleep before it updates
> the OVN Southbound database. Can you please explain a bit on how it helps
in latency measurement.
>

Thanks Numan.

In my testing with 3K HVs, when there is a change in SB DB, all the 3K HVs
will be notified by SB DB server, process the change, and update its nb_cfg
back to SB DB.
Because the OVSDB server is single threaded, there can be HVs start
updating nb_cfg back to SB before all HVs are notified of the SB change,
and once the updating starts, the server would become more busy which could
further delay the notification sending. At the same time, although nb_cfg
update wouldn't get flooded when private_chassis is introduced, with RAFT
cluster, the _Server database monitor still triggers the flooding of raft
index change for every nb_cfg update from every HV to every other HVs,
which can interfere the HVs for processing the actual change. To avoid
these problems, this patch provides the capability of delaying nb_cfg
updates so that in scale tests these noises will be isolated.

> Also I don't think this option will be used during live deployments. So
instead of adding this as configuration
> in the ovs db, I would suggest to add an ovn-appctl command -
debug/set-delay-nb-cfg-report (or  a more
> appropriate name) so that a control plane latency test suite can set this
option -
>    ovn-appctl -t ovn-controller  debug/set-delay-nb-cfg-report 3 (for
example).
>

Agree. I will send a v2.

> Thanks
> Numan
>
>
>
>>> >> ---
>>> >>  controller/ovn-controller.8.xml | 12 ++++++++++++
>>> >>  controller/ovn-controller.c     |  9 +++++++++
>>> >>  2 files changed, 21 insertions(+)
>>> >>
>>> >> diff --git a/controller/ovn-controller.8.xml
>>> b/controller/ovn-controller.8.xml
>>> >> index 6687731..4cb3f41 100644
>>> >> --- a/controller/ovn-controller.8.xml
>>> >> +++ b/controller/ovn-controller.8.xml
>>> >> @@ -233,6 +233,18 @@
>>> >>          The boolean flag indicates if the chassis is used as an
>>> >>          interconnection gateway.
>>> >>        </dd>
>>> >> +
>>> >> +      <dt><code>external_ids:ovn-delay-nb-cfg-report</code></dt>
>>> >> +      <dd>
>>> >> +        An integer that represents the time (in seconds) to sleep
before
>>> >> +        <code>ovn-controller</code> updating <code>nb_cfg</code>
column
>>> >> +        of <code>Chassis_Private</code> table after it completes
>>> processing
>>> >> +        the corresponding changes. By default it is 0 (meaning do
not
>>> sleep).
>>> >> +        This value can be configured when doing control plane
latency
>>> testing,
>>> >> +        to make sure the updates of <code>nb_cfg</code> to the
>>> Southbound DB
>>> >> +        from a large number of hypervisors do not interfere with the
>>> original
>>> >> +        Southbound DB data processing.
>>> >> +      </dd>
>>> >>      </dl>
>>> >>
>>> >>      <p>
>>> >> diff --git a/controller/ovn-controller.c
b/controller/ovn-controller.c
>>> >> index 89f10a6..5ab5a13 100644
>>> >> --- a/controller/ovn-controller.c
>>> >> +++ b/controller/ovn-controller.c
>>> >> @@ -2568,6 +2568,15 @@ main(int argc, char *argv[])
>>> >>
 sbrec_chassis_private_set_nb_cfg(chassis_private,
>>> cur_cfg);
>>> >>                      sbrec_chassis_private_set_nb_cfg_timestamp(
>>> >>                          chassis_private, time_wall_msec());
>>> >> +                    const struct ovsrec_open_vswitch *cfg =
>>> >> +                        ovsrec_open_vswitch_first(ovs_idl_loop.idl);
>>> >> +                    if (cfg) {
>>> >> +                        int delay_cfg_report =
>>> smap_get_int(&cfg->external_ids,
>>> >> +
>>> "ovn-delay-nb-cfg-report", 0);
>>> >> +                        if (delay_cfg_report) {
>>> >> +                            xsleep(delay_cfg_report);
>>> >> +                        }
>>> >> +                    }
>>> >
>>> >
>>> > Hi Han,
>>> >
>>> > Maybe we can also sync this option from the ovs config to chassis
>>> other_config in south db ?
>>> >
>>> > What do you think ?
>>> >
>>>
>>> Hi Numan,
>>>
>>> I am not sure what's the purpose of syncing this to SB? It is used only
by
>>> ovn-controller, something similar to ovn-remote-probe-interval, etc.
>>
>>
>> We do copy some of the configs from ovs db to chassis other_config. But
I agree this config doesnt need to go to sb and probably CMS wouldn't care
much know about it.
>>
>> Thanks
>> Numan
>>
>>>
>>> Thanks,
>>> Han
>>>
>>> > Thanks
>>> > Numan
>>> >
>>> >>
>>> >>                  }
>>> >>              }
>>> >>
>>> >> --
>>> >> 2.1.0
>>> >>
>>> >> _______________________________________________
>>> >> dev mailing list
>>> >> dev@openvswitch.org
>>> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> >>
>>> _______________________________________________
>>> dev mailing list
>>> dev@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
Han Zhou Sept. 8, 2020, 8:12 p.m. UTC | #6
On Mon, Sep 7, 2020 at 6:22 PM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Mon, Sep 7, 2020 at 11:02 AM Numan Siddique <numans@ovn.org> wrote:
> >
> >
> >
> > On Sat, Sep 5, 2020 at 12:46 AM Numan Siddique <numans@ovn.org> wrote:
> >>
> >>
> >>
> >> On Sat, Sep 5, 2020, 12:04 AM Han Zhou <hzhou@ovn.org> wrote:
> >>>
> >>> On Fri, Sep 4, 2020 at 11:18 AM Numan Siddique <numans@ovn.org> wrote:
> >>> >
> >>> >
> >>> >
> >>> > On Wed, Sep 2, 2020 at 4:57 AM Han Zhou <hzhou@ovn.org> wrote:
> >>> >>
> >>> >> This option is added to avoid the flooding of nb_cfg updates from
> >>> >> a large nubmer of hypervisors interfere with the original SB DB
> >>> >> data distribution and handling, so that the e2e control plane
latency
> >>> >> can be measured more accurately during scale testing.
> >>> >>
> >>> >> Signed-off-by: Han Zhou <hzhou@ovn.org>
> >
> >
> > Hi Han,
> >
> > If I understand correctly, if the option is configured, ovn-controller
will sleep before it updates
> > the OVN Southbound database. Can you please explain a bit on how it
helps in latency measurement.
> >
>
> Thanks Numan.
>
> In my testing with 3K HVs, when there is a change in SB DB, all the 3K
HVs will be notified by SB DB server, process the change, and update its
nb_cfg back to SB DB.
> Because the OVSDB server is single threaded, there can be HVs start
updating nb_cfg back to SB before all HVs are notified of the SB change,
and once the updating starts, the server would become more busy which could
further delay the notification sending. At the same time, although nb_cfg
update wouldn't get flooded when private_chassis is introduced, with RAFT
cluster, the _Server database monitor still triggers the flooding of raft
index change for every nb_cfg update from every HV to every other HVs,
which can interfere the HVs for processing the actual change. To avoid
these problems, this patch provides the capability of delaying nb_cfg
updates so that in scale tests these noises will be isolated.
>
> > Also I don't think this option will be used during live deployments. So
instead of adding this as configuration
> > in the ovs db, I would suggest to add an ovn-appctl command -
debug/set-delay-nb-cfg-report (or  a more
> > appropriate name) so that a control plane latency test suite can set
this option -
> >    ovn-appctl -t ovn-controller  debug/set-delay-nb-cfg-report 3 (for
example).
> >
>
> Agree. I will send a v2.

Hi Numan, I sent a new patch for this:
https://patchwork.ozlabs.org/project/ovn/patch/1599595835-124391-1-git-send-email-hzhou@ovn.org/

Please take a look.

Thanks,
Han

>
> > Thanks
> > Numan
> >
> >
> >
> >>> >> ---
> >>> >>  controller/ovn-controller.8.xml | 12 ++++++++++++
> >>> >>  controller/ovn-controller.c     |  9 +++++++++
> >>> >>  2 files changed, 21 insertions(+)
> >>> >>
> >>> >> diff --git a/controller/ovn-controller.8.xml
> >>> b/controller/ovn-controller.8.xml
> >>> >> index 6687731..4cb3f41 100644
> >>> >> --- a/controller/ovn-controller.8.xml
> >>> >> +++ b/controller/ovn-controller.8.xml
> >>> >> @@ -233,6 +233,18 @@
> >>> >>          The boolean flag indicates if the chassis is used as an
> >>> >>          interconnection gateway.
> >>> >>        </dd>
> >>> >> +
> >>> >> +      <dt><code>external_ids:ovn-delay-nb-cfg-report</code></dt>
> >>> >> +      <dd>
> >>> >> +        An integer that represents the time (in seconds) to sleep
before
> >>> >> +        <code>ovn-controller</code> updating <code>nb_cfg</code>
column
> >>> >> +        of <code>Chassis_Private</code> table after it completes
> >>> processing
> >>> >> +        the corresponding changes. By default it is 0 (meaning do
not
> >>> sleep).
> >>> >> +        This value can be configured when doing control plane
latency
> >>> testing,
> >>> >> +        to make sure the updates of <code>nb_cfg</code> to the
> >>> Southbound DB
> >>> >> +        from a large number of hypervisors do not interfere with
the
> >>> original
> >>> >> +        Southbound DB data processing.
> >>> >> +      </dd>
> >>> >>      </dl>
> >>> >>
> >>> >>      <p>
> >>> >> diff --git a/controller/ovn-controller.c
b/controller/ovn-controller.c
> >>> >> index 89f10a6..5ab5a13 100644
> >>> >> --- a/controller/ovn-controller.c
> >>> >> +++ b/controller/ovn-controller.c
> >>> >> @@ -2568,6 +2568,15 @@ main(int argc, char *argv[])
> >>> >>
 sbrec_chassis_private_set_nb_cfg(chassis_private,
> >>> cur_cfg);
> >>> >>                      sbrec_chassis_private_set_nb_cfg_timestamp(
> >>> >>                          chassis_private, time_wall_msec());
> >>> >> +                    const struct ovsrec_open_vswitch *cfg =
> >>> >> +
 ovsrec_open_vswitch_first(ovs_idl_loop.idl);
> >>> >> +                    if (cfg) {
> >>> >> +                        int delay_cfg_report =
> >>> smap_get_int(&cfg->external_ids,
> >>> >> +
> >>> "ovn-delay-nb-cfg-report", 0);
> >>> >> +                        if (delay_cfg_report) {
> >>> >> +                            xsleep(delay_cfg_report);
> >>> >> +                        }
> >>> >> +                    }
> >>> >
> >>> >
> >>> > Hi Han,
> >>> >
> >>> > Maybe we can also sync this option from the ovs config to chassis
> >>> other_config in south db ?
> >>> >
> >>> > What do you think ?
> >>> >
> >>>
> >>> Hi Numan,
> >>>
> >>> I am not sure what's the purpose of syncing this to SB? It is used
only by
> >>> ovn-controller, something similar to ovn-remote-probe-interval, etc.
> >>
> >>
> >> We do copy some of the configs from ovs db to chassis other_config.
But I agree this config doesnt need to go to sb and probably CMS wouldn't
care much know about it.
> >>
> >> Thanks
> >> Numan
> >>
> >>>
> >>> Thanks,
> >>> Han
> >>>
> >>> > Thanks
> >>> > Numan
> >>> >
> >>> >>
> >>> >>                  }
> >>> >>              }
> >>> >>
> >>> >> --
> >>> >> 2.1.0
> >>> >>
> >>> >> _______________________________________________
> >>> >> dev mailing list
> >>> >> dev@openvswitch.org
> >>> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>> >>
> >>> _______________________________________________
> >>> dev mailing list
> >>> dev@openvswitch.org
> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>
diff mbox series

Patch

diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
index 6687731..4cb3f41 100644
--- a/controller/ovn-controller.8.xml
+++ b/controller/ovn-controller.8.xml
@@ -233,6 +233,18 @@ 
         The boolean flag indicates if the chassis is used as an
         interconnection gateway.
       </dd>
+
+      <dt><code>external_ids:ovn-delay-nb-cfg-report</code></dt>
+      <dd>
+        An integer that represents the time (in seconds) to sleep before
+        <code>ovn-controller</code> updating <code>nb_cfg</code> column
+        of <code>Chassis_Private</code> table after it completes processing
+        the corresponding changes. By default it is 0 (meaning do not sleep).
+        This value can be configured when doing control plane latency testing,
+        to make sure the updates of <code>nb_cfg</code> to the Southbound DB
+        from a large number of hypervisors do not interfere with the original
+        Southbound DB data processing.
+      </dd>
     </dl>
 
     <p>
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 89f10a6..5ab5a13 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -2568,6 +2568,15 @@  main(int argc, char *argv[])
                     sbrec_chassis_private_set_nb_cfg(chassis_private, cur_cfg);
                     sbrec_chassis_private_set_nb_cfg_timestamp(
                         chassis_private, time_wall_msec());
+                    const struct ovsrec_open_vswitch *cfg =
+                        ovsrec_open_vswitch_first(ovs_idl_loop.idl);
+                    if (cfg) {
+                        int delay_cfg_report = smap_get_int(&cfg->external_ids,
+                                               "ovn-delay-nb-cfg-report", 0);
+                        if (delay_cfg_report) {
+                            xsleep(delay_cfg_report);
+                        }
+                    }
                 }
             }