diff mbox series

[ovs-dev] northd: Change the default value of ignore_lsp_down to true.

Message ID 20211002071139.4068061-1-hzhou@ovn.org
State Accepted
Headers show
Series [ovs-dev] northd: Change the default value of ignore_lsp_down to true. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Han Zhou Oct. 2, 2021, 7:11 a.m. UTC
The current default behavior is that ARP responder flows for VIFs are
added by northd after the port-binding state is UP, which creates more
trouble than benefit in most use cases. To make the default behavior
desirable for majority of the use cases, set the option ignore_lsp_down
to true by default. This would help saving the control plane cost in
large scale environment, reduce the e2e latency for all flows to be
installed for a VIF, and making the VIF readiness checking more convenient
in test cases and likely in CMS as well. User can still set it to false
in circumstances (if any) when this behavior is not desired.

Signed-off-by: Han Zhou <hzhou@ovn.org>
---
 NEWS                 | 4 ++++
 northd/northd.c      | 2 +-
 northd/ovn_northd.dl | 2 +-
 ovn-nb.xml           | 2 +-
 tests/ovn-northd.at  | 3 ++-
 5 files changed, 9 insertions(+), 4 deletions(-)

Comments

Mark Gray Oct. 4, 2021, 10:10 a.m. UTC | #1
On 02/10/2021 08:11, Han Zhou wrote:
> The current default behavior is that ARP responder flows for VIFs are
> added by northd after the port-binding state is UP, which creates more
> trouble than benefit in most use cases. To make the default behavior
> desirable for majority of the use cases, set the option ignore_lsp_down
> to true by default. This would help saving the control plane cost in
> large scale environment, reduce the e2e latency for all flows to be
> installed for a VIF, and making the VIF readiness checking more convenient
> in test cases and likely in CMS as well. User can still set it to false
> in circumstances (if any) when this behavior is not desired.
> 
> Signed-off-by: Han Zhou <hzhou@ovn.org>

Thanks Han. This seems to be good to me but I think someone else who is
more familiar with some of the original problems should give it a look
over in case I am missing something.

Acked-by: Mark D. Gray <mark.d.gray@redhat.com>

> ---
>  NEWS                 | 4 ++++
>  northd/northd.c      | 2 +-
>  northd/ovn_northd.dl | 2 +-
>  ovn-nb.xml           | 2 +-
>  tests/ovn-northd.at  | 3 ++-
>  5 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 8a21c029e..348f3f548 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
>    - Allow static routes without nexthops.
>    - Enabled logical dp groups as a default.  CMS should disable it if not
>      desired.
> +  - Set ignore_lsp_down to true as default, so that ARP responder flows are
> +    installed together with other flows when a logical switch port is created,
> +    without having to wait for the port to be UP.  CMS should set it to false
> +    if not desired.
>  
>  OVN v21.06.0 - 18 Jun 2021
>  -------------------------
> diff --git a/northd/northd.c b/northd/northd.c
> index cf2467fe1..5ffd6b8eb 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
>      controller_event_en = smap_get_bool(&nb->options,
>                                          "controller_event", false);
>      check_lsp_is_up = !smap_get_bool(&nb->options,
> -                                     "ignore_lsp_down", false);
> +                                     "ignore_lsp_down", true);
>  
>      build_datapaths(ctx, datapaths, lr_list);
>      build_ovn_lbs(ctx, datapaths, &lbs);
> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> index 22913f05a..7154fed13 100644
> --- a/northd/ovn_northd.dl
> +++ b/northd/ovn_northd.dl
> @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
>  relation CheckLspIsUp[bool]
>  CheckLspIsUp[check_lsp_is_up] :-
>      nb in nb::NB_Global(),
> -    var check_lsp_is_up = not nb.options.get_bool_def(i"ignore_lsp_down", false).
> +    var check_lsp_is_up = not nb.options.get_bool_def(i"ignore_lsp_down", true).
>  CheckLspIsUp[true] :-
>      Unit(),
>      not nb in nb::NB_Global().
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 390cc5a44..d8266ed4d 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -236,7 +236,7 @@
>            resolved even before the relevant VM/container is running. For
>            environments where this is not an issue, setting it to
>            <code>true</code> can reduce the load and latency of the control
> -          plane. The default value is <code>false</code>.
> +          plane. The default value is <code>true</code>.
>          </p>
>        </column>
>  
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index c5400d806..3eebb55b6 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
>  AT_SETUP([ignore_lsp_down])
>  ovn_start
>  
> +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
>  ovn-nbctl ls-add sw0
>  ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1 "aa:aa:aa:aa:aa:aa 10.0.0.1"
>  
> @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
>  
>  check ovn-nbctl --wait=sb sync
>  AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep lsp0 -c], [0], [dnl
> -9
> +10
>  ])
>  
>  AT_CLEANUP
>
Han Zhou Oct. 4, 2021, 5:36 p.m. UTC | #2
On Mon, Oct 4, 2021 at 3:10 AM Mark Gray <mark.d.gray@redhat.com> wrote:
>
> On 02/10/2021 08:11, Han Zhou wrote:
> > The current default behavior is that ARP responder flows for VIFs are
> > added by northd after the port-binding state is UP, which creates more
> > trouble than benefit in most use cases. To make the default behavior
> > desirable for majority of the use cases, set the option ignore_lsp_down
> > to true by default. This would help saving the control plane cost in
> > large scale environment, reduce the e2e latency for all flows to be
> > installed for a VIF, and making the VIF readiness checking more
convenient
> > in test cases and likely in CMS as well. User can still set it to false
> > in circumstances (if any) when this behavior is not desired.
> >
> > Signed-off-by: Han Zhou <hzhou@ovn.org>
>
> Thanks Han. This seems to be good to me but I think someone else who is
> more familiar with some of the original problems should give it a look
> over in case I am missing something.
>
> Acked-by: Mark D. Gray <mark.d.gray@redhat.com>

Thanks Mark. I couldn't find the source of the original problem that
required checking LSP state before adding the ARP responder flows. But let
me add Numan who was the author for this lsp status check back in 2015, and
could have better judgement regarding this default behavior change.
(b891c4c6a ovn-northd: Only add ARP reply flows for logical ports that are
up.)

Thanks,
Han
>
> > ---
> >  NEWS                 | 4 ++++
> >  northd/northd.c      | 2 +-
> >  northd/ovn_northd.dl | 2 +-
> >  ovn-nb.xml           | 2 +-
> >  tests/ovn-northd.at  | 3 ++-
> >  5 files changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index 8a21c029e..348f3f548 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
> >    - Allow static routes without nexthops.
> >    - Enabled logical dp groups as a default.  CMS should disable it if
not
> >      desired.
> > +  - Set ignore_lsp_down to true as default, so that ARP responder
flows are
> > +    installed together with other flows when a logical switch port is
created,
> > +    without having to wait for the port to be UP.  CMS should set it
to false
> > +    if not desired.
> >
> >  OVN v21.06.0 - 18 Jun 2021
> >  -------------------------
> > diff --git a/northd/northd.c b/northd/northd.c
> > index cf2467fe1..5ffd6b8eb 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
> >      controller_event_en = smap_get_bool(&nb->options,
> >                                          "controller_event", false);
> >      check_lsp_is_up = !smap_get_bool(&nb->options,
> > -                                     "ignore_lsp_down", false);
> > +                                     "ignore_lsp_down", true);
> >
> >      build_datapaths(ctx, datapaths, lr_list);
> >      build_ovn_lbs(ctx, datapaths, &lbs);
> > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> > index 22913f05a..7154fed13 100644
> > --- a/northd/ovn_northd.dl
> > +++ b/northd/ovn_northd.dl
> > @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
> >  relation CheckLspIsUp[bool]
> >  CheckLspIsUp[check_lsp_is_up] :-
> >      nb in nb::NB_Global(),
> > -    var check_lsp_is_up = not
nb.options.get_bool_def(i"ignore_lsp_down", false).
> > +    var check_lsp_is_up = not
nb.options.get_bool_def(i"ignore_lsp_down", true).
> >  CheckLspIsUp[true] :-
> >      Unit(),
> >      not nb in nb::NB_Global().
> > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > index 390cc5a44..d8266ed4d 100644
> > --- a/ovn-nb.xml
> > +++ b/ovn-nb.xml
> > @@ -236,7 +236,7 @@
> >            resolved even before the relevant VM/container is running.
For
> >            environments where this is not an issue, setting it to
> >            <code>true</code> can reduce the load and latency of the
control
> > -          plane. The default value is <code>false</code>.
> > +          plane. The default value is <code>true</code>.
> >          </p>
> >        </column>
> >
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index c5400d806..3eebb55b6 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
> >  AT_SETUP([ignore_lsp_down])
> >  ovn_start
> >
> > +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
> >  ovn-nbctl ls-add sw0
> >  ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1
"aa:aa:aa:aa:aa:aa 10.0.0.1"
> >
> > @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
> >
> >  check ovn-nbctl --wait=sb sync
> >  AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep lsp0 -c],
[0], [dnl
> > -9
> > +10
> >  ])
> >
> >  AT_CLEANUP
> >
>
Numan Siddique Oct. 4, 2021, 6:30 p.m. UTC | #3
On Mon, Oct 4, 2021 at 1:37 PM Han Zhou <hzhou@ovn.org> wrote:
>
> On Mon, Oct 4, 2021 at 3:10 AM Mark Gray <mark.d.gray@redhat.com> wrote:
> >
> > On 02/10/2021 08:11, Han Zhou wrote:
> > > The current default behavior is that ARP responder flows for VIFs are
> > > added by northd after the port-binding state is UP, which creates more
> > > trouble than benefit in most use cases. To make the default behavior
> > > desirable for majority of the use cases, set the option ignore_lsp_down
> > > to true by default. This would help saving the control plane cost in
> > > large scale environment, reduce the e2e latency for all flows to be
> > > installed for a VIF, and making the VIF readiness checking more
> convenient
> > > in test cases and likely in CMS as well. User can still set it to false
> > > in circumstances (if any) when this behavior is not desired.
> > >
> > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> >
> > Thanks Han. This seems to be good to me but I think someone else who is
> > more familiar with some of the original problems should give it a look
> > over in case I am missing something.
> >
> > Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
>
> Thanks Mark. I couldn't find the source of the original problem that
> required checking LSP state before adding the ARP responder flows. But let
> me add Numan who was the author for this lsp status check back in 2015, and
> could have better judgement regarding this default behavior change.
> (b891c4c6a ovn-northd: Only add ARP reply flows for logical ports that are
> up.)
>

I think I submitted the patch because, when a logical port P1 sends
arp for an IP of lport P2,
then it would get the arp response even if P2 is down  which at that
time I thought is not
reflecting the real case scenario (with the physical deployments).

But I'm fine with this patch.

Acked-by: Numan Siddique <numans@ovn.org>

Thanks
Numan


> Han
> >
> > > ---
> > >  NEWS                 | 4 ++++
> > >  northd/northd.c      | 2 +-
> > >  northd/ovn_northd.dl | 2 +-
> > >  ovn-nb.xml           | 2 +-
> > >  tests/ovn-northd.at  | 3 ++-
> > >  5 files changed, 9 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/NEWS b/NEWS
> > > index 8a21c029e..348f3f548 100644
> > > --- a/NEWS
> > > +++ b/NEWS
> > > @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
> > >    - Allow static routes without nexthops.
> > >    - Enabled logical dp groups as a default.  CMS should disable it if
> not
> > >      desired.
> > > +  - Set ignore_lsp_down to true as default, so that ARP responder
> flows are
> > > +    installed together with other flows when a logical switch port is
> created,
> > > +    without having to wait for the port to be UP.  CMS should set it
> to false
> > > +    if not desired.
> > >
> > >  OVN v21.06.0 - 18 Jun 2021
> > >  -------------------------
> > > diff --git a/northd/northd.c b/northd/northd.c
> > > index cf2467fe1..5ffd6b8eb 100644
> > > --- a/northd/northd.c
> > > +++ b/northd/northd.c
> > > @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
> > >      controller_event_en = smap_get_bool(&nb->options,
> > >                                          "controller_event", false);
> > >      check_lsp_is_up = !smap_get_bool(&nb->options,
> > > -                                     "ignore_lsp_down", false);
> > > +                                     "ignore_lsp_down", true);
> > >
> > >      build_datapaths(ctx, datapaths, lr_list);
> > >      build_ovn_lbs(ctx, datapaths, &lbs);
> > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> > > index 22913f05a..7154fed13 100644
> > > --- a/northd/ovn_northd.dl
> > > +++ b/northd/ovn_northd.dl
> > > @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
> > >  relation CheckLspIsUp[bool]
> > >  CheckLspIsUp[check_lsp_is_up] :-
> > >      nb in nb::NB_Global(),
> > > -    var check_lsp_is_up = not
> nb.options.get_bool_def(i"ignore_lsp_down", false).
> > > +    var check_lsp_is_up = not
> nb.options.get_bool_def(i"ignore_lsp_down", true).
> > >  CheckLspIsUp[true] :-
> > >      Unit(),
> > >      not nb in nb::NB_Global().
> > > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > > index 390cc5a44..d8266ed4d 100644
> > > --- a/ovn-nb.xml
> > > +++ b/ovn-nb.xml
> > > @@ -236,7 +236,7 @@
> > >            resolved even before the relevant VM/container is running.
> For
> > >            environments where this is not an issue, setting it to
> > >            <code>true</code> can reduce the load and latency of the
> control
> > > -          plane. The default value is <code>false</code>.
> > > +          plane. The default value is <code>true</code>.
> > >          </p>
> > >        </column>
> > >
> > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > > index c5400d806..3eebb55b6 100644
> > > --- a/tests/ovn-northd.at
> > > +++ b/tests/ovn-northd.at
> > > @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
> > >  AT_SETUP([ignore_lsp_down])
> > >  ovn_start
> > >
> > > +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
> > >  ovn-nbctl ls-add sw0
> > >  ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1
> "aa:aa:aa:aa:aa:aa 10.0.0.1"
> > >
> > > @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
> > >
> > >  check ovn-nbctl --wait=sb sync
> > >  AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep lsp0 -c],
> [0], [dnl
> > > -9
> > > +10
> > >  ])
> > >
> > >  AT_CLEANUP
> > >
> >
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Han Zhou Oct. 4, 2021, 8:10 p.m. UTC | #4
On Mon, Oct 4, 2021 at 11:30 AM Numan Siddique <numans@ovn.org> wrote:
>
> On Mon, Oct 4, 2021 at 1:37 PM Han Zhou <hzhou@ovn.org> wrote:
> >
> > On Mon, Oct 4, 2021 at 3:10 AM Mark Gray <mark.d.gray@redhat.com> wrote:
> > >
> > > On 02/10/2021 08:11, Han Zhou wrote:
> > > > The current default behavior is that ARP responder flows for VIFs
are
> > > > added by northd after the port-binding state is UP, which creates
more
> > > > trouble than benefit in most use cases. To make the default behavior
> > > > desirable for majority of the use cases, set the option
ignore_lsp_down
> > > > to true by default. This would help saving the control plane cost in
> > > > large scale environment, reduce the e2e latency for all flows to be
> > > > installed for a VIF, and making the VIF readiness checking more
> > convenient
> > > > in test cases and likely in CMS as well. User can still set it to
false
> > > > in circumstances (if any) when this behavior is not desired.
> > > >
> > > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> > >
> > > Thanks Han. This seems to be good to me but I think someone else who
is
> > > more familiar with some of the original problems should give it a look
> > > over in case I am missing something.
> > >
> > > Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
> >
> > Thanks Mark. I couldn't find the source of the original problem that
> > required checking LSP state before adding the ARP responder flows. But
let
> > me add Numan who was the author for this lsp status check back in 2015,
and
> > could have better judgement regarding this default behavior change.
> > (b891c4c6a ovn-northd: Only add ARP reply flows for logical ports that
are
> > up.)
> >
>
> I think I submitted the patch because, when a logical port P1 sends
> arp for an IP of lport P2,
> then it would get the arp response even if P2 is down  which at that
> time I thought is not
> reflecting the real case scenario (with the physical deployments).
>
> But I'm fine with this patch.
>
> Acked-by: Numan Siddique <numans@ovn.org>

Thanks Numan and Mark. I noticed that I put the news update in the wrong
section. So I just did the minor change and applied the patch.

However, I also noticed that in the NEWS file the release date of 21.09 is
not updated:
OVN v21.09.0 - xx xxx xxxx

I didn't correct that because it is better to be a separate patch. cc @Mark
Michelson <mmichels@redhat.com>

>
> Thanks
> Numan
>
>
> > Han
> > >
> > > > ---
> > > >  NEWS                 | 4 ++++
> > > >  northd/northd.c      | 2 +-
> > > >  northd/ovn_northd.dl | 2 +-
> > > >  ovn-nb.xml           | 2 +-
> > > >  tests/ovn-northd.at  | 3 ++-
> > > >  5 files changed, 9 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/NEWS b/NEWS
> > > > index 8a21c029e..348f3f548 100644
> > > > --- a/NEWS
> > > > +++ b/NEWS
> > > > @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
> > > >    - Allow static routes without nexthops.
> > > >    - Enabled logical dp groups as a default.  CMS should disable it
if
> > not
> > > >      desired.
> > > > +  - Set ignore_lsp_down to true as default, so that ARP responder
> > flows are
> > > > +    installed together with other flows when a logical switch port
is
> > created,
> > > > +    without having to wait for the port to be UP.  CMS should set
it
> > to false
> > > > +    if not desired.
> > > >
> > > >  OVN v21.06.0 - 18 Jun 2021
> > > >  -------------------------
> > > > diff --git a/northd/northd.c b/northd/northd.c
> > > > index cf2467fe1..5ffd6b8eb 100644
> > > > --- a/northd/northd.c
> > > > +++ b/northd/northd.c
> > > > @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
> > > >      controller_event_en = smap_get_bool(&nb->options,
> > > >                                          "controller_event", false);
> > > >      check_lsp_is_up = !smap_get_bool(&nb->options,
> > > > -                                     "ignore_lsp_down", false);
> > > > +                                     "ignore_lsp_down", true);
> > > >
> > > >      build_datapaths(ctx, datapaths, lr_list);
> > > >      build_ovn_lbs(ctx, datapaths, &lbs);
> > > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> > > > index 22913f05a..7154fed13 100644
> > > > --- a/northd/ovn_northd.dl
> > > > +++ b/northd/ovn_northd.dl
> > > > @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
> > > >  relation CheckLspIsUp[bool]
> > > >  CheckLspIsUp[check_lsp_is_up] :-
> > > >      nb in nb::NB_Global(),
> > > > -    var check_lsp_is_up = not
> > nb.options.get_bool_def(i"ignore_lsp_down", false).
> > > > +    var check_lsp_is_up = not
> > nb.options.get_bool_def(i"ignore_lsp_down", true).
> > > >  CheckLspIsUp[true] :-
> > > >      Unit(),
> > > >      not nb in nb::NB_Global().
> > > > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > > > index 390cc5a44..d8266ed4d 100644
> > > > --- a/ovn-nb.xml
> > > > +++ b/ovn-nb.xml
> > > > @@ -236,7 +236,7 @@
> > > >            resolved even before the relevant VM/container is
running.
> > For
> > > >            environments where this is not an issue, setting it to
> > > >            <code>true</code> can reduce the load and latency of the
> > control
> > > > -          plane. The default value is <code>false</code>.
> > > > +          plane. The default value is <code>true</code>.
> > > >          </p>
> > > >        </column>
> > > >
> > > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > > > index c5400d806..3eebb55b6 100644
> > > > --- a/tests/ovn-northd.at
> > > > +++ b/tests/ovn-northd.at
> > > > @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
> > > >  AT_SETUP([ignore_lsp_down])
> > > >  ovn_start
> > > >
> > > > +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
> > > >  ovn-nbctl ls-add sw0
> > > >  ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1
> > "aa:aa:aa:aa:aa:aa 10.0.0.1"
> > > >
> > > > @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
> > > >
> > > >  check ovn-nbctl --wait=sb sync
> > > >  AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep lsp0
-c],
> > [0], [dnl
> > > > -9
> > > > +10
> > > >  ])
> > > >
> > > >  AT_CLEANUP
> > > >
> > >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
Mark Michelson Oct. 4, 2021, 8:56 p.m. UTC | #5
On 10/4/21 4:10 PM, Han Zhou wrote:
> 
> 
> 
> On Mon, Oct 4, 2021 at 11:30 AM Numan Siddique <numans@ovn.org 
> <mailto:numans@ovn.org>> wrote:
>  >
>  > On Mon, Oct 4, 2021 at 1:37 PM Han Zhou <hzhou@ovn.org 
> <mailto:hzhou@ovn.org>> wrote:
>  > >
>  > > On Mon, Oct 4, 2021 at 3:10 AM Mark Gray <mark.d.gray@redhat.com 
> <mailto:mark.d.gray@redhat.com>> wrote:
>  > > >
>  > > > On 02/10/2021 08:11, Han Zhou wrote:
>  > > > > The current default behavior is that ARP responder flows for 
> VIFs are
>  > > > > added by northd after the port-binding state is UP, which 
> creates more
>  > > > > trouble than benefit in most use cases. To make the default 
> behavior
>  > > > > desirable for majority of the use cases, set the option 
> ignore_lsp_down
>  > > > > to true by default. This would help saving the control plane 
> cost in
>  > > > > large scale environment, reduce the e2e latency for all flows to be
>  > > > > installed for a VIF, and making the VIF readiness checking more
>  > > convenient
>  > > > > in test cases and likely in CMS as well. User can still set it 
> to false
>  > > > > in circumstances (if any) when this behavior is not desired.
>  > > > >
>  > > > > Signed-off-by: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>>
>  > > >
>  > > > Thanks Han. This seems to be good to me but I think someone else 
> who is
>  > > > more familiar with some of the original problems should give it a 
> look
>  > > > over in case I am missing something.
>  > > >
>  > > > Acked-by: Mark D. Gray <mark.d.gray@redhat.com 
> <mailto:mark.d.gray@redhat.com>>
>  > >
>  > > Thanks Mark. I couldn't find the source of the original problem that
>  > > required checking LSP state before adding the ARP responder flows. 
> But let
>  > > me add Numan who was the author for this lsp status check back in 
> 2015, and
>  > > could have better judgement regarding this default behavior change.
>  > > (b891c4c6a ovn-northd: Only add ARP reply flows for logical ports 
> that are
>  > > up.)
>  > >
>  >
>  > I think I submitted the patch because, when a logical port P1 sends
>  > arp for an IP of lport P2,
>  > then it would get the arp response even if P2 is down  which at that
>  > time I thought is not
>  > reflecting the real case scenario (with the physical deployments).
>  >
>  > But I'm fine with this patch.
>  >
>  > Acked-by: Numan Siddique <numans@ovn.org <mailto:numans@ovn.org>>
> 
> Thanks Numan and Mark. I noticed that I put the news update in the wrong 
> section. So I just did the minor change and applied the patch.
> 
> However, I also noticed that in the NEWS file the release date of 21.09 
> is not updated:
> OVN v21.09.0 - xx xxx xxxx
> 
> I didn't correct that because it is better to be a separate patch. cc 
> @Mark Michelson <mailto:mmichels@redhat.com>

One of these days I'll actually do this correctly :)

The release date is correct in branch-21.09, but I did not update the 
release date in the main branch. I'll submit a patch that corrects this. 
Thanks for bringing it to my attention.

> 
>  >
>  > Thanks
>  > Numan
>  >
>  >
>  > > Han
>  > > >
>  > > > > ---
>  > > > >  NEWS                 | 4 ++++
>  > > > >  northd/northd.c      | 2 +-
>  > > > >  northd/ovn_northd.dl | 2 +-
>  > > > >  ovn-nb.xml           | 2 +-
>  > > > >  tests/ovn-northd.at <http://ovn-northd.at>  | 3 ++-
>  > > > >  5 files changed, 9 insertions(+), 4 deletions(-)
>  > > > >
>  > > > > diff --git a/NEWS b/NEWS
>  > > > > index 8a21c029e..348f3f548 100644
>  > > > > --- a/NEWS
>  > > > > +++ b/NEWS
>  > > > > @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
>  > > > >    - Allow static routes without nexthops.
>  > > > >    - Enabled logical dp groups as a default.  CMS should 
> disable it if
>  > > not
>  > > > >      desired.
>  > > > > +  - Set ignore_lsp_down to true as default, so that ARP responder
>  > > flows are
>  > > > > +    installed together with other flows when a logical switch 
> port is
>  > > created,
>  > > > > +    without having to wait for the port to be UP.  CMS should 
> set it
>  > > to false
>  > > > > +    if not desired.
>  > > > >
>  > > > >  OVN v21.06.0 - 18 Jun 2021
>  > > > >  -------------------------
>  > > > > diff --git a/northd/northd.c b/northd/northd.c
>  > > > > index cf2467fe1..5ffd6b8eb 100644
>  > > > > --- a/northd/northd.c
>  > > > > +++ b/northd/northd.c
>  > > > > @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
>  > > > >      controller_event_en = smap_get_bool(&nb->options,
>  > > > >                                          "controller_event", 
> false);
>  > > > >      check_lsp_is_up = !smap_get_bool(&nb->options,
>  > > > > -                                     "ignore_lsp_down", false);
>  > > > > +                                     "ignore_lsp_down", true);
>  > > > >
>  > > > >      build_datapaths(ctx, datapaths, lr_list);
>  > > > >      build_ovn_lbs(ctx, datapaths, &lbs);
>  > > > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
>  > > > > index 22913f05a..7154fed13 100644
>  > > > > --- a/northd/ovn_northd.dl
>  > > > > +++ b/northd/ovn_northd.dl
>  > > > > @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
>  > > > >  relation CheckLspIsUp[bool]
>  > > > >  CheckLspIsUp[check_lsp_is_up] :-
>  > > > >      nb in nb::NB_Global(),
>  > > > > -    var check_lsp_is_up = not
>  > > nb.options.get_bool_def(i"ignore_lsp_down", false).
>  > > > > +    var check_lsp_is_up = not
>  > > nb.options.get_bool_def(i"ignore_lsp_down", true).
>  > > > >  CheckLspIsUp[true] :-
>  > > > >      Unit(),
>  > > > >      not nb in nb::NB_Global().
>  > > > > diff --git a/ovn-nb.xml b/ovn-nb.xml
>  > > > > index 390cc5a44..d8266ed4d 100644
>  > > > > --- a/ovn-nb.xml
>  > > > > +++ b/ovn-nb.xml
>  > > > > @@ -236,7 +236,7 @@
>  > > > >            resolved even before the relevant VM/container is 
> running.
>  > > For
>  > > > >            environments where this is not an issue, setting it to
>  > > > >            <code>true</code> can reduce the load and latency of the
>  > > control
>  > > > > -          plane. The default value is <code>false</code>.
>  > > > > +          plane. The default value is <code>true</code>.
>  > > > >          </p>
>  > > > >        </column>
>  > > > >
>  > > > > diff --git a/tests/ovn-northd.at <http://ovn-northd.at> 
> b/tests/ovn-northd.at <http://ovn-northd.at>
>  > > > > index c5400d806..3eebb55b6 100644
>  > > > > --- a/tests/ovn-northd.at <http://ovn-northd.at>
>  > > > > +++ b/tests/ovn-northd.at <http://ovn-northd.at>
>  > > > > @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
>  > > > >  AT_SETUP([ignore_lsp_down])
>  > > > >  ovn_start
>  > > > >
>  > > > > +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
>  > > > >  ovn-nbctl ls-add sw0
>  > > > >  ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1
>  > > "aa:aa:aa:aa:aa:aa 10.0.0.1"
>  > > > >
>  > > > > @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
>  > > > >
>  > > > >  check ovn-nbctl --wait=sb sync
>  > > > >  AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep 
> lsp0 -c],
>  > > [0], [dnl
>  > > > > -9
>  > > > > +10
>  > > > >  ])
>  > > > >
>  > > > >  AT_CLEANUP
>  > > > >
>  > > >
>  > > _______________________________________________
>  > > dev mailing list
>  > > dev@openvswitch.org <mailto:dev@openvswitch.org>
>  > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
>  > >
Numan Siddique Oct. 4, 2021, 10:34 p.m. UTC | #6
On Mon, Oct 4, 2021 at 4:56 PM Mark Michelson <mmichels@redhat.com> wrote:
>
> On 10/4/21 4:10 PM, Han Zhou wrote:
> >
> >
> >
> > On Mon, Oct 4, 2021 at 11:30 AM Numan Siddique <numans@ovn.org
> > <mailto:numans@ovn.org>> wrote:
> >  >
> >  > On Mon, Oct 4, 2021 at 1:37 PM Han Zhou <hzhou@ovn.org
> > <mailto:hzhou@ovn.org>> wrote:
> >  > >
> >  > > On Mon, Oct 4, 2021 at 3:10 AM Mark Gray <mark.d.gray@redhat.com
> > <mailto:mark.d.gray@redhat.com>> wrote:
> >  > > >
> >  > > > On 02/10/2021 08:11, Han Zhou wrote:
> >  > > > > The current default behavior is that ARP responder flows for
> > VIFs are
> >  > > > > added by northd after the port-binding state is UP, which
> > creates more
> >  > > > > trouble than benefit in most use cases. To make the default
> > behavior
> >  > > > > desirable for majority of the use cases, set the option
> > ignore_lsp_down
> >  > > > > to true by default. This would help saving the control plane
> > cost in
> >  > > > > large scale environment, reduce the e2e latency for all flows to be
> >  > > > > installed for a VIF, and making the VIF readiness checking more
> >  > > convenient
> >  > > > > in test cases and likely in CMS as well. User can still set it
> > to false
> >  > > > > in circumstances (if any) when this behavior is not desired.
> >  > > > >
> >  > > > > Signed-off-by: Han Zhou <hzhou@ovn.org <mailto:hzhou@ovn.org>>
> >  > > >
> >  > > > Thanks Han. This seems to be good to me but I think someone else
> > who is
> >  > > > more familiar with some of the original problems should give it a
> > look
> >  > > > over in case I am missing something.
> >  > > >
> >  > > > Acked-by: Mark D. Gray <mark.d.gray@redhat.com
> > <mailto:mark.d.gray@redhat.com>>
> >  > >
> >  > > Thanks Mark. I couldn't find the source of the original problem that
> >  > > required checking LSP state before adding the ARP responder flows.
> > But let
> >  > > me add Numan who was the author for this lsp status check back in
> > 2015, and
> >  > > could have better judgement regarding this default behavior change.
> >  > > (b891c4c6a ovn-northd: Only add ARP reply flows for logical ports
> > that are
> >  > > up.)
> >  > >
> >  >
> >  > I think I submitted the patch because, when a logical port P1 sends
> >  > arp for an IP of lport P2,
> >  > then it would get the arp response even if P2 is down  which at that
> >  > time I thought is not
> >  > reflecting the real case scenario (with the physical deployments).
> >  >
> >  > But I'm fine with this patch.
> >  >
> >  > Acked-by: Numan Siddique <numans@ovn.org <mailto:numans@ovn.org>>
> >
> > Thanks Numan and Mark. I noticed that I put the news update in the wrong
> > section. So I just did the minor change and applied the patch.
> >
> > However, I also noticed that in the NEWS file the release date of 21.09
> > is not updated:
> > OVN v21.09.0 - xx xxx xxxx
> >
> > I didn't correct that because it is better to be a separate patch. cc
> > @Mark Michelson <mailto:mmichels@redhat.com>
>
> One of these days I'll actually do this correctly :)
>
> The release date is correct in branch-21.09, but I did not update the
> release date in the main branch. I'll submit a patch that corrects this.
> Thanks for bringing it to my attention.

I think you just need to apply the patch 1 (this one -
https://github.com/ovn-org/ovn/commit/332946db2f579f4999ad9708cdc94ebcd78ff296)
to the main branch too.

Numan

>
> >
> >  >
> >  > Thanks
> >  > Numan
> >  >
> >  >
> >  > > Han
> >  > > >
> >  > > > > ---
> >  > > > >  NEWS                 | 4 ++++
> >  > > > >  northd/northd.c      | 2 +-
> >  > > > >  northd/ovn_northd.dl | 2 +-
> >  > > > >  ovn-nb.xml           | 2 +-
> >  > > > >  tests/ovn-northd.at <http://ovn-northd.at>  | 3 ++-
> >  > > > >  5 files changed, 9 insertions(+), 4 deletions(-)
> >  > > > >
> >  > > > > diff --git a/NEWS b/NEWS
> >  > > > > index 8a21c029e..348f3f548 100644
> >  > > > > --- a/NEWS
> >  > > > > +++ b/NEWS
> >  > > > > @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
> >  > > > >    - Allow static routes without nexthops.
> >  > > > >    - Enabled logical dp groups as a default.  CMS should
> > disable it if
> >  > > not
> >  > > > >      desired.
> >  > > > > +  - Set ignore_lsp_down to true as default, so that ARP responder
> >  > > flows are
> >  > > > > +    installed together with other flows when a logical switch
> > port is
> >  > > created,
> >  > > > > +    without having to wait for the port to be UP.  CMS should
> > set it
> >  > > to false
> >  > > > > +    if not desired.
> >  > > > >
> >  > > > >  OVN v21.06.0 - 18 Jun 2021
> >  > > > >  -------------------------
> >  > > > > diff --git a/northd/northd.c b/northd/northd.c
> >  > > > > index cf2467fe1..5ffd6b8eb 100644
> >  > > > > --- a/northd/northd.c
> >  > > > > +++ b/northd/northd.c
> >  > > > > @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
> >  > > > >      controller_event_en = smap_get_bool(&nb->options,
> >  > > > >                                          "controller_event",
> > false);
> >  > > > >      check_lsp_is_up = !smap_get_bool(&nb->options,
> >  > > > > -                                     "ignore_lsp_down", false);
> >  > > > > +                                     "ignore_lsp_down", true);
> >  > > > >
> >  > > > >      build_datapaths(ctx, datapaths, lr_list);
> >  > > > >      build_ovn_lbs(ctx, datapaths, &lbs);
> >  > > > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> >  > > > > index 22913f05a..7154fed13 100644
> >  > > > > --- a/northd/ovn_northd.dl
> >  > > > > +++ b/northd/ovn_northd.dl
> >  > > > > @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
> >  > > > >  relation CheckLspIsUp[bool]
> >  > > > >  CheckLspIsUp[check_lsp_is_up] :-
> >  > > > >      nb in nb::NB_Global(),
> >  > > > > -    var check_lsp_is_up = not
> >  > > nb.options.get_bool_def(i"ignore_lsp_down", false).
> >  > > > > +    var check_lsp_is_up = not
> >  > > nb.options.get_bool_def(i"ignore_lsp_down", true).
> >  > > > >  CheckLspIsUp[true] :-
> >  > > > >      Unit(),
> >  > > > >      not nb in nb::NB_Global().
> >  > > > > diff --git a/ovn-nb.xml b/ovn-nb.xml
> >  > > > > index 390cc5a44..d8266ed4d 100644
> >  > > > > --- a/ovn-nb.xml
> >  > > > > +++ b/ovn-nb.xml
> >  > > > > @@ -236,7 +236,7 @@
> >  > > > >            resolved even before the relevant VM/container is
> > running.
> >  > > For
> >  > > > >            environments where this is not an issue, setting it to
> >  > > > >            <code>true</code> can reduce the load and latency of the
> >  > > control
> >  > > > > -          plane. The default value is <code>false</code>.
> >  > > > > +          plane. The default value is <code>true</code>.
> >  > > > >          </p>
> >  > > > >        </column>
> >  > > > >
> >  > > > > diff --git a/tests/ovn-northd.at <http://ovn-northd.at>
> > b/tests/ovn-northd.at <http://ovn-northd.at>
> >  > > > > index c5400d806..3eebb55b6 100644
> >  > > > > --- a/tests/ovn-northd.at <http://ovn-northd.at>
> >  > > > > +++ b/tests/ovn-northd.at <http://ovn-northd.at>
> >  > > > > @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
> >  > > > >  AT_SETUP([ignore_lsp_down])
> >  > > > >  ovn_start
> >  > > > >
> >  > > > > +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
> >  > > > >  ovn-nbctl ls-add sw0
> >  > > > >  ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1
> >  > > "aa:aa:aa:aa:aa:aa 10.0.0.1"
> >  > > > >
> >  > > > > @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
> >  > > > >
> >  > > > >  check ovn-nbctl --wait=sb sync
> >  > > > >  AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep
> > lsp0 -c],
> >  > > [0], [dnl
> >  > > > > -9
> >  > > > > +10
> >  > > > >  ])
> >  > > > >
> >  > > > >  AT_CLEANUP
> >  > > > >
> >  > > >
> >  > > _______________________________________________
> >  > > dev mailing list
> >  > > dev@openvswitch.org <mailto:dev@openvswitch.org>
> >  > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
> >  > >
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Jakub Libosvar Aug. 24, 2023, 12:34 p.m. UTC | #7
Hello,

we use OpenStack (Neutron) as CMS and we have some customers reporting problems with their use case because of this change. I was wondering about the benefits described below in the commit message, are there any public data that back up the statements about the control plane cost at scale and latency? It would be great to have those to help with the decision whether we want to enable or disable this knob.

Thank you very much.

Jakub

> On Oct 2, 2021, at 3:11 AM, Han Zhou <hzhou@ovn.org> wrote:
> 
> The current default behavior is that ARP responder flows for VIFs are
> added by northd after the port-binding state is UP, which creates more
> trouble than benefit in most use cases. To make the default behavior
> desirable for majority of the use cases, set the option ignore_lsp_down
> to true by default. This would help saving the control plane cost in
> large scale environment, reduce the e2e latency for all flows to be
> installed for a VIF, and making the VIF readiness checking more convenient
> in test cases and likely in CMS as well. User can still set it to false
> in circumstances (if any) when this behavior is not desired.
> 
> Signed-off-by: Han Zhou <hzhou@ovn.org>
> ---
> NEWS                 | 4 ++++
> northd/northd.c      | 2 +-
> northd/ovn_northd.dl | 2 +-
> ovn-nb.xml           | 2 +-
> tests/ovn-northd.at  | 3 ++-
> 5 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 8a21c029e..348f3f548 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
>   - Allow static routes without nexthops.
>   - Enabled logical dp groups as a default.  CMS should disable it if not
>     desired.
> +  - Set ignore_lsp_down to true as default, so that ARP responder flows are
> +    installed together with other flows when a logical switch port is created,
> +    without having to wait for the port to be UP.  CMS should set it to false
> +    if not desired.
> 
> OVN v21.06.0 - 18 Jun 2021
> -------------------------
> diff --git a/northd/northd.c b/northd/northd.c
> index cf2467fe1..5ffd6b8eb 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
>     controller_event_en = smap_get_bool(&nb->options,
>                                         "controller_event", false);
>     check_lsp_is_up = !smap_get_bool(&nb->options,
> -                                     "ignore_lsp_down", false);
> +                                     "ignore_lsp_down", true);
> 
>     build_datapaths(ctx, datapaths, lr_list);
>     build_ovn_lbs(ctx, datapaths, &lbs);
> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> index 22913f05a..7154fed13 100644
> --- a/northd/ovn_northd.dl
> +++ b/northd/ovn_northd.dl
> @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
> relation CheckLspIsUp[bool]
> CheckLspIsUp[check_lsp_is_up] :-
>     nb in nb::NB_Global(),
> -    var check_lsp_is_up = not nb.options.get_bool_def(i"ignore_lsp_down", false).
> +    var check_lsp_is_up = not nb.options.get_bool_def(i"ignore_lsp_down", true).
> CheckLspIsUp[true] :-
>     Unit(),
>     not nb in nb::NB_Global().
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 390cc5a44..d8266ed4d 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -236,7 +236,7 @@
>           resolved even before the relevant VM/container is running. For
>           environments where this is not an issue, setting it to
>           <code>true</code> can reduce the load and latency of the control
> -          plane. The default value is <code>false</code>.
> +          plane. The default value is <code>true</code>.
>         </p>
>       </column>
> 
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index c5400d806..3eebb55b6 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
> AT_SETUP([ignore_lsp_down])
> ovn_start
> 
> +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
> ovn-nbctl ls-add sw0
> ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1 "aa:aa:aa:aa:aa:aa 10.0.0.1"
> 
> @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
> 
> check ovn-nbctl --wait=sb sync
> AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep lsp0 -c], [0], [dnl
> -9
> +10
> ])
> 
> AT_CLEANUP
> -- 
> 2.30.2
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Han Zhou Aug. 24, 2023, 4:18 p.m. UTC | #8
On Thu, Aug 24, 2023 at 5:34 AM Jakub Libosvar <jlibosva@redhat.com> wrote:
>
> Hello,
>
> we use OpenStack (Neutron) as CMS and we have some customers reporting
problems with their use case because of this change. I was wondering about
the benefits described below in the commit message, are there any public
data that back up the statements about the control plane cost at scale and
latency? It would be great to have those to help with the decision whether
we want to enable or disable this knob.

Hi Jakub,

Thanks for reporting this. I am curious about the use case that is impacted
by this behavior. Could you share more details?
With regard to the control plane cost, I don't have data at hand, but it
should be easy to test either with a scale test environment or with an
actual deployment by enabling and disabling this setting and run the same
test to compare. You should see obvious northd CPU time increase because
for each LSP creation and binding there will be one more control plane
round trip and recompute in northd, and if you check the ARP responder flow
installation on the chassis, you should see its installation latency is
much bigger. Of course, this depends on the scale under test which
determines how long it takes for ovn-northd to recompute.

Thanks,
Han

>
> Thank you very much.
>
> Jakub
>
> > On Oct 2, 2021, at 3:11 AM, Han Zhou <hzhou@ovn.org> wrote:
> >
> > The current default behavior is that ARP responder flows for VIFs are
> > added by northd after the port-binding state is UP, which creates more
> > trouble than benefit in most use cases. To make the default behavior
> > desirable for majority of the use cases, set the option ignore_lsp_down
> > to true by default. This would help saving the control plane cost in
> > large scale environment, reduce the e2e latency for all flows to be
> > installed for a VIF, and making the VIF readiness checking more
convenient
> > in test cases and likely in CMS as well. User can still set it to false
> > in circumstances (if any) when this behavior is not desired.
> >
> > Signed-off-by: Han Zhou <hzhou@ovn.org>
> > ---
> > NEWS                 | 4 ++++
> > northd/northd.c      | 2 +-
> > northd/ovn_northd.dl | 2 +-
> > ovn-nb.xml           | 2 +-
> > tests/ovn-northd.at  | 3 ++-
> > 5 files changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index 8a21c029e..348f3f548 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
> >   - Allow static routes without nexthops.
> >   - Enabled logical dp groups as a default.  CMS should disable it if
not
> >     desired.
> > +  - Set ignore_lsp_down to true as default, so that ARP responder
flows are
> > +    installed together with other flows when a logical switch port is
created,
> > +    without having to wait for the port to be UP.  CMS should set it
to false
> > +    if not desired.
> >
> > OVN v21.06.0 - 18 Jun 2021
> > -------------------------
> > diff --git a/northd/northd.c b/northd/northd.c
> > index cf2467fe1..5ffd6b8eb 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
> >     controller_event_en = smap_get_bool(&nb->options,
> >                                         "controller_event", false);
> >     check_lsp_is_up = !smap_get_bool(&nb->options,
> > -                                     "ignore_lsp_down", false);
> > +                                     "ignore_lsp_down", true);
> >
> >     build_datapaths(ctx, datapaths, lr_list);
> >     build_ovn_lbs(ctx, datapaths, &lbs);
> > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> > index 22913f05a..7154fed13 100644
> > --- a/northd/ovn_northd.dl
> > +++ b/northd/ovn_northd.dl
> > @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
> > relation CheckLspIsUp[bool]
> > CheckLspIsUp[check_lsp_is_up] :-
> >     nb in nb::NB_Global(),
> > -    var check_lsp_is_up = not
nb.options.get_bool_def(i"ignore_lsp_down", false).
> > +    var check_lsp_is_up = not
nb.options.get_bool_def(i"ignore_lsp_down", true).
> > CheckLspIsUp[true] :-
> >     Unit(),
> >     not nb in nb::NB_Global().
> > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > index 390cc5a44..d8266ed4d 100644
> > --- a/ovn-nb.xml
> > +++ b/ovn-nb.xml
> > @@ -236,7 +236,7 @@
> >           resolved even before the relevant VM/container is running. For
> >           environments where this is not an issue, setting it to
> >           <code>true</code> can reduce the load and latency of the
control
> > -          plane. The default value is <code>false</code>.
> > +          plane. The default value is <code>true</code>.
> >         </p>
> >       </column>
> >
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index c5400d806..3eebb55b6 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
> > AT_SETUP([ignore_lsp_down])
> > ovn_start
> >
> > +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
> > ovn-nbctl ls-add sw0
> > ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1
"aa:aa:aa:aa:aa:aa 10.0.0.1"
> >
> > @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
> >
> > check ovn-nbctl --wait=sb sync
> > AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep lsp0 -c],
[0], [dnl
> > -9
> > +10
> > ])
> >
> > AT_CLEANUP
> > --
> > 2.30.2
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
Jakub Libosvar Aug. 24, 2023, 7:31 p.m. UTC | #9
> On Aug 24, 2023, at 12:18 PM, Han Zhou <hzhou@ovn.org> wrote:
> 
> 
> 
> On Thu, Aug 24, 2023 at 5:34 AM Jakub Libosvar <jlibosva@redhat.com> wrote:
> >
> > Hello,
> >
> > we use OpenStack (Neutron) as CMS and we have some customers reporting problems with their use case because of this change. I was wondering about the benefits described below in the commit message, are there any public data that back up the statements about the control plane cost at scale and latency? It would be great to have those to help with the decision whether we want to enable or disable this knob.
> 
> Hi Jakub,
> 
> Thanks for reporting this. I am curious about the use case that is impacted by this behavior. Could you share more details?
> With regard to the control plane cost, I don't have data at hand, but it should be easy to test either with a scale test environment or with an actual deployment by enabling and disabling this setting and run the same test to compare. You should see obvious northd CPU time increase because for each LSP creation and binding there will be one more control plane round trip and recompute in northd, and if you check the ARP responder flow installation on the chassis, you should see its installation latency is much bigger. Of course, this depends on the scale under test which determines how long it takes for ovn-northd to recompute.
> 
> Thanks,
> Han
> 

Hi Han,

thanks for your reply. I don’t know much of details, the best of information I have is that they use OpenStack Manila with NetApp cluster. They’ve been using Neutron for IP allocation and the same IP is then configured on the NetApp storage as a share export. With the OVN option they get ARP replies for the unbound port instead of the real storage export, which uses the port's IP. That’s the best of my understanding. They fixed their environment by setting ignore_lsp_down to false, thus I wanted to know the benefits.

Thank you again.

Jakub

> >
> > Thank you very much.
> >
> > Jakub
> >
> > > On Oct 2, 2021, at 3:11 AM, Han Zhou <hzhou@ovn.org> wrote:
> > >
> > > The current default behavior is that ARP responder flows for VIFs are
> > > added by northd after the port-binding state is UP, which creates more
> > > trouble than benefit in most use cases. To make the default behavior
> > > desirable for majority of the use cases, set the option ignore_lsp_down
> > > to true by default. This would help saving the control plane cost in
> > > large scale environment, reduce the e2e latency for all flows to be
> > > installed for a VIF, and making the VIF readiness checking more convenient
> > > in test cases and likely in CMS as well. User can still set it to false
> > > in circumstances (if any) when this behavior is not desired.
> > >
> > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> > > ---
> > > NEWS                 | 4 ++++
> > > northd/northd.c      | 2 +-
> > > northd/ovn_northd.dl | 2 +-
> > > ovn-nb.xml           | 2 +-
> > > tests/ovn-northd.at  | 3 ++-
> > > 5 files changed, 9 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/NEWS b/NEWS
> > > index 8a21c029e..348f3f548 100644
> > > --- a/NEWS
> > > +++ b/NEWS
> > > @@ -12,6 +12,10 @@ OVN v21.09.0 - xx xxx xxxx
> > >   - Allow static routes without nexthops.
> > >   - Enabled logical dp groups as a default.  CMS should disable it if not
> > >     desired.
> > > +  - Set ignore_lsp_down to true as default, so that ARP responder flows are
> > > +    installed together with other flows when a logical switch port is created,
> > > +    without having to wait for the port to be UP.  CMS should set it to false
> > > +    if not desired.
> > >
> > > OVN v21.06.0 - 18 Jun 2021
> > > -------------------------
> > > diff --git a/northd/northd.c b/northd/northd.c
> > > index cf2467fe1..5ffd6b8eb 100644
> > > --- a/northd/northd.c
> > > +++ b/northd/northd.c
> > > @@ -14304,7 +14304,7 @@ ovnnb_db_run(struct northd_context *ctx,
> > >     controller_event_en = smap_get_bool(&nb->options,
> > >                                         "controller_event", false);
> > >     check_lsp_is_up = !smap_get_bool(&nb->options,
> > > -                                     "ignore_lsp_down", false);
> > > +                                     "ignore_lsp_down", true);
> > >
> > >     build_datapaths(ctx, datapaths, lr_list);
> > >     build_ovn_lbs(ctx, datapaths, &lbs);
> > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> > > index 22913f05a..7154fed13 100644
> > > --- a/northd/ovn_northd.dl
> > > +++ b/northd/ovn_northd.dl
> > > @@ -753,7 +753,7 @@ Northd_Probe_Interval[interval] :-
> > > relation CheckLspIsUp[bool]
> > > CheckLspIsUp[check_lsp_is_up] :-
> > >     nb in nb::NB_Global(),
> > > -    var check_lsp_is_up = not nb.options.get_bool_def(i"ignore_lsp_down", false).
> > > +    var check_lsp_is_up = not nb.options.get_bool_def(i"ignore_lsp_down", true).
> > > CheckLspIsUp[true] :-
> > >     Unit(),
> > >     not nb in nb::NB_Global().
> > > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > > index 390cc5a44..d8266ed4d 100644
> > > --- a/ovn-nb.xml
> > > +++ b/ovn-nb.xml
> > > @@ -236,7 +236,7 @@
> > >           resolved even before the relevant VM/container is running. For
> > >           environments where this is not an issue, setting it to
> > >           <code>true</code> can reduce the load and latency of the control
> > > -          plane. The default value is <code>false</code>.
> > > +          plane. The default value is <code>true</code>.
> > >         </p>
> > >       </column>
> > >
> > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > > index c5400d806..3eebb55b6 100644
> > > --- a/tests/ovn-northd.at
> > > +++ b/tests/ovn-northd.at
> > > @@ -1918,6 +1918,7 @@ OVN_FOR_EACH_NORTHD([
> > > AT_SETUP([ignore_lsp_down])
> > > ovn_start
> > >
> > > +ovn-nbctl set NB_Global . options:ignore_lsp_down=false
> > > ovn-nbctl ls-add sw0
> > > ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1 "aa:aa:aa:aa:aa:aa 10.0.0.1"
> > >
> > > @@ -4922,7 +4923,7 @@ check ovn-nbctl lsp-add sw0 lsp0 \
> > >
> > > check ovn-nbctl --wait=sb sync
> > > AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep lsp0 -c], [0], [dnl
> > > -9
> > > +10
> > > ])
> > >
> > > AT_CLEANUP
> > > --
> > > 2.30.2
> > >
> > > _______________________________________________
> > > dev mailing list
> > > dev@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
> >
diff mbox series

Patch

diff --git a/NEWS b/NEWS
index 8a21c029e..348f3f548 100644
--- a/NEWS
+++ b/NEWS
@@ -12,6 +12,10 @@  OVN v21.09.0 - xx xxx xxxx
   - Allow static routes without nexthops.
   - Enabled logical dp groups as a default.  CMS should disable it if not
     desired.
+  - Set ignore_lsp_down to true as default, so that ARP responder flows are
+    installed together with other flows when a logical switch port is created,
+    without having to wait for the port to be UP.  CMS should set it to false
+    if not desired.
 
 OVN v21.06.0 - 18 Jun 2021
 -------------------------
diff --git a/northd/northd.c b/northd/northd.c
index cf2467fe1..5ffd6b8eb 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -14304,7 +14304,7 @@  ovnnb_db_run(struct northd_context *ctx,
     controller_event_en = smap_get_bool(&nb->options,
                                         "controller_event", false);
     check_lsp_is_up = !smap_get_bool(&nb->options,
-                                     "ignore_lsp_down", false);
+                                     "ignore_lsp_down", true);
 
     build_datapaths(ctx, datapaths, lr_list);
     build_ovn_lbs(ctx, datapaths, &lbs);
diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
index 22913f05a..7154fed13 100644
--- a/northd/ovn_northd.dl
+++ b/northd/ovn_northd.dl
@@ -753,7 +753,7 @@  Northd_Probe_Interval[interval] :-
 relation CheckLspIsUp[bool]
 CheckLspIsUp[check_lsp_is_up] :-
     nb in nb::NB_Global(),
-    var check_lsp_is_up = not nb.options.get_bool_def(i"ignore_lsp_down", false).
+    var check_lsp_is_up = not nb.options.get_bool_def(i"ignore_lsp_down", true).
 CheckLspIsUp[true] :-
     Unit(),
     not nb in nb::NB_Global().
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 390cc5a44..d8266ed4d 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -236,7 +236,7 @@ 
           resolved even before the relevant VM/container is running. For
           environments where this is not an issue, setting it to
           <code>true</code> can reduce the load and latency of the control
-          plane. The default value is <code>false</code>.
+          plane. The default value is <code>true</code>.
         </p>
       </column>
 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index c5400d806..3eebb55b6 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -1918,6 +1918,7 @@  OVN_FOR_EACH_NORTHD([
 AT_SETUP([ignore_lsp_down])
 ovn_start
 
+ovn-nbctl set NB_Global . options:ignore_lsp_down=false
 ovn-nbctl ls-add sw0
 ovn-nbctl lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1 "aa:aa:aa:aa:aa:aa 10.0.0.1"
 
@@ -4922,7 +4923,7 @@  check ovn-nbctl lsp-add sw0 lsp0 \
 
 check ovn-nbctl --wait=sb sync
 AT_CHECK([ovn-sbctl --columns=tags list logical_flow | grep lsp0 -c], [0], [dnl
-9
+10
 ])
 
 AT_CLEANUP