diff mbox series

[ovs-dev,ovn] ovn-controller: Run I-P engine even when no SB txn is available.

Message ID 1575639517-2266-1-git-send-email-dceara@redhat.com
State Superseded
Headers show
Series [ovs-dev,ovn] ovn-controller: Run I-P engine even when no SB txn is available. | expand

Commit Message

Dumitru Ceara Dec. 6, 2019, 1:38 p.m. UTC
If the last ovn-controller main loop run started a transaction to the
Southbound DB and the transaction is still in progress, ovn-controller
will not call engine_run(). In case there were changes to the DB,
engine_need_run() would return true which would trigger an immediate
forced recompute in the next processing loop iteration.

However, there are scenarios when updates can be processed incrementally
even if no Southbound DB transaction is available. One example, often
seen in the field, is when the MAC_Binding table is updated. Currently
such an update received while a transaction is still committing would
trigger a forced recompute.

To minimize the number of forced recomputes, we now call
engine_run(false), i.e., try to process updates incrementally without
allowing recompute, also when ovnsb_idl_txn == NULL. This is safe
because ovnsb_idl_txn is not used by change_handlers and run handlers
are not allowed to execute when calling engine_run(false).

To make sure that other users of ovnsb_idl_txn, like pinctrl_run(),
get a chance to run as soon as the transaction is completed, if the
engine has successfully run and ovnsb_idl_txn is NULL we trigger an
immediate wake and a new iteration of the processing loop.

CC: Han Zhou <hzhou@ovn.org>
CC: Numan Siddique <numans@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
---
 controller/ovn-controller.c | 13 +++++++++++++
 lib/inc-proc-eng.h          |  8 +++++++-
 2 files changed, 20 insertions(+), 1 deletion(-)

Comments

Han Zhou Dec. 7, 2019, 12:36 a.m. UTC | #1
Thanks Dumitru for the patch. Please see my comments below.

On Fri, Dec 6, 2019 at 5:38 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> If the last ovn-controller main loop run started a transaction to the
> Southbound DB and the transaction is still in progress, ovn-controller
> will not call engine_run(). In case there were changes to the DB,
> engine_need_run() would return true which would trigger an immediate
> forced recompute in the next processing loop iteration.
>
> However, there are scenarios when updates can be processed incrementally
> even if no Southbound DB transaction is available. One example, often
> seen in the field, is when the MAC_Binding table is updated. Currently
> such an update received while a transaction is still committing would
> trigger a forced recompute.
>
> To minimize the number of forced recomputes, we now call
> engine_run(false), i.e., try to process updates incrementally without
> allowing recompute, also when ovnsb_idl_txn == NULL. This is safe
> because ovnsb_idl_txn is not used by change_handlers and run handlers
> are not allowed to execute when calling engine_run(false).
>
This patch assumes we will never need to do SB updates when handling
changes incrementally. There are two problems:
1. I don't see a reason why a change_handler (for doing incremental
compute) can't update SB DB. For example Numan mentioned about working on
incremental processing for port-bindings on local chassis. I believe those
change handling would trigger SB updates, e.g. claiming/releasing a port.
2. Even if we can keep this assumption and never do incremental processing
for changes that require updating SB DB, how do we ensure this constraint
is enforced in coding?

For 2), it may be easy. We could hide the interface get_context(), and
always pass the context to xxx_run() interface, so that ovnsb_idl_txn can
be accessed by recompute but never by change_handlers.
What I am really concerned is 1). I think the long term solution is to make
the OVSDB change tracking available across iterations. Maybe it is not a
trivial work.

If we believe there is a big performance gain here and we are sure we don't
need to update SB DB in change handlers (in short term), we may proceed
with this approach as long as 2) is addressed, and revert this whenever the
long term solution (OVSDB change tracking over iterations) is ready.

> To make sure that other users of ovnsb_idl_txn, like pinctrl_run(),
> get a chance to run as soon as the transaction is completed, if the
> engine has successfully run and ovnsb_idl_txn is NULL we trigger an
> immediate wake and a new iteration of the processing loop.

I think this will cause busy loop until the transaction completes. And I
think it is not necessary because when transaction completes there will be
a jsonrpc message coming and trigger the main loop, and pinctrl will get a
chance to run.

>
> CC: Han Zhou <hzhou@ovn.org>
> CC: Numan Siddique <numans@ovn.org>
> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> ---
>  controller/ovn-controller.c | 13 +++++++++++++
>  lib/inc-proc-eng.h          |  8 +++++++-
>  2 files changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 5874776..4a27d5f 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -2100,6 +2100,14 @@ main(int argc, char *argv[])
>                          } else {
>                              engine_run(true);
>                          }
> +                    } else {
> +                        /* Even if there's no SB DB transaction
available,
> +                         * try to run the engine so that we can handle
any
> +                         * incremental changes that don't require a
recompute.
> +                         * If a recompute is required, the engine will
abort,
> +                         * triggerring a full run in the next iteration.
> +                         */
> +                        engine_run(false);
>                      }
>                      stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
>                                     time_msec());
> @@ -2168,6 +2176,11 @@ main(int argc, char *argv[])
>                           "br_int %p, chassis %p", br_int, chassis);
>                  engine_set_force_recompute(true);
>                  poll_immediate_wake();
> +            } else if (!ovnsb_idl_txn) {
> +                VLOG_DBG("engine ran, no SB DB transaction available, "
> +                         "trigger an immediate loop run: "
> +                         "br_int %p, chassis %p", br_int, chassis);
> +                poll_immediate_wake();
>              } else {
>                  engine_set_force_recompute(false);
>              }
> diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
> index 5b92971..2f90b0a 100644
> --- a/lib/inc-proc-eng.h
> +++ b/lib/inc-proc-eng.h
> @@ -189,7 +189,13 @@ void engine_add_input(struct engine_node *node,
struct engine_node *input,
>   * iteration, and the change can't be tracked across iterations */
>  void engine_set_force_recompute(bool val);
>
> -const struct engine_context * engine_get_context(void);
> +/* Return the current engine_context. The values in the context can be
NULL
> + * if the engine is run with allow_recompute == false in the current
> + * iteration.
> + * Therefore, it is the responsibility of the caller to check the context
> + * values when called from change_handlers.
> + */
> +const struct engine_context *engine_get_context(void);
>
>  void engine_set_context(const struct engine_context *);
>
> --
> 1.8.3.1
>
Dumitru Ceara Dec. 7, 2019, 11:21 a.m. UTC | #2
On Sat, Dec 7, 2019 at 1:36 AM Han Zhou <hzhou@ovn.org> wrote:
>
> Thanks Dumitru for the patch. Please see my comments below.
>

Hi Han,

Thanks for the review.

> On Fri, Dec 6, 2019 at 5:38 AM Dumitru Ceara <dceara@redhat.com> wrote:
> >
> > If the last ovn-controller main loop run started a transaction to the
> > Southbound DB and the transaction is still in progress, ovn-controller
> > will not call engine_run(). In case there were changes to the DB,
> > engine_need_run() would return true which would trigger an immediate
> > forced recompute in the next processing loop iteration.
> >
> > However, there are scenarios when updates can be processed incrementally
> > even if no Southbound DB transaction is available. One example, often
> > seen in the field, is when the MAC_Binding table is updated. Currently
> > such an update received while a transaction is still committing would
> > trigger a forced recompute.
> >
> > To minimize the number of forced recomputes, we now call
> > engine_run(false), i.e., try to process updates incrementally without
> > allowing recompute, also when ovnsb_idl_txn == NULL. This is safe
> > because ovnsb_idl_txn is not used by change_handlers and run handlers
> > are not allowed to execute when calling engine_run(false).
> >
> This patch assumes we will never need to do SB updates when handling changes incrementally. There are two problems:
> 1. I don't see a reason why a change_handler (for doing incremental compute) can't update SB DB. For example Numan mentioned about working on incremental processing for port-bindings on local chassis. I believe those change handling would trigger SB updates, e.g. claiming/releasing a port.
> 2. Even if we can keep this assumption and never do incremental processing for changes that require updating SB DB, how do we ensure this constraint is enforced in coding?
>

I think my commit message is not clear enough. I didn't mean that
change handlers will never perform SB updates. What I meant was that
if a change handler needs to perform SB updates it needs to validate
that ovnsb_idl_txn != NULL and if it's NULL return "false" to trigger
a recompute. At this moment we have no change handler that updates the
SB database but as you pointed out we might have them in the future.
Then, the new change handlers will have to make sure that
ovnsb_idl_txn is not NULL.

> For 2), it may be easy. We could hide the interface get_context(), and always pass the context to xxx_run() interface, so that ovnsb_idl_txn can be accessed by recompute but never by change_handlers.
> What I am really concerned is 1). I think the long term solution is to make the OVSDB change tracking available across iterations. Maybe it is not a trivial work.
>

As mentioned above, I don't think we need this as long as the contract
of get_context() is clear and the users know to check for NULL txn
pointers in change handlers.

> If we believe there is a big performance gain here and we are sure we don't need to update SB DB in change handlers (in short term), we may proceed with this approach as long as 2) is addressed, and revert this whenever the long term solution (OVSDB change tracking over iterations) is ready.

There's no real reason to not try to process changes incrementally
even when a SB txn is in progress (ovnsb_idl_txn == NULL). We use the
same reasoning for !ofctrl_can_put() in order to avoid forced
recomputes. We saw it on customer deployments that MAC_Binding updates
received while a SB transaction was still in progress were triggering
unnecessary forced full recomputes that last in the order of tens of
seconds.

What do you think?

>
> > To make sure that other users of ovnsb_idl_txn, like pinctrl_run(),
> > get a chance to run as soon as the transaction is completed, if the
> > engine has successfully run and ovnsb_idl_txn is NULL we trigger an
> > immediate wake and a new iteration of the processing loop.
>
> I think this will cause busy loop until the transaction completes. And I think it is not necessary because when transaction completes there will be a jsonrpc message coming and trigger the main loop, and pinctrl will get a chance to run.

Ah, right, I missed that case. I can just remove this check.

Thanks,
Dumitru

>
> >
> > CC: Han Zhou <hzhou@ovn.org>
> > CC: Numan Siddique <numans@ovn.org>
> > Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> > ---
> >  controller/ovn-controller.c | 13 +++++++++++++
> >  lib/inc-proc-eng.h          |  8 +++++++-
> >  2 files changed, 20 insertions(+), 1 deletion(-)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 5874776..4a27d5f 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -2100,6 +2100,14 @@ main(int argc, char *argv[])
> >                          } else {
> >                              engine_run(true);
> >                          }
> > +                    } else {
> > +                        /* Even if there's no SB DB transaction available,
> > +                         * try to run the engine so that we can handle any
> > +                         * incremental changes that don't require a recompute.
> > +                         * If a recompute is required, the engine will abort,
> > +                         * triggerring a full run in the next iteration.
> > +                         */
> > +                        engine_run(false);
> >                      }
> >                      stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
> >                                     time_msec());
> > @@ -2168,6 +2176,11 @@ main(int argc, char *argv[])
> >                           "br_int %p, chassis %p", br_int, chassis);
> >                  engine_set_force_recompute(true);
> >                  poll_immediate_wake();
> > +            } else if (!ovnsb_idl_txn) {
> > +                VLOG_DBG("engine ran, no SB DB transaction available, "
> > +                         "trigger an immediate loop run: "
> > +                         "br_int %p, chassis %p", br_int, chassis);
> > +                poll_immediate_wake();
> >              } else {
> >                  engine_set_force_recompute(false);
> >              }
> > diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
> > index 5b92971..2f90b0a 100644
> > --- a/lib/inc-proc-eng.h
> > +++ b/lib/inc-proc-eng.h
> > @@ -189,7 +189,13 @@ void engine_add_input(struct engine_node *node, struct engine_node *input,
> >   * iteration, and the change can't be tracked across iterations */
> >  void engine_set_force_recompute(bool val);
> >
> > -const struct engine_context * engine_get_context(void);
> > +/* Return the current engine_context. The values in the context can be NULL
> > + * if the engine is run with allow_recompute == false in the current
> > + * iteration.
> > + * Therefore, it is the responsibility of the caller to check the context
> > + * values when called from change_handlers.
> > + */
> > +const struct engine_context *engine_get_context(void);
> >
> >  void engine_set_context(const struct engine_context *);
> >
> > --
> > 1.8.3.1
> >
Han Zhou Dec. 7, 2019, 3:53 p.m. UTC | #3
On Sat, Dec 7, 2019 at 3:21 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On Sat, Dec 7, 2019 at 1:36 AM Han Zhou <hzhou@ovn.org> wrote:
> >
> > Thanks Dumitru for the patch. Please see my comments below.
> >
>
> Hi Han,
>
> Thanks for the review.
>
> > On Fri, Dec 6, 2019 at 5:38 AM Dumitru Ceara <dceara@redhat.com> wrote:
> > >
> > > If the last ovn-controller main loop run started a transaction to the
> > > Southbound DB and the transaction is still in progress, ovn-controller
> > > will not call engine_run(). In case there were changes to the DB,
> > > engine_need_run() would return true which would trigger an immediate
> > > forced recompute in the next processing loop iteration.
> > >
> > > However, there are scenarios when updates can be processed
incrementally
> > > even if no Southbound DB transaction is available. One example, often
> > > seen in the field, is when the MAC_Binding table is updated. Currently
> > > such an update received while a transaction is still committing would
> > > trigger a forced recompute.
> > >
> > > To minimize the number of forced recomputes, we now call
> > > engine_run(false), i.e., try to process updates incrementally without
> > > allowing recompute, also when ovnsb_idl_txn == NULL. This is safe
> > > because ovnsb_idl_txn is not used by change_handlers and run handlers
> > > are not allowed to execute when calling engine_run(false).
> > >
> > This patch assumes we will never need to do SB updates when handling
changes incrementally. There are two problems:
> > 1. I don't see a reason why a change_handler (for doing incremental
compute) can't update SB DB. For example Numan mentioned about working on
incremental processing for port-bindings on local chassis. I believe those
change handling would trigger SB updates, e.g. claiming/releasing a port.
> > 2. Even if we can keep this assumption and never do incremental
processing for changes that require updating SB DB, how do we ensure this
constraint is enforced in coding?
> >
>
> I think my commit message is not clear enough. I didn't mean that
> change handlers will never perform SB updates. What I meant was that
> if a change handler needs to perform SB updates it needs to validate
> that ovnsb_idl_txn != NULL and if it's NULL return "false" to trigger
> a recompute. At this moment we have no change handler that updates the
> SB database but as you pointed out we might have them in the future.
> Then, the new change handlers will have to make sure that
> ovnsb_idl_txn is not NULL.
>

OK, maybe I missed the point. Now let me rephrase your idea.
This patch is to do only incremental processing when txn is NULL, which
means executing change handlers instead of run() methods. Assume some
change handlers need to update SB DB, some don't.
If the incoming change (when txn is NULL) doesn't trigger SB update, then
change handlers can complete without aborting. - this seems to happen in
most cases.
If the incoming change happen to require another SB update, since txn is
NULL, the change handlers that are supposed to do the update will need to
return false and result in aborting. - this may not happen very often, so
the approach of this patch should be effective.

> > For 2), it may be easy. We could hide the interface get_context(), and
always pass the context to xxx_run() interface, so that ovnsb_idl_txn can
be accessed by recompute but never by change_handlers.
> > What I am really concerned is 1). I think the long term solution is to
make the OVSDB change tracking available across iterations. Maybe it is not
a trivial work.
> >
>
> As mentioned above, I don't think we need this as long as the contract
> of get_context() is clear and the users know to check for NULL txn
> pointers in change handlers.
>
Agree, but I'd suggest to document the patter more clearly, that a run()
handler never expects NULL txn, and a change handler need to check if txn
is NULL before using it. If it is NULL, it should return false (can't
process the change).

> > If we believe there is a big performance gain here and we are sure we
don't need to update SB DB in change handlers (in short term), we may
proceed with this approach as long as 2) is addressed, and revert this
whenever the long term solution (OVSDB change tracking over iterations) is
ready.
>
> There's no real reason to not try to process changes incrementally
> even when a SB txn is in progress (ovnsb_idl_txn == NULL). We use the
> same reasoning for !ofctrl_can_put() in order to avoid forced
> recomputes. We saw it on customer deployments that MAC_Binding updates
> received while a SB transaction was still in progress were triggering
> unnecessary forced full recomputes that last in the order of tens of
> seconds.
>
> What do you think?
>
Ack.

> >
> > > To make sure that other users of ovnsb_idl_txn, like pinctrl_run(),
> > > get a chance to run as soon as the transaction is completed, if the
> > > engine has successfully run and ovnsb_idl_txn is NULL we trigger an
> > > immediate wake and a new iteration of the processing loop.
> >
> > I think this will cause busy loop until the transaction completes. And
I think it is not necessary because when transaction completes there will
be a jsonrpc message coming and trigger the main loop, and pinctrl will get
a chance to run.
>
> Ah, right, I missed that case. I can just remove this check.
>
> Thanks,
> Dumitru
>
> >
> > >
> > > CC: Han Zhou <hzhou@ovn.org>
> > > CC: Numan Siddique <numans@ovn.org>
> > > Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> > > ---
> > >  controller/ovn-controller.c | 13 +++++++++++++
> > >  lib/inc-proc-eng.h          |  8 +++++++-
> > >  2 files changed, 20 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > > index 5874776..4a27d5f 100644
> > > --- a/controller/ovn-controller.c
> > > +++ b/controller/ovn-controller.c
> > > @@ -2100,6 +2100,14 @@ main(int argc, char *argv[])
> > >                          } else {
> > >                              engine_run(true);
> > >                          }
> > > +                    } else {
> > > +                        /* Even if there's no SB DB transaction
available,
> > > +                         * try to run the engine so that we can
handle any
> > > +                         * incremental changes that don't require a
recompute.
> > > +                         * If a recompute is required, the engine
will abort,
> > > +                         * triggerring a full run in the next
iteration.
> > > +                         */
> > > +                        engine_run(false);
> > >                      }
> > >                      stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
> > >                                     time_msec());
> > > @@ -2168,6 +2176,11 @@ main(int argc, char *argv[])
> > >                           "br_int %p, chassis %p", br_int, chassis);
> > >                  engine_set_force_recompute(true);
> > >                  poll_immediate_wake();
> > > +            } else if (!ovnsb_idl_txn) {
> > > +                VLOG_DBG("engine ran, no SB DB transaction
available, "
> > > +                         "trigger an immediate loop run: "
> > > +                         "br_int %p, chassis %p", br_int, chassis);
> > > +                poll_immediate_wake();
> > >              } else {
> > >                  engine_set_force_recompute(false);
> > >              }
> > > diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
> > > index 5b92971..2f90b0a 100644
> > > --- a/lib/inc-proc-eng.h
> > > +++ b/lib/inc-proc-eng.h
> > > @@ -189,7 +189,13 @@ void engine_add_input(struct engine_node *node,
struct engine_node *input,
> > >   * iteration, and the change can't be tracked across iterations */
> > >  void engine_set_force_recompute(bool val);
> > >
> > > -const struct engine_context * engine_get_context(void);
> > > +/* Return the current engine_context. The values in the context can
be NULL
> > > + * if the engine is run with allow_recompute == false in the current
> > > + * iteration.
> > > + * Therefore, it is the responsibility of the caller to check the
context
> > > + * values when called from change_handlers.
> > > + */
> > > +const struct engine_context *engine_get_context(void);
> > >
> > >  void engine_set_context(const struct engine_context *);
> > >
> > > --
> > > 1.8.3.1
> > >
>
Dumitru Ceara Dec. 8, 2019, 12:55 p.m. UTC | #4
On Sat, Dec 7, 2019 at 4:53 PM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Sat, Dec 7, 2019 at 3:21 AM Dumitru Ceara <dceara@redhat.com> wrote:
> >
> > On Sat, Dec 7, 2019 at 1:36 AM Han Zhou <hzhou@ovn.org> wrote:
> > >
> > > Thanks Dumitru for the patch. Please see my comments below.
> > >
> >
> > Hi Han,
> >
> > Thanks for the review.
> >
> > > On Fri, Dec 6, 2019 at 5:38 AM Dumitru Ceara <dceara@redhat.com> wrote:
> > > >
> > > > If the last ovn-controller main loop run started a transaction to the
> > > > Southbound DB and the transaction is still in progress, ovn-controller
> > > > will not call engine_run(). In case there were changes to the DB,
> > > > engine_need_run() would return true which would trigger an immediate
> > > > forced recompute in the next processing loop iteration.
> > > >
> > > > However, there are scenarios when updates can be processed incrementally
> > > > even if no Southbound DB transaction is available. One example, often
> > > > seen in the field, is when the MAC_Binding table is updated. Currently
> > > > such an update received while a transaction is still committing would
> > > > trigger a forced recompute.
> > > >
> > > > To minimize the number of forced recomputes, we now call
> > > > engine_run(false), i.e., try to process updates incrementally without
> > > > allowing recompute, also when ovnsb_idl_txn == NULL. This is safe
> > > > because ovnsb_idl_txn is not used by change_handlers and run handlers
> > > > are not allowed to execute when calling engine_run(false).
> > > >
> > > This patch assumes we will never need to do SB updates when handling changes incrementally. There are two problems:
> > > 1. I don't see a reason why a change_handler (for doing incremental compute) can't update SB DB. For example Numan mentioned about working on incremental processing for port-bindings on local chassis. I believe those change handling would trigger SB updates, e.g. claiming/releasing a port.
> > > 2. Even if we can keep this assumption and never do incremental processing for changes that require updating SB DB, how do we ensure this constraint is enforced in coding?
> > >
> >
> > I think my commit message is not clear enough. I didn't mean that
> > change handlers will never perform SB updates. What I meant was that
> > if a change handler needs to perform SB updates it needs to validate
> > that ovnsb_idl_txn != NULL and if it's NULL return "false" to trigger
> > a recompute. At this moment we have no change handler that updates the
> > SB database but as you pointed out we might have them in the future.
> > Then, the new change handlers will have to make sure that
> > ovnsb_idl_txn is not NULL.
> >
>
> OK, maybe I missed the point. Now let me rephrase your idea.
> This patch is to do only incremental processing when txn is NULL, which means executing change handlers instead of run() methods. Assume some change handlers need to update SB DB, some don't.
> If the incoming change (when txn is NULL) doesn't trigger SB update, then change handlers can complete without aborting. - this seems to happen in most cases.
> If the incoming change happen to require another SB update, since txn is NULL, the change handlers that are supposed to do the update will need to return false and result in aborting. - this may not happen very often, so the approach of this patch should be effective.

Exactly.

>
> > > For 2), it may be easy. We could hide the interface get_context(), and always pass the context to xxx_run() interface, so that ovnsb_idl_txn can be accessed by recompute but never by change_handlers.
> > > What I am really concerned is 1). I think the long term solution is to make the OVSDB change tracking available across iterations. Maybe it is not a trivial work.
> > >
> >
> > As mentioned above, I don't think we need this as long as the contract
> > of get_context() is clear and the users know to check for NULL txn
> > pointers in change handlers.
> >
> Agree, but I'd suggest to document the patter more clearly, that a run() handler never expects NULL txn, and a change handler need to check if txn is NULL before using it. If it is NULL, it should return false (can't process the change).

I'll send a v2 soon and document this requirement better.

Thanks,
Dumitru

>
> > > If we believe there is a big performance gain here and we are sure we don't need to update SB DB in change handlers (in short term), we may proceed with this approach as long as 2) is addressed, and revert this whenever the long term solution (OVSDB change tracking over iterations) is ready.
> >
> > There's no real reason to not try to process changes incrementally
> > even when a SB txn is in progress (ovnsb_idl_txn == NULL). We use the
> > same reasoning for !ofctrl_can_put() in order to avoid forced
> > recomputes. We saw it on customer deployments that MAC_Binding updates
> > received while a SB transaction was still in progress were triggering
> > unnecessary forced full recomputes that last in the order of tens of
> > seconds.
> >
> > What do you think?
> >
> Ack.
>
> > >
> > > > To make sure that other users of ovnsb_idl_txn, like pinctrl_run(),
> > > > get a chance to run as soon as the transaction is completed, if the
> > > > engine has successfully run and ovnsb_idl_txn is NULL we trigger an
> > > > immediate wake and a new iteration of the processing loop.
> > >
> > > I think this will cause busy loop until the transaction completes. And I think it is not necessary because when transaction completes there will be a jsonrpc message coming and trigger the main loop, and pinctrl will get a chance to run.
> >
> > Ah, right, I missed that case. I can just remove this check.
> >
> > Thanks,
> > Dumitru
> >
> > >
> > > >
> > > > CC: Han Zhou <hzhou@ovn.org>
> > > > CC: Numan Siddique <numans@ovn.org>
> > > > Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> > > > ---
> > > >  controller/ovn-controller.c | 13 +++++++++++++
> > > >  lib/inc-proc-eng.h          |  8 +++++++-
> > > >  2 files changed, 20 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > > > index 5874776..4a27d5f 100644
> > > > --- a/controller/ovn-controller.c
> > > > +++ b/controller/ovn-controller.c
> > > > @@ -2100,6 +2100,14 @@ main(int argc, char *argv[])
> > > >                          } else {
> > > >                              engine_run(true);
> > > >                          }
> > > > +                    } else {
> > > > +                        /* Even if there's no SB DB transaction available,
> > > > +                         * try to run the engine so that we can handle any
> > > > +                         * incremental changes that don't require a recompute.
> > > > +                         * If a recompute is required, the engine will abort,
> > > > +                         * triggerring a full run in the next iteration.
> > > > +                         */
> > > > +                        engine_run(false);
> > > >                      }
> > > >                      stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
> > > >                                     time_msec());
> > > > @@ -2168,6 +2176,11 @@ main(int argc, char *argv[])
> > > >                           "br_int %p, chassis %p", br_int, chassis);
> > > >                  engine_set_force_recompute(true);
> > > >                  poll_immediate_wake();
> > > > +            } else if (!ovnsb_idl_txn) {
> > > > +                VLOG_DBG("engine ran, no SB DB transaction available, "
> > > > +                         "trigger an immediate loop run: "
> > > > +                         "br_int %p, chassis %p", br_int, chassis);
> > > > +                poll_immediate_wake();
> > > >              } else {
> > > >                  engine_set_force_recompute(false);
> > > >              }
> > > > diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
> > > > index 5b92971..2f90b0a 100644
> > > > --- a/lib/inc-proc-eng.h
> > > > +++ b/lib/inc-proc-eng.h
> > > > @@ -189,7 +189,13 @@ void engine_add_input(struct engine_node *node, struct engine_node *input,
> > > >   * iteration, and the change can't be tracked across iterations */
> > > >  void engine_set_force_recompute(bool val);
> > > >
> > > > -const struct engine_context * engine_get_context(void);
> > > > +/* Return the current engine_context. The values in the context can be NULL
> > > > + * if the engine is run with allow_recompute == false in the current
> > > > + * iteration.
> > > > + * Therefore, it is the responsibility of the caller to check the context
> > > > + * values when called from change_handlers.
> > > > + */
> > > > +const struct engine_context *engine_get_context(void);
> > > >
> > > >  void engine_set_context(const struct engine_context *);
> > > >
> > > > --
> > > > 1.8.3.1
> > > >
> >
diff mbox series

Patch

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 5874776..4a27d5f 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -2100,6 +2100,14 @@  main(int argc, char *argv[])
                         } else {
                             engine_run(true);
                         }
+                    } else {
+                        /* Even if there's no SB DB transaction available,
+                         * try to run the engine so that we can handle any
+                         * incremental changes that don't require a recompute.
+                         * If a recompute is required, the engine will abort,
+                         * triggerring a full run in the next iteration.
+                         */
+                        engine_run(false);
                     }
                     stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
                                    time_msec());
@@ -2168,6 +2176,11 @@  main(int argc, char *argv[])
                          "br_int %p, chassis %p", br_int, chassis);
                 engine_set_force_recompute(true);
                 poll_immediate_wake();
+            } else if (!ovnsb_idl_txn) {
+                VLOG_DBG("engine ran, no SB DB transaction available, "
+                         "trigger an immediate loop run: "
+                         "br_int %p, chassis %p", br_int, chassis);
+                poll_immediate_wake();
             } else {
                 engine_set_force_recompute(false);
             }
diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
index 5b92971..2f90b0a 100644
--- a/lib/inc-proc-eng.h
+++ b/lib/inc-proc-eng.h
@@ -189,7 +189,13 @@  void engine_add_input(struct engine_node *node, struct engine_node *input,
  * iteration, and the change can't be tracked across iterations */
 void engine_set_force_recompute(bool val);
 
-const struct engine_context * engine_get_context(void);
+/* Return the current engine_context. The values in the context can be NULL
+ * if the engine is run with allow_recompute == false in the current
+ * iteration.
+ * Therefore, it is the responsibility of the caller to check the context
+ * values when called from change_handlers.
+ */
+const struct engine_context *engine_get_context(void);
 
 void engine_set_context(const struct engine_context *);