Message ID | 1563936755-71403-1-git-send-email-hzhou8@ebay.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] ovn-controller: Fix flow installation latency caused by recompute. | expand |
On Wed, Jul 24, 2019 at 8:23 AM Han Zhou <zhouhan@gmail.com> wrote: > From: Han Zhou <hzhou8@ebay.com> > > When there are in-flight flow-installations pending to ovs-vswitchd, > current incremental processing logic prioritizes new change handling. > However, in scenarios where frequent recomputes are triggered, the > later recompute would block the flow-installation for previously > computed flows because recompute usually takes long time, especially > when there are large number of flows. This results in worse latency > than the version without incremental processing in specific scale > test scenarios. > > While we can simply fix the problem by prioritizing flow installation > rather than new change handling, it can cause the incremental > processing to degrade to always recompute in certain scenarios when > there are some changes triggering recomputes, followed by a lot of > continously coming changes that can be handled incrementally. Because > OVSDB change tracker cannot preserve changes across iterations, once > the recompute is triggered and resulted in a lot of pending messages > to ovs-vswitchd, and if we choose to skip the engine_run() > in the next iteration when a incrementally processible change comes, > we miss the opportunity to handle that tracked change and will have > to trigger recompute again in the next next iteration, and so on, if > such changes come continously. > > This patch solves the problem by introducing engine_set_abort_recompute(), > so that we can prioritize new change handling if the change can be > incrementally processed, but if the change triggers recompute, we > abort there without spending CPU on the recompute to avoid blocking > the previous computed flow installation. > > Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> > Reported-by: Numan Siddique <nusiddiq@redhat.com> > Reported-at: > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048822.html > Tested-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Han Zhou <hzhou8@ebay.com> > Acked-by: Numan Siddique <nusiddiq@redhat.com> Thanks Numan > --- > ovn/controller/ofctrl.c | 2 +- > ovn/controller/ofctrl.h | 1 + > ovn/controller/ovn-controller.c | 30 +++++++++++++++++++++++++++--- > ovn/lib/inc-proc-eng.c | 26 ++++++++++++++++++++++---- > ovn/lib/inc-proc-eng.h | 9 +++++++-- > 5 files changed, 58 insertions(+), 10 deletions(-) > > diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c > index 043abd6..0fcaa72 100644 > --- a/ovn/controller/ofctrl.c > +++ b/ovn/controller/ofctrl.c > @@ -985,7 +985,7 @@ add_meter(struct ovn_extend_table_info *m_desired, > * in the correct state and not backlogged with existing flow_mods. (Our > * criteria for being backlogged appear very conservative, but the socket > * between ovn-controller and OVS provides some buffering.) */ > -static bool > +bool > ofctrl_can_put(void) > { > if (state != S_UPDATE_FLOWS > diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h > index ed8918a..2b21c11 100644 > --- a/ovn/controller/ofctrl.h > +++ b/ovn/controller/ofctrl.h > @@ -51,6 +51,7 @@ void ofctrl_put(struct ovn_desired_flow_table *, > const struct sbrec_meter_table *, > int64_t nb_cfg, > bool flow_changed); > +bool ofctrl_can_put(void); > void ofctrl_wait(void); > void ofctrl_destroy(void); > int64_t ofctrl_get_cur_cfg(void); > diff --git a/ovn/controller/ovn-controller.c > b/ovn/controller/ovn-controller.c > index c4883aa..866cc1c 100644 > --- a/ovn/controller/ovn-controller.c > +++ b/ovn/controller/ovn-controller.c > @@ -1869,6 +1869,7 @@ main(int argc, char *argv[]) > > uint64_t engine_run_id = 0; > uint64_t old_engine_run_id = 0; > + bool engine_aborted = false; > > unsigned int ovs_cond_seqno = UINT_MAX; > unsigned int ovnsb_cond_seqno = UINT_MAX; > @@ -1955,7 +1956,30 @@ main(int argc, char *argv[]) > stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, > time_msec()); > if (ovnsb_idl_txn) { > - engine_run(&en_flow_output, ++engine_run_id); > + if (!ofctrl_can_put()) { > + /* When there are in-flight messages pending > to > + * ovs-vswitchd, we should hold on > recomputing so > + * that the previous flow installations won't > be > + * delayed. However, we still want to try if > + * recompute is not needed and we can quickly > + * incrementally process the new changes, to > avoid > + * unnecessarily forced recomputes later on. > This > + * is because the OVSDB change tracker cannot > + * preserve tracked changes across > iterations. If > + * change tracking is improved, we can simply > skip > + * this round of engine_run and continue > processing > + * acculated changes incrementally later when > + * ofctrl_can_put() returns true. */ > + if (!engine_aborted) { > + engine_set_abort_recompute(true); > + engine_aborted = > engine_run(&en_flow_output, > + > ++engine_run_id); > + } > + } else { > + engine_set_abort_recompute(false); > + engine_aborted = false; > + engine_run(&en_flow_output, ++engine_run_id); > + } > } > stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, > time_msec()); > @@ -1993,8 +2017,8 @@ main(int argc, char *argv[]) > } > > } > - if (old_engine_run_id == engine_run_id) { > - if (engine_need_run(&en_flow_output)) { > + if (old_engine_run_id == engine_run_id || engine_aborted) { > + if (engine_aborted || engine_need_run(&en_flow_output)) { > VLOG_DBG("engine did not run, force recompute next > time: " > "br_int %p, chassis %p", br_int, chassis); > engine_set_force_recompute(true); > diff --git a/ovn/lib/inc-proc-eng.c b/ovn/lib/inc-proc-eng.c > index 1ddea1a..1064a08 100644 > --- a/ovn/lib/inc-proc-eng.c > +++ b/ovn/lib/inc-proc-eng.c > @@ -31,6 +31,7 @@ > VLOG_DEFINE_THIS_MODULE(inc_proc_eng); > > static bool engine_force_recompute = false; > +static bool engine_abort_recompute = false; > static const struct engine_context *engine_context; > > void > @@ -39,6 +40,12 @@ engine_set_force_recompute(bool val) > engine_force_recompute = val; > } > > +void > +engine_set_abort_recompute(bool val) > +{ > + engine_abort_recompute = val; > +} > + > const struct engine_context * > engine_get_context(void) > { > @@ -121,11 +128,11 @@ engine_ovsdb_node_add_index(struct engine_node > *node, const char *name, > ed->n_indexes ++; > } > > -void > +bool > engine_run(struct engine_node *node, uint64_t run_id) > { > if (node->run_id == run_id) { > - return; > + return true; > } > node->run_id = run_id; > > @@ -133,11 +140,13 @@ engine_run(struct engine_node *node, uint64_t run_id) > if (!node->n_inputs) { > node->run(node); > VLOG_DBG("node: %s, changed: %d", node->name, node->changed); > - return; > + return true; > } > > for (size_t i = 0; i < node->n_inputs; i++) { > - engine_run(node->inputs[i].node, run_id); > + if (!engine_run(node->inputs[i].node, run_id)) { > + return false; > + } > } > > bool need_compute = false; > @@ -160,6 +169,10 @@ engine_run(struct engine_node *node, uint64_t run_id) > if (need_recompute) { > VLOG_DBG("node: %s, recompute (%s)", node->name, > engine_force_recompute ? "forced" : "triggered"); > + if (engine_abort_recompute) { > + VLOG_DBG("node: %s, recompute aborted", node->name); > + return false; > + } > node->run(node); > } else if (need_compute) { > for (size_t i = 0; i < node->n_inputs; i++) { > @@ -170,6 +183,10 @@ engine_run(struct engine_node *node, uint64_t run_id) > VLOG_DBG("node: %s, can't handle change for input %s, > " > "fall back to recompute", > node->name, node->inputs[i].node->name); > + if (engine_abort_recompute) { > + VLOG_DBG("node: %s, recompute aborted", > node->name); > + return false; > + } > node->run(node); > break; > } > @@ -178,6 +195,7 @@ engine_run(struct engine_node *node, uint64_t run_id) > } > > VLOG_DBG("node: %s, changed: %d", node->name, node->changed); > + return true; > } > > bool > diff --git a/ovn/lib/inc-proc-eng.h b/ovn/lib/inc-proc-eng.h > index aab899e..c3d7b5e 100644 > --- a/ovn/lib/inc-proc-eng.h > +++ b/ovn/lib/inc-proc-eng.h > @@ -121,8 +121,9 @@ struct engine_node { > void engine_init(struct engine_node *); > > /* Execute the processing recursively, which should be called in the main > - * loop. */ > -void engine_run(struct engine_node *, uint64_t run_id); > + * loop. Returns true if the execution is compelte, false if it is > aborted, > + * which could happen when engine_abort_recompute is set. */ > +bool engine_run(struct engine_node *, uint64_t run_id); > > /* Clean up the data for the engine nodes recursively. It calls each > node's > * cleanup() method if not NULL. It should be called before the program > @@ -150,6 +151,10 @@ void engine_add_input(struct engine_node *node, > struct engine_node *input, > * iteration, and the change can't be tracked across iterations */ > void engine_set_force_recompute(bool val); > > +/* Set the flag to cause engine execution to be aborted when there > + * is any recompute to be triggered in any node. */ > +void engine_set_abort_recompute(bool val); > + > const struct engine_context * engine_get_context(void); > > void engine_set_context(const struct engine_context *); > -- > 2.1.0 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
Acked-by: Mark Michelson Thanks for the informative commit message! On 7/23/19 10:52 PM, Han Zhou wrote: > From: Han Zhou <hzhou8@ebay.com> > > When there are in-flight flow-installations pending to ovs-vswitchd, > current incremental processing logic prioritizes new change handling. > However, in scenarios where frequent recomputes are triggered, the > later recompute would block the flow-installation for previously > computed flows because recompute usually takes long time, especially > when there are large number of flows. This results in worse latency > than the version without incremental processing in specific scale > test scenarios. > > While we can simply fix the problem by prioritizing flow installation > rather than new change handling, it can cause the incremental > processing to degrade to always recompute in certain scenarios when > there are some changes triggering recomputes, followed by a lot of > continously coming changes that can be handled incrementally. Because > OVSDB change tracker cannot preserve changes across iterations, once > the recompute is triggered and resulted in a lot of pending messages > to ovs-vswitchd, and if we choose to skip the engine_run() > in the next iteration when a incrementally processible change comes, > we miss the opportunity to handle that tracked change and will have > to trigger recompute again in the next next iteration, and so on, if > such changes come continously. > > This patch solves the problem by introducing engine_set_abort_recompute(), > so that we can prioritize new change handling if the change can be > incrementally processed, but if the change triggers recompute, we > abort there without spending CPU on the recompute to avoid blocking > the previous computed flow installation. > > Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> > Reported-by: Numan Siddique <nusiddiq@redhat.com> > Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048822.html > Tested-by: Numan Siddique <nusiddiq@redhat.com> > Signed-off-by: Han Zhou <hzhou8@ebay.com> > --- > ovn/controller/ofctrl.c | 2 +- > ovn/controller/ofctrl.h | 1 + > ovn/controller/ovn-controller.c | 30 +++++++++++++++++++++++++++--- > ovn/lib/inc-proc-eng.c | 26 ++++++++++++++++++++++---- > ovn/lib/inc-proc-eng.h | 9 +++++++-- > 5 files changed, 58 insertions(+), 10 deletions(-) > > diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c > index 043abd6..0fcaa72 100644 > --- a/ovn/controller/ofctrl.c > +++ b/ovn/controller/ofctrl.c > @@ -985,7 +985,7 @@ add_meter(struct ovn_extend_table_info *m_desired, > * in the correct state and not backlogged with existing flow_mods. (Our > * criteria for being backlogged appear very conservative, but the socket > * between ovn-controller and OVS provides some buffering.) */ > -static bool > +bool > ofctrl_can_put(void) > { > if (state != S_UPDATE_FLOWS > diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h > index ed8918a..2b21c11 100644 > --- a/ovn/controller/ofctrl.h > +++ b/ovn/controller/ofctrl.h > @@ -51,6 +51,7 @@ void ofctrl_put(struct ovn_desired_flow_table *, > const struct sbrec_meter_table *, > int64_t nb_cfg, > bool flow_changed); > +bool ofctrl_can_put(void); > void ofctrl_wait(void); > void ofctrl_destroy(void); > int64_t ofctrl_get_cur_cfg(void); > diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c > index c4883aa..866cc1c 100644 > --- a/ovn/controller/ovn-controller.c > +++ b/ovn/controller/ovn-controller.c > @@ -1869,6 +1869,7 @@ main(int argc, char *argv[]) > > uint64_t engine_run_id = 0; > uint64_t old_engine_run_id = 0; > + bool engine_aborted = false; > > unsigned int ovs_cond_seqno = UINT_MAX; > unsigned int ovnsb_cond_seqno = UINT_MAX; > @@ -1955,7 +1956,30 @@ main(int argc, char *argv[]) > stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, > time_msec()); > if (ovnsb_idl_txn) { > - engine_run(&en_flow_output, ++engine_run_id); > + if (!ofctrl_can_put()) { > + /* When there are in-flight messages pending to > + * ovs-vswitchd, we should hold on recomputing so > + * that the previous flow installations won't be > + * delayed. However, we still want to try if > + * recompute is not needed and we can quickly > + * incrementally process the new changes, to avoid > + * unnecessarily forced recomputes later on. This > + * is because the OVSDB change tracker cannot > + * preserve tracked changes across iterations. If > + * change tracking is improved, we can simply skip > + * this round of engine_run and continue processing > + * acculated changes incrementally later when > + * ofctrl_can_put() returns true. */ > + if (!engine_aborted) { > + engine_set_abort_recompute(true); > + engine_aborted = engine_run(&en_flow_output, > + ++engine_run_id); > + } > + } else { > + engine_set_abort_recompute(false); > + engine_aborted = false; > + engine_run(&en_flow_output, ++engine_run_id); > + } > } > stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, > time_msec()); > @@ -1993,8 +2017,8 @@ main(int argc, char *argv[]) > } > > } > - if (old_engine_run_id == engine_run_id) { > - if (engine_need_run(&en_flow_output)) { > + if (old_engine_run_id == engine_run_id || engine_aborted) { > + if (engine_aborted || engine_need_run(&en_flow_output)) { > VLOG_DBG("engine did not run, force recompute next time: " > "br_int %p, chassis %p", br_int, chassis); > engine_set_force_recompute(true); > diff --git a/ovn/lib/inc-proc-eng.c b/ovn/lib/inc-proc-eng.c > index 1ddea1a..1064a08 100644 > --- a/ovn/lib/inc-proc-eng.c > +++ b/ovn/lib/inc-proc-eng.c > @@ -31,6 +31,7 @@ > VLOG_DEFINE_THIS_MODULE(inc_proc_eng); > > static bool engine_force_recompute = false; > +static bool engine_abort_recompute = false; > static const struct engine_context *engine_context; > > void > @@ -39,6 +40,12 @@ engine_set_force_recompute(bool val) > engine_force_recompute = val; > } > > +void > +engine_set_abort_recompute(bool val) > +{ > + engine_abort_recompute = val; > +} > + > const struct engine_context * > engine_get_context(void) > { > @@ -121,11 +128,11 @@ engine_ovsdb_node_add_index(struct engine_node *node, const char *name, > ed->n_indexes ++; > } > > -void > +bool > engine_run(struct engine_node *node, uint64_t run_id) > { > if (node->run_id == run_id) { > - return; > + return true; > } > node->run_id = run_id; > > @@ -133,11 +140,13 @@ engine_run(struct engine_node *node, uint64_t run_id) > if (!node->n_inputs) { > node->run(node); > VLOG_DBG("node: %s, changed: %d", node->name, node->changed); > - return; > + return true; > } > > for (size_t i = 0; i < node->n_inputs; i++) { > - engine_run(node->inputs[i].node, run_id); > + if (!engine_run(node->inputs[i].node, run_id)) { > + return false; > + } > } > > bool need_compute = false; > @@ -160,6 +169,10 @@ engine_run(struct engine_node *node, uint64_t run_id) > if (need_recompute) { > VLOG_DBG("node: %s, recompute (%s)", node->name, > engine_force_recompute ? "forced" : "triggered"); > + if (engine_abort_recompute) { > + VLOG_DBG("node: %s, recompute aborted", node->name); > + return false; > + } > node->run(node); > } else if (need_compute) { > for (size_t i = 0; i < node->n_inputs; i++) { > @@ -170,6 +183,10 @@ engine_run(struct engine_node *node, uint64_t run_id) > VLOG_DBG("node: %s, can't handle change for input %s, " > "fall back to recompute", > node->name, node->inputs[i].node->name); > + if (engine_abort_recompute) { > + VLOG_DBG("node: %s, recompute aborted", node->name); > + return false; > + } > node->run(node); > break; > } > @@ -178,6 +195,7 @@ engine_run(struct engine_node *node, uint64_t run_id) > } > > VLOG_DBG("node: %s, changed: %d", node->name, node->changed); > + return true; > } > > bool > diff --git a/ovn/lib/inc-proc-eng.h b/ovn/lib/inc-proc-eng.h > index aab899e..c3d7b5e 100644 > --- a/ovn/lib/inc-proc-eng.h > +++ b/ovn/lib/inc-proc-eng.h > @@ -121,8 +121,9 @@ struct engine_node { > void engine_init(struct engine_node *); > > /* Execute the processing recursively, which should be called in the main > - * loop. */ > -void engine_run(struct engine_node *, uint64_t run_id); > + * loop. Returns true if the execution is compelte, false if it is aborted, > + * which could happen when engine_abort_recompute is set. */ > +bool engine_run(struct engine_node *, uint64_t run_id); > > /* Clean up the data for the engine nodes recursively. It calls each node's > * cleanup() method if not NULL. It should be called before the program > @@ -150,6 +151,10 @@ void engine_add_input(struct engine_node *node, struct engine_node *input, > * iteration, and the change can't be tracked across iterations */ > void engine_set_force_recompute(bool val); > > +/* Set the flag to cause engine execution to be aborted when there > + * is any recompute to be triggered in any node. */ > +void engine_set_abort_recompute(bool val); > + > const struct engine_context * engine_get_context(void); > > void engine_set_context(const struct engine_context *); >
Forgot the e-mail address after my name :) Acked-by: Mark Michelson <mmichels@redhat.com> On 7/25/19 4:16 PM, Mark Michelson wrote: > Acked-by: Mark Michelson > > Thanks for the informative commit message! > > On 7/23/19 10:52 PM, Han Zhou wrote: >> From: Han Zhou <hzhou8@ebay.com> >> >> When there are in-flight flow-installations pending to ovs-vswitchd, >> current incremental processing logic prioritizes new change handling. >> However, in scenarios where frequent recomputes are triggered, the >> later recompute would block the flow-installation for previously >> computed flows because recompute usually takes long time, especially >> when there are large number of flows. This results in worse latency >> than the version without incremental processing in specific scale >> test scenarios. >> >> While we can simply fix the problem by prioritizing flow installation >> rather than new change handling, it can cause the incremental >> processing to degrade to always recompute in certain scenarios when >> there are some changes triggering recomputes, followed by a lot of >> continously coming changes that can be handled incrementally. Because >> OVSDB change tracker cannot preserve changes across iterations, once >> the recompute is triggered and resulted in a lot of pending messages >> to ovs-vswitchd, and if we choose to skip the engine_run() >> in the next iteration when a incrementally processible change comes, >> we miss the opportunity to handle that tracked change and will have >> to trigger recompute again in the next next iteration, and so on, if >> such changes come continously. >> >> This patch solves the problem by introducing >> engine_set_abort_recompute(), >> so that we can prioritize new change handling if the change can be >> incrementally processed, but if the change triggers recompute, we >> abort there without spending CPU on the recompute to avoid blocking >> the previous computed flow installation. >> >> Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> >> Reported-by: Numan Siddique <nusiddiq@redhat.com> >> Reported-at: >> https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048822.html >> Tested-by: Numan Siddique <nusiddiq@redhat.com> >> Signed-off-by: Han Zhou <hzhou8@ebay.com> >> --- >> ovn/controller/ofctrl.c | 2 +- >> ovn/controller/ofctrl.h | 1 + >> ovn/controller/ovn-controller.c | 30 +++++++++++++++++++++++++++--- >> ovn/lib/inc-proc-eng.c | 26 ++++++++++++++++++++++---- >> ovn/lib/inc-proc-eng.h | 9 +++++++-- >> 5 files changed, 58 insertions(+), 10 deletions(-) >> >> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c >> index 043abd6..0fcaa72 100644 >> --- a/ovn/controller/ofctrl.c >> +++ b/ovn/controller/ofctrl.c >> @@ -985,7 +985,7 @@ add_meter(struct ovn_extend_table_info *m_desired, >> * in the correct state and not backlogged with existing flow_mods. >> (Our >> * criteria for being backlogged appear very conservative, but the >> socket >> * between ovn-controller and OVS provides some buffering.) */ >> -static bool >> +bool >> ofctrl_can_put(void) >> { >> if (state != S_UPDATE_FLOWS >> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h >> index ed8918a..2b21c11 100644 >> --- a/ovn/controller/ofctrl.h >> +++ b/ovn/controller/ofctrl.h >> @@ -51,6 +51,7 @@ void ofctrl_put(struct ovn_desired_flow_table *, >> const struct sbrec_meter_table *, >> int64_t nb_cfg, >> bool flow_changed); >> +bool ofctrl_can_put(void); >> void ofctrl_wait(void); >> void ofctrl_destroy(void); >> int64_t ofctrl_get_cur_cfg(void); >> diff --git a/ovn/controller/ovn-controller.c >> b/ovn/controller/ovn-controller.c >> index c4883aa..866cc1c 100644 >> --- a/ovn/controller/ovn-controller.c >> +++ b/ovn/controller/ovn-controller.c >> @@ -1869,6 +1869,7 @@ main(int argc, char *argv[]) >> uint64_t engine_run_id = 0; >> uint64_t old_engine_run_id = 0; >> + bool engine_aborted = false; >> unsigned int ovs_cond_seqno = UINT_MAX; >> unsigned int ovnsb_cond_seqno = UINT_MAX; >> @@ -1955,7 +1956,30 @@ main(int argc, char *argv[]) >> stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, >> time_msec()); >> if (ovnsb_idl_txn) { >> - engine_run(&en_flow_output, ++engine_run_id); >> + if (!ofctrl_can_put()) { >> + /* When there are in-flight messages >> pending to >> + * ovs-vswitchd, we should hold on >> recomputing so >> + * that the previous flow installations >> won't be >> + * delayed. However, we still want to >> try if >> + * recompute is not needed and we can >> quickly >> + * incrementally process the new changes, >> to avoid >> + * unnecessarily forced recomputes later >> on. This >> + * is because the OVSDB change tracker >> cannot >> + * preserve tracked changes across >> iterations. If >> + * change tracking is improved, we can >> simply skip >> + * this round of engine_run and continue >> processing >> + * acculated changes incrementally later >> when >> + * ofctrl_can_put() returns true. */ >> + if (!engine_aborted) { >> + engine_set_abort_recompute(true); >> + engine_aborted = >> engine_run(&en_flow_output, >> + >> ++engine_run_id); >> + } >> + } else { >> + engine_set_abort_recompute(false); >> + engine_aborted = false; >> + engine_run(&en_flow_output, >> ++engine_run_id); >> + } >> } >> stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, >> time_msec()); >> @@ -1993,8 +2017,8 @@ main(int argc, char *argv[]) >> } >> } >> - if (old_engine_run_id == engine_run_id) { >> - if (engine_need_run(&en_flow_output)) { >> + if (old_engine_run_id == engine_run_id || engine_aborted) { >> + if (engine_aborted || >> engine_need_run(&en_flow_output)) { >> VLOG_DBG("engine did not run, force recompute >> next time: " >> "br_int %p, chassis %p", br_int, chassis); >> engine_set_force_recompute(true); >> diff --git a/ovn/lib/inc-proc-eng.c b/ovn/lib/inc-proc-eng.c >> index 1ddea1a..1064a08 100644 >> --- a/ovn/lib/inc-proc-eng.c >> +++ b/ovn/lib/inc-proc-eng.c >> @@ -31,6 +31,7 @@ >> VLOG_DEFINE_THIS_MODULE(inc_proc_eng); >> static bool engine_force_recompute = false; >> +static bool engine_abort_recompute = false; >> static const struct engine_context *engine_context; >> void >> @@ -39,6 +40,12 @@ engine_set_force_recompute(bool val) >> engine_force_recompute = val; >> } >> +void >> +engine_set_abort_recompute(bool val) >> +{ >> + engine_abort_recompute = val; >> +} >> + >> const struct engine_context * >> engine_get_context(void) >> { >> @@ -121,11 +128,11 @@ engine_ovsdb_node_add_index(struct engine_node >> *node, const char *name, >> ed->n_indexes ++; >> } >> -void >> +bool >> engine_run(struct engine_node *node, uint64_t run_id) >> { >> if (node->run_id == run_id) { >> - return; >> + return true; >> } >> node->run_id = run_id; >> @@ -133,11 +140,13 @@ engine_run(struct engine_node *node, uint64_t >> run_id) >> if (!node->n_inputs) { >> node->run(node); >> VLOG_DBG("node: %s, changed: %d", node->name, node->changed); >> - return; >> + return true; >> } >> for (size_t i = 0; i < node->n_inputs; i++) { >> - engine_run(node->inputs[i].node, run_id); >> + if (!engine_run(node->inputs[i].node, run_id)) { >> + return false; >> + } >> } >> bool need_compute = false; >> @@ -160,6 +169,10 @@ engine_run(struct engine_node *node, uint64_t >> run_id) >> if (need_recompute) { >> VLOG_DBG("node: %s, recompute (%s)", node->name, >> engine_force_recompute ? "forced" : "triggered"); >> + if (engine_abort_recompute) { >> + VLOG_DBG("node: %s, recompute aborted", node->name); >> + return false; >> + } >> node->run(node); >> } else if (need_compute) { >> for (size_t i = 0; i < node->n_inputs; i++) { >> @@ -170,6 +183,10 @@ engine_run(struct engine_node *node, uint64_t >> run_id) >> VLOG_DBG("node: %s, can't handle change for >> input %s, " >> "fall back to recompute", >> node->name, node->inputs[i].node->name); >> + if (engine_abort_recompute) { >> + VLOG_DBG("node: %s, recompute aborted", >> node->name); >> + return false; >> + } >> node->run(node); >> break; >> } >> @@ -178,6 +195,7 @@ engine_run(struct engine_node *node, uint64_t run_id) >> } >> VLOG_DBG("node: %s, changed: %d", node->name, node->changed); >> + return true; >> } >> bool >> diff --git a/ovn/lib/inc-proc-eng.h b/ovn/lib/inc-proc-eng.h >> index aab899e..c3d7b5e 100644 >> --- a/ovn/lib/inc-proc-eng.h >> +++ b/ovn/lib/inc-proc-eng.h >> @@ -121,8 +121,9 @@ struct engine_node { >> void engine_init(struct engine_node *); >> /* Execute the processing recursively, which should be called in the >> main >> - * loop. */ >> -void engine_run(struct engine_node *, uint64_t run_id); >> + * loop. Returns true if the execution is compelte, false if it is >> aborted, >> + * which could happen when engine_abort_recompute is set. */ >> +bool engine_run(struct engine_node *, uint64_t run_id); >> /* Clean up the data for the engine nodes recursively. It calls each >> node's >> * cleanup() method if not NULL. It should be called before the program >> @@ -150,6 +151,10 @@ void engine_add_input(struct engine_node *node, >> struct engine_node *input, >> * iteration, and the change can't be tracked across iterations */ >> void engine_set_force_recompute(bool val); >> +/* Set the flag to cause engine execution to be aborted when there >> + * is any recompute to be triggered in any node. */ >> +void engine_set_abort_recompute(bool val); >> + >> const struct engine_context * engine_get_context(void); >> void engine_set_context(const struct engine_context *); >> >
Thanks Mark and Numan for the review. Ben, now this is merged to OVN repo. Could you help backporting it to 2.12? Thanks, Han On Thu, Jul 25, 2019 at 1:23 PM Mark Michelson <mmichels@redhat.com> wrote: > > Forgot the e-mail address after my name :) > > Acked-by: Mark Michelson <mmichels@redhat.com> > > On 7/25/19 4:16 PM, Mark Michelson wrote: > > Acked-by: Mark Michelson > > > > Thanks for the informative commit message! > > > > On 7/23/19 10:52 PM, Han Zhou wrote: > >> From: Han Zhou <hzhou8@ebay.com> > >> > >> When there are in-flight flow-installations pending to ovs-vswitchd, > >> current incremental processing logic prioritizes new change handling. > >> However, in scenarios where frequent recomputes are triggered, the > >> later recompute would block the flow-installation for previously > >> computed flows because recompute usually takes long time, especially > >> when there are large number of flows. This results in worse latency > >> than the version without incremental processing in specific scale > >> test scenarios. > >> > >> While we can simply fix the problem by prioritizing flow installation > >> rather than new change handling, it can cause the incremental > >> processing to degrade to always recompute in certain scenarios when > >> there are some changes triggering recomputes, followed by a lot of > >> continously coming changes that can be handled incrementally. Because > >> OVSDB change tracker cannot preserve changes across iterations, once > >> the recompute is triggered and resulted in a lot of pending messages > >> to ovs-vswitchd, and if we choose to skip the engine_run() > >> in the next iteration when a incrementally processible change comes, > >> we miss the opportunity to handle that tracked change and will have > >> to trigger recompute again in the next next iteration, and so on, if > >> such changes come continously. > >> > >> This patch solves the problem by introducing > >> engine_set_abort_recompute(), > >> so that we can prioritize new change handling if the change can be > >> incrementally processed, but if the change triggers recompute, we > >> abort there without spending CPU on the recompute to avoid blocking > >> the previous computed flow installation. > >> > >> Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> > >> Reported-by: Numan Siddique <nusiddiq@redhat.com> > >> Reported-at: > >> https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048822.html > >> Tested-by: Numan Siddique <nusiddiq@redhat.com> > >> Signed-off-by: Han Zhou <hzhou8@ebay.com> > >> --- > >> ovn/controller/ofctrl.c | 2 +- > >> ovn/controller/ofctrl.h | 1 + > >> ovn/controller/ovn-controller.c | 30 +++++++++++++++++++++++++++--- > >> ovn/lib/inc-proc-eng.c | 26 ++++++++++++++++++++++---- > >> ovn/lib/inc-proc-eng.h | 9 +++++++-- > >> 5 files changed, 58 insertions(+), 10 deletions(-) > >> > >> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c > >> index 043abd6..0fcaa72 100644 > >> --- a/ovn/controller/ofctrl.c > >> +++ b/ovn/controller/ofctrl.c > >> @@ -985,7 +985,7 @@ add_meter(struct ovn_extend_table_info *m_desired, > >> * in the correct state and not backlogged with existing flow_mods. > >> (Our > >> * criteria for being backlogged appear very conservative, but the > >> socket > >> * between ovn-controller and OVS provides some buffering.) */ > >> -static bool > >> +bool > >> ofctrl_can_put(void) > >> { > >> if (state != S_UPDATE_FLOWS > >> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h > >> index ed8918a..2b21c11 100644 > >> --- a/ovn/controller/ofctrl.h > >> +++ b/ovn/controller/ofctrl.h > >> @@ -51,6 +51,7 @@ void ofctrl_put(struct ovn_desired_flow_table *, > >> const struct sbrec_meter_table *, > >> int64_t nb_cfg, > >> bool flow_changed); > >> +bool ofctrl_can_put(void); > >> void ofctrl_wait(void); > >> void ofctrl_destroy(void); > >> int64_t ofctrl_get_cur_cfg(void); > >> diff --git a/ovn/controller/ovn-controller.c > >> b/ovn/controller/ovn-controller.c > >> index c4883aa..866cc1c 100644 > >> --- a/ovn/controller/ovn-controller.c > >> +++ b/ovn/controller/ovn-controller.c > >> @@ -1869,6 +1869,7 @@ main(int argc, char *argv[]) > >> uint64_t engine_run_id = 0; > >> uint64_t old_engine_run_id = 0; > >> + bool engine_aborted = false; > >> unsigned int ovs_cond_seqno = UINT_MAX; > >> unsigned int ovnsb_cond_seqno = UINT_MAX; > >> @@ -1955,7 +1956,30 @@ main(int argc, char *argv[]) > >> stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, > >> time_msec()); > >> if (ovnsb_idl_txn) { > >> - engine_run(&en_flow_output, ++engine_run_id); > >> + if (!ofctrl_can_put()) { > >> + /* When there are in-flight messages > >> pending to > >> + * ovs-vswitchd, we should hold on > >> recomputing so > >> + * that the previous flow installations > >> won't be > >> + * delayed. However, we still want to > >> try if > >> + * recompute is not needed and we can > >> quickly > >> + * incrementally process the new changes, > >> to avoid > >> + * unnecessarily forced recomputes later > >> on. This > >> + * is because the OVSDB change tracker > >> cannot > >> + * preserve tracked changes across > >> iterations. If > >> + * change tracking is improved, we can > >> simply skip > >> + * this round of engine_run and continue > >> processing > >> + * acculated changes incrementally later > >> when > >> + * ofctrl_can_put() returns true. */ > >> + if (!engine_aborted) { > >> + engine_set_abort_recompute(true); > >> + engine_aborted = > >> engine_run(&en_flow_output, > >> + > >> ++engine_run_id); > >> + } > >> + } else { > >> + engine_set_abort_recompute(false); > >> + engine_aborted = false; > >> + engine_run(&en_flow_output, > >> ++engine_run_id); > >> + } > >> } > >> stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, > >> time_msec()); > >> @@ -1993,8 +2017,8 @@ main(int argc, char *argv[]) > >> } > >> } > >> - if (old_engine_run_id == engine_run_id) { > >> - if (engine_need_run(&en_flow_output)) { > >> + if (old_engine_run_id == engine_run_id || engine_aborted) { > >> + if (engine_aborted || > >> engine_need_run(&en_flow_output)) { > >> VLOG_DBG("engine did not run, force recompute > >> next time: " > >> "br_int %p, chassis %p", br_int, chassis); > >> engine_set_force_recompute(true); > >> diff --git a/ovn/lib/inc-proc-eng.c b/ovn/lib/inc-proc-eng.c > >> index 1ddea1a..1064a08 100644 > >> --- a/ovn/lib/inc-proc-eng.c > >> +++ b/ovn/lib/inc-proc-eng.c > >> @@ -31,6 +31,7 @@ > >> VLOG_DEFINE_THIS_MODULE(inc_proc_eng); > >> static bool engine_force_recompute = false; > >> +static bool engine_abort_recompute = false; > >> static const struct engine_context *engine_context; > >> void > >> @@ -39,6 +40,12 @@ engine_set_force_recompute(bool val) > >> engine_force_recompute = val; > >> } > >> +void > >> +engine_set_abort_recompute(bool val) > >> +{ > >> + engine_abort_recompute = val; > >> +} > >> + > >> const struct engine_context * > >> engine_get_context(void) > >> { > >> @@ -121,11 +128,11 @@ engine_ovsdb_node_add_index(struct engine_node > >> *node, const char *name, > >> ed->n_indexes ++; > >> } > >> -void > >> +bool > >> engine_run(struct engine_node *node, uint64_t run_id) > >> { > >> if (node->run_id == run_id) { > >> - return; > >> + return true; > >> } > >> node->run_id = run_id; > >> @@ -133,11 +140,13 @@ engine_run(struct engine_node *node, uint64_t > >> run_id) > >> if (!node->n_inputs) { > >> node->run(node); > >> VLOG_DBG("node: %s, changed: %d", node->name, node->changed); > >> - return; > >> + return true; > >> } > >> for (size_t i = 0; i < node->n_inputs; i++) { > >> - engine_run(node->inputs[i].node, run_id); > >> + if (!engine_run(node->inputs[i].node, run_id)) { > >> + return false; > >> + } > >> } > >> bool need_compute = false; > >> @@ -160,6 +169,10 @@ engine_run(struct engine_node *node, uint64_t > >> run_id) > >> if (need_recompute) { > >> VLOG_DBG("node: %s, recompute (%s)", node->name, > >> engine_force_recompute ? "forced" : "triggered"); > >> + if (engine_abort_recompute) { > >> + VLOG_DBG("node: %s, recompute aborted", node->name); > >> + return false; > >> + } > >> node->run(node); > >> } else if (need_compute) { > >> for (size_t i = 0; i < node->n_inputs; i++) { > >> @@ -170,6 +183,10 @@ engine_run(struct engine_node *node, uint64_t > >> run_id) > >> VLOG_DBG("node: %s, can't handle change for > >> input %s, " > >> "fall back to recompute", > >> node->name, node->inputs[i].node->name); > >> + if (engine_abort_recompute) { > >> + VLOG_DBG("node: %s, recompute aborted", > >> node->name); > >> + return false; > >> + } > >> node->run(node); > >> break; > >> } > >> @@ -178,6 +195,7 @@ engine_run(struct engine_node *node, uint64_t run_id) > >> } > >> VLOG_DBG("node: %s, changed: %d", node->name, node->changed); > >> + return true; > >> } > >> bool > >> diff --git a/ovn/lib/inc-proc-eng.h b/ovn/lib/inc-proc-eng.h > >> index aab899e..c3d7b5e 100644 > >> --- a/ovn/lib/inc-proc-eng.h > >> +++ b/ovn/lib/inc-proc-eng.h > >> @@ -121,8 +121,9 @@ struct engine_node { > >> void engine_init(struct engine_node *); > >> /* Execute the processing recursively, which should be called in the > >> main > >> - * loop. */ > >> -void engine_run(struct engine_node *, uint64_t run_id); > >> + * loop. Returns true if the execution is compelte, false if it is > >> aborted, > >> + * which could happen when engine_abort_recompute is set. */ > >> +bool engine_run(struct engine_node *, uint64_t run_id); > >> /* Clean up the data for the engine nodes recursively. It calls each > >> node's > >> * cleanup() method if not NULL. It should be called before the program > >> @@ -150,6 +151,10 @@ void engine_add_input(struct engine_node *node, > >> struct engine_node *input, > >> * iteration, and the change can't be tracked across iterations */ > >> void engine_set_force_recompute(bool val); > >> +/* Set the flag to cause engine execution to be aborted when there > >> + * is any recompute to be triggered in any node. */ > >> +void engine_set_abort_recompute(bool val); > >> + > >> const struct engine_context * engine_get_context(void); > >> void engine_set_context(const struct engine_context *); > >> > > >
I resent it for backporting to 2.12 as a series followed by a fix from Dumitru, which has been merged in OVN repo, too. https://patchwork.ozlabs.org/project/openvswitch/list/?series=123382 Thanks Dumitru again for the fix. On Tue, Jul 30, 2019 at 12:16 AM Han Zhou <zhouhan@gmail.com> wrote: > Thanks Mark and Numan for the review. > Ben, now this is merged to OVN repo. Could you help backporting it to 2.12? > > Thanks, > Han > > > On Thu, Jul 25, 2019 at 1:23 PM Mark Michelson <mmichels@redhat.com> > wrote: > > > > Forgot the e-mail address after my name :) > > > > Acked-by: Mark Michelson <mmichels@redhat.com> > > > > On 7/25/19 4:16 PM, Mark Michelson wrote: > > > Acked-by: Mark Michelson > > > > > > Thanks for the informative commit message! > > > > > > On 7/23/19 10:52 PM, Han Zhou wrote: > > >> From: Han Zhou <hzhou8@ebay.com> > > >> > > >> When there are in-flight flow-installations pending to ovs-vswitchd, > > >> current incremental processing logic prioritizes new change handling. > > >> However, in scenarios where frequent recomputes are triggered, the > > >> later recompute would block the flow-installation for previously > > >> computed flows because recompute usually takes long time, especially > > >> when there are large number of flows. This results in worse latency > > >> than the version without incremental processing in specific scale > > >> test scenarios. > > >> > > >> While we can simply fix the problem by prioritizing flow installation > > >> rather than new change handling, it can cause the incremental > > >> processing to degrade to always recompute in certain scenarios when > > >> there are some changes triggering recomputes, followed by a lot of > > >> continously coming changes that can be handled incrementally. Because > > >> OVSDB change tracker cannot preserve changes across iterations, once > > >> the recompute is triggered and resulted in a lot of pending messages > > >> to ovs-vswitchd, and if we choose to skip the engine_run() > > >> in the next iteration when a incrementally processible change comes, > > >> we miss the opportunity to handle that tracked change and will have > > >> to trigger recompute again in the next next iteration, and so on, if > > >> such changes come continously. > > >> > > >> This patch solves the problem by introducing > > >> engine_set_abort_recompute(), > > >> so that we can prioritize new change handling if the change can be > > >> incrementally processed, but if the change triggers recompute, we > > >> abort there without spending CPU on the recompute to avoid blocking > > >> the previous computed flow installation. > > >> > > >> Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> > > >> Reported-by: Numan Siddique <nusiddiq@redhat.com> > > >> Reported-at: > > >> > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048822.html > > >> Tested-by: Numan Siddique <nusiddiq@redhat.com> > > >> Signed-off-by: Han Zhou <hzhou8@ebay.com> > > >> --- > > >> ovn/controller/ofctrl.c | 2 +- > > >> ovn/controller/ofctrl.h | 1 + > > >> ovn/controller/ovn-controller.c | 30 +++++++++++++++++++++++++++--- > > >> ovn/lib/inc-proc-eng.c | 26 ++++++++++++++++++++++---- > > >> ovn/lib/inc-proc-eng.h | 9 +++++++-- > > >> 5 files changed, 58 insertions(+), 10 deletions(-) > > >> > > >> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c > > >> index 043abd6..0fcaa72 100644 > > >> --- a/ovn/controller/ofctrl.c > > >> +++ b/ovn/controller/ofctrl.c > > >> @@ -985,7 +985,7 @@ add_meter(struct ovn_extend_table_info *m_desired, > > >> * in the correct state and not backlogged with existing flow_mods. > > >> (Our > > >> * criteria for being backlogged appear very conservative, but the > > >> socket > > >> * between ovn-controller and OVS provides some buffering.) */ > > >> -static bool > > >> +bool > > >> ofctrl_can_put(void) > > >> { > > >> if (state != S_UPDATE_FLOWS > > >> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h > > >> index ed8918a..2b21c11 100644 > > >> --- a/ovn/controller/ofctrl.h > > >> +++ b/ovn/controller/ofctrl.h > > >> @@ -51,6 +51,7 @@ void ofctrl_put(struct ovn_desired_flow_table *, > > >> const struct sbrec_meter_table *, > > >> int64_t nb_cfg, > > >> bool flow_changed); > > >> +bool ofctrl_can_put(void); > > >> void ofctrl_wait(void); > > >> void ofctrl_destroy(void); > > >> int64_t ofctrl_get_cur_cfg(void); > > >> diff --git a/ovn/controller/ovn-controller.c > > >> b/ovn/controller/ovn-controller.c > > >> index c4883aa..866cc1c 100644 > > >> --- a/ovn/controller/ovn-controller.c > > >> +++ b/ovn/controller/ovn-controller.c > > >> @@ -1869,6 +1869,7 @@ main(int argc, char *argv[]) > > >> uint64_t engine_run_id = 0; > > >> uint64_t old_engine_run_id = 0; > > >> + bool engine_aborted = false; > > >> unsigned int ovs_cond_seqno = UINT_MAX; > > >> unsigned int ovnsb_cond_seqno = UINT_MAX; > > >> @@ -1955,7 +1956,30 @@ main(int argc, char *argv[]) > > >> stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, > > >> time_msec()); > > >> if (ovnsb_idl_txn) { > > >> - engine_run(&en_flow_output, ++engine_run_id); > > >> + if (!ofctrl_can_put()) { > > >> + /* When there are in-flight messages > > >> pending to > > >> + * ovs-vswitchd, we should hold on > > >> recomputing so > > >> + * that the previous flow installations > > >> won't be > > >> + * delayed. However, we still want to > > >> try if > > >> + * recompute is not needed and we can > > >> quickly > > >> + * incrementally process the new changes, > > >> to avoid > > >> + * unnecessarily forced recomputes later > > >> on. This > > >> + * is because the OVSDB change tracker > > >> cannot > > >> + * preserve tracked changes across > > >> iterations. If > > >> + * change tracking is improved, we can > > >> simply skip > > >> + * this round of engine_run and continue > > >> processing > > >> + * acculated changes incrementally later > > >> when > > >> + * ofctrl_can_put() returns true. */ > > >> + if (!engine_aborted) { > > >> + engine_set_abort_recompute(true); > > >> + engine_aborted = > > >> engine_run(&en_flow_output, > > >> + > > >> ++engine_run_id); > > >> + } > > >> + } else { > > >> + engine_set_abort_recompute(false); > > >> + engine_aborted = false; > > >> + engine_run(&en_flow_output, > > >> ++engine_run_id); > > >> + } > > >> } > > >> stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, > > >> time_msec()); > > >> @@ -1993,8 +2017,8 @@ main(int argc, char *argv[]) > > >> } > > >> } > > >> - if (old_engine_run_id == engine_run_id) { > > >> - if (engine_need_run(&en_flow_output)) { > > >> + if (old_engine_run_id == engine_run_id || > engine_aborted) { > > >> + if (engine_aborted || > > >> engine_need_run(&en_flow_output)) { > > >> VLOG_DBG("engine did not run, force recompute > > >> next time: " > > >> "br_int %p, chassis %p", br_int, > chassis); > > >> engine_set_force_recompute(true); > > >> diff --git a/ovn/lib/inc-proc-eng.c b/ovn/lib/inc-proc-eng.c > > >> index 1ddea1a..1064a08 100644 > > >> --- a/ovn/lib/inc-proc-eng.c > > >> +++ b/ovn/lib/inc-proc-eng.c > > >> @@ -31,6 +31,7 @@ > > >> VLOG_DEFINE_THIS_MODULE(inc_proc_eng); > > >> static bool engine_force_recompute = false; > > >> +static bool engine_abort_recompute = false; > > >> static const struct engine_context *engine_context; > > >> void > > >> @@ -39,6 +40,12 @@ engine_set_force_recompute(bool val) > > >> engine_force_recompute = val; > > >> } > > >> +void > > >> +engine_set_abort_recompute(bool val) > > >> +{ > > >> + engine_abort_recompute = val; > > >> +} > > >> + > > >> const struct engine_context * > > >> engine_get_context(void) > > >> { > > >> @@ -121,11 +128,11 @@ engine_ovsdb_node_add_index(struct engine_node > > >> *node, const char *name, > > >> ed->n_indexes ++; > > >> } > > >> -void > > >> +bool > > >> engine_run(struct engine_node *node, uint64_t run_id) > > >> { > > >> if (node->run_id == run_id) { > > >> - return; > > >> + return true; > > >> } > > >> node->run_id = run_id; > > >> @@ -133,11 +140,13 @@ engine_run(struct engine_node *node, uint64_t > > >> run_id) > > >> if (!node->n_inputs) { > > >> node->run(node); > > >> VLOG_DBG("node: %s, changed: %d", node->name, > node->changed); > > >> - return; > > >> + return true; > > >> } > > >> for (size_t i = 0; i < node->n_inputs; i++) { > > >> - engine_run(node->inputs[i].node, run_id); > > >> + if (!engine_run(node->inputs[i].node, run_id)) { > > >> + return false; > > >> + } > > >> } > > >> bool need_compute = false; > > >> @@ -160,6 +169,10 @@ engine_run(struct engine_node *node, uint64_t > > >> run_id) > > >> if (need_recompute) { > > >> VLOG_DBG("node: %s, recompute (%s)", node->name, > > >> engine_force_recompute ? "forced" : "triggered"); > > >> + if (engine_abort_recompute) { > > >> + VLOG_DBG("node: %s, recompute aborted", node->name); > > >> + return false; > > >> + } > > >> node->run(node); > > >> } else if (need_compute) { > > >> for (size_t i = 0; i < node->n_inputs; i++) { > > >> @@ -170,6 +183,10 @@ engine_run(struct engine_node *node, uint64_t > > >> run_id) > > >> VLOG_DBG("node: %s, can't handle change for > > >> input %s, " > > >> "fall back to recompute", > > >> node->name, > node->inputs[i].node->name); > > >> + if (engine_abort_recompute) { > > >> + VLOG_DBG("node: %s, recompute aborted", > > >> node->name); > > >> + return false; > > >> + } > > >> node->run(node); > > >> break; > > >> } > > >> @@ -178,6 +195,7 @@ engine_run(struct engine_node *node, uint64_t > run_id) > > >> } > > >> VLOG_DBG("node: %s, changed: %d", node->name, node->changed); > > >> + return true; > > >> } > > >> bool > > >> diff --git a/ovn/lib/inc-proc-eng.h b/ovn/lib/inc-proc-eng.h > > >> index aab899e..c3d7b5e 100644 > > >> --- a/ovn/lib/inc-proc-eng.h > > >> +++ b/ovn/lib/inc-proc-eng.h > > >> @@ -121,8 +121,9 @@ struct engine_node { > > >> void engine_init(struct engine_node *); > > >> /* Execute the processing recursively, which should be called in the > > >> main > > >> - * loop. */ > > >> -void engine_run(struct engine_node *, uint64_t run_id); > > >> + * loop. Returns true if the execution is compelte, false if it is > > >> aborted, > > >> + * which could happen when engine_abort_recompute is set. */ > > >> +bool engine_run(struct engine_node *, uint64_t run_id); > > >> /* Clean up the data for the engine nodes recursively. It calls each > > >> node's > > >> * cleanup() method if not NULL. It should be called before the > program > > >> @@ -150,6 +151,10 @@ void engine_add_input(struct engine_node *node, > > >> struct engine_node *input, > > >> * iteration, and the change can't be tracked across iterations */ > > >> void engine_set_force_recompute(bool val); > > >> +/* Set the flag to cause engine execution to be aborted when there > > >> + * is any recompute to be triggered in any node. */ > > >> +void engine_set_abort_recompute(bool val); > > >> + > > >> const struct engine_context * engine_get_context(void); > > >> void engine_set_context(const struct engine_context *); > > >> > > > > > >
diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c index 043abd6..0fcaa72 100644 --- a/ovn/controller/ofctrl.c +++ b/ovn/controller/ofctrl.c @@ -985,7 +985,7 @@ add_meter(struct ovn_extend_table_info *m_desired, * in the correct state and not backlogged with existing flow_mods. (Our * criteria for being backlogged appear very conservative, but the socket * between ovn-controller and OVS provides some buffering.) */ -static bool +bool ofctrl_can_put(void) { if (state != S_UPDATE_FLOWS diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h index ed8918a..2b21c11 100644 --- a/ovn/controller/ofctrl.h +++ b/ovn/controller/ofctrl.h @@ -51,6 +51,7 @@ void ofctrl_put(struct ovn_desired_flow_table *, const struct sbrec_meter_table *, int64_t nb_cfg, bool flow_changed); +bool ofctrl_can_put(void); void ofctrl_wait(void); void ofctrl_destroy(void); int64_t ofctrl_get_cur_cfg(void); diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index c4883aa..866cc1c 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -1869,6 +1869,7 @@ main(int argc, char *argv[]) uint64_t engine_run_id = 0; uint64_t old_engine_run_id = 0; + bool engine_aborted = false; unsigned int ovs_cond_seqno = UINT_MAX; unsigned int ovnsb_cond_seqno = UINT_MAX; @@ -1955,7 +1956,30 @@ main(int argc, char *argv[]) stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, time_msec()); if (ovnsb_idl_txn) { - engine_run(&en_flow_output, ++engine_run_id); + if (!ofctrl_can_put()) { + /* When there are in-flight messages pending to + * ovs-vswitchd, we should hold on recomputing so + * that the previous flow installations won't be + * delayed. However, we still want to try if + * recompute is not needed and we can quickly + * incrementally process the new changes, to avoid + * unnecessarily forced recomputes later on. This + * is because the OVSDB change tracker cannot + * preserve tracked changes across iterations. If + * change tracking is improved, we can simply skip + * this round of engine_run and continue processing + * acculated changes incrementally later when + * ofctrl_can_put() returns true. */ + if (!engine_aborted) { + engine_set_abort_recompute(true); + engine_aborted = engine_run(&en_flow_output, + ++engine_run_id); + } + } else { + engine_set_abort_recompute(false); + engine_aborted = false; + engine_run(&en_flow_output, ++engine_run_id); + } } stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, time_msec()); @@ -1993,8 +2017,8 @@ main(int argc, char *argv[]) } } - if (old_engine_run_id == engine_run_id) { - if (engine_need_run(&en_flow_output)) { + if (old_engine_run_id == engine_run_id || engine_aborted) { + if (engine_aborted || engine_need_run(&en_flow_output)) { VLOG_DBG("engine did not run, force recompute next time: " "br_int %p, chassis %p", br_int, chassis); engine_set_force_recompute(true); diff --git a/ovn/lib/inc-proc-eng.c b/ovn/lib/inc-proc-eng.c index 1ddea1a..1064a08 100644 --- a/ovn/lib/inc-proc-eng.c +++ b/ovn/lib/inc-proc-eng.c @@ -31,6 +31,7 @@ VLOG_DEFINE_THIS_MODULE(inc_proc_eng); static bool engine_force_recompute = false; +static bool engine_abort_recompute = false; static const struct engine_context *engine_context; void @@ -39,6 +40,12 @@ engine_set_force_recompute(bool val) engine_force_recompute = val; } +void +engine_set_abort_recompute(bool val) +{ + engine_abort_recompute = val; +} + const struct engine_context * engine_get_context(void) { @@ -121,11 +128,11 @@ engine_ovsdb_node_add_index(struct engine_node *node, const char *name, ed->n_indexes ++; } -void +bool engine_run(struct engine_node *node, uint64_t run_id) { if (node->run_id == run_id) { - return; + return true; } node->run_id = run_id; @@ -133,11 +140,13 @@ engine_run(struct engine_node *node, uint64_t run_id) if (!node->n_inputs) { node->run(node); VLOG_DBG("node: %s, changed: %d", node->name, node->changed); - return; + return true; } for (size_t i = 0; i < node->n_inputs; i++) { - engine_run(node->inputs[i].node, run_id); + if (!engine_run(node->inputs[i].node, run_id)) { + return false; + } } bool need_compute = false; @@ -160,6 +169,10 @@ engine_run(struct engine_node *node, uint64_t run_id) if (need_recompute) { VLOG_DBG("node: %s, recompute (%s)", node->name, engine_force_recompute ? "forced" : "triggered"); + if (engine_abort_recompute) { + VLOG_DBG("node: %s, recompute aborted", node->name); + return false; + } node->run(node); } else if (need_compute) { for (size_t i = 0; i < node->n_inputs; i++) { @@ -170,6 +183,10 @@ engine_run(struct engine_node *node, uint64_t run_id) VLOG_DBG("node: %s, can't handle change for input %s, " "fall back to recompute", node->name, node->inputs[i].node->name); + if (engine_abort_recompute) { + VLOG_DBG("node: %s, recompute aborted", node->name); + return false; + } node->run(node); break; } @@ -178,6 +195,7 @@ engine_run(struct engine_node *node, uint64_t run_id) } VLOG_DBG("node: %s, changed: %d", node->name, node->changed); + return true; } bool diff --git a/ovn/lib/inc-proc-eng.h b/ovn/lib/inc-proc-eng.h index aab899e..c3d7b5e 100644 --- a/ovn/lib/inc-proc-eng.h +++ b/ovn/lib/inc-proc-eng.h @@ -121,8 +121,9 @@ struct engine_node { void engine_init(struct engine_node *); /* Execute the processing recursively, which should be called in the main - * loop. */ -void engine_run(struct engine_node *, uint64_t run_id); + * loop. Returns true if the execution is compelte, false if it is aborted, + * which could happen when engine_abort_recompute is set. */ +bool engine_run(struct engine_node *, uint64_t run_id); /* Clean up the data for the engine nodes recursively. It calls each node's * cleanup() method if not NULL. It should be called before the program @@ -150,6 +151,10 @@ void engine_add_input(struct engine_node *node, struct engine_node *input, * iteration, and the change can't be tracked across iterations */ void engine_set_force_recompute(bool val); +/* Set the flag to cause engine execution to be aborted when there + * is any recompute to be triggered in any node. */ +void engine_set_abort_recompute(bool val); + const struct engine_context * engine_get_context(void); void engine_set_context(const struct engine_context *);