Message ID | 20230621094753.150072-1-dceara@redhat.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,v2] ovn-controller: Reduce size of the SB monitor condition. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | fail | github build: failed |
On Wed, Jun 21, 2023 at 2:48 AM Dumitru Ceara <dceara@redhat.com> wrote: > > We don't need to explicitly add port bindings that were already bound > locally. We implicitly get those because we monitor the datapaths > they're attached to. > > When performing an ovn-heater 500-node density-heavy scale test [0], with > conditional monitoring enabled, the unreasonably long poll intervals on > the Southbound database (the ones that took more than one second) are > measured as: > > --------------------------------------------------------------------- > Count Min Max Median Mean 95 percentile > --------------------------------------------------------------------- > 56.0 1010.0 2664.0 1455.5 1544.9 2163.0 > 77.0 1015.0 3460.0 1896.0 1858.2 2907.8 > 69.0 1010.0 3118.0 1618.0 1688.1 2582.4 > --------------------------------------------------------------------- > 202.0 1010.0 3460.0 1610.0 1713.3 2711.5 > > Compared to the baseline results (also with conditional monitoring > enabled): > --------------------------------------------------------------------- > Count Min Max Median Mean 95 percentile > --------------------------------------------------------------------- > 141.0 1003.0 18338.0 1721.0 2625.4 7626.0 > 151.0 1019.0 80297.0 1808.0 3410.7 9089.0 > 165.0 1007.0 50736.0 2354.0 3067.7 7309.8 > --------------------------------------------------------------------- > 457.0 1003.0 80297.0 2131.0 3044.6 7799.6 > > We see significant improvement on the server side. This is explained > by the fact that now the Southbound database server doesn't need to > apply as many conditions as before when filtering individual monitor > contents. > > Note: Sub-ports - OVN switch ports with parent_port set - have to be > monitored unconditionally as we cannot efficiently determine their local > datapaths without including all local OVS interfaces in the monitor. > > [0] https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194 > Reported-by: Ilya Maximets <i.maximets@ovn.org> > Signed-off-by: Dumitru Ceara <dceara@redhat.com> > --- > V2: > - Addressed Han's comments: > - changed commit log: removed assumption about sub-ports. > - added ovn-nb documentation note about sub-ports monitoring. > - added ovn-controller documentation note about sub-ports monitoring. > - added TODO item. > - Added NEWS item. > --- > NEWS | 3 +++ > TODO.rst | 6 ++++++ > controller/binding.c | 2 +- > controller/binding.h | 2 +- > controller/ovn-controller.8.xml | 6 ++++++ > controller/ovn-controller.c | 19 ++++++++++++++++--- > ovn-nb.xml | 6 +++++- > 7 files changed, 38 insertions(+), 6 deletions(-) > > diff --git a/NEWS b/NEWS > index 482970eeeb..8275877f99 100644 > --- a/NEWS > +++ b/NEWS > @@ -7,6 +7,9 @@ Post v23.06.0 > DHCPv4 "hostname" (12) option. > - Support to create/update MAC_Binding when GARP received from VTEP (RAMP) > switch on l3dgw port. > + - To allow optimizing ovn-controller's monitor conditions for the regular > + VIF case, ovn-controller now unconditionally monitors all sub-ports > + (ports with parent_port set). > > OVN v23.06.0 - 01 Jun 2023 > -------------------------- > diff --git a/TODO.rst b/TODO.rst > index dff163e70b..b790a9fadf 100644 > --- a/TODO.rst > +++ b/TODO.rst > @@ -124,3 +124,9 @@ OVN To-do List > * Load Balancer templates > > * Support combining the VIP IP and port into a single template variable. > + > +* ovn-controller conditional monitoring > + > + * Improve sub-ports (with parent_port set) conditional monitoring; these > + are currently unconditionally monitored, even if ovn-monitor-all is > + set to false. > diff --git a/controller/binding.c b/controller/binding.c > index 486095ca79..359ad6d660 100644 > --- a/controller/binding.c > +++ b/controller/binding.c > @@ -785,7 +785,7 @@ local_binding_data_destroy(struct local_binding_data *lbinding_data) > } > > const struct sbrec_port_binding * > -local_binding_get_primary_pb(struct shash *local_bindings, > +local_binding_get_primary_pb(const struct shash *local_bindings, > const char *pb_name) > { > struct local_binding *lbinding = > diff --git a/controller/binding.h b/controller/binding.h > index 0e57f02ee5..319cbd263c 100644 > --- a/controller/binding.h > +++ b/controller/binding.h > @@ -150,7 +150,7 @@ void local_binding_data_init(struct local_binding_data *); > void local_binding_data_destroy(struct local_binding_data *); > > const struct sbrec_port_binding *local_binding_get_primary_pb( > - struct shash *local_bindings, const char *pb_name); > + const struct shash *local_bindings, const char *pb_name); > ofp_port_t local_binding_get_lport_ofport(const struct shash *local_bindings, > const char *pb_name); > > diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml > index f61f430084..0883d8da9b 100644 > --- a/controller/ovn-controller.8.xml > +++ b/controller/ovn-controller.8.xml > @@ -128,6 +128,12 @@ > to <code>true</code> for environments that all workloads need to be > reachable from each other. > </p> > + <p> > + NOTE: for efficiency and scalability in common scenarios > + <code>ovn-controller</code> unconditionally monitors all sub-ports > + (ports with <code>parent_port</code> set) regardless of the > + <code>ovn-monitor-all</code> value. > + </p> > <p> > Default value is <var>false</var>. > </p> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index a474069798..96d01219e1 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -199,6 +199,7 @@ static unsigned int > update_sb_monitors(struct ovsdb_idl *ovnsb_idl, > const struct sbrec_chassis *chassis, > const struct sset *local_ifaces, > + const struct shash *local_bindings, > struct hmap *local_datapaths, > bool monitor_all) > { > @@ -297,10 +298,21 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, > > if (local_ifaces) { > const char *name; > + > + ovs_assert(local_bindings); > SSET_FOR_EACH (name, local_ifaces) { > + /* Skip the VIFs we bound already, we should have a local datapath > + * for those. */ > + const struct sbrec_port_binding *local_pb > + = local_binding_get_primary_pb(local_bindings, name); > + if (local_pb && get_lport_type(local_pb) == LP_VIF) { > + continue; > + } > sbrec_port_binding_add_clause_logical_port(&pb, OVSDB_F_EQ, name); > - sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ, name); > } > + /* Monitor all sub-ports unconditionally; we don't expect a lot of > + * them in the SB database. */ > + sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, NULL); > } > if (local_datapaths) { > const struct local_datapath *ld; > @@ -609,7 +621,7 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct ovsdb_idl *ovnsb_idl, > * extra cost. Instead, it is called after the engine execution only > * when it is necessary. */ > unsigned int next_cond_seqno = > - update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true); > + update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, true); > if (sb_cond_seqno) { > *sb_cond_seqno = next_cond_seqno; > } > @@ -4712,7 +4724,7 @@ main(int argc, char *argv[]) > ovsdb_idl_omit(ovnsb_idl_loop.idl, > &sbrec_chassis_private_col_external_ids); > > - update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false); > + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL, false); > > stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS); > stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS); > @@ -5355,6 +5367,7 @@ main(int argc, char *argv[]) > update_sb_monitors( > ovnsb_idl_loop.idl, chassis, > &runtime_data->local_lports, > + &runtime_data->lbinding_data.bindings, > &runtime_data->local_datapaths, > sb_monitor_all); > } > diff --git a/ovn-nb.xml b/ovn-nb.xml > index cf9c92009d..1fb9fa10ee 100644 > --- a/ovn-nb.xml > +++ b/ovn-nb.xml > @@ -1302,7 +1302,11 @@ > <column name="parent_name"> > The VM interface through which the nested container sends its network > traffic. This must match the <ref column="name"/> column for some > - other <ref table="Logical_Switch_Port"/>. > + other <ref table="Logical_Switch_Port"/>. Note: for performance > + reasons, unlike for regular VIFs, <code>ovn-controller</code> will It may be better to say "for performance of OVN Southbound database conditional monitoring ...", otherwise the user might be confused. > + register to get updates about all OVN Southbound database > + <ref db="OVN_Southbound" table="Port_Binding"/> table records > + that correspond to nested container ports. Better to add: "..., even if 'ovn-monitor-all' is set to false. See ovn-controlller (8)." or something similar. Please feel free to adjust the words and merge. Acked-by: Han Zhou <hzhou@ovn.org> > </column> > > <column name="tag_request"> > -- > 2.31.1 >
On 6/21/23 16:50, Han Zhou wrote: > On Wed, Jun 21, 2023 at 2:48 AM Dumitru Ceara <dceara@redhat.com> wrote: >> >> We don't need to explicitly add port bindings that were already bound >> locally. We implicitly get those because we monitor the datapaths >> they're attached to. >> >> When performing an ovn-heater 500-node density-heavy scale test [0], with >> conditional monitoring enabled, the unreasonably long poll intervals on >> the Southbound database (the ones that took more than one second) are >> measured as: >> >> --------------------------------------------------------------------- >> Count Min Max Median Mean 95 percentile >> --------------------------------------------------------------------- >> 56.0 1010.0 2664.0 1455.5 1544.9 2163.0 >> 77.0 1015.0 3460.0 1896.0 1858.2 2907.8 >> 69.0 1010.0 3118.0 1618.0 1688.1 2582.4 >> --------------------------------------------------------------------- >> 202.0 1010.0 3460.0 1610.0 1713.3 2711.5 >> >> Compared to the baseline results (also with conditional monitoring >> enabled): >> --------------------------------------------------------------------- >> Count Min Max Median Mean 95 percentile >> --------------------------------------------------------------------- >> 141.0 1003.0 18338.0 1721.0 2625.4 7626.0 >> 151.0 1019.0 80297.0 1808.0 3410.7 9089.0 >> 165.0 1007.0 50736.0 2354.0 3067.7 7309.8 >> --------------------------------------------------------------------- >> 457.0 1003.0 80297.0 2131.0 3044.6 7799.6 >> >> We see significant improvement on the server side. This is explained >> by the fact that now the Southbound database server doesn't need to >> apply as many conditions as before when filtering individual monitor >> contents. >> >> Note: Sub-ports - OVN switch ports with parent_port set - have to be >> monitored unconditionally as we cannot efficiently determine their local >> datapaths without including all local OVS interfaces in the monitor. >> >> [0] > https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml >> >> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194 >> Reported-by: Ilya Maximets <i.maximets@ovn.org> >> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >> --- >> V2: >> - Addressed Han's comments: >> - changed commit log: removed assumption about sub-ports. >> - added ovn-nb documentation note about sub-ports monitoring. >> - added ovn-controller documentation note about sub-ports monitoring. >> - added TODO item. >> - Added NEWS item. >> --- >> NEWS | 3 +++ >> TODO.rst | 6 ++++++ >> controller/binding.c | 2 +- >> controller/binding.h | 2 +- >> controller/ovn-controller.8.xml | 6 ++++++ >> controller/ovn-controller.c | 19 ++++++++++++++++--- >> ovn-nb.xml | 6 +++++- >> 7 files changed, 38 insertions(+), 6 deletions(-) >> >> diff --git a/NEWS b/NEWS >> index 482970eeeb..8275877f99 100644 >> --- a/NEWS >> +++ b/NEWS >> @@ -7,6 +7,9 @@ Post v23.06.0 >> DHCPv4 "hostname" (12) option. >> - Support to create/update MAC_Binding when GARP received from VTEP > (RAMP) >> switch on l3dgw port. >> + - To allow optimizing ovn-controller's monitor conditions for the > regular >> + VIF case, ovn-controller now unconditionally monitors all sub-ports >> + (ports with parent_port set). >> >> OVN v23.06.0 - 01 Jun 2023 >> -------------------------- >> diff --git a/TODO.rst b/TODO.rst >> index dff163e70b..b790a9fadf 100644 >> --- a/TODO.rst >> +++ b/TODO.rst >> @@ -124,3 +124,9 @@ OVN To-do List >> * Load Balancer templates >> >> * Support combining the VIP IP and port into a single template > variable. >> + >> +* ovn-controller conditional monitoring >> + >> + * Improve sub-ports (with parent_port set) conditional monitoring; > these >> + are currently unconditionally monitored, even if ovn-monitor-all is >> + set to false. >> diff --git a/controller/binding.c b/controller/binding.c >> index 486095ca79..359ad6d660 100644 >> --- a/controller/binding.c >> +++ b/controller/binding.c >> @@ -785,7 +785,7 @@ local_binding_data_destroy(struct local_binding_data > *lbinding_data) >> } >> >> const struct sbrec_port_binding * >> -local_binding_get_primary_pb(struct shash *local_bindings, >> +local_binding_get_primary_pb(const struct shash *local_bindings, >> const char *pb_name) >> { >> struct local_binding *lbinding = >> diff --git a/controller/binding.h b/controller/binding.h >> index 0e57f02ee5..319cbd263c 100644 >> --- a/controller/binding.h >> +++ b/controller/binding.h >> @@ -150,7 +150,7 @@ void local_binding_data_init(struct > local_binding_data *); >> void local_binding_data_destroy(struct local_binding_data *); >> >> const struct sbrec_port_binding *local_binding_get_primary_pb( >> - struct shash *local_bindings, const char *pb_name); >> + const struct shash *local_bindings, const char *pb_name); >> ofp_port_t local_binding_get_lport_ofport(const struct shash > *local_bindings, >> const char *pb_name); >> >> diff --git a/controller/ovn-controller.8.xml > b/controller/ovn-controller.8.xml >> index f61f430084..0883d8da9b 100644 >> --- a/controller/ovn-controller.8.xml >> +++ b/controller/ovn-controller.8.xml >> @@ -128,6 +128,12 @@ >> to <code>true</code> for environments that all workloads need > to be >> reachable from each other. >> </p> >> + <p> >> + NOTE: for efficiency and scalability in common scenarios >> + <code>ovn-controller</code> unconditionally monitors all > sub-ports >> + (ports with <code>parent_port</code> set) regardless of the >> + <code>ovn-monitor-all</code> value. >> + </p> >> <p> >> Default value is <var>false</var>. >> </p> >> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >> index a474069798..96d01219e1 100644 >> --- a/controller/ovn-controller.c >> +++ b/controller/ovn-controller.c >> @@ -199,6 +199,7 @@ static unsigned int >> update_sb_monitors(struct ovsdb_idl *ovnsb_idl, >> const struct sbrec_chassis *chassis, >> const struct sset *local_ifaces, >> + const struct shash *local_bindings, >> struct hmap *local_datapaths, >> bool monitor_all) >> { >> @@ -297,10 +298,21 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, >> >> if (local_ifaces) { >> const char *name; >> + >> + ovs_assert(local_bindings); >> SSET_FOR_EACH (name, local_ifaces) { >> + /* Skip the VIFs we bound already, we should have a local > datapath >> + * for those. */ >> + const struct sbrec_port_binding *local_pb >> + = local_binding_get_primary_pb(local_bindings, name); >> + if (local_pb && get_lport_type(local_pb) == LP_VIF) { >> + continue; >> + } >> sbrec_port_binding_add_clause_logical_port(&pb, OVSDB_F_EQ, > name); >> - sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ, > name); >> } >> + /* Monitor all sub-ports unconditionally; we don't expect a lot > of >> + * them in the SB database. */ >> + sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, NULL); >> } >> if (local_datapaths) { >> const struct local_datapath *ld; >> @@ -609,7 +621,7 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct > ovsdb_idl *ovnsb_idl, >> * extra cost. Instead, it is called after the engine execution > only >> * when it is necessary. */ >> unsigned int next_cond_seqno = >> - update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true); >> + update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, true); >> if (sb_cond_seqno) { >> *sb_cond_seqno = next_cond_seqno; >> } >> @@ -4712,7 +4724,7 @@ main(int argc, char *argv[]) >> ovsdb_idl_omit(ovnsb_idl_loop.idl, >> &sbrec_chassis_private_col_external_ids); >> >> - update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false); >> + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL, > false); >> >> stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS); >> stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS); >> @@ -5355,6 +5367,7 @@ main(int argc, char *argv[]) >> update_sb_monitors( >> ovnsb_idl_loop.idl, chassis, >> &runtime_data->local_lports, >> + > &runtime_data->lbinding_data.bindings, >> &runtime_data->local_datapaths, >> sb_monitor_all); >> } >> diff --git a/ovn-nb.xml b/ovn-nb.xml >> index cf9c92009d..1fb9fa10ee 100644 >> --- a/ovn-nb.xml >> +++ b/ovn-nb.xml >> @@ -1302,7 +1302,11 @@ >> <column name="parent_name"> >> The VM interface through which the nested container sends its > network >> traffic. This must match the <ref column="name"/> column for > some >> - other <ref table="Logical_Switch_Port"/>. >> + other <ref table="Logical_Switch_Port"/>. Note: for performance >> + reasons, unlike for regular VIFs, <code>ovn-controller</code> > will > > It may be better to say "for performance of OVN Southbound database > conditional monitoring ...", otherwise the user might be confused. > >> + register to get updates about all OVN Southbound database >> + <ref db="OVN_Southbound" table="Port_Binding"/> table records >> + that correspond to nested container ports. > > Better to add: "..., even if 'ovn-monitor-all' is set to false. See > ovn-controlller (8)." or something similar. > > Please feel free to adjust the words and merge. > > Acked-by: Han Zhou <hzhou@ovn.org> > Done, thanks again for the review! Regards, Dumitru
On Thu, Jun 22, 2023 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote: > > On 6/21/23 16:50, Han Zhou wrote: > > On Wed, Jun 21, 2023 at 2:48 AM Dumitru Ceara <dceara@redhat.com> wrote: > >> > >> We don't need to explicitly add port bindings that were already bound > >> locally. We implicitly get those because we monitor the datapaths > >> they're attached to. > >> > >> When performing an ovn-heater 500-node density-heavy scale test [0], with > >> conditional monitoring enabled, the unreasonably long poll intervals on > >> the Southbound database (the ones that took more than one second) are > >> measured as: > >> > >> --------------------------------------------------------------------- > >> Count Min Max Median Mean 95 percentile > >> --------------------------------------------------------------------- > >> 56.0 1010.0 2664.0 1455.5 1544.9 2163.0 > >> 77.0 1015.0 3460.0 1896.0 1858.2 2907.8 > >> 69.0 1010.0 3118.0 1618.0 1688.1 2582.4 > >> --------------------------------------------------------------------- > >> 202.0 1010.0 3460.0 1610.0 1713.3 2711.5 > >> > >> Compared to the baseline results (also with conditional monitoring > >> enabled): > >> --------------------------------------------------------------------- > >> Count Min Max Median Mean 95 percentile > >> --------------------------------------------------------------------- > >> 141.0 1003.0 18338.0 1721.0 2625.4 7626.0 > >> 151.0 1019.0 80297.0 1808.0 3410.7 9089.0 > >> 165.0 1007.0 50736.0 2354.0 3067.7 7309.8 > >> --------------------------------------------------------------------- > >> 457.0 1003.0 80297.0 2131.0 3044.6 7799.6 > >> > >> We see significant improvement on the server side. This is explained > >> by the fact that now the Southbound database server doesn't need to > >> apply as many conditions as before when filtering individual monitor > >> contents. > >> > >> Note: Sub-ports - OVN switch ports with parent_port set - have to be > >> monitored unconditionally as we cannot efficiently determine their local > >> datapaths without including all local OVS interfaces in the monitor. > >> > >> [0] > > https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml > >> > >> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194 > >> Reported-by: Ilya Maximets <i.maximets@ovn.org> > >> Signed-off-by: Dumitru Ceara <dceara@redhat.com> > >> --- > >> V2: > >> - Addressed Han's comments: > >> - changed commit log: removed assumption about sub-ports. > >> - added ovn-nb documentation note about sub-ports monitoring. > >> - added ovn-controller documentation note about sub-ports monitoring. > >> - added TODO item. > >> - Added NEWS item. > >> --- > >> NEWS | 3 +++ > >> TODO.rst | 6 ++++++ > >> controller/binding.c | 2 +- > >> controller/binding.h | 2 +- > >> controller/ovn-controller.8.xml | 6 ++++++ > >> controller/ovn-controller.c | 19 ++++++++++++++++--- > >> ovn-nb.xml | 6 +++++- > >> 7 files changed, 38 insertions(+), 6 deletions(-) > >> > >> diff --git a/NEWS b/NEWS > >> index 482970eeeb..8275877f99 100644 > >> --- a/NEWS > >> +++ b/NEWS > >> @@ -7,6 +7,9 @@ Post v23.06.0 > >> DHCPv4 "hostname" (12) option. > >> - Support to create/update MAC_Binding when GARP received from VTEP > > (RAMP) > >> switch on l3dgw port. > >> + - To allow optimizing ovn-controller's monitor conditions for the > > regular > >> + VIF case, ovn-controller now unconditionally monitors all sub-ports > >> + (ports with parent_port set). > >> > >> OVN v23.06.0 - 01 Jun 2023 > >> -------------------------- > >> diff --git a/TODO.rst b/TODO.rst > >> index dff163e70b..b790a9fadf 100644 > >> --- a/TODO.rst > >> +++ b/TODO.rst > >> @@ -124,3 +124,9 @@ OVN To-do List > >> * Load Balancer templates > >> > >> * Support combining the VIP IP and port into a single template > > variable. > >> + > >> +* ovn-controller conditional monitoring > >> + > >> + * Improve sub-ports (with parent_port set) conditional monitoring; > > these > >> + are currently unconditionally monitored, even if ovn-monitor-all is > >> + set to false. > >> diff --git a/controller/binding.c b/controller/binding.c > >> index 486095ca79..359ad6d660 100644 > >> --- a/controller/binding.c > >> +++ b/controller/binding.c > >> @@ -785,7 +785,7 @@ local_binding_data_destroy(struct local_binding_data > > *lbinding_data) > >> } > >> > >> const struct sbrec_port_binding * > >> -local_binding_get_primary_pb(struct shash *local_bindings, > >> +local_binding_get_primary_pb(const struct shash *local_bindings, > >> const char *pb_name) > >> { > >> struct local_binding *lbinding = > >> diff --git a/controller/binding.h b/controller/binding.h > >> index 0e57f02ee5..319cbd263c 100644 > >> --- a/controller/binding.h > >> +++ b/controller/binding.h > >> @@ -150,7 +150,7 @@ void local_binding_data_init(struct > > local_binding_data *); > >> void local_binding_data_destroy(struct local_binding_data *); > >> > >> const struct sbrec_port_binding *local_binding_get_primary_pb( > >> - struct shash *local_bindings, const char *pb_name); > >> + const struct shash *local_bindings, const char *pb_name); > >> ofp_port_t local_binding_get_lport_ofport(const struct shash > > *local_bindings, > >> const char *pb_name); > >> > >> diff --git a/controller/ovn-controller.8.xml > > b/controller/ovn-controller.8.xml > >> index f61f430084..0883d8da9b 100644 > >> --- a/controller/ovn-controller.8.xml > >> +++ b/controller/ovn-controller.8.xml > >> @@ -128,6 +128,12 @@ > >> to <code>true</code> for environments that all workloads need > > to be > >> reachable from each other. > >> </p> > >> + <p> > >> + NOTE: for efficiency and scalability in common scenarios > >> + <code>ovn-controller</code> unconditionally monitors all > > sub-ports > >> + (ports with <code>parent_port</code> set) regardless of the > >> + <code>ovn-monitor-all</code> value. > >> + </p> > >> <p> > >> Default value is <var>false</var>. > >> </p> > >> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > >> index a474069798..96d01219e1 100644 > >> --- a/controller/ovn-controller.c > >> +++ b/controller/ovn-controller.c > >> @@ -199,6 +199,7 @@ static unsigned int > >> update_sb_monitors(struct ovsdb_idl *ovnsb_idl, > >> const struct sbrec_chassis *chassis, > >> const struct sset *local_ifaces, > >> + const struct shash *local_bindings, > >> struct hmap *local_datapaths, > >> bool monitor_all) > >> { > >> @@ -297,10 +298,21 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, > >> > >> if (local_ifaces) { > >> const char *name; > >> + > >> + ovs_assert(local_bindings); > >> SSET_FOR_EACH (name, local_ifaces) { > >> + /* Skip the VIFs we bound already, we should have a local > > datapath > >> + * for those. */ > >> + const struct sbrec_port_binding *local_pb > >> + = local_binding_get_primary_pb(local_bindings, name); > >> + if (local_pb && get_lport_type(local_pb) == LP_VIF) { > >> + continue; > >> + } > >> sbrec_port_binding_add_clause_logical_port(&pb, OVSDB_F_EQ, > > name); > >> - sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ, > > name); > >> } > >> + /* Monitor all sub-ports unconditionally; we don't expect a lot > > of > >> + * them in the SB database. */ > >> + sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, NULL); > >> } > >> if (local_datapaths) { > >> const struct local_datapath *ld; > >> @@ -609,7 +621,7 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct > > ovsdb_idl *ovnsb_idl, > >> * extra cost. Instead, it is called after the engine execution > > only > >> * when it is necessary. */ > >> unsigned int next_cond_seqno = > >> - update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true); > >> + update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, true); > >> if (sb_cond_seqno) { > >> *sb_cond_seqno = next_cond_seqno; > >> } > >> @@ -4712,7 +4724,7 @@ main(int argc, char *argv[]) > >> ovsdb_idl_omit(ovnsb_idl_loop.idl, > >> &sbrec_chassis_private_col_external_ids); > >> > >> - update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false); > >> + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL, > > false); > >> > >> stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS); > >> stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS); > >> @@ -5355,6 +5367,7 @@ main(int argc, char *argv[]) > >> update_sb_monitors( > >> ovnsb_idl_loop.idl, chassis, > >> &runtime_data->local_lports, > >> + > > &runtime_data->lbinding_data.bindings, > >> &runtime_data->local_datapaths, > >> sb_monitor_all); > >> } > >> diff --git a/ovn-nb.xml b/ovn-nb.xml > >> index cf9c92009d..1fb9fa10ee 100644 > >> --- a/ovn-nb.xml > >> +++ b/ovn-nb.xml > >> @@ -1302,7 +1302,11 @@ > >> <column name="parent_name"> > >> The VM interface through which the nested container sends its > > network > >> traffic. This must match the <ref column="name"/> column for > > some > >> - other <ref table="Logical_Switch_Port"/>. > >> + other <ref table="Logical_Switch_Port"/>. Note: for performance > >> + reasons, unlike for regular VIFs, <code>ovn-controller</code> > > will > > > > It may be better to say "for performance of OVN Southbound database > > conditional monitoring ...", otherwise the user might be confused. > > > >> + register to get updates about all OVN Southbound database > >> + <ref db="OVN_Southbound" table="Port_Binding"/> table records > >> + that correspond to nested container ports. > > > > Better to add: "..., even if 'ovn-monitor-all' is set to false. See > > ovn-controlller (8)." or something similar. > > > > Please feel free to adjust the words and merge. > > > > Acked-by: Han Zhou <hzhou@ovn.org> > > > > Done, thanks again for the review! > > Regards, > Dumitru > Hi Dumitru, I noticed that with this patch, the ovn-controller port-binding latency increased a lot. With the simulated k8s scale test [0] of 500 chassis x 50 LSPs, the latency of a single LSP add/delete is almost tripled: Command: ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01 -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" -- lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" Before: Time spent on processing nb_cfg 3: ovn-northd delay before processing: 2ms ovn-northd completion: 42ms ovn-controller(s) completion: 68ms (68 - 42 = 26ms) After: Time spent on processing nb_cfg 3: ovn-northd delay before processing: 1ms ovn-northd completion: 38ms ovn-controller(s) completion: 112m (112 - 38 = 74ms) I think it may be because each port bind/unbind now introduces a SB DB monitor condition change, which may have caused some delay. But I didn't expect the impact to be that high because this test has only one single ovn-controller (all the other 499 chassis are fake). I haven't spent time to dig deeper yet. Even though, it is still <100 ms, so I am not too worried about it. But just to share the observation and raise the awareness. [0] https://github.com/hzhou8/ovn-test-script Thanks, Han
On 6/28/23 04:54, Han Zhou wrote: > On Thu, Jun 22, 2023 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote: >> >> On 6/21/23 16:50, Han Zhou wrote: >>> On Wed, Jun 21, 2023 at 2:48 AM Dumitru Ceara <dceara@redhat.com> wrote: >>>> >>>> We don't need to explicitly add port bindings that were already bound >>>> locally. We implicitly get those because we monitor the datapaths >>>> they're attached to. >>>> >>>> When performing an ovn-heater 500-node density-heavy scale test [0], > with >>>> conditional monitoring enabled, the unreasonably long poll intervals on >>>> the Southbound database (the ones that took more than one second) are >>>> measured as: >>>> >>>> --------------------------------------------------------------------- >>>> Count Min Max Median Mean 95 percentile >>>> --------------------------------------------------------------------- >>>> 56.0 1010.0 2664.0 1455.5 1544.9 2163.0 >>>> 77.0 1015.0 3460.0 1896.0 1858.2 2907.8 >>>> 69.0 1010.0 3118.0 1618.0 1688.1 2582.4 >>>> --------------------------------------------------------------------- >>>> 202.0 1010.0 3460.0 1610.0 1713.3 2711.5 >>>> >>>> Compared to the baseline results (also with conditional monitoring >>>> enabled): >>>> --------------------------------------------------------------------- >>>> Count Min Max Median Mean 95 percentile >>>> --------------------------------------------------------------------- >>>> 141.0 1003.0 18338.0 1721.0 2625.4 7626.0 >>>> 151.0 1019.0 80297.0 1808.0 3410.7 9089.0 >>>> 165.0 1007.0 50736.0 2354.0 3067.7 7309.8 >>>> --------------------------------------------------------------------- >>>> 457.0 1003.0 80297.0 2131.0 3044.6 7799.6 >>>> >>>> We see significant improvement on the server side. This is explained >>>> by the fact that now the Southbound database server doesn't need to >>>> apply as many conditions as before when filtering individual monitor >>>> contents. >>>> >>>> Note: Sub-ports - OVN switch ports with parent_port set - have to be >>>> monitored unconditionally as we cannot efficiently determine their > local >>>> datapaths without including all local OVS interfaces in the monitor. >>>> >>>> [0] >>> > https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml >>>> >>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194 >>>> Reported-by: Ilya Maximets <i.maximets@ovn.org> >>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>>> --- >>>> V2: >>>> - Addressed Han's comments: >>>> - changed commit log: removed assumption about sub-ports. >>>> - added ovn-nb documentation note about sub-ports monitoring. >>>> - added ovn-controller documentation note about sub-ports monitoring. >>>> - added TODO item. >>>> - Added NEWS item. >>>> --- >>>> NEWS | 3 +++ >>>> TODO.rst | 6 ++++++ >>>> controller/binding.c | 2 +- >>>> controller/binding.h | 2 +- >>>> controller/ovn-controller.8.xml | 6 ++++++ >>>> controller/ovn-controller.c | 19 ++++++++++++++++--- >>>> ovn-nb.xml | 6 +++++- >>>> 7 files changed, 38 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/NEWS b/NEWS >>>> index 482970eeeb..8275877f99 100644 >>>> --- a/NEWS >>>> +++ b/NEWS >>>> @@ -7,6 +7,9 @@ Post v23.06.0 >>>> DHCPv4 "hostname" (12) option. >>>> - Support to create/update MAC_Binding when GARP received from VTEP >>> (RAMP) >>>> switch on l3dgw port. >>>> + - To allow optimizing ovn-controller's monitor conditions for the >>> regular >>>> + VIF case, ovn-controller now unconditionally monitors all > sub-ports >>>> + (ports with parent_port set). >>>> >>>> OVN v23.06.0 - 01 Jun 2023 >>>> -------------------------- >>>> diff --git a/TODO.rst b/TODO.rst >>>> index dff163e70b..b790a9fadf 100644 >>>> --- a/TODO.rst >>>> +++ b/TODO.rst >>>> @@ -124,3 +124,9 @@ OVN To-do List >>>> * Load Balancer templates >>>> >>>> * Support combining the VIP IP and port into a single template >>> variable. >>>> + >>>> +* ovn-controller conditional monitoring >>>> + >>>> + * Improve sub-ports (with parent_port set) conditional monitoring; >>> these >>>> + are currently unconditionally monitored, even if ovn-monitor-all > is >>>> + set to false. >>>> diff --git a/controller/binding.c b/controller/binding.c >>>> index 486095ca79..359ad6d660 100644 >>>> --- a/controller/binding.c >>>> +++ b/controller/binding.c >>>> @@ -785,7 +785,7 @@ local_binding_data_destroy(struct > local_binding_data >>> *lbinding_data) >>>> } >>>> >>>> const struct sbrec_port_binding * >>>> -local_binding_get_primary_pb(struct shash *local_bindings, >>>> +local_binding_get_primary_pb(const struct shash *local_bindings, >>>> const char *pb_name) >>>> { >>>> struct local_binding *lbinding = >>>> diff --git a/controller/binding.h b/controller/binding.h >>>> index 0e57f02ee5..319cbd263c 100644 >>>> --- a/controller/binding.h >>>> +++ b/controller/binding.h >>>> @@ -150,7 +150,7 @@ void local_binding_data_init(struct >>> local_binding_data *); >>>> void local_binding_data_destroy(struct local_binding_data *); >>>> >>>> const struct sbrec_port_binding *local_binding_get_primary_pb( >>>> - struct shash *local_bindings, const char *pb_name); >>>> + const struct shash *local_bindings, const char *pb_name); >>>> ofp_port_t local_binding_get_lport_ofport(const struct shash >>> *local_bindings, >>>> const char *pb_name); >>>> >>>> diff --git a/controller/ovn-controller.8.xml >>> b/controller/ovn-controller.8.xml >>>> index f61f430084..0883d8da9b 100644 >>>> --- a/controller/ovn-controller.8.xml >>>> +++ b/controller/ovn-controller.8.xml >>>> @@ -128,6 +128,12 @@ >>>> to <code>true</code> for environments that all workloads > need >>> to be >>>> reachable from each other. >>>> </p> >>>> + <p> >>>> + NOTE: for efficiency and scalability in common scenarios >>>> + <code>ovn-controller</code> unconditionally monitors all >>> sub-ports >>>> + (ports with <code>parent_port</code> set) regardless of the >>>> + <code>ovn-monitor-all</code> value. >>>> + </p> >>>> <p> >>>> Default value is <var>false</var>. >>>> </p> >>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >>>> index a474069798..96d01219e1 100644 >>>> --- a/controller/ovn-controller.c >>>> +++ b/controller/ovn-controller.c >>>> @@ -199,6 +199,7 @@ static unsigned int >>>> update_sb_monitors(struct ovsdb_idl *ovnsb_idl, >>>> const struct sbrec_chassis *chassis, >>>> const struct sset *local_ifaces, >>>> + const struct shash *local_bindings, >>>> struct hmap *local_datapaths, >>>> bool monitor_all) >>>> { >>>> @@ -297,10 +298,21 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, >>>> >>>> if (local_ifaces) { >>>> const char *name; >>>> + >>>> + ovs_assert(local_bindings); >>>> SSET_FOR_EACH (name, local_ifaces) { >>>> + /* Skip the VIFs we bound already, we should have a local >>> datapath >>>> + * for those. */ >>>> + const struct sbrec_port_binding *local_pb >>>> + = local_binding_get_primary_pb(local_bindings, name); >>>> + if (local_pb && get_lport_type(local_pb) == LP_VIF) { >>>> + continue; >>>> + } >>>> sbrec_port_binding_add_clause_logical_port(&pb, > OVSDB_F_EQ, >>> name); >>>> - sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ, >>> name); >>>> } >>>> + /* Monitor all sub-ports unconditionally; we don't expect a > lot >>> of >>>> + * them in the SB database. */ >>>> + sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, > NULL); >>>> } >>>> if (local_datapaths) { >>>> const struct local_datapath *ld; >>>> @@ -609,7 +621,7 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct >>> ovsdb_idl *ovnsb_idl, >>>> * extra cost. Instead, it is called after the engine > execution >>> only >>>> * when it is necessary. */ >>>> unsigned int next_cond_seqno = >>>> - update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true); >>>> + update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, > true); >>>> if (sb_cond_seqno) { >>>> *sb_cond_seqno = next_cond_seqno; >>>> } >>>> @@ -4712,7 +4724,7 @@ main(int argc, char *argv[]) >>>> ovsdb_idl_omit(ovnsb_idl_loop.idl, >>>> &sbrec_chassis_private_col_external_ids); >>>> >>>> - update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false); >>>> + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL, >>> false); >>>> >>>> stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS); >>>> stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS); >>>> @@ -5355,6 +5367,7 @@ main(int argc, char *argv[]) >>>> update_sb_monitors( >>>> ovnsb_idl_loop.idl, chassis, >>>> &runtime_data->local_lports, >>>> + >>> &runtime_data->lbinding_data.bindings, >>>> &runtime_data->local_datapaths, >>>> sb_monitor_all); >>>> } >>>> diff --git a/ovn-nb.xml b/ovn-nb.xml >>>> index cf9c92009d..1fb9fa10ee 100644 >>>> --- a/ovn-nb.xml >>>> +++ b/ovn-nb.xml >>>> @@ -1302,7 +1302,11 @@ >>>> <column name="parent_name"> >>>> The VM interface through which the nested container sends its >>> network >>>> traffic. This must match the <ref column="name"/> column for >>> some >>>> - other <ref table="Logical_Switch_Port"/>. >>>> + other <ref table="Logical_Switch_Port"/>. Note: for > performance >>>> + reasons, unlike for regular VIFs, <code>ovn-controller</code> >>> will >>> >>> It may be better to say "for performance of OVN Southbound database >>> conditional monitoring ...", otherwise the user might be confused. >>> >>>> + register to get updates about all OVN Southbound database >>>> + <ref db="OVN_Southbound" table="Port_Binding"/> table records >>>> + that correspond to nested container ports. >>> >>> Better to add: "..., even if 'ovn-monitor-all' is set to false. See >>> ovn-controlller (8)." or something similar. >>> >>> Please feel free to adjust the words and merge. >>> >>> Acked-by: Han Zhou <hzhou@ovn.org> >>> >> >> Done, thanks again for the review! >> >> Regards, >> Dumitru >> > > Hi Dumitru, > Hi Han, > I noticed that with this patch, the ovn-controller port-binding latency > increased a lot. With the simulated k8s scale test [0] of 500 chassis x 50 > LSPs, the latency of a single LSP add/delete is almost tripled: > > Command: > ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01 -- > lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" -- > lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" > > Before: > Time spent on processing nb_cfg 3: > > ovn-northd delay before processing: 2ms > > ovn-northd completion: 42ms > > > ovn-controller(s) completion: 68ms > > (68 - 42 = 26ms) > > After: > Time spent on processing nb_cfg 3: > > > > > ovn-northd delay before processing: 1ms > > > ovn-northd completion: 38ms > > > > > ovn-controller(s) completion: 112m > > (112 - 38 = 74ms) > > I think it may be because each port bind/unbind now introduces a SB DB > monitor condition change, which may have caused some delay. But I didn't > expect the impact to be that high because this test has only one single > ovn-controller (all the other 499 chassis are fake). I haven't spent time > to dig deeper yet. The monitor condition change should happen only for the first port of a new logical switch. From that point on the datapath is "local" to ovn-controller and all new port additions to that logical switch should not cause monitor conditions to change. If there are monitor condition changes for *every* port bind/unbind (if the set of local datapaths doesn't change) that means we have a bug. > Even though, it is still <100 ms, so I am not too worried about it. But > just to share the observation and raise the awareness. > > [0] https://github.com/hzhou8/ovn-test-script > > Thanks, > Han > Regards, Dumitru
On 6/28/23 10:30, Dumitru Ceara wrote: > On 6/28/23 04:54, Han Zhou wrote: >> On Thu, Jun 22, 2023 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote: >>> >>> On 6/21/23 16:50, Han Zhou wrote: >>>> On Wed, Jun 21, 2023 at 2:48 AM Dumitru Ceara <dceara@redhat.com> wrote: >>>>> >>>>> We don't need to explicitly add port bindings that were already bound >>>>> locally. We implicitly get those because we monitor the datapaths >>>>> they're attached to. >>>>> >>>>> When performing an ovn-heater 500-node density-heavy scale test [0], >> with >>>>> conditional monitoring enabled, the unreasonably long poll intervals on >>>>> the Southbound database (the ones that took more than one second) are >>>>> measured as: >>>>> >>>>> --------------------------------------------------------------------- >>>>> Count Min Max Median Mean 95 percentile >>>>> --------------------------------------------------------------------- >>>>> 56.0 1010.0 2664.0 1455.5 1544.9 2163.0 >>>>> 77.0 1015.0 3460.0 1896.0 1858.2 2907.8 >>>>> 69.0 1010.0 3118.0 1618.0 1688.1 2582.4 >>>>> --------------------------------------------------------------------- >>>>> 202.0 1010.0 3460.0 1610.0 1713.3 2711.5 >>>>> >>>>> Compared to the baseline results (also with conditional monitoring >>>>> enabled): >>>>> --------------------------------------------------------------------- >>>>> Count Min Max Median Mean 95 percentile >>>>> --------------------------------------------------------------------- >>>>> 141.0 1003.0 18338.0 1721.0 2625.4 7626.0 >>>>> 151.0 1019.0 80297.0 1808.0 3410.7 9089.0 >>>>> 165.0 1007.0 50736.0 2354.0 3067.7 7309.8 >>>>> --------------------------------------------------------------------- >>>>> 457.0 1003.0 80297.0 2131.0 3044.6 7799.6 >>>>> >>>>> We see significant improvement on the server side. This is explained >>>>> by the fact that now the Southbound database server doesn't need to >>>>> apply as many conditions as before when filtering individual monitor >>>>> contents. >>>>> >>>>> Note: Sub-ports - OVN switch ports with parent_port set - have to be >>>>> monitored unconditionally as we cannot efficiently determine their >> local >>>>> datapaths without including all local OVS interfaces in the monitor. >>>>> >>>>> [0] >>>> >> https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml >>>>> >>>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194 >>>>> Reported-by: Ilya Maximets <i.maximets@ovn.org> >>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>>>> --- >>>>> V2: >>>>> - Addressed Han's comments: >>>>> - changed commit log: removed assumption about sub-ports. >>>>> - added ovn-nb documentation note about sub-ports monitoring. >>>>> - added ovn-controller documentation note about sub-ports monitoring. >>>>> - added TODO item. >>>>> - Added NEWS item. >>>>> --- >>>>> NEWS | 3 +++ >>>>> TODO.rst | 6 ++++++ >>>>> controller/binding.c | 2 +- >>>>> controller/binding.h | 2 +- >>>>> controller/ovn-controller.8.xml | 6 ++++++ >>>>> controller/ovn-controller.c | 19 ++++++++++++++++--- >>>>> ovn-nb.xml | 6 +++++- >>>>> 7 files changed, 38 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/NEWS b/NEWS >>>>> index 482970eeeb..8275877f99 100644 >>>>> --- a/NEWS >>>>> +++ b/NEWS >>>>> @@ -7,6 +7,9 @@ Post v23.06.0 >>>>> DHCPv4 "hostname" (12) option. >>>>> - Support to create/update MAC_Binding when GARP received from VTEP >>>> (RAMP) >>>>> switch on l3dgw port. >>>>> + - To allow optimizing ovn-controller's monitor conditions for the >>>> regular >>>>> + VIF case, ovn-controller now unconditionally monitors all >> sub-ports >>>>> + (ports with parent_port set). >>>>> >>>>> OVN v23.06.0 - 01 Jun 2023 >>>>> -------------------------- >>>>> diff --git a/TODO.rst b/TODO.rst >>>>> index dff163e70b..b790a9fadf 100644 >>>>> --- a/TODO.rst >>>>> +++ b/TODO.rst >>>>> @@ -124,3 +124,9 @@ OVN To-do List >>>>> * Load Balancer templates >>>>> >>>>> * Support combining the VIP IP and port into a single template >>>> variable. >>>>> + >>>>> +* ovn-controller conditional monitoring >>>>> + >>>>> + * Improve sub-ports (with parent_port set) conditional monitoring; >>>> these >>>>> + are currently unconditionally monitored, even if ovn-monitor-all >> is >>>>> + set to false. >>>>> diff --git a/controller/binding.c b/controller/binding.c >>>>> index 486095ca79..359ad6d660 100644 >>>>> --- a/controller/binding.c >>>>> +++ b/controller/binding.c >>>>> @@ -785,7 +785,7 @@ local_binding_data_destroy(struct >> local_binding_data >>>> *lbinding_data) >>>>> } >>>>> >>>>> const struct sbrec_port_binding * >>>>> -local_binding_get_primary_pb(struct shash *local_bindings, >>>>> +local_binding_get_primary_pb(const struct shash *local_bindings, >>>>> const char *pb_name) >>>>> { >>>>> struct local_binding *lbinding = >>>>> diff --git a/controller/binding.h b/controller/binding.h >>>>> index 0e57f02ee5..319cbd263c 100644 >>>>> --- a/controller/binding.h >>>>> +++ b/controller/binding.h >>>>> @@ -150,7 +150,7 @@ void local_binding_data_init(struct >>>> local_binding_data *); >>>>> void local_binding_data_destroy(struct local_binding_data *); >>>>> >>>>> const struct sbrec_port_binding *local_binding_get_primary_pb( >>>>> - struct shash *local_bindings, const char *pb_name); >>>>> + const struct shash *local_bindings, const char *pb_name); >>>>> ofp_port_t local_binding_get_lport_ofport(const struct shash >>>> *local_bindings, >>>>> const char *pb_name); >>>>> >>>>> diff --git a/controller/ovn-controller.8.xml >>>> b/controller/ovn-controller.8.xml >>>>> index f61f430084..0883d8da9b 100644 >>>>> --- a/controller/ovn-controller.8.xml >>>>> +++ b/controller/ovn-controller.8.xml >>>>> @@ -128,6 +128,12 @@ >>>>> to <code>true</code> for environments that all workloads >> need >>>> to be >>>>> reachable from each other. >>>>> </p> >>>>> + <p> >>>>> + NOTE: for efficiency and scalability in common scenarios >>>>> + <code>ovn-controller</code> unconditionally monitors all >>>> sub-ports >>>>> + (ports with <code>parent_port</code> set) regardless of the >>>>> + <code>ovn-monitor-all</code> value. >>>>> + </p> >>>>> <p> >>>>> Default value is <var>false</var>. >>>>> </p> >>>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >>>>> index a474069798..96d01219e1 100644 >>>>> --- a/controller/ovn-controller.c >>>>> +++ b/controller/ovn-controller.c >>>>> @@ -199,6 +199,7 @@ static unsigned int >>>>> update_sb_monitors(struct ovsdb_idl *ovnsb_idl, >>>>> const struct sbrec_chassis *chassis, >>>>> const struct sset *local_ifaces, >>>>> + const struct shash *local_bindings, >>>>> struct hmap *local_datapaths, >>>>> bool monitor_all) >>>>> { >>>>> @@ -297,10 +298,21 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, >>>>> >>>>> if (local_ifaces) { >>>>> const char *name; >>>>> + >>>>> + ovs_assert(local_bindings); >>>>> SSET_FOR_EACH (name, local_ifaces) { >>>>> + /* Skip the VIFs we bound already, we should have a local >>>> datapath >>>>> + * for those. */ >>>>> + const struct sbrec_port_binding *local_pb >>>>> + = local_binding_get_primary_pb(local_bindings, name); >>>>> + if (local_pb && get_lport_type(local_pb) == LP_VIF) { >>>>> + continue; >>>>> + } >>>>> sbrec_port_binding_add_clause_logical_port(&pb, >> OVSDB_F_EQ, >>>> name); >>>>> - sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ, >>>> name); >>>>> } >>>>> + /* Monitor all sub-ports unconditionally; we don't expect a >> lot >>>> of >>>>> + * them in the SB database. */ >>>>> + sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, >> NULL); >>>>> } >>>>> if (local_datapaths) { >>>>> const struct local_datapath *ld; >>>>> @@ -609,7 +621,7 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct >>>> ovsdb_idl *ovnsb_idl, >>>>> * extra cost. Instead, it is called after the engine >> execution >>>> only >>>>> * when it is necessary. */ >>>>> unsigned int next_cond_seqno = >>>>> - update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true); >>>>> + update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, >> true); >>>>> if (sb_cond_seqno) { >>>>> *sb_cond_seqno = next_cond_seqno; >>>>> } >>>>> @@ -4712,7 +4724,7 @@ main(int argc, char *argv[]) >>>>> ovsdb_idl_omit(ovnsb_idl_loop.idl, >>>>> &sbrec_chassis_private_col_external_ids); >>>>> >>>>> - update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false); >>>>> + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL, >>>> false); >>>>> >>>>> stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS); >>>>> stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS); >>>>> @@ -5355,6 +5367,7 @@ main(int argc, char *argv[]) >>>>> update_sb_monitors( >>>>> ovnsb_idl_loop.idl, chassis, >>>>> &runtime_data->local_lports, >>>>> + >>>> &runtime_data->lbinding_data.bindings, >>>>> &runtime_data->local_datapaths, >>>>> sb_monitor_all); >>>>> } >>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml >>>>> index cf9c92009d..1fb9fa10ee 100644 >>>>> --- a/ovn-nb.xml >>>>> +++ b/ovn-nb.xml >>>>> @@ -1302,7 +1302,11 @@ >>>>> <column name="parent_name"> >>>>> The VM interface through which the nested container sends its >>>> network >>>>> traffic. This must match the <ref column="name"/> column for >>>> some >>>>> - other <ref table="Logical_Switch_Port"/>. >>>>> + other <ref table="Logical_Switch_Port"/>. Note: for >> performance >>>>> + reasons, unlike for regular VIFs, <code>ovn-controller</code> >>>> will >>>> >>>> It may be better to say "for performance of OVN Southbound database >>>> conditional monitoring ...", otherwise the user might be confused. >>>> >>>>> + register to get updates about all OVN Southbound database >>>>> + <ref db="OVN_Southbound" table="Port_Binding"/> table records >>>>> + that correspond to nested container ports. >>>> >>>> Better to add: "..., even if 'ovn-monitor-all' is set to false. See >>>> ovn-controlller (8)." or something similar. >>>> >>>> Please feel free to adjust the words and merge. >>>> >>>> Acked-by: Han Zhou <hzhou@ovn.org> >>>> >>> >>> Done, thanks again for the review! >>> >>> Regards, >>> Dumitru >>> >> >> Hi Dumitru, >> > > Hi Han, > >> I noticed that with this patch, the ovn-controller port-binding latency >> increased a lot. With the simulated k8s scale test [0] of 500 chassis x 50 >> LSPs, the latency of a single LSP add/delete is almost tripled: >> >> Command: >> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01 -- >> lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" -- >> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" >> >> Before: >> Time spent on processing nb_cfg 3: >> >> ovn-northd delay before processing: 2ms >> >> ovn-northd completion: 42ms >> >> >> ovn-controller(s) completion: 68ms >> >> (68 - 42 = 26ms) >> >> After: >> Time spent on processing nb_cfg 3: >> >> >> >> >> ovn-northd delay before processing: 1ms >> >> >> ovn-northd completion: 38ms >> >> >> >> >> ovn-controller(s) completion: 112m >> >> (112 - 38 = 74ms) >> >> I think it may be because each port bind/unbind now introduces a SB DB >> monitor condition change, which may have caused some delay. But I didn't >> expect the impact to be that high because this test has only one single >> ovn-controller (all the other 499 chassis are fake). I haven't spent time >> to dig deeper yet. > > The monitor condition change should happen only for the first port of a > new logical switch. From that point on the datapath is "local" to > ovn-controller and all new port additions to that logical switch should > not cause monitor conditions to change. > > If there are monitor condition changes for *every* port bind/unbind (if > the set of local datapaths doesn't change) that means we have a bug. But if you add OVS port first, ovn-controller will add it to the condition. And it will remove it once Sb port binding is created for it. If you create Nb --> Sb records first, then add OVS port, if this case there should be no condition updates. Or am I missing something? How do real CMSes work? Do they add OVS ports before or after creating Nb records? There might be a race anyway since we do not control when the Sb port binding is created, unless we --wait=sb. Best regards, Ilya Maximets. > >> Even though, it is still <100 ms, so I am not too worried about it. But >> just to share the observation and raise the awareness. >> >> [0] https://github.com/hzhou8/ovn-test-script >> >> Thanks, >> Han >> > > Regards, > Dumitru >
On 6/28/23 12:20, Ilya Maximets wrote: > On 6/28/23 10:30, Dumitru Ceara wrote: >> On 6/28/23 04:54, Han Zhou wrote: >>> On Thu, Jun 22, 2023 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote: >>>> >>>> On 6/21/23 16:50, Han Zhou wrote: >>>>> On Wed, Jun 21, 2023 at 2:48 AM Dumitru Ceara <dceara@redhat.com> wrote: >>>>>> >>>>>> We don't need to explicitly add port bindings that were already bound >>>>>> locally. We implicitly get those because we monitor the datapaths >>>>>> they're attached to. >>>>>> >>>>>> When performing an ovn-heater 500-node density-heavy scale test [0], >>> with >>>>>> conditional monitoring enabled, the unreasonably long poll intervals on >>>>>> the Southbound database (the ones that took more than one second) are >>>>>> measured as: >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> Count Min Max Median Mean 95 percentile >>>>>> --------------------------------------------------------------------- >>>>>> 56.0 1010.0 2664.0 1455.5 1544.9 2163.0 >>>>>> 77.0 1015.0 3460.0 1896.0 1858.2 2907.8 >>>>>> 69.0 1010.0 3118.0 1618.0 1688.1 2582.4 >>>>>> --------------------------------------------------------------------- >>>>>> 202.0 1010.0 3460.0 1610.0 1713.3 2711.5 >>>>>> >>>>>> Compared to the baseline results (also with conditional monitoring >>>>>> enabled): >>>>>> --------------------------------------------------------------------- >>>>>> Count Min Max Median Mean 95 percentile >>>>>> --------------------------------------------------------------------- >>>>>> 141.0 1003.0 18338.0 1721.0 2625.4 7626.0 >>>>>> 151.0 1019.0 80297.0 1808.0 3410.7 9089.0 >>>>>> 165.0 1007.0 50736.0 2354.0 3067.7 7309.8 >>>>>> --------------------------------------------------------------------- >>>>>> 457.0 1003.0 80297.0 2131.0 3044.6 7799.6 >>>>>> >>>>>> We see significant improvement on the server side. This is explained >>>>>> by the fact that now the Southbound database server doesn't need to >>>>>> apply as many conditions as before when filtering individual monitor >>>>>> contents. >>>>>> >>>>>> Note: Sub-ports - OVN switch ports with parent_port set - have to be >>>>>> monitored unconditionally as we cannot efficiently determine their >>> local >>>>>> datapaths without including all local OVS interfaces in the monitor. >>>>>> >>>>>> [0] >>>>> >>> https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml >>>>>> >>>>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194 >>>>>> Reported-by: Ilya Maximets <i.maximets@ovn.org> >>>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>>>>> --- >>>>>> V2: >>>>>> - Addressed Han's comments: >>>>>> - changed commit log: removed assumption about sub-ports. >>>>>> - added ovn-nb documentation note about sub-ports monitoring. >>>>>> - added ovn-controller documentation note about sub-ports monitoring. >>>>>> - added TODO item. >>>>>> - Added NEWS item. >>>>>> --- >>>>>> NEWS | 3 +++ >>>>>> TODO.rst | 6 ++++++ >>>>>> controller/binding.c | 2 +- >>>>>> controller/binding.h | 2 +- >>>>>> controller/ovn-controller.8.xml | 6 ++++++ >>>>>> controller/ovn-controller.c | 19 ++++++++++++++++--- >>>>>> ovn-nb.xml | 6 +++++- >>>>>> 7 files changed, 38 insertions(+), 6 deletions(-) >>>>>> >>>>>> diff --git a/NEWS b/NEWS >>>>>> index 482970eeeb..8275877f99 100644 >>>>>> --- a/NEWS >>>>>> +++ b/NEWS >>>>>> @@ -7,6 +7,9 @@ Post v23.06.0 >>>>>> DHCPv4 "hostname" (12) option. >>>>>> - Support to create/update MAC_Binding when GARP received from VTEP >>>>> (RAMP) >>>>>> switch on l3dgw port. >>>>>> + - To allow optimizing ovn-controller's monitor conditions for the >>>>> regular >>>>>> + VIF case, ovn-controller now unconditionally monitors all >>> sub-ports >>>>>> + (ports with parent_port set). >>>>>> >>>>>> OVN v23.06.0 - 01 Jun 2023 >>>>>> -------------------------- >>>>>> diff --git a/TODO.rst b/TODO.rst >>>>>> index dff163e70b..b790a9fadf 100644 >>>>>> --- a/TODO.rst >>>>>> +++ b/TODO.rst >>>>>> @@ -124,3 +124,9 @@ OVN To-do List >>>>>> * Load Balancer templates >>>>>> >>>>>> * Support combining the VIP IP and port into a single template >>>>> variable. >>>>>> + >>>>>> +* ovn-controller conditional monitoring >>>>>> + >>>>>> + * Improve sub-ports (with parent_port set) conditional monitoring; >>>>> these >>>>>> + are currently unconditionally monitored, even if ovn-monitor-all >>> is >>>>>> + set to false. >>>>>> diff --git a/controller/binding.c b/controller/binding.c >>>>>> index 486095ca79..359ad6d660 100644 >>>>>> --- a/controller/binding.c >>>>>> +++ b/controller/binding.c >>>>>> @@ -785,7 +785,7 @@ local_binding_data_destroy(struct >>> local_binding_data >>>>> *lbinding_data) >>>>>> } >>>>>> >>>>>> const struct sbrec_port_binding * >>>>>> -local_binding_get_primary_pb(struct shash *local_bindings, >>>>>> +local_binding_get_primary_pb(const struct shash *local_bindings, >>>>>> const char *pb_name) >>>>>> { >>>>>> struct local_binding *lbinding = >>>>>> diff --git a/controller/binding.h b/controller/binding.h >>>>>> index 0e57f02ee5..319cbd263c 100644 >>>>>> --- a/controller/binding.h >>>>>> +++ b/controller/binding.h >>>>>> @@ -150,7 +150,7 @@ void local_binding_data_init(struct >>>>> local_binding_data *); >>>>>> void local_binding_data_destroy(struct local_binding_data *); >>>>>> >>>>>> const struct sbrec_port_binding *local_binding_get_primary_pb( >>>>>> - struct shash *local_bindings, const char *pb_name); >>>>>> + const struct shash *local_bindings, const char *pb_name); >>>>>> ofp_port_t local_binding_get_lport_ofport(const struct shash >>>>> *local_bindings, >>>>>> const char *pb_name); >>>>>> >>>>>> diff --git a/controller/ovn-controller.8.xml >>>>> b/controller/ovn-controller.8.xml >>>>>> index f61f430084..0883d8da9b 100644 >>>>>> --- a/controller/ovn-controller.8.xml >>>>>> +++ b/controller/ovn-controller.8.xml >>>>>> @@ -128,6 +128,12 @@ >>>>>> to <code>true</code> for environments that all workloads >>> need >>>>> to be >>>>>> reachable from each other. >>>>>> </p> >>>>>> + <p> >>>>>> + NOTE: for efficiency and scalability in common scenarios >>>>>> + <code>ovn-controller</code> unconditionally monitors all >>>>> sub-ports >>>>>> + (ports with <code>parent_port</code> set) regardless of the >>>>>> + <code>ovn-monitor-all</code> value. >>>>>> + </p> >>>>>> <p> >>>>>> Default value is <var>false</var>. >>>>>> </p> >>>>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >>>>>> index a474069798..96d01219e1 100644 >>>>>> --- a/controller/ovn-controller.c >>>>>> +++ b/controller/ovn-controller.c >>>>>> @@ -199,6 +199,7 @@ static unsigned int >>>>>> update_sb_monitors(struct ovsdb_idl *ovnsb_idl, >>>>>> const struct sbrec_chassis *chassis, >>>>>> const struct sset *local_ifaces, >>>>>> + const struct shash *local_bindings, >>>>>> struct hmap *local_datapaths, >>>>>> bool monitor_all) >>>>>> { >>>>>> @@ -297,10 +298,21 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, >>>>>> >>>>>> if (local_ifaces) { >>>>>> const char *name; >>>>>> + >>>>>> + ovs_assert(local_bindings); >>>>>> SSET_FOR_EACH (name, local_ifaces) { >>>>>> + /* Skip the VIFs we bound already, we should have a local >>>>> datapath >>>>>> + * for those. */ >>>>>> + const struct sbrec_port_binding *local_pb >>>>>> + = local_binding_get_primary_pb(local_bindings, name); >>>>>> + if (local_pb && get_lport_type(local_pb) == LP_VIF) { >>>>>> + continue; >>>>>> + } >>>>>> sbrec_port_binding_add_clause_logical_port(&pb, >>> OVSDB_F_EQ, >>>>> name); >>>>>> - sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ, >>>>> name); >>>>>> } >>>>>> + /* Monitor all sub-ports unconditionally; we don't expect a >>> lot >>>>> of >>>>>> + * them in the SB database. */ >>>>>> + sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, >>> NULL); >>>>>> } >>>>>> if (local_datapaths) { >>>>>> const struct local_datapath *ld; >>>>>> @@ -609,7 +621,7 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct >>>>> ovsdb_idl *ovnsb_idl, >>>>>> * extra cost. Instead, it is called after the engine >>> execution >>>>> only >>>>>> * when it is necessary. */ >>>>>> unsigned int next_cond_seqno = >>>>>> - update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true); >>>>>> + update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, >>> true); >>>>>> if (sb_cond_seqno) { >>>>>> *sb_cond_seqno = next_cond_seqno; >>>>>> } >>>>>> @@ -4712,7 +4724,7 @@ main(int argc, char *argv[]) >>>>>> ovsdb_idl_omit(ovnsb_idl_loop.idl, >>>>>> &sbrec_chassis_private_col_external_ids); >>>>>> >>>>>> - update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false); >>>>>> + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL, >>>>> false); >>>>>> >>>>>> stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS); >>>>>> stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS); >>>>>> @@ -5355,6 +5367,7 @@ main(int argc, char *argv[]) >>>>>> update_sb_monitors( >>>>>> ovnsb_idl_loop.idl, chassis, >>>>>> &runtime_data->local_lports, >>>>>> + >>>>> &runtime_data->lbinding_data.bindings, >>>>>> &runtime_data->local_datapaths, >>>>>> sb_monitor_all); >>>>>> } >>>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml >>>>>> index cf9c92009d..1fb9fa10ee 100644 >>>>>> --- a/ovn-nb.xml >>>>>> +++ b/ovn-nb.xml >>>>>> @@ -1302,7 +1302,11 @@ >>>>>> <column name="parent_name"> >>>>>> The VM interface through which the nested container sends its >>>>> network >>>>>> traffic. This must match the <ref column="name"/> column for >>>>> some >>>>>> - other <ref table="Logical_Switch_Port"/>. >>>>>> + other <ref table="Logical_Switch_Port"/>. Note: for >>> performance >>>>>> + reasons, unlike for regular VIFs, <code>ovn-controller</code> >>>>> will >>>>> >>>>> It may be better to say "for performance of OVN Southbound database >>>>> conditional monitoring ...", otherwise the user might be confused. >>>>> >>>>>> + register to get updates about all OVN Southbound database >>>>>> + <ref db="OVN_Southbound" table="Port_Binding"/> table records >>>>>> + that correspond to nested container ports. >>>>> >>>>> Better to add: "..., even if 'ovn-monitor-all' is set to false. See >>>>> ovn-controlller (8)." or something similar. >>>>> >>>>> Please feel free to adjust the words and merge. >>>>> >>>>> Acked-by: Han Zhou <hzhou@ovn.org> >>>>> >>>> >>>> Done, thanks again for the review! >>>> >>>> Regards, >>>> Dumitru >>>> >>> >>> Hi Dumitru, >>> >> >> Hi Han, >> >>> I noticed that with this patch, the ovn-controller port-binding latency >>> increased a lot. With the simulated k8s scale test [0] of 500 chassis x 50 >>> LSPs, the latency of a single LSP add/delete is almost tripled: >>> >>> Command: >>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01 -- >>> lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" -- >>> lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" >>> >>> Before: >>> Time spent on processing nb_cfg 3: >>> >>> ovn-northd delay before processing: 2ms >>> >>> ovn-northd completion: 42ms >>> >>> >>> ovn-controller(s) completion: 68ms >>> >>> (68 - 42 = 26ms) >>> >>> After: >>> Time spent on processing nb_cfg 3: >>> >>> >>> >>> >>> ovn-northd delay before processing: 1ms >>> >>> >>> ovn-northd completion: 38ms >>> >>> >>> >>> >>> ovn-controller(s) completion: 112m >>> >>> (112 - 38 = 74ms) >>> >>> I think it may be because each port bind/unbind now introduces a SB DB >>> monitor condition change, which may have caused some delay. But I didn't >>> expect the impact to be that high because this test has only one single >>> ovn-controller (all the other 499 chassis are fake). I haven't spent time >>> to dig deeper yet. >> >> The monitor condition change should happen only for the first port of a >> new logical switch. From that point on the datapath is "local" to >> ovn-controller and all new port additions to that logical switch should >> not cause monitor conditions to change. >> >> If there are monitor condition changes for *every* port bind/unbind (if >> the set of local datapaths doesn't change) that means we have a bug. > > But if you add OVS port first, ovn-controller will add it to the condition. > And it will remove it once Sb port binding is created for it. > > If you create Nb --> Sb records first, then add OVS port, if this case > there should be no condition updates. > > Or am I missing something? > You're right, that's what's going on here too. I forgot about this case, thanks for following up! > How do real CMSes work? Do they add OVS ports before or after creating > Nb records? There might be a race anyway since we do not control when > the Sb port binding is created, unless we --wait=sb. > Yes, there's a race anyway. I don't think we can do much about that unless we add extra hints for ovn-controller, e.g., only monitor ports that are tagged in a specific way. Thanks, Dumitru > Best regards, Ilya Maximets. > >> >>> Even though, it is still <100 ms, so I am not too worried about it. But >>> just to share the observation and raise the awareness. >>> >>> [0] https://github.com/hzhou8/ovn-test-script >>> >>> Thanks, >>> Han >>> >> >> Regards, >> Dumitru >> >
diff --git a/NEWS b/NEWS index 482970eeeb..8275877f99 100644 --- a/NEWS +++ b/NEWS @@ -7,6 +7,9 @@ Post v23.06.0 DHCPv4 "hostname" (12) option. - Support to create/update MAC_Binding when GARP received from VTEP (RAMP) switch on l3dgw port. + - To allow optimizing ovn-controller's monitor conditions for the regular + VIF case, ovn-controller now unconditionally monitors all sub-ports + (ports with parent_port set). OVN v23.06.0 - 01 Jun 2023 -------------------------- diff --git a/TODO.rst b/TODO.rst index dff163e70b..b790a9fadf 100644 --- a/TODO.rst +++ b/TODO.rst @@ -124,3 +124,9 @@ OVN To-do List * Load Balancer templates * Support combining the VIP IP and port into a single template variable. + +* ovn-controller conditional monitoring + + * Improve sub-ports (with parent_port set) conditional monitoring; these + are currently unconditionally monitored, even if ovn-monitor-all is + set to false. diff --git a/controller/binding.c b/controller/binding.c index 486095ca79..359ad6d660 100644 --- a/controller/binding.c +++ b/controller/binding.c @@ -785,7 +785,7 @@ local_binding_data_destroy(struct local_binding_data *lbinding_data) } const struct sbrec_port_binding * -local_binding_get_primary_pb(struct shash *local_bindings, +local_binding_get_primary_pb(const struct shash *local_bindings, const char *pb_name) { struct local_binding *lbinding = diff --git a/controller/binding.h b/controller/binding.h index 0e57f02ee5..319cbd263c 100644 --- a/controller/binding.h +++ b/controller/binding.h @@ -150,7 +150,7 @@ void local_binding_data_init(struct local_binding_data *); void local_binding_data_destroy(struct local_binding_data *); const struct sbrec_port_binding *local_binding_get_primary_pb( - struct shash *local_bindings, const char *pb_name); + const struct shash *local_bindings, const char *pb_name); ofp_port_t local_binding_get_lport_ofport(const struct shash *local_bindings, const char *pb_name); diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml index f61f430084..0883d8da9b 100644 --- a/controller/ovn-controller.8.xml +++ b/controller/ovn-controller.8.xml @@ -128,6 +128,12 @@ to <code>true</code> for environments that all workloads need to be reachable from each other. </p> + <p> + NOTE: for efficiency and scalability in common scenarios + <code>ovn-controller</code> unconditionally monitors all sub-ports + (ports with <code>parent_port</code> set) regardless of the + <code>ovn-monitor-all</code> value. + </p> <p> Default value is <var>false</var>. </p> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index a474069798..96d01219e1 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -199,6 +199,7 @@ static unsigned int update_sb_monitors(struct ovsdb_idl *ovnsb_idl, const struct sbrec_chassis *chassis, const struct sset *local_ifaces, + const struct shash *local_bindings, struct hmap *local_datapaths, bool monitor_all) { @@ -297,10 +298,21 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, if (local_ifaces) { const char *name; + + ovs_assert(local_bindings); SSET_FOR_EACH (name, local_ifaces) { + /* Skip the VIFs we bound already, we should have a local datapath + * for those. */ + const struct sbrec_port_binding *local_pb + = local_binding_get_primary_pb(local_bindings, name); + if (local_pb && get_lport_type(local_pb) == LP_VIF) { + continue; + } sbrec_port_binding_add_clause_logical_port(&pb, OVSDB_F_EQ, name); - sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_EQ, name); } + /* Monitor all sub-ports unconditionally; we don't expect a lot of + * them in the SB database. */ + sbrec_port_binding_add_clause_parent_port(&pb, OVSDB_F_NE, NULL); } if (local_datapaths) { const struct local_datapath *ld; @@ -609,7 +621,7 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct ovsdb_idl *ovnsb_idl, * extra cost. Instead, it is called after the engine execution only * when it is necessary. */ unsigned int next_cond_seqno = - update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, true); + update_sb_monitors(ovnsb_idl, NULL, NULL, NULL, NULL, true); if (sb_cond_seqno) { *sb_cond_seqno = next_cond_seqno; } @@ -4712,7 +4724,7 @@ main(int argc, char *argv[]) ovsdb_idl_omit(ovnsb_idl_loop.idl, &sbrec_chassis_private_col_external_ids); - update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false); + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, NULL, false); stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS); stopwatch_create(OFCTRL_PUT_STOPWATCH_NAME, SW_MS); @@ -5355,6 +5367,7 @@ main(int argc, char *argv[]) update_sb_monitors( ovnsb_idl_loop.idl, chassis, &runtime_data->local_lports, + &runtime_data->lbinding_data.bindings, &runtime_data->local_datapaths, sb_monitor_all); } diff --git a/ovn-nb.xml b/ovn-nb.xml index cf9c92009d..1fb9fa10ee 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -1302,7 +1302,11 @@ <column name="parent_name"> The VM interface through which the nested container sends its network traffic. This must match the <ref column="name"/> column for some - other <ref table="Logical_Switch_Port"/>. + other <ref table="Logical_Switch_Port"/>. Note: for performance + reasons, unlike for regular VIFs, <code>ovn-controller</code> will + register to get updates about all OVN Southbound database + <ref db="OVN_Southbound" table="Port_Binding"/> table records + that correspond to nested container ports. </column> <column name="tag_request">
We don't need to explicitly add port bindings that were already bound locally. We implicitly get those because we monitor the datapaths they're attached to. When performing an ovn-heater 500-node density-heavy scale test [0], with conditional monitoring enabled, the unreasonably long poll intervals on the Southbound database (the ones that took more than one second) are measured as: --------------------------------------------------------------------- Count Min Max Median Mean 95 percentile --------------------------------------------------------------------- 56.0 1010.0 2664.0 1455.5 1544.9 2163.0 77.0 1015.0 3460.0 1896.0 1858.2 2907.8 69.0 1010.0 3118.0 1618.0 1688.1 2582.4 --------------------------------------------------------------------- 202.0 1010.0 3460.0 1610.0 1713.3 2711.5 Compared to the baseline results (also with conditional monitoring enabled): --------------------------------------------------------------------- Count Min Max Median Mean 95 percentile --------------------------------------------------------------------- 141.0 1003.0 18338.0 1721.0 2625.4 7626.0 151.0 1019.0 80297.0 1808.0 3410.7 9089.0 165.0 1007.0 50736.0 2354.0 3067.7 7309.8 --------------------------------------------------------------------- 457.0 1003.0 80297.0 2131.0 3044.6 7799.6 We see significant improvement on the server side. This is explained by the fact that now the Southbound database server doesn't need to apply as many conditions as before when filtering individual monitor contents. Note: Sub-ports - OVN switch ports with parent_port set - have to be monitored unconditionally as we cannot efficiently determine their local datapaths without including all local OVS interfaces in the monitor. [0] https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2139194 Reported-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> --- V2: - Addressed Han's comments: - changed commit log: removed assumption about sub-ports. - added ovn-nb documentation note about sub-ports monitoring. - added ovn-controller documentation note about sub-ports monitoring. - added TODO item. - Added NEWS item. --- NEWS | 3 +++ TODO.rst | 6 ++++++ controller/binding.c | 2 +- controller/binding.h | 2 +- controller/ovn-controller.8.xml | 6 ++++++ controller/ovn-controller.c | 19 ++++++++++++++++--- ovn-nb.xml | 6 +++++- 7 files changed, 38 insertions(+), 6 deletions(-)