Message ID | 20161218081834.18541-17-blp@ovn.org |
---|---|
State | Accepted |
Headers | show |
Ben Pfaff <blp@ovn.org> wrote on 18/12/2016 10:18:34 AM: > Until now, ovn-controller has replicated all of the southbound database > (through the IDL). This is inefficient, especially in a large OVN setup > where many logical networks are not present on an individual hypervisor. > This commit improves on the situation somewhat, by making ovn-controller > replicate (almost) only the port bindings, logical flows, and multicast > groups that are actually relevant to the particular hypervisor on which > ovn-controller is running. This is easily possible by replicating the > patch ports from the Port_Binding table and using these relationships to > determine connections between datapaths. > > This patch is strongly influenced by earlier work from the CCed developers. > I am grateful for their assistance. > > CC: Darrell Ball <dlu998@gmail.com> > CC: Liran Schour <LIRANS@il.ibm.com> > Signed-off-by: Ben Pfaff <blp@ovn.org> > --- I did some performance evaluation of this patch and compared it to the performance of the original patch series and found that they both show very similar performance benefit. Test scenario: Using ovs-sim to simulate data center with 50 hosts. 167 logical networks, each logical network is shared across 6 hosts (12% overlap). Each host has 20 ports from 20 different logical networks. Results: Current master branch: ---------------------- Host 1 # flows 1504 Host 2 # flows 1626 Host 3 # flows 1443 ... Logical flows = 8016 Elapsed time 208,180ms total of 1002 (in 208ms per port) Ports created and bound in 8,146,466,629,254 cycles 0% south 71,391,573,236 1% north 91,125,817,279 98% clients 7,983,949,238,739 This patch: ----------- Host 1 # flows 1358 Host 2 # flows 1482 Host 3 # flows 1296 ... Logical flows = 8016 Elapsed time 208,163ms total of 1002 (in 208ms per port) Ports created and bound in 1,391,851,557,883 cycles 2% south 33,923,722,225 3% north 46,647,673,929 94% clients 1,311,280,161,729 Origin patch: ------------- Host 1 # flows 1358 Host 2 # flows 1482 Host 3 # flows 1296 ... Logical flows = 8016 Elapsed time 208,173ms total of 1002 (in 208ms per port) Ports created and bound in 1,431,070,248,059 cycles 2% south 33,509,227,946 3% north 46,578,991,190 94% clients 1,350,982,028,923 Results summary: ---------------- This patch improves computation overhead of ovn-controller at about 80% and the overhead of SB ovsdb-server at about 50% in this specific scenario. (these results are very similar to the origin patch) Acked-by: Liran Schour <lirans@il.ibm.com>
On Sun, Dec 18, 2016 at 12:18 AM, Ben Pfaff <blp@ovn.org> wrote: > Until now, ovn-controller has replicated all of the southbound database > (through the IDL). This is inefficient, especially in a large OVN setup > where many logical networks are not present on an individual hypervisor. > This commit improves on the situation somewhat, by making ovn-controller > replicate (almost) only the port bindings, logical flows, and multicast > groups that are actually relevant to the particular hypervisor on which > ovn-controller is running. This is easily possible by replicating the > patch ports from the Port_Binding table and using these relationships to > determine connections between datapaths. > This approach is certainly a lot simpler than datapaths of interest. Thinking about this direct approach using patch ports versus the related_datapaths concept from datapaths of interest: - The number of related_datapaths and the number of patch ports and l3gateway ports should be the same. The complexity of running the algorithm should be similar. The direct approach has some advantage in not running the recursion over l3gateway ports. - One difference to ovn-controller is in getting a full port_binding for each versus just a related_datapath. Probably not that significant. - ovn-controller has a few SBREC_PORT_BINDING_FOR_EACH loops, in lport.c, binding.c, patch.c, and physical.c. These do not seem to involve much processing for ports that are not local. Given that none of this seems significant, and that Liran's performance results are good, it looks like the direct approach is the way to go. > > This patch is strongly influenced by earlier work from the CCed developers. > I am grateful for their assistance. > > CC: Darrell Ball <dlu998@gmail.com> > CC: Liran Schour <LIRANS@il.ibm.com> > Signed-off-by: Ben Pfaff <blp@ovn.org> > Acked-by: Mickey Spiegel <mickeys.dev@gmail.com> Three comments below. > --- > ovn/controller/binding.c | 2 ++ > ovn/controller/ovn-controller.c | 52 ++++++++++++++++++++++++++++++ > +++++++++++ > tests/ovn.at | 4 ++-- > 3 files changed, 56 insertions(+), 2 deletions(-) > > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c > index fbdcf67..9543bb2 100644 > --- a/ovn/controller/binding.c > +++ b/ovn/controller/binding.c > @@ -393,6 +393,7 @@ consider_local_datapath(struct controller_ctx *ctx, > add_local_datapath(ldatapaths, lports, binding_rec->datapath, > true, local_datapaths); > } > + sset_add(local_lports, binding_rec->logical_port); > I don't understand why this is necessary. There is already a clause for port_bindings of type "l3gateway" in update_sb_monitors below. I ran ovn automated tests including kernel tests without this line. They all passed. } else if (chassis_rec && binding_rec->chassis == chassis_rec) { > if (ctx->ovnsb_idl_txn) { > VLOG_INFO("Releasing lport %s from this chassis.", > @@ -402,6 +403,7 @@ consider_local_datapath(struct controller_ctx *ctx, > } > > sbrec_port_binding_set_chassis(binding_rec, NULL); > + > sset_find_and_delete(local_lports, > binding_rec->logical_port); > } > } else if (!binding_rec->chassis > diff --git a/ovn/controller/ovn-controller.c > b/ovn/controller/ovn-controller.c > index 7a30c2a..37b7756 100644 > --- a/ovn/controller/ovn-controller.c > +++ b/ovn/controller/ovn-controller.c > @@ -118,6 +118,55 @@ get_bridge(struct ovsdb_idl *ovs_idl, const char > *br_name) > return NULL; > } > > +static void > +update_sb_monitors(struct ovsdb_idl *ovnsb_idl, > + const struct sset *local_ifaces, > + struct hmap *local_datapaths) > +{ > + /* Monitor Port_Bindings rows for local interfaces and local > datapaths. > + * > + * Monitor Logical_Flow, MAC_Binding, and Multicast_Group tables for > + * local datapaths. > + * > + * We always monitor patch ports because they allow us to see the > linkages > + * between related logical datapaths. That way, when we know that we > have > + * a VIF on a particular logical switch, we immediately know to > monitor all > + * the connected logical routers and logical switches. */ > + struct ovsdb_idl_condition pb = OVSDB_IDL_CONDITION_INIT; > + struct ovsdb_idl_condition lf = OVSDB_IDL_CONDITION_INIT; > + struct ovsdb_idl_condition mb = OVSDB_IDL_CONDITION_INIT; > + struct ovsdb_idl_condition mg = OVSDB_IDL_CONDITION_INIT; > + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "patch"); > + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "l2gateway"); > + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "l3gateway"); > + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "localnet"); > Why do you need a clause for "localnet"? IMO it should be good enough to receive "localnet" port_bindings based on the port_binding datapath clause. I ran ovn automated tests including kernel tests without this line. They all passed. + if (local_ifaces) { > + const char *name; > + SSET_FOR_EACH (name, local_ifaces) { > + sbrec_port_binding_add_clause_logical_port(&pb, OVSDB_F_EQ, > name); > + } > + } > + if (local_datapaths) { > + const struct local_datapath *ld; > + HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { > + const struct uuid *uuid = &ld->datapath->header_.uuid; > + sbrec_port_binding_add_clause_datapath(&pb, OVSDB_F_EQ, > uuid); > + sbrec_logical_flow_add_clause_logical_datapath(&lf, > OVSDB_F_EQ, > + uuid); > + sbrec_mac_binding_add_clause_datapath(&mb, OVSDB_F_EQ, uuid); > + sbrec_multicast_group_add_clause_datapath(&mg, OVSDB_F_EQ, > uuid); > + } > + } > + sbrec_port_binding_set_condition(ovnsb_idl, &pb); > + sbrec_logical_flow_set_condition(ovnsb_idl, &lf); > + sbrec_mac_binding_set_condition(ovnsb_idl, &mb); > + sbrec_multicast_group_set_condition(ovnsb_idl, &mg); > + ovsdb_idl_condition_destroy(&pb); > + ovsdb_idl_condition_destroy(&lf); > + ovsdb_idl_condition_destroy(&mb); > + ovsdb_idl_condition_destroy(&mg); > +} > + > static const struct ovsrec_bridge * > create_br_int(struct controller_ctx *ctx, > const struct ovsrec_open_vswitch *cfg, > @@ -451,6 +500,7 @@ main(int argc, char *argv[]) > struct ovsdb_idl_loop ovnsb_idl_loop = OVSDB_IDL_LOOP_INITIALIZER( > ovsdb_idl_create(ovnsb_remote, &sbrec_idl_class, true, true)); > ovsdb_idl_omit_alert(ovnsb_idl_loop.idl, &sbrec_chassis_col_nb_cfg); > + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL); > > /* Track the southbound idl. */ > ovsdb_idl_get_initial_snapshot(ovnsb_idl_loop.idl); > @@ -545,6 +595,8 @@ main(int argc, char *argv[]) > } > } > } > + > + update_sb_monitors(ctx.ovnsb_idl, &local_lports, > &local_datapaths); > } > > mcgroup_index_destroy(&mcgroups); > diff --git a/tests/ovn.at b/tests/ovn.at > index 628d3c8..3779741 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -5403,9 +5403,9 @@ AT_CHECK([ovs-vsctl add-port br-int localvif1 -- set > Interface localvif1 externa > # On hv1, check that there are no flows outputting bcast to tunnel > OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip > | grep output | wc -l` -eq 0]) > > -# On hv2, check that there is 1 flow outputting bcast to tunnel to hv1. > +# On hv2, check that no flow outputs bcast to tunnel to hv1. > as hv2 > -OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip > | grep output | wc -l` -eq 1]) > +OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip > | grep output | wc -l` -eq 0]) > > # Now bind vif2 on hv2. > AT_CHECK([ovs-vsctl add-port br-int localvif2 -- set Interface localvif2 > external_ids:iface-id=localvif2]) > If you take my suggestion for patch 10 to filter to only local_datapaths in consider_mc_group in physical.c, then these changes to ovn.at would move to patch 10. Mickey -- > 2.10.2 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Mon, Dec 19, 2016 at 10:51:36AM +0200, Liran Schour wrote: > Ben Pfaff <blp@ovn.org> wrote on 18/12/2016 10:18:34 AM: > > > Until now, ovn-controller has replicated all of the southbound database > > (through the IDL). This is inefficient, especially in a large OVN setup > > where many logical networks are not present on an individual hypervisor. > > This commit improves on the situation somewhat, by making ovn-controller > > replicate (almost) only the port bindings, logical flows, and multicast > > groups that are actually relevant to the particular hypervisor on which > > ovn-controller is running. This is easily possible by replicating the > > patch ports from the Port_Binding table and using these relationships to > > determine connections between datapaths. > > > > This patch is strongly influenced by earlier work from the CCed > developers. > > I am grateful for their assistance. > > > > CC: Darrell Ball <dlu998@gmail.com> > > CC: Liran Schour <LIRANS@il.ibm.com> > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > --- > > I did some performance evaluation of this patch and compared it to the > performance of the original patch series and found that they both show > very similar performance benefit. Thank you very much for the review and especially for the performance testing.
diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c index fbdcf67..9543bb2 100644 --- a/ovn/controller/binding.c +++ b/ovn/controller/binding.c @@ -393,6 +393,7 @@ consider_local_datapath(struct controller_ctx *ctx, add_local_datapath(ldatapaths, lports, binding_rec->datapath, true, local_datapaths); } + sset_add(local_lports, binding_rec->logical_port); } else if (chassis_rec && binding_rec->chassis == chassis_rec) { if (ctx->ovnsb_idl_txn) { VLOG_INFO("Releasing lport %s from this chassis.", @@ -402,6 +403,7 @@ consider_local_datapath(struct controller_ctx *ctx, } sbrec_port_binding_set_chassis(binding_rec, NULL); + sset_find_and_delete(local_lports, binding_rec->logical_port); } } else if (!binding_rec->chassis diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index 7a30c2a..37b7756 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -118,6 +118,55 @@ get_bridge(struct ovsdb_idl *ovs_idl, const char *br_name) return NULL; } +static void +update_sb_monitors(struct ovsdb_idl *ovnsb_idl, + const struct sset *local_ifaces, + struct hmap *local_datapaths) +{ + /* Monitor Port_Bindings rows for local interfaces and local datapaths. + * + * Monitor Logical_Flow, MAC_Binding, and Multicast_Group tables for + * local datapaths. + * + * We always monitor patch ports because they allow us to see the linkages + * between related logical datapaths. That way, when we know that we have + * a VIF on a particular logical switch, we immediately know to monitor all + * the connected logical routers and logical switches. */ + struct ovsdb_idl_condition pb = OVSDB_IDL_CONDITION_INIT; + struct ovsdb_idl_condition lf = OVSDB_IDL_CONDITION_INIT; + struct ovsdb_idl_condition mb = OVSDB_IDL_CONDITION_INIT; + struct ovsdb_idl_condition mg = OVSDB_IDL_CONDITION_INIT; + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "patch"); + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "l2gateway"); + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "l3gateway"); + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "localnet"); + if (local_ifaces) { + const char *name; + SSET_FOR_EACH (name, local_ifaces) { + sbrec_port_binding_add_clause_logical_port(&pb, OVSDB_F_EQ, name); + } + } + if (local_datapaths) { + const struct local_datapath *ld; + HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { + const struct uuid *uuid = &ld->datapath->header_.uuid; + sbrec_port_binding_add_clause_datapath(&pb, OVSDB_F_EQ, uuid); + sbrec_logical_flow_add_clause_logical_datapath(&lf, OVSDB_F_EQ, + uuid); + sbrec_mac_binding_add_clause_datapath(&mb, OVSDB_F_EQ, uuid); + sbrec_multicast_group_add_clause_datapath(&mg, OVSDB_F_EQ, uuid); + } + } + sbrec_port_binding_set_condition(ovnsb_idl, &pb); + sbrec_logical_flow_set_condition(ovnsb_idl, &lf); + sbrec_mac_binding_set_condition(ovnsb_idl, &mb); + sbrec_multicast_group_set_condition(ovnsb_idl, &mg); + ovsdb_idl_condition_destroy(&pb); + ovsdb_idl_condition_destroy(&lf); + ovsdb_idl_condition_destroy(&mb); + ovsdb_idl_condition_destroy(&mg); +} + static const struct ovsrec_bridge * create_br_int(struct controller_ctx *ctx, const struct ovsrec_open_vswitch *cfg, @@ -451,6 +500,7 @@ main(int argc, char *argv[]) struct ovsdb_idl_loop ovnsb_idl_loop = OVSDB_IDL_LOOP_INITIALIZER( ovsdb_idl_create(ovnsb_remote, &sbrec_idl_class, true, true)); ovsdb_idl_omit_alert(ovnsb_idl_loop.idl, &sbrec_chassis_col_nb_cfg); + update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL); /* Track the southbound idl. */ ovsdb_idl_get_initial_snapshot(ovnsb_idl_loop.idl); @@ -545,6 +595,8 @@ main(int argc, char *argv[]) } } } + + update_sb_monitors(ctx.ovnsb_idl, &local_lports, &local_datapaths); } mcgroup_index_destroy(&mcgroups); diff --git a/tests/ovn.at b/tests/ovn.at index 628d3c8..3779741 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -5403,9 +5403,9 @@ AT_CHECK([ovs-vsctl add-port br-int localvif1 -- set Interface localvif1 externa # On hv1, check that there are no flows outputting bcast to tunnel OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip | grep output | wc -l` -eq 0]) -# On hv2, check that there is 1 flow outputting bcast to tunnel to hv1. +# On hv2, check that no flow outputs bcast to tunnel to hv1. as hv2 -OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip | grep output | wc -l` -eq 1]) +OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip | grep output | wc -l` -eq 0]) # Now bind vif2 on hv2. AT_CHECK([ovs-vsctl add-port br-int localvif2 -- set Interface localvif2 external_ids:iface-id=localvif2])
Until now, ovn-controller has replicated all of the southbound database (through the IDL). This is inefficient, especially in a large OVN setup where many logical networks are not present on an individual hypervisor. This commit improves on the situation somewhat, by making ovn-controller replicate (almost) only the port bindings, logical flows, and multicast groups that are actually relevant to the particular hypervisor on which ovn-controller is running. This is easily possible by replicating the patch ports from the Port_Binding table and using these relationships to determine connections between datapaths. This patch is strongly influenced by earlier work from the CCed developers. I am grateful for their assistance. CC: Darrell Ball <dlu998@gmail.com> CC: Liran Schour <LIRANS@il.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org> --- ovn/controller/binding.c | 2 ++ ovn/controller/ovn-controller.c | 52 +++++++++++++++++++++++++++++++++++++++++ tests/ovn.at | 4 ++-- 3 files changed, 56 insertions(+), 2 deletions(-)