Message ID | 1492826128-6491-1-git-send-email-zhouhan@gmail.com |
---|---|
State | Accepted |
Headers | show |
On Fri, Apr 21, 2017 at 06:55:27PM -0700, Han Zhou wrote: > When there are in-flight msgs being sent to OVS, ofctrl_put will > skip, which makes all the flows computed in that main loop > iteration useless. To avoid the wasted CPU cycles, a check is added > before lflow/physical flow run in each iteration. > > This has huge performance improvement in below testing: > - 1 lswitch with 10 lports bound locally > - Each lport has an ingress ACL, referencing the same address-set > - The address-set has 10,000 IPv4 addresses > > For each IP address in the address-set, there will be 3 > OpenFlow rules generated for each ACL. So the total number > of rules is 300k+. > > Without the patch, it takes 50+ minutes to install all the > rules to ovs-vswitchd. > > With the patch, it takes 16 seconds to install all the rules > to ovs-vswitchd. > > The reason is that the large number of rules are sent to > ovs-vswitchd gradually in many iterations of ovn-controller > main loop. Without the patch, cpu cycles are wasted in > lflow_run to re-processing the large address set in every > main loop iteration. With the patch, this re-processing is > avoided in iterations when there are pending rules sending. > > Signed-off-by: Han Zhou <zhouhan@gmail.com> Wow, that's a huge performance improvement, really amazing. Thank you for implementing this! I applied it to master.
On Mon, May 1, 2017 at 5:42 PM, Ben Pfaff <blp@ovn.org> wrote: > On Fri, Apr 21, 2017 at 06:55:27PM -0700, Han Zhou wrote: >> When there are in-flight msgs being sent to OVS, ofctrl_put will >> skip, which makes all the flows computed in that main loop >> iteration useless. To avoid the wasted CPU cycles, a check is added >> before lflow/physical flow run in each iteration. >> >> This has huge performance improvement in below testing: >> - 1 lswitch with 10 lports bound locally >> - Each lport has an ingress ACL, referencing the same address-set >> - The address-set has 10,000 IPv4 addresses >> >> For each IP address in the address-set, there will be 3 >> OpenFlow rules generated for each ACL. So the total number >> of rules is 300k+. >> >> Without the patch, it takes 50+ minutes to install all the >> rules to ovs-vswitchd. >> >> With the patch, it takes 16 seconds to install all the rules >> to ovs-vswitchd. >> >> The reason is that the large number of rules are sent to >> ovs-vswitchd gradually in many iterations of ovn-controller >> main loop. Without the patch, cpu cycles are wasted in >> lflow_run to re-processing the large address set in every >> main loop iteration. With the patch, this re-processing is >> avoided in iterations when there are pending rules sending. >> >> Signed-off-by: Han Zhou <zhouhan@gmail.com> > > Wow, that's a huge performance improvement, really amazing. Thank you > for implementing this! I applied it to master. This patch made it into OVS 2.8. It makes such a big difference to performance that I wonder if it's worth backporting to OVS 2.7. Does anyone have a strong opinion about that? If there's no objections, I'd like to backport it.
On Thu, Sep 14, 2017 at 03:06:46PM -0400, Russell Bryant wrote: > On Mon, May 1, 2017 at 5:42 PM, Ben Pfaff <blp@ovn.org> wrote: > > On Fri, Apr 21, 2017 at 06:55:27PM -0700, Han Zhou wrote: > >> When there are in-flight msgs being sent to OVS, ofctrl_put will > >> skip, which makes all the flows computed in that main loop > >> iteration useless. To avoid the wasted CPU cycles, a check is added > >> before lflow/physical flow run in each iteration. > >> > >> This has huge performance improvement in below testing: > >> - 1 lswitch with 10 lports bound locally > >> - Each lport has an ingress ACL, referencing the same address-set > >> - The address-set has 10,000 IPv4 addresses > >> > >> For each IP address in the address-set, there will be 3 > >> OpenFlow rules generated for each ACL. So the total number > >> of rules is 300k+. > >> > >> Without the patch, it takes 50+ minutes to install all the > >> rules to ovs-vswitchd. > >> > >> With the patch, it takes 16 seconds to install all the rules > >> to ovs-vswitchd. > >> > >> The reason is that the large number of rules are sent to > >> ovs-vswitchd gradually in many iterations of ovn-controller > >> main loop. Without the patch, cpu cycles are wasted in > >> lflow_run to re-processing the large address set in every > >> main loop iteration. With the patch, this re-processing is > >> avoided in iterations when there are pending rules sending. > >> > >> Signed-off-by: Han Zhou <zhouhan@gmail.com> > > > > Wow, that's a huge performance improvement, really amazing. Thank you > > for implementing this! I applied it to master. > > This patch made it into OVS 2.8. It makes such a big difference to > performance that I wonder if it's worth backporting to OVS 2.7. Does > anyone have a strong opinion about that? If there's no objections, > I'd like to backport it. It seems pretty low risk to me. I won't oppose it.
On Thu, Sep 14, 2017 at 3:45 PM, Ben Pfaff <blp@ovn.org> wrote: > On Thu, Sep 14, 2017 at 03:06:46PM -0400, Russell Bryant wrote: >> On Mon, May 1, 2017 at 5:42 PM, Ben Pfaff <blp@ovn.org> wrote: >> > On Fri, Apr 21, 2017 at 06:55:27PM -0700, Han Zhou wrote: >> >> When there are in-flight msgs being sent to OVS, ofctrl_put will >> >> skip, which makes all the flows computed in that main loop >> >> iteration useless. To avoid the wasted CPU cycles, a check is added >> >> before lflow/physical flow run in each iteration. >> >> >> >> This has huge performance improvement in below testing: >> >> - 1 lswitch with 10 lports bound locally >> >> - Each lport has an ingress ACL, referencing the same address-set >> >> - The address-set has 10,000 IPv4 addresses >> >> >> >> For each IP address in the address-set, there will be 3 >> >> OpenFlow rules generated for each ACL. So the total number >> >> of rules is 300k+. >> >> >> >> Without the patch, it takes 50+ minutes to install all the >> >> rules to ovs-vswitchd. >> >> >> >> With the patch, it takes 16 seconds to install all the rules >> >> to ovs-vswitchd. >> >> >> >> The reason is that the large number of rules are sent to >> >> ovs-vswitchd gradually in many iterations of ovn-controller >> >> main loop. Without the patch, cpu cycles are wasted in >> >> lflow_run to re-processing the large address set in every >> >> main loop iteration. With the patch, this re-processing is >> >> avoided in iterations when there are pending rules sending. >> >> >> >> Signed-off-by: Han Zhou <zhouhan@gmail.com> >> > >> > Wow, that's a huge performance improvement, really amazing. Thank you >> > for implementing this! I applied it to master. >> >> This patch made it into OVS 2.8. It makes such a big difference to >> performance that I wonder if it's worth backporting to OVS 2.7. Does >> anyone have a strong opinion about that? If there's no objections, >> I'd like to backport it. > > It seems pretty low risk to me. I won't oppose it. OK, I backported this to branch-2.7.
diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c index 10c8105..417fdc9 100644 --- a/ovn/controller/ofctrl.c +++ b/ovn/controller/ofctrl.c @@ -813,6 +813,20 @@ add_ct_flush_zone(uint16_t zone_id, struct ovs_list *msgs) ovs_list_push_back(msgs, &msg->list_node); } +/* The flow table can be updated if the connection to the switch is up and + * in the correct state and not backlogged with existing flow_mods. (Our + * criteria for being backlogged appear very conservative, but the socket + * between ovn-controller and OVS provides some buffering.) */ +bool +ofctrl_can_put(void) +{ + if (state != S_UPDATE_FLOWS + || rconn_packet_counter_n_packets(tx_counter)) { + return false; + } + return true; +} + /* Replaces the flow table on the switch, if possible, by the flows added * with ofctrl_add_flow(). * @@ -831,12 +845,7 @@ void ofctrl_put(struct hmap *flow_table, struct shash *pending_ct_zones, int64_t nb_cfg) { - /* The flow table can be updated if the connection to the switch is up and - * in the correct state and not backlogged with existing flow_mods. (Our - * criteria for being backlogged appear very conservative, but the socket - * between ovn-controller and OVS provides some buffering.) */ - if (state != S_UPDATE_FLOWS - || rconn_packet_counter_n_packets(tx_counter)) { + if (!ofctrl_can_put()) { ovn_flow_table_clear(flow_table); ovn_group_table_clear(groups, false); return; diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h index bf5ba01..d83f6ae 100644 --- a/ovn/controller/ofctrl.h +++ b/ovn/controller/ofctrl.h @@ -34,6 +34,7 @@ struct shash; void ofctrl_init(struct group_table *group_table); enum mf_field_id ofctrl_run(const struct ovsrec_bridge *br_int, struct shash *pending_ct_zones); +bool ofctrl_can_put(void); void ofctrl_put(struct hmap *flow_table, struct shash *pending_ct_zones, int64_t nb_cfg); void ofctrl_wait(void); diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index e00f57a..480d06d 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -639,23 +639,23 @@ main(int argc, char *argv[]) update_ct_zones(&local_lports, &local_datapaths, &ct_zones, ct_zone_bitmap, &pending_ct_zones); if (ctx.ovs_idl_txn) { + if (ofctrl_can_put()) { + commit_ct_zones(br_int, &pending_ct_zones); - commit_ct_zones(br_int, &pending_ct_zones); + struct hmap flow_table = HMAP_INITIALIZER(&flow_table); + lflow_run(&ctx, chassis, &lports, &mcgroups, + &local_datapaths, &group_table, &ct_zones, + &addr_sets, &flow_table); - struct hmap flow_table = HMAP_INITIALIZER(&flow_table); - lflow_run(&ctx, chassis, &lports, &mcgroups, - &local_datapaths, &group_table, &ct_zones, - &addr_sets, &flow_table); + physical_run(&ctx, mff_ovn_geneve, + br_int, chassis, &ct_zones, &lports, + &flow_table, &local_datapaths); - physical_run(&ctx, mff_ovn_geneve, - br_int, chassis, &ct_zones, &lports, - &flow_table, &local_datapaths); - - ofctrl_put(&flow_table, &pending_ct_zones, - get_nb_cfg(ctx.ovnsb_idl)); - - hmap_destroy(&flow_table); + ofctrl_put(&flow_table, &pending_ct_zones, + get_nb_cfg(ctx.ovnsb_idl)); + hmap_destroy(&flow_table); + } if (ctx.ovnsb_idl_txn) { int64_t cur_cfg = ofctrl_get_cur_cfg(); if (cur_cfg && cur_cfg != chassis->nb_cfg) {
When there are in-flight msgs being sent to OVS, ofctrl_put will skip, which makes all the flows computed in that main loop iteration useless. To avoid the wasted CPU cycles, a check is added before lflow/physical flow run in each iteration. This has huge performance improvement in below testing: - 1 lswitch with 10 lports bound locally - Each lport has an ingress ACL, referencing the same address-set - The address-set has 10,000 IPv4 addresses For each IP address in the address-set, there will be 3 OpenFlow rules generated for each ACL. So the total number of rules is 300k+. Without the patch, it takes 50+ minutes to install all the rules to ovs-vswitchd. With the patch, it takes 16 seconds to install all the rules to ovs-vswitchd. The reason is that the large number of rules are sent to ovs-vswitchd gradually in many iterations of ovn-controller main loop. Without the patch, cpu cycles are wasted in lflow_run to re-processing the large address set in every main loop iteration. With the patch, this re-processing is avoided in iterations when there are pending rules sending. Signed-off-by: Han Zhou <zhouhan@gmail.com> --- ovn/controller/ofctrl.c | 21 +++++++++++++++------ ovn/controller/ofctrl.h | 1 + ovn/controller/ovn-controller.c | 26 +++++++++++++------------- 3 files changed, 29 insertions(+), 19 deletions(-)