diff mbox

[ovs-dev,1/2] ovn-controller: Avoid recomputing when there are in-flight msgs.

Message ID 1492826128-6491-1-git-send-email-zhouhan@gmail.com
State Accepted
Headers show

Commit Message

Han Zhou April 22, 2017, 1:55 a.m. UTC
When there are in-flight msgs being sent to OVS, ofctrl_put will
skip, which makes all the flows computed in that main loop
iteration useless. To avoid the wasted CPU cycles, a check is added
before lflow/physical flow run in each iteration.

This has huge performance improvement in below testing:
- 1 lswitch with 10 lports bound locally
- Each lport has an ingress ACL, referencing the same address-set
- The address-set has 10,000 IPv4 addresses

For each IP address in the address-set, there will be 3
OpenFlow rules generated for each ACL. So the total number
of rules is 300k+.

Without the patch, it takes 50+ minutes to install all the
rules to ovs-vswitchd.

With the patch, it takes 16 seconds to install all the rules
to ovs-vswitchd.

The reason is that the large number of rules are sent to
ovs-vswitchd gradually in many iterations of ovn-controller
main loop. Without the patch, cpu cycles are wasted in
lflow_run to re-processing the large address set in every
main loop iteration. With the patch, this re-processing is
avoided in iterations when there are pending rules sending.

Signed-off-by: Han Zhou <zhouhan@gmail.com>
---
 ovn/controller/ofctrl.c         | 21 +++++++++++++++------
 ovn/controller/ofctrl.h         |  1 +
 ovn/controller/ovn-controller.c | 26 +++++++++++++-------------
 3 files changed, 29 insertions(+), 19 deletions(-)

Comments

Ben Pfaff May 1, 2017, 9:42 p.m. UTC | #1
On Fri, Apr 21, 2017 at 06:55:27PM -0700, Han Zhou wrote:
> When there are in-flight msgs being sent to OVS, ofctrl_put will
> skip, which makes all the flows computed in that main loop
> iteration useless. To avoid the wasted CPU cycles, a check is added
> before lflow/physical flow run in each iteration.
> 
> This has huge performance improvement in below testing:
> - 1 lswitch with 10 lports bound locally
> - Each lport has an ingress ACL, referencing the same address-set
> - The address-set has 10,000 IPv4 addresses
> 
> For each IP address in the address-set, there will be 3
> OpenFlow rules generated for each ACL. So the total number
> of rules is 300k+.
> 
> Without the patch, it takes 50+ minutes to install all the
> rules to ovs-vswitchd.
> 
> With the patch, it takes 16 seconds to install all the rules
> to ovs-vswitchd.
> 
> The reason is that the large number of rules are sent to
> ovs-vswitchd gradually in many iterations of ovn-controller
> main loop. Without the patch, cpu cycles are wasted in
> lflow_run to re-processing the large address set in every
> main loop iteration. With the patch, this re-processing is
> avoided in iterations when there are pending rules sending.
> 
> Signed-off-by: Han Zhou <zhouhan@gmail.com>

Wow, that's a huge performance improvement, really amazing.  Thank you
for implementing this!  I applied it to master.
Russell Bryant Sept. 14, 2017, 7:06 p.m. UTC | #2
On Mon, May 1, 2017 at 5:42 PM, Ben Pfaff <blp@ovn.org> wrote:
> On Fri, Apr 21, 2017 at 06:55:27PM -0700, Han Zhou wrote:
>> When there are in-flight msgs being sent to OVS, ofctrl_put will
>> skip, which makes all the flows computed in that main loop
>> iteration useless. To avoid the wasted CPU cycles, a check is added
>> before lflow/physical flow run in each iteration.
>>
>> This has huge performance improvement in below testing:
>> - 1 lswitch with 10 lports bound locally
>> - Each lport has an ingress ACL, referencing the same address-set
>> - The address-set has 10,000 IPv4 addresses
>>
>> For each IP address in the address-set, there will be 3
>> OpenFlow rules generated for each ACL. So the total number
>> of rules is 300k+.
>>
>> Without the patch, it takes 50+ minutes to install all the
>> rules to ovs-vswitchd.
>>
>> With the patch, it takes 16 seconds to install all the rules
>> to ovs-vswitchd.
>>
>> The reason is that the large number of rules are sent to
>> ovs-vswitchd gradually in many iterations of ovn-controller
>> main loop. Without the patch, cpu cycles are wasted in
>> lflow_run to re-processing the large address set in every
>> main loop iteration. With the patch, this re-processing is
>> avoided in iterations when there are pending rules sending.
>>
>> Signed-off-by: Han Zhou <zhouhan@gmail.com>
>
> Wow, that's a huge performance improvement, really amazing.  Thank you
> for implementing this!  I applied it to master.

This patch made it into OVS 2.8.  It makes such a big difference to
performance that I wonder if it's worth backporting to OVS 2.7.  Does
anyone have a strong opinion about that?  If there's no objections,
I'd like to backport it.
Ben Pfaff Sept. 14, 2017, 7:45 p.m. UTC | #3
On Thu, Sep 14, 2017 at 03:06:46PM -0400, Russell Bryant wrote:
> On Mon, May 1, 2017 at 5:42 PM, Ben Pfaff <blp@ovn.org> wrote:
> > On Fri, Apr 21, 2017 at 06:55:27PM -0700, Han Zhou wrote:
> >> When there are in-flight msgs being sent to OVS, ofctrl_put will
> >> skip, which makes all the flows computed in that main loop
> >> iteration useless. To avoid the wasted CPU cycles, a check is added
> >> before lflow/physical flow run in each iteration.
> >>
> >> This has huge performance improvement in below testing:
> >> - 1 lswitch with 10 lports bound locally
> >> - Each lport has an ingress ACL, referencing the same address-set
> >> - The address-set has 10,000 IPv4 addresses
> >>
> >> For each IP address in the address-set, there will be 3
> >> OpenFlow rules generated for each ACL. So the total number
> >> of rules is 300k+.
> >>
> >> Without the patch, it takes 50+ minutes to install all the
> >> rules to ovs-vswitchd.
> >>
> >> With the patch, it takes 16 seconds to install all the rules
> >> to ovs-vswitchd.
> >>
> >> The reason is that the large number of rules are sent to
> >> ovs-vswitchd gradually in many iterations of ovn-controller
> >> main loop. Without the patch, cpu cycles are wasted in
> >> lflow_run to re-processing the large address set in every
> >> main loop iteration. With the patch, this re-processing is
> >> avoided in iterations when there are pending rules sending.
> >>
> >> Signed-off-by: Han Zhou <zhouhan@gmail.com>
> >
> > Wow, that's a huge performance improvement, really amazing.  Thank you
> > for implementing this!  I applied it to master.
> 
> This patch made it into OVS 2.8.  It makes such a big difference to
> performance that I wonder if it's worth backporting to OVS 2.7.  Does
> anyone have a strong opinion about that?  If there's no objections,
> I'd like to backport it.

It seems pretty low risk to me.  I won't oppose it.
Russell Bryant Sept. 14, 2017, 9:42 p.m. UTC | #4
On Thu, Sep 14, 2017 at 3:45 PM, Ben Pfaff <blp@ovn.org> wrote:
> On Thu, Sep 14, 2017 at 03:06:46PM -0400, Russell Bryant wrote:
>> On Mon, May 1, 2017 at 5:42 PM, Ben Pfaff <blp@ovn.org> wrote:
>> > On Fri, Apr 21, 2017 at 06:55:27PM -0700, Han Zhou wrote:
>> >> When there are in-flight msgs being sent to OVS, ofctrl_put will
>> >> skip, which makes all the flows computed in that main loop
>> >> iteration useless. To avoid the wasted CPU cycles, a check is added
>> >> before lflow/physical flow run in each iteration.
>> >>
>> >> This has huge performance improvement in below testing:
>> >> - 1 lswitch with 10 lports bound locally
>> >> - Each lport has an ingress ACL, referencing the same address-set
>> >> - The address-set has 10,000 IPv4 addresses
>> >>
>> >> For each IP address in the address-set, there will be 3
>> >> OpenFlow rules generated for each ACL. So the total number
>> >> of rules is 300k+.
>> >>
>> >> Without the patch, it takes 50+ minutes to install all the
>> >> rules to ovs-vswitchd.
>> >>
>> >> With the patch, it takes 16 seconds to install all the rules
>> >> to ovs-vswitchd.
>> >>
>> >> The reason is that the large number of rules are sent to
>> >> ovs-vswitchd gradually in many iterations of ovn-controller
>> >> main loop. Without the patch, cpu cycles are wasted in
>> >> lflow_run to re-processing the large address set in every
>> >> main loop iteration. With the patch, this re-processing is
>> >> avoided in iterations when there are pending rules sending.
>> >>
>> >> Signed-off-by: Han Zhou <zhouhan@gmail.com>
>> >
>> > Wow, that's a huge performance improvement, really amazing.  Thank you
>> > for implementing this!  I applied it to master.
>>
>> This patch made it into OVS 2.8.  It makes such a big difference to
>> performance that I wonder if it's worth backporting to OVS 2.7.  Does
>> anyone have a strong opinion about that?  If there's no objections,
>> I'd like to backport it.
>
> It seems pretty low risk to me.  I won't oppose it.

OK, I backported this to branch-2.7.
diff mbox

Patch

diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
index 10c8105..417fdc9 100644
--- a/ovn/controller/ofctrl.c
+++ b/ovn/controller/ofctrl.c
@@ -813,6 +813,20 @@  add_ct_flush_zone(uint16_t zone_id, struct ovs_list *msgs)
     ovs_list_push_back(msgs, &msg->list_node);
 }
 
+/* The flow table can be updated if the connection to the switch is up and
+ * in the correct state and not backlogged with existing flow_mods.  (Our
+ * criteria for being backlogged appear very conservative, but the socket
+ * between ovn-controller and OVS provides some buffering.) */
+bool
+ofctrl_can_put(void)
+{
+    if (state != S_UPDATE_FLOWS
+        || rconn_packet_counter_n_packets(tx_counter)) {
+        return false;
+    }
+    return true;
+}
+
 /* Replaces the flow table on the switch, if possible, by the flows added
  * with ofctrl_add_flow().
  *
@@ -831,12 +845,7 @@  void
 ofctrl_put(struct hmap *flow_table, struct shash *pending_ct_zones,
            int64_t nb_cfg)
 {
-    /* The flow table can be updated if the connection to the switch is up and
-     * in the correct state and not backlogged with existing flow_mods.  (Our
-     * criteria for being backlogged appear very conservative, but the socket
-     * between ovn-controller and OVS provides some buffering.) */
-    if (state != S_UPDATE_FLOWS
-        || rconn_packet_counter_n_packets(tx_counter)) {
+    if (!ofctrl_can_put()) {
         ovn_flow_table_clear(flow_table);
         ovn_group_table_clear(groups, false);
         return;
diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
index bf5ba01..d83f6ae 100644
--- a/ovn/controller/ofctrl.h
+++ b/ovn/controller/ofctrl.h
@@ -34,6 +34,7 @@  struct shash;
 void ofctrl_init(struct group_table *group_table);
 enum mf_field_id ofctrl_run(const struct ovsrec_bridge *br_int,
                             struct shash *pending_ct_zones);
+bool ofctrl_can_put(void);
 void ofctrl_put(struct hmap *flow_table, struct shash *pending_ct_zones,
                 int64_t nb_cfg);
 void ofctrl_wait(void);
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index e00f57a..480d06d 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -639,23 +639,23 @@  main(int argc, char *argv[])
             update_ct_zones(&local_lports, &local_datapaths, &ct_zones,
                             ct_zone_bitmap, &pending_ct_zones);
             if (ctx.ovs_idl_txn) {
+                if (ofctrl_can_put()) {
+                    commit_ct_zones(br_int, &pending_ct_zones);
 
-                commit_ct_zones(br_int, &pending_ct_zones);
+                    struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
+                    lflow_run(&ctx, chassis, &lports, &mcgroups,
+                              &local_datapaths, &group_table, &ct_zones,
+                              &addr_sets, &flow_table);
 
-                struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
-                lflow_run(&ctx, chassis, &lports, &mcgroups,
-                          &local_datapaths, &group_table, &ct_zones,
-                          &addr_sets, &flow_table);
+                    physical_run(&ctx, mff_ovn_geneve,
+                                 br_int, chassis, &ct_zones, &lports,
+                                 &flow_table, &local_datapaths);
 
-                physical_run(&ctx, mff_ovn_geneve,
-                             br_int, chassis, &ct_zones, &lports,
-                             &flow_table, &local_datapaths);
-
-                ofctrl_put(&flow_table, &pending_ct_zones,
-                           get_nb_cfg(ctx.ovnsb_idl));
-
-                hmap_destroy(&flow_table);
+                    ofctrl_put(&flow_table, &pending_ct_zones,
+                               get_nb_cfg(ctx.ovnsb_idl));
 
+                    hmap_destroy(&flow_table);
+                }
                 if (ctx.ovnsb_idl_txn) {
                     int64_t cur_cfg = ofctrl_get_cur_cfg();
                     if (cur_cfg && cur_cfg != chassis->nb_cfg) {