[ovs-dev,v5,RFC] ovn-controller: add quiet mode

Submitted by Han Zhou on April 20, 2017, 5:10 p.m.

Details

Message ID 1492708201-53895-1-git-send-email-zhouhan@gmail.com
State New
Headers show

Commit Message

Han Zhou April 20, 2017, 5:10 p.m.
As discussed in [1], what the incremental processing code
actually accomplished was that the ovn-controller would
be "quiet" and not burn CPU when things weren't changing.
This patch set recreates this state by calculating whether
changes have occured that would require a full calculation
to be performed.  It does this by persisting a copy of
the localvif_to_ofport and tunnel information in the
controller module, rather than in the physical.c module
as was the case with previous commits.

Performance improved extremely in below test scenario:

- 1 lswitch with 10 lports bound locally
- Each lport has an ingress ACL, referencing the same address-set
- The address-set has 10,000 IPv4 addresses

For each IP address in the address-set, there will be 3
OpenFlow rules generated for each ACL. So the total number
of rules is 300k+.

Without the patch, it takes 50+ minutes to install all the
rules to ovs-vswitchd.

With the patch, it takes 20 seconds to install all the rules
to ovs-vswitchd.

The reason is that the large number of rules are sent to
ovs-vswitchd gradually in many iterations of ovn-controller
main loop. Without the patch, cpu cycles are wasted in
lflow_run to re-processing the large address set in every
main loop iteration. With the patch, lflow_run is not
executed in most iterations because there is no change of
input.

[1] http://openvswitch.org/pipermail/dev/2016-August/078272.html

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Han Zhou <zhouhan@gmail.com>
---

Notes:
    v4->v5: Based on the old patch from Ryan Moats and rebased to current
    master. Current version is RFC only because a problem is found in
    testing. Sometimes latest SB DB change is not seen by ovn-controller:
    cur_seqno in ovn-controller stays behind the real seqno seen by
    ovsdb-tool, but ovn-controller does not wakeup until a new change
    happens in SB DB, which can be triggered by, e.g.:
    ovn-nbctl --wait=hv sync.

 ovn/controller/ofctrl.c         |   2 +
 ovn/controller/ovn-controller.c |  59 ++++++++++++++++-------
 ovn/controller/ovn-controller.h |   1 +
 ovn/controller/patch.c          |   2 +
 ovn/controller/physical.c       | 102 +++++++++++++++++++++++++---------------
 ovn/controller/physical.h       |   5 ++
 6 files changed, 116 insertions(+), 55 deletions(-)

Comments

Han Zhou April 20, 2017, 10:14 p.m.
>
> [1] http://openvswitch.org/pipermail/dev/2016-August/078272.html

This link is old, should be:
https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/078272.html


> Notes:
>     v4->v5: Based on the old patch from Ryan Moats and rebased to current
>     master. Current version is RFC only because a problem is found in
>     testing. Sometimes latest SB DB change is not seen by ovn-controller:
>     cur_seqno in ovn-controller stays behind the real seqno seen by
>     ovsdb-tool,

Please ignore this ovsdb-tool part, which is misleading. ovsdb-tool would
never see seqno of an idl :o)
The problem seems to be related to ctx.ovs_idl_txn. I am still debugging.

>                       but ovn-controller does not wakeup until a new
change
>     happens in SB DB, which can be triggered by, e.g.:
>     ovn-nbctl --wait=hv sync.

Thanks,
Han
Han Zhou April 21, 2017, 6:07 a.m.
On Thu, Apr 20, 2017 at 3:14 PM, Han Zhou <zhouhan@gmail.com> wrote:
>
>
>
>
> >
> > [1] http://openvswitch.org/pipermail/dev/2016-August/078272.html
>
> This link is old, should be:
https://mail.openvswitch.org/pipermail/ovs-dev/2016-August/078272.html
>
>
> > Notes:
> >     v4->v5: Based on the old patch from Ryan Moats and rebased to
current
> >     master. Current version is RFC only because a problem is found in
> >     testing. Sometimes latest SB DB change is not seen by
ovn-controller:
> >     cur_seqno in ovn-controller stays behind the real seqno seen by
> >     ovsdb-tool,
>
> Please ignore this ovsdb-tool part, which is misleading. ovsdb-tool would
never see seqno of an idl :o)
> The problem seems to be related to ctx.ovs_idl_txn. I am still debugging.
>
The root cause of the problem is found and fixed, so I submitted formal v5:

https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331226.html

The root cause is mentioned in the notes.

Thanks,
Han

Patch hide | download patch | download mbox

diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
index 10c8105..20e4ab6 100644
--- a/ovn/controller/ofctrl.c
+++ b/ovn/controller/ofctrl.c
@@ -383,6 +383,8 @@  run_S_CLEAR_FLOWS(void)
     queue_msg(encode_group_mod(&gm));
     ofputil_uninit_group_mod(&gm);
 
+    force_full_process();
+
     /* Clear existing groups, to match the state of the switch. */
     if (groups) {
         ovn_group_table_clear(groups, true);
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index e00f57a..2822eb7 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -484,6 +484,23 @@  get_nb_cfg(struct ovsdb_idl *idl)
     return sb ? sb->nb_cfg : 0;
 }
 
+/* Contains mapping of localvifs to OF ports for detecting if the physical
+ * configuration of the switch has changed. */
+static struct simap localvif_to_ofport =
+    SIMAP_INITIALIZER(&localvif_to_ofport);
+
+/* Contains the list of known tunnels. */
+static struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
+
+/* The last sequence number seen from the southbound IDL. */
+static unsigned int seqno = 0;
+
+void
+force_full_process(void)
+{
+    seqno = 0;
+}
+
 int
 main(int argc, char *argv[])
 {
@@ -613,10 +630,8 @@  main(int argc, char *argv[])
 
         struct ldatapath_index ldatapaths;
         struct lport_index lports;
-        struct mcgroup_index mcgroups;
         ldatapath_index_init(&ldatapaths, ctx.ovnsb_idl);
         lport_index_init(&lports, ctx.ovnsb_idl);
-        mcgroup_index_init(&mcgroups, ctx.ovnsb_idl);
 
         const struct sbrec_chassis *chassis = NULL;
         if (chassis_id) {
@@ -626,36 +641,47 @@  main(int argc, char *argv[])
                         &local_datapaths, &local_lports);
         }
 
+        bool physical_change = true;
         if (br_int && chassis) {
+            enum mf_field_id mff_ovn_geneve = ofctrl_run(br_int,
+                                                         &pending_ct_zones);
+            physical_change = detect_and_save_physical_changes(
+                &localvif_to_ofport, &tunnels, mff_ovn_geneve, br_int,
+                chassis);
+            unsigned int cur_seqno = ovsdb_idl_get_seqno(ovnsb_idl_loop.idl);
+
             struct shash addr_sets = SHASH_INITIALIZER(&addr_sets);
             addr_sets_init(&ctx, &addr_sets);
 
             patch_run(&ctx, br_int, chassis, &local_datapaths);
 
-            enum mf_field_id mff_ovn_geneve = ofctrl_run(br_int,
-                                                         &pending_ct_zones);
-
             pinctrl_run(&ctx, &lports, br_int, chassis, &local_datapaths);
             update_ct_zones(&local_lports, &local_datapaths, &ct_zones,
                             ct_zone_bitmap, &pending_ct_zones);
             if (ctx.ovs_idl_txn) {
+                if (physical_change || seqno < cur_seqno) {
+                    seqno = cur_seqno;
 
-                commit_ct_zones(br_int, &pending_ct_zones);
+                    commit_ct_zones(br_int, &pending_ct_zones);
 
-                struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
-                lflow_run(&ctx, chassis, &lports, &mcgroups,
-                          &local_datapaths, &group_table, &ct_zones,
-                          &addr_sets, &flow_table);
+                    struct mcgroup_index mcgroups;
+                    mcgroup_index_init(&mcgroups, ctx.ovnsb_idl);
 
-                physical_run(&ctx, mff_ovn_geneve,
-                             br_int, chassis, &ct_zones, &lports,
-                             &flow_table, &local_datapaths);
+                    struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
+                    lflow_run(&ctx, chassis, &lports, &mcgroups,
+                              &local_datapaths, &group_table, &ct_zones,
+                              &addr_sets, &flow_table);
 
-                ofctrl_put(&flow_table, &pending_ct_zones,
-                           get_nb_cfg(ctx.ovnsb_idl));
+                    physical_run(&ctx, mff_ovn_geneve,
+                                 br_int, chassis, &ct_zones, &lports,
+                                 &flow_table, &local_datapaths);
 
-                hmap_destroy(&flow_table);
+                    ofctrl_put(&flow_table, &pending_ct_zones,
+                               get_nb_cfg(ctx.ovnsb_idl));
 
+                    hmap_destroy(&flow_table);
+                    mcgroup_index_destroy(&mcgroups);
+                }
                 if (ctx.ovnsb_idl_txn) {
                     int64_t cur_cfg = ofctrl_get_cur_cfg();
                     if (cur_cfg && cur_cfg != chassis->nb_cfg) {
@@ -693,7 +719,6 @@  main(int argc, char *argv[])
             free(pending_pkt.flow_s);
         }
 
-        mcgroup_index_destroy(&mcgroups);
         lport_index_destroy(&lports);
         ldatapath_index_destroy(&ldatapaths);
 
diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h
index 4bc0467..9a782d1 100644
--- a/ovn/controller/ovn-controller.h
+++ b/ovn/controller/ovn-controller.h
@@ -87,5 +87,6 @@  enum chassis_tunnel_type {
 
 uint32_t get_tunnel_type(const char *name);
 
+void force_full_process(void);
 
 #endif /* ovn/ovn-controller.h */
diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
index 158413e..5e383a5 100644
--- a/ovn/controller/patch.c
+++ b/ovn/controller/patch.c
@@ -96,6 +96,7 @@  create_patch_port(struct controller_ctx *ctx,
     ovsrec_bridge_set_ports(src, ports, src->n_ports + 1);
 
     free(ports);
+    force_full_process();
 }
 
 static void
@@ -124,6 +125,7 @@  remove_port(struct controller_ctx *ctx,
             ovsrec_bridge_set_ports(bridge, new_ports, bridge->n_ports - 1);
             free(new_ports);
             ovsrec_port_delete(port);
+            force_full_process();
             return;
         }
     }
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index 0f1aa63..084eea2 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -56,10 +56,6 @@  physical_register_ovs_idl(struct ovsdb_idl *ovs_idl)
     ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_external_ids);
 }
 
-static struct simap localvif_to_ofport =
-    SIMAP_INITIALIZER(&localvif_to_ofport);
-static struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
-
 /* Maps from a chassis to the OpenFlow port number of the tunnel that can be
  * used to reach that chassis. */
 struct chassis_tunnel {
@@ -70,11 +66,11 @@  struct chassis_tunnel {
 };
 
 static struct chassis_tunnel *
-chassis_tunnel_find(const char *chassis_id)
+chassis_tunnel_find(struct hmap *tunnels_p, const char *chassis_id)
 {
     struct chassis_tunnel *tun;
     HMAP_FOR_EACH_WITH_HASH (tun, hmap_node, hash_string(chassis_id, 0),
-                             &tunnels) {
+                             tunnels_p) {
         if (!strcmp(tun->chassis_id, chassis_id)) {
             return tun;
         }
@@ -293,7 +289,9 @@  consider_port_binding(enum mf_field_id mff_ovn_geneve,
                       const struct sbrec_port_binding *binding,
                       const struct sbrec_chassis *chassis,
                       struct ofpbuf *ofpacts_p,
-                      struct hmap *flow_table)
+                      struct hmap *flow_table,
+                      struct simap *localvif_to_ofport_p,
+                      struct hmap *tunnels_p)
 {
     uint32_t dp_key = binding->datapath->tunnel_key;
     uint32_t port_key = binding->tunnel_key;
@@ -444,14 +442,14 @@  consider_port_binding(enum mf_field_id mff_ovn_geneve,
         if (!binding->tag) {
             return;
         }
-        ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
+        ofport = u16_to_ofp(simap_get(localvif_to_ofport_p,
                                       binding->parent_port));
         if (ofport) {
             tag = *binding->tag;
             nested_container = true;
         }
     } else {
-        ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
+        ofport = u16_to_ofp(simap_get(localvif_to_ofport_p,
                                       binding->logical_port));
         if ((!strcmp(binding->type, "localnet")
             || !strcmp(binding->type, "l2gateway"))
@@ -470,13 +468,13 @@  consider_port_binding(enum mf_field_id mff_ovn_geneve,
             return;
         }
         if (localnet_port) {
-            ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
+            ofport = u16_to_ofp(simap_get(localvif_to_ofport_p,
                                           localnet_port->logical_port));
             if (!ofport) {
                 return;
             }
         } else {
-            tun = chassis_tunnel_find(binding->chassis->name);
+            tun = chassis_tunnel_find(tunnels_p, binding->chassis->name);
             if (!tun) {
                 return;
             }
@@ -633,7 +631,9 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
                   const struct sbrec_multicast_group *mc,
                   struct ofpbuf *ofpacts_p,
                   struct ofpbuf *remote_ofpacts_p,
-                  struct hmap *flow_table)
+                  struct hmap *flow_table,
+                  struct simap *localvif_to_ofport_p,
+                  struct hmap *tunnels_p)
 {
     uint32_t dp_key = mc->datapath->tunnel_key;
     if (!get_local_datapath(local_datapaths, dp_key)) {
@@ -681,7 +681,7 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
             put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32,
                      remote_ofpacts_p);
             put_resubmit(OFTABLE_CHECK_LOOPBACK, remote_ofpacts_p);
-        } else if (simap_contains(&localvif_to_ofport,
+        } else if (simap_contains(localvif_to_ofport_p,
                            (port->parent_port && *port->parent_port)
                            ? port->parent_port : port->logical_port)
                    || (!strcmp(port->type, "l3gateway")
@@ -729,7 +729,7 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
         const struct chassis_tunnel *prev = NULL;
         SSET_FOR_EACH (chassis, &remote_chassis) {
             const struct chassis_tunnel *tun
-                = chassis_tunnel_find(chassis);
+                = chassis_tunnel_find(tunnels_p, chassis);
             if (!tun) {
                 continue;
             }
@@ -753,12 +753,15 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
     sset_destroy(&remote_chassis);
 }
 
-void
-physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
-             const struct ovsrec_bridge *br_int,
-             const struct sbrec_chassis *chassis,
-             const struct simap *ct_zones, struct lport_index *lports,
-             struct hmap *flow_table, struct hmap *local_datapaths)
+/* Scans the bridge 'br_int' and compares it to the supplied expected state.
+ * Returns true and updates the expected state if changes are detected.
+ * Returns false if no changes are found. */
+bool
+detect_and_save_physical_changes(struct simap *localvif_to_ofport_p,
+                                 struct hmap *tunnels_p,
+                                 enum mf_field_id mff_ovn_geneve,
+                                 const struct ovsrec_bridge *br_int,
+                                 const struct sbrec_chassis *chassis)
 {
 
     /* This bool tracks physical mapping changes. */
@@ -824,7 +827,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 }
 
                 simap_put(&new_tunnel_to_ofport, chassis_id, ofport);
-                struct chassis_tunnel *tun = chassis_tunnel_find(chassis_id);
+                struct chassis_tunnel *tun = chassis_tunnel_find(tunnels_p,
+                                                                 chassis_id);
                 if (tun) {
                     /* If the tunnel's ofport has changed, update. */
                     if (tun->ofport != u16_to_ofp(ofport) ||
@@ -835,7 +839,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                     }
                 } else {
                     tun = xmalloc(sizeof *tun);
-                    hmap_insert(&tunnels, &tun->hmap_node,
+                    hmap_insert(tunnels_p, &tun->hmap_node,
                                 hash_string(chassis_id, 0));
                     tun->chassis_id = chassis_id;
                     tun->ofport = u16_to_ofp(ofport);
@@ -855,9 +859,9 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
     /* Remove tunnels that are no longer here. */
     struct chassis_tunnel *tun, *tun_next;
-    HMAP_FOR_EACH_SAFE (tun, tun_next, hmap_node, &tunnels) {
+    HMAP_FOR_EACH_SAFE (tun, tun_next, hmap_node, tunnels_p) {
         if (!simap_find(&new_tunnel_to_ofport, tun->chassis_id)) {
-            hmap_remove(&tunnels, &tun->hmap_node);
+            hmap_remove(tunnels_p, &tun->hmap_node);
             physical_map_changed = true;
             free(tun);
         }
@@ -865,29 +869,45 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
     /* Capture changed or removed openflow ports. */
     struct simap_node *vif_name, *vif_name_next;
-    SIMAP_FOR_EACH_SAFE (vif_name, vif_name_next, &localvif_to_ofport) {
+    SIMAP_FOR_EACH_SAFE (vif_name, vif_name_next, localvif_to_ofport_p) {
         int newport;
         if ((newport = simap_get(&new_localvif_to_ofport, vif_name->name))) {
-            if (newport != simap_get(&localvif_to_ofport, vif_name->name)) {
-                simap_put(&localvif_to_ofport, vif_name->name, newport);
+            if (newport != simap_get(localvif_to_ofport_p, vif_name->name)) {
+                simap_put(localvif_to_ofport_p, vif_name->name, newport);
                 physical_map_changed = true;
             }
         } else {
-            simap_find_and_delete(&localvif_to_ofport, vif_name->name);
+            simap_find_and_delete(localvif_to_ofport_p, vif_name->name);
             physical_map_changed = true;
         }
     }
     SIMAP_FOR_EACH (vif_name, &new_localvif_to_ofport) {
-        if (!simap_get(&localvif_to_ofport, vif_name->name)) {
-            simap_put(&localvif_to_ofport, vif_name->name,
+        if (!simap_get(localvif_to_ofport_p, vif_name->name)) {
+            simap_put(localvif_to_ofport_p, vif_name->name,
                       simap_get(&new_localvif_to_ofport, vif_name->name));
             physical_map_changed = true;
         }
     }
-    if (physical_map_changed) {
-        /* Reprocess logical flow table immediately. */
-        poll_immediate_wake();
-    }
+
+
+    simap_destroy(&new_localvif_to_ofport);
+    simap_destroy(&new_tunnel_to_ofport);
+
+    return physical_map_changed;
+}
+
+void
+physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
+             const struct ovsrec_bridge *br_int,
+             const struct sbrec_chassis *chassis,
+             const struct simap *ct_zones, struct lport_index *lports,
+             struct hmap *flow_table, struct hmap *local_datapaths)
+{
+    struct simap localvif_to_ofport = SIMAP_INITIALIZER(&localvif_to_ofport);
+    struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
+
+    detect_and_save_physical_changes(&localvif_to_ofport, &tunnels,
+                                     mff_ovn_geneve, br_int, chassis);
 
     struct ofpbuf ofpacts;
     ofpbuf_init(&ofpacts, 0);
@@ -898,7 +918,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
         consider_port_binding(mff_ovn_geneve, ct_zones, lports,
                               local_datapaths, binding, chassis,
-                              &ofpacts, flow_table);
+                              &ofpacts, flow_table,
+                              &localvif_to_ofport, &tunnels);
     }
 
     /* Handle output to multicast groups, in tables 32 and 33. */
@@ -907,7 +928,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     ofpbuf_init(&remote_ofpacts, 0);
     SBREC_MULTICAST_GROUP_FOR_EACH (mc, ctx->ovnsb_idl) {
         consider_mc_group(mff_ovn_geneve, ct_zones, local_datapaths, chassis,
-                          mc, &ofpacts, &remote_ofpacts, flow_table);
+                          mc, &ofpacts, &remote_ofpacts, flow_table,
+                          &localvif_to_ofport, &tunnels);
     }
 
     ofpbuf_uninit(&remote_ofpacts);
@@ -923,6 +945,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
      * ports.  We set MFF_LOG_DATAPATH, MFF_LOG_INPORT, and
      * MFF_LOG_OUTPORT from the tunnel key data, then resubmit to table
      * 33 to handle packets to the local hypervisor. */
+    struct chassis_tunnel *tun;
     HMAP_FOR_EACH (tun, hmap_node, &tunnels) {
         struct match match = MATCH_CATCHALL_INITIALIZER;
         match_set_in_port(&match, tun->ofport);
@@ -1041,6 +1064,9 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
     ofpbuf_uninit(&ofpacts);
 
-    simap_destroy(&new_localvif_to_ofport);
-    simap_destroy(&new_tunnel_to_ofport);
+    simap_destroy(&localvif_to_ofport);
+    HMAP_FOR_EACH_POP (tun, hmap_node, &tunnels) {
+        free(tun);
+    }
+    hmap_destroy(&tunnels);
 }
diff --git a/ovn/controller/physical.h b/ovn/controller/physical.h
index e2debed..fc3d4c7 100644
--- a/ovn/controller/physical.h
+++ b/ovn/controller/physical.h
@@ -41,6 +41,11 @@  struct simap;
 #define OVN_GENEVE_LEN 4
 
 void physical_register_ovs_idl(struct ovsdb_idl *);
+bool detect_and_save_physical_changes(struct simap *localvif_to_ofport_p,
+                                      struct hmap *tunnels_p,
+                                      enum mf_field_id mff_ovn_geneve,
+                                      const struct ovsrec_bridge *br_int,
+                                      const struct sbrec_chassis *chassis);
 void physical_run(struct controller_ctx *, enum mf_field_id mff_ovn_geneve,
                   const struct ovsrec_bridge *br_int,
                   const struct sbrec_chassis *chassis,