[ovs-dev,2/2] ovn-controller: Change strategy for gateway conntrack zone allocation.
diff mbox

Message ID 1467963486-32094-1-git-send-email-guru@ovn.org
State Changes Requested
Headers show

Commit Message

Guru Shetty July 8, 2016, 7:38 a.m. UTC
Commit 263064aeaa31e7 (Convert binding_run to incremental processing.)
changed the way patched_datapaths were handled. Previously we would
destroy the datastructure in every run and re-create it fresh. The new
way causes problems with the way conntrack zones are allocated as now
we can have stale port_binding entries causing segmentation faults.

With this commit, we simply don't depend on port_binding records in
conntrack zone allocation and instead store the UUID as a string in
the patch_datapath datastructure.

Fixes: 263064aeaa31e7 ("Convert binding_run to incremental processing.")
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
---
This applies on top of https://patchwork.ozlabs.org/patch/646268/.
I will rebase this if the dependent patch changes.
---
 ovn/controller/ovn-controller.c | 4 ++--
 ovn/controller/ovn-controller.h | 2 +-
 ovn/controller/patch.c          | 4 +++-
 ovn/controller/physical.c       | 8 ++++++--
 ovn/lib/ovn-util.c              | 8 +++-----
 ovn/lib/ovn-util.h              | 3 +--
 6 files changed, 16 insertions(+), 13 deletions(-)

Comments

Ryan Moats July 8, 2016, 5:51 p.m. UTC | #1
"dev" <dev-bounces@openvswitch.org> wrote on 07/08/2016 02:38:06 AM:

> From: Gurucharan Shetty <guru@ovn.org>
> To: dev@openvswitch.org
> Date: 07/08/2016 12:36 PM
> Subject: [ovs-dev] [PATCH 2/2] ovn-controller: Change strategy for
> gateway conntrack zone allocation.
> Sent by: "dev" <dev-bounces@openvswitch.org>
>
> Commit 263064aeaa31e7 (Convert binding_run to incremental processing.)
> changed the way patched_datapaths were handled. Previously we would
> destroy the datastructure in every run and re-create it fresh. The new
> way causes problems with the way conntrack zones are allocated as now
> we can have stale port_binding entries causing segmentation faults.
>
> With this commit, we simply don't depend on port_binding records in
> conntrack zone allocation and instead store the UUID as a string in
> the patch_datapath datastructure.
>
> Fixes: 263064aeaa31e7 ("Convert binding_run to incremental processing.")
> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> ---

I like what this is doing, but I'd like it more if it had a test case
that would fail without this patch, but pass with it, so that we don't
regress...

Ryan
Guru Shetty July 8, 2016, 5:55 p.m. UTC | #2
On 8 July 2016 at 10:51, Ryan Moats <rmoats@us.ibm.com> wrote:

> "dev" <dev-bounces@openvswitch.org> wrote on 07/08/2016 02:38:06 AM:
>
> > From: Gurucharan Shetty <guru@ovn.org>
> > To: dev@openvswitch.org
> > Date: 07/08/2016 12:36 PM
> > Subject: [ovs-dev] [PATCH 2/2] ovn-controller: Change strategy for
> > gateway conntrack zone allocation.
> > Sent by: "dev" <dev-bounces@openvswitch.org>
> >
> > Commit 263064aeaa31e7 (Convert binding_run to incremental processing.)
> > changed the way patched_datapaths were handled. Previously we would
> > destroy the datastructure in every run and re-create it fresh. The new
> > way causes problems with the way conntrack zones are allocated as now
> > we can have stale port_binding entries causing segmentation faults.
> >
> > With this commit, we simply don't depend on port_binding records in
> > conntrack zone allocation and instead store the UUID as a string in
> > the patch_datapath datastructure.
> >
> > Fixes: 263064aeaa31e7 ("Convert binding_run to incremental processing.")
> > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> > ---
>
> I like what this is doing, but I'd like it more if it had a test case
> that would fail without this patch, but pass with it, so that we don't
> regress...
>
Ryan,
 With the above referenced patch that this issue fixes, stale patched
datapaths were simply not removed. And none of the OVN tests caught it. We
can't catch this unless we start adding negative tests (for e.g: delete
records and look to see if ovn-controller deleted patch ports in br-int).
Or do you have something else in mind?



>
> Ryan
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
Ryan Moats July 8, 2016, 6:24 p.m. UTC | #3
Guru Shetty <guru@ovn.org> wrote on 07/08/2016 12:55:06 PM:

> From: Guru Shetty <guru@ovn.org>
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: ovs dev <dev@openvswitch.org>
> Date: 07/08/2016 12:55 PM
> Subject: Re: [ovs-dev] [PATCH 2/2] ovn-controller: Change strategy
> for gateway conntrack zone allocation.
>
> On 8 July 2016 at 10:51, Ryan Moats <rmoats@us.ibm.com> wrote:
> "dev" <dev-bounces@openvswitch.org> wrote on 07/08/2016 02:38:06 AM:
>
> > From: Gurucharan Shetty <guru@ovn.org>
> > To: dev@openvswitch.org
> > Date: 07/08/2016 12:36 PM
> > Subject: [ovs-dev] [PATCH 2/2] ovn-controller: Change strategy for
> > gateway conntrack zone allocation.
> > Sent by: "dev" <dev-bounces@openvswitch.org>
> >
> > Commit 263064aeaa31e7 (Convert binding_run to incremental processing.)
> > changed the way patched_datapaths were handled. Previously we would
> > destroy the datastructure in every run and re-create it fresh. The new
> > way causes problems with the way conntrack zones are allocated as now
> > we can have stale port_binding entries causing segmentation faults.
> >
> > With this commit, we simply don't depend on port_binding records in
> > conntrack zone allocation and instead store the UUID as a string in
> > the patch_datapath datastructure.
> >
> > Fixes: 263064aeaa31e7 ("Convert binding_run to incremental
processing.")
> > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> > ---
>
> I like what this is doing, but I'd like it more if it had a test case
> that would fail without this patch, but pass with it, so that we don't
> regress...
> Ryan,
>  With the above referenced patch that this issue fixes, stale
> patched datapaths were simply not removed. And none of the OVN tests
> caught it. We can't catch this unless we start adding negative tests
> (for e.g: delete records and look to see if ovn-controller deleted
> patch ports in br-int). Or do you have something else in mind?

If this is going to cause a core dump, how about a test that runs
through the steps that cause a core dump and then check to see if
ovn-controller is still running?  That's what was done for the patch
that fixed the double destroy bug from the original address set patch
series.

Ryan

Patch
diff mbox

diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 8471f64..28ee13e 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -251,8 +251,8 @@  update_ct_zones(struct sset *lports, struct hmap *patched_datapaths,
             continue;
         }
 
-        char *dnat = alloc_nat_zone_key(pd->port_binding, "dnat");
-        char *snat = alloc_nat_zone_key(pd->port_binding, "snat");
+        char *dnat = alloc_nat_zone_key(pd->key, "dnat");
+        char *snat = alloc_nat_zone_key(pd->key, "snat");
         sset_add(&all_users, dnat);
         sset_add(&all_users, snat);
         free(dnat);
diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h
index f3edc43..470b1f5 100644
--- a/ovn/controller/ovn-controller.h
+++ b/ovn/controller/ovn-controller.h
@@ -51,7 +51,7 @@  struct local_datapath *get_local_datapath(const struct hmap *,
  * with at least one logical patch port binding. */
 struct patched_datapath {
     struct hmap_node hmap_node;
-    const struct sbrec_port_binding *port_binding;
+    char *key;  /* Holds the uuid of the corresponding datapath. */
     bool local; /* 'True' if the datapath is for gateway router. */
     bool stale; /* 'True' if the datapath is not referenced by any patch
                  * port. */
diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
index 3825f31..9a5c96f 100644
--- a/ovn/controller/patch.c
+++ b/ovn/controller/patch.c
@@ -263,7 +263,8 @@  add_patched_datapath(struct hmap *patched_datapaths,
 
     pd = xzalloc(sizeof *pd);
     pd->local = local;
-    pd->port_binding = binding_rec;
+    pd->key = xasprintf(UUID_FMT,
+                        UUID_ARGS(&binding_rec->datapath->header_.uuid));
     /* stale is set to false. */
     hmap_insert(patched_datapaths, &pd->hmap_node,
                 binding_rec->datapath->tunnel_key);
@@ -291,6 +292,7 @@  add_logical_patch_ports_postprocess(struct hmap *patched_datapaths)
                         patched_datapaths) {
         if (pd_cur_node->stale == true) {
             hmap_remove(patched_datapaths, &pd_cur_node->hmap_node);
+            free(pd_cur_node->key);
             free(pd_cur_node);
         }
     }
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index f7389ea..d1b40c2 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -297,8 +297,12 @@  consider_port_binding(struct hmap *flow_table,
         }
 
         int zone_id_dnat, zone_id_snat;
-        char *dnat = alloc_nat_zone_key(binding, "dnat");
-        char *snat = alloc_nat_zone_key(binding, "snat");
+        char *key = xasprintf(UUID_FMT,
+                              UUID_ARGS(&binding->datapath->header_.uuid));
+        char *dnat = alloc_nat_zone_key(key, "dnat");
+        char *snat = alloc_nat_zone_key(key, "snat");
+        free(key);
+
         zone_id_dnat = simap_get(ct_zones, dnat);
         if (zone_id_dnat) {
             put_load(zone_id_dnat, MFF_LOG_DNAT_ZONE, 0, 32, ofpacts_p);
diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
index 6113087..6e94083 100644
--- a/ovn/lib/ovn-util.c
+++ b/ovn/lib/ovn-util.c
@@ -105,13 +105,11 @@  extract_lsp_addresses(char *address, struct lport_addresses *laddrs,
 }
 
 /* Allocates a key for NAT conntrack zone allocation for a provided
- * 'port_binding' record and a 'type'.
+ * 'key' record and a 'type'.
  *
  * It is the caller's responsibility to free the allocated memory. */
 char *
-alloc_nat_zone_key(const struct sbrec_port_binding *port_binding,
-                   const char *type)
+alloc_nat_zone_key(const char *key, const char *type)
 {
-    return xasprintf(UUID_FMT"_%s",
-                     UUID_ARGS(&port_binding->datapath->header_.uuid), type);
+    return xasprintf("%s_%s", key, type);
 }
diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h
index d23f4af..98b1426 100644
--- a/ovn/lib/ovn-util.h
+++ b/ovn/lib/ovn-util.h
@@ -43,6 +43,5 @@  extract_lsp_addresses(char *address, struct lport_addresses *laddrs,
                       bool store_ipv6);
 
 char *
-alloc_nat_zone_key(const struct sbrec_port_binding *port_binding,
-                   const char *type);
+alloc_nat_zone_key(const char *key, const char *type);
 #endif