diff mbox series

[ovs-dev,v1] ovn-controller: Fix the CT zone assignment logic for logical routers

Message ID 1596513334-11673-1-git-send-email-svc.mail.git@nutanix.com
State Changes Requested
Headers show
Series [ovs-dev,v1] ovn-controller: Fix the CT zone assignment logic for logical routers | expand

Commit Message

Ankur Sharma Aug. 4, 2020, 3:55 a.m. UTC
From: Ankur Sharma <ankur.sharma@nutanix.com>

BACKGROUND:
a. ovn-controller assigns CT ZONES for local ports and datapaths.
b. If a local port/datapath is cleaned up from a chassis, then
   corresponding CT ZONE is "unassigned"/"freed" up.

ISSUE:
Above logic and implementations leaves stale CT entries in the
datapath, which may get reused unexpectedly, thereby causing
issues like, packets going through ct_nat(SNAT_IP_NEW) and getting
a stale IP as SNAT IP etc.

a. As a part of CT Zone unassign, implementation should FLUSH the
   corresponding CT entries, i.e it should do FLUSH by ZONE.
   As os now, implementation avoids the flushing, thereby leaving
   stale CT entries.

b. Similarly, since the implementation relies on datapath existence
   for assign/unassign of CT ZONEs. Hence, simple operations like
   moving the logical router from one external logical switch to
   another, may not cause any CT ZONE reassignment and thereby
   stale CT entries might get consumed, when they should not have
   been.

c. a. and b. combined causes following:
   i. Start a to be SNATed traffic from internal endpoint to an
      external endpoint. Let us say internal endpoint IP is
      50.0.0.10 and external endpoint ip is 8.8.8.8 and
      logical router port ip (and hence SNAT ip) is 100.0.0.10.

  ii. Detach the logical router from old external logical switch
      and attach to new external logical switch. As a result
      of this operation, new router port ip becomes 200.0.0.10
      , which also becomes the new SNAT ip.

 iii. The observation has been that traffic initiated in i. above
      still ends up using OLD SNAT IP, i.e 100.0.0.10, rather than
      200.0.0.10

 iv. iii. above happened, because although from OVS DP, the IP for
     NAT action is 200.0.0.10, however, since its an ongoing traffic,
     hence the CT entries come in use and end up NATing to old SNAT
     ip 100.0.0.10. For example:

     OVS DP STATE
     recirc_id(0),in_port(16),....ct(commit,zone=1,nat(src=200.0.0.10))

     CT STATE
     icmp,orig=(src=50.0.0.10,dst=8.8.8.8,id=2288,type=8,code=0),
     reply=(src=8.8.8.8,dst=100.0.0.10,id=2288,type=0,code=0),zone=1

FIX:
This patch improves the overall CT ZONE management by doing following:
a. Do a FLUSH by CT ZONE, once we identify that a zone has to be freed up.
b. From datapath perspective, restrict the CT ZONE assignment ONLY
   to logical routers that has NAT rules enabled.
c. Instead of using logical router uuid as ct zone key, use crossproduct
   of logical router and logical router port that connects to external
   logical switch.

Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
---
 controller/ovn-controller.c | 37 +++++++++++++++++++++++++++----------
 controller/physical.c       | 18 ++++++++++++------
 lib/ovn-util.c              | 10 ++++++----
 lib/ovn-util.h              |  3 ++-
 4 files changed, 47 insertions(+), 21 deletions(-)

Comments

0-day Robot Aug. 4, 2020, 4:56 a.m. UTC | #1
Bleep bloop.  Greetings Ankur Sharma, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: sha1 information is lacking or useless (controller/ovn-controller.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0001 ovn-controller: Fix the CT zone assignment logic for logical routers
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Dumitru Ceara Aug. 7, 2020, 8:58 a.m. UTC | #2
On 8/4/20 5:55 AM, Ankur Sharma wrote:
> From: Ankur Sharma <ankur.sharma@nutanix.com>
> 
> BACKGROUND:
> a. ovn-controller assigns CT ZONES for local ports and datapaths.
> b. If a local port/datapath is cleaned up from a chassis, then
>    corresponding CT ZONE is "unassigned"/"freed" up.
> 
> ISSUE:
> Above logic and implementations leaves stale CT entries in the
> datapath, which may get reused unexpectedly, thereby causing
> issues like, packets going through ct_nat(SNAT_IP_NEW) and getting
> a stale IP as SNAT IP etc.
> 
> a. As a part of CT Zone unassign, implementation should FLUSH the
>    corresponding CT entries, i.e it should do FLUSH by ZONE.
>    As os now, implementation avoids the flushing, thereby leaving
>    stale CT entries.
> 
> b. Similarly, since the implementation relies on datapath existence
>    for assign/unassign of CT ZONEs. Hence, simple operations like
>    moving the logical router from one external logical switch to
>    another, may not cause any CT ZONE reassignment and thereby
>    stale CT entries might get consumed, when they should not have
>    been.
> 
> c. a. and b. combined causes following:
>    i. Start a to be SNATed traffic from internal endpoint to an
>       external endpoint. Let us say internal endpoint IP is
>       50.0.0.10 and external endpoint ip is 8.8.8.8 and
>       logical router port ip (and hence SNAT ip) is 100.0.0.10.
> 
>   ii. Detach the logical router from old external logical switch
>       and attach to new external logical switch. As a result
>       of this operation, new router port ip becomes 200.0.0.10
>       , which also becomes the new SNAT ip.
> 
>  iii. The observation has been that traffic initiated in i. above
>       still ends up using OLD SNAT IP, i.e 100.0.0.10, rather than
>       200.0.0.10
> 
>  iv. iii. above happened, because although from OVS DP, the IP for
>      NAT action is 200.0.0.10, however, since its an ongoing traffic,
>      hence the CT entries come in use and end up NATing to old SNAT
>      ip 100.0.0.10. For example:
> 
>      OVS DP STATE
>      recirc_id(0),in_port(16),....ct(commit,zone=1,nat(src=200.0.0.10))
> 
>      CT STATE
>      icmp,orig=(src=50.0.0.10,dst=8.8.8.8,id=2288,type=8,code=0),
>      reply=(src=8.8.8.8,dst=100.0.0.10,id=2288,type=0,code=0),zone=1
> 
> FIX:
> This patch improves the overall CT ZONE management by doing following:
> a. Do a FLUSH by CT ZONE, once we identify that a zone has to be freed up.
> b. From datapath perspective, restrict the CT ZONE assignment ONLY
>    to logical routers that has NAT rules enabled.

Hi Ankur,

Maybe I'm missing something but doesn't this cause stateful ACLs, load
balancers applied to logical_switches and load balancers applied to
logical routers that don't have nat configured to use the default
conntrack zone (0)? I don't think that's ok.

As a matter of fact, most of the system-ovn.at tests related to
conntrack fail because conntrack entries are created in the default
zone, e.g.:

$ make check-kernel
1: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT FAILED
(system-ovn.at:116)

+++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/1/stdout
2020-08-07 04:50:11.167836393 -0400
@@ -1,2 +1,2 @@
-icmp,orig=(src=172.16.1.2,dst=30.0.0.2,id=<cleared>,type=8,code=0),reply=(src=192.168.1.2,dst=172.16.1.2,id=<cleared>,type=0,code=0),zone=<cleared>
+icmp,orig=(src=172.16.1.2,dst=30.0.0.2,id=<cleared>,type=8,code=0),reply=(src=192.168.1.2,dst=172.16.1.2,id=<cleared>,type=0,code=0)

  2: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT - IPv6
FAILED (system-ovn.at:296)
  3: ovn -- 2 LRs connected via LS, gateway router, easy SNAT FAILED
(system-ovn.at:448)
  4: ovn -- 2 LRs connected via LS, gateway router, easy SNAT - IPv6
FAILED (system-ovn.at:560)
  5: ovn -- multiple gateway routers, SNAT and DNAT  FAILED
(system-ovn.at:730)
  6: ovn -- multiple gateway routers, SNAT and DNAT - IPv6 FAILED
(system-ovn.at:958)
  7: ovn -- load-balancing                           FAILED
(system-ovn.at:1153)
  8: ovn -- load-balancing - IPv6                    ok
  9: ovn -- load-balancing - same subnet.            ok
 10: ovn -- load-balancing - same subnet. - IPv6     ok
 11: ovn -- load balancing in gateway router         FAILED
(system-ovn.at:1829)

./system-ovn.at:1829: ovs-appctl dpctl/dump-conntrack | grep
"dst=30.0.0.1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e
's/id=[0-9]*/id=<cleared>/g' -e 's/state=[0-9_A-Z]*/state=<cleared>/g' |
sort | uniq |
sed -e 's/zone=[0-9]*/zone=<cleared>/'
--- -   2020-08-07 04:46:41.414957151 -0400
+++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/11/stdout
2020-08-07 04:46:41.412798625 -0400
@@ -1,3 +1,3 @@
-tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
-tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
+tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),protoinfo=(state=<cleared>)
+tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),protoinfo=(state=<cleared>)


 12: ovn -- load balancing in gateway router - IPv6  FAILED
(system-ovn.at:2033)
 13: ovn -- multiple gateway routers, load-balancing FAILED
(system-ovn.at:2209)
 14: ovn -- multiple gateway routers, load-balancing - IPv6 FAILED
(system-ovn.at:2376)

[...]

 30: ovn -- ECMP symmetric reply                     FAILED
(system-ovn.at:4684)

./system-ovn.at:4684: ovs-appctl dpctl/dump-conntrack | grep
"dst=172.16.0.1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e
's/id=[0-9]*/id=<cleared>/g' -e 's/state=[0-9_A-Z]*/state=<cleared>/g' |
sort | uniq | \
sed -e 's/zone=[0-9]*/zone=<cleared>/' |
sed -e
's/labels=0x[0-9a-f]*00000401020400000000/labels=0x00000401020400000000/'
--- -   2020-08-07 04:56:22.403499981 -0400
+++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/30/stdout
2020-08-07 04:56:22.401442917 -0400
@@ -1,2 +1,2 @@
-icmp,orig=(src=172.16.0.1,dst=10.0.0.2,id=<cleared>,type=8,code=0),reply=(src=10.0.0.2,dst=172.16.0.1,id=<cleared>,type=0,code=0),zone=<cleared>,labels=0x00000401020400000000
+icmp,orig=(src=172.16.0.1,dst=10.0.0.2,id=<cleared>,type=8,code=0),reply=(src=10.0.0.2,dst=172.16.0.1,id=<cleared>,type=0,code=0),labels=0x00000401020400000000

> c. Instead of using logical router uuid as ct zone key, use crossproduct
>    of logical router and logical router port that connects to external
>    logical switch.
> 
> Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
> ---
>  controller/ovn-controller.c | 37 +++++++++++++++++++++++++++----------
>  controller/physical.c       | 18 ++++++++++++------
>  lib/ovn-util.c              | 10 ++++++----
>  lib/ovn-util.h              |  3 ++-
>  4 files changed, 47 insertions(+), 21 deletions(-)
> 
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 5ca32ac..9a6746e 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -521,17 +521,34 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths,
>          sset_add(&all_users, user);
>      }
>  
> -    /* Local patched datapath (gateway routers) need zones assigned. */
> +    /* Local patched datapath (gateway routers) need zones assigned.
> +     * Only local logical routers with atleast one NAT rule are considered for
> +     * CT zone assignment.*/
>      const struct local_datapath *ld;
>      HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
> -        /* XXX Add method to limit zone assignment to logical router
> -         * datapaths with NAT */
> -        char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "dnat");
> -        char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "snat");
> -        sset_add(&all_users, dnat);
> -        sset_add(&all_users, snat);
> -        free(dnat);
> -        free(snat);
> +       const char *dp_nblr = smap_get(&ld->datapath->external_ids,
> +                                      "logical-router");
> +       if (dp_nblr) {
> +          for (size_t iter = 0; iter < ld->n_peer_ports; iter++) {
> +             const struct sbrec_port_binding *peer_binding =
> +                                             ld->peer_ports[iter].remote;
> +             const struct sbrec_port_binding *local_binding =
> +                                             ld->peer_ports[iter].local;
> +
> +             if (peer_binding->nat_addresses) {
> +                char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid,
> +                                                &local_binding->header_.uuid,
> +                                                "dnat");
> +                char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid,
> +                                                &local_binding->header_.uuid,
> +                                                "snat");
> +                sset_add(&all_users, dnat);
> +                sset_add(&all_users, snat);
> +                free(dnat);
> +                free(snat);
> +             }
> +          }
> +       }
>      }
>  
>      /* Delete zones that do not exist in above sset. */
> @@ -541,7 +558,7 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths,
>                       ct_zone->data, ct_zone->name);
>  
>              struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending);
> -            pending->state = CT_ZONE_DB_QUEUED; /* Skip flushing zone. */
> +            pending->state = CT_ZONE_OF_QUEUED;
>              pending->zone = ct_zone->data;
>              pending->add = false;
>              shash_add(pending_ct_zones, ct_zone->name, pending);
> diff --git a/controller/physical.c b/controller/physical.c
> index 535c777..cc497e0 100644
> --- a/controller/physical.c
> +++ b/controller/physical.c
> @@ -218,18 +218,24 @@ static struct zone_ids
>  get_zone_ids(const struct sbrec_port_binding *binding,
>               const struct simap *ct_zones)
>  {
> -    struct zone_ids zone_ids;
> +    struct zone_ids zone_ids = {0};
>  
>      zone_ids.ct = simap_get(ct_zones, binding->logical_port);
>  
> -    const struct uuid *key = &binding->datapath->header_.uuid;
> +    const struct uuid *key1 = &binding->datapath->header_.uuid;
> +    const struct uuid *key2 = &binding->header_.uuid;
>  
> -    char *dnat = alloc_nat_zone_key(key, "dnat");
> -    zone_ids.dnat = simap_get(ct_zones, dnat);
> +    char *dnat = alloc_nat_zone_key(key1, key2, "dnat");
> +
> +    if (simap_contains(ct_zones, dnat)) {
> +       zone_ids.dnat = simap_get(ct_zones, dnat);
> +    }

Both simap_get() and simap_contains() call simap_find() internally.  If
simap_get() can't find the key it will return a default value of 0.
There's no need for the "if (simap_contains(ct_zones, dnat)) {". We can
just directly:

zone_ids.dnat = simap_get(ct_zones, dnat);

>      free(dnat);
>  
> -    char *snat = alloc_nat_zone_key(key, "snat");
> -    zone_ids.snat = simap_get(ct_zones, snat);
> +    char *snat = alloc_nat_zone_key(key1, key2, "snat");
> +    if (simap_contains(ct_zones, snat)) {
> +       zone_ids.snat = simap_get(ct_zones, snat);
> +    }

Same here.

Thanks,
Dumitru

>      free(snat);
>  
>      return zone_ids;
> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> index cdb5e18..cba7355 100644
> --- a/lib/ovn-util.c
> +++ b/lib/ovn-util.c
> @@ -327,14 +327,16 @@ destroy_lport_addresses(struct lport_addresses *laddrs)
>      free(laddrs->ipv6_addrs);
>  }
>  
> -/* Allocates a key for NAT conntrack zone allocation for a provided
> - * 'key' record and a 'type'.
> +/* Allocates a key for NAT conntrack zone allocation for provided
> + * 'keys' and a 'type'.
>   *
>   * It is the caller's responsibility to free the allocated memory. */
>  char *
> -alloc_nat_zone_key(const struct uuid *key, const char *type)
> +alloc_nat_zone_key(const struct uuid *key1, const struct uuid *key2,
> +                   const char *type)
>  {
> -    return xasprintf(UUID_FMT"_%s", UUID_ARGS(key), type);
> +    return xasprintf(UUID_FMT"_"UUID_FMT"_%s", UUID_ARGS(key1),
> +                     UUID_ARGS(key2), type);
>  }
>  
>  const char *
> diff --git a/lib/ovn-util.h b/lib/ovn-util.h
> index 0f7b501..fe86bf8 100644
> --- a/lib/ovn-util.h
> +++ b/lib/ovn-util.h
> @@ -77,7 +77,8 @@ bool extract_sbrec_binding_first_mac(const struct sbrec_port_binding *binding,
>  
>  void destroy_lport_addresses(struct lport_addresses *);
>  
> -char *alloc_nat_zone_key(const struct uuid *key, const char *type);
> +char *alloc_nat_zone_key(const struct uuid *key1, const struct uuid *key2,
> +                         const char *type);
>  
>  const char *default_nb_db(void);
>  const char *default_sb_db(void);
>
Ankur Sharma Aug. 13, 2020, 1:08 a.m. UTC | #3
Hi Dumitru,

Thanks a lot for looking at the change.
It my mistake, LB use case totally slipped through my mind.

Cross product based approach is not generic enough to address LB as well (and future use cases for CT ZONE).

Please ignore this patch, i will submit a new one with a more generic approach soon.

There is one change in this patch that can be taken, i.e replacing CT_ZONE_DB_QUEUED with CT_ZONE_OF_QUEUED,
but i will submit a separate patch for it as well.


Thanks

Regards,
Ankur




From: Dumitru Ceara <dceara@redhat.com>
Sent: Friday, August 7, 2020 1:58 AM
To: svc.mail.git <svc.mail.git@nutanix.com>; ovs-dev@openvswitch.org <ovs-dev@openvswitch.org>; Ankur Sharma <ankur.sharma@nutanix.com>
Subject: Re: [ovs-dev] [PATCH v1] ovn-controller: Fix the CT zone assignment logic for logical routers 
 
On 8/4/20 5:55 AM, Ankur Sharma wrote:
> From: Ankur Sharma <ankur.sharma@nutanix.com>
> 
> BACKGROUND:
> a. ovn-controller assigns CT ZONES for local ports and datapaths.
> b. If a local port/datapath is cleaned up from a chassis, then
>    corresponding CT ZONE is "unassigned"/"freed" up.
> 
> ISSUE:
> Above logic and implementations leaves stale CT entries in the
> datapath, which may get reused unexpectedly, thereby causing
> issues like, packets going through ct_nat(SNAT_IP_NEW) and getting
> a stale IP as SNAT IP etc.
> 
> a. As a part of CT Zone unassign, implementation should FLUSH the
>    corresponding CT entries, i.e it should do FLUSH by ZONE.
>    As os now, implementation avoids the flushing, thereby leaving
>    stale CT entries.
> 
> b. Similarly, since the implementation relies on datapath existence
>    for assign/unassign of CT ZONEs. Hence, simple operations like
>    moving the logical router from one external logical switch to
>    another, may not cause any CT ZONE reassignment and thereby
>    stale CT entries might get consumed, when they should not have
>    been.
> 
> c. a. and b. combined causes following:
>    i. Start a to be SNATed traffic from internal endpoint to an
>       external endpoint. Let us say internal endpoint IP is
>       50.0.0.10 and external endpoint ip is 8.8.8.8 and
>       logical router port ip (and hence SNAT ip) is 100.0.0.10.
> 
>   ii. Detach the logical router from old external logical switch
>       and attach to new external logical switch. As a result
>       of this operation, new router port ip becomes 200.0.0.10
>       , which also becomes the new SNAT ip.
> 
>  iii. The observation has been that traffic initiated in i. above
>       still ends up using OLD SNAT IP, i.e 100.0.0.10, rather than
>       200.0.0.10
> 
>  iv. iii. above happened, because although from OVS DP, the IP for
>      NAT action is 200.0.0.10, however, since its an ongoing traffic,
>      hence the CT entries come in use and end up NATing to old SNAT
>      ip 100.0.0.10. For example:
> 
>      OVS DP STATE
>      recirc_id(0),in_port(16),....ct(commit,zone=1,nat(src=200.0.0.10))
> 
>      CT STATE
>      icmp,orig=(src=50.0.0.10,dst=8.8.8.8,id=2288,type=8,code=0),
>      reply=(src=8.8.8.8,dst=100.0.0.10,id=2288,type=0,code=0),zone=1
> 
> FIX:
> This patch improves the overall CT ZONE management by doing following:
> a. Do a FLUSH by CT ZONE, once we identify that a zone has to be freed up.
> b. From datapath perspective, restrict the CT ZONE assignment ONLY
>    to logical routers that has NAT rules enabled.

Hi Ankur,

Maybe I'm missing something but doesn't this cause stateful ACLs, load
balancers applied to logical_switches and load balancers applied to
logical routers that don't have nat configured to use the default
conntrack zone (0)? I don't think that's ok.

As a matter of fact, most of the system-ovn.at tests related to
conntrack fail because conntrack entries are created in the default
zone, e.g.:

$ make check-kernel
1: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT FAILED
(system-ovn.at:116)

+++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/1/stdout
2020-08-07 04:50:11.167836393 -0400
@@ -1,2 +1,2 @@
-icmp,orig=(src=172.16.1.2,dst=30.0.0.2,id=<cleared>,type=8,code=0),reply=(src=192.168.1.2,dst=172.16.1.2,id=<cleared>,type=0,code=0),zone=<cleared>
+icmp,orig=(src=172.16.1.2,dst=30.0.0.2,id=<cleared>,type=8,code=0),reply=(src=192.168.1.2,dst=172.16.1.2,id=<cleared>,type=0,code=0)

  2: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT - IPv6
FAILED (system-ovn.at:296)
  3: ovn -- 2 LRs connected via LS, gateway router, easy SNAT FAILED
(system-ovn.at:448)
  4: ovn -- 2 LRs connected via LS, gateway router, easy SNAT - IPv6
FAILED (system-ovn.at:560)
  5: ovn -- multiple gateway routers, SNAT and DNAT  FAILED
(system-ovn.at:730)
  6: ovn -- multiple gateway routers, SNAT and DNAT - IPv6 FAILED
(system-ovn.at:958)
  7: ovn -- load-balancing                           FAILED
(system-ovn.at:1153)
  8: ovn -- load-balancing - IPv6                    ok
  9: ovn -- load-balancing - same subnet.            ok
 10: ovn -- load-balancing - same subnet. - IPv6     ok
 11: ovn -- load balancing in gateway router         FAILED
(system-ovn.at:1829)

./system-ovn.at:1829: ovs-appctl dpctl/dump-conntrack | grep
"dst=30.0.0.1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e
's/id=[0-9]*/id=<cleared>/g' -e 's/state=[0-9_A-Z]*/state=<cleared>/g' |
sort | uniq |
sed -e 's/zone=[0-9]*/zone=<cleared>/'
--- -   2020-08-07 04:46:41.414957151 -0400
+++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/11/stdout
2020-08-07 04:46:41.412798625 -0400
@@ -1,3 +1,3 @@
-tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
-tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
+tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),protoinfo=(state=<cleared>)
+tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),protoinfo=(state=<cleared>)


 12: ovn -- load balancing in gateway router - IPv6  FAILED
(system-ovn.at:2033)
 13: ovn -- multiple gateway routers, load-balancing FAILED
(system-ovn.at:2209)
 14: ovn -- multiple gateway routers, load-balancing - IPv6 FAILED
(system-ovn.at:2376)

[...]

 30: ovn -- ECMP symmetric reply                     FAILED
(system-ovn.at:4684)

./system-ovn.at:4684: ovs-appctl dpctl/dump-conntrack | grep
"dst=172.16.0.1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e
's/id=[0-9]*/id=<cleared>/g' -e 's/state=[0-9_A-Z]*/state=<cleared>/g' |
sort | uniq | \
sed -e 's/zone=[0-9]*/zone=<cleared>/' |
sed -e
's/labels=0x[0-9a-f]*00000401020400000000/labels=0x00000401020400000000/'
--- -   2020-08-07 04:56:22.403499981 -0400
+++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/30/stdout
2020-08-07 04:56:22.401442917 -0400
@@ -1,2 +1,2 @@
-icmp,orig=(src=172.16.0.1,dst=10.0.0.2,id=<cleared>,type=8,code=0),reply=(src=10.0.0.2,dst=172.16.0.1,id=<cleared>,type=0,code=0),zone=<cleared>,labels=0x00000401020400000000
+icmp,orig=(src=172.16.0.1,dst=10.0.0.2,id=<cleared>,type=8,code=0),reply=(src=10.0.0.2,dst=172.16.0.1,id=<cleared>,type=0,code=0),labels=0x00000401020400000000

> c. Instead of using logical router uuid as ct zone key, use crossproduct
>    of logical router and logical router port that connects to external
>    logical switch.
> 
> Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
> ---
>  controller/ovn-controller.c | 37 +++++++++++++++++++++++++++----------
>  controller/physical.c       | 18 ++++++++++++------
>  lib/ovn-util.c              | 10 ++++++----
>  lib/ovn-util.h              |  3 ++-
>  4 files changed, 47 insertions(+), 21 deletions(-)
> 
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 5ca32ac..9a6746e 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -521,17 +521,34 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths,
>          sset_add(&all_users, user);
>      }

> -    /* Local patched datapath (gateway routers) need zones assigned. */
> +    /* Local patched datapath (gateway routers) need zones assigned.
> +     * Only local logical routers with atleast one NAT rule are considered for
> +     * CT zone assignment.*/
>      const struct local_datapath *ld;
>      HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
> -        /* XXX Add method to limit zone assignment to logical router
> -         * datapaths with NAT */
> -        char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "dnat");
> -        char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "snat");
> -        sset_add(&all_users, dnat);
> -        sset_add(&all_users, snat);
> -        free(dnat);
> -        free(snat);
> +       const char *dp_nblr = smap_get(&ld->datapath->external_ids,
> +                                      "logical-router");
> +       if (dp_nblr) {
> +          for (size_t iter = 0; iter < ld->n_peer_ports; iter++) {
> +             const struct sbrec_port_binding *peer_binding =
> +                                             ld->peer_ports[iter].remote;
> +             const struct sbrec_port_binding *local_binding =
> +                                             ld->peer_ports[iter].local;
> +
> +             if (peer_binding->nat_addresses) {
> +                char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid,
> +                                                &local_binding->header_.uuid,
> +                                                "dnat");
> +                char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid,
> +                                                &local_binding->header_.uuid,
> +                                                "snat");
> +                sset_add(&all_users, dnat);
> +                sset_add(&all_users, snat);
> +                free(dnat);
> +                free(snat);
> +             }
> +          }
> +       }
>      }

>      /* Delete zones that do not exist in above sset. */
> @@ -541,7 +558,7 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths,
>                       ct_zone->data, ct_zone->name);

>              struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending);
> -            pending->state = CT_ZONE_DB_QUEUED; /* Skip flushing zone. */
> +            pending->state = CT_ZONE_OF_QUEUED;
>              pending->zone = ct_zone->data;
>              pending->add = false;
>              shash_add(pending_ct_zones, ct_zone->name, pending);
> diff --git a/controller/physical.c b/controller/physical.c
> index 535c777..cc497e0 100644
> --- a/controller/physical.c
> +++ b/controller/physical.c
> @@ -218,18 +218,24 @@ static struct zone_ids
>  get_zone_ids(const struct sbrec_port_binding *binding,
>               const struct simap *ct_zones)
>  {
> -    struct zone_ids zone_ids;
> +    struct zone_ids zone_ids = {0};

>      zone_ids.ct = simap_get(ct_zones, binding->logical_port);

> -    const struct uuid *key = &binding->datapath->header_.uuid;
> +    const struct uuid *key1 = &binding->datapath->header_.uuid;
> +    const struct uuid *key2 = &binding->header_.uuid;

> -    char *dnat = alloc_nat_zone_key(key, "dnat");
> -    zone_ids.dnat = simap_get(ct_zones, dnat);
> +    char *dnat = alloc_nat_zone_key(key1, key2, "dnat");
> +
> +    if (simap_contains(ct_zones, dnat)) {
> +       zone_ids.dnat = simap_get(ct_zones, dnat);
> +    }

Both simap_get() and simap_contains() call simap_find() internally.  If
simap_get() can't find the key it will return a default value of 0.
There's no need for the "if (simap_contains(ct_zones, dnat)) {". We can
just directly:

zone_ids.dnat = simap_get(ct_zones, dnat);

>      free(dnat);

> -    char *snat = alloc_nat_zone_key(key, "snat");
> -    zone_ids.snat = simap_get(ct_zones, snat);
> +    char *snat = alloc_nat_zone_key(key1, key2, "snat");
> +    if (simap_contains(ct_zones, snat)) {
> +       zone_ids.snat = simap_get(ct_zones, snat);
> +    }

Same here.

Thanks,
Dumitru

>      free(snat);

>      return zone_ids;
> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> index cdb5e18..cba7355 100644
> --- a/lib/ovn-util.c
> +++ b/lib/ovn-util.c
> @@ -327,14 +327,16 @@ destroy_lport_addresses(struct lport_addresses *laddrs)
>      free(laddrs->ipv6_addrs);
>  }

> -/* Allocates a key for NAT conntrack zone allocation for a provided
> - * 'key' record and a 'type'.
> +/* Allocates a key for NAT conntrack zone allocation for provided
> + * 'keys' and a 'type'.
>   *
>   * It is the caller's responsibility to free the allocated memory. */
>  char *
> -alloc_nat_zone_key(const struct uuid *key, const char *type)
> +alloc_nat_zone_key(const struct uuid *key1, const struct uuid *key2,
> +                   const char *type)
>  {
> -    return xasprintf(UUID_FMT"_%s", UUID_ARGS(key), type);
> +    return xasprintf(UUID_FMT"_"UUID_FMT"_%s", UUID_ARGS(key1),
> +                     UUID_ARGS(key2), type);
>  }

>  const char *
> diff --git a/lib/ovn-util.h b/lib/ovn-util.h
> index 0f7b501..fe86bf8 100644
> --- a/lib/ovn-util.h
> +++ b/lib/ovn-util.h
> @@ -77,7 +77,8 @@ bool extract_sbrec_binding_first_mac(const struct sbrec_port_binding *binding,

>  void destroy_lport_addresses(struct lport_addresses *);

> -char *alloc_nat_zone_key(const struct uuid *key, const char *type);
> +char *alloc_nat_zone_key(const struct uuid *key1, const struct uuid *key2,
> +                         const char *type);

>  const char *default_nb_db(void);
>  const char *default_sb_db(void);
>
Dumitru Ceara Aug. 13, 2020, 7:39 a.m. UTC | #4
On 8/13/20 3:08 AM, Ankur Sharma wrote:
> Hi Dumitru,
> 
> Thanks a lot for looking at the change.
> It my mistake, LB use case totally slipped through my mind.
> 
> Cross product based approach is not generic enough to address LB as well (and future use cases for CT ZONE).
> 
> Please ignore this patch, i will submit a new one with a more generic approach soon.
> 

Looking forward to it!

> There is one change in this patch that can be taken, i.e replacing CT_ZONE_DB_QUEUED with CT_ZONE_OF_QUEUED,
> but i will submit a separate patch for it as well.
> 

Sounds good, this is something we should definitely address and it can
be indeed a separate patch.

Thanks,
Dumitru

> 
> Thanks
> 
> Regards,
> Ankur
> 
> 
> 
> 
> From: Dumitru Ceara <dceara@redhat.com>
> Sent: Friday, August 7, 2020 1:58 AM
> To: svc.mail.git <svc.mail.git@nutanix.com>; ovs-dev@openvswitch.org <ovs-dev@openvswitch.org>; Ankur Sharma <ankur.sharma@nutanix.com>
> Subject: Re: [ovs-dev] [PATCH v1] ovn-controller: Fix the CT zone assignment logic for logical routers 
>  
> On 8/4/20 5:55 AM, Ankur Sharma wrote:
>> From: Ankur Sharma <ankur.sharma@nutanix.com>
>>
>> BACKGROUND:
>> a. ovn-controller assigns CT ZONES for local ports and datapaths.
>> b. If a local port/datapath is cleaned up from a chassis, then
>>     corresponding CT ZONE is "unassigned"/"freed" up.
>>
>> ISSUE:
>> Above logic and implementations leaves stale CT entries in the
>> datapath, which may get reused unexpectedly, thereby causing
>> issues like, packets going through ct_nat(SNAT_IP_NEW) and getting
>> a stale IP as SNAT IP etc.
>>
>> a. As a part of CT Zone unassign, implementation should FLUSH the
>>     corresponding CT entries, i.e it should do FLUSH by ZONE.
>>     As os now, implementation avoids the flushing, thereby leaving
>>     stale CT entries.
>>
>> b. Similarly, since the implementation relies on datapath existence
>>     for assign/unassign of CT ZONEs. Hence, simple operations like
>>     moving the logical router from one external logical switch to
>>     another, may not cause any CT ZONE reassignment and thereby
>>     stale CT entries might get consumed, when they should not have
>>     been.
>>
>> c. a. and b. combined causes following:
>>     i. Start a to be SNATed traffic from internal endpoint to an
>>        external endpoint. Let us say internal endpoint IP is
>>        50.0.0.10 and external endpoint ip is 8.8.8.8 and
>>        logical router port ip (and hence SNAT ip) is 100.0.0.10.
>>
>>    ii. Detach the logical router from old external logical switch
>>        and attach to new external logical switch. As a result
>>        of this operation, new router port ip becomes 200.0.0.10
>>        , which also becomes the new SNAT ip.
>>
>>   iii. The observation has been that traffic initiated in i. above
>>        still ends up using OLD SNAT IP, i.e 100.0.0.10, rather than
>>        200.0.0.10
>>
>>   iv. iii. above happened, because although from OVS DP, the IP for
>>       NAT action is 200.0.0.10, however, since its an ongoing traffic,
>>       hence the CT entries come in use and end up NATing to old SNAT
>>       ip 100.0.0.10. For example:
>>
>>       OVS DP STATE
>>       recirc_id(0),in_port(16),....ct(commit,zone=1,nat(src=200.0.0.10))
>>
>>       CT STATE
>>       icmp,orig=(src=50.0.0.10,dst=8.8.8.8,id=2288,type=8,code=0),
>>       reply=(src=8.8.8.8,dst=100.0.0.10,id=2288,type=0,code=0),zone=1
>>
>> FIX:
>> This patch improves the overall CT ZONE management by doing following:
>> a. Do a FLUSH by CT ZONE, once we identify that a zone has to be freed up.
>> b. From datapath perspective, restrict the CT ZONE assignment ONLY
>>     to logical routers that has NAT rules enabled.
> 
> Hi Ankur,
> 
> Maybe I'm missing something but doesn't this cause stateful ACLs, load
> balancers applied to logical_switches and load balancers applied to
> logical routers that don't have nat configured to use the default
> conntrack zone (0)? I don't think that's ok.
> 
> As a matter of fact, most of the system-ovn.at tests related to
> conntrack fail because conntrack entries are created in the default
> zone, e.g.:
> 
> $ make check-kernel
> 1: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT FAILED
> (system-ovn.at:116)
> 
> +++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/1/stdout
> 2020-08-07 04:50:11.167836393 -0400
> @@ -1,2 +1,2 @@
> -icmp,orig=(src=172.16.1.2,dst=30.0.0.2,id=<cleared>,type=8,code=0),reply=(src=192.168.1.2,dst=172.16.1.2,id=<cleared>,type=0,code=0),zone=<cleared>
> +icmp,orig=(src=172.16.1.2,dst=30.0.0.2,id=<cleared>,type=8,code=0),reply=(src=192.168.1.2,dst=172.16.1.2,id=<cleared>,type=0,code=0)
> 
>   2: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT - IPv6
> FAILED (system-ovn.at:296)
>   3: ovn -- 2 LRs connected via LS, gateway router, easy SNAT FAILED
> (system-ovn.at:448)
>   4: ovn -- 2 LRs connected via LS, gateway router, easy SNAT - IPv6
> FAILED (system-ovn.at:560)
>   5: ovn -- multiple gateway routers, SNAT and DNAT  FAILED
> (system-ovn.at:730)
>   6: ovn -- multiple gateway routers, SNAT and DNAT - IPv6 FAILED
> (system-ovn.at:958)
>   7: ovn -- load-balancing                           FAILED
> (system-ovn.at:1153)
>   8: ovn -- load-balancing - IPv6                    ok
>   9: ovn -- load-balancing - same subnet.            ok
>  10: ovn -- load-balancing - same subnet. - IPv6     ok
>  11: ovn -- load balancing in gateway router         FAILED
> (system-ovn.at:1829)
> 
> ./system-ovn.at:1829: ovs-appctl dpctl/dump-conntrack | grep
> "dst=30.0.0.1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e
> 's/id=[0-9]*/id=<cleared>/g' -e 's/state=[0-9_A-Z]*/state=<cleared>/g' |
> sort | uniq |
> sed -e 's/zone=[0-9]*/zone=<cleared>/'
> --- -   2020-08-07 04:46:41.414957151 -0400
> +++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/11/stdout
> 2020-08-07 04:46:41.412798625 -0400
> @@ -1,3 +1,3 @@
> -tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
> -tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
> +tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),protoinfo=(state=<cleared>)
> +tcp,orig=(src=172.16.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),protoinfo=(state=<cleared>)
> 
> 
>  12: ovn -- load balancing in gateway router - IPv6  FAILED
> (system-ovn.at:2033)
>  13: ovn -- multiple gateway routers, load-balancing FAILED
> (system-ovn.at:2209)
>  14: ovn -- multiple gateway routers, load-balancing - IPv6 FAILED
> (system-ovn.at:2376)
> 
> [...]
> 
>  30: ovn -- ECMP symmetric reply                     FAILED
> (system-ovn.at:4684)
> 
> ./system-ovn.at:4684: ovs-appctl dpctl/dump-conntrack | grep
> "dst=172.16.0.1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e
> 's/id=[0-9]*/id=<cleared>/g' -e 's/state=[0-9_A-Z]*/state=<cleared>/g' |
> sort | uniq | \
> sed -e 's/zone=[0-9]*/zone=<cleared>/' |
> sed -e
> 's/labels=0x[0-9a-f]*00000401020400000000/labels=0x00000401020400000000/'
> --- -   2020-08-07 04:56:22.403499981 -0400
> +++ /root/ovn/tests/system-kmod-testsuite.dir/at-groups/30/stdout
> 2020-08-07 04:56:22.401442917 -0400
> @@ -1,2 +1,2 @@
> -icmp,orig=(src=172.16.0.1,dst=10.0.0.2,id=<cleared>,type=8,code=0),reply=(src=10.0.0.2,dst=172.16.0.1,id=<cleared>,type=0,code=0),zone=<cleared>,labels=0x00000401020400000000
> +icmp,orig=(src=172.16.0.1,dst=10.0.0.2,id=<cleared>,type=8,code=0),reply=(src=10.0.0.2,dst=172.16.0.1,id=<cleared>,type=0,code=0),labels=0x00000401020400000000
> 
>> c. Instead of using logical router uuid as ct zone key, use crossproduct
>>     of logical router and logical router port that connects to external
>>     logical switch.
>>
>> Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
>> ---
>>   controller/ovn-controller.c | 37 +++++++++++++++++++++++++++----------
>>   controller/physical.c       | 18 ++++++++++++------
>>   lib/ovn-util.c              | 10 ++++++----
>>   lib/ovn-util.h              |  3 ++-
>>   4 files changed, 47 insertions(+), 21 deletions(-)
>>
>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>> index 5ca32ac..9a6746e 100644
>> --- a/controller/ovn-controller.c
>> +++ b/controller/ovn-controller.c
>> @@ -521,17 +521,34 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths,
>>           sset_add(&all_users, user);
>>       }
>>   
>> -    /* Local patched datapath (gateway routers) need zones assigned. */
>> +    /* Local patched datapath (gateway routers) need zones assigned.
>> +     * Only local logical routers with atleast one NAT rule are considered for
>> +     * CT zone assignment.*/
>>       const struct local_datapath *ld;
>>       HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
>> -        /* XXX Add method to limit zone assignment to logical router
>> -         * datapaths with NAT */
>> -        char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "dnat");
>> -        char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "snat");
>> -        sset_add(&all_users, dnat);
>> -        sset_add(&all_users, snat);
>> -        free(dnat);
>> -        free(snat);
>> +       const char *dp_nblr = smap_get(&ld->datapath->external_ids,
>> +                                      "logical-router");
>> +       if (dp_nblr) {
>> +          for (size_t iter = 0; iter < ld->n_peer_ports; iter++) {
>> +             const struct sbrec_port_binding *peer_binding =
>> +                                             ld->peer_ports[iter].remote;
>> +             const struct sbrec_port_binding *local_binding =
>> +                                             ld->peer_ports[iter].local;
>> +
>> +             if (peer_binding->nat_addresses) {
>> +                char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid,
>> +                                                &local_binding->header_.uuid,
>> +                                                "dnat");
>> +                char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid,
>> +                                                &local_binding->header_.uuid,
>> +                                                "snat");
>> +                sset_add(&all_users, dnat);
>> +                sset_add(&all_users, snat);
>> +                free(dnat);
>> +                free(snat);
>> +             }
>> +          }
>> +       }
>>       }
>>   
>>       /* Delete zones that do not exist in above sset. */
>> @@ -541,7 +558,7 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths,
>>                        ct_zone->data, ct_zone->name);
>>   
>>               struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending);
>> -            pending->state = CT_ZONE_DB_QUEUED; /* Skip flushing zone. */
>> +            pending->state = CT_ZONE_OF_QUEUED;
>>               pending->zone = ct_zone->data;
>>               pending->add = false;
>>               shash_add(pending_ct_zones, ct_zone->name, pending);
>> diff --git a/controller/physical.c b/controller/physical.c
>> index 535c777..cc497e0 100644
>> --- a/controller/physical.c
>> +++ b/controller/physical.c
>> @@ -218,18 +218,24 @@ static struct zone_ids
>>   get_zone_ids(const struct sbrec_port_binding *binding,
>>                const struct simap *ct_zones)
>>   {
>> -    struct zone_ids zone_ids;
>> +    struct zone_ids zone_ids = {0};
>>   
>>       zone_ids.ct = simap_get(ct_zones, binding->logical_port);
>>   
>> -    const struct uuid *key = &binding->datapath->header_.uuid;
>> +    const struct uuid *key1 = &binding->datapath->header_.uuid;
>> +    const struct uuid *key2 = &binding->header_.uuid;
>>   
>> -    char *dnat = alloc_nat_zone_key(key, "dnat");
>> -    zone_ids.dnat = simap_get(ct_zones, dnat);
>> +    char *dnat = alloc_nat_zone_key(key1, key2, "dnat");
>> +
>> +    if (simap_contains(ct_zones, dnat)) {
>> +       zone_ids.dnat = simap_get(ct_zones, dnat);
>> +    }
> 
> Both simap_get() and simap_contains() call simap_find() internally.  If
> simap_get() can't find the key it will return a default value of 0.
> There's no need for the "if (simap_contains(ct_zones, dnat)) {". We can
> just directly:
> 
> zone_ids.dnat = simap_get(ct_zones, dnat);
> 
>>       free(dnat);
>>   
>> -    char *snat = alloc_nat_zone_key(key, "snat");
>> -    zone_ids.snat = simap_get(ct_zones, snat);
>> +    char *snat = alloc_nat_zone_key(key1, key2, "snat");
>> +    if (simap_contains(ct_zones, snat)) {
>> +       zone_ids.snat = simap_get(ct_zones, snat);
>> +    }
> 
> Same here.
> 
> Thanks,
> Dumitru
> 
>>       free(snat);
>>   
>>       return zone_ids;
>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
>> index cdb5e18..cba7355 100644
>> --- a/lib/ovn-util.c
>> +++ b/lib/ovn-util.c
>> @@ -327,14 +327,16 @@ destroy_lport_addresses(struct lport_addresses *laddrs)
>>       free(laddrs->ipv6_addrs);
>>   }
>>   
>> -/* Allocates a key for NAT conntrack zone allocation for a provided
>> - * 'key' record and a 'type'.
>> +/* Allocates a key for NAT conntrack zone allocation for provided
>> + * 'keys' and a 'type'.
>>    *
>>    * It is the caller's responsibility to free the allocated memory. */
>>   char *
>> -alloc_nat_zone_key(const struct uuid *key, const char *type)
>> +alloc_nat_zone_key(const struct uuid *key1, const struct uuid *key2,
>> +                   const char *type)
>>   {
>> -    return xasprintf(UUID_FMT"_%s", UUID_ARGS(key), type);
>> +    return xasprintf(UUID_FMT"_"UUID_FMT"_%s", UUID_ARGS(key1),
>> +                     UUID_ARGS(key2), type);
>>   }
>>   
>>   const char *
>> diff --git a/lib/ovn-util.h b/lib/ovn-util.h
>> index 0f7b501..fe86bf8 100644
>> --- a/lib/ovn-util.h
>> +++ b/lib/ovn-util.h
>> @@ -77,7 +77,8 @@ bool extract_sbrec_binding_first_mac(const struct sbrec_port_binding *binding,
>>   
>>   void destroy_lport_addresses(struct lport_addresses *);
>>   
>> -char *alloc_nat_zone_key(const struct uuid *key, const char *type);
>> +char *alloc_nat_zone_key(const struct uuid *key1, const struct uuid *key2,
>> +                         const char *type);
>>   
>>   const char *default_nb_db(void);
>>   const char *default_sb_db(void);
>>
> 
>
diff mbox series

Patch

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 5ca32ac..9a6746e 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -521,17 +521,34 @@  update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths,
         sset_add(&all_users, user);
     }
 
-    /* Local patched datapath (gateway routers) need zones assigned. */
+    /* Local patched datapath (gateway routers) need zones assigned.
+     * Only local logical routers with atleast one NAT rule are considered for
+     * CT zone assignment.*/
     const struct local_datapath *ld;
     HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
-        /* XXX Add method to limit zone assignment to logical router
-         * datapaths with NAT */
-        char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "dnat");
-        char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "snat");
-        sset_add(&all_users, dnat);
-        sset_add(&all_users, snat);
-        free(dnat);
-        free(snat);
+       const char *dp_nblr = smap_get(&ld->datapath->external_ids,
+                                      "logical-router");
+       if (dp_nblr) {
+          for (size_t iter = 0; iter < ld->n_peer_ports; iter++) {
+             const struct sbrec_port_binding *peer_binding =
+                                             ld->peer_ports[iter].remote;
+             const struct sbrec_port_binding *local_binding =
+                                             ld->peer_ports[iter].local;
+
+             if (peer_binding->nat_addresses) {
+                char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid,
+                                                &local_binding->header_.uuid,
+                                                "dnat");
+                char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid,
+                                                &local_binding->header_.uuid,
+                                                "snat");
+                sset_add(&all_users, dnat);
+                sset_add(&all_users, snat);
+                free(dnat);
+                free(snat);
+             }
+          }
+       }
     }
 
     /* Delete zones that do not exist in above sset. */
@@ -541,7 +558,7 @@  update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths,
                      ct_zone->data, ct_zone->name);
 
             struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending);
-            pending->state = CT_ZONE_DB_QUEUED; /* Skip flushing zone. */
+            pending->state = CT_ZONE_OF_QUEUED;
             pending->zone = ct_zone->data;
             pending->add = false;
             shash_add(pending_ct_zones, ct_zone->name, pending);
diff --git a/controller/physical.c b/controller/physical.c
index 535c777..cc497e0 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -218,18 +218,24 @@  static struct zone_ids
 get_zone_ids(const struct sbrec_port_binding *binding,
              const struct simap *ct_zones)
 {
-    struct zone_ids zone_ids;
+    struct zone_ids zone_ids = {0};
 
     zone_ids.ct = simap_get(ct_zones, binding->logical_port);
 
-    const struct uuid *key = &binding->datapath->header_.uuid;
+    const struct uuid *key1 = &binding->datapath->header_.uuid;
+    const struct uuid *key2 = &binding->header_.uuid;
 
-    char *dnat = alloc_nat_zone_key(key, "dnat");
-    zone_ids.dnat = simap_get(ct_zones, dnat);
+    char *dnat = alloc_nat_zone_key(key1, key2, "dnat");
+
+    if (simap_contains(ct_zones, dnat)) {
+       zone_ids.dnat = simap_get(ct_zones, dnat);
+    }
     free(dnat);
 
-    char *snat = alloc_nat_zone_key(key, "snat");
-    zone_ids.snat = simap_get(ct_zones, snat);
+    char *snat = alloc_nat_zone_key(key1, key2, "snat");
+    if (simap_contains(ct_zones, snat)) {
+       zone_ids.snat = simap_get(ct_zones, snat);
+    }
     free(snat);
 
     return zone_ids;
diff --git a/lib/ovn-util.c b/lib/ovn-util.c
index cdb5e18..cba7355 100644
--- a/lib/ovn-util.c
+++ b/lib/ovn-util.c
@@ -327,14 +327,16 @@  destroy_lport_addresses(struct lport_addresses *laddrs)
     free(laddrs->ipv6_addrs);
 }
 
-/* Allocates a key for NAT conntrack zone allocation for a provided
- * 'key' record and a 'type'.
+/* Allocates a key for NAT conntrack zone allocation for provided
+ * 'keys' and a 'type'.
  *
  * It is the caller's responsibility to free the allocated memory. */
 char *
-alloc_nat_zone_key(const struct uuid *key, const char *type)
+alloc_nat_zone_key(const struct uuid *key1, const struct uuid *key2,
+                   const char *type)
 {
-    return xasprintf(UUID_FMT"_%s", UUID_ARGS(key), type);
+    return xasprintf(UUID_FMT"_"UUID_FMT"_%s", UUID_ARGS(key1),
+                     UUID_ARGS(key2), type);
 }
 
 const char *
diff --git a/lib/ovn-util.h b/lib/ovn-util.h
index 0f7b501..fe86bf8 100644
--- a/lib/ovn-util.h
+++ b/lib/ovn-util.h
@@ -77,7 +77,8 @@  bool extract_sbrec_binding_first_mac(const struct sbrec_port_binding *binding,
 
 void destroy_lport_addresses(struct lport_addresses *);
 
-char *alloc_nat_zone_key(const struct uuid *key, const char *type);
+char *alloc_nat_zone_key(const struct uuid *key1, const struct uuid *key2,
+                         const char *type);
 
 const char *default_nb_db(void);
 const char *default_sb_db(void);