diff mbox series

[ovs-dev,v2] controller: Fix hairpin SNAT flow explosion if hairpin_snat_ip is set.

Message ID 20230301223600.1607778-1-i.maximets@ovn.org
State Accepted
Headers show
Series [ovs-dev,v2] controller: Fix hairpin SNAT flow explosion if hairpin_snat_ip is set. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Ilya Maximets March 1, 2023, 10:36 p.m. UTC
It's common to have 'hairpin_snat_ip' to be configured for each and
every Load Balancer in a setup.  For example, ovn-kubernetes does that,
setting '169.254.169.5 fd69::5' value unconditionally for every LB:
  https://github.com/ovn-org/ovn-kubernetes/blob/cd78ae1af4657d38bdc41003a8737aa958d62b9d/go-controller/pkg/ovn/controller/services/loadbalancer.go#L314

Current implementation of ovn-controller will create an OpenFlow rule
for every datapath for each VIP in case of 'hairpin_snat_ip', because
we need to handle the case where LBs with the same VIP are applied to
different datapaths and have different SNAT IPs.

In large scale setups with tens of thousands of LBs and hundreds of
nodes, this generates millions of OpenFlow rules, making ovn-controller
choke and fall into unrecoverable disconnection loop with poll
intervals up to 200 seconds in ovn-heater's density-heavy scenario with
just 120 nodes.  It also generates rules with thousands of conjunctions
that do not fit into a single FLOW_MOD, breaking the OpenFlow
management communication.

Fix that by partially reverting cited commit for Load Balancers with
'hairpin_snat_ip' specified.  This will allow us to avoid construction
of OpenFlow rules that do not fit into a single OpenFlow message.
In order to fight the number of flows we have to generate we will
generate flows for local datapaths only.  In modern ovn-kubernetes
setups, in general, there should be only one local datapath.

Optimization for Load Balancers without 'hairpin_snat_ip' specified
can still be preserved by keeping these wider flows at a lower priority.
The same way as we do now.  That allows to limit the blast radius of
this change.  E.g. OpenStack and older ovn-kubernetes deployments will
not be affected.

Fixes: 07467cfac499 ("lflow: Refactor OpenFlow snat hairpin flows")
Reported-at: https://bugzilla.redhat.com/2171423
Suggested-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---

Version 2:

  - This is less of a v2, but an alternative solution.  A modified
    version of what Dumitru suggested during review of 1.  Difference
    is that it's a partial revert of the cited commit, not a full one.
    i.e. optimization preserved for a case without hairpin_snat_ip.

  - Posting now due to pressing time constraints for a release.
    I didn't finish high-scale performance tests yet.  I'll post
    results for that tomorrow.  Preliminary ovn-heater run with
    120 nodes density-heavy worked fine without issues.  500 node
    tests are queued.

  - Unfortunately, ovn-kubernetes tests are fialing in CI on ovn-k
    continer build stage, so I didn't run that check.

 controller/lflow.c          | 304 +++++++++++-------------------------
 controller/lflow.h          |   2 -
 controller/ovn-controller.c |  14 --
 3 files changed, 90 insertions(+), 230 deletions(-)

Comments

Ilya Maximets March 2, 2023, 10:15 a.m. UTC | #1
On 3/1/23 23:36, Ilya Maximets wrote:
> It's common to have 'hairpin_snat_ip' to be configured for each and
> every Load Balancer in a setup.  For example, ovn-kubernetes does that,
> setting '169.254.169.5 fd69::5' value unconditionally for every LB:
>   https://github.com/ovn-org/ovn-kubernetes/blob/cd78ae1af4657d38bdc41003a8737aa958d62b9d/go-controller/pkg/ovn/controller/services/loadbalancer.go#L314
> 
> Current implementation of ovn-controller will create an OpenFlow rule
> for every datapath for each VIP in case of 'hairpin_snat_ip', because
> we need to handle the case where LBs with the same VIP are applied to
> different datapaths and have different SNAT IPs.
> 
> In large scale setups with tens of thousands of LBs and hundreds of
> nodes, this generates millions of OpenFlow rules, making ovn-controller
> choke and fall into unrecoverable disconnection loop with poll
> intervals up to 200 seconds in ovn-heater's density-heavy scenario with
> just 120 nodes.  It also generates rules with thousands of conjunctions
> that do not fit into a single FLOW_MOD, breaking the OpenFlow
> management communication.
> 
> Fix that by partially reverting cited commit for Load Balancers with
> 'hairpin_snat_ip' specified.  This will allow us to avoid construction
> of OpenFlow rules that do not fit into a single OpenFlow message.
> In order to fight the number of flows we have to generate we will
> generate flows for local datapaths only.  In modern ovn-kubernetes
> setups, in general, there should be only one local datapath.
> 
> Optimization for Load Balancers without 'hairpin_snat_ip' specified
> can still be preserved by keeping these wider flows at a lower priority.
> The same way as we do now.  That allows to limit the blast radius of
> this change.  E.g. OpenStack and older ovn-kubernetes deployments will
> not be affected.
> 
> Fixes: 07467cfac499 ("lflow: Refactor OpenFlow snat hairpin flows")
> Reported-at: https://bugzilla.redhat.com/2171423
> Suggested-by: Dumitru Ceara <dceara@redhat.com>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
> 
> Version 2:
> 
>   - This is less of a v2, but an alternative solution.  A modified
>     version of what Dumitru suggested during review of 1.  Difference
>     is that it's a partial revert of the cited commit, not a full one.
>     i.e. optimization preserved for a case without hairpin_snat_ip.
> 
>   - Posting now due to pressing time constraints for a release.
>     I didn't finish high-scale performance tests yet.  I'll post
>     results for that tomorrow.  Preliminary ovn-heater run with
>     120 nodes density-heavy worked fine without issues.  500 node
>     tests are queued.

500-node tests finished.  And I see almost no performance difference
between cases with and without hairpin_snat_ip and the current main
branch without hairpin_snat_ip.  All 3 setups have the same amount of
OF rules in ovn-k-ish setups with ovn-heater.  CPU usage is approximately
on the same level as well.

Best regards, Ilya Maximets.

> 
>   - Unfortunately, ovn-kubernetes tests are fialing in CI on ovn-k
>     continer build stage, so I didn't run that check.
> 
>  controller/lflow.c          | 304 +++++++++++-------------------------
>  controller/lflow.h          |   2 -
>  controller/ovn-controller.c |  14 --
>  3 files changed, 90 insertions(+), 230 deletions(-)
Dumitru Ceara March 2, 2023, 10:24 a.m. UTC | #2
On 3/2/23 11:15, Ilya Maximets wrote:
> On 3/1/23 23:36, Ilya Maximets wrote:
>> It's common to have 'hairpin_snat_ip' to be configured for each and
>> every Load Balancer in a setup.  For example, ovn-kubernetes does that,
>> setting '169.254.169.5 fd69::5' value unconditionally for every LB:
>>   https://github.com/ovn-org/ovn-kubernetes/blob/cd78ae1af4657d38bdc41003a8737aa958d62b9d/go-controller/pkg/ovn/controller/services/loadbalancer.go#L314
>>
>> Current implementation of ovn-controller will create an OpenFlow rule
>> for every datapath for each VIP in case of 'hairpin_snat_ip', because
>> we need to handle the case where LBs with the same VIP are applied to
>> different datapaths and have different SNAT IPs.
>>
>> In large scale setups with tens of thousands of LBs and hundreds of
>> nodes, this generates millions of OpenFlow rules, making ovn-controller
>> choke and fall into unrecoverable disconnection loop with poll
>> intervals up to 200 seconds in ovn-heater's density-heavy scenario with
>> just 120 nodes.  It also generates rules with thousands of conjunctions
>> that do not fit into a single FLOW_MOD, breaking the OpenFlow
>> management communication.
>>
>> Fix that by partially reverting cited commit for Load Balancers with
>> 'hairpin_snat_ip' specified.  This will allow us to avoid construction
>> of OpenFlow rules that do not fit into a single OpenFlow message.
>> In order to fight the number of flows we have to generate we will
>> generate flows for local datapaths only.  In modern ovn-kubernetes
>> setups, in general, there should be only one local datapath.
>>
>> Optimization for Load Balancers without 'hairpin_snat_ip' specified
>> can still be preserved by keeping these wider flows at a lower priority.
>> The same way as we do now.  That allows to limit the blast radius of
>> this change.  E.g. OpenStack and older ovn-kubernetes deployments will
>> not be affected.
>>
>> Fixes: 07467cfac499 ("lflow: Refactor OpenFlow snat hairpin flows")
>> Reported-at: https://bugzilla.redhat.com/2171423
>> Suggested-by: Dumitru Ceara <dceara@redhat.com>
>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>> ---
>>
>> Version 2:
>>
>>   - This is less of a v2, but an alternative solution.  A modified
>>     version of what Dumitru suggested during review of 1.  Difference
>>     is that it's a partial revert of the cited commit, not a full one.
>>     i.e. optimization preserved for a case without hairpin_snat_ip.
>>
>>   - Posting now due to pressing time constraints for a release.
>>     I didn't finish high-scale performance tests yet.  I'll post
>>     results for that tomorrow.  Preliminary ovn-heater run with
>>     120 nodes density-heavy worked fine without issues.  500 node
>>     tests are queued.
> 
> 500-node tests finished.  And I see almost no performance difference
> between cases with and without hairpin_snat_ip and the current main
> branch without hairpin_snat_ip.  All 3 setups have the same amount of
> OF rules in ovn-k-ish setups with ovn-heater.  CPU usage is approximately
> on the same level as well.
> 

Nice!  I'm also running this through the ovn-kubernetes CI but I expect
it to work fine:

https://github.com/dceara/ovn/actions/runs/4312305009

Acked-by: Dumitru Ceara <dceara@redhat.com>

Thanks a lot for fixing this!

Regards,
Dumitru
Mark Michelson March 2, 2023, 8:08 p.m. UTC | #3
On 3/2/23 05:24, Dumitru Ceara wrote:
> On 3/2/23 11:15, Ilya Maximets wrote:
>> On 3/1/23 23:36, Ilya Maximets wrote:
>>> It's common to have 'hairpin_snat_ip' to be configured for each and
>>> every Load Balancer in a setup.  For example, ovn-kubernetes does that,
>>> setting '169.254.169.5 fd69::5' value unconditionally for every LB:
>>>    https://github.com/ovn-org/ovn-kubernetes/blob/cd78ae1af4657d38bdc41003a8737aa958d62b9d/go-controller/pkg/ovn/controller/services/loadbalancer.go#L314
>>>
>>> Current implementation of ovn-controller will create an OpenFlow rule
>>> for every datapath for each VIP in case of 'hairpin_snat_ip', because
>>> we need to handle the case where LBs with the same VIP are applied to
>>> different datapaths and have different SNAT IPs.
>>>
>>> In large scale setups with tens of thousands of LBs and hundreds of
>>> nodes, this generates millions of OpenFlow rules, making ovn-controller
>>> choke and fall into unrecoverable disconnection loop with poll
>>> intervals up to 200 seconds in ovn-heater's density-heavy scenario with
>>> just 120 nodes.  It also generates rules with thousands of conjunctions
>>> that do not fit into a single FLOW_MOD, breaking the OpenFlow
>>> management communication.
>>>
>>> Fix that by partially reverting cited commit for Load Balancers with
>>> 'hairpin_snat_ip' specified.  This will allow us to avoid construction
>>> of OpenFlow rules that do not fit into a single OpenFlow message.
>>> In order to fight the number of flows we have to generate we will
>>> generate flows for local datapaths only.  In modern ovn-kubernetes
>>> setups, in general, there should be only one local datapath.
>>>
>>> Optimization for Load Balancers without 'hairpin_snat_ip' specified
>>> can still be preserved by keeping these wider flows at a lower priority.
>>> The same way as we do now.  That allows to limit the blast radius of
>>> this change.  E.g. OpenStack and older ovn-kubernetes deployments will
>>> not be affected.
>>>
>>> Fixes: 07467cfac499 ("lflow: Refactor OpenFlow snat hairpin flows")
>>> Reported-at: https://bugzilla.redhat.com/2171423
>>> Suggested-by: Dumitru Ceara <dceara@redhat.com>
>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>> ---
>>>
>>> Version 2:
>>>
>>>    - This is less of a v2, but an alternative solution.  A modified
>>>      version of what Dumitru suggested during review of 1.  Difference
>>>      is that it's a partial revert of the cited commit, not a full one.
>>>      i.e. optimization preserved for a case without hairpin_snat_ip.
>>>
>>>    - Posting now due to pressing time constraints for a release.
>>>      I didn't finish high-scale performance tests yet.  I'll post
>>>      results for that tomorrow.  Preliminary ovn-heater run with
>>>      120 nodes density-heavy worked fine without issues.  500 node
>>>      tests are queued.
>>
>> 500-node tests finished.  And I see almost no performance difference
>> between cases with and without hairpin_snat_ip and the current main
>> branch without hairpin_snat_ip.  All 3 setups have the same amount of
>> OF rules in ovn-k-ish setups with ovn-heater.  CPU usage is approximately
>> on the same level as well.
>>
> 
> Nice!  I'm also running this through the ovn-kubernetes CI but I expect
> it to work fine:
> 
> https://github.com/dceara/ovn/actions/runs/4312305009
> 
> Acked-by: Dumitru Ceara <dceara@redhat.com>
> 
> Thanks a lot for fixing this!
> 
> Regards,
> Dumitru

I applied this to main and branch-23.03. I attempted to apply it to 
branch-22.12 but it did not apply cleanly. If this needs to be 
backported to other branches, then we will need patches for the affected 
branches.

Thanks Ilya and Dumitru!

> 
>
diff mbox series

Patch

diff --git a/controller/lflow.c b/controller/lflow.c
index a0a26460c..6a98b19e1 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -18,7 +18,6 @@ 
 #include "lflow.h"
 #include "coverage.h"
 #include "ha-chassis.h"
-#include "lib/id-pool.h"
 #include "lflow-cache.h"
 #include "local_data.h"
 #include "lport.h"
@@ -99,9 +98,9 @@  consider_logical_flow(const struct sbrec_logical_flow *lflow,
 
 static void
 consider_lb_hairpin_flows(const struct ovn_controller_lb *lb,
+                          const struct hmap *local_datapaths,
                           bool use_ct_mark,
-                          struct ovn_desired_flow_table *flow_table,
-                          struct simap *ids);
+                          struct ovn_desired_flow_table *flow_table);
 
 static void add_port_sec_flows(const struct shash *binding_lports,
                                const struct sbrec_chassis *,
@@ -1731,114 +1730,38 @@  add_lb_vip_hairpin_flows(const struct ovn_controller_lb *lb,
 static void
 add_lb_ct_snat_hairpin_for_dp(const struct ovn_controller_lb *lb,
                               const struct sbrec_datapath_binding *datapath,
+                              const struct hmap *local_datapaths,
                               struct match *dp_match,
                               struct ofpbuf *dp_acts,
                               struct ovn_desired_flow_table *flow_table)
 {
-    match_set_metadata(dp_match, htonll(datapath->tunnel_key));
-    ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, 200,
-                              lb->slb->header_.uuid.parts[0],
-                              dp_match, dp_acts, &lb->slb->header_.uuid,
-                              NX_CTLR_NO_METER, NULL);
-}
-
-static void
-add_lb_ct_snat_hairpin_dp_flows(const struct ovn_controller_lb *lb,
-                                uint32_t id,
-                                struct ovn_desired_flow_table *flow_table)
-{
-    /* If "hairpin_snat_ip" is not specified on this LB, we do not need
-       to add these flows because no conjunctive flows have been added
-       by add_lb_ct_snat_hairpin_vip_flow() for this LB. */
-    if (!lb->hairpin_snat_ips.n_ipv4_addrs &&
-        !lb->hairpin_snat_ips.n_ipv6_addrs) {
-        return;
-    }
-
-    uint64_t stub[1024 / 8];
-    struct ofpbuf dp_acts = OFPBUF_STUB_INITIALIZER(stub);
-    struct ofpact_conjunction *conj;
-
-    conj = ofpact_put_CONJUNCTION(&dp_acts);
-    conj->id = id;
-    conj->n_clauses = 2;
-    conj->clause = 0;
-
-    struct match dp_match = MATCH_CATCHALL_INITIALIZER;
-
-    for (size_t i = 0; i < lb->slb->n_datapaths; i++) {
-        add_lb_ct_snat_hairpin_for_dp(lb, lb->slb->datapaths[i],
-                                      &dp_match, &dp_acts, flow_table);
-    }
-    if (lb->slb->datapath_group) {
-        for (size_t i = 0; i < lb->slb->datapath_group->n_datapaths; i++) {
-            add_lb_ct_snat_hairpin_for_dp(
-                lb, lb->slb->datapath_group->datapaths[i],
-                &dp_match, &dp_acts, flow_table);
+    if (datapath) {
+        if (!get_local_datapath(local_datapaths, datapath->tunnel_key)) {
+            return;
         }
+        match_set_metadata(dp_match, htonll(datapath->tunnel_key));
     }
 
-    ofpbuf_uninit(&dp_acts);
-
-    struct ofpbuf snat_acts = OFPBUF_STUB_INITIALIZER(stub);
-
-    struct ofpact_conntrack *ct = ofpact_put_CT(&snat_acts);
-    ct->recirc_table = NX_CT_RECIRC_NONE;
-    ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
-    ct->zone_src.ofs = 0;
-    ct->zone_src.n_bits = 16;
-    ct->flags = NX_CT_F_COMMIT;
-    ct->alg = 0;
-
-    size_t nat_offset;
-    nat_offset = snat_acts.size;
-    ofpbuf_pull(&snat_acts, nat_offset);
-
-    struct ofpact_nat *nat = ofpact_put_NAT(&snat_acts);
-    nat->flags = NX_NAT_F_SRC;
-
-    snat_acts.header = ofpbuf_push_uninit(&snat_acts, nat_offset);
-    ofpact_finish(&snat_acts, &ct->ofpact);
-
-    struct match snat_match = MATCH_CATCHALL_INITIALIZER;
-
-    match_set_conj_id(&snat_match, id);
-
-    if (lb->hairpin_snat_ips.n_ipv4_addrs) {
-        nat->range_af = AF_INET;
-        nat->range.addr.ipv4.min = lb->hairpin_snat_ips.ipv4_addrs[0].addr;
-        match_set_dl_type(&snat_match, htons(ETH_TYPE_IP));
-
-        ofctrl_add_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, 200,
-                    lb->slb->header_.uuid.parts[0],
-                    &snat_match, &snat_acts, &lb->slb->header_.uuid);
-    }
-
-    if (lb->hairpin_snat_ips.n_ipv6_addrs) {
-        nat->range_af = AF_INET6;
-        nat->range.addr.ipv6.min = lb->hairpin_snat_ips.ipv6_addrs[0].addr;
-        match_set_dl_type(&snat_match, htons(ETH_TYPE_IPV6));
-
-        ofctrl_add_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, 200,
-                    lb->slb->header_.uuid.parts[0],
-                    &snat_match, &snat_acts, &lb->slb->header_.uuid);
-    }
-
-    ofpbuf_uninit(&snat_acts);
+    /* A flow added for the "hairpin_snat_ip" case will have an extra
+     * datapath match, but it will also match on the less restrictive
+     * general case.  Therefore, we set the priority in the
+     * "hairpin_snat_ip" case to be higher than the general case. */
+    ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN,
+                              datapath ? 200 : 100,
+                              lb->slb->header_.uuid.parts[0],
+                              dp_match, dp_acts, &lb->slb->header_.uuid,
+                              NX_CTLR_NO_METER, NULL);
 }
 
-
-/* Add a ct_snat flow for each VIP of the LB. If this LB does not use
+/* Add a ct_snat flow for each VIP of the LB.  If this LB does not use
  * "hairpin_snat_ip", we can SNAT using the VIP.
  *
- * If this LB uses "hairpin_snat_ip", we add a flow to one dimension of a
- * conjunctive flow 'id'. The other dimension consists of the datapaths
- * that this LB belongs to. These flows (and the actual SNAT flow) get added
- * by add_lb_ct_snat_hairpin_dp_flows(). */
+ * If this LB uses "hairpin_snat_ip", we can SNAT using that address, but
+ * we have to add a separate flow per datapath. */
 static void
 add_lb_ct_snat_hairpin_vip_flow(const struct ovn_controller_lb *lb,
-                                uint32_t id,
-                                struct ovn_lb_vip *lb_vip,
+                                const struct ovn_lb_vip *lb_vip,
+                                const struct hmap *local_datapaths,
                                 struct ovn_desired_flow_table *flow_table)
 {
     uint64_t stub[1024 / 8];
@@ -1851,51 +1774,33 @@  add_lb_ct_snat_hairpin_vip_flow(const struct ovn_controller_lb *lb,
         address_family = AF_INET6;
     }
 
-    bool use_hairpin_snat_ip = false;
-    uint16_t priority = 100;
-    if ((address_family == AF_INET && lb->hairpin_snat_ips.n_ipv4_addrs) ||
-        (address_family == AF_INET6 && lb->hairpin_snat_ips.n_ipv6_addrs)) {
-        use_hairpin_snat_ip = true;
+    struct ofpact_conntrack *ct = ofpact_put_CT(&ofpacts);
+    ct->recirc_table = NX_CT_RECIRC_NONE;
+    ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
+    ct->zone_src.ofs = 0;
+    ct->zone_src.n_bits = 16;
+    ct->flags = NX_CT_F_COMMIT;
+    ct->alg = 0;
 
-        /* A flow added for the "hairpin_snat_ip" case will also match on the
-           less restrictive general case. This can be seen as the match in both
-           cases is the same (the second dimension of the conjunction makes it
-           more restrictive). Therefore, we set the priority in the
-           "hairpin_snat_ip" case to be higher than the general case. */
-        priority = 200;
-    }
+    size_t nat_offset;
+    nat_offset = ofpacts.size;
+    ofpbuf_pull(&ofpacts, nat_offset);
+
+    struct ofpact_nat *nat = ofpact_put_NAT(&ofpacts);
+    nat->flags = NX_NAT_F_SRC;
+    nat->range_af = address_family;
 
-    if (use_hairpin_snat_ip) {
-        struct ofpact_conjunction *conj;
-        conj = ofpact_put_CONJUNCTION(&ofpacts);
-        conj->id = id;
-        conj->n_clauses = 2;
-        conj->clause = 1;
+    if (nat->range_af == AF_INET) {
+        nat->range.addr.ipv4.min = lb->hairpin_snat_ips.n_ipv4_addrs
+                                   ? lb->hairpin_snat_ips.ipv4_addrs[0].addr
+                                   : in6_addr_get_mapped_ipv4(&lb_vip->vip);
     } else {
-        struct ofpact_conntrack *ct = ofpact_put_CT(&ofpacts);
-        ct->recirc_table = NX_CT_RECIRC_NONE;
-        ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
-        ct->zone_src.ofs = 0;
-        ct->zone_src.n_bits = 16;
-        ct->flags = NX_CT_F_COMMIT;
-        ct->alg = 0;
-
-        size_t nat_offset;
-        nat_offset = ofpacts.size;
-        ofpbuf_pull(&ofpacts, nat_offset);
-
-        struct ofpact_nat *nat = ofpact_put_NAT(&ofpacts);
-        nat->flags = NX_NAT_F_SRC;
-        nat->range_af = address_family;
-
-        if (nat->range_af == AF_INET) {
-            nat->range.addr.ipv4.min = in6_addr_get_mapped_ipv4(&lb_vip->vip);
-        } else {
-            nat->range.addr.ipv6.min = lb_vip->vip;
-        }
-        ofpacts.header = ofpbuf_push_uninit(&ofpacts, nat_offset);
-        ofpact_finish(&ofpacts, &ct->ofpact);
+        nat->range.addr.ipv6.min = lb->hairpin_snat_ips.n_ipv6_addrs
+                                   ? lb->hairpin_snat_ips.ipv6_addrs[0].addr
+                                   : lb_vip->vip;
     }
+    ofpacts.header = ofpbuf_push_uninit(&ofpacts, nat_offset);
+    ofpact_finish(&ofpacts, &ct->ofpact);
 
     struct match match = MATCH_CATCHALL_INITIALIZER;
 
@@ -1942,15 +1847,31 @@  add_lb_ct_snat_hairpin_vip_flow(const struct ovn_controller_lb *lb,
         }
     }
 
-    /* We need to "add_or_append" flows because this match may form part
-     * of flows if the same "hairpin_snat_ip" address is present on mutiple
-     * LBs */
-    ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN, priority,
-                              lb->slb->header_.uuid.parts[0],
-                              &match, &ofpacts, &lb->slb->header_.uuid,
-                              NX_CTLR_NO_METER, NULL);
-    ofpbuf_uninit(&ofpacts);
+    bool use_hairpin_snat_ip = false;
+    if ((address_family == AF_INET && lb->hairpin_snat_ips.n_ipv4_addrs) ||
+        (address_family == AF_INET6 && lb->hairpin_snat_ips.n_ipv6_addrs)) {
+        use_hairpin_snat_ip = true;
+    }
 
+    if (!use_hairpin_snat_ip) {
+        add_lb_ct_snat_hairpin_for_dp(lb, NULL, NULL,
+                                      &match, &ofpacts, flow_table);
+    } else {
+        for (size_t i = 0; i < lb->slb->n_datapaths; i++) {
+            add_lb_ct_snat_hairpin_for_dp(lb, lb->slb->datapaths[i],
+                                          local_datapaths,
+                                          &match, &ofpacts, flow_table);
+        }
+        if (lb->slb->datapath_group) {
+            for (size_t i = 0; i < lb->slb->datapath_group->n_datapaths; i++) {
+                add_lb_ct_snat_hairpin_for_dp(
+                    lb, lb->slb->datapath_group->datapaths[i],
+                    local_datapaths, &match, &ofpacts, flow_table);
+            }
+        }
+    }
+
+    ofpbuf_uninit(&ofpacts);
 }
 
 /* When a packet is sent to a LB VIP from a backend and the LB selects that
@@ -1960,13 +1881,10 @@  add_lb_ct_snat_hairpin_vip_flow(const struct ovn_controller_lb *lb,
  * the LB entry in the NBDB.
  *
  * add_lb_ct_snat_hairpin_flows() adds OpenFlow flows for each LB in order to
- * achieve this behaviour.
- *
- * Note: 'conjunctive_id' must be a unique identifier for each LB as it is used
- * as a conjunctive flow id. */
+ * achieve this behaviour. */
 static void
 add_lb_ct_snat_hairpin_flows(const struct ovn_controller_lb *lb,
-                             uint32_t conjunctive_id,
+                             const struct hmap *local_datapaths,
                              struct ovn_desired_flow_table *flow_table)
 {
     /* We must add a flow for each LB VIP. In the general case, this flow
@@ -1988,10 +1906,9 @@  add_lb_ct_snat_hairpin_flows(const struct ovn_controller_lb *lb,
        above we do not need to add an OpenFlow flow for each datapath. However,
        if one LB has specified "hairpin_snat_ip", then we need to SNAT that LB
        using the "hairpin_snat_ip" address rather than the VIP. In order to
-       achieve that, we can use a conjunctive flow that matches on any VIPs
-       from the "hairpin_snat_ip" LB and any datapath on which this LB is
-       added. This conjuctive flow can then SNAT using the "hairpin_snat_ip" IP
-       address rather than the LB VIP.
+       achieve that, we need to add a datapath metadata match.  These flows
+       will match on a subset of fields of more general flows, generated for a
+       case without "hairpin_snat_ip", so they need to have a higher priority.
 
        There is another potential exception. Consider the case in which we have
        two LBs which both have "hairpin_snat_ip" set. If these LBs have
@@ -2001,23 +1918,17 @@  add_lb_ct_snat_hairpin_flows(const struct ovn_controller_lb *lb,
        same VIP should not be added to the same datapath. */
 
     for (int i = 0; i < lb->n_vips; i++) {
-        struct ovn_lb_vip *lb_vip = &lb->vips[i];
-        add_lb_ct_snat_hairpin_vip_flow(lb, conjunctive_id,
-                                        lb_vip, flow_table);
+        add_lb_ct_snat_hairpin_vip_flow(lb, &lb->vips[i], local_datapaths,
+                                        flow_table);
     }
-
-    add_lb_ct_snat_hairpin_dp_flows(lb, conjunctive_id, flow_table);
 }
 
 static void
 consider_lb_hairpin_flows(const struct ovn_controller_lb *lb,
+                          const struct hmap *local_datapaths,
                           bool use_ct_mark,
-                          struct ovn_desired_flow_table *flow_table,
-                          struct simap *ids)
+                          struct ovn_desired_flow_table *flow_table)
 {
-    int id = simap_get(ids, lb->slb->name);
-    VLOG_DBG("Load Balancer %s has conjunctive flow id %u", lb->slb->name, id);
-
     for (size_t i = 0; i < lb->n_vips; i++) {
         struct ovn_lb_vip *lb_vip = &lb->vips[i];
 
@@ -2029,37 +1940,21 @@  consider_lb_hairpin_flows(const struct ovn_controller_lb *lb,
         }
     }
 
-    add_lb_ct_snat_hairpin_flows(lb, id, flow_table);
+    add_lb_ct_snat_hairpin_flows(lb, local_datapaths, flow_table);
 }
 
 /* Adds OpenFlow flows to flow tables for each Load balancer VIPs and
  * backends to handle the load balanced hairpin traffic. */
 static void
 add_lb_hairpin_flows(const struct hmap *local_lbs,
+                     const struct hmap *local_datapaths,
                      bool use_ct_mark,
-                     struct ovn_desired_flow_table *flow_table,
-                     struct simap *ids,
-                     struct id_pool *pool)
+                     struct ovn_desired_flow_table *flow_table)
 {
-    uint32_t id;
     const struct ovn_controller_lb *lb;
     HMAP_FOR_EACH (lb, hmap_node, local_lbs) {
-        /* Allocate a unique 32-bit integer to this load-balancer. This will
-         * be used as a conjunctive flow id in the OFTABLE_CT_SNAT_HAIRPIN
-         * table.
-         *
-         * If we are unable to allocate a unique ID then we have run out of
-         * ids. As this is unrecoverable then we abort. However, this is
-         * unlikely to happen as it would be mean that we have created
-         * "UINT32_MAX" load-balancers.
-         */
-
-        id = simap_get(ids, lb->slb->name);
-        if (!id) {
-            ovs_assert(id_pool_alloc_id(pool, &id));
-            simap_put(ids, lb->slb->name, id);
-        }
-        consider_lb_hairpin_flows(lb, use_ct_mark, flow_table, ids);
+        consider_lb_hairpin_flows(lb, local_datapaths,
+                                  use_ct_mark, flow_table);
     }
 }
 
@@ -2196,10 +2091,9 @@  lflow_run(struct lflow_ctx_in *l_ctx_in, struct lflow_ctx_out *l_ctx_out)
                        l_ctx_in->local_datapaths,
                        l_ctx_out->flow_table);
     add_lb_hairpin_flows(l_ctx_in->local_lbs,
+                         l_ctx_in->local_datapaths,
                          l_ctx_in->lb_hairpin_use_ct_mark,
-                         l_ctx_out->flow_table,
-                         l_ctx_out->hairpin_lb_ids,
-                         l_ctx_out->hairpin_id_pool);
+                         l_ctx_out->flow_table);
     add_fdb_flows(l_ctx_in->fdb_table, l_ctx_in->local_datapaths,
                   l_ctx_out->flow_table);
     add_port_sec_flows(l_ctx_in->binding_lports, l_ctx_in->chassis,
@@ -2423,9 +2317,6 @@  lflow_handle_changed_lbs(struct lflow_ctx_in *l_ctx_in,
 {
     const struct ovn_controller_lb *lb;
 
-    struct id_pool *pool = l_ctx_out->hairpin_id_pool;
-    struct simap *ids = l_ctx_out->hairpin_lb_ids;
-
     struct uuidset_node *uuid_node;
     UUIDSET_FOR_EACH (uuid_node, deleted_lbs) {
         lb = ovn_controller_lb_find(old_lbs, &uuid_node->uuid);
@@ -2433,8 +2324,6 @@  lflow_handle_changed_lbs(struct lflow_ctx_in *l_ctx_in,
         VLOG_DBG("Remove hairpin flows for deleted load balancer "UUID_FMT,
                  UUID_ARGS(&uuid_node->uuid));
         ofctrl_remove_flows(l_ctx_out->flow_table, &uuid_node->uuid);
-        id_pool_free_id(pool, simap_get(ids, lb->slb->name));
-        simap_find_and_delete(ids, lb->slb->name);
     }
 
     UUIDSET_FOR_EACH (uuid_node, updated_lbs) {
@@ -2443,32 +2332,19 @@  lflow_handle_changed_lbs(struct lflow_ctx_in *l_ctx_in,
         VLOG_DBG("Remove and add hairpin flows for updated load balancer "
                   UUID_FMT, UUID_ARGS(&uuid_node->uuid));
         ofctrl_remove_flows(l_ctx_out->flow_table, &uuid_node->uuid);
-        consider_lb_hairpin_flows(lb, l_ctx_in->lb_hairpin_use_ct_mark,
-                                  l_ctx_out->flow_table,
-                                  l_ctx_out->hairpin_lb_ids);
+        consider_lb_hairpin_flows(lb, l_ctx_in->local_datapaths,
+                                  l_ctx_in->lb_hairpin_use_ct_mark,
+                                  l_ctx_out->flow_table);
     }
 
     UUIDSET_FOR_EACH (uuid_node, new_lbs) {
         lb = ovn_controller_lb_find(l_ctx_in->local_lbs, &uuid_node->uuid);
 
-        /* Allocate a unique 32-bit integer to this load-balancer. This
-         * will be used as a conjunctive flow id in the
-         * OFTABLE_CT_SNAT_HAIRPIN table.
-         *
-         * If we are unable to allocate a unique ID then we have run out of
-         * ids. As this is unrecoverable then we abort. However, this is
-         * unlikely to happen as it would be mean that we have created
-         * "UINT32_MAX" load-balancers.
-         */
-        uint32_t id;
-        ovs_assert(id_pool_alloc_id(pool, &id));
-        simap_put(ids, lb->slb->name, id);
-
         VLOG_DBG("Add load balancer hairpin flows for "UUID_FMT,
                  UUID_ARGS(&uuid_node->uuid));
-        consider_lb_hairpin_flows(lb, l_ctx_in->lb_hairpin_use_ct_mark,
-                                  l_ctx_out->flow_table,
-                                  l_ctx_out->hairpin_lb_ids);
+        consider_lb_hairpin_flows(lb, l_ctx_in->local_datapaths,
+                                  l_ctx_in->lb_hairpin_use_ct_mark,
+                                  l_ctx_out->flow_table);
     }
 
     return true;
diff --git a/controller/lflow.h b/controller/lflow.h
index 44e534696..dd742257b 100644
--- a/controller/lflow.h
+++ b/controller/lflow.h
@@ -128,8 +128,6 @@  struct lflow_ctx_out {
     struct lflow_cache *lflow_cache;
     struct conj_ids *conj_ids;
     struct uuidset *objs_processed;
-    struct simap *hairpin_lb_ids;
-    struct id_pool *hairpin_id_pool;
 };
 
 void lflow_init(void);
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 292ca6b10..2d18bbfca 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -3264,11 +3264,6 @@  struct lflow_output_persistent_data {
     struct lflow_cache *lflow_cache;
 };
 
-struct lflow_output_hairpin_data {
-    struct id_pool *pool;
-    struct simap   ids;
-};
-
 struct ed_type_lflow_output {
     /* Logical flow table */
     struct ovn_desired_flow_table flow_table;
@@ -3290,9 +3285,6 @@  struct ed_type_lflow_output {
      * full recompute. */
     struct lflow_output_persistent_data pd;
 
-    /* Data for managing hairpin flow conjunctive flow ids. */
-    struct lflow_output_hairpin_data hd;
-
     /* Fixed neighbor discovery supported options. */
     struct hmap nd_ra_opts;
 
@@ -3448,8 +3440,6 @@  init_lflow_ctx(struct engine_node *node,
     l_ctx_out->conj_ids = &fo->conj_ids;
     l_ctx_out->objs_processed = &fo->objs_processed;
     l_ctx_out->lflow_cache = fo->pd.lflow_cache;
-    l_ctx_out->hairpin_id_pool = fo->hd.pool;
-    l_ctx_out->hairpin_lb_ids = &fo->hd.ids;
 }
 
 static void *
@@ -3463,8 +3453,6 @@  en_lflow_output_init(struct engine_node *node OVS_UNUSED,
     objdep_mgr_init(&data->lflow_deps_mgr);
     lflow_conj_ids_init(&data->conj_ids);
     uuidset_init(&data->objs_processed);
-    simap_init(&data->hd.ids);
-    data->hd.pool = id_pool_create(1, UINT32_MAX - 1);
     nd_ra_opts_init(&data->nd_ra_opts);
     controller_event_opts_init(&data->controller_event_opts);
     flow_collector_ids_init(&data->collector_ids);
@@ -3489,8 +3477,6 @@  en_lflow_output_cleanup(void *data)
     lflow_conj_ids_destroy(&flow_output_data->conj_ids);
     uuidset_destroy(&flow_output_data->objs_processed);
     lflow_cache_destroy(flow_output_data->pd.lflow_cache);
-    simap_destroy(&flow_output_data->hd.ids);
-    id_pool_destroy(flow_output_data->hd.pool);
     nd_ra_opts_destroy(&flow_output_data->nd_ra_opts);
     controller_event_opts_destroy(&flow_output_data->controller_event_opts);
     flow_collector_ids_destroy(&flow_output_data->collector_ids);