diff mbox series

[ovs-dev] northd: Fix logical router load-balancer nat rules when using DGP.

Message ID 20240219213321.56103-1-roberto.acosta@luizalabs.com
State New
Headers show
Series [ovs-dev] northd: Fix logical router load-balancer nat rules when using DGP. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Roberto Bartzen Acosta Feb. 19, 2024, 9:33 p.m. UTC
This commit fixes the build_distr_lrouter_nat_flows_for_lb function to
include one NAT flow entry for each DGP in use. Since we have added support
to create multiple gateway ports per logical router, it's necessary to
include in the LR nat rules pipeline a specific entry for each attached
DGP. Otherwise, the ingress traffic is only redirected when the incoming
LRP matches the chassis_resident field.

Considering that DNAT rules for DGPs were implemented with the need to
configure the DGP-related gateway-port column, the load-balancer NAT rule
configuration can use a similar idea. In this case, we don't know the LRP
responsible for the incoming traffic, and therefore we must apply the
load-balancer automatically created NAT rule in all DGPs to allow the
incoming traffic.

Reported-at: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/2054322
Fixes: 15348b7b806f ("ovn-northd: Multiple distributed gateway port support.")
Signed-off-by: Roberto Bartzen Acosta <roberto.acosta@luizalabs.com>
---
 northd/en-lr-stateful.c | 12 ------
 northd/northd.c         | 14 ++++---
 tests/ovn-northd.at     | 92 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 100 insertions(+), 18 deletions(-)

Comments

Mark Michelson March 18, 2024, 2:53 p.m. UTC | #1
Hi Roberto,

I have some concerns about this patch. Let's use the test case you added 
as an example network. Let's bind the vms and DGPs to hypervisors:

* vm1 and lr1-ts1 are bound to hypervisor hv1
* vm2 and lr1-ts2 are bound to hypervisor hv2

Now imagine a packet arrives on lr1-ts1. The packet gets load balanced 
and sent to vm2. In this case, since lr1-ts1 is on hv1, the ct_lb_mark() 
action runs there, creating a conntrack entry on hv1. However, the 
packet eventually is tunneled to hv2 so that it can be output to vm2.

Now vm2 replies to the packet. Is there anything that ensures that the 
reply from vm2 gets sent to hv1 for the egress pipeline of lr1? If so, 
then this should work fine since the packet will be unDNATted as 
expected on hv1. However, if the packet runs all of lr1's pipelines on 
hv2, then there is no conntrack entry present, and the attempt to unDNAT 
will not succeed. The packet will either be dropped because of invalid 
CT, or the packet will have an incorrect source IP and port. Either way, 
this isn't what is desired.

On 2/19/24 16:33, Roberto Bartzen Acosta wrote:
> This commit fixes the build_distr_lrouter_nat_flows_for_lb function to
> include one NAT flow entry for each DGP in use. Since we have added support
> to create multiple gateway ports per logical router, it's necessary to
> include in the LR nat rules pipeline a specific entry for each attached
> DGP. Otherwise, the ingress traffic is only redirected when the incoming
> LRP matches the chassis_resident field.
> 
> Considering that DNAT rules for DGPs were implemented with the need to
> configure the DGP-related gateway-port column, the load-balancer NAT rule
> configuration can use a similar idea. In this case, we don't know the LRP
> responsible for the incoming traffic, and therefore we must apply the
> load-balancer automatically created NAT rule in all DGPs to allow the
> incoming traffic.
> 
> Reported-at: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/2054322
> Fixes: 15348b7b806f ("ovn-northd: Multiple distributed gateway port support.")
> Signed-off-by: Roberto Bartzen Acosta <roberto.acosta@luizalabs.com>
> ---
>   northd/en-lr-stateful.c | 12 ------
>   northd/northd.c         | 14 ++++---
>   tests/ovn-northd.at     | 92 +++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 100 insertions(+), 18 deletions(-)
> 
> diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c
> index 6d0192487..7ffa4a690 100644
> --- a/northd/en-lr-stateful.c
> +++ b/northd/en-lr-stateful.c
> @@ -537,18 +537,6 @@ lr_stateful_record_create(struct lr_stateful_table *table,
>   
>       table->array[od->index] = lr_stateful_rec;
>   
> -    /* Load balancers are not supported (yet) if a logical router has multiple
> -     * distributed gateway port.  Log a warning. */
> -    if (lr_stateful_rec->has_lb_vip && lr_has_multiple_gw_ports(od)) {
> -        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> -        VLOG_WARN_RL(&rl, "Load-balancers are configured on logical "
> -                     "router %s, which has %"PRIuSIZE" distributed "
> -                     "gateway ports. Load-balancer is not supported "
> -                     "yet when there is more than one distributed "
> -                     "gateway port on the router.",
> -                     od->nbr->name, od->n_l3dgw_ports);
> -    }
> -
>       return lr_stateful_rec;
>   }
>   
> diff --git a/northd/northd.c b/northd/northd.c
> index 2c3560ce2..7eb943d2f 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -10919,10 +10919,9 @@ static void
>   build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
>                                        enum lrouter_nat_lb_flow_type type,
>                                        struct ovn_datapath *od,
> -                                     struct lflow_ref *lflow_ref)
> +                                     struct lflow_ref *lflow_ref,
> +                                     struct ovn_port *dgp)
>   {
> -    struct ovn_port *dgp = od->l3dgw_ports[0];
> -
>       const char *undnat_action;
>   
>       switch (type) {
> @@ -10953,7 +10952,7 @@ build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
>   
>       if (ctx->lb_vip->n_backends || !ctx->lb_vip->empty_backend_rej) {
>           ds_put_format(ctx->new_match, " && is_chassis_resident(%s)",
> -                      od->l3dgw_ports[0]->cr_port->json_key);
> +                      dgp->cr_port->json_key);
>       }
>   
>       ovn_lflow_add_with_hint__(ctx->lflows, od, S_ROUTER_IN_DNAT, ctx->prio,
> @@ -11164,8 +11163,11 @@ build_lrouter_nat_flows_for_lb(
>           if (!od->n_l3dgw_ports) {
>               bitmap_set1(gw_dp_bitmap[type], index);
>           } else {
> -            build_distr_lrouter_nat_flows_for_lb(&ctx, type, od,
> -                                                 lb_dps->lflow_ref);
> +            for (size_t i = 0; i < od->n_l3dgw_ports; i++) {
> +                struct ovn_port *dgp = od->l3dgw_ports[i];
> +                build_distr_lrouter_nat_flows_for_lb(&ctx, type, od,
> +                                                     lb_dps->lflow_ref, dgp);
> +            }
>           }
>   
>           if (lb->affinity_timeout) {
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 6fdd761da..fa24935e1 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -12313,3 +12313,95 @@ check_engine_stats northd recompute nocompute
>   check_engine_stats lflow recompute nocompute
>   
>   AT_CLEANUP
> +
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([Load balancer with Distributed Gateway Ports (DGP)])
> +ovn_start
> +
> +check ovn-nbctl ls-add public
> +check ovn-nbctl lr-add lr1
> +
> +# lr1 DGP ts1
> +check ovn-nbctl ls-add ts1
> +check ovn-nbctl lrp-add lr1 lr1-ts1 00:00:01:02:03:04 172.16.10.1/24
> +check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 chassis-2
> +
> +# lr1 DGP ts2
> +check ovn-nbctl ls-add ts2
> +check ovn-nbctl lrp-add lr1 lr1-ts2 00:00:01:02:03:05 172.16.20.1/24
> +check ovn-nbctl lrp-set-gateway-chassis lr1-ts2 chassis-3
> +
> +# lr1 DGP public
> +check ovn-nbctl lrp-add lr1 lr1_public 00:de:ad:ff:00:01 173.16.0.1/16
> +check ovn-nbctl lrp-add lr1 lr1_s1 00:de:ad:fe:00:02 172.16.0.1/24
> +check ovn-nbctl lrp-set-gateway-chassis lr1_public chassis-1
> +
> +check ovn-nbctl ls-add s1
> +# s1 - lr1
> +check ovn-nbctl lsp-add s1 s1_lr1
> +check ovn-nbctl lsp-set-type s1_lr1 router
> +check ovn-nbctl lsp-set-addresses s1_lr1 "00:de:ad:fe:00:02 172.16.0.1"
> +check ovn-nbctl lsp-set-options s1_lr1 router-port=lr1_s1
> +
> +# s1 - backend vm1
> +check ovn-nbctl lsp-add s1 vm1
> +check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 172.16.0.101"
> +
> +# s1 - backend vm2
> +check ovn-nbctl lsp-add s1 vm2
> +check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 172.16.0.102"
> +
> +# s1 - backend vm3
> +check ovn-nbctl lsp-add s1 vm3
> +check ovn-nbctl lsp-set-addresses vm3 "00:de:ad:01:00:03 172.16.0.103"
> +
> +# Add the lr1 DGP ts1 to the public switch
> +check ovn-nbctl lsp-add public public_lr1_ts1
> +check ovn-nbctl lsp-set-type public_lr1_ts1 router
> +check ovn-nbctl lsp-set-addresses public_lr1_ts1 router
> +check ovn-nbctl lsp-set-options public_lr1_ts1 router-port=lr1-ts1 nat-addresses=router
> +
> +# Add the lr1 DGP ts2 to the public switch
> +check ovn-nbctl lsp-add public public_lr1_ts2
> +check ovn-nbctl lsp-set-type public_lr1_ts2 router
> +check ovn-nbctl lsp-set-addresses public_lr1_ts2 router
> +check ovn-nbctl lsp-set-options public_lr1_ts2 router-port=lr1-ts2 nat-addresses=router
> +
> +# Add the lr1 DGP public to the public switch
> +check ovn-nbctl lsp-add public public_lr1
> +check ovn-nbctl lsp-set-type public_lr1 router
> +check ovn-nbctl lsp-set-addresses public_lr1 router
> +check ovn-nbctl lsp-set-options public_lr1 router-port=lr1_public nat-addresses=router
> +
> +# Create the Load Balancer lb1
> +check ovn-nbctl --wait=sb lb-add lb1 "30.0.0.1" "172.16.0.103,172.16.0.102,172.16.0.101"
> +
> +# Associate load balancer to s1
> +check ovn-nbctl ls-lb-add s1 lb1
> +check ovn-nbctl --wait=sb sync
> +
> +ovn-sbctl dump-flows s1 > s1flows
> +AT_CAPTURE_FILE([s1flows])
> +
> +AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl
> +  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && ip4.dst == 30.0.0.1), action=(reg1 = 30.0.0.1; ct_lb_mark;)
> +])
> +AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl
> +  table=??(ls_in_lb           ), priority=110  , match=(ct.new && ip4.dst == 30.0.0.1), action=(reg0[[1]] = 0; ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> +])
> +
> +# Associate load balancer to lr1 with DGP
> +check ovn-nbctl lr-lb-add lr1 lb1
> +check ovn-nbctl --wait=sb sync
> +
> +ovn-sbctl dump-flows lr1 > lr1flows
> +AT_CAPTURE_FILE([lr1flows])
> +
> +AT_CHECK([grep "lr_in_dnat" lr1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl
> +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1-ts1")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1-ts2")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1_public")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> +])
> +
> +AT_CLEANUP
> +])
Roberto Bartzen Acosta March 22, 2024, 11:50 a.m. UTC | #2
Hi Mark,

Thanks for your feedback.

Em seg., 18 de mar. de 2024 às 11:53, Mark Michelson <mmichels@redhat.com>
escreveu:

> Hi Roberto,
>
> I have some concerns about this patch. Let's use the test case you added
> as an example network. Let's bind the vms and DGPs to hypervisors:
>
> * vm1 and lr1-ts1 are bound to hypervisor hv1
> * vm2 and lr1-ts2 are bound to hypervisor hv2
>
> Now imagine a packet arrives on lr1-ts1. The packet gets load balanced
> and sent to vm2. In this case, since lr1-ts1 is on hv1, the ct_lb_mark()
> action runs there, creating a conntrack entry on hv1. However, the
> packet eventually is tunneled to hv2 so that it can be output to vm2.
>
> Now vm2 replies to the packet. Is there anything that ensures that the
> reply from vm2 gets sent to hv1 for the egress pipeline of lr1? If so,
> then this should work fine since the packet will be unDNATted as
> expected on hv1. However, if the packet runs all of lr1's pipelines on
> hv2, then there is no conntrack entry present, and the attempt to unDNAT
> will not succeed. The packet will either be dropped because of invalid
> CT, or the packet will have an incorrect source IP and port. Either way,
> this isn't what is desired.
>

yep, you're right! that makes sense.
If the packet comes back with a chassis that does not have the related LB
contrack entry corresponding to the initial request, this will trigger a
miss in the pipeline.

I tried to disable ct tracking for lb by configuring the parameter on the
router, but I still wasn't successful. E.g.:
options:lb_force_snat_ip="172.16.200.201"

Even changing the lb_force snat ip on the router, or creating a SNAT
"stateless" rule, I still see this action in the output pipeline in the
highest priority table (table=1).

  table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
8000)) && outport == "incus-net40-lr-lrp-01" &&
is_chassis_resident("cr-incus-net40-lr-lrp-01")), action=(ct_dnat_in_czone;)
  table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
8000)) && outport == "incus-net40-lr-lrp-02" &&
is_chassis_resident("cr-incus-net40-lr-lrp-02")), action=(ct_dnat_in_czone;)
  table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
8000)) && outport == "incus-net40-lr-lrp-03" &&
is_chassis_resident("cr-incus-net40-lr-lrp-03")), action=(ct_dnat_in_czone;)
  table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
8000)) && outport == "incus-net40-lr-lrp-04" &&
is_chassis_resident("cr-incus-net40-lr-lrp-04")), action=(ct_dnat_in_czone;)
  table=1 (lr_out_undnat      ), priority=120  , match=(ip4 && ((ip4.src ==
10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src ==
8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src ==
10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src ==
8000)) && outport == "incus-net40-lr-lrp-ext" &&
is_chassis_resident("cr-incus-net40-lr-lrp-ext")),
action=(ct_dnat_in_czone;)


Considering that the return when using multiple DGPs is probably associated
with ECMP, any idea on how to change the rules so that the LB output rules
are 'stateless' (not ct dependent) in this case with multiple DGPs? I
imagine this solves the problem and guarantees a return for any chassis.

Thanks,
Roberto

PS: I have a complex load balancer scenario for the use case with multiple
DPGs, and 'internal' VIP addresses + OVN interconnect. I can explain the
general context if you are interested :)


> On 2/19/24 16:33, Roberto Bartzen Acosta wrote:
> > This commit fixes the build_distr_lrouter_nat_flows_for_lb function to
> > include one NAT flow entry for each DGP in use. Since we have added
> support
> > to create multiple gateway ports per logical router, it's necessary to
> > include in the LR nat rules pipeline a specific entry for each attached
> > DGP. Otherwise, the ingress traffic is only redirected when the incoming
> > LRP matches the chassis_resident field.
> >
> > Considering that DNAT rules for DGPs were implemented with the need to
> > configure the DGP-related gateway-port column, the load-balancer NAT rule
> > configuration can use a similar idea. In this case, we don't know the LRP
> > responsible for the incoming traffic, and therefore we must apply the
> > load-balancer automatically created NAT rule in all DGPs to allow the
> > incoming traffic.
> >
> > Reported-at: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/2054322
> > Fixes: 15348b7b806f ("ovn-northd: Multiple distributed gateway port
> support.")
> > Signed-off-by: Roberto Bartzen Acosta <roberto.acosta@luizalabs.com>
> > ---
> >   northd/en-lr-stateful.c | 12 ------
> >   northd/northd.c         | 14 ++++---
> >   tests/ovn-northd.at     | 92 +++++++++++++++++++++++++++++++++++++++++
> >   3 files changed, 100 insertions(+), 18 deletions(-)
> >
> > diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c
> > index 6d0192487..7ffa4a690 100644
> > --- a/northd/en-lr-stateful.c
> > +++ b/northd/en-lr-stateful.c
> > @@ -537,18 +537,6 @@ lr_stateful_record_create(struct lr_stateful_table
> *table,
> >
> >       table->array[od->index] = lr_stateful_rec;
> >
> > -    /* Load balancers are not supported (yet) if a logical router has
> multiple
> > -     * distributed gateway port.  Log a warning. */
> > -    if (lr_stateful_rec->has_lb_vip && lr_has_multiple_gw_ports(od)) {
> > -        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> > -        VLOG_WARN_RL(&rl, "Load-balancers are configured on logical "
> > -                     "router %s, which has %"PRIuSIZE" distributed "
> > -                     "gateway ports. Load-balancer is not supported "
> > -                     "yet when there is more than one distributed "
> > -                     "gateway port on the router.",
> > -                     od->nbr->name, od->n_l3dgw_ports);
> > -    }
> > -
> >       return lr_stateful_rec;
> >   }
> >
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 2c3560ce2..7eb943d2f 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -10919,10 +10919,9 @@ static void
> >   build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx
> *ctx,
> >                                        enum lrouter_nat_lb_flow_type
> type,
> >                                        struct ovn_datapath *od,
> > -                                     struct lflow_ref *lflow_ref)
> > +                                     struct lflow_ref *lflow_ref,
> > +                                     struct ovn_port *dgp)
> >   {
> > -    struct ovn_port *dgp = od->l3dgw_ports[0];
> > -
> >       const char *undnat_action;
> >
> >       switch (type) {
> > @@ -10953,7 +10952,7 @@ build_distr_lrouter_nat_flows_for_lb(struct
> lrouter_nat_lb_flows_ctx *ctx,
> >
> >       if (ctx->lb_vip->n_backends || !ctx->lb_vip->empty_backend_rej) {
> >           ds_put_format(ctx->new_match, " && is_chassis_resident(%s)",
> > -                      od->l3dgw_ports[0]->cr_port->json_key);
> > +                      dgp->cr_port->json_key);
> >       }
> >
> >       ovn_lflow_add_with_hint__(ctx->lflows, od, S_ROUTER_IN_DNAT,
> ctx->prio,
> > @@ -11164,8 +11163,11 @@ build_lrouter_nat_flows_for_lb(
> >           if (!od->n_l3dgw_ports) {
> >               bitmap_set1(gw_dp_bitmap[type], index);
> >           } else {
> > -            build_distr_lrouter_nat_flows_for_lb(&ctx, type, od,
> > -                                                 lb_dps->lflow_ref);
> > +            for (size_t i = 0; i < od->n_l3dgw_ports; i++) {
> > +                struct ovn_port *dgp = od->l3dgw_ports[i];
> > +                build_distr_lrouter_nat_flows_for_lb(&ctx, type, od,
> > +                                                     lb_dps->lflow_ref,
> dgp);
> > +            }
> >           }
> >
> >           if (lb->affinity_timeout) {
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 6fdd761da..fa24935e1 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -12313,3 +12313,95 @@ check_engine_stats northd recompute nocompute
> >   check_engine_stats lflow recompute nocompute
> >
> >   AT_CLEANUP
> > +
> > +OVN_FOR_EACH_NORTHD_NO_HV([
> > +AT_SETUP([Load balancer with Distributed Gateway Ports (DGP)])
> > +ovn_start
> > +
> > +check ovn-nbctl ls-add public
> > +check ovn-nbctl lr-add lr1
> > +
> > +# lr1 DGP ts1
> > +check ovn-nbctl ls-add ts1
> > +check ovn-nbctl lrp-add lr1 lr1-ts1 00:00:01:02:03:04 172.16.10.1/24
> > +check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 chassis-2
> > +
> > +# lr1 DGP ts2
> > +check ovn-nbctl ls-add ts2
> > +check ovn-nbctl lrp-add lr1 lr1-ts2 00:00:01:02:03:05 172.16.20.1/24
> > +check ovn-nbctl lrp-set-gateway-chassis lr1-ts2 chassis-3
> > +
> > +# lr1 DGP public
> > +check ovn-nbctl lrp-add lr1 lr1_public 00:de:ad:ff:00:01 173.16.0.1/16
> > +check ovn-nbctl lrp-add lr1 lr1_s1 00:de:ad:fe:00:02 172.16.0.1/24
> > +check ovn-nbctl lrp-set-gateway-chassis lr1_public chassis-1
> > +
> > +check ovn-nbctl ls-add s1
> > +# s1 - lr1
> > +check ovn-nbctl lsp-add s1 s1_lr1
> > +check ovn-nbctl lsp-set-type s1_lr1 router
> > +check ovn-nbctl lsp-set-addresses s1_lr1 "00:de:ad:fe:00:02 172.16.0.1"
> > +check ovn-nbctl lsp-set-options s1_lr1 router-port=lr1_s1
> > +
> > +# s1 - backend vm1
> > +check ovn-nbctl lsp-add s1 vm1
> > +check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 172.16.0.101"
> > +
> > +# s1 - backend vm2
> > +check ovn-nbctl lsp-add s1 vm2
> > +check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 172.16.0.102"
> > +
> > +# s1 - backend vm3
> > +check ovn-nbctl lsp-add s1 vm3
> > +check ovn-nbctl lsp-set-addresses vm3 "00:de:ad:01:00:03 172.16.0.103"
> > +
> > +# Add the lr1 DGP ts1 to the public switch
> > +check ovn-nbctl lsp-add public public_lr1_ts1
> > +check ovn-nbctl lsp-set-type public_lr1_ts1 router
> > +check ovn-nbctl lsp-set-addresses public_lr1_ts1 router
> > +check ovn-nbctl lsp-set-options public_lr1_ts1 router-port=lr1-ts1
> nat-addresses=router
> > +
> > +# Add the lr1 DGP ts2 to the public switch
> > +check ovn-nbctl lsp-add public public_lr1_ts2
> > +check ovn-nbctl lsp-set-type public_lr1_ts2 router
> > +check ovn-nbctl lsp-set-addresses public_lr1_ts2 router
> > +check ovn-nbctl lsp-set-options public_lr1_ts2 router-port=lr1-ts2
> nat-addresses=router
> > +
> > +# Add the lr1 DGP public to the public switch
> > +check ovn-nbctl lsp-add public public_lr1
> > +check ovn-nbctl lsp-set-type public_lr1 router
> > +check ovn-nbctl lsp-set-addresses public_lr1 router
> > +check ovn-nbctl lsp-set-options public_lr1 router-port=lr1_public
> nat-addresses=router
> > +
> > +# Create the Load Balancer lb1
> > +check ovn-nbctl --wait=sb lb-add lb1 "30.0.0.1"
> "172.16.0.103,172.16.0.102,172.16.0.101"
> > +
> > +# Associate load balancer to s1
> > +check ovn-nbctl ls-lb-add s1 lb1
> > +check ovn-nbctl --wait=sb sync
> > +
> > +ovn-sbctl dump-flows s1 > s1flows
> > +AT_CAPTURE_FILE([s1flows])
> > +
> > +AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep
> "30.0.0.1"], [0], [dnl
> > +  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1
> && ip4.dst == 30.0.0.1), action=(reg1 = 30.0.0.1; ct_lb_mark;)
> > +])
> > +AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep
> "30.0.0.1"], [0], [dnl
> > +  table=??(ls_in_lb           ), priority=110  , match=(ct.new &&
> ip4.dst == 30.0.0.1), action=(reg0[[1]] = 0;
> ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> > +])
> > +
> > +# Associate load balancer to lr1 with DGP
> > +check ovn-nbctl lr-lb-add lr1 lb1
> > +check ovn-nbctl --wait=sb sync
> > +
> > +ovn-sbctl dump-flows lr1 > lr1flows
> > +AT_CAPTURE_FILE([lr1flows])
> > +
> > +AT_CHECK([grep "lr_in_dnat" lr1flows | ovn_strip_lflows | grep
> "30.0.0.1"], [0], [dnl
> > +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 30.0.0.1 &&
> is_chassis_resident("cr-lr1-ts1")),
> action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> > +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 30.0.0.1 &&
> is_chassis_resident("cr-lr1-ts2")),
> action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> > +  table=??(lr_in_dnat         ), priority=110  , match=(ct.new &&
> !ct.rel && ip4 && ip4.dst == 30.0.0.1 &&
> is_chassis_resident("cr-lr1_public")),
> action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
> > +])
> > +
> > +AT_CLEANUP
> > +])
>
>
diff mbox series

Patch

diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c
index 6d0192487..7ffa4a690 100644
--- a/northd/en-lr-stateful.c
+++ b/northd/en-lr-stateful.c
@@ -537,18 +537,6 @@  lr_stateful_record_create(struct lr_stateful_table *table,
 
     table->array[od->index] = lr_stateful_rec;
 
-    /* Load balancers are not supported (yet) if a logical router has multiple
-     * distributed gateway port.  Log a warning. */
-    if (lr_stateful_rec->has_lb_vip && lr_has_multiple_gw_ports(od)) {
-        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-        VLOG_WARN_RL(&rl, "Load-balancers are configured on logical "
-                     "router %s, which has %"PRIuSIZE" distributed "
-                     "gateway ports. Load-balancer is not supported "
-                     "yet when there is more than one distributed "
-                     "gateway port on the router.",
-                     od->nbr->name, od->n_l3dgw_ports);
-    }
-
     return lr_stateful_rec;
 }
 
diff --git a/northd/northd.c b/northd/northd.c
index 2c3560ce2..7eb943d2f 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -10919,10 +10919,9 @@  static void
 build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
                                      enum lrouter_nat_lb_flow_type type,
                                      struct ovn_datapath *od,
-                                     struct lflow_ref *lflow_ref)
+                                     struct lflow_ref *lflow_ref,
+                                     struct ovn_port *dgp)
 {
-    struct ovn_port *dgp = od->l3dgw_ports[0];
-
     const char *undnat_action;
 
     switch (type) {
@@ -10953,7 +10952,7 @@  build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx,
 
     if (ctx->lb_vip->n_backends || !ctx->lb_vip->empty_backend_rej) {
         ds_put_format(ctx->new_match, " && is_chassis_resident(%s)",
-                      od->l3dgw_ports[0]->cr_port->json_key);
+                      dgp->cr_port->json_key);
     }
 
     ovn_lflow_add_with_hint__(ctx->lflows, od, S_ROUTER_IN_DNAT, ctx->prio,
@@ -11164,8 +11163,11 @@  build_lrouter_nat_flows_for_lb(
         if (!od->n_l3dgw_ports) {
             bitmap_set1(gw_dp_bitmap[type], index);
         } else {
-            build_distr_lrouter_nat_flows_for_lb(&ctx, type, od,
-                                                 lb_dps->lflow_ref);
+            for (size_t i = 0; i < od->n_l3dgw_ports; i++) {
+                struct ovn_port *dgp = od->l3dgw_ports[i];
+                build_distr_lrouter_nat_flows_for_lb(&ctx, type, od,
+                                                     lb_dps->lflow_ref, dgp);
+            }
         }
 
         if (lb->affinity_timeout) {
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 6fdd761da..fa24935e1 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -12313,3 +12313,95 @@  check_engine_stats northd recompute nocompute
 check_engine_stats lflow recompute nocompute
 
 AT_CLEANUP
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([Load balancer with Distributed Gateway Ports (DGP)])
+ovn_start
+
+check ovn-nbctl ls-add public
+check ovn-nbctl lr-add lr1
+
+# lr1 DGP ts1
+check ovn-nbctl ls-add ts1
+check ovn-nbctl lrp-add lr1 lr1-ts1 00:00:01:02:03:04 172.16.10.1/24
+check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 chassis-2
+
+# lr1 DGP ts2
+check ovn-nbctl ls-add ts2
+check ovn-nbctl lrp-add lr1 lr1-ts2 00:00:01:02:03:05 172.16.20.1/24
+check ovn-nbctl lrp-set-gateway-chassis lr1-ts2 chassis-3
+
+# lr1 DGP public
+check ovn-nbctl lrp-add lr1 lr1_public 00:de:ad:ff:00:01 173.16.0.1/16
+check ovn-nbctl lrp-add lr1 lr1_s1 00:de:ad:fe:00:02 172.16.0.1/24
+check ovn-nbctl lrp-set-gateway-chassis lr1_public chassis-1
+
+check ovn-nbctl ls-add s1
+# s1 - lr1
+check ovn-nbctl lsp-add s1 s1_lr1
+check ovn-nbctl lsp-set-type s1_lr1 router
+check ovn-nbctl lsp-set-addresses s1_lr1 "00:de:ad:fe:00:02 172.16.0.1"
+check ovn-nbctl lsp-set-options s1_lr1 router-port=lr1_s1
+
+# s1 - backend vm1
+check ovn-nbctl lsp-add s1 vm1
+check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 172.16.0.101"
+
+# s1 - backend vm2
+check ovn-nbctl lsp-add s1 vm2
+check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 172.16.0.102"
+
+# s1 - backend vm3
+check ovn-nbctl lsp-add s1 vm3
+check ovn-nbctl lsp-set-addresses vm3 "00:de:ad:01:00:03 172.16.0.103"
+
+# Add the lr1 DGP ts1 to the public switch
+check ovn-nbctl lsp-add public public_lr1_ts1
+check ovn-nbctl lsp-set-type public_lr1_ts1 router
+check ovn-nbctl lsp-set-addresses public_lr1_ts1 router
+check ovn-nbctl lsp-set-options public_lr1_ts1 router-port=lr1-ts1 nat-addresses=router
+
+# Add the lr1 DGP ts2 to the public switch
+check ovn-nbctl lsp-add public public_lr1_ts2
+check ovn-nbctl lsp-set-type public_lr1_ts2 router
+check ovn-nbctl lsp-set-addresses public_lr1_ts2 router
+check ovn-nbctl lsp-set-options public_lr1_ts2 router-port=lr1-ts2 nat-addresses=router
+
+# Add the lr1 DGP public to the public switch
+check ovn-nbctl lsp-add public public_lr1
+check ovn-nbctl lsp-set-type public_lr1 router
+check ovn-nbctl lsp-set-addresses public_lr1 router
+check ovn-nbctl lsp-set-options public_lr1 router-port=lr1_public nat-addresses=router
+
+# Create the Load Balancer lb1
+check ovn-nbctl --wait=sb lb-add lb1 "30.0.0.1" "172.16.0.103,172.16.0.102,172.16.0.101"
+
+# Associate load balancer to s1
+check ovn-nbctl ls-lb-add s1 lb1
+check ovn-nbctl --wait=sb sync
+
+ovn-sbctl dump-flows s1 > s1flows
+AT_CAPTURE_FILE([s1flows])
+
+AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl
+  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && ip4.dst == 30.0.0.1), action=(reg1 = 30.0.0.1; ct_lb_mark;)
+])
+AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl
+  table=??(ls_in_lb           ), priority=110  , match=(ct.new && ip4.dst == 30.0.0.1), action=(reg0[[1]] = 0; ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
+])
+
+# Associate load balancer to lr1 with DGP
+check ovn-nbctl lr-lb-add lr1 lb1
+check ovn-nbctl --wait=sb sync
+
+ovn-sbctl dump-flows lr1 > lr1flows
+AT_CAPTURE_FILE([lr1flows])
+
+AT_CHECK([grep "lr_in_dnat" lr1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl
+  table=??(lr_in_dnat         ), priority=110  , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1-ts1")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
+  table=??(lr_in_dnat         ), priority=110  , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1-ts2")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
+  table=??(lr_in_dnat         ), priority=110  , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1_public")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
+])
+
+AT_CLEANUP
+])