diff mbox series

[ovs-dev,v9,4/4] northd: Flood ARPs to routers for "unreachable" addresses.

Message ID 20210630235632.3605211-5-mmichels@redhat.com
State Superseded
Headers show
Series ARP and Floating IP Fixes | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot success github build: passed

Commit Message

Mark Michelson June 30, 2021, 11:56 p.m. UTC
Previously, ARP TPAs were filtered down only to "reachable" addresses.
Reachable addresses are all router interface addresses, as well as NAT
external addresses and load balancer VIPs that are within the subnet
handled by a router's port.

However, it is possible that in some configurations, CMSes purposely
configure NAT or load balancer addresses on a router that are outside
the router's subnets, and they expect the router to respond to ARPs for
those addresses.

This commit adds a higher priority flow to logical switches that makes
it so ARPs targeted at "unreachable" addresses are flooded to all ports.
This way, the ARPs can reach the router appropriately and receive a
response.

Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1929901

Signed-off-by: Mark Michelson <mmichels@redhat.com>
---
 northd/ovn-northd.8.xml |   8 ++
 northd/ovn-northd.c     | 162 +++++++++++++++++++++++++++-------------
 northd/ovn_northd.dl    |  91 ++++++++++++++++++----
 tests/ovn-northd.at     |  99 ++++++++++++++++++++++++
 tests/system-ovn.at     | 102 +++++++++++++++++++++++++
 5 files changed, 395 insertions(+), 67 deletions(-)

Comments

Numan Siddique July 7, 2021, 5:41 p.m. UTC | #1
On Wed, Jun 30, 2021 at 7:57 PM Mark Michelson <mmichels@redhat.com> wrote:
>
> Previously, ARP TPAs were filtered down only to "reachable" addresses.
> Reachable addresses are all router interface addresses, as well as NAT
> external addresses and load balancer VIPs that are within the subnet
> handled by a router's port.
>
> However, it is possible that in some configurations, CMSes purposely
> configure NAT or load balancer addresses on a router that are outside
> the router's subnets, and they expect the router to respond to ARPs for
> those addresses.
>
> This commit adds a higher priority flow to logical switches that makes
> it so ARPs targeted at "unreachable" addresses are flooded to all ports.
> This way, the ARPs can reach the router appropriately and receive a
> response.
>
> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1929901
>
> Signed-off-by: Mark Michelson <mmichels@redhat.com>

Acked-by: Numan Siddique <numans@ovn.org>

I've one comment which probably can be addressed.

If a configured NAT entry on a logical router is unreachable within
that router, this patch floods
the packet for the ARP destined to that NAT IP in the logical switch pipeline.

Before adding the flow to flood, can't we check if the NAT IP is
reachable from other router ports
connected to this logical switch.  If so, we can do "outport ==
<REACHABLE_LRP_PORT>" instead
of flooding.

I think this is possible given that the logical flow is added in the
switch pipeline.  We just need to loop through
all the router ports of the logical switch.  The question is - is this
efficient  and takes up some time on a scaled environment ?

What do you think ?  If this seems fine,  it can be a follow up patch too.

Numan

> ---
>  northd/ovn-northd.8.xml |   8 ++
>  northd/ovn-northd.c     | 162 +++++++++++++++++++++++++++-------------
>  northd/ovn_northd.dl    |  91 ++++++++++++++++++----
>  tests/ovn-northd.at     |  99 ++++++++++++++++++++++++
>  tests/system-ovn.at     | 102 +++++++++++++++++++++++++
>  5 files changed, 395 insertions(+), 67 deletions(-)
>
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index beaf5a183..5aedd6619 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -1587,6 +1587,14 @@ output;
>          logical ports.
>        </li>
>
> +      <li>
> +        Priority-90 flows for each IP address/VIP/NAT address configured
> +        outside its owning router port's subnet. These flows match ARP
> +        requests and ND packets for the specific IP addresses.  Matched packets
> +        are forwarded to the <code>MC_FLOOD</code> multicast group which
> +        contains all connected logical ports.
> +      </li>
> +
>        <li>
>          Priority-75 flows for each port connected to a logical router
>          matching self originated ARP request/ND packets.  These packets
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index f6fad281b..d0b325748 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -6555,38 +6555,41 @@ build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op,
>      ds_destroy(&match);
>  }
>
> -/*
> - * Ingress table 19: Flows that forward ARP/ND requests only to the routers
> - * that own the addresses. Other ARP/ND packets are still flooded in the
> - * switching domain as regular broadcast.
> - */
>  static void
> -build_lswitch_rport_arp_req_flow_for_ip(struct ds *ip_match,
> -                                        int addr_family,
> -                                        struct ovn_port *patch_op,
> -                                        struct ovn_datapath *od,
> -                                        uint32_t priority,
> -                                        struct hmap *lflows,
> -                                        const struct ovsdb_idl_row *stage_hint)
> +arp_nd_ns_match(struct ds *ips, int addr_family, struct ds *match)
>  {
> -    struct ds match   = DS_EMPTY_INITIALIZER;
> -    struct ds actions = DS_EMPTY_INITIALIZER;
> -
>      /* Packets received from VXLAN tunnels have already been through the
>       * router pipeline so we should skip them. Normally this is done by the
>       * multicast_group implementation (VXLAN packets skip table 32 which
>       * delivers to patch ports) but we're bypassing multicast_groups.
>       */
> -    ds_put_cstr(&match, FLAGBIT_NOT_VXLAN " && ");
> +    ds_put_cstr(match, FLAGBIT_NOT_VXLAN " && ");
>
>      if (addr_family == AF_INET) {
> -        ds_put_cstr(&match, "arp.op == 1 && arp.tpa == { ");
> +        ds_put_cstr(match, "arp.op == 1 && arp.tpa == {");
>      } else {
> -        ds_put_cstr(&match, "nd_ns && nd.target == { ");
> +        ds_put_cstr(match, "nd_ns && nd.target == {");
>      }
>
> -    ds_put_cstr(&match, ds_cstr_ro(ip_match));
> -    ds_put_cstr(&match, "}");
> +    ds_put_cstr(match, ds_cstr_ro(ips));
> +    ds_put_cstr(match, "}");
> +}
> +
> +/*
> + * Ingress table 19: Flows that forward ARP/ND requests only to the routers
> + * that own the addresses. Other ARP/ND packets are still flooded in the
> + * switching domain as regular broadcast.
> + */
> +static void
> +build_lswitch_rport_arp_req_flow_for_reachable_ip(struct ds *ips,
> +    int addr_family, struct ovn_port *patch_op, struct ovn_datapath *od,
> +    uint32_t priority, struct hmap *lflows,
> +    const struct ovsdb_idl_row *stage_hint)
> +{
> +    struct ds match   = DS_EMPTY_INITIALIZER;
> +    struct ds actions = DS_EMPTY_INITIALIZER;
> +
> +    arp_nd_ns_match(ips, addr_family, &match);
>
>      /* Send a the packet to the router pipeline.  If the switch has non-router
>       * ports then flood it there as well.
> @@ -6609,6 +6612,30 @@ build_lswitch_rport_arp_req_flow_for_ip(struct ds *ip_match,
>      ds_destroy(&actions);
>  }
>
> +/*
> + * Ingress table 19: Flows that forward ARP/ND requests for "unreachable" IPs
> + * (NAT or load balancer IPs configured on a router that are outside the
> + * router's configured subnets).
> + * These ARP/ND packets are flooded in the switching domain as regular
> + * broadcast.
> + */
> +static void
> +build_lswitch_rport_arp_req_flow_for_unreachable_ip(struct ds *ips,
> +    int addr_family, struct ovn_datapath *od, uint32_t priority,
> +    struct hmap *lflows, const struct ovsdb_idl_row *stage_hint)
> +{
> +    struct ds match = DS_EMPTY_INITIALIZER;
> +
> +    arp_nd_ns_match(ips, addr_family, &match);
> +
> +    ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP,
> +                            priority, ds_cstr(&match),
> +                            "outport = \""MC_FLOOD"\"; output;",
> +                            stage_hint);
> +
> +    ds_destroy(&match);
> +}
> +
>  /*
>   * Ingress table 19: Flows that forward ARP/ND requests only to the routers
>   * that own the addresses.
> @@ -6635,8 +6662,10 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>       * router port.
>       * Priority: 80.
>       */
> -    struct ds ips_v4_match = DS_EMPTY_INITIALIZER;
> -    struct ds ips_v6_match = DS_EMPTY_INITIALIZER;
> +    struct ds ips_v4_match_reachable = DS_EMPTY_INITIALIZER;
> +    struct ds ips_v6_match_reachable = DS_EMPTY_INITIALIZER;
> +    struct ds ips_v4_match_unreachable = DS_EMPTY_INITIALIZER;
> +    struct ds ips_v6_match_unreachable = DS_EMPTY_INITIALIZER;
>
>      const char *ip_addr;
>      SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v4) {
> @@ -6645,9 +6674,12 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>          /* Check if the ovn port has a network configured on which we could
>           * expect ARP requests for the LB VIP.
>           */
> -        if (ip_parse(ip_addr, &ipv4_addr) &&
> -                lrouter_port_ipv4_reachable(op, ipv4_addr)) {
> -            ds_put_format(&ips_v4_match, "%s, ", ip_addr);
> +        if (ip_parse(ip_addr, &ipv4_addr)) {
> +            if (lrouter_port_ipv4_reachable(op, ipv4_addr)) {
> +                ds_put_format(&ips_v4_match_reachable, "%s, ", ip_addr);
> +            } else {
> +                ds_put_format(&ips_v4_match_unreachable, "%s, ", ip_addr);
> +            }
>          }
>      }
>      SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v6) {
> @@ -6656,9 +6688,12 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>          /* Check if the ovn port has a network configured on which we could
>           * expect NS requests for the LB VIP.
>           */
> -        if (ipv6_parse(ip_addr, &ipv6_addr) &&
> -                lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
> -            ds_put_format(&ips_v6_match, "%s, ", ip_addr);
> +        if (ipv6_parse(ip_addr, &ipv6_addr)) {
> +            if (lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
> +                ds_put_format(&ips_v6_match_reachable, "%s, ", ip_addr);
> +            } else {
> +                ds_put_format(&ips_v6_match_unreachable, "%s, ", ip_addr);
> +            }
>          }
>      }
>
> @@ -6680,44 +6715,67 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>          if (nat_entry_is_v6(nat_entry)) {
>              struct in6_addr *addr = &nat_entry->ext_addrs.ipv6_addrs[0].addr;
>
> -            if (lrouter_port_ipv6_reachable(op, addr) &&
> -                    !sset_contains(&op->od->lb_ips_v6, nat->external_ip)) {
> -                ds_put_format(&ips_v6_match, "%s, ", nat->external_ip);
> +            if (!sset_contains(&op->od->lb_ips_v6, nat->external_ip)) {
> +                if (lrouter_port_ipv6_reachable(op, addr)) {
> +                    ds_put_format(&ips_v6_match_reachable, "%s, ",
> +                                  nat->external_ip);
> +                } else {
> +                    ds_put_format(&ips_v6_match_unreachable, "%s, ",
> +                                  nat->external_ip);
> +                }
>              }
>          } else {
>              ovs_be32 addr = nat_entry->ext_addrs.ipv4_addrs[0].addr;
> -
> -            if (lrouter_port_ipv4_reachable(op, addr) &&
> -                    !sset_contains(&op->od->lb_ips_v4, nat->external_ip)) {
> -                ds_put_format(&ips_v4_match, "%s, ", nat->external_ip);
> +            if (!sset_contains(&op->od->lb_ips_v4, nat->external_ip)) {
> +                if (lrouter_port_ipv4_reachable(op, addr)) {
> +                    ds_put_format(&ips_v4_match_reachable, "%s, ",
> +                                  nat->external_ip);
> +                } else {
> +                    ds_put_format(&ips_v4_match_unreachable, "%s, ",
> +                                  nat->external_ip);
> +                }
>              }
>          }
>      }
>
>      for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> -        ds_put_format(&ips_v4_match, "%s, ",
> +        ds_put_format(&ips_v4_match_reachable, "%s, ",
>                        op->lrp_networks.ipv4_addrs[i].addr_s);
>      }
>      for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
> -        ds_put_format(&ips_v6_match, "%s, ",
> +        ds_put_format(&ips_v6_match_reachable, "%s, ",
>                        op->lrp_networks.ipv6_addrs[i].addr_s);
>      }
>
> -    if (ds_last(&ips_v4_match) != EOF) {
> -        ds_chomp(&ips_v4_match, ' ');
> -        ds_chomp(&ips_v4_match, ',');
> -        build_lswitch_rport_arp_req_flow_for_ip(&ips_v4_match, AF_INET, sw_op,
> -                                                sw_od, 80, lflows,
> -                                                stage_hint);
> +    if (ds_last(&ips_v4_match_reachable) != EOF) {
> +        ds_chomp(&ips_v4_match_reachable, ' ');
> +        ds_chomp(&ips_v4_match_reachable, ',');
> +        build_lswitch_rport_arp_req_flow_for_reachable_ip(
> +            &ips_v4_match_reachable, AF_INET, sw_op, sw_od, 80, lflows,
> +            stage_hint);
>      }
> -    if (ds_last(&ips_v6_match) != EOF) {
> -        ds_chomp(&ips_v6_match, ' ');
> -        ds_chomp(&ips_v6_match, ',');
> -        build_lswitch_rport_arp_req_flow_for_ip(&ips_v6_match, AF_INET6, sw_op,
> -                                                sw_od, 80, lflows,
> -                                                stage_hint);
> +    if (ds_last(&ips_v6_match_reachable) != EOF) {
> +        ds_chomp(&ips_v6_match_reachable, ' ');
> +        ds_chomp(&ips_v6_match_reachable, ',');
> +        build_lswitch_rport_arp_req_flow_for_reachable_ip(
> +            &ips_v6_match_reachable, AF_INET6, sw_op, sw_od, 80, lflows,
> +            stage_hint);
>      }
>
> +    if (ds_last(&ips_v4_match_unreachable) != EOF) {
> +        ds_chomp(&ips_v4_match_unreachable, ' ');
> +        ds_chomp(&ips_v4_match_unreachable, ',');
> +        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
> +            &ips_v4_match_unreachable, AF_INET, sw_od, 90, lflows,
> +            stage_hint);
> +    }
> +    if (ds_last(&ips_v6_match_unreachable) != EOF) {
> +        ds_chomp(&ips_v6_match_unreachable, ' ');
> +        ds_chomp(&ips_v6_match_unreachable, ',');
> +        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
> +            &ips_v6_match_unreachable, AF_INET6, sw_od, 90, lflows,
> +            stage_hint);
> +    }
>      /* Self originated ARP requests/ND need to be flooded as usual.
>       *
>       * However, if the switch doesn't have any non-router ports we shouldn't
> @@ -6728,8 +6786,10 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>      if (sw_od->n_router_ports != sw_od->nbs->n_ports) {
>          build_lswitch_rport_arp_req_self_orig_flow(op, 75, sw_od, lflows);
>      }
> -    ds_destroy(&ips_v4_match);
> -    ds_destroy(&ips_v6_match);
> +    ds_destroy(&ips_v4_match_reachable);
> +    ds_destroy(&ips_v6_match_reachable);
> +    ds_destroy(&ips_v4_match_unreachable);
> +    ds_destroy(&ips_v6_match_unreachable);
>  }
>
>  static void
> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> index c9ee742d1..dd6430849 100644
> --- a/northd/ovn_northd.dl
> +++ b/northd/ovn_northd.dl
> @@ -4109,9 +4109,13 @@ Flow(.logical_datapath = sw._uuid,
>   * router port.
>   * Priority: 80.
>   */
> -function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>) = {
> -    var all_ips_v4 = set_empty();
> -    var all_ips_v6 = set_empty();
> +function get_arp_forward_ips(rp: Intern<RouterPort>):
> +    (Set<string>, Set<string>, Set<string>, Set<string>) =
> +{
> +    var reachable_ips_v4 = set_empty();
> +    var reachable_ips_v6 = set_empty();
> +    var unreachable_ips_v4 = set_empty();
> +    var unreachable_ips_v6 = set_empty();
>
>      (var lb_ips_v4, var lb_ips_v6)
>          = get_router_load_balancer_ips(rp.router, false);
> @@ -4121,7 +4125,9 @@ function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
>           */
>          match (ip_parse(a)) {
>              Some{ipv4} -> if (lrouter_port_ip_reachable(rp, IPv4{ipv4})) {
> -                all_ips_v4.insert(a)
> +                reachable_ips_v4.insert(a)
> +            } else {
> +                unreachable_ips_v4.insert(a)
>              },
>              _ -> ()
>          }
> @@ -4132,7 +4138,9 @@ function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
>           */
>          match (ipv6_parse(a)) {
>              Some{ipv6} -> if (lrouter_port_ip_reachable(rp, IPv6{ipv6})) {
> -                all_ips_v6.insert(a)
> +                reachable_ips_v6.insert(a)
> +            } else {
> +                unreachable_ips_v6.insert(a)
>              },
>              _ -> ()
>          }
> @@ -4145,22 +4153,45 @@ function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
>               */
>              if (lrouter_port_ip_reachable(rp, nat.external_ip)) {
>                  match (nat.external_ip) {
> -                    IPv4{_} -> all_ips_v4.insert(nat.nat.external_ip),
> -                    IPv6{_} -> all_ips_v6.insert(nat.nat.external_ip)
> +                    IPv4{_} -> reachable_ips_v4.insert(nat.nat.external_ip),
> +                    IPv6{_} -> reachable_ips_v6.insert(nat.nat.external_ip)
> +                }
> +            } else {
> +                match (nat.external_ip) {
> +                    IPv4{_} -> unreachable_ips_v4.insert(nat.nat.external_ip),
> +                    IPv6{_} -> unreachable_ips_v6.insert(nat.nat.external_ip),
>                  }
>              }
>          }
>      };
>
>      for (a in rp.networks.ipv4_addrs) {
> -        all_ips_v4.insert("${a.addr}")
> +        reachable_ips_v4.insert("${a.addr}")
>      };
>      for (a in rp.networks.ipv6_addrs) {
> -        all_ips_v6.insert("${a.addr}")
> +        reachable_ips_v6.insert("${a.addr}")
>      };
>
> -    (all_ips_v4, all_ips_v6)
> +    (reachable_ips_v4, reachable_ips_v6, unreachable_ips_v4, unreachable_ips_v6)
>  }
> +
> +relation &SwitchPortARPForwards(
> +    port: Intern<SwitchPort>,
> +    reachable_ips_v4: Set<string>,
> +    reachable_ips_v6: Set<string>,
> +    unreachable_ips_v4: Set<string>,
> +    unreachable_ips_v6: Set<string>
> +)
> +
> +&SwitchPortARPForwards(.port = port,
> +                       .reachable_ips_v4 = reachable_ips_v4,
> +                       .reachable_ips_v6 = reachable_ips_v6,
> +                       .unreachable_ips_v4 = unreachable_ips_v4,
> +                       .unreachable_ips_v6 = unreachable_ips_v6) :-
> +    port in &SwitchPort(.peer = Some{rp}),
> +    rp.is_enabled(),
> +    (var reachable_ips_v4, var reachable_ips_v6, var unreachable_ips_v4, var unreachable_ips_v6) = get_arp_forward_ips(rp).
> +
>  /* Packets received from VXLAN tunnels have already been through the
>   * router pipeline so we should skip them. Normally this is done by the
>   * multicast_group implementation (VXLAN packets skip table 32 which
> @@ -4172,7 +4203,7 @@ Flow(.logical_datapath = sw._uuid,
>       .priority         = 80,
>       .__match          = fLAGBIT_NOT_VXLAN() ++
>                           " && arp.op == 1 && arp.tpa == { " ++
> -                         all_ips_v4.to_vec().join(", ") ++ "}",
> +                         ipv4.to_vec().join(", ") ++ "}",
>       .actions          = if (sw.has_non_router_port) {
>                               "clone {outport = ${sp.json_name}; output; }; "
>                               "outport = ${mc_flood_l2}; output;"
> @@ -4182,15 +4213,15 @@ Flow(.logical_datapath = sw._uuid,
>       .external_ids     = stage_hint(sp.lsp._uuid)) :-
>      sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>      rp.is_enabled(),
> -    (var all_ips_v4, _) = get_arp_forward_ips(rp),
> -    not all_ips_v4.is_empty(),
> +    &SwitchPortARPForwards(.port = sp, .reachable_ips_v4 = ipv4),
> +    not ipv4.is_empty(),
>      var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
>  Flow(.logical_datapath = sw._uuid,
>       .stage            = s_SWITCH_IN_L2_LKUP(),
>       .priority         = 80,
>       .__match          = fLAGBIT_NOT_VXLAN() ++
>                           " && nd_ns && nd.target == { " ++
> -                         all_ips_v6.to_vec().join(", ") ++ "}",
> +                         ipv6.to_vec().join(", ") ++ "}",
>       .actions          = if (sw.has_non_router_port) {
>                               "clone {outport = ${sp.json_name}; output; }; "
>                               "outport = ${mc_flood_l2}; output;"
> @@ -4200,10 +4231,38 @@ Flow(.logical_datapath = sw._uuid,
>       .external_ids     = stage_hint(sp.lsp._uuid)) :-
>      sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>      rp.is_enabled(),
> -    (_, var all_ips_v6) = get_arp_forward_ips(rp),
> -    not all_ips_v6.is_empty(),
> +    &SwitchPortARPForwards(.port = sp, .reachable_ips_v6 = ipv6),
> +    not ipv6.is_empty(),
>      var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
>
> +Flow(.logical_datapath = sw._uuid,
> +     .stage            = s_SWITCH_IN_L2_LKUP(),
> +     .priority         = 90,
> +     .__match          = fLAGBIT_NOT_VXLAN() ++
> +                        " && arp.op == 1 && arp.tpa == {" ++
> +                        ipv4.to_vec().join(", ") ++ "}",
> +     .actions          = "outport = ${flood}; output;",
> +     .external_ids     = stage_hint(sp.lsp._uuid)) :-
> +    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
> +    rp.is_enabled(),
> +    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v4 = ipv4),
> +    not ipv4.is_empty(),
> +    var flood = json_string_escape(mC_FLOOD().0).
> +
> +Flow(.logical_datapath = sw._uuid,
> +     .stage            = s_SWITCH_IN_L2_LKUP(),
> +     .priority         = 90,
> +     .__match          = fLAGBIT_NOT_VXLAN() ++
> +                         " && nd_ns && nd.target == {" ++
> +                         ipv6.to_vec().join(", ") ++ "}",
> +     .actions          = "outport = ${flood}; output;",
> +     .external_ids     = stage_hint(sp.lsp._uuid)) :-
> +    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
> +    rp.is_enabled(),
> +    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v6 = ipv6),
> +    not ipv6.is_empty(),
> +    var flood = json_string_escape(mC_FLOOD().0).
> +
>  for (SwitchPortNewDynamicAddress(.port = &SwitchPort{.lsp = lsp, .json_name = json_name, .sw = sw},
>                                   .address = Some{addrs})
>       if lsp.__type != "external") {
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 0c8a09ced..f591537e1 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -4033,3 +4033,102 @@ check ovn-nbctl --wait=sb clear logical_router_port ro2-sw ha_chassis_group
>  check_lflows 0
>
>  AT_CLEANUP
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([ovn -- ARP flood for unreachable addresses])
> +ovn_start
> +
> +AS_BOX([Setting up the logical network])
> +
> +# This network is the same as the one from "Router Address Propagation"
> +check ovn-nbctl ls-add sw
> +
> +check ovn-nbctl lr-add ro1
> +check ovn-nbctl lrp-add ro1 ro1-sw 00:00:00:00:00:01 10.0.0.1/24
> +check ovn-nbctl lsp-add sw sw-ro1
> +check ovn-nbctl lsp-set-type sw-ro1 router
> +check ovn-nbctl lsp-set-addresses sw-ro1 router
> +check ovn-nbctl lsp-set-options sw-ro1 router-port=ro1-sw
> +
> +check ovn-nbctl lr-add ro2
> +check ovn-nbctl lrp-add ro2 ro2-sw 00:00:00:00:00:02 20.0.0.1/24
> +check ovn-nbctl lsp-add sw sw-ro2
> +check ovn-nbctl lsp-set-type sw-ro2 router
> +check ovn-nbctl lsp-set-addresses sw-ro2 router
> +check ovn-nbctl --wait=sb lsp-set-options sw-ro2 router-port=ro2-sw
> +
> +check ovn-nbctl ls-add ls1
> +check ovn-nbctl lsp-add ls1 vm1
> +check ovn-nbctl lsp-set-addresses vm1 "00:00:00:00:01:02 192.168.1.2"
> +check ovn-nbctl lrp-add ro1 ro1-ls1 00:00:00:00:01:01 192.168.1.1/24
> +check ovn-nbctl lsp-add ls1 ls1-ro1
> +check ovn-nbctl lsp-set-type ls1-ro1 router
> +check ovn-nbctl lsp-set-addresses ls1-ro1 router
> +check ovn-nbctl lsp-set-options ls1-ro1 router-port=ro1-ls1
> +
> +check ovn-nbctl ls-add ls2
> +check ovn-nbctl lsp-add ls2 vm2
> +check ovn-nbctl lsp-set-addresses vm2 "00:00:00:00:02:02 192.168.2.2"
> +check ovn-nbctl lrp-add ro2 ro2-ls2 00:00:00:00:02:01 192.168.2.1/24
> +check ovn-nbctl lsp-add ls2 ls2-ro2
> +check ovn-nbctl lsp-set-type ls2-ro2 router
> +check ovn-nbctl lsp-set-addresses ls2-ro2 router
> +check ovn-nbctl lsp-set-options ls2-ro2 router-port=ro2-ls2
> +
> +AS_BOX([Ensure that unreachable flood flows are not installed, since no addresses are unreachable])
> +
> +AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep "priority=90" -c], [1], [dnl
> +0
> +])
> +
> +AS_BOX([Adding some reachable NAT addresses])
> +
> +check ovn-nbctl lr-nat-add ro1 dnat 10.0.0.100 192.168.1.100
> +check ovn-nbctl lr-nat-add ro1 snat 10.0.0.200 192.168.1.200/30
> +
> +check ovn-nbctl lr-nat-add ro2 dnat 20.0.0.100 192.168.2.100
> +check ovn-nbctl --wait=sb lr-nat-add ro2 snat 20.0.0.200 192.168.2.200/30
> +
> +AS_BOX([Ensure that unreachable flood flows are not installed, since all addresses are reachable])
> +
> +AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep "priority=90" -c], [1], [dnl
> +0
> +])
> +
> +AS_BOX([Adding some unreachable NAT addresses])
> +
> +check ovn-nbctl lr-nat-add ro1 dnat 30.0.0.100 192.168.1.130
> +check ovn-nbctl lr-nat-add ro1 snat 30.0.0.200 192.168.1.148/30
> +
> +check ovn-nbctl lr-nat-add ro2 dnat 40.0.0.100 192.168.2.130
> +check ovn-nbctl --wait=sb lr-nat-add ro2 snat 40.0.0.200 192.168.2.148/30
> +
> +AS_BOX([Ensure that unreachable flood flows are installed, since there are unreachable addresses])
> +
> +ovn-sbctl lflow-list
> +
> +# We expect two flows to be installed, one per connected router port on sw
> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 -c], [0], [dnl
> +2
> +])
> +
> +# We expect that the installed flows will match the unreachable DNAT addresses only.
> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {30.0.0.100}" -c], [0], [dnl
> +1
> +])
> +
> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {40.0.0.100}" -c], [0], [dnl
> +1
> +])
> +
> +# Ensure that we do not create flows for SNAT addresses
> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {30.0.0.200}" -c], [1], [dnl
> +0
> +])
> +
> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {40.0.0.200}" -c], [1], [dnl
> +0
> +])
> +
> +AT_CLEANUP
> +])
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index a5f10ce38..bdd13b009 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -6482,3 +6482,105 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([ovn -- Floating IP outside router subnet IPv4])
> +AT_KEYWORDS(NAT)
> +
> +ovn_start
> +
> +OVS_TRAFFIC_VSWITCHD_START()
> +ADD_BR([br-int])
> +
> +# Set external-ids in br-int needed for ovn-controller
> +ovs-vsctl \
> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
> +
> +start_daemon ovn-controller
> +
> +# Logical network:
> +# Two VMs
> +#   * VM1 with IP address 192.168.100.5
> +#   * VM2 with IP address 192.168.200.5
> +#
> +# VM1 connects to logical switch ls1. ls1 connects to logical router lr1.
> +# VM2 connects to logical switch ls2. ls2 connects to logical router lr2.
> +# lr1 and lr2 both connect to logical switch ls-pub.
> +# * lr1's interface that connects to ls-pub has IP address 172.18.2.110/24
> +# * lr2's interface that connects to ls-pub has IP address 172.18.1.173/24
> +#
> +# lr1 has the following attributes:
> +#   * It has a DNAT rule that translates 172.18.2.11 to 192.168.100.5 (VM1)
> +#
> +# lr2 has the following attributes:
> +#   * It has a DNAT rule that translates 172.18.2.12 to 192.168.200.5 (VM2)
> +#
> +# In this test, we want to ensure that a ping from VM1 to IP address 172.18.2.12 reaches VM2.
> +# When the NAT rules are set up, there should be MAC_Bindings created that allow for traffic
> +# to exit lr1, go through ls-pub, and reach the NAT external IP configured on lr2.
> +
> +check ovn-nbctl ls-add ls1
> +check ovn-nbctl lsp-add ls1 vm1 -- lsp-set-addresses vm1 "00:00:00:00:01:05 192.168.100.5"
> +
> +check ovn-nbctl ls-add ls2
> +check ovn-nbctl lsp-add ls2 vm2 -- lsp-set-addresses vm2 "00:00:00:00:02:05 192.168.200.5"
> +
> +check ovn-nbctl ls-add ls-pub
> +
> +check ovn-nbctl lr-add lr1
> +check ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:01:01 192.168.100.1/24
> +check ovn-nbctl lsp-add ls1 ls1-lr1                      \
> +    -- lsp-set-type ls1-lr1 router                 \
> +    -- lsp-set-addresses ls1-lr1 router            \
> +    -- lsp-set-options ls1-lr1 router-port=lr1-ls1
> +
> +check ovn-nbctl lr-add lr2
> +check ovn-nbctl lrp-add lr2 lr2-ls2 00:00:00:00:02:01 192.168.200.1/24
> +check ovn-nbctl lsp-add ls2 ls2-lr2                      \
> +    -- lsp-set-type ls2-lr2 router                 \
> +    -- lsp-set-addresses ls2-lr2 router            \
> +    -- lsp-set-options ls2-lr2 router-port=lr2-ls2
> +
> +check ovn-nbctl lrp-add lr1 lr1-ls-pub 00:00:00:00:03:01 172.18.2.110/24
> +check ovn-nbctl lrp-set-gateway-chassis lr1-ls-pub hv1
> +check ovn-nbctl lsp-add ls-pub ls-pub-lr1                      \
> +    -- lsp-set-type ls-pub-lr1 router                    \
> +    -- lsp-set-addresses ls-pub-lr1 router               \
> +    -- lsp-set-options ls-pub-lr1 router-port=lr1-ls-pub
> +
> +check ovn-nbctl lrp-add lr2 lr2-ls-pub 00:00:00:00:03:02 172.18.1.173/24
> +check ovn-nbctl lrp-set-gateway-chassis lr2-ls-pub hv1
> +check ovn-nbctl lsp-add ls-pub ls-pub-lr2                      \
> +    -- lsp-set-type ls-pub-lr2 router                    \
> +    -- lsp-set-addresses ls-pub-lr2 router               \
> +    -- lsp-set-options ls-pub-lr2 router-port=lr2-ls-pub
> +
> +# Putting --add-route on these NAT rules means there is no need to
> +# add any static routes.
> +check ovn-nbctl --add-route lr-nat-add lr1 dnat_and_snat 172.18.2.11 192.168.100.5 vm1 00:00:00:00:03:01
> +check ovn-nbctl --add-route lr-nat-add lr2 dnat_and_snat 172.18.2.12 192.168.200.5 vm2 00:00:00:00:03:02
> +
> +ADD_NAMESPACES(vm1)
> +ADD_VETH(vm1, vm1, br-int, "192.168.100.5/24", "00:00:00:00:01:05", \
> +         "192.168.100.1")
> +
> +ADD_NAMESPACES(vm2)
> +ADD_VETH(vm2, vm2, br-int, "192.168.200.5/24", "00:00:00:00:02:05", \
> +         "192.168.200.1")
> +
> +OVN_POPULATE_ARP
> +check ovn-nbctl --wait=hv sync
> +
> +AS_BOX([Testing a ping])
> +
> +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 172.18.2.12 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +AT_CLEANUP
> +])
> --
> 2.31.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Mark Michelson July 7, 2021, 6:50 p.m. UTC | #2
On 7/7/21 1:41 PM, Numan Siddique wrote:
> On Wed, Jun 30, 2021 at 7:57 PM Mark Michelson <mmichels@redhat.com> wrote:
>>
>> Previously, ARP TPAs were filtered down only to "reachable" addresses.
>> Reachable addresses are all router interface addresses, as well as NAT
>> external addresses and load balancer VIPs that are within the subnet
>> handled by a router's port.
>>
>> However, it is possible that in some configurations, CMSes purposely
>> configure NAT or load balancer addresses on a router that are outside
>> the router's subnets, and they expect the router to respond to ARPs for
>> those addresses.
>>
>> This commit adds a higher priority flow to logical switches that makes
>> it so ARPs targeted at "unreachable" addresses are flooded to all ports.
>> This way, the ARPs can reach the router appropriately and receive a
>> response.
>>
>> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1929901
>>
>> Signed-off-by: Mark Michelson <mmichels@redhat.com>
> 
> Acked-by: Numan Siddique <numans@ovn.org>
> 
> I've one comment which probably can be addressed.
> 
> If a configured NAT entry on a logical router is unreachable within
> that router, this patch floods
> the packet for the ARP destined to that NAT IP in the logical switch pipeline.
> 
> Before adding the flow to flood, can't we check if the NAT IP is
> reachable from other router ports
> connected to this logical switch.  If so, we can do "outport ==
> <REACHABLE_LRP_PORT>" instead
> of flooding.
> 
> I think this is possible given that the logical flow is added in the
> switch pipeline.  We just need to loop through
> all the router ports of the logical switch.  The question is - is this
> efficient  and takes up some time on a scaled environment ?
> 
> What do you think ?  If this seems fine,  it can be a follow up patch too.

I don't think your suggestion would add any appreciable time compared to 
what we're already doing in ovn-northd. I will attempt this approach and 
let you know how it goes.

> 
> Numan
> 
>> ---
>>   northd/ovn-northd.8.xml |   8 ++
>>   northd/ovn-northd.c     | 162 +++++++++++++++++++++++++++-------------
>>   northd/ovn_northd.dl    |  91 ++++++++++++++++++----
>>   tests/ovn-northd.at     |  99 ++++++++++++++++++++++++
>>   tests/system-ovn.at     | 102 +++++++++++++++++++++++++
>>   5 files changed, 395 insertions(+), 67 deletions(-)
>>
>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>> index beaf5a183..5aedd6619 100644
>> --- a/northd/ovn-northd.8.xml
>> +++ b/northd/ovn-northd.8.xml
>> @@ -1587,6 +1587,14 @@ output;
>>           logical ports.
>>         </li>
>>
>> +      <li>
>> +        Priority-90 flows for each IP address/VIP/NAT address configured
>> +        outside its owning router port's subnet. These flows match ARP
>> +        requests and ND packets for the specific IP addresses.  Matched packets
>> +        are forwarded to the <code>MC_FLOOD</code> multicast group which
>> +        contains all connected logical ports.
>> +      </li>
>> +
>>         <li>
>>           Priority-75 flows for each port connected to a logical router
>>           matching self originated ARP request/ND packets.  These packets
>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>> index f6fad281b..d0b325748 100644
>> --- a/northd/ovn-northd.c
>> +++ b/northd/ovn-northd.c
>> @@ -6555,38 +6555,41 @@ build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op,
>>       ds_destroy(&match);
>>   }
>>
>> -/*
>> - * Ingress table 19: Flows that forward ARP/ND requests only to the routers
>> - * that own the addresses. Other ARP/ND packets are still flooded in the
>> - * switching domain as regular broadcast.
>> - */
>>   static void
>> -build_lswitch_rport_arp_req_flow_for_ip(struct ds *ip_match,
>> -                                        int addr_family,
>> -                                        struct ovn_port *patch_op,
>> -                                        struct ovn_datapath *od,
>> -                                        uint32_t priority,
>> -                                        struct hmap *lflows,
>> -                                        const struct ovsdb_idl_row *stage_hint)
>> +arp_nd_ns_match(struct ds *ips, int addr_family, struct ds *match)
>>   {
>> -    struct ds match   = DS_EMPTY_INITIALIZER;
>> -    struct ds actions = DS_EMPTY_INITIALIZER;
>> -
>>       /* Packets received from VXLAN tunnels have already been through the
>>        * router pipeline so we should skip them. Normally this is done by the
>>        * multicast_group implementation (VXLAN packets skip table 32 which
>>        * delivers to patch ports) but we're bypassing multicast_groups.
>>        */
>> -    ds_put_cstr(&match, FLAGBIT_NOT_VXLAN " && ");
>> +    ds_put_cstr(match, FLAGBIT_NOT_VXLAN " && ");
>>
>>       if (addr_family == AF_INET) {
>> -        ds_put_cstr(&match, "arp.op == 1 && arp.tpa == { ");
>> +        ds_put_cstr(match, "arp.op == 1 && arp.tpa == {");
>>       } else {
>> -        ds_put_cstr(&match, "nd_ns && nd.target == { ");
>> +        ds_put_cstr(match, "nd_ns && nd.target == {");
>>       }
>>
>> -    ds_put_cstr(&match, ds_cstr_ro(ip_match));
>> -    ds_put_cstr(&match, "}");
>> +    ds_put_cstr(match, ds_cstr_ro(ips));
>> +    ds_put_cstr(match, "}");
>> +}
>> +
>> +/*
>> + * Ingress table 19: Flows that forward ARP/ND requests only to the routers
>> + * that own the addresses. Other ARP/ND packets are still flooded in the
>> + * switching domain as regular broadcast.
>> + */
>> +static void
>> +build_lswitch_rport_arp_req_flow_for_reachable_ip(struct ds *ips,
>> +    int addr_family, struct ovn_port *patch_op, struct ovn_datapath *od,
>> +    uint32_t priority, struct hmap *lflows,
>> +    const struct ovsdb_idl_row *stage_hint)
>> +{
>> +    struct ds match   = DS_EMPTY_INITIALIZER;
>> +    struct ds actions = DS_EMPTY_INITIALIZER;
>> +
>> +    arp_nd_ns_match(ips, addr_family, &match);
>>
>>       /* Send a the packet to the router pipeline.  If the switch has non-router
>>        * ports then flood it there as well.
>> @@ -6609,6 +6612,30 @@ build_lswitch_rport_arp_req_flow_for_ip(struct ds *ip_match,
>>       ds_destroy(&actions);
>>   }
>>
>> +/*
>> + * Ingress table 19: Flows that forward ARP/ND requests for "unreachable" IPs
>> + * (NAT or load balancer IPs configured on a router that are outside the
>> + * router's configured subnets).
>> + * These ARP/ND packets are flooded in the switching domain as regular
>> + * broadcast.
>> + */
>> +static void
>> +build_lswitch_rport_arp_req_flow_for_unreachable_ip(struct ds *ips,
>> +    int addr_family, struct ovn_datapath *od, uint32_t priority,
>> +    struct hmap *lflows, const struct ovsdb_idl_row *stage_hint)
>> +{
>> +    struct ds match = DS_EMPTY_INITIALIZER;
>> +
>> +    arp_nd_ns_match(ips, addr_family, &match);
>> +
>> +    ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP,
>> +                            priority, ds_cstr(&match),
>> +                            "outport = \""MC_FLOOD"\"; output;",
>> +                            stage_hint);
>> +
>> +    ds_destroy(&match);
>> +}
>> +
>>   /*
>>    * Ingress table 19: Flows that forward ARP/ND requests only to the routers
>>    * that own the addresses.
>> @@ -6635,8 +6662,10 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>>        * router port.
>>        * Priority: 80.
>>        */
>> -    struct ds ips_v4_match = DS_EMPTY_INITIALIZER;
>> -    struct ds ips_v6_match = DS_EMPTY_INITIALIZER;
>> +    struct ds ips_v4_match_reachable = DS_EMPTY_INITIALIZER;
>> +    struct ds ips_v6_match_reachable = DS_EMPTY_INITIALIZER;
>> +    struct ds ips_v4_match_unreachable = DS_EMPTY_INITIALIZER;
>> +    struct ds ips_v6_match_unreachable = DS_EMPTY_INITIALIZER;
>>
>>       const char *ip_addr;
>>       SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v4) {
>> @@ -6645,9 +6674,12 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>>           /* Check if the ovn port has a network configured on which we could
>>            * expect ARP requests for the LB VIP.
>>            */
>> -        if (ip_parse(ip_addr, &ipv4_addr) &&
>> -                lrouter_port_ipv4_reachable(op, ipv4_addr)) {
>> -            ds_put_format(&ips_v4_match, "%s, ", ip_addr);
>> +        if (ip_parse(ip_addr, &ipv4_addr)) {
>> +            if (lrouter_port_ipv4_reachable(op, ipv4_addr)) {
>> +                ds_put_format(&ips_v4_match_reachable, "%s, ", ip_addr);
>> +            } else {
>> +                ds_put_format(&ips_v4_match_unreachable, "%s, ", ip_addr);
>> +            }
>>           }
>>       }
>>       SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v6) {
>> @@ -6656,9 +6688,12 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>>           /* Check if the ovn port has a network configured on which we could
>>            * expect NS requests for the LB VIP.
>>            */
>> -        if (ipv6_parse(ip_addr, &ipv6_addr) &&
>> -                lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
>> -            ds_put_format(&ips_v6_match, "%s, ", ip_addr);
>> +        if (ipv6_parse(ip_addr, &ipv6_addr)) {
>> +            if (lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
>> +                ds_put_format(&ips_v6_match_reachable, "%s, ", ip_addr);
>> +            } else {
>> +                ds_put_format(&ips_v6_match_unreachable, "%s, ", ip_addr);
>> +            }
>>           }
>>       }
>>
>> @@ -6680,44 +6715,67 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>>           if (nat_entry_is_v6(nat_entry)) {
>>               struct in6_addr *addr = &nat_entry->ext_addrs.ipv6_addrs[0].addr;
>>
>> -            if (lrouter_port_ipv6_reachable(op, addr) &&
>> -                    !sset_contains(&op->od->lb_ips_v6, nat->external_ip)) {
>> -                ds_put_format(&ips_v6_match, "%s, ", nat->external_ip);
>> +            if (!sset_contains(&op->od->lb_ips_v6, nat->external_ip)) {
>> +                if (lrouter_port_ipv6_reachable(op, addr)) {
>> +                    ds_put_format(&ips_v6_match_reachable, "%s, ",
>> +                                  nat->external_ip);
>> +                } else {
>> +                    ds_put_format(&ips_v6_match_unreachable, "%s, ",
>> +                                  nat->external_ip);
>> +                }
>>               }
>>           } else {
>>               ovs_be32 addr = nat_entry->ext_addrs.ipv4_addrs[0].addr;
>> -
>> -            if (lrouter_port_ipv4_reachable(op, addr) &&
>> -                    !sset_contains(&op->od->lb_ips_v4, nat->external_ip)) {
>> -                ds_put_format(&ips_v4_match, "%s, ", nat->external_ip);
>> +            if (!sset_contains(&op->od->lb_ips_v4, nat->external_ip)) {
>> +                if (lrouter_port_ipv4_reachable(op, addr)) {
>> +                    ds_put_format(&ips_v4_match_reachable, "%s, ",
>> +                                  nat->external_ip);
>> +                } else {
>> +                    ds_put_format(&ips_v4_match_unreachable, "%s, ",
>> +                                  nat->external_ip);
>> +                }
>>               }
>>           }
>>       }
>>
>>       for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
>> -        ds_put_format(&ips_v4_match, "%s, ",
>> +        ds_put_format(&ips_v4_match_reachable, "%s, ",
>>                         op->lrp_networks.ipv4_addrs[i].addr_s);
>>       }
>>       for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
>> -        ds_put_format(&ips_v6_match, "%s, ",
>> +        ds_put_format(&ips_v6_match_reachable, "%s, ",
>>                         op->lrp_networks.ipv6_addrs[i].addr_s);
>>       }
>>
>> -    if (ds_last(&ips_v4_match) != EOF) {
>> -        ds_chomp(&ips_v4_match, ' ');
>> -        ds_chomp(&ips_v4_match, ',');
>> -        build_lswitch_rport_arp_req_flow_for_ip(&ips_v4_match, AF_INET, sw_op,
>> -                                                sw_od, 80, lflows,
>> -                                                stage_hint);
>> +    if (ds_last(&ips_v4_match_reachable) != EOF) {
>> +        ds_chomp(&ips_v4_match_reachable, ' ');
>> +        ds_chomp(&ips_v4_match_reachable, ',');
>> +        build_lswitch_rport_arp_req_flow_for_reachable_ip(
>> +            &ips_v4_match_reachable, AF_INET, sw_op, sw_od, 80, lflows,
>> +            stage_hint);
>>       }
>> -    if (ds_last(&ips_v6_match) != EOF) {
>> -        ds_chomp(&ips_v6_match, ' ');
>> -        ds_chomp(&ips_v6_match, ',');
>> -        build_lswitch_rport_arp_req_flow_for_ip(&ips_v6_match, AF_INET6, sw_op,
>> -                                                sw_od, 80, lflows,
>> -                                                stage_hint);
>> +    if (ds_last(&ips_v6_match_reachable) != EOF) {
>> +        ds_chomp(&ips_v6_match_reachable, ' ');
>> +        ds_chomp(&ips_v6_match_reachable, ',');
>> +        build_lswitch_rport_arp_req_flow_for_reachable_ip(
>> +            &ips_v6_match_reachable, AF_INET6, sw_op, sw_od, 80, lflows,
>> +            stage_hint);
>>       }
>>
>> +    if (ds_last(&ips_v4_match_unreachable) != EOF) {
>> +        ds_chomp(&ips_v4_match_unreachable, ' ');
>> +        ds_chomp(&ips_v4_match_unreachable, ',');
>> +        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
>> +            &ips_v4_match_unreachable, AF_INET, sw_od, 90, lflows,
>> +            stage_hint);
>> +    }
>> +    if (ds_last(&ips_v6_match_unreachable) != EOF) {
>> +        ds_chomp(&ips_v6_match_unreachable, ' ');
>> +        ds_chomp(&ips_v6_match_unreachable, ',');
>> +        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
>> +            &ips_v6_match_unreachable, AF_INET6, sw_od, 90, lflows,
>> +            stage_hint);
>> +    }
>>       /* Self originated ARP requests/ND need to be flooded as usual.
>>        *
>>        * However, if the switch doesn't have any non-router ports we shouldn't
>> @@ -6728,8 +6786,10 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
>>       if (sw_od->n_router_ports != sw_od->nbs->n_ports) {
>>           build_lswitch_rport_arp_req_self_orig_flow(op, 75, sw_od, lflows);
>>       }
>> -    ds_destroy(&ips_v4_match);
>> -    ds_destroy(&ips_v6_match);
>> +    ds_destroy(&ips_v4_match_reachable);
>> +    ds_destroy(&ips_v6_match_reachable);
>> +    ds_destroy(&ips_v4_match_unreachable);
>> +    ds_destroy(&ips_v6_match_unreachable);
>>   }
>>
>>   static void
>> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
>> index c9ee742d1..dd6430849 100644
>> --- a/northd/ovn_northd.dl
>> +++ b/northd/ovn_northd.dl
>> @@ -4109,9 +4109,13 @@ Flow(.logical_datapath = sw._uuid,
>>    * router port.
>>    * Priority: 80.
>>    */
>> -function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>) = {
>> -    var all_ips_v4 = set_empty();
>> -    var all_ips_v6 = set_empty();
>> +function get_arp_forward_ips(rp: Intern<RouterPort>):
>> +    (Set<string>, Set<string>, Set<string>, Set<string>) =
>> +{
>> +    var reachable_ips_v4 = set_empty();
>> +    var reachable_ips_v6 = set_empty();
>> +    var unreachable_ips_v4 = set_empty();
>> +    var unreachable_ips_v6 = set_empty();
>>
>>       (var lb_ips_v4, var lb_ips_v6)
>>           = get_router_load_balancer_ips(rp.router, false);
>> @@ -4121,7 +4125,9 @@ function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
>>            */
>>           match (ip_parse(a)) {
>>               Some{ipv4} -> if (lrouter_port_ip_reachable(rp, IPv4{ipv4})) {
>> -                all_ips_v4.insert(a)
>> +                reachable_ips_v4.insert(a)
>> +            } else {
>> +                unreachable_ips_v4.insert(a)
>>               },
>>               _ -> ()
>>           }
>> @@ -4132,7 +4138,9 @@ function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
>>            */
>>           match (ipv6_parse(a)) {
>>               Some{ipv6} -> if (lrouter_port_ip_reachable(rp, IPv6{ipv6})) {
>> -                all_ips_v6.insert(a)
>> +                reachable_ips_v6.insert(a)
>> +            } else {
>> +                unreachable_ips_v6.insert(a)
>>               },
>>               _ -> ()
>>           }
>> @@ -4145,22 +4153,45 @@ function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
>>                */
>>               if (lrouter_port_ip_reachable(rp, nat.external_ip)) {
>>                   match (nat.external_ip) {
>> -                    IPv4{_} -> all_ips_v4.insert(nat.nat.external_ip),
>> -                    IPv6{_} -> all_ips_v6.insert(nat.nat.external_ip)
>> +                    IPv4{_} -> reachable_ips_v4.insert(nat.nat.external_ip),
>> +                    IPv6{_} -> reachable_ips_v6.insert(nat.nat.external_ip)
>> +                }
>> +            } else {
>> +                match (nat.external_ip) {
>> +                    IPv4{_} -> unreachable_ips_v4.insert(nat.nat.external_ip),
>> +                    IPv6{_} -> unreachable_ips_v6.insert(nat.nat.external_ip),
>>                   }
>>               }
>>           }
>>       };
>>
>>       for (a in rp.networks.ipv4_addrs) {
>> -        all_ips_v4.insert("${a.addr}")
>> +        reachable_ips_v4.insert("${a.addr}")
>>       };
>>       for (a in rp.networks.ipv6_addrs) {
>> -        all_ips_v6.insert("${a.addr}")
>> +        reachable_ips_v6.insert("${a.addr}")
>>       };
>>
>> -    (all_ips_v4, all_ips_v6)
>> +    (reachable_ips_v4, reachable_ips_v6, unreachable_ips_v4, unreachable_ips_v6)
>>   }
>> +
>> +relation &SwitchPortARPForwards(
>> +    port: Intern<SwitchPort>,
>> +    reachable_ips_v4: Set<string>,
>> +    reachable_ips_v6: Set<string>,
>> +    unreachable_ips_v4: Set<string>,
>> +    unreachable_ips_v6: Set<string>
>> +)
>> +
>> +&SwitchPortARPForwards(.port = port,
>> +                       .reachable_ips_v4 = reachable_ips_v4,
>> +                       .reachable_ips_v6 = reachable_ips_v6,
>> +                       .unreachable_ips_v4 = unreachable_ips_v4,
>> +                       .unreachable_ips_v6 = unreachable_ips_v6) :-
>> +    port in &SwitchPort(.peer = Some{rp}),
>> +    rp.is_enabled(),
>> +    (var reachable_ips_v4, var reachable_ips_v6, var unreachable_ips_v4, var unreachable_ips_v6) = get_arp_forward_ips(rp).
>> +
>>   /* Packets received from VXLAN tunnels have already been through the
>>    * router pipeline so we should skip them. Normally this is done by the
>>    * multicast_group implementation (VXLAN packets skip table 32 which
>> @@ -4172,7 +4203,7 @@ Flow(.logical_datapath = sw._uuid,
>>        .priority         = 80,
>>        .__match          = fLAGBIT_NOT_VXLAN() ++
>>                            " && arp.op == 1 && arp.tpa == { " ++
>> -                         all_ips_v4.to_vec().join(", ") ++ "}",
>> +                         ipv4.to_vec().join(", ") ++ "}",
>>        .actions          = if (sw.has_non_router_port) {
>>                                "clone {outport = ${sp.json_name}; output; }; "
>>                                "outport = ${mc_flood_l2}; output;"
>> @@ -4182,15 +4213,15 @@ Flow(.logical_datapath = sw._uuid,
>>        .external_ids     = stage_hint(sp.lsp._uuid)) :-
>>       sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>>       rp.is_enabled(),
>> -    (var all_ips_v4, _) = get_arp_forward_ips(rp),
>> -    not all_ips_v4.is_empty(),
>> +    &SwitchPortARPForwards(.port = sp, .reachable_ips_v4 = ipv4),
>> +    not ipv4.is_empty(),
>>       var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
>>   Flow(.logical_datapath = sw._uuid,
>>        .stage            = s_SWITCH_IN_L2_LKUP(),
>>        .priority         = 80,
>>        .__match          = fLAGBIT_NOT_VXLAN() ++
>>                            " && nd_ns && nd.target == { " ++
>> -                         all_ips_v6.to_vec().join(", ") ++ "}",
>> +                         ipv6.to_vec().join(", ") ++ "}",
>>        .actions          = if (sw.has_non_router_port) {
>>                                "clone {outport = ${sp.json_name}; output; }; "
>>                                "outport = ${mc_flood_l2}; output;"
>> @@ -4200,10 +4231,38 @@ Flow(.logical_datapath = sw._uuid,
>>        .external_ids     = stage_hint(sp.lsp._uuid)) :-
>>       sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>>       rp.is_enabled(),
>> -    (_, var all_ips_v6) = get_arp_forward_ips(rp),
>> -    not all_ips_v6.is_empty(),
>> +    &SwitchPortARPForwards(.port = sp, .reachable_ips_v6 = ipv6),
>> +    not ipv6.is_empty(),
>>       var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
>>
>> +Flow(.logical_datapath = sw._uuid,
>> +     .stage            = s_SWITCH_IN_L2_LKUP(),
>> +     .priority         = 90,
>> +     .__match          = fLAGBIT_NOT_VXLAN() ++
>> +                        " && arp.op == 1 && arp.tpa == {" ++
>> +                        ipv4.to_vec().join(", ") ++ "}",
>> +     .actions          = "outport = ${flood}; output;",
>> +     .external_ids     = stage_hint(sp.lsp._uuid)) :-
>> +    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>> +    rp.is_enabled(),
>> +    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v4 = ipv4),
>> +    not ipv4.is_empty(),
>> +    var flood = json_string_escape(mC_FLOOD().0).
>> +
>> +Flow(.logical_datapath = sw._uuid,
>> +     .stage            = s_SWITCH_IN_L2_LKUP(),
>> +     .priority         = 90,
>> +     .__match          = fLAGBIT_NOT_VXLAN() ++
>> +                         " && nd_ns && nd.target == {" ++
>> +                         ipv6.to_vec().join(", ") ++ "}",
>> +     .actions          = "outport = ${flood}; output;",
>> +     .external_ids     = stage_hint(sp.lsp._uuid)) :-
>> +    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>> +    rp.is_enabled(),
>> +    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v6 = ipv6),
>> +    not ipv6.is_empty(),
>> +    var flood = json_string_escape(mC_FLOOD().0).
>> +
>>   for (SwitchPortNewDynamicAddress(.port = &SwitchPort{.lsp = lsp, .json_name = json_name, .sw = sw},
>>                                    .address = Some{addrs})
>>        if lsp.__type != "external") {
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 0c8a09ced..f591537e1 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -4033,3 +4033,102 @@ check ovn-nbctl --wait=sb clear logical_router_port ro2-sw ha_chassis_group
>>   check_lflows 0
>>
>>   AT_CLEANUP
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ovn -- ARP flood for unreachable addresses])
>> +ovn_start
>> +
>> +AS_BOX([Setting up the logical network])
>> +
>> +# This network is the same as the one from "Router Address Propagation"
>> +check ovn-nbctl ls-add sw
>> +
>> +check ovn-nbctl lr-add ro1
>> +check ovn-nbctl lrp-add ro1 ro1-sw 00:00:00:00:00:01 10.0.0.1/24
>> +check ovn-nbctl lsp-add sw sw-ro1
>> +check ovn-nbctl lsp-set-type sw-ro1 router
>> +check ovn-nbctl lsp-set-addresses sw-ro1 router
>> +check ovn-nbctl lsp-set-options sw-ro1 router-port=ro1-sw
>> +
>> +check ovn-nbctl lr-add ro2
>> +check ovn-nbctl lrp-add ro2 ro2-sw 00:00:00:00:00:02 20.0.0.1/24
>> +check ovn-nbctl lsp-add sw sw-ro2
>> +check ovn-nbctl lsp-set-type sw-ro2 router
>> +check ovn-nbctl lsp-set-addresses sw-ro2 router
>> +check ovn-nbctl --wait=sb lsp-set-options sw-ro2 router-port=ro2-sw
>> +
>> +check ovn-nbctl ls-add ls1
>> +check ovn-nbctl lsp-add ls1 vm1
>> +check ovn-nbctl lsp-set-addresses vm1 "00:00:00:00:01:02 192.168.1.2"
>> +check ovn-nbctl lrp-add ro1 ro1-ls1 00:00:00:00:01:01 192.168.1.1/24
>> +check ovn-nbctl lsp-add ls1 ls1-ro1
>> +check ovn-nbctl lsp-set-type ls1-ro1 router
>> +check ovn-nbctl lsp-set-addresses ls1-ro1 router
>> +check ovn-nbctl lsp-set-options ls1-ro1 router-port=ro1-ls1
>> +
>> +check ovn-nbctl ls-add ls2
>> +check ovn-nbctl lsp-add ls2 vm2
>> +check ovn-nbctl lsp-set-addresses vm2 "00:00:00:00:02:02 192.168.2.2"
>> +check ovn-nbctl lrp-add ro2 ro2-ls2 00:00:00:00:02:01 192.168.2.1/24
>> +check ovn-nbctl lsp-add ls2 ls2-ro2
>> +check ovn-nbctl lsp-set-type ls2-ro2 router
>> +check ovn-nbctl lsp-set-addresses ls2-ro2 router
>> +check ovn-nbctl lsp-set-options ls2-ro2 router-port=ro2-ls2
>> +
>> +AS_BOX([Ensure that unreachable flood flows are not installed, since no addresses are unreachable])
>> +
>> +AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep "priority=90" -c], [1], [dnl
>> +0
>> +])
>> +
>> +AS_BOX([Adding some reachable NAT addresses])
>> +
>> +check ovn-nbctl lr-nat-add ro1 dnat 10.0.0.100 192.168.1.100
>> +check ovn-nbctl lr-nat-add ro1 snat 10.0.0.200 192.168.1.200/30
>> +
>> +check ovn-nbctl lr-nat-add ro2 dnat 20.0.0.100 192.168.2.100
>> +check ovn-nbctl --wait=sb lr-nat-add ro2 snat 20.0.0.200 192.168.2.200/30
>> +
>> +AS_BOX([Ensure that unreachable flood flows are not installed, since all addresses are reachable])
>> +
>> +AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep "priority=90" -c], [1], [dnl
>> +0
>> +])
>> +
>> +AS_BOX([Adding some unreachable NAT addresses])
>> +
>> +check ovn-nbctl lr-nat-add ro1 dnat 30.0.0.100 192.168.1.130
>> +check ovn-nbctl lr-nat-add ro1 snat 30.0.0.200 192.168.1.148/30
>> +
>> +check ovn-nbctl lr-nat-add ro2 dnat 40.0.0.100 192.168.2.130
>> +check ovn-nbctl --wait=sb lr-nat-add ro2 snat 40.0.0.200 192.168.2.148/30
>> +
>> +AS_BOX([Ensure that unreachable flood flows are installed, since there are unreachable addresses])
>> +
>> +ovn-sbctl lflow-list
>> +
>> +# We expect two flows to be installed, one per connected router port on sw
>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 -c], [0], [dnl
>> +2
>> +])
>> +
>> +# We expect that the installed flows will match the unreachable DNAT addresses only.
>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {30.0.0.100}" -c], [0], [dnl
>> +1
>> +])
>> +
>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {40.0.0.100}" -c], [0], [dnl
>> +1
>> +])
>> +
>> +# Ensure that we do not create flows for SNAT addresses
>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {30.0.0.200}" -c], [1], [dnl
>> +0
>> +])
>> +
>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {40.0.0.200}" -c], [1], [dnl
>> +0
>> +])
>> +
>> +AT_CLEANUP
>> +])
>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>> index a5f10ce38..bdd13b009 100644
>> --- a/tests/system-ovn.at
>> +++ b/tests/system-ovn.at
>> @@ -6482,3 +6482,105 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>>
>>   AT_CLEANUP
>>   ])
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ovn -- Floating IP outside router subnet IPv4])
>> +AT_KEYWORDS(NAT)
>> +
>> +ovn_start
>> +
>> +OVS_TRAFFIC_VSWITCHD_START()
>> +ADD_BR([br-int])
>> +
>> +# Set external-ids in br-int needed for ovn-controller
>> +ovs-vsctl \
>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
>> +
>> +start_daemon ovn-controller
>> +
>> +# Logical network:
>> +# Two VMs
>> +#   * VM1 with IP address 192.168.100.5
>> +#   * VM2 with IP address 192.168.200.5
>> +#
>> +# VM1 connects to logical switch ls1. ls1 connects to logical router lr1.
>> +# VM2 connects to logical switch ls2. ls2 connects to logical router lr2.
>> +# lr1 and lr2 both connect to logical switch ls-pub.
>> +# * lr1's interface that connects to ls-pub has IP address 172.18.2.110/24
>> +# * lr2's interface that connects to ls-pub has IP address 172.18.1.173/24
>> +#
>> +# lr1 has the following attributes:
>> +#   * It has a DNAT rule that translates 172.18.2.11 to 192.168.100.5 (VM1)
>> +#
>> +# lr2 has the following attributes:
>> +#   * It has a DNAT rule that translates 172.18.2.12 to 192.168.200.5 (VM2)
>> +#
>> +# In this test, we want to ensure that a ping from VM1 to IP address 172.18.2.12 reaches VM2.
>> +# When the NAT rules are set up, there should be MAC_Bindings created that allow for traffic
>> +# to exit lr1, go through ls-pub, and reach the NAT external IP configured on lr2.
>> +
>> +check ovn-nbctl ls-add ls1
>> +check ovn-nbctl lsp-add ls1 vm1 -- lsp-set-addresses vm1 "00:00:00:00:01:05 192.168.100.5"
>> +
>> +check ovn-nbctl ls-add ls2
>> +check ovn-nbctl lsp-add ls2 vm2 -- lsp-set-addresses vm2 "00:00:00:00:02:05 192.168.200.5"
>> +
>> +check ovn-nbctl ls-add ls-pub
>> +
>> +check ovn-nbctl lr-add lr1
>> +check ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:01:01 192.168.100.1/24
>> +check ovn-nbctl lsp-add ls1 ls1-lr1                      \
>> +    -- lsp-set-type ls1-lr1 router                 \
>> +    -- lsp-set-addresses ls1-lr1 router            \
>> +    -- lsp-set-options ls1-lr1 router-port=lr1-ls1
>> +
>> +check ovn-nbctl lr-add lr2
>> +check ovn-nbctl lrp-add lr2 lr2-ls2 00:00:00:00:02:01 192.168.200.1/24
>> +check ovn-nbctl lsp-add ls2 ls2-lr2                      \
>> +    -- lsp-set-type ls2-lr2 router                 \
>> +    -- lsp-set-addresses ls2-lr2 router            \
>> +    -- lsp-set-options ls2-lr2 router-port=lr2-ls2
>> +
>> +check ovn-nbctl lrp-add lr1 lr1-ls-pub 00:00:00:00:03:01 172.18.2.110/24
>> +check ovn-nbctl lrp-set-gateway-chassis lr1-ls-pub hv1
>> +check ovn-nbctl lsp-add ls-pub ls-pub-lr1                      \
>> +    -- lsp-set-type ls-pub-lr1 router                    \
>> +    -- lsp-set-addresses ls-pub-lr1 router               \
>> +    -- lsp-set-options ls-pub-lr1 router-port=lr1-ls-pub
>> +
>> +check ovn-nbctl lrp-add lr2 lr2-ls-pub 00:00:00:00:03:02 172.18.1.173/24
>> +check ovn-nbctl lrp-set-gateway-chassis lr2-ls-pub hv1
>> +check ovn-nbctl lsp-add ls-pub ls-pub-lr2                      \
>> +    -- lsp-set-type ls-pub-lr2 router                    \
>> +    -- lsp-set-addresses ls-pub-lr2 router               \
>> +    -- lsp-set-options ls-pub-lr2 router-port=lr2-ls-pub
>> +
>> +# Putting --add-route on these NAT rules means there is no need to
>> +# add any static routes.
>> +check ovn-nbctl --add-route lr-nat-add lr1 dnat_and_snat 172.18.2.11 192.168.100.5 vm1 00:00:00:00:03:01
>> +check ovn-nbctl --add-route lr-nat-add lr2 dnat_and_snat 172.18.2.12 192.168.200.5 vm2 00:00:00:00:03:02
>> +
>> +ADD_NAMESPACES(vm1)
>> +ADD_VETH(vm1, vm1, br-int, "192.168.100.5/24", "00:00:00:00:01:05", \
>> +         "192.168.100.1")
>> +
>> +ADD_NAMESPACES(vm2)
>> +ADD_VETH(vm2, vm2, br-int, "192.168.200.5/24", "00:00:00:00:02:05", \
>> +         "192.168.200.1")
>> +
>> +OVN_POPULATE_ARP
>> +check ovn-nbctl --wait=hv sync
>> +
>> +AS_BOX([Testing a ping])
>> +
>> +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 172.18.2.12 | FORMAT_PING], \
>> +[0], [dnl
>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
>> +])
>> +
>> +AT_CLEANUP
>> +])
>> --
>> 2.31.1
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
Mark Michelson July 7, 2021, 9:03 p.m. UTC | #3
On 7/7/21 2:50 PM, Mark Michelson wrote:
> On 7/7/21 1:41 PM, Numan Siddique wrote:
>> On Wed, Jun 30, 2021 at 7:57 PM Mark Michelson <mmichels@redhat.com> 
>> wrote:
>>>
>>> Previously, ARP TPAs were filtered down only to "reachable" addresses.
>>> Reachable addresses are all router interface addresses, as well as NAT
>>> external addresses and load balancer VIPs that are within the subnet
>>> handled by a router's port.
>>>
>>> However, it is possible that in some configurations, CMSes purposely
>>> configure NAT or load balancer addresses on a router that are outside
>>> the router's subnets, and they expect the router to respond to ARPs for
>>> those addresses.
>>>
>>> This commit adds a higher priority flow to logical switches that makes
>>> it so ARPs targeted at "unreachable" addresses are flooded to all ports.
>>> This way, the ARPs can reach the router appropriately and receive a
>>> response.
>>>
>>> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1929901
>>>
>>> Signed-off-by: Mark Michelson <mmichels@redhat.com>
>>
>> Acked-by: Numan Siddique <numans@ovn.org>
>>
>> I've one comment which probably can be addressed.
>>
>> If a configured NAT entry on a logical router is unreachable within
>> that router, this patch floods
>> the packet for the ARP destined to that NAT IP in the logical switch 
>> pipeline.
>>
>> Before adding the flow to flood, can't we check if the NAT IP is
>> reachable from other router ports
>> connected to this logical switch.  If so, we can do "outport ==
>> <REACHABLE_LRP_PORT>" instead
>> of flooding.
>>
>> I think this is possible given that the logical flow is added in the
>> switch pipeline.  We just need to loop through
>> all the router ports of the logical switch.  The question is - is this
>> efficient  and takes up some time on a scaled environment ?
>>
>> What do you think ?  If this seems fine,  it can be a follow up patch 
>> too.
> 
> I don't think your suggestion would add any appreciable time compared to 
> what we're already doing in ovn-northd. I will attempt this approach and 
> let you know how it goes.
> 

On second thought, I think this should be a follow-up patch as you 
suggested. This particular work has been going on for too long to delay 
for this purpose, IMO.

>>
>> Numan
>>
>>> ---
>>>   northd/ovn-northd.8.xml |   8 ++
>>>   northd/ovn-northd.c     | 162 +++++++++++++++++++++++++++-------------
>>>   northd/ovn_northd.dl    |  91 ++++++++++++++++++----
>>>   tests/ovn-northd.at     |  99 ++++++++++++++++++++++++
>>>   tests/system-ovn.at     | 102 +++++++++++++++++++++++++
>>>   5 files changed, 395 insertions(+), 67 deletions(-)
>>>
>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>>> index beaf5a183..5aedd6619 100644
>>> --- a/northd/ovn-northd.8.xml
>>> +++ b/northd/ovn-northd.8.xml
>>> @@ -1587,6 +1587,14 @@ output;
>>>           logical ports.
>>>         </li>
>>>
>>> +      <li>
>>> +        Priority-90 flows for each IP address/VIP/NAT address 
>>> configured
>>> +        outside its owning router port's subnet. These flows match ARP
>>> +        requests and ND packets for the specific IP addresses.  
>>> Matched packets
>>> +        are forwarded to the <code>MC_FLOOD</code> multicast group 
>>> which
>>> +        contains all connected logical ports.
>>> +      </li>
>>> +
>>>         <li>
>>>           Priority-75 flows for each port connected to a logical router
>>>           matching self originated ARP request/ND packets.  These 
>>> packets
>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>> index f6fad281b..d0b325748 100644
>>> --- a/northd/ovn-northd.c
>>> +++ b/northd/ovn-northd.c
>>> @@ -6555,38 +6555,41 @@ 
>>> build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op,
>>>       ds_destroy(&match);
>>>   }
>>>
>>> -/*
>>> - * Ingress table 19: Flows that forward ARP/ND requests only to the 
>>> routers
>>> - * that own the addresses. Other ARP/ND packets are still flooded in 
>>> the
>>> - * switching domain as regular broadcast.
>>> - */
>>>   static void
>>> -build_lswitch_rport_arp_req_flow_for_ip(struct ds *ip_match,
>>> -                                        int addr_family,
>>> -                                        struct ovn_port *patch_op,
>>> -                                        struct ovn_datapath *od,
>>> -                                        uint32_t priority,
>>> -                                        struct hmap *lflows,
>>> -                                        const struct ovsdb_idl_row 
>>> *stage_hint)
>>> +arp_nd_ns_match(struct ds *ips, int addr_family, struct ds *match)
>>>   {
>>> -    struct ds match   = DS_EMPTY_INITIALIZER;
>>> -    struct ds actions = DS_EMPTY_INITIALIZER;
>>> -
>>>       /* Packets received from VXLAN tunnels have already been 
>>> through the
>>>        * router pipeline so we should skip them. Normally this is 
>>> done by the
>>>        * multicast_group implementation (VXLAN packets skip table 32 
>>> which
>>>        * delivers to patch ports) but we're bypassing multicast_groups.
>>>        */
>>> -    ds_put_cstr(&match, FLAGBIT_NOT_VXLAN " && ");
>>> +    ds_put_cstr(match, FLAGBIT_NOT_VXLAN " && ");
>>>
>>>       if (addr_family == AF_INET) {
>>> -        ds_put_cstr(&match, "arp.op == 1 && arp.tpa == { ");
>>> +        ds_put_cstr(match, "arp.op == 1 && arp.tpa == {");
>>>       } else {
>>> -        ds_put_cstr(&match, "nd_ns && nd.target == { ");
>>> +        ds_put_cstr(match, "nd_ns && nd.target == {");
>>>       }
>>>
>>> -    ds_put_cstr(&match, ds_cstr_ro(ip_match));
>>> -    ds_put_cstr(&match, "}");
>>> +    ds_put_cstr(match, ds_cstr_ro(ips));
>>> +    ds_put_cstr(match, "}");
>>> +}
>>> +
>>> +/*
>>> + * Ingress table 19: Flows that forward ARP/ND requests only to the 
>>> routers
>>> + * that own the addresses. Other ARP/ND packets are still flooded in 
>>> the
>>> + * switching domain as regular broadcast.
>>> + */
>>> +static void
>>> +build_lswitch_rport_arp_req_flow_for_reachable_ip(struct ds *ips,
>>> +    int addr_family, struct ovn_port *patch_op, struct ovn_datapath 
>>> *od,
>>> +    uint32_t priority, struct hmap *lflows,
>>> +    const struct ovsdb_idl_row *stage_hint)
>>> +{
>>> +    struct ds match   = DS_EMPTY_INITIALIZER;
>>> +    struct ds actions = DS_EMPTY_INITIALIZER;
>>> +
>>> +    arp_nd_ns_match(ips, addr_family, &match);
>>>
>>>       /* Send a the packet to the router pipeline.  If the switch has 
>>> non-router
>>>        * ports then flood it there as well.
>>> @@ -6609,6 +6612,30 @@ build_lswitch_rport_arp_req_flow_for_ip(struct 
>>> ds *ip_match,
>>>       ds_destroy(&actions);
>>>   }
>>>
>>> +/*
>>> + * Ingress table 19: Flows that forward ARP/ND requests for 
>>> "unreachable" IPs
>>> + * (NAT or load balancer IPs configured on a router that are outside 
>>> the
>>> + * router's configured subnets).
>>> + * These ARP/ND packets are flooded in the switching domain as regular
>>> + * broadcast.
>>> + */
>>> +static void
>>> +build_lswitch_rport_arp_req_flow_for_unreachable_ip(struct ds *ips,
>>> +    int addr_family, struct ovn_datapath *od, uint32_t priority,
>>> +    struct hmap *lflows, const struct ovsdb_idl_row *stage_hint)
>>> +{
>>> +    struct ds match = DS_EMPTY_INITIALIZER;
>>> +
>>> +    arp_nd_ns_match(ips, addr_family, &match);
>>> +
>>> +    ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP,
>>> +                            priority, ds_cstr(&match),
>>> +                            "outport = \""MC_FLOOD"\"; output;",
>>> +                            stage_hint);
>>> +
>>> +    ds_destroy(&match);
>>> +}
>>> +
>>>   /*
>>>    * Ingress table 19: Flows that forward ARP/ND requests only to the 
>>> routers
>>>    * that own the addresses.
>>> @@ -6635,8 +6662,10 @@ build_lswitch_rport_arp_req_flows(struct 
>>> ovn_port *op,
>>>        * router port.
>>>        * Priority: 80.
>>>        */
>>> -    struct ds ips_v4_match = DS_EMPTY_INITIALIZER;
>>> -    struct ds ips_v6_match = DS_EMPTY_INITIALIZER;
>>> +    struct ds ips_v4_match_reachable = DS_EMPTY_INITIALIZER;
>>> +    struct ds ips_v6_match_reachable = DS_EMPTY_INITIALIZER;
>>> +    struct ds ips_v4_match_unreachable = DS_EMPTY_INITIALIZER;
>>> +    struct ds ips_v6_match_unreachable = DS_EMPTY_INITIALIZER;
>>>
>>>       const char *ip_addr;
>>>       SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v4) {
>>> @@ -6645,9 +6674,12 @@ build_lswitch_rport_arp_req_flows(struct 
>>> ovn_port *op,
>>>           /* Check if the ovn port has a network configured on which 
>>> we could
>>>            * expect ARP requests for the LB VIP.
>>>            */
>>> -        if (ip_parse(ip_addr, &ipv4_addr) &&
>>> -                lrouter_port_ipv4_reachable(op, ipv4_addr)) {
>>> -            ds_put_format(&ips_v4_match, "%s, ", ip_addr);
>>> +        if (ip_parse(ip_addr, &ipv4_addr)) {
>>> +            if (lrouter_port_ipv4_reachable(op, ipv4_addr)) {
>>> +                ds_put_format(&ips_v4_match_reachable, "%s, ", 
>>> ip_addr);
>>> +            } else {
>>> +                ds_put_format(&ips_v4_match_unreachable, "%s, ", 
>>> ip_addr);
>>> +            }
>>>           }
>>>       }
>>>       SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v6) {
>>> @@ -6656,9 +6688,12 @@ build_lswitch_rport_arp_req_flows(struct 
>>> ovn_port *op,
>>>           /* Check if the ovn port has a network configured on which 
>>> we could
>>>            * expect NS requests for the LB VIP.
>>>            */
>>> -        if (ipv6_parse(ip_addr, &ipv6_addr) &&
>>> -                lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
>>> -            ds_put_format(&ips_v6_match, "%s, ", ip_addr);
>>> +        if (ipv6_parse(ip_addr, &ipv6_addr)) {
>>> +            if (lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
>>> +                ds_put_format(&ips_v6_match_reachable, "%s, ", 
>>> ip_addr);
>>> +            } else {
>>> +                ds_put_format(&ips_v6_match_unreachable, "%s, ", 
>>> ip_addr);
>>> +            }
>>>           }
>>>       }
>>>
>>> @@ -6680,44 +6715,67 @@ build_lswitch_rport_arp_req_flows(struct 
>>> ovn_port *op,
>>>           if (nat_entry_is_v6(nat_entry)) {
>>>               struct in6_addr *addr = 
>>> &nat_entry->ext_addrs.ipv6_addrs[0].addr;
>>>
>>> -            if (lrouter_port_ipv6_reachable(op, addr) &&
>>> -                    !sset_contains(&op->od->lb_ips_v6, 
>>> nat->external_ip)) {
>>> -                ds_put_format(&ips_v6_match, "%s, ", nat->external_ip);
>>> +            if (!sset_contains(&op->od->lb_ips_v6, nat->external_ip)) {
>>> +                if (lrouter_port_ipv6_reachable(op, addr)) {
>>> +                    ds_put_format(&ips_v6_match_reachable, "%s, ",
>>> +                                  nat->external_ip);
>>> +                } else {
>>> +                    ds_put_format(&ips_v6_match_unreachable, "%s, ",
>>> +                                  nat->external_ip);
>>> +                }
>>>               }
>>>           } else {
>>>               ovs_be32 addr = nat_entry->ext_addrs.ipv4_addrs[0].addr;
>>> -
>>> -            if (lrouter_port_ipv4_reachable(op, addr) &&
>>> -                    !sset_contains(&op->od->lb_ips_v4, 
>>> nat->external_ip)) {
>>> -                ds_put_format(&ips_v4_match, "%s, ", nat->external_ip);
>>> +            if (!sset_contains(&op->od->lb_ips_v4, nat->external_ip)) {
>>> +                if (lrouter_port_ipv4_reachable(op, addr)) {
>>> +                    ds_put_format(&ips_v4_match_reachable, "%s, ",
>>> +                                  nat->external_ip);
>>> +                } else {
>>> +                    ds_put_format(&ips_v4_match_unreachable, "%s, ",
>>> +                                  nat->external_ip);
>>> +                }
>>>               }
>>>           }
>>>       }
>>>
>>>       for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
>>> -        ds_put_format(&ips_v4_match, "%s, ",
>>> +        ds_put_format(&ips_v4_match_reachable, "%s, ",
>>>                         op->lrp_networks.ipv4_addrs[i].addr_s);
>>>       }
>>>       for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
>>> -        ds_put_format(&ips_v6_match, "%s, ",
>>> +        ds_put_format(&ips_v6_match_reachable, "%s, ",
>>>                         op->lrp_networks.ipv6_addrs[i].addr_s);
>>>       }
>>>
>>> -    if (ds_last(&ips_v4_match) != EOF) {
>>> -        ds_chomp(&ips_v4_match, ' ');
>>> -        ds_chomp(&ips_v4_match, ',');
>>> -        build_lswitch_rport_arp_req_flow_for_ip(&ips_v4_match, 
>>> AF_INET, sw_op,
>>> -                                                sw_od, 80, lflows,
>>> -                                                stage_hint);
>>> +    if (ds_last(&ips_v4_match_reachable) != EOF) {
>>> +        ds_chomp(&ips_v4_match_reachable, ' ');
>>> +        ds_chomp(&ips_v4_match_reachable, ',');
>>> +        build_lswitch_rport_arp_req_flow_for_reachable_ip(
>>> +            &ips_v4_match_reachable, AF_INET, sw_op, sw_od, 80, lflows,
>>> +            stage_hint);
>>>       }
>>> -    if (ds_last(&ips_v6_match) != EOF) {
>>> -        ds_chomp(&ips_v6_match, ' ');
>>> -        ds_chomp(&ips_v6_match, ',');
>>> -        build_lswitch_rport_arp_req_flow_for_ip(&ips_v6_match, 
>>> AF_INET6, sw_op,
>>> -                                                sw_od, 80, lflows,
>>> -                                                stage_hint);
>>> +    if (ds_last(&ips_v6_match_reachable) != EOF) {
>>> +        ds_chomp(&ips_v6_match_reachable, ' ');
>>> +        ds_chomp(&ips_v6_match_reachable, ',');
>>> +        build_lswitch_rport_arp_req_flow_for_reachable_ip(
>>> +            &ips_v6_match_reachable, AF_INET6, sw_op, sw_od, 80, 
>>> lflows,
>>> +            stage_hint);
>>>       }
>>>
>>> +    if (ds_last(&ips_v4_match_unreachable) != EOF) {
>>> +        ds_chomp(&ips_v4_match_unreachable, ' ');
>>> +        ds_chomp(&ips_v4_match_unreachable, ',');
>>> +        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
>>> +            &ips_v4_match_unreachable, AF_INET, sw_od, 90, lflows,
>>> +            stage_hint);
>>> +    }
>>> +    if (ds_last(&ips_v6_match_unreachable) != EOF) {
>>> +        ds_chomp(&ips_v6_match_unreachable, ' ');
>>> +        ds_chomp(&ips_v6_match_unreachable, ',');
>>> +        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
>>> +            &ips_v6_match_unreachable, AF_INET6, sw_od, 90, lflows,
>>> +            stage_hint);
>>> +    }
>>>       /* Self originated ARP requests/ND need to be flooded as usual.
>>>        *
>>>        * However, if the switch doesn't have any non-router ports we 
>>> shouldn't
>>> @@ -6728,8 +6786,10 @@ build_lswitch_rport_arp_req_flows(struct 
>>> ovn_port *op,
>>>       if (sw_od->n_router_ports != sw_od->nbs->n_ports) {
>>>           build_lswitch_rport_arp_req_self_orig_flow(op, 75, sw_od, 
>>> lflows);
>>>       }
>>> -    ds_destroy(&ips_v4_match);
>>> -    ds_destroy(&ips_v6_match);
>>> +    ds_destroy(&ips_v4_match_reachable);
>>> +    ds_destroy(&ips_v6_match_reachable);
>>> +    ds_destroy(&ips_v4_match_unreachable);
>>> +    ds_destroy(&ips_v6_match_unreachable);
>>>   }
>>>
>>>   static void
>>> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
>>> index c9ee742d1..dd6430849 100644
>>> --- a/northd/ovn_northd.dl
>>> +++ b/northd/ovn_northd.dl
>>> @@ -4109,9 +4109,13 @@ Flow(.logical_datapath = sw._uuid,
>>>    * router port.
>>>    * Priority: 80.
>>>    */
>>> -function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, 
>>> Set<string>) = {
>>> -    var all_ips_v4 = set_empty();
>>> -    var all_ips_v6 = set_empty();
>>> +function get_arp_forward_ips(rp: Intern<RouterPort>):
>>> +    (Set<string>, Set<string>, Set<string>, Set<string>) =
>>> +{
>>> +    var reachable_ips_v4 = set_empty();
>>> +    var reachable_ips_v6 = set_empty();
>>> +    var unreachable_ips_v4 = set_empty();
>>> +    var unreachable_ips_v6 = set_empty();
>>>
>>>       (var lb_ips_v4, var lb_ips_v6)
>>>           = get_router_load_balancer_ips(rp.router, false);
>>> @@ -4121,7 +4125,9 @@ function get_arp_forward_ips(rp: 
>>> Intern<RouterPort>): (Set<string>, Set<string>)
>>>            */
>>>           match (ip_parse(a)) {
>>>               Some{ipv4} -> if (lrouter_port_ip_reachable(rp, 
>>> IPv4{ipv4})) {
>>> -                all_ips_v4.insert(a)
>>> +                reachable_ips_v4.insert(a)
>>> +            } else {
>>> +                unreachable_ips_v4.insert(a)
>>>               },
>>>               _ -> ()
>>>           }
>>> @@ -4132,7 +4138,9 @@ function get_arp_forward_ips(rp: 
>>> Intern<RouterPort>): (Set<string>, Set<string>)
>>>            */
>>>           match (ipv6_parse(a)) {
>>>               Some{ipv6} -> if (lrouter_port_ip_reachable(rp, 
>>> IPv6{ipv6})) {
>>> -                all_ips_v6.insert(a)
>>> +                reachable_ips_v6.insert(a)
>>> +            } else {
>>> +                unreachable_ips_v6.insert(a)
>>>               },
>>>               _ -> ()
>>>           }
>>> @@ -4145,22 +4153,45 @@ function get_arp_forward_ips(rp: 
>>> Intern<RouterPort>): (Set<string>, Set<string>)
>>>                */
>>>               if (lrouter_port_ip_reachable(rp, nat.external_ip)) {
>>>                   match (nat.external_ip) {
>>> -                    IPv4{_} -> all_ips_v4.insert(nat.nat.external_ip),
>>> -                    IPv6{_} -> all_ips_v6.insert(nat.nat.external_ip)
>>> +                    IPv4{_} -> 
>>> reachable_ips_v4.insert(nat.nat.external_ip),
>>> +                    IPv6{_} -> 
>>> reachable_ips_v6.insert(nat.nat.external_ip)
>>> +                }
>>> +            } else {
>>> +                match (nat.external_ip) {
>>> +                    IPv4{_} -> 
>>> unreachable_ips_v4.insert(nat.nat.external_ip),
>>> +                    IPv6{_} -> 
>>> unreachable_ips_v6.insert(nat.nat.external_ip),
>>>                   }
>>>               }
>>>           }
>>>       };
>>>
>>>       for (a in rp.networks.ipv4_addrs) {
>>> -        all_ips_v4.insert("${a.addr}")
>>> +        reachable_ips_v4.insert("${a.addr}")
>>>       };
>>>       for (a in rp.networks.ipv6_addrs) {
>>> -        all_ips_v6.insert("${a.addr}")
>>> +        reachable_ips_v6.insert("${a.addr}")
>>>       };
>>>
>>> -    (all_ips_v4, all_ips_v6)
>>> +    (reachable_ips_v4, reachable_ips_v6, unreachable_ips_v4, 
>>> unreachable_ips_v6)
>>>   }
>>> +
>>> +relation &SwitchPortARPForwards(
>>> +    port: Intern<SwitchPort>,
>>> +    reachable_ips_v4: Set<string>,
>>> +    reachable_ips_v6: Set<string>,
>>> +    unreachable_ips_v4: Set<string>,
>>> +    unreachable_ips_v6: Set<string>
>>> +)
>>> +
>>> +&SwitchPortARPForwards(.port = port,
>>> +                       .reachable_ips_v4 = reachable_ips_v4,
>>> +                       .reachable_ips_v6 = reachable_ips_v6,
>>> +                       .unreachable_ips_v4 = unreachable_ips_v4,
>>> +                       .unreachable_ips_v6 = unreachable_ips_v6) :-
>>> +    port in &SwitchPort(.peer = Some{rp}),
>>> +    rp.is_enabled(),
>>> +    (var reachable_ips_v4, var reachable_ips_v6, var 
>>> unreachable_ips_v4, var unreachable_ips_v6) = get_arp_forward_ips(rp).
>>> +
>>>   /* Packets received from VXLAN tunnels have already been through the
>>>    * router pipeline so we should skip them. Normally this is done by 
>>> the
>>>    * multicast_group implementation (VXLAN packets skip table 32 which
>>> @@ -4172,7 +4203,7 @@ Flow(.logical_datapath = sw._uuid,
>>>        .priority         = 80,
>>>        .__match          = fLAGBIT_NOT_VXLAN() ++
>>>                            " && arp.op == 1 && arp.tpa == { " ++
>>> -                         all_ips_v4.to_vec().join(", ") ++ "}",
>>> +                         ipv4.to_vec().join(", ") ++ "}",
>>>        .actions          = if (sw.has_non_router_port) {
>>>                                "clone {outport = ${sp.json_name}; 
>>> output; }; "
>>>                                "outport = ${mc_flood_l2}; output;"
>>> @@ -4182,15 +4213,15 @@ Flow(.logical_datapath = sw._uuid,
>>>        .external_ids     = stage_hint(sp.lsp._uuid)) :-
>>>       sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>>>       rp.is_enabled(),
>>> -    (var all_ips_v4, _) = get_arp_forward_ips(rp),
>>> -    not all_ips_v4.is_empty(),
>>> +    &SwitchPortARPForwards(.port = sp, .reachable_ips_v4 = ipv4),
>>> +    not ipv4.is_empty(),
>>>       var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
>>>   Flow(.logical_datapath = sw._uuid,
>>>        .stage            = s_SWITCH_IN_L2_LKUP(),
>>>        .priority         = 80,
>>>        .__match          = fLAGBIT_NOT_VXLAN() ++
>>>                            " && nd_ns && nd.target == { " ++
>>> -                         all_ips_v6.to_vec().join(", ") ++ "}",
>>> +                         ipv6.to_vec().join(", ") ++ "}",
>>>        .actions          = if (sw.has_non_router_port) {
>>>                                "clone {outport = ${sp.json_name}; 
>>> output; }; "
>>>                                "outport = ${mc_flood_l2}; output;"
>>> @@ -4200,10 +4231,38 @@ Flow(.logical_datapath = sw._uuid,
>>>        .external_ids     = stage_hint(sp.lsp._uuid)) :-
>>>       sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>>>       rp.is_enabled(),
>>> -    (_, var all_ips_v6) = get_arp_forward_ips(rp),
>>> -    not all_ips_v6.is_empty(),
>>> +    &SwitchPortARPForwards(.port = sp, .reachable_ips_v6 = ipv6),
>>> +    not ipv6.is_empty(),
>>>       var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
>>>
>>> +Flow(.logical_datapath = sw._uuid,
>>> +     .stage            = s_SWITCH_IN_L2_LKUP(),
>>> +     .priority         = 90,
>>> +     .__match          = fLAGBIT_NOT_VXLAN() ++
>>> +                        " && arp.op == 1 && arp.tpa == {" ++
>>> +                        ipv4.to_vec().join(", ") ++ "}",
>>> +     .actions          = "outport = ${flood}; output;",
>>> +     .external_ids     = stage_hint(sp.lsp._uuid)) :-
>>> +    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>>> +    rp.is_enabled(),
>>> +    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v4 = ipv4),
>>> +    not ipv4.is_empty(),
>>> +    var flood = json_string_escape(mC_FLOOD().0).
>>> +
>>> +Flow(.logical_datapath = sw._uuid,
>>> +     .stage            = s_SWITCH_IN_L2_LKUP(),
>>> +     .priority         = 90,
>>> +     .__match          = fLAGBIT_NOT_VXLAN() ++
>>> +                         " && nd_ns && nd.target == {" ++
>>> +                         ipv6.to_vec().join(", ") ++ "}",
>>> +     .actions          = "outport = ${flood}; output;",
>>> +     .external_ids     = stage_hint(sp.lsp._uuid)) :-
>>> +    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
>>> +    rp.is_enabled(),
>>> +    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v6 = ipv6),
>>> +    not ipv6.is_empty(),
>>> +    var flood = json_string_escape(mC_FLOOD().0).
>>> +
>>>   for (SwitchPortNewDynamicAddress(.port = &SwitchPort{.lsp = lsp, 
>>> .json_name = json_name, .sw = sw},
>>>                                    .address = Some{addrs})
>>>        if lsp.__type != "external") {
>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>> index 0c8a09ced..f591537e1 100644
>>> --- a/tests/ovn-northd.at
>>> +++ b/tests/ovn-northd.at
>>> @@ -4033,3 +4033,102 @@ check ovn-nbctl --wait=sb clear 
>>> logical_router_port ro2-sw ha_chassis_group
>>>   check_lflows 0
>>>
>>>   AT_CLEANUP
>>> +
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([ovn -- ARP flood for unreachable addresses])
>>> +ovn_start
>>> +
>>> +AS_BOX([Setting up the logical network])
>>> +
>>> +# This network is the same as the one from "Router Address Propagation"
>>> +check ovn-nbctl ls-add sw
>>> +
>>> +check ovn-nbctl lr-add ro1
>>> +check ovn-nbctl lrp-add ro1 ro1-sw 00:00:00:00:00:01 10.0.0.1/24
>>> +check ovn-nbctl lsp-add sw sw-ro1
>>> +check ovn-nbctl lsp-set-type sw-ro1 router
>>> +check ovn-nbctl lsp-set-addresses sw-ro1 router
>>> +check ovn-nbctl lsp-set-options sw-ro1 router-port=ro1-sw
>>> +
>>> +check ovn-nbctl lr-add ro2
>>> +check ovn-nbctl lrp-add ro2 ro2-sw 00:00:00:00:00:02 20.0.0.1/24
>>> +check ovn-nbctl lsp-add sw sw-ro2
>>> +check ovn-nbctl lsp-set-type sw-ro2 router
>>> +check ovn-nbctl lsp-set-addresses sw-ro2 router
>>> +check ovn-nbctl --wait=sb lsp-set-options sw-ro2 router-port=ro2-sw
>>> +
>>> +check ovn-nbctl ls-add ls1
>>> +check ovn-nbctl lsp-add ls1 vm1
>>> +check ovn-nbctl lsp-set-addresses vm1 "00:00:00:00:01:02 192.168.1.2"
>>> +check ovn-nbctl lrp-add ro1 ro1-ls1 00:00:00:00:01:01 192.168.1.1/24
>>> +check ovn-nbctl lsp-add ls1 ls1-ro1
>>> +check ovn-nbctl lsp-set-type ls1-ro1 router
>>> +check ovn-nbctl lsp-set-addresses ls1-ro1 router
>>> +check ovn-nbctl lsp-set-options ls1-ro1 router-port=ro1-ls1
>>> +
>>> +check ovn-nbctl ls-add ls2
>>> +check ovn-nbctl lsp-add ls2 vm2
>>> +check ovn-nbctl lsp-set-addresses vm2 "00:00:00:00:02:02 192.168.2.2"
>>> +check ovn-nbctl lrp-add ro2 ro2-ls2 00:00:00:00:02:01 192.168.2.1/24
>>> +check ovn-nbctl lsp-add ls2 ls2-ro2
>>> +check ovn-nbctl lsp-set-type ls2-ro2 router
>>> +check ovn-nbctl lsp-set-addresses ls2-ro2 router
>>> +check ovn-nbctl lsp-set-options ls2-ro2 router-port=ro2-ls2
>>> +
>>> +AS_BOX([Ensure that unreachable flood flows are not installed, since 
>>> no addresses are unreachable])
>>> +
>>> +AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep 
>>> "priority=90" -c], [1], [dnl
>>> +0
>>> +])
>>> +
>>> +AS_BOX([Adding some reachable NAT addresses])
>>> +
>>> +check ovn-nbctl lr-nat-add ro1 dnat 10.0.0.100 192.168.1.100
>>> +check ovn-nbctl lr-nat-add ro1 snat 10.0.0.200 192.168.1.200/30
>>> +
>>> +check ovn-nbctl lr-nat-add ro2 dnat 20.0.0.100 192.168.2.100
>>> +check ovn-nbctl --wait=sb lr-nat-add ro2 snat 20.0.0.200 
>>> 192.168.2.200/30
>>> +
>>> +AS_BOX([Ensure that unreachable flood flows are not installed, since 
>>> all addresses are reachable])
>>> +
>>> +AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep 
>>> "priority=90" -c], [1], [dnl
>>> +0
>>> +])
>>> +
>>> +AS_BOX([Adding some unreachable NAT addresses])
>>> +
>>> +check ovn-nbctl lr-nat-add ro1 dnat 30.0.0.100 192.168.1.130
>>> +check ovn-nbctl lr-nat-add ro1 snat 30.0.0.200 192.168.1.148/30
>>> +
>>> +check ovn-nbctl lr-nat-add ro2 dnat 40.0.0.100 192.168.2.130
>>> +check ovn-nbctl --wait=sb lr-nat-add ro2 snat 40.0.0.200 
>>> 192.168.2.148/30
>>> +
>>> +AS_BOX([Ensure that unreachable flood flows are installed, since 
>>> there are unreachable addresses])
>>> +
>>> +ovn-sbctl lflow-list
>>> +
>>> +# We expect two flows to be installed, one per connected router port 
>>> on sw
>>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep 
>>> priority=90 -c], [0], [dnl
>>> +2
>>> +])
>>> +
>>> +# We expect that the installed flows will match the unreachable DNAT 
>>> addresses only.
>>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep 
>>> priority=90 | grep "arp.tpa == {30.0.0.100}" -c], [0], [dnl
>>> +1
>>> +])
>>> +
>>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep 
>>> priority=90 | grep "arp.tpa == {40.0.0.100}" -c], [0], [dnl
>>> +1
>>> +])
>>> +
>>> +# Ensure that we do not create flows for SNAT addresses
>>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep 
>>> priority=90 | grep "arp.tpa == {30.0.0.200}" -c], [1], [dnl
>>> +0
>>> +])
>>> +
>>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep 
>>> priority=90 | grep "arp.tpa == {40.0.0.200}" -c], [1], [dnl
>>> +0
>>> +])
>>> +
>>> +AT_CLEANUP
>>> +])
>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>> index a5f10ce38..bdd13b009 100644
>>> --- a/tests/system-ovn.at
>>> +++ b/tests/system-ovn.at
>>> @@ -6482,3 +6482,105 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error 
>>> receiving.*/d
>>>
>>>   AT_CLEANUP
>>>   ])
>>> +
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([ovn -- Floating IP outside router subnet IPv4])
>>> +AT_KEYWORDS(NAT)
>>> +
>>> +ovn_start
>>> +
>>> +OVS_TRAFFIC_VSWITCHD_START()
>>> +ADD_BR([br-int])
>>> +
>>> +# Set external-ids in br-int needed for ovn-controller
>>> +ovs-vsctl \
>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>> +        -- set Open_vSwitch . 
>>> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>>> +        -- set bridge br-int fail-mode=secure 
>>> other-config:disable-in-band=true
>>> +
>>> +start_daemon ovn-controller
>>> +
>>> +# Logical network:
>>> +# Two VMs
>>> +#   * VM1 with IP address 192.168.100.5
>>> +#   * VM2 with IP address 192.168.200.5
>>> +#
>>> +# VM1 connects to logical switch ls1. ls1 connects to logical router 
>>> lr1.
>>> +# VM2 connects to logical switch ls2. ls2 connects to logical router 
>>> lr2.
>>> +# lr1 and lr2 both connect to logical switch ls-pub.
>>> +# * lr1's interface that connects to ls-pub has IP address 
>>> 172.18.2.110/24
>>> +# * lr2's interface that connects to ls-pub has IP address 
>>> 172.18.1.173/24
>>> +#
>>> +# lr1 has the following attributes:
>>> +#   * It has a DNAT rule that translates 172.18.2.11 to 
>>> 192.168.100.5 (VM1)
>>> +#
>>> +# lr2 has the following attributes:
>>> +#   * It has a DNAT rule that translates 172.18.2.12 to 
>>> 192.168.200.5 (VM2)
>>> +#
>>> +# In this test, we want to ensure that a ping from VM1 to IP address 
>>> 172.18.2.12 reaches VM2.
>>> +# When the NAT rules are set up, there should be MAC_Bindings 
>>> created that allow for traffic
>>> +# to exit lr1, go through ls-pub, and reach the NAT external IP 
>>> configured on lr2.
>>> +
>>> +check ovn-nbctl ls-add ls1
>>> +check ovn-nbctl lsp-add ls1 vm1 -- lsp-set-addresses vm1 
>>> "00:00:00:00:01:05 192.168.100.5"
>>> +
>>> +check ovn-nbctl ls-add ls2
>>> +check ovn-nbctl lsp-add ls2 vm2 -- lsp-set-addresses vm2 
>>> "00:00:00:00:02:05 192.168.200.5"
>>> +
>>> +check ovn-nbctl ls-add ls-pub
>>> +
>>> +check ovn-nbctl lr-add lr1
>>> +check ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:01:01 192.168.100.1/24
>>> +check ovn-nbctl lsp-add ls1 ls1-lr1                      \
>>> +    -- lsp-set-type ls1-lr1 router                 \
>>> +    -- lsp-set-addresses ls1-lr1 router            \
>>> +    -- lsp-set-options ls1-lr1 router-port=lr1-ls1
>>> +
>>> +check ovn-nbctl lr-add lr2
>>> +check ovn-nbctl lrp-add lr2 lr2-ls2 00:00:00:00:02:01 192.168.200.1/24
>>> +check ovn-nbctl lsp-add ls2 ls2-lr2                      \
>>> +    -- lsp-set-type ls2-lr2 router                 \
>>> +    -- lsp-set-addresses ls2-lr2 router            \
>>> +    -- lsp-set-options ls2-lr2 router-port=lr2-ls2
>>> +
>>> +check ovn-nbctl lrp-add lr1 lr1-ls-pub 00:00:00:00:03:01 
>>> 172.18.2.110/24
>>> +check ovn-nbctl lrp-set-gateway-chassis lr1-ls-pub hv1
>>> +check ovn-nbctl lsp-add ls-pub ls-pub-lr1                      \
>>> +    -- lsp-set-type ls-pub-lr1 router                    \
>>> +    -- lsp-set-addresses ls-pub-lr1 router               \
>>> +    -- lsp-set-options ls-pub-lr1 router-port=lr1-ls-pub
>>> +
>>> +check ovn-nbctl lrp-add lr2 lr2-ls-pub 00:00:00:00:03:02 
>>> 172.18.1.173/24
>>> +check ovn-nbctl lrp-set-gateway-chassis lr2-ls-pub hv1
>>> +check ovn-nbctl lsp-add ls-pub ls-pub-lr2                      \
>>> +    -- lsp-set-type ls-pub-lr2 router                    \
>>> +    -- lsp-set-addresses ls-pub-lr2 router               \
>>> +    -- lsp-set-options ls-pub-lr2 router-port=lr2-ls-pub
>>> +
>>> +# Putting --add-route on these NAT rules means there is no need to
>>> +# add any static routes.
>>> +check ovn-nbctl --add-route lr-nat-add lr1 dnat_and_snat 172.18.2.11 
>>> 192.168.100.5 vm1 00:00:00:00:03:01
>>> +check ovn-nbctl --add-route lr-nat-add lr2 dnat_and_snat 172.18.2.12 
>>> 192.168.200.5 vm2 00:00:00:00:03:02
>>> +
>>> +ADD_NAMESPACES(vm1)
>>> +ADD_VETH(vm1, vm1, br-int, "192.168.100.5/24", "00:00:00:00:01:05", \
>>> +         "192.168.100.1")
>>> +
>>> +ADD_NAMESPACES(vm2)
>>> +ADD_VETH(vm2, vm2, br-int, "192.168.200.5/24", "00:00:00:00:02:05", \
>>> +         "192.168.200.1")
>>> +
>>> +OVN_POPULATE_ARP
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +AS_BOX([Testing a ping])
>>> +
>>> +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 172.18.2.12 | 
>>> FORMAT_PING], \
>>> +[0], [dnl
>>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
>>> +])
>>> +
>>> +AT_CLEANUP
>>> +])
>>> -- 
>>> 2.31.1
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>
>
Numan Siddique July 7, 2021, 9:09 p.m. UTC | #4
On Wed, Jul 7, 2021 at 5:03 PM Mark Michelson <mmichels@redhat.com> wrote:
>
> On 7/7/21 2:50 PM, Mark Michelson wrote:
> > On 7/7/21 1:41 PM, Numan Siddique wrote:
> >> On Wed, Jun 30, 2021 at 7:57 PM Mark Michelson <mmichels@redhat.com>
> >> wrote:
> >>>
> >>> Previously, ARP TPAs were filtered down only to "reachable" addresses.
> >>> Reachable addresses are all router interface addresses, as well as NAT
> >>> external addresses and load balancer VIPs that are within the subnet
> >>> handled by a router's port.
> >>>
> >>> However, it is possible that in some configurations, CMSes purposely
> >>> configure NAT or load balancer addresses on a router that are outside
> >>> the router's subnets, and they expect the router to respond to ARPs for
> >>> those addresses.
> >>>
> >>> This commit adds a higher priority flow to logical switches that makes
> >>> it so ARPs targeted at "unreachable" addresses are flooded to all ports.
> >>> This way, the ARPs can reach the router appropriately and receive a
> >>> response.
> >>>
> >>> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1929901
> >>>
> >>> Signed-off-by: Mark Michelson <mmichels@redhat.com>
> >>
> >> Acked-by: Numan Siddique <numans@ovn.org>
> >>
> >> I've one comment which probably can be addressed.
> >>
> >> If a configured NAT entry on a logical router is unreachable within
> >> that router, this patch floods
> >> the packet for the ARP destined to that NAT IP in the logical switch
> >> pipeline.
> >>
> >> Before adding the flow to flood, can't we check if the NAT IP is
> >> reachable from other router ports
> >> connected to this logical switch.  If so, we can do "outport ==
> >> <REACHABLE_LRP_PORT>" instead
> >> of flooding.
> >>
> >> I think this is possible given that the logical flow is added in the
> >> switch pipeline.  We just need to loop through
> >> all the router ports of the logical switch.  The question is - is this
> >> efficient  and takes up some time on a scaled environment ?
> >>
> >> What do you think ?  If this seems fine,  it can be a follow up patch
> >> too.
> >
> > I don't think your suggestion would add any appreciable time compared to
> > what we're already doing in ovn-northd. I will attempt this approach and
> > let you know how it goes.
> >
>
> On second thought, I think this should be a follow-up patch as you
> suggested. This particular work has been going on for too long to delay
> for this purpose, IMO.

Sounds good to me.

Numan

>
> >>
> >> Numan
> >>
> >>> ---
> >>>   northd/ovn-northd.8.xml |   8 ++
> >>>   northd/ovn-northd.c     | 162 +++++++++++++++++++++++++++-------------
> >>>   northd/ovn_northd.dl    |  91 ++++++++++++++++++----
> >>>   tests/ovn-northd.at     |  99 ++++++++++++++++++++++++
> >>>   tests/system-ovn.at     | 102 +++++++++++++++++++++++++
> >>>   5 files changed, 395 insertions(+), 67 deletions(-)
> >>>
> >>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> >>> index beaf5a183..5aedd6619 100644
> >>> --- a/northd/ovn-northd.8.xml
> >>> +++ b/northd/ovn-northd.8.xml
> >>> @@ -1587,6 +1587,14 @@ output;
> >>>           logical ports.
> >>>         </li>
> >>>
> >>> +      <li>
> >>> +        Priority-90 flows for each IP address/VIP/NAT address
> >>> configured
> >>> +        outside its owning router port's subnet. These flows match ARP
> >>> +        requests and ND packets for the specific IP addresses.
> >>> Matched packets
> >>> +        are forwarded to the <code>MC_FLOOD</code> multicast group
> >>> which
> >>> +        contains all connected logical ports.
> >>> +      </li>
> >>> +
> >>>         <li>
> >>>           Priority-75 flows for each port connected to a logical router
> >>>           matching self originated ARP request/ND packets.  These
> >>> packets
> >>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> >>> index f6fad281b..d0b325748 100644
> >>> --- a/northd/ovn-northd.c
> >>> +++ b/northd/ovn-northd.c
> >>> @@ -6555,38 +6555,41 @@
> >>> build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op,
> >>>       ds_destroy(&match);
> >>>   }
> >>>
> >>> -/*
> >>> - * Ingress table 19: Flows that forward ARP/ND requests only to the
> >>> routers
> >>> - * that own the addresses. Other ARP/ND packets are still flooded in
> >>> the
> >>> - * switching domain as regular broadcast.
> >>> - */
> >>>   static void
> >>> -build_lswitch_rport_arp_req_flow_for_ip(struct ds *ip_match,
> >>> -                                        int addr_family,
> >>> -                                        struct ovn_port *patch_op,
> >>> -                                        struct ovn_datapath *od,
> >>> -                                        uint32_t priority,
> >>> -                                        struct hmap *lflows,
> >>> -                                        const struct ovsdb_idl_row
> >>> *stage_hint)
> >>> +arp_nd_ns_match(struct ds *ips, int addr_family, struct ds *match)
> >>>   {
> >>> -    struct ds match   = DS_EMPTY_INITIALIZER;
> >>> -    struct ds actions = DS_EMPTY_INITIALIZER;
> >>> -
> >>>       /* Packets received from VXLAN tunnels have already been
> >>> through the
> >>>        * router pipeline so we should skip them. Normally this is
> >>> done by the
> >>>        * multicast_group implementation (VXLAN packets skip table 32
> >>> which
> >>>        * delivers to patch ports) but we're bypassing multicast_groups.
> >>>        */
> >>> -    ds_put_cstr(&match, FLAGBIT_NOT_VXLAN " && ");
> >>> +    ds_put_cstr(match, FLAGBIT_NOT_VXLAN " && ");
> >>>
> >>>       if (addr_family == AF_INET) {
> >>> -        ds_put_cstr(&match, "arp.op == 1 && arp.tpa == { ");
> >>> +        ds_put_cstr(match, "arp.op == 1 && arp.tpa == {");
> >>>       } else {
> >>> -        ds_put_cstr(&match, "nd_ns && nd.target == { ");
> >>> +        ds_put_cstr(match, "nd_ns && nd.target == {");
> >>>       }
> >>>
> >>> -    ds_put_cstr(&match, ds_cstr_ro(ip_match));
> >>> -    ds_put_cstr(&match, "}");
> >>> +    ds_put_cstr(match, ds_cstr_ro(ips));
> >>> +    ds_put_cstr(match, "}");
> >>> +}
> >>> +
> >>> +/*
> >>> + * Ingress table 19: Flows that forward ARP/ND requests only to the
> >>> routers
> >>> + * that own the addresses. Other ARP/ND packets are still flooded in
> >>> the
> >>> + * switching domain as regular broadcast.
> >>> + */
> >>> +static void
> >>> +build_lswitch_rport_arp_req_flow_for_reachable_ip(struct ds *ips,
> >>> +    int addr_family, struct ovn_port *patch_op, struct ovn_datapath
> >>> *od,
> >>> +    uint32_t priority, struct hmap *lflows,
> >>> +    const struct ovsdb_idl_row *stage_hint)
> >>> +{
> >>> +    struct ds match   = DS_EMPTY_INITIALIZER;
> >>> +    struct ds actions = DS_EMPTY_INITIALIZER;
> >>> +
> >>> +    arp_nd_ns_match(ips, addr_family, &match);
> >>>
> >>>       /* Send a the packet to the router pipeline.  If the switch has
> >>> non-router
> >>>        * ports then flood it there as well.
> >>> @@ -6609,6 +6612,30 @@ build_lswitch_rport_arp_req_flow_for_ip(struct
> >>> ds *ip_match,
> >>>       ds_destroy(&actions);
> >>>   }
> >>>
> >>> +/*
> >>> + * Ingress table 19: Flows that forward ARP/ND requests for
> >>> "unreachable" IPs
> >>> + * (NAT or load balancer IPs configured on a router that are outside
> >>> the
> >>> + * router's configured subnets).
> >>> + * These ARP/ND packets are flooded in the switching domain as regular
> >>> + * broadcast.
> >>> + */
> >>> +static void
> >>> +build_lswitch_rport_arp_req_flow_for_unreachable_ip(struct ds *ips,
> >>> +    int addr_family, struct ovn_datapath *od, uint32_t priority,
> >>> +    struct hmap *lflows, const struct ovsdb_idl_row *stage_hint)
> >>> +{
> >>> +    struct ds match = DS_EMPTY_INITIALIZER;
> >>> +
> >>> +    arp_nd_ns_match(ips, addr_family, &match);
> >>> +
> >>> +    ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP,
> >>> +                            priority, ds_cstr(&match),
> >>> +                            "outport = \""MC_FLOOD"\"; output;",
> >>> +                            stage_hint);
> >>> +
> >>> +    ds_destroy(&match);
> >>> +}
> >>> +
> >>>   /*
> >>>    * Ingress table 19: Flows that forward ARP/ND requests only to the
> >>> routers
> >>>    * that own the addresses.
> >>> @@ -6635,8 +6662,10 @@ build_lswitch_rport_arp_req_flows(struct
> >>> ovn_port *op,
> >>>        * router port.
> >>>        * Priority: 80.
> >>>        */
> >>> -    struct ds ips_v4_match = DS_EMPTY_INITIALIZER;
> >>> -    struct ds ips_v6_match = DS_EMPTY_INITIALIZER;
> >>> +    struct ds ips_v4_match_reachable = DS_EMPTY_INITIALIZER;
> >>> +    struct ds ips_v6_match_reachable = DS_EMPTY_INITIALIZER;
> >>> +    struct ds ips_v4_match_unreachable = DS_EMPTY_INITIALIZER;
> >>> +    struct ds ips_v6_match_unreachable = DS_EMPTY_INITIALIZER;
> >>>
> >>>       const char *ip_addr;
> >>>       SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v4) {
> >>> @@ -6645,9 +6674,12 @@ build_lswitch_rport_arp_req_flows(struct
> >>> ovn_port *op,
> >>>           /* Check if the ovn port has a network configured on which
> >>> we could
> >>>            * expect ARP requests for the LB VIP.
> >>>            */
> >>> -        if (ip_parse(ip_addr, &ipv4_addr) &&
> >>> -                lrouter_port_ipv4_reachable(op, ipv4_addr)) {
> >>> -            ds_put_format(&ips_v4_match, "%s, ", ip_addr);
> >>> +        if (ip_parse(ip_addr, &ipv4_addr)) {
> >>> +            if (lrouter_port_ipv4_reachable(op, ipv4_addr)) {
> >>> +                ds_put_format(&ips_v4_match_reachable, "%s, ",
> >>> ip_addr);
> >>> +            } else {
> >>> +                ds_put_format(&ips_v4_match_unreachable, "%s, ",
> >>> ip_addr);
> >>> +            }
> >>>           }
> >>>       }
> >>>       SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v6) {
> >>> @@ -6656,9 +6688,12 @@ build_lswitch_rport_arp_req_flows(struct
> >>> ovn_port *op,
> >>>           /* Check if the ovn port has a network configured on which
> >>> we could
> >>>            * expect NS requests for the LB VIP.
> >>>            */
> >>> -        if (ipv6_parse(ip_addr, &ipv6_addr) &&
> >>> -                lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
> >>> -            ds_put_format(&ips_v6_match, "%s, ", ip_addr);
> >>> +        if (ipv6_parse(ip_addr, &ipv6_addr)) {
> >>> +            if (lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
> >>> +                ds_put_format(&ips_v6_match_reachable, "%s, ",
> >>> ip_addr);
> >>> +            } else {
> >>> +                ds_put_format(&ips_v6_match_unreachable, "%s, ",
> >>> ip_addr);
> >>> +            }
> >>>           }
> >>>       }
> >>>
> >>> @@ -6680,44 +6715,67 @@ build_lswitch_rport_arp_req_flows(struct
> >>> ovn_port *op,
> >>>           if (nat_entry_is_v6(nat_entry)) {
> >>>               struct in6_addr *addr =
> >>> &nat_entry->ext_addrs.ipv6_addrs[0].addr;
> >>>
> >>> -            if (lrouter_port_ipv6_reachable(op, addr) &&
> >>> -                    !sset_contains(&op->od->lb_ips_v6,
> >>> nat->external_ip)) {
> >>> -                ds_put_format(&ips_v6_match, "%s, ", nat->external_ip);
> >>> +            if (!sset_contains(&op->od->lb_ips_v6, nat->external_ip)) {
> >>> +                if (lrouter_port_ipv6_reachable(op, addr)) {
> >>> +                    ds_put_format(&ips_v6_match_reachable, "%s, ",
> >>> +                                  nat->external_ip);
> >>> +                } else {
> >>> +                    ds_put_format(&ips_v6_match_unreachable, "%s, ",
> >>> +                                  nat->external_ip);
> >>> +                }
> >>>               }
> >>>           } else {
> >>>               ovs_be32 addr = nat_entry->ext_addrs.ipv4_addrs[0].addr;
> >>> -
> >>> -            if (lrouter_port_ipv4_reachable(op, addr) &&
> >>> -                    !sset_contains(&op->od->lb_ips_v4,
> >>> nat->external_ip)) {
> >>> -                ds_put_format(&ips_v4_match, "%s, ", nat->external_ip);
> >>> +            if (!sset_contains(&op->od->lb_ips_v4, nat->external_ip)) {
> >>> +                if (lrouter_port_ipv4_reachable(op, addr)) {
> >>> +                    ds_put_format(&ips_v4_match_reachable, "%s, ",
> >>> +                                  nat->external_ip);
> >>> +                } else {
> >>> +                    ds_put_format(&ips_v4_match_unreachable, "%s, ",
> >>> +                                  nat->external_ip);
> >>> +                }
> >>>               }
> >>>           }
> >>>       }
> >>>
> >>>       for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> >>> -        ds_put_format(&ips_v4_match, "%s, ",
> >>> +        ds_put_format(&ips_v4_match_reachable, "%s, ",
> >>>                         op->lrp_networks.ipv4_addrs[i].addr_s);
> >>>       }
> >>>       for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
> >>> -        ds_put_format(&ips_v6_match, "%s, ",
> >>> +        ds_put_format(&ips_v6_match_reachable, "%s, ",
> >>>                         op->lrp_networks.ipv6_addrs[i].addr_s);
> >>>       }
> >>>
> >>> -    if (ds_last(&ips_v4_match) != EOF) {
> >>> -        ds_chomp(&ips_v4_match, ' ');
> >>> -        ds_chomp(&ips_v4_match, ',');
> >>> -        build_lswitch_rport_arp_req_flow_for_ip(&ips_v4_match,
> >>> AF_INET, sw_op,
> >>> -                                                sw_od, 80, lflows,
> >>> -                                                stage_hint);
> >>> +    if (ds_last(&ips_v4_match_reachable) != EOF) {
> >>> +        ds_chomp(&ips_v4_match_reachable, ' ');
> >>> +        ds_chomp(&ips_v4_match_reachable, ',');
> >>> +        build_lswitch_rport_arp_req_flow_for_reachable_ip(
> >>> +            &ips_v4_match_reachable, AF_INET, sw_op, sw_od, 80, lflows,
> >>> +            stage_hint);
> >>>       }
> >>> -    if (ds_last(&ips_v6_match) != EOF) {
> >>> -        ds_chomp(&ips_v6_match, ' ');
> >>> -        ds_chomp(&ips_v6_match, ',');
> >>> -        build_lswitch_rport_arp_req_flow_for_ip(&ips_v6_match,
> >>> AF_INET6, sw_op,
> >>> -                                                sw_od, 80, lflows,
> >>> -                                                stage_hint);
> >>> +    if (ds_last(&ips_v6_match_reachable) != EOF) {
> >>> +        ds_chomp(&ips_v6_match_reachable, ' ');
> >>> +        ds_chomp(&ips_v6_match_reachable, ',');
> >>> +        build_lswitch_rport_arp_req_flow_for_reachable_ip(
> >>> +            &ips_v6_match_reachable, AF_INET6, sw_op, sw_od, 80,
> >>> lflows,
> >>> +            stage_hint);
> >>>       }
> >>>
> >>> +    if (ds_last(&ips_v4_match_unreachable) != EOF) {
> >>> +        ds_chomp(&ips_v4_match_unreachable, ' ');
> >>> +        ds_chomp(&ips_v4_match_unreachable, ',');
> >>> +        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
> >>> +            &ips_v4_match_unreachable, AF_INET, sw_od, 90, lflows,
> >>> +            stage_hint);
> >>> +    }
> >>> +    if (ds_last(&ips_v6_match_unreachable) != EOF) {
> >>> +        ds_chomp(&ips_v6_match_unreachable, ' ');
> >>> +        ds_chomp(&ips_v6_match_unreachable, ',');
> >>> +        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
> >>> +            &ips_v6_match_unreachable, AF_INET6, sw_od, 90, lflows,
> >>> +            stage_hint);
> >>> +    }
> >>>       /* Self originated ARP requests/ND need to be flooded as usual.
> >>>        *
> >>>        * However, if the switch doesn't have any non-router ports we
> >>> shouldn't
> >>> @@ -6728,8 +6786,10 @@ build_lswitch_rport_arp_req_flows(struct
> >>> ovn_port *op,
> >>>       if (sw_od->n_router_ports != sw_od->nbs->n_ports) {
> >>>           build_lswitch_rport_arp_req_self_orig_flow(op, 75, sw_od,
> >>> lflows);
> >>>       }
> >>> -    ds_destroy(&ips_v4_match);
> >>> -    ds_destroy(&ips_v6_match);
> >>> +    ds_destroy(&ips_v4_match_reachable);
> >>> +    ds_destroy(&ips_v6_match_reachable);
> >>> +    ds_destroy(&ips_v4_match_unreachable);
> >>> +    ds_destroy(&ips_v6_match_unreachable);
> >>>   }
> >>>
> >>>   static void
> >>> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> >>> index c9ee742d1..dd6430849 100644
> >>> --- a/northd/ovn_northd.dl
> >>> +++ b/northd/ovn_northd.dl
> >>> @@ -4109,9 +4109,13 @@ Flow(.logical_datapath = sw._uuid,
> >>>    * router port.
> >>>    * Priority: 80.
> >>>    */
> >>> -function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>,
> >>> Set<string>) = {
> >>> -    var all_ips_v4 = set_empty();
> >>> -    var all_ips_v6 = set_empty();
> >>> +function get_arp_forward_ips(rp: Intern<RouterPort>):
> >>> +    (Set<string>, Set<string>, Set<string>, Set<string>) =
> >>> +{
> >>> +    var reachable_ips_v4 = set_empty();
> >>> +    var reachable_ips_v6 = set_empty();
> >>> +    var unreachable_ips_v4 = set_empty();
> >>> +    var unreachable_ips_v6 = set_empty();
> >>>
> >>>       (var lb_ips_v4, var lb_ips_v6)
> >>>           = get_router_load_balancer_ips(rp.router, false);
> >>> @@ -4121,7 +4125,9 @@ function get_arp_forward_ips(rp:
> >>> Intern<RouterPort>): (Set<string>, Set<string>)
> >>>            */
> >>>           match (ip_parse(a)) {
> >>>               Some{ipv4} -> if (lrouter_port_ip_reachable(rp,
> >>> IPv4{ipv4})) {
> >>> -                all_ips_v4.insert(a)
> >>> +                reachable_ips_v4.insert(a)
> >>> +            } else {
> >>> +                unreachable_ips_v4.insert(a)
> >>>               },
> >>>               _ -> ()
> >>>           }
> >>> @@ -4132,7 +4138,9 @@ function get_arp_forward_ips(rp:
> >>> Intern<RouterPort>): (Set<string>, Set<string>)
> >>>            */
> >>>           match (ipv6_parse(a)) {
> >>>               Some{ipv6} -> if (lrouter_port_ip_reachable(rp,
> >>> IPv6{ipv6})) {
> >>> -                all_ips_v6.insert(a)
> >>> +                reachable_ips_v6.insert(a)
> >>> +            } else {
> >>> +                unreachable_ips_v6.insert(a)
> >>>               },
> >>>               _ -> ()
> >>>           }
> >>> @@ -4145,22 +4153,45 @@ function get_arp_forward_ips(rp:
> >>> Intern<RouterPort>): (Set<string>, Set<string>)
> >>>                */
> >>>               if (lrouter_port_ip_reachable(rp, nat.external_ip)) {
> >>>                   match (nat.external_ip) {
> >>> -                    IPv4{_} -> all_ips_v4.insert(nat.nat.external_ip),
> >>> -                    IPv6{_} -> all_ips_v6.insert(nat.nat.external_ip)
> >>> +                    IPv4{_} ->
> >>> reachable_ips_v4.insert(nat.nat.external_ip),
> >>> +                    IPv6{_} ->
> >>> reachable_ips_v6.insert(nat.nat.external_ip)
> >>> +                }
> >>> +            } else {
> >>> +                match (nat.external_ip) {
> >>> +                    IPv4{_} ->
> >>> unreachable_ips_v4.insert(nat.nat.external_ip),
> >>> +                    IPv6{_} ->
> >>> unreachable_ips_v6.insert(nat.nat.external_ip),
> >>>                   }
> >>>               }
> >>>           }
> >>>       };
> >>>
> >>>       for (a in rp.networks.ipv4_addrs) {
> >>> -        all_ips_v4.insert("${a.addr}")
> >>> +        reachable_ips_v4.insert("${a.addr}")
> >>>       };
> >>>       for (a in rp.networks.ipv6_addrs) {
> >>> -        all_ips_v6.insert("${a.addr}")
> >>> +        reachable_ips_v6.insert("${a.addr}")
> >>>       };
> >>>
> >>> -    (all_ips_v4, all_ips_v6)
> >>> +    (reachable_ips_v4, reachable_ips_v6, unreachable_ips_v4,
> >>> unreachable_ips_v6)
> >>>   }
> >>> +
> >>> +relation &SwitchPortARPForwards(
> >>> +    port: Intern<SwitchPort>,
> >>> +    reachable_ips_v4: Set<string>,
> >>> +    reachable_ips_v6: Set<string>,
> >>> +    unreachable_ips_v4: Set<string>,
> >>> +    unreachable_ips_v6: Set<string>
> >>> +)
> >>> +
> >>> +&SwitchPortARPForwards(.port = port,
> >>> +                       .reachable_ips_v4 = reachable_ips_v4,
> >>> +                       .reachable_ips_v6 = reachable_ips_v6,
> >>> +                       .unreachable_ips_v4 = unreachable_ips_v4,
> >>> +                       .unreachable_ips_v6 = unreachable_ips_v6) :-
> >>> +    port in &SwitchPort(.peer = Some{rp}),
> >>> +    rp.is_enabled(),
> >>> +    (var reachable_ips_v4, var reachable_ips_v6, var
> >>> unreachable_ips_v4, var unreachable_ips_v6) = get_arp_forward_ips(rp).
> >>> +
> >>>   /* Packets received from VXLAN tunnels have already been through the
> >>>    * router pipeline so we should skip them. Normally this is done by
> >>> the
> >>>    * multicast_group implementation (VXLAN packets skip table 32 which
> >>> @@ -4172,7 +4203,7 @@ Flow(.logical_datapath = sw._uuid,
> >>>        .priority         = 80,
> >>>        .__match          = fLAGBIT_NOT_VXLAN() ++
> >>>                            " && arp.op == 1 && arp.tpa == { " ++
> >>> -                         all_ips_v4.to_vec().join(", ") ++ "}",
> >>> +                         ipv4.to_vec().join(", ") ++ "}",
> >>>        .actions          = if (sw.has_non_router_port) {
> >>>                                "clone {outport = ${sp.json_name};
> >>> output; }; "
> >>>                                "outport = ${mc_flood_l2}; output;"
> >>> @@ -4182,15 +4213,15 @@ Flow(.logical_datapath = sw._uuid,
> >>>        .external_ids     = stage_hint(sp.lsp._uuid)) :-
> >>>       sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
> >>>       rp.is_enabled(),
> >>> -    (var all_ips_v4, _) = get_arp_forward_ips(rp),
> >>> -    not all_ips_v4.is_empty(),
> >>> +    &SwitchPortARPForwards(.port = sp, .reachable_ips_v4 = ipv4),
> >>> +    not ipv4.is_empty(),
> >>>       var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
> >>>   Flow(.logical_datapath = sw._uuid,
> >>>        .stage            = s_SWITCH_IN_L2_LKUP(),
> >>>        .priority         = 80,
> >>>        .__match          = fLAGBIT_NOT_VXLAN() ++
> >>>                            " && nd_ns && nd.target == { " ++
> >>> -                         all_ips_v6.to_vec().join(", ") ++ "}",
> >>> +                         ipv6.to_vec().join(", ") ++ "}",
> >>>        .actions          = if (sw.has_non_router_port) {
> >>>                                "clone {outport = ${sp.json_name};
> >>> output; }; "
> >>>                                "outport = ${mc_flood_l2}; output;"
> >>> @@ -4200,10 +4231,38 @@ Flow(.logical_datapath = sw._uuid,
> >>>        .external_ids     = stage_hint(sp.lsp._uuid)) :-
> >>>       sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
> >>>       rp.is_enabled(),
> >>> -    (_, var all_ips_v6) = get_arp_forward_ips(rp),
> >>> -    not all_ips_v6.is_empty(),
> >>> +    &SwitchPortARPForwards(.port = sp, .reachable_ips_v6 = ipv6),
> >>> +    not ipv6.is_empty(),
> >>>       var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
> >>>
> >>> +Flow(.logical_datapath = sw._uuid,
> >>> +     .stage            = s_SWITCH_IN_L2_LKUP(),
> >>> +     .priority         = 90,
> >>> +     .__match          = fLAGBIT_NOT_VXLAN() ++
> >>> +                        " && arp.op == 1 && arp.tpa == {" ++
> >>> +                        ipv4.to_vec().join(", ") ++ "}",
> >>> +     .actions          = "outport = ${flood}; output;",
> >>> +     .external_ids     = stage_hint(sp.lsp._uuid)) :-
> >>> +    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
> >>> +    rp.is_enabled(),
> >>> +    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v4 = ipv4),
> >>> +    not ipv4.is_empty(),
> >>> +    var flood = json_string_escape(mC_FLOOD().0).
> >>> +
> >>> +Flow(.logical_datapath = sw._uuid,
> >>> +     .stage            = s_SWITCH_IN_L2_LKUP(),
> >>> +     .priority         = 90,
> >>> +     .__match          = fLAGBIT_NOT_VXLAN() ++
> >>> +                         " && nd_ns && nd.target == {" ++
> >>> +                         ipv6.to_vec().join(", ") ++ "}",
> >>> +     .actions          = "outport = ${flood}; output;",
> >>> +     .external_ids     = stage_hint(sp.lsp._uuid)) :-
> >>> +    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
> >>> +    rp.is_enabled(),
> >>> +    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v6 = ipv6),
> >>> +    not ipv6.is_empty(),
> >>> +    var flood = json_string_escape(mC_FLOOD().0).
> >>> +
> >>>   for (SwitchPortNewDynamicAddress(.port = &SwitchPort{.lsp = lsp,
> >>> .json_name = json_name, .sw = sw},
> >>>                                    .address = Some{addrs})
> >>>        if lsp.__type != "external") {
> >>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> >>> index 0c8a09ced..f591537e1 100644
> >>> --- a/tests/ovn-northd.at
> >>> +++ b/tests/ovn-northd.at
> >>> @@ -4033,3 +4033,102 @@ check ovn-nbctl --wait=sb clear
> >>> logical_router_port ro2-sw ha_chassis_group
> >>>   check_lflows 0
> >>>
> >>>   AT_CLEANUP
> >>> +
> >>> +OVN_FOR_EACH_NORTHD([
> >>> +AT_SETUP([ovn -- ARP flood for unreachable addresses])
> >>> +ovn_start
> >>> +
> >>> +AS_BOX([Setting up the logical network])
> >>> +
> >>> +# This network is the same as the one from "Router Address Propagation"
> >>> +check ovn-nbctl ls-add sw
> >>> +
> >>> +check ovn-nbctl lr-add ro1
> >>> +check ovn-nbctl lrp-add ro1 ro1-sw 00:00:00:00:00:01 10.0.0.1/24
> >>> +check ovn-nbctl lsp-add sw sw-ro1
> >>> +check ovn-nbctl lsp-set-type sw-ro1 router
> >>> +check ovn-nbctl lsp-set-addresses sw-ro1 router
> >>> +check ovn-nbctl lsp-set-options sw-ro1 router-port=ro1-sw
> >>> +
> >>> +check ovn-nbctl lr-add ro2
> >>> +check ovn-nbctl lrp-add ro2 ro2-sw 00:00:00:00:00:02 20.0.0.1/24
> >>> +check ovn-nbctl lsp-add sw sw-ro2
> >>> +check ovn-nbctl lsp-set-type sw-ro2 router
> >>> +check ovn-nbctl lsp-set-addresses sw-ro2 router
> >>> +check ovn-nbctl --wait=sb lsp-set-options sw-ro2 router-port=ro2-sw
> >>> +
> >>> +check ovn-nbctl ls-add ls1
> >>> +check ovn-nbctl lsp-add ls1 vm1
> >>> +check ovn-nbctl lsp-set-addresses vm1 "00:00:00:00:01:02 192.168.1.2"
> >>> +check ovn-nbctl lrp-add ro1 ro1-ls1 00:00:00:00:01:01 192.168.1.1/24
> >>> +check ovn-nbctl lsp-add ls1 ls1-ro1
> >>> +check ovn-nbctl lsp-set-type ls1-ro1 router
> >>> +check ovn-nbctl lsp-set-addresses ls1-ro1 router
> >>> +check ovn-nbctl lsp-set-options ls1-ro1 router-port=ro1-ls1
> >>> +
> >>> +check ovn-nbctl ls-add ls2
> >>> +check ovn-nbctl lsp-add ls2 vm2
> >>> +check ovn-nbctl lsp-set-addresses vm2 "00:00:00:00:02:02 192.168.2.2"
> >>> +check ovn-nbctl lrp-add ro2 ro2-ls2 00:00:00:00:02:01 192.168.2.1/24
> >>> +check ovn-nbctl lsp-add ls2 ls2-ro2
> >>> +check ovn-nbctl lsp-set-type ls2-ro2 router
> >>> +check ovn-nbctl lsp-set-addresses ls2-ro2 router
> >>> +check ovn-nbctl lsp-set-options ls2-ro2 router-port=ro2-ls2
> >>> +
> >>> +AS_BOX([Ensure that unreachable flood flows are not installed, since
> >>> no addresses are unreachable])
> >>> +
> >>> +AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep
> >>> "priority=90" -c], [1], [dnl
> >>> +0
> >>> +])
> >>> +
> >>> +AS_BOX([Adding some reachable NAT addresses])
> >>> +
> >>> +check ovn-nbctl lr-nat-add ro1 dnat 10.0.0.100 192.168.1.100
> >>> +check ovn-nbctl lr-nat-add ro1 snat 10.0.0.200 192.168.1.200/30
> >>> +
> >>> +check ovn-nbctl lr-nat-add ro2 dnat 20.0.0.100 192.168.2.100
> >>> +check ovn-nbctl --wait=sb lr-nat-add ro2 snat 20.0.0.200
> >>> 192.168.2.200/30
> >>> +
> >>> +AS_BOX([Ensure that unreachable flood flows are not installed, since
> >>> all addresses are reachable])
> >>> +
> >>> +AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep
> >>> "priority=90" -c], [1], [dnl
> >>> +0
> >>> +])
> >>> +
> >>> +AS_BOX([Adding some unreachable NAT addresses])
> >>> +
> >>> +check ovn-nbctl lr-nat-add ro1 dnat 30.0.0.100 192.168.1.130
> >>> +check ovn-nbctl lr-nat-add ro1 snat 30.0.0.200 192.168.1.148/30
> >>> +
> >>> +check ovn-nbctl lr-nat-add ro2 dnat 40.0.0.100 192.168.2.130
> >>> +check ovn-nbctl --wait=sb lr-nat-add ro2 snat 40.0.0.200
> >>> 192.168.2.148/30
> >>> +
> >>> +AS_BOX([Ensure that unreachable flood flows are installed, since
> >>> there are unreachable addresses])
> >>> +
> >>> +ovn-sbctl lflow-list
> >>> +
> >>> +# We expect two flows to be installed, one per connected router port
> >>> on sw
> >>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep
> >>> priority=90 -c], [0], [dnl
> >>> +2
> >>> +])
> >>> +
> >>> +# We expect that the installed flows will match the unreachable DNAT
> >>> addresses only.
> >>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep
> >>> priority=90 | grep "arp.tpa == {30.0.0.100}" -c], [0], [dnl
> >>> +1
> >>> +])
> >>> +
> >>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep
> >>> priority=90 | grep "arp.tpa == {40.0.0.100}" -c], [0], [dnl
> >>> +1
> >>> +])
> >>> +
> >>> +# Ensure that we do not create flows for SNAT addresses
> >>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep
> >>> priority=90 | grep "arp.tpa == {30.0.0.200}" -c], [1], [dnl
> >>> +0
> >>> +])
> >>> +
> >>> +AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep
> >>> priority=90 | grep "arp.tpa == {40.0.0.200}" -c], [1], [dnl
> >>> +0
> >>> +])
> >>> +
> >>> +AT_CLEANUP
> >>> +])
> >>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> >>> index a5f10ce38..bdd13b009 100644
> >>> --- a/tests/system-ovn.at
> >>> +++ b/tests/system-ovn.at
> >>> @@ -6482,3 +6482,105 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error
> >>> receiving.*/d
> >>>
> >>>   AT_CLEANUP
> >>>   ])
> >>> +
> >>> +OVN_FOR_EACH_NORTHD([
> >>> +AT_SETUP([ovn -- Floating IP outside router subnet IPv4])
> >>> +AT_KEYWORDS(NAT)
> >>> +
> >>> +ovn_start
> >>> +
> >>> +OVS_TRAFFIC_VSWITCHD_START()
> >>> +ADD_BR([br-int])
> >>> +
> >>> +# Set external-ids in br-int needed for ovn-controller
> >>> +ovs-vsctl \
> >>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> >>> +        -- set Open_vSwitch .
> >>> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> >>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> >>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> >>> +        -- set bridge br-int fail-mode=secure
> >>> other-config:disable-in-band=true
> >>> +
> >>> +start_daemon ovn-controller
> >>> +
> >>> +# Logical network:
> >>> +# Two VMs
> >>> +#   * VM1 with IP address 192.168.100.5
> >>> +#   * VM2 with IP address 192.168.200.5
> >>> +#
> >>> +# VM1 connects to logical switch ls1. ls1 connects to logical router
> >>> lr1.
> >>> +# VM2 connects to logical switch ls2. ls2 connects to logical router
> >>> lr2.
> >>> +# lr1 and lr2 both connect to logical switch ls-pub.
> >>> +# * lr1's interface that connects to ls-pub has IP address
> >>> 172.18.2.110/24
> >>> +# * lr2's interface that connects to ls-pub has IP address
> >>> 172.18.1.173/24
> >>> +#
> >>> +# lr1 has the following attributes:
> >>> +#   * It has a DNAT rule that translates 172.18.2.11 to
> >>> 192.168.100.5 (VM1)
> >>> +#
> >>> +# lr2 has the following attributes:
> >>> +#   * It has a DNAT rule that translates 172.18.2.12 to
> >>> 192.168.200.5 (VM2)
> >>> +#
> >>> +# In this test, we want to ensure that a ping from VM1 to IP address
> >>> 172.18.2.12 reaches VM2.
> >>> +# When the NAT rules are set up, there should be MAC_Bindings
> >>> created that allow for traffic
> >>> +# to exit lr1, go through ls-pub, and reach the NAT external IP
> >>> configured on lr2.
> >>> +
> >>> +check ovn-nbctl ls-add ls1
> >>> +check ovn-nbctl lsp-add ls1 vm1 -- lsp-set-addresses vm1
> >>> "00:00:00:00:01:05 192.168.100.5"
> >>> +
> >>> +check ovn-nbctl ls-add ls2
> >>> +check ovn-nbctl lsp-add ls2 vm2 -- lsp-set-addresses vm2
> >>> "00:00:00:00:02:05 192.168.200.5"
> >>> +
> >>> +check ovn-nbctl ls-add ls-pub
> >>> +
> >>> +check ovn-nbctl lr-add lr1
> >>> +check ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:01:01 192.168.100.1/24
> >>> +check ovn-nbctl lsp-add ls1 ls1-lr1                      \
> >>> +    -- lsp-set-type ls1-lr1 router                 \
> >>> +    -- lsp-set-addresses ls1-lr1 router            \
> >>> +    -- lsp-set-options ls1-lr1 router-port=lr1-ls1
> >>> +
> >>> +check ovn-nbctl lr-add lr2
> >>> +check ovn-nbctl lrp-add lr2 lr2-ls2 00:00:00:00:02:01 192.168.200.1/24
> >>> +check ovn-nbctl lsp-add ls2 ls2-lr2                      \
> >>> +    -- lsp-set-type ls2-lr2 router                 \
> >>> +    -- lsp-set-addresses ls2-lr2 router            \
> >>> +    -- lsp-set-options ls2-lr2 router-port=lr2-ls2
> >>> +
> >>> +check ovn-nbctl lrp-add lr1 lr1-ls-pub 00:00:00:00:03:01
> >>> 172.18.2.110/24
> >>> +check ovn-nbctl lrp-set-gateway-chassis lr1-ls-pub hv1
> >>> +check ovn-nbctl lsp-add ls-pub ls-pub-lr1                      \
> >>> +    -- lsp-set-type ls-pub-lr1 router                    \
> >>> +    -- lsp-set-addresses ls-pub-lr1 router               \
> >>> +    -- lsp-set-options ls-pub-lr1 router-port=lr1-ls-pub
> >>> +
> >>> +check ovn-nbctl lrp-add lr2 lr2-ls-pub 00:00:00:00:03:02
> >>> 172.18.1.173/24
> >>> +check ovn-nbctl lrp-set-gateway-chassis lr2-ls-pub hv1
> >>> +check ovn-nbctl lsp-add ls-pub ls-pub-lr2                      \
> >>> +    -- lsp-set-type ls-pub-lr2 router                    \
> >>> +    -- lsp-set-addresses ls-pub-lr2 router               \
> >>> +    -- lsp-set-options ls-pub-lr2 router-port=lr2-ls-pub
> >>> +
> >>> +# Putting --add-route on these NAT rules means there is no need to
> >>> +# add any static routes.
> >>> +check ovn-nbctl --add-route lr-nat-add lr1 dnat_and_snat 172.18.2.11
> >>> 192.168.100.5 vm1 00:00:00:00:03:01
> >>> +check ovn-nbctl --add-route lr-nat-add lr2 dnat_and_snat 172.18.2.12
> >>> 192.168.200.5 vm2 00:00:00:00:03:02
> >>> +
> >>> +ADD_NAMESPACES(vm1)
> >>> +ADD_VETH(vm1, vm1, br-int, "192.168.100.5/24", "00:00:00:00:01:05", \
> >>> +         "192.168.100.1")
> >>> +
> >>> +ADD_NAMESPACES(vm2)
> >>> +ADD_VETH(vm2, vm2, br-int, "192.168.200.5/24", "00:00:00:00:02:05", \
> >>> +         "192.168.200.1")
> >>> +
> >>> +OVN_POPULATE_ARP
> >>> +check ovn-nbctl --wait=hv sync
> >>> +
> >>> +AS_BOX([Testing a ping])
> >>> +
> >>> +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 172.18.2.12 |
> >>> FORMAT_PING], \
> >>> +[0], [dnl
> >>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> >>> +])
> >>> +
> >>> +AT_CLEANUP
> >>> +])
> >>> --
> >>> 2.31.1
> >>>
> >>> _______________________________________________
> >>> dev mailing list
> >>> dev@openvswitch.org
> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>
> >>
> >
>
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff mbox series

Patch

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index beaf5a183..5aedd6619 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1587,6 +1587,14 @@  output;
         logical ports.
       </li>
 
+      <li>
+        Priority-90 flows for each IP address/VIP/NAT address configured
+        outside its owning router port's subnet. These flows match ARP
+        requests and ND packets for the specific IP addresses.  Matched packets
+        are forwarded to the <code>MC_FLOOD</code> multicast group which
+        contains all connected logical ports.
+      </li>
+
       <li>
         Priority-75 flows for each port connected to a logical router
         matching self originated ARP request/ND packets.  These packets
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index f6fad281b..d0b325748 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -6555,38 +6555,41 @@  build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op,
     ds_destroy(&match);
 }
 
-/*
- * Ingress table 19: Flows that forward ARP/ND requests only to the routers
- * that own the addresses. Other ARP/ND packets are still flooded in the
- * switching domain as regular broadcast.
- */
 static void
-build_lswitch_rport_arp_req_flow_for_ip(struct ds *ip_match,
-                                        int addr_family,
-                                        struct ovn_port *patch_op,
-                                        struct ovn_datapath *od,
-                                        uint32_t priority,
-                                        struct hmap *lflows,
-                                        const struct ovsdb_idl_row *stage_hint)
+arp_nd_ns_match(struct ds *ips, int addr_family, struct ds *match)
 {
-    struct ds match   = DS_EMPTY_INITIALIZER;
-    struct ds actions = DS_EMPTY_INITIALIZER;
-
     /* Packets received from VXLAN tunnels have already been through the
      * router pipeline so we should skip them. Normally this is done by the
      * multicast_group implementation (VXLAN packets skip table 32 which
      * delivers to patch ports) but we're bypassing multicast_groups.
      */
-    ds_put_cstr(&match, FLAGBIT_NOT_VXLAN " && ");
+    ds_put_cstr(match, FLAGBIT_NOT_VXLAN " && ");
 
     if (addr_family == AF_INET) {
-        ds_put_cstr(&match, "arp.op == 1 && arp.tpa == { ");
+        ds_put_cstr(match, "arp.op == 1 && arp.tpa == {");
     } else {
-        ds_put_cstr(&match, "nd_ns && nd.target == { ");
+        ds_put_cstr(match, "nd_ns && nd.target == {");
     }
 
-    ds_put_cstr(&match, ds_cstr_ro(ip_match));
-    ds_put_cstr(&match, "}");
+    ds_put_cstr(match, ds_cstr_ro(ips));
+    ds_put_cstr(match, "}");
+}
+
+/*
+ * Ingress table 19: Flows that forward ARP/ND requests only to the routers
+ * that own the addresses. Other ARP/ND packets are still flooded in the
+ * switching domain as regular broadcast.
+ */
+static void
+build_lswitch_rport_arp_req_flow_for_reachable_ip(struct ds *ips,
+    int addr_family, struct ovn_port *patch_op, struct ovn_datapath *od,
+    uint32_t priority, struct hmap *lflows,
+    const struct ovsdb_idl_row *stage_hint)
+{
+    struct ds match   = DS_EMPTY_INITIALIZER;
+    struct ds actions = DS_EMPTY_INITIALIZER;
+
+    arp_nd_ns_match(ips, addr_family, &match);
 
     /* Send a the packet to the router pipeline.  If the switch has non-router
      * ports then flood it there as well.
@@ -6609,6 +6612,30 @@  build_lswitch_rport_arp_req_flow_for_ip(struct ds *ip_match,
     ds_destroy(&actions);
 }
 
+/*
+ * Ingress table 19: Flows that forward ARP/ND requests for "unreachable" IPs
+ * (NAT or load balancer IPs configured on a router that are outside the
+ * router's configured subnets).
+ * These ARP/ND packets are flooded in the switching domain as regular
+ * broadcast.
+ */
+static void
+build_lswitch_rport_arp_req_flow_for_unreachable_ip(struct ds *ips,
+    int addr_family, struct ovn_datapath *od, uint32_t priority,
+    struct hmap *lflows, const struct ovsdb_idl_row *stage_hint)
+{
+    struct ds match = DS_EMPTY_INITIALIZER;
+
+    arp_nd_ns_match(ips, addr_family, &match);
+
+    ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP,
+                            priority, ds_cstr(&match),
+                            "outport = \""MC_FLOOD"\"; output;",
+                            stage_hint);
+
+    ds_destroy(&match);
+}
+
 /*
  * Ingress table 19: Flows that forward ARP/ND requests only to the routers
  * that own the addresses.
@@ -6635,8 +6662,10 @@  build_lswitch_rport_arp_req_flows(struct ovn_port *op,
      * router port.
      * Priority: 80.
      */
-    struct ds ips_v4_match = DS_EMPTY_INITIALIZER;
-    struct ds ips_v6_match = DS_EMPTY_INITIALIZER;
+    struct ds ips_v4_match_reachable = DS_EMPTY_INITIALIZER;
+    struct ds ips_v6_match_reachable = DS_EMPTY_INITIALIZER;
+    struct ds ips_v4_match_unreachable = DS_EMPTY_INITIALIZER;
+    struct ds ips_v6_match_unreachable = DS_EMPTY_INITIALIZER;
 
     const char *ip_addr;
     SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v4) {
@@ -6645,9 +6674,12 @@  build_lswitch_rport_arp_req_flows(struct ovn_port *op,
         /* Check if the ovn port has a network configured on which we could
          * expect ARP requests for the LB VIP.
          */
-        if (ip_parse(ip_addr, &ipv4_addr) &&
-                lrouter_port_ipv4_reachable(op, ipv4_addr)) {
-            ds_put_format(&ips_v4_match, "%s, ", ip_addr);
+        if (ip_parse(ip_addr, &ipv4_addr)) {
+            if (lrouter_port_ipv4_reachable(op, ipv4_addr)) {
+                ds_put_format(&ips_v4_match_reachable, "%s, ", ip_addr);
+            } else {
+                ds_put_format(&ips_v4_match_unreachable, "%s, ", ip_addr);
+            }
         }
     }
     SSET_FOR_EACH (ip_addr, &op->od->lb_ips_v6) {
@@ -6656,9 +6688,12 @@  build_lswitch_rport_arp_req_flows(struct ovn_port *op,
         /* Check if the ovn port has a network configured on which we could
          * expect NS requests for the LB VIP.
          */
-        if (ipv6_parse(ip_addr, &ipv6_addr) &&
-                lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
-            ds_put_format(&ips_v6_match, "%s, ", ip_addr);
+        if (ipv6_parse(ip_addr, &ipv6_addr)) {
+            if (lrouter_port_ipv6_reachable(op, &ipv6_addr)) {
+                ds_put_format(&ips_v6_match_reachable, "%s, ", ip_addr);
+            } else {
+                ds_put_format(&ips_v6_match_unreachable, "%s, ", ip_addr);
+            }
         }
     }
 
@@ -6680,44 +6715,67 @@  build_lswitch_rport_arp_req_flows(struct ovn_port *op,
         if (nat_entry_is_v6(nat_entry)) {
             struct in6_addr *addr = &nat_entry->ext_addrs.ipv6_addrs[0].addr;
 
-            if (lrouter_port_ipv6_reachable(op, addr) &&
-                    !sset_contains(&op->od->lb_ips_v6, nat->external_ip)) {
-                ds_put_format(&ips_v6_match, "%s, ", nat->external_ip);
+            if (!sset_contains(&op->od->lb_ips_v6, nat->external_ip)) {
+                if (lrouter_port_ipv6_reachable(op, addr)) {
+                    ds_put_format(&ips_v6_match_reachable, "%s, ",
+                                  nat->external_ip);
+                } else {
+                    ds_put_format(&ips_v6_match_unreachable, "%s, ",
+                                  nat->external_ip);
+                }
             }
         } else {
             ovs_be32 addr = nat_entry->ext_addrs.ipv4_addrs[0].addr;
-
-            if (lrouter_port_ipv4_reachable(op, addr) &&
-                    !sset_contains(&op->od->lb_ips_v4, nat->external_ip)) {
-                ds_put_format(&ips_v4_match, "%s, ", nat->external_ip);
+            if (!sset_contains(&op->od->lb_ips_v4, nat->external_ip)) {
+                if (lrouter_port_ipv4_reachable(op, addr)) {
+                    ds_put_format(&ips_v4_match_reachable, "%s, ",
+                                  nat->external_ip);
+                } else {
+                    ds_put_format(&ips_v4_match_unreachable, "%s, ",
+                                  nat->external_ip);
+                }
             }
         }
     }
 
     for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
-        ds_put_format(&ips_v4_match, "%s, ",
+        ds_put_format(&ips_v4_match_reachable, "%s, ",
                       op->lrp_networks.ipv4_addrs[i].addr_s);
     }
     for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
-        ds_put_format(&ips_v6_match, "%s, ",
+        ds_put_format(&ips_v6_match_reachable, "%s, ",
                       op->lrp_networks.ipv6_addrs[i].addr_s);
     }
 
-    if (ds_last(&ips_v4_match) != EOF) {
-        ds_chomp(&ips_v4_match, ' ');
-        ds_chomp(&ips_v4_match, ',');
-        build_lswitch_rport_arp_req_flow_for_ip(&ips_v4_match, AF_INET, sw_op,
-                                                sw_od, 80, lflows,
-                                                stage_hint);
+    if (ds_last(&ips_v4_match_reachable) != EOF) {
+        ds_chomp(&ips_v4_match_reachable, ' ');
+        ds_chomp(&ips_v4_match_reachable, ',');
+        build_lswitch_rport_arp_req_flow_for_reachable_ip(
+            &ips_v4_match_reachable, AF_INET, sw_op, sw_od, 80, lflows,
+            stage_hint);
     }
-    if (ds_last(&ips_v6_match) != EOF) {
-        ds_chomp(&ips_v6_match, ' ');
-        ds_chomp(&ips_v6_match, ',');
-        build_lswitch_rport_arp_req_flow_for_ip(&ips_v6_match, AF_INET6, sw_op,
-                                                sw_od, 80, lflows,
-                                                stage_hint);
+    if (ds_last(&ips_v6_match_reachable) != EOF) {
+        ds_chomp(&ips_v6_match_reachable, ' ');
+        ds_chomp(&ips_v6_match_reachable, ',');
+        build_lswitch_rport_arp_req_flow_for_reachable_ip(
+            &ips_v6_match_reachable, AF_INET6, sw_op, sw_od, 80, lflows,
+            stage_hint);
     }
 
+    if (ds_last(&ips_v4_match_unreachable) != EOF) {
+        ds_chomp(&ips_v4_match_unreachable, ' ');
+        ds_chomp(&ips_v4_match_unreachable, ',');
+        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
+            &ips_v4_match_unreachable, AF_INET, sw_od, 90, lflows,
+            stage_hint);
+    }
+    if (ds_last(&ips_v6_match_unreachable) != EOF) {
+        ds_chomp(&ips_v6_match_unreachable, ' ');
+        ds_chomp(&ips_v6_match_unreachable, ',');
+        build_lswitch_rport_arp_req_flow_for_unreachable_ip(
+            &ips_v6_match_unreachable, AF_INET6, sw_od, 90, lflows,
+            stage_hint);
+    }
     /* Self originated ARP requests/ND need to be flooded as usual.
      *
      * However, if the switch doesn't have any non-router ports we shouldn't
@@ -6728,8 +6786,10 @@  build_lswitch_rport_arp_req_flows(struct ovn_port *op,
     if (sw_od->n_router_ports != sw_od->nbs->n_ports) {
         build_lswitch_rport_arp_req_self_orig_flow(op, 75, sw_od, lflows);
     }
-    ds_destroy(&ips_v4_match);
-    ds_destroy(&ips_v6_match);
+    ds_destroy(&ips_v4_match_reachable);
+    ds_destroy(&ips_v6_match_reachable);
+    ds_destroy(&ips_v4_match_unreachable);
+    ds_destroy(&ips_v6_match_unreachable);
 }
 
 static void
diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
index c9ee742d1..dd6430849 100644
--- a/northd/ovn_northd.dl
+++ b/northd/ovn_northd.dl
@@ -4109,9 +4109,13 @@  Flow(.logical_datapath = sw._uuid,
  * router port.
  * Priority: 80.
  */
-function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>) = {
-    var all_ips_v4 = set_empty();
-    var all_ips_v6 = set_empty();
+function get_arp_forward_ips(rp: Intern<RouterPort>):
+    (Set<string>, Set<string>, Set<string>, Set<string>) =
+{
+    var reachable_ips_v4 = set_empty();
+    var reachable_ips_v6 = set_empty();
+    var unreachable_ips_v4 = set_empty();
+    var unreachable_ips_v6 = set_empty();
 
     (var lb_ips_v4, var lb_ips_v6)
         = get_router_load_balancer_ips(rp.router, false);
@@ -4121,7 +4125,9 @@  function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
          */
         match (ip_parse(a)) {
             Some{ipv4} -> if (lrouter_port_ip_reachable(rp, IPv4{ipv4})) {
-                all_ips_v4.insert(a)
+                reachable_ips_v4.insert(a)
+            } else {
+                unreachable_ips_v4.insert(a)
             },
             _ -> ()
         }
@@ -4132,7 +4138,9 @@  function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
          */
         match (ipv6_parse(a)) {
             Some{ipv6} -> if (lrouter_port_ip_reachable(rp, IPv6{ipv6})) {
-                all_ips_v6.insert(a)
+                reachable_ips_v6.insert(a)
+            } else {
+                unreachable_ips_v6.insert(a)
             },
             _ -> ()
         }
@@ -4145,22 +4153,45 @@  function get_arp_forward_ips(rp: Intern<RouterPort>): (Set<string>, Set<string>)
              */
             if (lrouter_port_ip_reachable(rp, nat.external_ip)) {
                 match (nat.external_ip) {
-                    IPv4{_} -> all_ips_v4.insert(nat.nat.external_ip),
-                    IPv6{_} -> all_ips_v6.insert(nat.nat.external_ip)
+                    IPv4{_} -> reachable_ips_v4.insert(nat.nat.external_ip),
+                    IPv6{_} -> reachable_ips_v6.insert(nat.nat.external_ip)
+                }
+            } else {
+                match (nat.external_ip) {
+                    IPv4{_} -> unreachable_ips_v4.insert(nat.nat.external_ip),
+                    IPv6{_} -> unreachable_ips_v6.insert(nat.nat.external_ip),
                 }
             }
         }
     };
 
     for (a in rp.networks.ipv4_addrs) {
-        all_ips_v4.insert("${a.addr}")
+        reachable_ips_v4.insert("${a.addr}")
     };
     for (a in rp.networks.ipv6_addrs) {
-        all_ips_v6.insert("${a.addr}")
+        reachable_ips_v6.insert("${a.addr}")
     };
 
-    (all_ips_v4, all_ips_v6)
+    (reachable_ips_v4, reachable_ips_v6, unreachable_ips_v4, unreachable_ips_v6)
 }
+
+relation &SwitchPortARPForwards(
+    port: Intern<SwitchPort>,
+    reachable_ips_v4: Set<string>,
+    reachable_ips_v6: Set<string>,
+    unreachable_ips_v4: Set<string>,
+    unreachable_ips_v6: Set<string>
+)
+
+&SwitchPortARPForwards(.port = port,
+                       .reachable_ips_v4 = reachable_ips_v4,
+                       .reachable_ips_v6 = reachable_ips_v6,
+                       .unreachable_ips_v4 = unreachable_ips_v4,
+                       .unreachable_ips_v6 = unreachable_ips_v6) :-
+    port in &SwitchPort(.peer = Some{rp}),
+    rp.is_enabled(),
+    (var reachable_ips_v4, var reachable_ips_v6, var unreachable_ips_v4, var unreachable_ips_v6) = get_arp_forward_ips(rp).
+
 /* Packets received from VXLAN tunnels have already been through the
  * router pipeline so we should skip them. Normally this is done by the
  * multicast_group implementation (VXLAN packets skip table 32 which
@@ -4172,7 +4203,7 @@  Flow(.logical_datapath = sw._uuid,
      .priority         = 80,
      .__match          = fLAGBIT_NOT_VXLAN() ++
                          " && arp.op == 1 && arp.tpa == { " ++
-                         all_ips_v4.to_vec().join(", ") ++ "}",
+                         ipv4.to_vec().join(", ") ++ "}",
      .actions          = if (sw.has_non_router_port) {
                              "clone {outport = ${sp.json_name}; output; }; "
                              "outport = ${mc_flood_l2}; output;"
@@ -4182,15 +4213,15 @@  Flow(.logical_datapath = sw._uuid,
      .external_ids     = stage_hint(sp.lsp._uuid)) :-
     sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
     rp.is_enabled(),
-    (var all_ips_v4, _) = get_arp_forward_ips(rp),
-    not all_ips_v4.is_empty(),
+    &SwitchPortARPForwards(.port = sp, .reachable_ips_v4 = ipv4),
+    not ipv4.is_empty(),
     var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
 Flow(.logical_datapath = sw._uuid,
      .stage            = s_SWITCH_IN_L2_LKUP(),
      .priority         = 80,
      .__match          = fLAGBIT_NOT_VXLAN() ++
                          " && nd_ns && nd.target == { " ++
-                         all_ips_v6.to_vec().join(", ") ++ "}",
+                         ipv6.to_vec().join(", ") ++ "}",
      .actions          = if (sw.has_non_router_port) {
                              "clone {outport = ${sp.json_name}; output; }; "
                              "outport = ${mc_flood_l2}; output;"
@@ -4200,10 +4231,38 @@  Flow(.logical_datapath = sw._uuid,
      .external_ids     = stage_hint(sp.lsp._uuid)) :-
     sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
     rp.is_enabled(),
-    (_, var all_ips_v6) = get_arp_forward_ips(rp),
-    not all_ips_v6.is_empty(),
+    &SwitchPortARPForwards(.port = sp, .reachable_ips_v6 = ipv6),
+    not ipv6.is_empty(),
     var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
 
+Flow(.logical_datapath = sw._uuid,
+     .stage            = s_SWITCH_IN_L2_LKUP(),
+     .priority         = 90,
+     .__match          = fLAGBIT_NOT_VXLAN() ++
+                        " && arp.op == 1 && arp.tpa == {" ++
+                        ipv4.to_vec().join(", ") ++ "}",
+     .actions          = "outport = ${flood}; output;",
+     .external_ids     = stage_hint(sp.lsp._uuid)) :-
+    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
+    rp.is_enabled(),
+    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v4 = ipv4),
+    not ipv4.is_empty(),
+    var flood = json_string_escape(mC_FLOOD().0).
+
+Flow(.logical_datapath = sw._uuid,
+     .stage            = s_SWITCH_IN_L2_LKUP(),
+     .priority         = 90,
+     .__match          = fLAGBIT_NOT_VXLAN() ++
+                         " && nd_ns && nd.target == {" ++
+                         ipv6.to_vec().join(", ") ++ "}",
+     .actions          = "outport = ${flood}; output;",
+     .external_ids     = stage_hint(sp.lsp._uuid)) :-
+    sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
+    rp.is_enabled(),
+    &SwitchPortARPForwards(.port = sp, .unreachable_ips_v6 = ipv6),
+    not ipv6.is_empty(),
+    var flood = json_string_escape(mC_FLOOD().0).
+
 for (SwitchPortNewDynamicAddress(.port = &SwitchPort{.lsp = lsp, .json_name = json_name, .sw = sw},
                                  .address = Some{addrs})
      if lsp.__type != "external") {
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 0c8a09ced..f591537e1 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -4033,3 +4033,102 @@  check ovn-nbctl --wait=sb clear logical_router_port ro2-sw ha_chassis_group
 check_lflows 0
 
 AT_CLEANUP
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- ARP flood for unreachable addresses])
+ovn_start
+
+AS_BOX([Setting up the logical network])
+
+# This network is the same as the one from "Router Address Propagation"
+check ovn-nbctl ls-add sw
+
+check ovn-nbctl lr-add ro1
+check ovn-nbctl lrp-add ro1 ro1-sw 00:00:00:00:00:01 10.0.0.1/24
+check ovn-nbctl lsp-add sw sw-ro1
+check ovn-nbctl lsp-set-type sw-ro1 router
+check ovn-nbctl lsp-set-addresses sw-ro1 router
+check ovn-nbctl lsp-set-options sw-ro1 router-port=ro1-sw
+
+check ovn-nbctl lr-add ro2
+check ovn-nbctl lrp-add ro2 ro2-sw 00:00:00:00:00:02 20.0.0.1/24
+check ovn-nbctl lsp-add sw sw-ro2
+check ovn-nbctl lsp-set-type sw-ro2 router
+check ovn-nbctl lsp-set-addresses sw-ro2 router
+check ovn-nbctl --wait=sb lsp-set-options sw-ro2 router-port=ro2-sw
+
+check ovn-nbctl ls-add ls1
+check ovn-nbctl lsp-add ls1 vm1
+check ovn-nbctl lsp-set-addresses vm1 "00:00:00:00:01:02 192.168.1.2"
+check ovn-nbctl lrp-add ro1 ro1-ls1 00:00:00:00:01:01 192.168.1.1/24
+check ovn-nbctl lsp-add ls1 ls1-ro1
+check ovn-nbctl lsp-set-type ls1-ro1 router
+check ovn-nbctl lsp-set-addresses ls1-ro1 router
+check ovn-nbctl lsp-set-options ls1-ro1 router-port=ro1-ls1
+
+check ovn-nbctl ls-add ls2
+check ovn-nbctl lsp-add ls2 vm2
+check ovn-nbctl lsp-set-addresses vm2 "00:00:00:00:02:02 192.168.2.2"
+check ovn-nbctl lrp-add ro2 ro2-ls2 00:00:00:00:02:01 192.168.2.1/24
+check ovn-nbctl lsp-add ls2 ls2-ro2
+check ovn-nbctl lsp-set-type ls2-ro2 router
+check ovn-nbctl lsp-set-addresses ls2-ro2 router
+check ovn-nbctl lsp-set-options ls2-ro2 router-port=ro2-ls2
+
+AS_BOX([Ensure that unreachable flood flows are not installed, since no addresses are unreachable])
+
+AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep "priority=90" -c], [1], [dnl
+0
+])
+
+AS_BOX([Adding some reachable NAT addresses])
+
+check ovn-nbctl lr-nat-add ro1 dnat 10.0.0.100 192.168.1.100
+check ovn-nbctl lr-nat-add ro1 snat 10.0.0.200 192.168.1.200/30
+
+check ovn-nbctl lr-nat-add ro2 dnat 20.0.0.100 192.168.2.100
+check ovn-nbctl --wait=sb lr-nat-add ro2 snat 20.0.0.200 192.168.2.200/30
+
+AS_BOX([Ensure that unreachable flood flows are not installed, since all addresses are reachable])
+
+AT_CHECK([ovn-sbctl lflow-list sw | grep "ls_in_l2_lkup" | grep "priority=90" -c], [1], [dnl
+0
+])
+
+AS_BOX([Adding some unreachable NAT addresses])
+
+check ovn-nbctl lr-nat-add ro1 dnat 30.0.0.100 192.168.1.130
+check ovn-nbctl lr-nat-add ro1 snat 30.0.0.200 192.168.1.148/30
+
+check ovn-nbctl lr-nat-add ro2 dnat 40.0.0.100 192.168.2.130
+check ovn-nbctl --wait=sb lr-nat-add ro2 snat 40.0.0.200 192.168.2.148/30
+
+AS_BOX([Ensure that unreachable flood flows are installed, since there are unreachable addresses])
+
+ovn-sbctl lflow-list
+
+# We expect two flows to be installed, one per connected router port on sw
+AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 -c], [0], [dnl
+2
+])
+
+# We expect that the installed flows will match the unreachable DNAT addresses only.
+AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {30.0.0.100}" -c], [0], [dnl
+1
+])
+
+AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {40.0.0.100}" -c], [0], [dnl
+1
+])
+
+# Ensure that we do not create flows for SNAT addresses
+AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {30.0.0.200}" -c], [1], [dnl
+0
+])
+
+AT_CHECK([ovn-sbctl lflow-list sw | grep ls_in_l2_lkup | grep priority=90 | grep "arp.tpa == {40.0.0.200}" -c], [1], [dnl
+0
+])
+
+AT_CLEANUP
+])
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index a5f10ce38..bdd13b009 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -6482,3 +6482,105 @@  OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- Floating IP outside router subnet IPv4])
+AT_KEYWORDS(NAT)
+
+ovn_start
+
+OVS_TRAFFIC_VSWITCHD_START()
+ADD_BR([br-int])
+
+# Set external-ids in br-int needed for ovn-controller
+ovs-vsctl \
+        -- set Open_vSwitch . external-ids:system-id=hv1 \
+        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
+        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
+        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
+        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
+
+start_daemon ovn-controller
+
+# Logical network:
+# Two VMs
+#   * VM1 with IP address 192.168.100.5
+#   * VM2 with IP address 192.168.200.5
+#
+# VM1 connects to logical switch ls1. ls1 connects to logical router lr1.
+# VM2 connects to logical switch ls2. ls2 connects to logical router lr2.
+# lr1 and lr2 both connect to logical switch ls-pub.
+# * lr1's interface that connects to ls-pub has IP address 172.18.2.110/24
+# * lr2's interface that connects to ls-pub has IP address 172.18.1.173/24
+#
+# lr1 has the following attributes:
+#   * It has a DNAT rule that translates 172.18.2.11 to 192.168.100.5 (VM1)
+#
+# lr2 has the following attributes:
+#   * It has a DNAT rule that translates 172.18.2.12 to 192.168.200.5 (VM2)
+#
+# In this test, we want to ensure that a ping from VM1 to IP address 172.18.2.12 reaches VM2.
+# When the NAT rules are set up, there should be MAC_Bindings created that allow for traffic
+# to exit lr1, go through ls-pub, and reach the NAT external IP configured on lr2.
+
+check ovn-nbctl ls-add ls1
+check ovn-nbctl lsp-add ls1 vm1 -- lsp-set-addresses vm1 "00:00:00:00:01:05 192.168.100.5"
+
+check ovn-nbctl ls-add ls2
+check ovn-nbctl lsp-add ls2 vm2 -- lsp-set-addresses vm2 "00:00:00:00:02:05 192.168.200.5"
+
+check ovn-nbctl ls-add ls-pub
+
+check ovn-nbctl lr-add lr1
+check ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:01:01 192.168.100.1/24
+check ovn-nbctl lsp-add ls1 ls1-lr1                      \
+    -- lsp-set-type ls1-lr1 router                 \
+    -- lsp-set-addresses ls1-lr1 router            \
+    -- lsp-set-options ls1-lr1 router-port=lr1-ls1
+
+check ovn-nbctl lr-add lr2
+check ovn-nbctl lrp-add lr2 lr2-ls2 00:00:00:00:02:01 192.168.200.1/24
+check ovn-nbctl lsp-add ls2 ls2-lr2                      \
+    -- lsp-set-type ls2-lr2 router                 \
+    -- lsp-set-addresses ls2-lr2 router            \
+    -- lsp-set-options ls2-lr2 router-port=lr2-ls2
+
+check ovn-nbctl lrp-add lr1 lr1-ls-pub 00:00:00:00:03:01 172.18.2.110/24
+check ovn-nbctl lrp-set-gateway-chassis lr1-ls-pub hv1
+check ovn-nbctl lsp-add ls-pub ls-pub-lr1                      \
+    -- lsp-set-type ls-pub-lr1 router                    \
+    -- lsp-set-addresses ls-pub-lr1 router               \
+    -- lsp-set-options ls-pub-lr1 router-port=lr1-ls-pub
+
+check ovn-nbctl lrp-add lr2 lr2-ls-pub 00:00:00:00:03:02 172.18.1.173/24
+check ovn-nbctl lrp-set-gateway-chassis lr2-ls-pub hv1
+check ovn-nbctl lsp-add ls-pub ls-pub-lr2                      \
+    -- lsp-set-type ls-pub-lr2 router                    \
+    -- lsp-set-addresses ls-pub-lr2 router               \
+    -- lsp-set-options ls-pub-lr2 router-port=lr2-ls-pub
+
+# Putting --add-route on these NAT rules means there is no need to
+# add any static routes.
+check ovn-nbctl --add-route lr-nat-add lr1 dnat_and_snat 172.18.2.11 192.168.100.5 vm1 00:00:00:00:03:01
+check ovn-nbctl --add-route lr-nat-add lr2 dnat_and_snat 172.18.2.12 192.168.200.5 vm2 00:00:00:00:03:02
+
+ADD_NAMESPACES(vm1)
+ADD_VETH(vm1, vm1, br-int, "192.168.100.5/24", "00:00:00:00:01:05", \
+         "192.168.100.1")
+
+ADD_NAMESPACES(vm2)
+ADD_VETH(vm2, vm2, br-int, "192.168.200.5/24", "00:00:00:00:02:05", \
+         "192.168.200.1")
+
+OVN_POPULATE_ARP
+check ovn-nbctl --wait=hv sync
+
+AS_BOX([Testing a ping])
+
+NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 172.18.2.12 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+AT_CLEANUP
+])