[ovs-dev,v9,2/2] OVN: Enable N-S Traffic, Vlan backed DVR
diff mbox series

Message ID 1559175728-127062-3-git-send-email-ankur.sharma@nutanix.com
State Superseded
Headers show
Series
  • OVN: Distributed Virtual Router for Vlan Backed Networks
Related show

Commit Message

Ankur Sharma May 30, 2019, 12:20 a.m. UTC
Background:
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html
[2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing

This Series:
Layer 2, Layer 3 E-W and Layer 3 N-S (NO NAT) changes for vlan
backed distributed logical router.

This patch:
For North-South traffic, we need a chassis which will respond to
ARP requests for router port coming from outside. For this purpose,
we will reply upon gateway-chassis construct in OVN, on a logical
router port, we will associate one or more chassis as gateway chassis.

One of these chassis would be active at a point and will become
entry point to traffic, bound for end points behind logical router
coming from outside network (North to South).

This patch make some enhancements to gateway chassis implementation
to manage above used case.

A.
Do not replace router port mac with chassis mac on gateway
chassis.
This is done, because:
    i. Chassisredirect port is NOT a distributed port, hence
       we need not replace its mac address
      (which same as router port mac).

   ii. ARP cache will be consistent everywhere, i.e just like
       endpoints on OVN chassis will see configured router port
       mac as resolved mac for router port ip, outside endpoints
       will see that as well.

  iii. For implementing Network Address Translation. Although
       not a part of this series. But, follow up series would
       be having this feature and approach would rely upon
       sending packets to redirect chassis using chassis redirect
       router port mac as dest mac.

B.
Advertise router port GARP on gateway chassis.
This is needed, especially if a failover happens and
chassisredirect port moves to a new gateway chassis.
Otherwise, there would be packet drops till outside
router ARPs for router port ip again.

Intention of this GARP is to update top of the rack (TOR)
to direct router port mac to new hypervisor.

Hence, we could have done the same using RARP as well, but
because ovn-controller has implementation for GARP already,
hence it did not look like worthy to add a RARP implementation
just for this.

C.
For South to North traffic, we need not pass through gateway
chassis, if there is no address transalation needed.

For overlay networks, NATing is a must to talk to outside networks.
However, for vlan backed networks, NATing is not a must, and hence
in the absence of NATing configuration we need redirect the packet
to gateway chassis.

Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
---
 ovn/controller/physical.c  |  24 +-
 ovn/controller/pinctrl.c   | 205 +++++++++++--
 ovn/controller/pinctrl.h   |   6 +
 ovn/lib/ovn-util.c         |  31 ++
 ovn/lib/ovn-util.h         |   6 +
 ovn/northd/ovn-northd.c    |  43 ++-
 ovn/ovn-architecture.7.xml |  87 +++++-
 tests/ovn.at               | 732 ++++++++++++++++++++++++++++++++++++++++++++-
 8 files changed, 1090 insertions(+), 44 deletions(-)

Comments

Numan Siddique June 3, 2019, 10:05 a.m. UTC | #1
Hi Ankur,

Please see some comments inline. Please note that I haven't got the chance
to look into the code
in detail. I am first trying to test out the patches. (I am in PTO. Expect
some delay in my replies).



On Thu, May 30, 2019 at 5:58 AM Ankur Sharma <ankur.sharma@nutanix.com>
wrote:

> Background:
> [1]
> https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html
> [2]
> https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing
>
> This Series:
> Layer 2, Layer 3 E-W and Layer 3 N-S (NO NAT) changes for vlan
> backed distributed logical router.
>
> This patch:
> For North-South traffic, we need a chassis which will respond to
> ARP requests for router port coming from outside. For this purpose,
> we will reply upon gateway-chassis construct in OVN, on a logical
> router port, we will associate one or more chassis as gateway chassis.
>
> One of these chassis would be active at a point and will become
> entry point to traffic, bound for end points behind logical router
> coming from outside network (North to South).
>
> This patch make some enhancements to gateway chassis implementation
> to manage above used case.
>
> A.
> Do not replace router port mac with chassis mac on gateway
> chassis.
> This is done, because:
>     i. Chassisredirect port is NOT a distributed port, hence
>        we need not replace its mac address
>       (which same as router port mac).
>
>    ii. ARP cache will be consistent everywhere, i.e just like
>        endpoints on OVN chassis will see configured router port
>        mac as resolved mac for router port ip, outside endpoints
>        will see that as well.
>
>   iii. For implementing Network Address Translation. Although
>        not a part of this series. But, follow up series would
>        be having this feature and approach would rely upon
>        sending packets to redirect chassis using chassis redirect
>        router port mac as dest mac.
>
> B.
> Advertise router port GARP on gateway chassis.
> This is needed, especially if a failover happens and
> chassisredirect port moves to a new gateway chassis.
> Otherwise, there would be packet drops till outside
> router ARPs for router port ip again.
>
> Intention of this GARP is to update top of the rack (TOR)
> to direct router port mac to new hypervisor.
>
> Hence, we could have done the same using RARP as well, but
> because ovn-controller has implementation for GARP already,
> hence it did not look like worthy to add a RARP implementation
> just for this.
>
> C.
> For South to North traffic, we need not pass through gateway
> chassis, if there is no address transalation needed.
>
> For overlay networks, NATing is a must to talk to outside networks.
> However, for vlan backed networks, NATing is not a must, and hence
> in the absence of NATing configuration we need redirect the packet
> to gateway chassis.
>
> Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
> ---
>  ovn/controller/physical.c  |  24 +-
>  ovn/controller/pinctrl.c   | 205 +++++++++++--
>  ovn/controller/pinctrl.h   |   6 +
>  ovn/lib/ovn-util.c         |  31 ++
>  ovn/lib/ovn-util.h         |   6 +
>  ovn/northd/ovn-northd.c    |  43 ++-
>  ovn/ovn-architecture.7.xml |  87 +++++-
>  tests/ovn.at               | 732
> ++++++++++++++++++++++++++++++++++++++++++++-
>  8 files changed, 1090 insertions(+), 44 deletions(-)
>
> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> index af587a5..1ab5968 100644
> --- a/ovn/controller/physical.c
> +++ b/ovn/controller/physical.c
> @@ -21,6 +21,7 @@
>  #include "lflow.h"
>  #include "lport.h"
>  #include "chassis.h"
> +#include "pinctrl.h"
>  #include "lib/bundle.h"
>  #include "openvswitch/poll-loop.h"
>  #include "lib/uuid.h"
> @@ -238,9 +239,12 @@ get_zone_ids(const struct sbrec_port_binding *binding,
>  }
>
>  static void
> -put_replace_router_port_mac_flows(const struct
> +put_replace_router_port_mac_flows(struct ovsdb_idl_index
> +                                  *sbrec_port_binding_by_name,
> +                                  const struct
>                                    sbrec_port_binding *localnet_port,
>                                    const struct sbrec_chassis *chassis,
> +                                  const struct sset *active_tunnels,
>                                    const struct hmap *local_datapaths,
>                                    struct ofpbuf *ofpacts_p,
>                                    ofp_port_t ofport,
> @@ -281,8 +285,21 @@ put_replace_router_port_mac_flows(const struct
>          char *err_str = NULL;
>          struct match match;
>          struct ofpact_mac *replace_mac;
> +        char *cr_peer_name = xasprintf("cr-%s",
> rport_binding->logical_port);
>
> -        /* Table 65, priority 150.
> +
> +        if (pinctrl_is_chassis_resident(sbrec_port_binding_by_name,
> +                                        chassis, active_tunnels,
> +                                        cr_peer_name)) {
> +            /* If a router port's chassisredirect port is
> +             * resident on this chassis, then we need not do mac replace.
> */
> +            free(cr_peer_name);
> +            continue;
> +        }
> +
> +        free(cr_peer_name);
> +
> +       /* Table 65, priority 150.
>           * =======================
>           *
>           * Implements output to localnet port.
> @@ -797,7 +814,8 @@ consider_port_binding(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
>                          &match, ofpacts_p, &binding->header_.uuid);
>
>          if (!strcmp(binding->type, "localnet")) {
> -            put_replace_router_port_mac_flows(binding, chassis,
> +            put_replace_router_port_mac_flows(sbrec_port_binding_by_name,
> +                                              binding, chassis,
> active_tunnels,
>                                                local_datapaths, ofpacts_p,
>                                                ofport, flow_table);
>          }
> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> index b7bb4c9..a145867 100644
> --- a/ovn/controller/pinctrl.c
> +++ b/ovn/controller/pinctrl.c
> @@ -226,6 +226,8 @@ static bool may_inject_pkts(void);
>  COVERAGE_DEFINE(pinctrl_drop_put_mac_binding);
>  COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map);
>
> +#define GARP_DEF_REPEAT_INTERVAL_MS   (3 * 60 * 1000) /* 3 minutes */
> +
>  void
>  pinctrl_init(void)
>  {
> @@ -242,6 +244,25 @@ pinctrl_init(void)
>                                                  &pinctrl);
>  }
>
> +bool
> +pinctrl_is_chassis_resident(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> +                            const struct sbrec_chassis *chassis,
> +                            const struct sset *active_tunnels,
> +                            const char *port_name)
> +{
> +    const struct sbrec_port_binding *pb
> +        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
> +    if (!pb || !pb->chassis) {
> +        return false;
> +    }
> +    if (strcmp(pb->type, "chassisredirect")) {
> +        return pb->chassis == chassis;
> +    } else {
> +        return ha_chassis_group_is_active(pb->ha_chassis_group,
> +                                          active_tunnels, chassis);
> +    }
> +}
> +
>  static ovs_be32
>  queue_msg(struct rconn *swconn, struct ofpbuf *msg)
>  {
> @@ -2548,6 +2569,8 @@ struct garp_data {
>      int backoff;                 /* Backoff for the next announcement. */
>      uint32_t dp_key;             /* Datapath used to output this GARP. */
>      uint32_t port_key;           /* Port to inject the GARP into. */
> +    bool is_repeat;              /* Send GARPs continously */
> +    long long int repeat_interval; /* Interval between GARP bursts in ms
> */
>  };
>
>  /* Contains GARPs to be sent. Protected by pinctrl_mutex*/
> @@ -2568,7 +2591,8 @@ destroy_send_garps(void)
>  /* Runs with in the main ovn-controller thread context. */
>  static void
>  add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
> -         uint32_t dp_key, uint32_t port_key)
> +         uint32_t dp_key, uint32_t port_key, bool is_repeat,
> +         long long int repeat_interval)
>  {
>      struct garp_data *garp = xmalloc(sizeof *garp);
>      garp->ea = ea;
> @@ -2577,6 +2601,8 @@ add_garp(const char *name, const struct eth_addr ea,
> ovs_be32 ip,
>      garp->backoff = 1;
>      garp->dp_key = dp_key;
>      garp->port_key = port_key;
> +    garp->is_repeat = is_repeat;
> +    garp->repeat_interval = repeat_interval;
>      shash_add(&send_garp_data, name, garp);
>
>      /* Notify pinctrl_handler so that it can wakeup and process
> @@ -2586,7 +2612,8 @@ add_garp(const char *name, const struct eth_addr ea,
> ovs_be32 ip,
>
>  /* Add or update a vif for which GARPs need to be announced. */
>  static void
> -send_garp_update(const struct sbrec_port_binding *binding_rec,
> +send_garp_update(struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +                 const struct sbrec_port_binding *binding_rec,
>                   struct shash *nat_addresses)
>  {
>      volatile struct garp_data *garp = NULL;
> @@ -2611,7 +2638,7 @@ send_garp_update(const struct sbrec_port_binding
> *binding_rec,
>                      add_garp(name, laddrs->ea,
>                               laddrs->ipv4_addrs[i].addr,
>                               binding_rec->datapath->tunnel_key,
> -                             binding_rec->tunnel_key);
> +                             binding_rec->tunnel_key, false, 0);
>                  }
>                  free(name);
>              }
> @@ -2621,6 +2648,64 @@ send_garp_update(const struct sbrec_port_binding
> *binding_rec,
>          return;
>      }
>
> +    /* Update GARPs for local chassisredirect port, if the peer
> +     * layer 2 switch is of type vlan.
> +     */
> +    if (!strcmp(binding_rec->type, "chassisredirect")) {
> +        struct eth_addr mac;
> +        ovs_be32 ip, mask;
> +        uint32_t dp_key = 0;
> +        uint32_t port_key = 0;
> +        const struct sbrec_port_binding *peer_port = NULL;
> +        const struct sbrec_port_binding *distributed_port = NULL;
> +
> +        if (!ovn_sbrec_get_port_binding_ip_mac(binding_rec, &mac,
> +                                               &ip, &mask)) {
> +            /* Router Port binding without ip and mac configured. */
> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> +            VLOG_WARN_RL(&rl, "cannot send garp, router port binding: %s,
> "
> +                         "does not have proper ip,mac values: %s",
> +                         binding_rec->logical_port, *binding_rec->mac);
> +            return;
> +        }
> +
> +        const char *lrp_name = smap_get(&binding_rec->options,
> +                                        "distributed-port");
> +        ovs_assert(lrp_name);
> +
> +        distributed_port =
> lport_lookup_by_name(sbrec_port_binding_by_name,
> +                                                lrp_name);
> +        ovs_assert(distributed_port);
> +
> +        const char *peer_name = smap_get(&distributed_port->options,
> "peer");
> +        ovs_assert(peer_name);
> +
> +        peer_port = lport_lookup_by_name(sbrec_port_binding_by_name,
> +                                         peer_name);
> +        ovs_assert(peer_port);
> +
> +        const char *network_type =
> smap_get(&peer_port->datapath->external_ids,
> +                                            "network-type");
> +
> +        /* Advertise GARP only of logical switch is of type bridged. */
> +        if (!network_type || strcmp(network_type, "bridged")) {
> +            return;
> +        }
> +
> +        dp_key = peer_port->datapath->tunnel_key;
> +        port_key = peer_port->tunnel_key;
> +
> +        garp = shash_find_data(&send_garp_data,
> binding_rec->logical_port);
> +        if (garp) {
> +            garp->dp_key = dp_key;
> +            garp->port_key = port_key;
> +        } else {
> +            add_garp(binding_rec->logical_port, mac, ip,
> +                     dp_key, port_key, true, GARP_DEF_REPEAT_INTERVAL_MS);
> +        }
> +        return;
> +    }
> +
>      /* Update GARP for vif if it exists. */
>      garp = shash_find_data(&send_garp_data, binding_rec->logical_port);
>      if (garp) {
> @@ -2640,7 +2725,8 @@ send_garp_update(const struct sbrec_port_binding
> *binding_rec,
>
>          add_garp(binding_rec->logical_port,
>                   laddrs.ea, laddrs.ipv4_addrs[0].addr,
> -                 binding_rec->datapath->tunnel_key,
> binding_rec->tunnel_key);
> +                 binding_rec->datapath->tunnel_key,
> binding_rec->tunnel_key,
> +                 false, 0);
>
>          destroy_lport_addresses(&laddrs);
>          break;
> @@ -2702,7 +2788,12 @@ send_garp(struct rconn *swconn, struct garp_data
> *garp,
>          garp->backoff *= 2;
>          garp->announce_time = current_time + garp->backoff * 1000;
>      } else {
> -        garp->announce_time = LLONG_MAX;
> +        if (garp->is_repeat) {
> +            garp->backoff = 1;
> +            garp->announce_time = current_time + garp->repeat_interval;
> +        } else {
> +            garp->announce_time = LLONG_MAX;
> +        }
>      }
>      return garp->announce_time;
>  }
> @@ -2786,25 +2877,6 @@ get_localnet_vifs_l3gwports(
>      sbrec_port_binding_index_destroy_row(target);
>  }
>
> -static bool
> -pinctrl_is_chassis_resident(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> -                            const struct sbrec_chassis *chassis,
> -                            const struct sset *active_tunnels,
> -                            const char *port_name)
> -{
> -    const struct sbrec_port_binding *pb
> -        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
> -    if (!pb || !pb->chassis) {
> -        return false;
> -    }
> -    if (strcmp(pb->type, "chassisredirect")) {
> -        return pb->chassis == chassis;
> -    } else {
> -        return ha_chassis_group_is_active(pb->ha_chassis_group,
> -                                          active_tunnels, chassis);
> -    }
> -}
> -
>  /* Extracts the mac, IPv4 and IPv6 addresses, and logical port from
>   * 'addresses' which should be of the format 'MAC [IP1 IP2 ..]
>   * [is_chassis_resident("LPORT_NAME")]', where IPn should be a valid IPv4
> @@ -2946,6 +3018,67 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
>  }
>
>  static void
> +get_local_cr_ports(struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +                   struct sset *local_cr_ports,
> +                   struct sset *local_l3gw_ports,
> +                   const struct sbrec_chassis *chassis,
> +                   const struct sset *active_tunnels)
> +{
> +    const char *gw_port;
> +    SSET_FOR_EACH (gw_port, local_l3gw_ports) {
> +        const struct sbrec_port_binding *binding_rec;
> +
> +        binding_rec = lport_lookup_by_name(sbrec_port_binding_by_name,
> +                                           gw_port);
> +        if (!binding_rec) {
> +            continue;
> +        }
> +
> +        /* For the patch port we will add send garp for peer's ip and
> mac. */
> +        if (!strcmp(binding_rec->type, "patch")) {
> +            const struct sbrec_port_binding *cr_port = NULL;
> +
> +            bool is_cr_resident;
> +            struct eth_addr mac;
> +            ovs_be32 ip, mask;
> +
> +            const char *peer_name = smap_get(&binding_rec->options,
> "peer");
> +            ovs_assert(peer_name);
> +
> +            char *cr_peer_name = xasprintf("cr-%s", peer_name);
> +            cr_port = lport_lookup_by_name(sbrec_port_binding_by_name,
> +                                           cr_peer_name);
> +            free(cr_peer_name);
> +
> +            if (!cr_port) {
> +                continue;
> +            }
> +
> +            is_cr_resident = pinctrl_is_chassis_resident
> +                                (sbrec_port_binding_by_name,
> +                                 chassis,
> +                                 active_tunnels,
> +                                 cr_port->logical_port);
> +            if (!is_cr_resident) {
> +                continue;
> +            }
> +
> +            if (!ovn_sbrec_get_port_binding_ip_mac(cr_port, &mac, &ip,
> +                                                   &mask)) {
> +                /* Router Port binding without ip and mac configured. */
> +                static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(1, 1);
> +                VLOG_WARN_RL(&rl, "cannot send garp, router port binding:
> %s, "
> +                             "does not have proper ip,mac values: %s",
> +                              cr_port->logical_port, *cr_port->mac);
> +                return;
> +            }
> +
> +            sset_add(local_cr_ports, cr_port->logical_port);
> +        }
> +    }
> +}
> +
> +static void
>  send_garp_wait(long long int send_garp_time)
>  {
>      /* Set the poll timer for next garp only if there is garp data to
> @@ -2990,6 +3123,8 @@ send_garp_prepare(struct ovsdb_idl_index
> *sbrec_port_binding_by_datapath,
>  {
>      struct sset localnet_vifs = SSET_INITIALIZER(&localnet_vifs);
>      struct sset local_l3gw_ports = SSET_INITIALIZER(&local_l3gw_ports);
> +    struct sset local_cr_ports = SSET_INITIALIZER(&local_cr_ports);
> +
>      struct sset nat_ip_keys = SSET_INITIALIZER(&nat_ip_keys);
>      struct shash nat_addresses;
>
> @@ -3004,11 +3139,17 @@ send_garp_prepare(struct ovsdb_idl_index
> *sbrec_port_binding_by_datapath,
>                                 &nat_ip_keys, &local_l3gw_ports,
>                                 chassis, active_tunnels,
>                                 &nat_addresses);
> +
> +    get_local_cr_ports(sbrec_port_binding_by_name,
> +                       &local_cr_ports, &local_l3gw_ports,
> +                       chassis, active_tunnels);
> +
>      /* For deleted ports and deleted nat ips, remove from send_garp_data.
> */
>      struct shash_node *iter, *next;
>      SHASH_FOR_EACH_SAFE (iter, next, &send_garp_data) {
>          if (!sset_contains(&localnet_vifs, iter->name) &&
> -            !sset_contains(&nat_ip_keys, iter->name)) {
> +            !sset_contains(&nat_ip_keys, iter->name) &&
> +            !sset_contains(&local_cr_ports, iter->name)) {
>              send_garp_delete(iter->name);
>          }
>      }
> @@ -3019,7 +3160,7 @@ send_garp_prepare(struct ovsdb_idl_index
> *sbrec_port_binding_by_datapath,
>          const struct sbrec_port_binding *pb = lport_lookup_by_name(
>              sbrec_port_binding_by_name, iface_id);
>          if (pb) {
> -            send_garp_update(pb, &nat_addresses);
> +            send_garp_update(sbrec_port_binding_by_name, pb,
> &nat_addresses);
>          }
>      }
>
> @@ -3029,7 +3170,17 @@ send_garp_prepare(struct ovsdb_idl_index
> *sbrec_port_binding_by_datapath,
>          const struct sbrec_port_binding *pb
>              = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port);
>          if (pb) {
> -            send_garp_update(pb, &nat_addresses);
> +            send_garp_update(sbrec_port_binding_by_name, pb,
> &nat_addresses);
> +        }
> +    }
> +
> +    /* Update send_garp_data for chassisredirect router ports. */
> +    const char *cr_port;
> +    SSET_FOR_EACH (cr_port, &local_cr_ports) {
> +        const struct sbrec_port_binding *pb
> +            = lport_lookup_by_name(sbrec_port_binding_by_name, cr_port);
> +        if (pb) {
> +            send_garp_update(sbrec_port_binding_by_name, pb,
> &nat_addresses);
>          }
>      }
>
> diff --git a/ovn/controller/pinctrl.h b/ovn/controller/pinctrl.h
> index f61d705..92f704e 100644
> --- a/ovn/controller/pinctrl.h
> +++ b/ovn/controller/pinctrl.h
> @@ -44,4 +44,10 @@ void pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>  void pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn);
>  void pinctrl_destroy(void);
>
> +bool
> +pinctrl_is_chassis_resident(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> +                            const struct sbrec_chassis *chassis,
> +                            const struct sset *active_tunnels,
> +                            const char *port_name);
> +
>  #endif /* ovn/pinctrl.h */
> diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> index 0f07d80..3d0ad8e 100644
> --- a/ovn/lib/ovn-util.c
> +++ b/ovn/lib/ovn-util.c
> @@ -16,6 +16,7 @@
>  #include "ovn-util.h"
>  #include "dirs.h"
>  #include "openvswitch/vlog.h"
> +#include "openvswitch/ofp-parse.h"
>  #include "ovn/lib/ovn-nb-idl.h"
>  #include "ovn/lib/ovn-sb-idl.h"
>
> @@ -371,3 +372,33 @@ ovn_logical_flow_hash(const struct uuid
> *logical_datapath,
>      hash = hash_string(match, hash);
>      return hash_string(actions, hash);
>  }
> +
> +/*  Extracts the mac, ip and mask for a sbrec_port_binding.
> + *
> + *  Expects following format:
> + *  "MAC_ADDRESS IP/MASK"
> + *
> + *  Return true if MAC, IP and MASK are found, false otherwise.
> + */
> +bool
> +ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding
> *binding,
> +                                  struct eth_addr *mac,
> +                                  ovs_be32 *ip, ovs_be32 *mask)
> +{
> +    char *err_str = NULL;
> +
> +    err_str = str_to_mac(binding->mac[0], mac);
> +    if (err_str) {
> +        free(err_str);
> +        return false;
> +    }
> +
> +    err_str = ip_parse_masked(binding->mac[0] + ETH_ADDR_STRLEN + 1,
> +                              ip, mask);
> +    if (err_str) {
> +        free(err_str);
> +        return false;
> +    }
> +
> +    return true;
> +}
> diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h
> index 6d5e1df..c01595a 100644
> --- a/ovn/lib/ovn-util.h
> +++ b/ovn/lib/ovn-util.h
> @@ -19,6 +19,7 @@
>  #include "lib/packets.h"
>
>  struct nbrec_logical_router_port;
> +struct sbrec_port_binding;
>  struct sbrec_logical_flow;
>  struct uuid;
>
> @@ -81,4 +82,9 @@ uint32_t ovn_logical_flow_hash(const struct uuid
> *logical_datapath,
>                                 uint16_t priority,
>                                 const char *match, const char *actions);
>
> +bool
> +ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding
> *binding,
> +                                  struct eth_addr *mac, ovs_be32 *ip,
> +                                  ovs_be32 *mask);
> +
>  #endif
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 74d3692..6835910 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -5914,6 +5914,20 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>                      ds_put_format(&match, " && is_chassis_resident(%s)",
>                                    op->od->l3redirect_port->json_key);
>                  }
> +            } else if (op->peer &&
> +                       op->peer->od->network_type == DP_NETWORK_BRIDGED) {
> +                /* For a router port connected to bridged logical switch,
> +                 * we will always have the is_chassis_resident check.
> +                 * This is because there could be vm/server on vlan
> network,
> +                 * but not on OVN chassis and could end up arping for
> router
> +                 * port ip.
> +                 *
> +                 * This check works on the assumption that for OVN
> chassis,
> +                 * VMs logical switch ARP responder will respond to ARP
> +                 * requests for router port IP.
> +                 */
> +                ds_put_format(&match, " &&
> is_chassis_resident(\"cr-%s\")",
> +                              op->key);
>              }
>
>              ds_clear(&actions);
> @@ -7365,18 +7379,23 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>              ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 300,
>                            REGBIT_DISTRIBUTED_NAT" == 1", "next;");
>
> -            /* For traffic with outport == l3dgw_port, if the
> -             * packet did not match any higher priority redirect
> -             * rule, then the traffic is redirected to the central
> -             * instance of the l3dgw_port. */
> -            ds_clear(&match);
> -            ds_put_format(&match, "outport == %s",
> -                          od->l3dgw_port->json_key);
> -            ds_clear(&actions);
> -            ds_put_format(&actions, "outport = %s; next;",
> -                          od->l3redirect_port->json_key);
> -            ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
> -                          ds_cstr(&match), ds_cstr(&actions));
> +            /* For VLAN backed networks, default match will not redirect
> to
> +             * chassis redirect port. */
> +            if (od->l3dgw_port->peer &&
> +                od->l3dgw_port->peer->od->network_type ==
> DP_NETWORK_OVERLAY) {
> +                /* For traffic with outport == l3dgw_port, if the
> +                 * packet did not match any higher priority redirect
> +                 * rule, then the traffic is redirected to the central
> +                 * instance of the l3dgw_port. */
> +                ds_clear(&match);
> +                ds_put_format(&match, "outport == %s",
> +                              od->l3dgw_port->json_key);
> +                ds_clear(&actions);
> +                ds_put_format(&actions, "outport = %s; next;",
> +                              od->l3redirect_port->json_key);
> +                ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
> +                              ds_cstr(&match), ds_cstr(&actions));
> +            }
>
>
Looks like this code is having some side effects.


Point 1.
======
For my public switch if I don't set the network_type as "bridged",
then I see the below logical flows and think this is as expected. And I
think
that's why in my v7 tests the packets were tunneled to the gw chassis (as
you mentioned in the reply).

****
table=12(lr_in_gw_redirect  ), priority=300  , match=(reg9[2] == 1),
action=(next;)
  table=12(lr_in_gw_redirect  ), priority=200  , match=(reg9[0] == 1),
action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=150  , match=(outport ==
"lr0-public" && eth.dst == 00:00:00:00:00:00), action=(outport =
"cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=50   , match=(outport ==
"lr0-public"), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
****

If I set the type as "bridged", I see the below flows

****
 table=12(lr_in_gw_redirect  ), priority=300  , match=(reg9[2] == 1),
action=(next;)
  table=12(lr_in_gw_redirect  ), priority=200  , match=(reg9[0] == 1),
action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=150  , match=(outport ==
"lr0-sw1" && reg0 == 20.0.0.3 && eth.dst == 00:00:00:00:00:00),
action=(eth.dst = 40:54:00:00:00:03; next;)
  table=12(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
****

I don't understand the 3rd flow with the match -- "outport == "lr0-sw1"...

Looks like the "match" and "action" variables have some old data. Please
look into the code again.

After the "if" condition you added in this patch at line 7384, the below
code is still there and it doesn't make sense

******
             /* For VLAN backed networks, default match will not redirect to
             * chassis redirect port. */
            if (od->l3dgw_port->peer &&
                od->l3dgw_port->peer->od->network_type ==
DP_NETWORK_OVERLAY) {
                /* For traffic with outport == l3dgw_port, if the
                 * packet did not match any higher priority redirect
                 * rule, then the traffic is redirected to the central
                 * instance of the l3dgw_port. */
                ds_clear(&match);
                ds_put_format(&match, "outport == %s",
                              od->l3dgw_port->json_key);
                ds_clear(&actions);
                ds_put_format(&actions, "outport = %s; next;",
                              od->l3redirect_port->json_key);
                ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
                              ds_cstr(&match), ds_cstr(&actions));
            }

            /* If the Ethernet destination has not been resolved,
             * redirect to the central instance of the l3dgw_port.
             * Such traffic will be replaced by an ARP request or ND
             * Neighbor Solicitation in the ARP request ingress
             * table, before being redirected to the central instance.
             */
            ds_put_format(&match, " && eth.dst == 00:00:00:00:00:00");
====> THIS ONE
            ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 150,   ====>
AND THIS ONE
                          ds_cstr(&match), ds_cstr(&actions));
        }
********

Point 2
=====

This patch breaks the S/N traffic if we have a logical switch (sw0) of type
overlay connected
to a router and the router also a gw port connected to a logical switch
(public) of type bridged (i.e provider network).
This public switch has a localnet port.

Some thing like this - http://paste.openstack.org/show/752427/

It works fine if I change the type of the logical switch - public to
overlay. But this doesn't make sense, since
the logical switch - public is a provider (or bridged) network and CMS can
set the type as bridged.

I still think it's better not to have "network_type" column in
logical_switch. We can always consider a logical
switch having a localnet port of type "bridged" and with out a localnet
port of type "overlay".

This patch series sets the network_type=bridged in the external_ids of the
datapath_binding row in SB DB.

Please see my comments in v4 of the patch 1 where I suggested something
like below

****
enum ovn_datapath_nw_type {
    DP_NETWORK_OVERLAY,
    DP_NETWORK_PROVIDER
};

static void
ovn_datapath_update_nw_type(struct ovn_datapath *od)
{
    if (!od->nbs) {
        return;
    }

    if (!od->localnet_port) {
        od->network_type = DP_NETWORK_OVERLAY;
    } else {
        od->network_type = DP_NETWORK_PROVIDER;
    }
}
******

I think you can still set the external_ids of the datapath_binding row with
"network_type=bridged"
if od->network_type is BRIDGED so that ovn-controller can distinguish if
its bridged or overlay datapath.


I am mainly thinking from upgrades perspective for the existing deployments
once this patch is series is applied.
Until CMS changes the network_type to "bridged" for all the logical
switches with localnet ports in the
existing deployments, "ovn-nbctl show" will show these logical switches as
"overlay" which is weird.
And later we may encounter other issues when enhancing OVN with new
features.

I think instead of adding the code to skip the redirection to the gateway
chassis in ovn-northd if its a bridged network,
it's better to handle it in table 32 and since the mac replacement is
handled in table 65 it probably makes more sense this way.

Thanks
Numan



>              /* If the Ethernet destination has not been resolved,
>               * redirect to the central instance of the l3dgw_port.
> diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> index 6275db1..6df711e 100644
> --- a/ovn/ovn-architecture.7.xml
> +++ b/ovn/ovn-architecture.7.xml
> @@ -1441,7 +1441,7 @@
>      </li>
>    </ol>
>
> -  <h3>External traffic</h3>
> +  <h3>External traffic (NAT)</h3>
>
>    <p>
>      The following happens when a VM sends an external traffic (which
> requires
> @@ -1607,6 +1607,91 @@
>      </li>
>    </ol>
>
> +  <h3>External traffic (NO NAT)</h3>
> +  <p>
> +    The following happens when a VM sends an external traffic (i.e to non
> +    logical router connected network), but there is not need for NATing.
> +  </p>
> +
> +  <p>
> +    Since, there is no NATing required, hence we need not redirect the
> packet
> +    to a gateway chassis. As a result, this packet flow is same as
> East-West.
> +    In order to ensure that OVN will not redirect the packet over a tunnel
> +    to gateway-chassis, "network_type" of destination localnet logical
> switch,
> +    should be set as "bridged". A "bridged" logical switch ensures that
> there
> +    is no tunnel encapsulation done while forwarding the packet on it.
> +    Please refer to <code>ovn-nb</code>(5) for more details.
> +  </p>
> +
> +  <ol>
> +    <li>
> +      It first enters the ingress pipeline, and then egress pipeline of
> the
> +      source localnet logical switch datapath. It then enters the ingress
> +      pipeline of the logical router datapath via the logical router port
> in
> +      the source chassis.
> +    </li>
> +
> +    <li>
> +      Routing decision is taken. Since, destination network is NOT
> directly
> +      connected to logial router, hence a static route is expected, which
> will
> +      provide next hop ip.
> +    </li>
> +
> +    <li>
> +      From the router datapath, packet enters the ingress pipeline and
> then
> +      egress pipeline of the destination localnet logical switch datapath
> +      (it is of type "bridged" and this is where the next hop is present)
> +      and goes out of the integration bridge to the provider bridge (
> +      belonging to the destination logical switch) via the localnet port.
> +      Same as East-West, source mac will replaced with chassis mac.
> +    </li>
> +  </ol>
> +
> +  <p>
> +    The following happens for the reverse external traffic.
> +  </p>
> +
> +  <ol>
> +    <li>
> +      The gateway chassis receives the packet from the localnet port of
> +      the logical switch (bridged type) which provides external
> connectivity.
> +      The packet then enters the ingress pipeline and then egress
> pipeline of
> +      the localnet logical switch (which provides external connectivity).
> +      The packet then enters the ingress pipeline of the logical router
> +      datapath.
> +    </li>
> +
> +    <li>
> +      Routing decision is taken and logical switch of destination VM is
> +      identified.
> +    </li>
> +
> +    <li>
> +      The packet then enters the ingress pipeline and then egress
> +      pipeline of VM's localnet logical switch. Since the source VM
> +      doesn't reside in the gateway chassis, the packet is sent out via
> the
> +      localnet port of the VM's logical switch. Source mac of this packet
> +      will be replaced with chassis unique mac.
> +    </li>
> +
> +    <li>
> +      VM's chassis receives the packet via the localnet port and
> +      sends it to the integration bridge. The packet enters the
> +      ingress pipeline and then egress pipeline of the localnet
> +      logical switch and finally gets delivered to the VM port.
> +    </li>
> +  </ol>
> +
> +  <p>
> +    One thing to note here is that, while VM to External traffic did not
> +    require redirection to gateway chassis, the reverse traffic is through
> +    gateway chassis only. This is because, for external router, OVN
> logical
> +    router port IP will be the next hop to reach the endpoints behind it.
> +    As a result, we need a centralized chassis, which will respond to ARP
> +    requests coming from external network. This centralized chassis, is
> the
> +    gateway chassis which is attached to corresponding router port.
> +  </p>
> +
>    <h2>Life Cycle of a VTEP gateway</h2>
>
>    <p>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index e5108a7..8a03393 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -29,6 +29,12 @@ m4_define([OVN_CHECK_PACKETS],
>    [ovn_check_packets__ "$1" "$2"
>     AT_CHECK([sort $rcv_text], [0], [expout])])
>
> +m4_define([OVN_CHECK_PACKETS_REMOVE_BROADCAST],
> +  [ovn_check_packets__ "$1" "$2"
> +   echo "received_text=$rcv_text"
> +   sed -i '/ffffffffffff/d' $rcv_text
> +   AT_CHECK([sort $rcv_text], [0], [expout])])
> +
>  AT_BANNER([OVN components])
>
>  AT_SETUP([ovn -- lexer])
> @@ -14018,7 +14024,7 @@ ovn-hv4-0
>  OVN_CLEANUP([hv1], [hv2], [hv3])
>  AT_CLEANUP
>
> -AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac])
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR E-W chassis mac])
>  ovn_start
>
>
> @@ -14028,6 +14034,8 @@ ovn_start
>  # of VIF port name indicates the hypervisor it is bound to, e.g.
>  # lp23 means VIF 3 on hv2.
>  #
> +# Both the switches are connected to a logical router "router".
> +#
>  # Each switch's VLAN tag and their logical switch ports are:
>  #   - ls1:
>  #       - tagged with VLAN 101
> @@ -14185,6 +14193,7 @@ test_ip() {
>  echo "------ OVN dump ------"
>  ovn-nbctl show
>  ovn-sbctl show
> +ovn-sbctl list port_binding
>
>  echo "------ hv1 dump ------"
>  as hv1 ovs-vsctl show
> @@ -14211,6 +14220,727 @@ as hv2 ovs-appctl fdb/show br-phys
>
>  OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected])
>
> +
> +# Associate a chassis as gateway chassis and validate garp.
> +
> +OVN_CLEANUP([hv1],[hv2])
> +
> +AT_CLEANUP
> +
> +
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S GARP])
> +ovn_start
> +
> +
> +# In this test cases we create 2 switches, all connected to same
> +# physical network (through br-phys on each HV). Each switch has
> +# 1 VIF. Each HV has 1 VIF port. The first digit
> +# of VIF port name indicates the hypervisor it is bound to, e.g.
> +# lp23 means VIF 3 on hv2.
> +#
> +# Both the switches are connected to a logical router "router".
> +#
> +# Additionally, we create a logical switch (ls-underlay) for N-S traffic.
> +#
> +# Each switch's VLAN tag and their logical switch ports are:
> +#   - ls1:
> +#       - tagged with VLAN 101
> +#       - ports: lp11
> +#   - ls2:
> +#       - tagged with VLAN 201
> +#       - ports: lp22
> +#   - ls-underlay:
> +#       - tagged with VLAN 1000
> +#
> +# Note: a localnet port is created for each switch to connect to
> +# physical network.
> +# lsp_to_ls LSP
> +#
> +# Prints the name of the logical switch that contains LSP.
> +
> +net_add n1
> +for i in 1 2; do
> +    sim_add hv$i
> +    as hv$i
> +    ovs-vsctl add-br br-phys
> +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +    ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
> +    ovs-vsctl set open . external-ids:system-id="HV$i"
> +    ovn_attach n1 br-phys 192.168.0.$i
> +    ovs-vsctl set-controller br-int ptcp:
> +    AT_CHECK([ovs-vsctl add-port br-phys snoopvif -- set Interface
> snoopvif options:tx_pcap=hv$i/snoopvif-tx.pcap
> options:rxq_pcap=hv$i/snoopvif-rx.pcap])
> +done
> +
> +ovn-nbctl ls-add ls-underlay bridged
> +ovn-nbctl lsp-add ls-underlay ln3 "" 1000
> +ovn-nbctl lsp-set-addresses ln3 unknown
> +ovn-nbctl lsp-set-type ln3 localnet
> +ovn-nbctl lsp-set-options ln3 network_name=phys
> +
> +ovn-nbctl lr-add router
> +ovn-nbctl lrp-add router router-to-underlay 00:00:01:01:02:07
> 172.31.0.1/24
> +
> +ovn-nbctl lsp-add ls-underlay underlay-to-router -- set
> Logical_Switch_Port \
> +                              underlay-to-router type=router \
> +                              options:router-port=router-to-underlay \
> +                              -- lsp-set-addresses underlay-to-router
> router
> +
> +ovn-nbctl --wait=sb sync
> +
> +# Associate hv2 as gateway chassis
> +ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv2
> +
> +ovn-nbctl show
> +ovn-sbctl show
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +
> +echo "------ hv1 dump ------"
> +as hv1 ovs-vsctl show
> +as hv1 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv2 dump ------"
> +as hv2 ovs-vsctl show
> +as hv2 ovs-vsctl list Open_Vswitch
> +
> +sleep 1
> +
> +echo "----------- Post Traffic hv1 dump -----------"
> +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv1 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv2 dump -----------"
> +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv2 ovs-appctl fdb/show br-phys
> +
> +AT_CHECK([as hv2 ovs-appctl fdb/show br-phys | grep 00:00:01:01:02:07 |
> grep 1000 | wc -l], [0], [[1
> +]])
> +
> +echo
> "ffffffffffff000001010207810003e808060001080006040001000001010207ac1f0001000000000000ac1f0001"
> > expected
> +OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [expected])
> +
>  OVN_CLEANUP([hv1],[hv2])
>
>  AT_CLEANUP
> +
> +
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S Ping])
> +ovn_start
> +
> +# In this test cases we create 3 switches, all connected to same
> +# physical network (through br-phys on each HV). LS1 and LS2 have
> +# 1 VIF each. Each HV has 1 VIF port. The first digit
> +# of VIF port name indicates the hypervisor it is bound to, e.g.
> +# lp23 means VIF 3 on hv2.
> +#
> +# All the switches are connected to a logical router "router".
> +#
> +# Each switch's VLAN tag and their logical switch ports are:
> +#   - ls1:
> +#       - tagged with VLAN 101
> +#       - ports: lp11
> +#   - ls2:
> +#       - tagged with VLAN 201
> +#       - ports: lp22
> +#   - ls-underlay:
> +#       - tagged with VLAN 1000
> +# Note: a localnet port is created for each switch to connect to
> +# physical network.
> +
> +for i in 1 2; do
> +    ls_name=ls$i
> +    ovn-nbctl ls-add $ls_name bridged
> +    ln_port_name=ln$i
> +    if test $i -eq 1; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
> +    elif test $i -eq 2; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
> +    fi
> +    ovn-nbctl lsp-set-addresses $ln_port_name unknown
> +    ovn-nbctl lsp-set-type $ln_port_name localnet
> +    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
> +done
> +
> +# lsp_to_ls LSP
> +#
> +# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        vif?[[north]]?) echo hv4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +net_add n1
> +for i in 1 2; do
> +    sim_add hv$i
> +    as hv$i
> +    ovs-vsctl add-br br-phys
> +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +    ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
> +    ovn_attach n1 br-phys 192.168.0.$i
> +
> +    ovs-vsctl add-port br-int vif$i$i -- \
> +        set Interface vif$i$i external-ids:iface-id=lp$i$i \
> +                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
> +                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
> +                              ofport-request=$i$i
> +
> +    lsp_name=lp$i$i
> +    ls_name=$(lsp_to_ls $lsp_name)
> +
> +    ovn-nbctl lsp-add $ls_name $lsp_name
> +    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i
> 192.168.$i.$i"
> +    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
> +
> +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> +
> +done
> +
> +ovn-nbctl ls-add ls-underlay bridged
> +ovn-nbctl lsp-add ls-underlay ln3 "" 1000
> +ovn-nbctl lsp-set-addresses ln3 unknown
> +ovn-nbctl lsp-set-type ln3 localnet
> +ovn-nbctl lsp-set-options ln3 network_name=phys
> +
> +ovn-nbctl ls-add ls-north bridged
> +ovn-nbctl lsp-add ls-north ln4 "" 1000
> +ovn-nbctl lsp-set-addresses ln4 unknown
> +ovn-nbctl lsp-set-type ln4 localnet
> +ovn-nbctl lsp-set-options ln4 network_name=phys
> +
> +# Add a VM on ls-north
> +ovn-nbctl lsp-add ls-north lp-north
> +ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
> +ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
> +
> +# Add 3rd hypervisor
> +sim_add hv3
> +as hv3 ovs-vsctl add-br br-phys
> +as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +as hv3 ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
> +as hv3 ovn_attach n1 br-phys 192.168.0.3
> +
> +# Add 4th hypervisor
> +sim_add hv4
> +as hv4 ovs-vsctl add-br br-phys
> +as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +as hv4 ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
> +as hv4 ovn_attach n1 br-phys 192.168.0.4
> +
> +as hv4 ovs-vsctl add-port br-int vif-north -- \
> +        set Interface vif-north external-ids:iface-id=lp-north \
> +                              options:tx_pcap=hv4/vif-north-tx.pcap \
> +                              options:rxq_pcap=hv4/vif-north-rx.pcap \
> +                              ofport-request=44
> +
> +ovn-nbctl lr-add router
> +ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
> +ovn-nbctl <http://192.168.1.3/24+ovn-nbctl> lrp-add router router-to-ls2
> 00:00:01:01:02:05 192.168.2.3/24
> +ovn-nbctl <http://192.168.2.3/24+ovn-nbctl> lrp-add router
> router-to-underlay 00:00:01:01:02:07 172.31.0.1/24
> +
> +ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port
> ls1-to-router type=router \
> +          options:router-port=router-to-ls1 -- lsp-set-addresses
> ls1-to-router router
> +ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port
> ls2-to-router type=router \
> +          options:router-port=router-to-ls2 -- lsp-set-addresses
> ls2-to-router router
> +ovn-nbctl lsp-add ls-underlay underlay-to-router -- set
> Logical_Switch_Port \
> +                              underlay-to-router type=router \
> +                              options:router-port=router-to-underlay \
> +                              -- lsp-set-addresses underlay-to-router
> router
> +
> +ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
> +
> +ovn-nbctl --wait=sb sync
> +
> +sleep 2
> +
> +OVN_POPULATE_ARP
> +
> ++# lsp_to_ls LSP
> ++#
> ++# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_ls () {
> +    case $1 in dnl (
> +        vif?[[11]]) echo ls1 ;; dnl (
> +        vif?[[12]]) echo ls2 ;; dnl (
> +        vif-north) echo ls-north ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +hv_to_num () {
> +    case $1 in dnl (
> +        hv1) echo 1 ;; dnl (
> +        hv2) echo 2 ;; dnl (
> +        hv3) echo 3 ;; dnl (
> +        hv4) echo 4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_num () {
> +    case $1 in dnl (
> +        vif22) echo 22 ;; dnl (
> +        vif21) echo 21 ;; dnl (
> +        vif11) echo 11 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        vif-north) echo hv4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_lrp () {
> +    echo router-to-`vif_to_ls $1`
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +
> +test_ip() {
> +        # This packet has bad checksums but logical L3 routing doesn't
> check.
> +        local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
> outport=$6
> +        local
> packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
> +        shift; shift; shift; shift; shift
> +        hv=`vif_to_hv $inport`
> +        as $hv ovs-appctl netdev-dummy/receive $inport $packet
> +        in_ls=`vif_to_ls $inport`
> +        for outport; do
> +            out_ls=`vif_to_ls $outport`
> +            if test $in_ls = $out_ls; then
> +                # Ports on the same logical switch receive exactly the
> same packet.
> +                echo $packet
> +            else
> +                # Routing decrements TTL and updates source and dest MAC
> +                # (and checksum).
> +                out_lrp=`vif_to_lrp $outport`
> +                # For North-South, packet will come via gateway chassis,
> i.e hv3
> +                if test $inport = vif-north; then
> +                    echo
> f00000000011aabbccddee3308004500001c000000003f110100${src_ip}${dst_ip}0035111100080000
> >> $outport.expected
> +                fi
> +                if test $outport = vif-north; then
> +                    echo
> f0f000000011aabbccddee1108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000
> >> $outport.expected
> +                fi
> +            fi >> $outport.expected
> +        done
> +}
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +ovn-sbctl list port_binding
> +ovn-sbctl list mac_binding
> +
> +echo "------ hv1 dump ------"
> +as hv1 ovs-vsctl show
> +as hv1 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv2 dump ------"
> +as hv2 ovs-vsctl show
> +as hv2 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv3 dump ------"
> +as hv3 ovs-vsctl show
> +as hv3 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv4 dump ------"
> +as hv4 ovs-vsctl show
> +as hv4 ovs-vsctl list Open_Vswitch
> +
> +echo "Send traffic North to South"
> +
> +sip=`ip_to_hex 172 31 0 10`
> +dip=`ip_to_hex 192 168 1 1`
> +test_ip vif-north f0f000000011 000001010207 $sip $dip vif11
> +
> +sleep 1
> +
> +# Confirm that North to south traffic works fine and went through gateway
> chassis, i.e HV3
> +OVN_CHECK_PACKETS([hv1/vif11-tx.pcap], [vif11.expected])
> +
> +echo "Send traffic South to Nouth"
> +sip=`ip_to_hex 192 168 1 1`
> +dip=`ip_to_hex 172 31 0 10`
> +test_ip vif11 f00000000011 000001010203 $sip $dip vif-north
> +
> +sleep 1
> +
> +# Confirm that South to North traffic works fine.
> +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap],
> [vif-north.expected])
> +
> +# Confirm that packets did not go out via tunnel port.
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=32 | grep
> NXM_NX_TUN_METADATA0 | grep n_packets=0 | wc -l], [0], [[1
> +]])
> +
> +# Confirm that HV1 chassis mac is never seen on Gateway chassis, i.e HV3
> +AT_CHECK([as hv3 ovs-appctl fdb/show br-phys | grep aa:bb:cc:dd:ee:11 |
> wc -l], [0], [[0
> +]])
> +
> +echo "----------- Post Traffic hv1 dump -----------"
> +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv1 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv2 dump -----------"
> +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv2 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv3 dump -----------"
> +as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv3 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv4 dump -----------"
> +as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv4 ovs-appctl fdb/show br-phys
> +
> +OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
> +
> +AT_CLEANUP
> +
> +
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S ARP
> handling])
> +ovn_start
> +
> +# In this test cases we create 3 switches, all connected to same
> +# physical network (through br-phys on each HV). LS1 and LS2 have
> +# 1 VIF each. Each HV has 1 VIF port. The first digit
> +# of VIF port name indicates the hypervisor it is bound to, e.g.
> +# lp23 means VIF 3 on hv2.
> +#
> +# All the switches are connected to a logical router "router".
> +#
> +# Each switch's VLAN tag and their logical switch ports are:
> +#   - ls1:
> +#       - tagged with VLAN 101
> +#       - ports: lp11
> +#   - ls2:
> +#       - tagged with VLAN 201
> +#       - ports: lp22
> +#   - ls-underlay:
> +#       - tagged with VLAN 1000
> +# Note: a localnet port is created for each switch to connect to
> +# physical network.
> +
> +for i in 1 2; do
> +    ls_name=ls$i
> +    ovn-nbctl ls-add $ls_name bridged
> +    ln_port_name=ln$i
> +    if test $i -eq 1; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
> +    elif test $i -eq 2; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
> +    fi
> +    ovn-nbctl lsp-set-addresses $ln_port_name unknown
> +    ovn-nbctl lsp-set-type $ln_port_name localnet
> +    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
> +done
> +
> +# lsp_to_ls LSP
> +#
> +# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        vif?[[north]]?) echo hv4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +net_add n1
> +for i in 1 2; do
> +    sim_add hv$i
> +    as hv$i
> +    ovs-vsctl add-br br-phys
> +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +    ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
> +    ovn_attach n1 br-phys 192.168.0.$i
> +
> +    ovs-vsctl add-port br-int vif$i$i -- \
> +        set Interface vif$i$i external-ids:iface-id=lp$i$i \
> +                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
> +                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
> +                              ofport-request=$i$i
> +
> +    lsp_name=lp$i$i
> +    ls_name=$(lsp_to_ls $lsp_name)
> +
> +    ovn-nbctl lsp-add $ls_name $lsp_name
> +    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i
> 192.168.$i.$i"
> +    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
> +
> +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> +
> +done
> +
> +ovn-nbctl ls-add ls-underlay bridged
> +ovn-nbctl lsp-add ls-underlay ln3 "" 1000
> +ovn-nbctl lsp-set-addresses ln3 unknown
> +ovn-nbctl lsp-set-type ln3 localnet
> +ovn-nbctl lsp-set-options ln3 network_name=phys
> +
> +ovn-nbctl ls-add ls-north bridged
> +ovn-nbctl lsp-add ls-north ln4 "" 1000
> +ovn-nbctl lsp-set-addresses ln4 unknown
> +ovn-nbctl lsp-set-type ln4 localnet
> +ovn-nbctl lsp-set-options ln4 network_name=phys
> +
> +# Add a VM on ls-north
> +ovn-nbctl lsp-add ls-north lp-north
> +ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
> +ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
> +
> +# Add 3rd hypervisor
> +sim_add hv3
> +as hv3 ovs-vsctl add-br br-phys
> +as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +as hv3 ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
> +as hv3 ovn_attach n1 br-phys 192.168.0.3
> +
> +# Add 4th hypervisor
> +sim_add hv4
> +as hv4 ovs-vsctl add-br br-phys
> +as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +as hv4 ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
> +as hv4 ovn_attach n1 br-phys 192.168.0.4
> +
> +as hv4 ovs-vsctl add-port br-int vif-north -- \
> +        set Interface vif-north external-ids:iface-id=lp-north \
> +                              options:tx_pcap=hv4/vif-north-tx.pcap \
> +                              options:rxq_pcap=hv4/vif-north-rx.pcap \
> +                              ofport-request=44
> +
> +ovn-nbctl lr-add router
> +ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
> +ovn-nbctl <http://192.168.1.3/24+ovn-nbctl> lrp-add router router-to-ls2
> 00:00:01:01:02:05 192.168.2.3/24
> +ovn-nbctl <http://192.168.2.3/24+ovn-nbctl> lrp-add router
> router-to-underlay 00:00:01:01:02:07 172.31.0.1/24
> +
> +ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port
> ls1-to-router type=router \
> +          options:router-port=router-to-ls1 -- lsp-set-addresses
> ls1-to-router router
> +ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port
> ls2-to-router type=router \
> +          options:router-port=router-to-ls2 -- lsp-set-addresses
> ls2-to-router router
> +ovn-nbctl lsp-add ls-underlay underlay-to-router -- set
> Logical_Switch_Port \
> +                              underlay-to-router type=router \
> +                              options:router-port=router-to-underlay \
> +                              -- lsp-set-addresses underlay-to-router
> router
> +
> +
> +OVN_POPULATE_ARP
> +
> ++# lsp_to_ls LSP
> ++#
> ++# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_ls () {
> +    case $1 in dnl (
> +        vif?[[11]]) echo ls1 ;; dnl (
> +        vif?[[12]]) echo ls2 ;; dnl (
> +        vif-north) echo ls-north ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +hv_to_num () {
> +    case $1 in dnl (
> +        hv1) echo 1 ;; dnl (
> +        hv2) echo 2 ;; dnl (
> +        hv3) echo 3 ;; dnl (
> +        hv4) echo 4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_num () {
> +    case $1 in dnl (
> +        vif22) echo 22 ;; dnl (
> +        vif21) echo 21 ;; dnl (
> +        vif11) echo 11 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        vif-north) echo hv4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_lrp () {
> +    echo router-to-`vif_to_ls $1`
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +ovn-sbctl list port_binding
> +ovn-sbctl list mac_binding
> +
> +echo "------ hv1 dump ------"
> +as hv1 ovs-vsctl show
> +as hv1 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv2 dump ------"
> +as hv2 ovs-vsctl show
> +as hv2 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv3 dump ------"
> +as hv3 ovs-vsctl show
> +as hv3 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv4 dump ------"
> +as hv4 ovs-vsctl show
> +as hv4 ovs-vsctl list Open_Vswitch
> +
> +# test_arp INPORT SHA SPA TPA [REPLY_HA]
> +#
> +# Causes a packet to be received on INPORT.  The packet is an ARP
> +# request with SHA, SPA, and TPA as specified.  If REPLY_HA is provided,
> then
> +# it should be the hardware address of the target to expect to receive in
> an
> +# ARP reply; otherwise no reply is expected.
> +#
> +# INPORT is an logical switch port number, e.g. 11 for vif11.
> +# SHA and REPLY_HA are each 12 hex digits.
> +# SPA and TPA are each 8 hex digits.
> +test_arp() {
> +    local inport=$1 sha=$2 spa=$3 tpa=$4 reply_ha=$5
> +    local
> request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa}
> +    hv=`vif_to_hv $inport`
> +    as $hv ovs-appctl netdev-dummy/receive $inport $request
> +
> +    if test X$reply_ha = X; then
> +        # Expect to receive the broadcast ARP on the other logical switch
> ports
> +        # if no reply is expected.
> +        local i j
> +        for i in 1 2 3; do
> +            for j in 1 2 3; do
> +                if test $i$j != $inport; then
> +                    echo $request >> $i$j.expected
> +                fi
> +            done
> +        done
> +    else
> +        # Expect to receive the reply, if any.
> +        local
> reply=${sha}${reply_ha}08060001080006040002${reply_ha}${tpa}${sha}${spa}
> +        local
> reply_vid=${sha}${reply_ha}810003e808060001080006040002${reply_ha}${tpa}${sha}${spa}
> +        echo $reply_vid >> ${inport}_vid.expected
> +        echo $reply >> $inport.expected
> +    fi
> +}
> +
> +sip=`ip_to_hex 172 31 0 10`
> +tip=`ip_to_hex 172 31 0 1`
> +
> +test_arp vif-north f0f000000011 $sip $tip
> +# Confirm that vif-north does not get ARP reply
> +AT_CHECK([wc -l hv4/vif-north-tx.pcap | awk '{print $1}'], [0], [[0
> +]])
> +
> +# Set a hypervisor as gateway chassis, for router port 172.31.0.1
> +ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
> +ovn-nbctl --wait=sb sync
> +sleep 2
> +
> +test_arp vif-north f0f000000011 $sip $tip 000001010207
> +
> +sleep 1
> +
> +# Confirm that vif-north gets a single ARP reply this time
> +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap],
> [vif-north.expected])
> +
> +# Confirm that only redirect chassis allowed arp resolution.
> +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv3/br-phys_n1-tx.pcap],
> [vif-north_vid.expected])
> +sed -i '/ffffffffffff/d' hv3/br-phys_n1-tx.packets
> +AT_CHECK([grep 000001010207 hv3/br-phys_n1-tx.packets | wc -l], [0], [[1
> +]])
> +
> +# Confirm that other OVN chassis did not generate ARP reply.
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/br-phys_n1-tx.pcap >
> hv1/br-phys_n1-tx.packets
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap >
> hv2/br-phys_n1-tx.packets
> +
> +AT_CHECK([grep 000001010207 hv1/br-phys_n1-tx.packets | wc -l], [0], [[0
> +]])
> +AT_CHECK([grep 000001010207 hv2/br-phys_n1-tx.packets | wc -l], [0], [[0
> +]])
> +
> +echo "----------- Post Traffic hv1 dump -----------"
> +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv1 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv2 dump -----------"
> +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv2 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv3 dump -----------"
> +as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv3 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv4 dump -----------"
> +as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv4 ovs-appctl fdb/show br-phys
> +
> +OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
> +
> +AT_CLEANUP
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Ankur Sharma June 6, 2019, 11:44 p.m. UTC | #2
Hi Numan,

Thanks for trying out the patch and providing feedback.
I am planning to change this series to reflect only the E-W and would send out separate patches for N-S improvements.

Following is the reasoning:
======================
a. I agree that the network_type construct is adding to the confusion and may be we should rely on optional/external-id based key-value config.
b. N-S patch has 3- changes (on a high level) which are distinct from each other, having them in a single patch is causing the confusion and is holding rest of the reviewed changes.
c. Last but not the least, there were some gaps in the patch as well.

Here is what I am planning to do:
==========================
a. Keep this series for E-W only and remove all the network_type related changes from here (including showing type as bridged/vlan).

b. For N-S Changes, this series has following changes. I will send them out in separate patches, especially the ones which are more of a bug fix.
    i. Do not allow ARP resolution from physical network unless gateway chassis is configured ==> More of a bug fix, will be sent as a separate standalone patch.
   ii. GARP advertisement during failover  in the absence of NAT configuration ==> More of a bug fix, will be sent as a separate standalone patch.
  iii. Periodic GARP advertisement with/without NAT configuration ==> New feature, will be added along with SNAT changes.
  iv. Avoid redirection ==> New feature, will come as a separate patchset, we will make it as optional feature, i.e by default even non NATed traffic will go via gateway chassis, but config knob can override it.
   v. No chassis mac replace on gateway chassis ==> More of a addendum to E-W, I am thinking about clubbing it some of N-S changes as this is where it will be relevant.

c. Will send out separate patch for showing network type as overlay or bridged (based on localnet port’s presence), I believe it is good to have 😊.
    i.e we will not have any new column in logical switch table, but the output of relevant ovn-nbctl show command will show type as “overlay” or “bridged”.


Above will allow us to make progress on the changes we are in agreement on, while having thorough discussion on the remaining.
Let me know, if you are fine with the plan, I should be able to send E-W only changes in a couple of days and should be able to individual bug fixes soon after as well.

For rest of the comments, please find my replies inline.

Appreciate your feedback.

Regards,
Ankur



From: Numan Siddique <nusiddiq@redhat.com>
Sent: Monday, June 3, 2019 3:06 AM
To: Ankur Sharma <ankur.sharma@nutanix.com>
Cc: ovs-dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v9 2/2] OVN: Enable N-S Traffic, Vlan backed DVR


Hi Ankur,

Please see some comments inline. Please note that I haven't got the chance to look into the code
in detail. I am first trying to test out the patches. (I am in PTO. Expect some delay in my replies).



On Thu, May 30, 2019 at 5:58 AM Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>> wrote:
Background:
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DOctober_353066.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=xQRPm8R90ygR4nx7uyRGOYHzW5NFiroiyZqi9JSYb-A&e=>
[2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing [docs.google.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU_edit-3Fusp-3Dsharing&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=myrKaOI2LsuZQQOZhhhNw1zwDgat77e5CmPmpTFpllw&e=>

This Series:
Layer 2, Layer 3 E-W and Layer 3 N-S (NO NAT) changes for vlan
backed distributed logical router.

This patch:
For North-South traffic, we need a chassis which will respond to
ARP requests for router port coming from outside. For this purpose,
we will reply upon gateway-chassis construct in OVN, on a logical
router port, we will associate one or more chassis as gateway chassis.

One of these chassis would be active at a point and will become
entry point to traffic, bound for end points behind logical router
coming from outside network (North to South).

This patch make some enhancements to gateway chassis implementation
to manage above used case.

A.
Do not replace router port mac with chassis mac on gateway
chassis.
This is done, because:
    i. Chassisredirect port is NOT a distributed port, hence
       we need not replace its mac address
      (which same as router port mac).

   ii. ARP cache will be consistent everywhere, i.e just like
       endpoints on OVN chassis will see configured router port
       mac as resolved mac for router port ip, outside endpoints
       will see that as well.

  iii. For implementing Network Address Translation. Although
       not a part of this series. But, follow up series would
       be having this feature and approach would rely upon
       sending packets to redirect chassis using chassis redirect
       router port mac as dest mac.

B.
Advertise router port GARP on gateway chassis.
This is needed, especially if a failover happens and
chassisredirect port moves to a new gateway chassis.
Otherwise, there would be packet drops till outside
router ARPs for router port ip again.

Intention of this GARP is to update top of the rack (TOR)
to direct router port mac to new hypervisor.

Hence, we could have done the same using RARP as well, but
because ovn-controller has implementation for GARP already,
hence it did not look like worthy to add a RARP implementation
just for this.

C.
For South to North traffic, we need not pass through gateway
chassis, if there is no address transalation needed.

For overlay networks, NATing is a must to talk to outside networks.
However, for vlan backed networks, NATing is not a must, and hence
in the absence of NATing configuration we need redirect the packet
to gateway chassis.

Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>>
---
 ovn/controller/physical.c  |  24 +-
 ovn/controller/pinctrl.c   | 205 +++++++++++--
 ovn/controller/pinctrl.h   |   6 +
 ovn/lib/ovn-util.c         |  31 ++
 ovn/lib/ovn-util.h         |   6 +
 ovn/northd/ovn-northd.c    |  43 ++-
 ovn/ovn-architecture.7.xml |  87 +++++-
 tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>               | 732 ++++++++++++++++++++++++++++++++++++++++++++-
 8 files changed, 1090 insertions(+), 44 deletions(-)

diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index af587a5..1ab5968 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -21,6 +21,7 @@
 #include "lflow.h"
 #include "lport.h"
 #include "chassis.h"
+#include "pinctrl.h"
 #include "lib/bundle.h"
 #include "openvswitch/poll-loop.h"
 #include "lib/uuid.h"
@@ -238,9 +239,12 @@ get_zone_ids(const struct sbrec_port_binding *binding,
 }

 static void
-put_replace_router_port_mac_flows(const struct
+put_replace_router_port_mac_flows(struct ovsdb_idl_index
+                                  *sbrec_port_binding_by_name,
+                                  const struct
                                   sbrec_port_binding *localnet_port,
                                   const struct sbrec_chassis *chassis,
+                                  const struct sset *active_tunnels,
                                   const struct hmap *local_datapaths,
                                   struct ofpbuf *ofpacts_p,
                                   ofp_port_t ofport,
@@ -281,8 +285,21 @@ put_replace_router_port_mac_flows(const struct
         char *err_str = NULL;
         struct match match;
         struct ofpact_mac *replace_mac;
+        char *cr_peer_name = xasprintf("cr-%s", rport_binding->logical_port);

-        /* Table 65, priority 150.
+
+        if (pinctrl_is_chassis_resident(sbrec_port_binding_by_name,
+                                        chassis, active_tunnels,
+                                        cr_peer_name)) {
+            /* If a router port's chassisredirect port is
+             * resident on this chassis, then we need not do mac replace. */
+            free(cr_peer_name);
+            continue;
+        }
+
+        free(cr_peer_name);
+
+       /* Table 65, priority 150.
          * =======================
          *
          * Implements output to localnet port.
@@ -797,7 +814,8 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name,
                         &match, ofpacts_p, &binding->header_.uuid);

         if (!strcmp(binding->type, "localnet")) {
-            put_replace_router_port_mac_flows(binding, chassis,
+            put_replace_router_port_mac_flows(sbrec_port_binding_by_name,
+                                              binding, chassis, active_tunnels,
                                               local_datapaths, ofpacts_p,
                                               ofport, flow_table);
         }
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index b7bb4c9..a145867 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -226,6 +226,8 @@ static bool may_inject_pkts(void);
 COVERAGE_DEFINE(pinctrl_drop_put_mac_binding);
 COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map);

+#define GARP_DEF_REPEAT_INTERVAL_MS   (3 * 60 * 1000) /* 3 minutes */
+
 void
 pinctrl_init(void)
 {
@@ -242,6 +244,25 @@ pinctrl_init(void)
                                                 &pinctrl);
 }

+bool
+pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                            const struct sbrec_chassis *chassis,
+                            const struct sset *active_tunnels,
+                            const char *port_name)
+{
+    const struct sbrec_port_binding *pb
+        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
+    if (!pb || !pb->chassis) {
+        return false;
+    }
+    if (strcmp(pb->type, "chassisredirect")) {
+        return pb->chassis == chassis;
+    } else {
+        return ha_chassis_group_is_active(pb->ha_chassis_group,
+                                          active_tunnels, chassis);
+    }
+}
+
 static ovs_be32
 queue_msg(struct rconn *swconn, struct ofpbuf *msg)
 {
@@ -2548,6 +2569,8 @@ struct garp_data {
     int backoff;                 /* Backoff for the next announcement. */
     uint32_t dp_key;             /* Datapath used to output this GARP. */
     uint32_t port_key;           /* Port to inject the GARP into. */
+    bool is_repeat;              /* Send GARPs continously */
+    long long int repeat_interval; /* Interval between GARP bursts in ms */
 };

 /* Contains GARPs to be sent. Protected by pinctrl_mutex*/
@@ -2568,7 +2591,8 @@ destroy_send_garps(void)
 /* Runs with in the main ovn-controller thread context. */
 static void
 add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
-         uint32_t dp_key, uint32_t port_key)
+         uint32_t dp_key, uint32_t port_key, bool is_repeat,
+         long long int repeat_interval)
 {
     struct garp_data *garp = xmalloc(sizeof *garp);
     garp->ea = ea;
@@ -2577,6 +2601,8 @@ add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
     garp->backoff = 1;
     garp->dp_key = dp_key;
     garp->port_key = port_key;
+    garp->is_repeat = is_repeat;
+    garp->repeat_interval = repeat_interval;
     shash_add(&send_garp_data, name, garp);

     /* Notify pinctrl_handler so that it can wakeup and process
@@ -2586,7 +2612,8 @@ add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,

 /* Add or update a vif for which GARPs need to be announced. */
 static void
-send_garp_update(const struct sbrec_port_binding *binding_rec,
+send_garp_update(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                 const struct sbrec_port_binding *binding_rec,
                  struct shash *nat_addresses)
 {
     volatile struct garp_data *garp = NULL;
@@ -2611,7 +2638,7 @@ send_garp_update(const struct sbrec_port_binding *binding_rec,
                     add_garp(name, laddrs->ea,
                              laddrs->ipv4_addrs[i].addr,
                              binding_rec->datapath->tunnel_key,
-                             binding_rec->tunnel_key);
+                             binding_rec->tunnel_key, false, 0);
                 }
                 free(name);
             }
@@ -2621,6 +2648,64 @@ send_garp_update(const struct sbrec_port_binding *binding_rec,
         return;
     }

+    /* Update GARPs for local chassisredirect port, if the peer
+     * layer 2 switch is of type vlan.
+     */
+    if (!strcmp(binding_rec->type, "chassisredirect")) {
+        struct eth_addr mac;
+        ovs_be32 ip, mask;
+        uint32_t dp_key = 0;
+        uint32_t port_key = 0;
+        const struct sbrec_port_binding *peer_port = NULL;
+        const struct sbrec_port_binding *distributed_port = NULL;
+
+        if (!ovn_sbrec_get_port_binding_ip_mac(binding_rec, &mac,
+                                               &ip, &mask)) {
+            /* Router Port binding without ip and mac configured. */
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+            VLOG_WARN_RL(&rl, "cannot send garp, router port binding: %s, "
+                         "does not have proper ip,mac values: %s",
+                         binding_rec->logical_port, *binding_rec->mac);
+            return;
+        }
+
+        const char *lrp_name = smap_get(&binding_rec->options,
+                                        "distributed-port");
+        ovs_assert(lrp_name);
+
+        distributed_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                                lrp_name);
+        ovs_assert(distributed_port);
+
+        const char *peer_name = smap_get(&distributed_port->options, "peer");
+        ovs_assert(peer_name);
+
+        peer_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                         peer_name);
+        ovs_assert(peer_port);
+
+        const char *network_type = smap_get(&peer_port->datapath->external_ids,
+                                            "network-type");
+
+        /* Advertise GARP only of logical switch is of type bridged. */
+        if (!network_type || strcmp(network_type, "bridged")) {
+            return;
+        }
+
+        dp_key = peer_port->datapath->tunnel_key;
+        port_key = peer_port->tunnel_key;
+
+        garp = shash_find_data(&send_garp_data, binding_rec->logical_port);
+        if (garp) {
+            garp->dp_key = dp_key;
+            garp->port_key = port_key;
+        } else {
+            add_garp(binding_rec->logical_port, mac, ip,
+                     dp_key, port_key, true, GARP_DEF_REPEAT_INTERVAL_MS);
+        }
+        return;
+    }
+
     /* Update GARP for vif if it exists. */
     garp = shash_find_data(&send_garp_data, binding_rec->logical_port);
     if (garp) {
@@ -2640,7 +2725,8 @@ send_garp_update(const struct sbrec_port_binding *binding_rec,

         add_garp(binding_rec->logical_port,
                  laddrs.ea, laddrs.ipv4_addrs[0].addr,
-                 binding_rec->datapath->tunnel_key, binding_rec->tunnel_key);
+                 binding_rec->datapath->tunnel_key, binding_rec->tunnel_key,
+                 false, 0);

         destroy_lport_addresses(&laddrs);
         break;
@@ -2702,7 +2788,12 @@ send_garp(struct rconn *swconn, struct garp_data *garp,
         garp->backoff *= 2;
         garp->announce_time = current_time + garp->backoff * 1000;
     } else {
-        garp->announce_time = LLONG_MAX;
+        if (garp->is_repeat) {
+            garp->backoff = 1;
+            garp->announce_time = current_time + garp->repeat_interval;
+        } else {
+            garp->announce_time = LLONG_MAX;
+        }
     }
     return garp->announce_time;
 }
@@ -2786,25 +2877,6 @@ get_localnet_vifs_l3gwports(
     sbrec_port_binding_index_destroy_row(target);
 }

-static bool
-pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
-                            const struct sbrec_chassis *chassis,
-                            const struct sset *active_tunnels,
-                            const char *port_name)
-{
-    const struct sbrec_port_binding *pb
-        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
-    if (!pb || !pb->chassis) {
-        return false;
-    }
-    if (strcmp(pb->type, "chassisredirect")) {
-        return pb->chassis == chassis;
-    } else {
-        return ha_chassis_group_is_active(pb->ha_chassis_group,
-                                          active_tunnels, chassis);
-    }
-}
-
 /* Extracts the mac, IPv4 and IPv6 addresses, and logical port from
  * 'addresses' which should be of the format 'MAC [IP1 IP2 ..]
  * [is_chassis_resident("LPORT_NAME")]', where IPn should be a valid IPv4
@@ -2946,6 +3018,67 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index *sbrec_port_binding_by_name,
 }

 static void
+get_local_cr_ports(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                   struct sset *local_cr_ports,
+                   struct sset *local_l3gw_ports,
+                   const struct sbrec_chassis *chassis,
+                   const struct sset *active_tunnels)
+{
+    const char *gw_port;
+    SSET_FOR_EACH (gw_port, local_l3gw_ports) {
+        const struct sbrec_port_binding *binding_rec;
+
+        binding_rec = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                           gw_port);
+        if (!binding_rec) {
+            continue;
+        }
+
+        /* For the patch port we will add send garp for peer's ip and mac. */
+        if (!strcmp(binding_rec->type, "patch")) {
+            const struct sbrec_port_binding *cr_port = NULL;
+
+            bool is_cr_resident;
+            struct eth_addr mac;
+            ovs_be32 ip, mask;
+
+            const char *peer_name = smap_get(&binding_rec->options, "peer");
+            ovs_assert(peer_name);
+
+            char *cr_peer_name = xasprintf("cr-%s", peer_name);
+            cr_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                           cr_peer_name);
+            free(cr_peer_name);
+
+            if (!cr_port) {
+                continue;
+            }
+
+            is_cr_resident = pinctrl_is_chassis_resident
+                                (sbrec_port_binding_by_name,
+                                 chassis,
+                                 active_tunnels,
+                                 cr_port->logical_port);
+            if (!is_cr_resident) {
+                continue;
+            }
+
+            if (!ovn_sbrec_get_port_binding_ip_mac(cr_port, &mac, &ip,
+                                                   &mask)) {
+                /* Router Port binding without ip and mac configured. */
+                static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+                VLOG_WARN_RL(&rl, "cannot send garp, router port binding: %s, "
+                             "does not have proper ip,mac values: %s",
+                              cr_port->logical_port, *cr_port->mac);
+                return;
+            }
+
+            sset_add(local_cr_ports, cr_port->logical_port);
+        }
+    }
+}
+
+static void
 send_garp_wait(long long int send_garp_time)
 {
     /* Set the poll timer for next garp only if there is garp data to
@@ -2990,6 +3123,8 @@ send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
 {
     struct sset localnet_vifs = SSET_INITIALIZER(&localnet_vifs);
     struct sset local_l3gw_ports = SSET_INITIALIZER(&local_l3gw_ports);
+    struct sset local_cr_ports = SSET_INITIALIZER(&local_cr_ports);
+
     struct sset nat_ip_keys = SSET_INITIALIZER(&nat_ip_keys);
     struct shash nat_addresses;

@@ -3004,11 +3139,17 @@ send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
                                &nat_ip_keys, &local_l3gw_ports,
                                chassis, active_tunnels,
                                &nat_addresses);
+
+    get_local_cr_ports(sbrec_port_binding_by_name,
+                       &local_cr_ports, &local_l3gw_ports,
+                       chassis, active_tunnels);
+
     /* For deleted ports and deleted nat ips, remove from send_garp_data. */
     struct shash_node *iter, *next;
     SHASH_FOR_EACH_SAFE (iter, next, &send_garp_data) {
         if (!sset_contains(&localnet_vifs, iter->name) &&
-            !sset_contains(&nat_ip_keys, iter->name)) {
+            !sset_contains(&nat_ip_keys, iter->name) &&
+            !sset_contains(&local_cr_ports, iter->name)) {
             send_garp_delete(iter->name);
         }
     }
@@ -3019,7 +3160,7 @@ send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
         const struct sbrec_port_binding *pb = lport_lookup_by_name(
             sbrec_port_binding_by_name, iface_id);
         if (pb) {
-            send_garp_update(pb, &nat_addresses);
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
         }
     }

@@ -3029,7 +3170,17 @@ send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
         const struct sbrec_port_binding *pb
             = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port);
         if (pb) {
-            send_garp_update(pb, &nat_addresses);
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
+        }
+    }
+
+    /* Update send_garp_data for chassisredirect router ports. */
+    const char *cr_port;
+    SSET_FOR_EACH (cr_port, &local_cr_ports) {
+        const struct sbrec_port_binding *pb
+            = lport_lookup_by_name(sbrec_port_binding_by_name, cr_port);
+        if (pb) {
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
         }
     }

diff --git a/ovn/controller/pinctrl.h b/ovn/controller/pinctrl.h
index f61d705..92f704e 100644
--- a/ovn/controller/pinctrl.h
+++ b/ovn/controller/pinctrl.h
@@ -44,4 +44,10 @@ void pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
 void pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn);
 void pinctrl_destroy(void);

+bool
+pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                            const struct sbrec_chassis *chassis,
+                            const struct sset *active_tunnels,
+                            const char *port_name);
+
 #endif /* ovn/pinctrl.h */
diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
index 0f07d80..3d0ad8e 100644
--- a/ovn/lib/ovn-util.c
+++ b/ovn/lib/ovn-util.c
@@ -16,6 +16,7 @@
 #include "ovn-util.h"
 #include "dirs.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/ofp-parse.h"
 #include "ovn/lib/ovn-nb-idl.h"
 #include "ovn/lib/ovn-sb-idl.h"

@@ -371,3 +372,33 @@ ovn_logical_flow_hash(const struct uuid *logical_datapath,
     hash = hash_string(match, hash);
     return hash_string(actions, hash);
 }
+
+/*  Extracts the mac, ip and mask for a sbrec_port_binding.
+ *
+ *  Expects following format:
+ *  "MAC_ADDRESS IP/MASK"
+ *
+ *  Return true if MAC, IP and MASK are found, false otherwise.
+ */
+bool
+ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding *binding,
+                                  struct eth_addr *mac,
+                                  ovs_be32 *ip, ovs_be32 *mask)
+{
+    char *err_str = NULL;
+
+    err_str = str_to_mac(binding->mac[0], mac);
+    if (err_str) {
+        free(err_str);
+        return false;
+    }
+
+    err_str = ip_parse_masked(binding->mac[0] + ETH_ADDR_STRLEN + 1,
+                              ip, mask);
+    if (err_str) {
+        free(err_str);
+        return false;
+    }
+
+    return true;
+}
diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h
index 6d5e1df..c01595a 100644
--- a/ovn/lib/ovn-util.h
+++ b/ovn/lib/ovn-util.h
@@ -19,6 +19,7 @@
 #include "lib/packets.h"

 struct nbrec_logical_router_port;
+struct sbrec_port_binding;
 struct sbrec_logical_flow;
 struct uuid;

@@ -81,4 +82,9 @@ uint32_t ovn_logical_flow_hash(const struct uuid *logical_datapath,
                                uint16_t priority,
                                const char *match, const char *actions);

+bool
+ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding *binding,
+                                  struct eth_addr *mac, ovs_be32 *ip,
+                                  ovs_be32 *mask);
+
 #endif
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 74d3692..6835910 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -5914,6 +5914,20 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
                     ds_put_format(&match, " && is_chassis_resident(%s)",
                                   op->od->l3redirect_port->json_key);
                 }
+            } else if (op->peer &&
+                       op->peer->od->network_type == DP_NETWORK_BRIDGED) {
+                /* For a router port connected to bridged logical switch,
+                 * we will always have the is_chassis_resident check.
+                 * This is because there could be vm/server on vlan network,
+                 * but not on OVN chassis and could end up arping for router
+                 * port ip.
+                 *
+                 * This check works on the assumption that for OVN chassis,
+                 * VMs logical switch ARP responder will respond to ARP
+                 * requests for router port IP.
+                 */
+                ds_put_format(&match, " && is_chassis_resident(\"cr-%s\")",
+                              op->key);
             }

             ds_clear(&actions);
@@ -7365,18 +7379,23 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
             ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 300,
                           REGBIT_DISTRIBUTED_NAT" == 1", "next;");

-            /* For traffic with outport == l3dgw_port, if the
-             * packet did not match any higher priority redirect
-             * rule, then the traffic is redirected to the central
-             * instance of the l3dgw_port. */
-            ds_clear(&match);
-            ds_put_format(&match, "outport == %s",
-                          od->l3dgw_port->json_key);
-            ds_clear(&actions);
-            ds_put_format(&actions, "outport = %s; next;",
-                          od->l3redirect_port->json_key);
-            ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
-                          ds_cstr(&match), ds_cstr(&actions));
+            /* For VLAN backed networks, default match will not redirect to
+             * chassis redirect port. */
+            if (od->l3dgw_port->peer &&
+                od->l3dgw_port->peer->od->network_type == DP_NETWORK_OVERLAY) {
+                /* For traffic with outport == l3dgw_port, if the
+                 * packet did not match any higher priority redirect
+                 * rule, then the traffic is redirected to the central
+                 * instance of the l3dgw_port. */
+                ds_clear(&match);
+                ds_put_format(&match, "outport == %s",
+                              od->l3dgw_port->json_key);
+                ds_clear(&actions);
+                ds_put_format(&actions, "outport = %s; next;",
+                              od->l3redirect_port->json_key);
+                ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
+                              ds_cstr(&match), ds_cstr(&actions));
+            }

Looks like this code is having some side effects.


Point 1.
======
For my public switch if I don't set the network_type as "bridged",
then I see the below logical flows and think this is as expected. And I think
that's why in my v7 tests the packets were tunneled to the gw chassis (as
you mentioned in the reply).

****
table=12(lr_in_gw_redirect  ), priority=300  , match=(reg9[2] == 1), action=(next;)
  table=12(lr_in_gw_redirect  ), priority=200  , match=(reg9[0] == 1), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=150  , match=(outport == "lr0-public" && eth.dst == 00:00:00:00:00:00), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=50   , match=(outport == "lr0-public"), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
****

If I set the type as "bridged", I see the below flows

****
 table=12(lr_in_gw_redirect  ), priority=300  , match=(reg9[2] == 1), action=(next;)
  table=12(lr_in_gw_redirect  ), priority=200  , match=(reg9[0] == 1), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=150  , match=(outport == "lr0-sw1" && reg0 == 20.0.0.3 && eth.dst == 00:00:00:00:00:00), action=(eth.dst = 40:54:00:00:00:03; next;)
  table=12(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
****

I don't understand the 3rd flow with the match -- "outport == "lr0-sw1"...

Looks like the "match" and "action" variables have some old data. Please look into the code again.
[ANKUR]:
Yup, missed out on clearing match and actions, thanks for calling it out.

After the "if" condition you added in this patch at line 7384, the below code is still there and it doesn't make sense

******
             /* For VLAN backed networks, default match will not redirect to
             * chassis redirect port. */
            if (od->l3dgw_port->peer &&
                od->l3dgw_port->peer->od->network_type == DP_NETWORK_OVERLAY) {
                /* For traffic with outport == l3dgw_port, if the
                 * packet did not match any higher priority redirect
                 * rule, then the traffic is redirected to the central
                 * instance of the l3dgw_port. */
                ds_clear(&match);
                ds_put_format(&match, "outport == %s",
                              od->l3dgw_port->json_key);
                ds_clear(&actions);
                ds_put_format(&actions, "outport = %s; next;",
                              od->l3redirect_port->json_key);
                ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
                              ds_cstr(&match), ds_cstr(&actions));
            }

            /* If the Ethernet destination has not been resolved,
             * redirect to the central instance of the l3dgw_port.
             * Such traffic will be replaced by an ARP request or ND
             * Neighbor Solicitation in the ARP request ingress
             * table, before being redirected to the central instance.
             */
            ds_put_format(&match, " && eth.dst == 00:00:00:00:00:00");      ====> THIS ONE
            ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 150,   ====> AND THIS ONE
                          ds_cstr(&match), ds_cstr(&actions));
        }

[ANKUR]:
Intention here was to just remove the flow which was sending out anything directed to router port to chassis redirect router port.

********

Point 2
=====

This patch breaks the S/N traffic if we have a logical switch (sw0) of type overlay connected
to a router and the router also a gw port connected to a logical switch (public) of type bridged (i.e provider network).
This public switch has a localnet port.

Some thing like this - http://paste.openstack.org/show/752427/ [paste.openstack.org]<https://urldefense.proofpoint.com/v2/url?u=http-3A__paste.openstack.org_show_752427_&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=6rUfRX1OJVsHGsYSDvkxwm1jNck-tOBUBuVlw4FyvHQ&e=>

It works fine if I change the type of the logical switch - public to overlay. But this doesn't make sense, since
the logical switch - public is a provider (or bridged) network and CMS can set the type as bridged.

I still think it's better not to have "network_type" column in logical_switch. We can always consider a logical
switch having a localnet port of type "bridged" and with out a localnet port of type "overlay".

[ANKUR]:
Agreed, nomenclature and usage of type field is confusing. It will be difficult to convey / expect that CMS
will NOT end up using even it was not supped to. I mentioned it the email reply, that I will be removing it
from current patch series and we will have separate config knobs for the use cases this field was added for.

This patch series sets the network_type=bridged in the external_ids of the datapath_binding row in SB DB.

Please see my comments in v4 of the patch 1 where I suggested something like below

****
enum ovn_datapath_nw_type {
    DP_NETWORK_OVERLAY,
    DP_NETWORK_PROVIDER
};

static void
ovn_datapath_update_nw_type(struct ovn_datapath *od)
{
    if (!od->nbs) {
        return;
    }

    if (!od->localnet_port) {
        od->network_type = DP_NETWORK_OVERLAY;
    } else {
        od->network_type = DP_NETWORK_PROVIDER;
    }
}
******

I think you can still set the external_ids of the datapath_binding row with "network_type=bridged"
if od->network_type is BRIDGED so that ovn-controller can distinguish if its bridged or overlay datapath.


I am mainly thinking from upgrades perspective for the existing deployments once this patch is series is applied.
Until CMS changes the network_type to "bridged" for all the logical switches with localnet ports in the
existing deployments, "ovn-nbctl show" will show these logical switches as "overlay" which is weird.
And later we may encounter other issues when enhancing OVN with new features.
[ANKUR]:
Yes, the value is quite easy to get confused with, I will be removing it in v10.

I think instead of adding the code to skip the redirection to the gateway chassis in ovn-northd if its a bridged network,
it's better to handle it in table 32 and since the mac replacement is handled in table 65 it probably makes more sense this way.
[ANKUR]:
IMO, Table 32 should only decide if redirection has to be done on overlay or vlan. While, logical flow should decide if redirection
is needed or not.

Thanks
Numan


             /* If the Ethernet destination has not been resolved,
              * redirect to the central instance of the l3dgw_port.
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 6275db1..6df711e 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -1441,7 +1441,7 @@
     </li>
   </ol>

-  <h3>External traffic</h3>
+  <h3>External traffic (NAT)</h3>

   <p>
     The following happens when a VM sends an external traffic (which requires
@@ -1607,6 +1607,91 @@
     </li>
   </ol>

+  <h3>External traffic (NO NAT)</h3>
+  <p>
+    The following happens when a VM sends an external traffic (i.e to non
+    logical router connected network), but there is not need for NATing.
+  </p>
+
+  <p>
+    Since, there is no NATing required, hence we need not redirect the packet
+    to a gateway chassis. As a result, this packet flow is same as East-West.
+    In order to ensure that OVN will not redirect the packet over a tunnel
+    to gateway-chassis, "network_type" of destination localnet logical switch,
+    should be set as "bridged". A "bridged" logical switch ensures that there
+    is no tunnel encapsulation done while forwarding the packet on it.
+    Please refer to <code>ovn-nb</code>(5) for more details.
+  </p>
+
+  <ol>
+    <li>
+      It first enters the ingress pipeline, and then egress pipeline of the
+      source localnet logical switch datapath. It then enters the ingress
+      pipeline of the logical router datapath via the logical router port in
+      the source chassis.
+    </li>
+
+    <li>
+      Routing decision is taken. Since, destination network is NOT directly
+      connected to logial router, hence a static route is expected, which will
+      provide next hop ip.
+    </li>
+
+    <li>
+      From the router datapath, packet enters the ingress pipeline and then
+      egress pipeline of the destination localnet logical switch datapath
+      (it is of type "bridged" and this is where the next hop is present)
+      and goes out of the integration bridge to the provider bridge (
+      belonging to the destination logical switch) via the localnet port.
+      Same as East-West, source mac will replaced with chassis mac.
+    </li>
+  </ol>
+
+  <p>
+    The following happens for the reverse external traffic.
+  </p>
+
+  <ol>
+    <li>
+      The gateway chassis receives the packet from the localnet port of
+      the logical switch (bridged type) which provides external connectivity.
+      The packet then enters the ingress pipeline and then egress pipeline of
+      the localnet logical switch (which provides external connectivity).
+      The packet then enters the ingress pipeline of the logical router
+      datapath.
+    </li>
+
+    <li>
+      Routing decision is taken and logical switch of destination VM is
+      identified.
+    </li>
+
+    <li>
+      The packet then enters the ingress pipeline and then egress
+      pipeline of VM's localnet logical switch. Since the source VM
+      doesn't reside in the gateway chassis, the packet is sent out via the
+      localnet port of the VM's logical switch. Source mac of this packet
+      will be replaced with chassis unique mac.
+    </li>
+
+    <li>
+      VM's chassis receives the packet via the localnet port and
+      sends it to the integration bridge. The packet enters the
+      ingress pipeline and then egress pipeline of the localnet
+      logical switch and finally gets delivered to the VM port.
+    </li>
+  </ol>
+
+  <p>
+    One thing to note here is that, while VM to External traffic did not
+    require redirection to gateway chassis, the reverse traffic is through
+    gateway chassis only. This is because, for external router, OVN logical
+    router port IP will be the next hop to reach the endpoints behind it.
+    As a result, we need a centralized chassis, which will respond to ARP
+    requests coming from external network. This centralized chassis, is the
+    gateway chassis which is attached to corresponding router port.
+  </p>
+
   <h2>Life Cycle of a VTEP gateway</h2>

   <p>
diff --git a/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=> b/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
index e5108a7..8a03393 100644
--- a/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
+++ b/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
@@ -29,6 +29,12 @@ m4_define([OVN_CHECK_PACKETS],
   [ovn_check_packets__ "$1" "$2"
    AT_CHECK([sort $rcv_text], [0], [expout])])

+m4_define([OVN_CHECK_PACKETS_REMOVE_BROADCAST],
+  [ovn_check_packets__ "$1" "$2"
+   echo "received_text=$rcv_text"
+   sed -i '/ffffffffffff/d' $rcv_text
+   AT_CHECK([sort $rcv_text], [0], [expout])])
+
 AT_BANNER([OVN components])

 AT_SETUP([ovn -- lexer])
@@ -14018,7 +14024,7 @@ ovn-hv4-0
 OVN_CLEANUP([hv1], [hv2], [hv3])
 AT_CLEANUP

-AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac])
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR E-W chassis mac])
 ovn_start


@@ -14028,6 +14034,8 @@ ovn_start
 # of VIF port name indicates the hypervisor it is bound to, e.g.
 # lp23 means VIF 3 on hv2.
 #
+# Both the switches are connected to a logical router "router".
+#
 # Each switch's VLAN tag and their logical switch ports are:
 #   - ls1:
 #       - tagged with VLAN 101
@@ -14185,6 +14193,7 @@ test_ip() {
 echo "------ OVN dump ------"
 ovn-nbctl show
 ovn-sbctl show
+ovn-sbctl list port_binding

 echo "------ hv1 dump ------"
 as hv1 ovs-vsctl show
@@ -14211,6 +14220,727 @@ as hv2 ovs-appctl fdb/show br-phys

 OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected])

+
+# Associate a chassis as gateway chassis and validate garp.
+
+OVN_CLEANUP([hv1],[hv2])
+
+AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S GARP])
+ovn_start
+
+
+# In this test cases we create 2 switches, all connected to same
+# physical network (through br-phys on each HV). Each switch has
+# 1 VIF. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# Both the switches are connected to a logical router "router".
+#
+# Additionally, we create a logical switch (ls-underlay) for N-S traffic.
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+#
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovs-vsctl set open . external-ids:system-id="HV$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+    ovs-vsctl set-controller br-int ptcp:
+    AT_CHECK([ovs-vsctl add-port br-phys snoopvif -- set Interface snoopvif options:tx_pcap=hv$i/snoopvif-tx.pcap options:rxq_pcap=hv$i/snoopvif-rx.pcap])
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24 [172.31.0.1]<https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
+
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+ovn-nbctl --wait=sb sync
+
+# Associate hv2 as gateway chassis
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv2
+
+ovn-nbctl show
+ovn-sbctl show
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+sleep 1
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+AT_CHECK([as hv2 ovs-appctl fdb/show br-phys | grep 00:00:01:01:02:07 | grep 1000 | wc -l], [0], [[1
+]])
+
+echo "ffffffffffff000001010207810003e808060001080006040001000001010207ac1f0001000000000000ac1f0001" > expected
+OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [expected])
+
 OVN_CLEANUP([hv1],[hv2])

 AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S Ping])
+ovn_start
+
+# In this test cases we create 3 switches, all connected to same
+# physical network (through br-phys on each HV). LS1 and LS2 have
+# 1 VIF each. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# All the switches are connected to a logical router "router".
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+
+for i in 1 2; do
+    ls_name=ls$i
+    ovn-nbctl ls-add $ls_name bridged
+    ln_port_name=ln$i
+    if test $i -eq 1; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
+    elif test $i -eq 2; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
+    fi
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
+done
+
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif?[[north]]?) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+
+    ovs-vsctl add-port br-int vif$i$i -- \
+        set Interface vif$i$i external-ids:iface-id=lp$i$i \
+                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
+                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
+                              ofport-request=$i$i
+
+    lsp_name=lp$i$i
+    ls_name=$(lsp_to_ls $lsp_name)
+
+    ovn-nbctl lsp-add $ls_name $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i"
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl ls-add ls-north bridged
+ovn-nbctl lsp-add ls-north ln4 "" 1000
+ovn-nbctl lsp-set-addresses ln4 unknown
+ovn-nbctl lsp-set-type ln4 localnet
+ovn-nbctl lsp-set-options ln4 network_name=phys
+
+# Add a VM on ls-north
+ovn-nbctl lsp-add ls-north lp-north
+ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
+ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
+
+# Add 3rd hypervisor
+sim_add hv3
+as hv3 ovs-vsctl add-br br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
+as hv3 ovn_attach n1 br-phys 192.168.0.3
+
+# Add 4th hypervisor
+sim_add hv4
+as hv4 ovs-vsctl add-br br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
+as hv4 ovn_attach n1 br-phys 192.168.0.4
+
+as hv4 ovs-vsctl add-port br-int vif-north -- \
+        set Interface vif-north external-ids:iface-id=lp-north \
+                              options:tx_pcap=hv4/vif-north-tx.pcap \
+                              options:rxq_pcap=hv4/vif-north-rx.pcap \
+                              ofport-request=44
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
+ovn-nbctl [192.168.1.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.1.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=r3laW3QCkYmIZydSf8n5bHm0ObKIuSd3VACsekBmSbg&e=> lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
+ovn-nbctl [192.168.2.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.2.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=suQy9nYmhP89HVjKfN--Kvziv8XSkkzS9bXCDrfE1c4&e=> lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24 [172.31.0.1]<https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
+
+ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router \
+          options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router
+ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router \
+          options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
+
+ovn-nbctl --wait=sb sync
+
+sleep 2
+
+OVN_POPULATE_ARP
+
++# lsp_to_ls LSP
++#
++# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_ls () {
+    case $1 in dnl (
+        vif?[[11]]) echo ls1 ;; dnl (
+        vif?[[12]]) echo ls2 ;; dnl (
+        vif-north) echo ls-north ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+hv_to_num () {
+    case $1 in dnl (
+        hv1) echo 1 ;; dnl (
+        hv2) echo 2 ;; dnl (
+        hv3) echo 3 ;; dnl (
+        hv4) echo 4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_num () {
+    case $1 in dnl (
+        vif22) echo 22 ;; dnl (
+        vif21) echo 21 ;; dnl (
+        vif11) echo 11 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif-north) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_lrp () {
+    echo router-to-`vif_to_ls $1`
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+
+test_ip() {
+        # This packet has bad checksums but logical L3 routing doesn't check.
+        local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 outport=$6
+        local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+        shift; shift; shift; shift; shift
+        hv=`vif_to_hv $inport`
+        as $hv ovs-appctl netdev-dummy/receive $inport $packet
+        in_ls=`vif_to_ls $inport`
+        for outport; do
+            out_ls=`vif_to_ls $outport`
+            if test $in_ls = $out_ls; then
+                # Ports on the same logical switch receive exactly the same packet.
+                echo $packet
+            else
+                # Routing decrements TTL and updates source and dest MAC
+                # (and checksum).
+                out_lrp=`vif_to_lrp $outport`
+                # For North-South, packet will come via gateway chassis, i.e hv3
+                if test $inport = vif-north; then
+                    echo f00000000011aabbccddee3308004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected
+                fi
+                if test $outport = vif-north; then
+                    echo f0f000000011aabbccddee1108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected
+                fi
+            fi >> $outport.expected
+        done
+}
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+ovn-sbctl list port_binding
+ovn-sbctl list mac_binding
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+echo "------ hv3 dump ------"
+as hv3 ovs-vsctl show
+as hv3 ovs-vsctl list Open_Vswitch
+
+echo "------ hv4 dump ------"
+as hv4 ovs-vsctl show
+as hv4 ovs-vsctl list Open_Vswitch
+
+echo "Send traffic North to South"
+
+sip=`ip_to_hex 172 31 0 10`
+dip=`ip_to_hex 192 168 1 1`
+test_ip vif-north f0f000000011 000001010207 $sip $dip vif11
+
+sleep 1
+
+# Confirm that North to south traffic works fine and went through gateway chassis, i.e HV3
+OVN_CHECK_PACKETS([hv1/vif11-tx.pcap], [vif11.expected])
+
+echo "Send traffic South to Nouth"
+sip=`ip_to_hex 192 168 1 1`
+dip=`ip_to_hex 172 31 0 10`
+test_ip vif11 f00000000011 000001010203 $sip $dip vif-north
+
+sleep 1
+
+# Confirm that South to North traffic works fine.
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap], [vif-north.expected])
+
+# Confirm that packets did not go out via tunnel port.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=32 | grep NXM_NX_TUN_METADATA0 | grep n_packets=0 | wc -l], [0], [[1
+]])
+
+# Confirm that HV1 chassis mac is never seen on Gateway chassis, i.e HV3
+AT_CHECK([as hv3 ovs-appctl fdb/show br-phys | grep aa:bb:cc:dd:ee:11 | wc -l], [0], [[0
+]])
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv3 dump -----------"
+as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv3 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv4 dump -----------"
+as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv4 ovs-appctl fdb/show br-phys
+
+OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
+
+AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S ARP handling])
+ovn_start
+
+# In this test cases we create 3 switches, all connected to same
+# physical network (through br-phys on each HV). LS1 and LS2 have
+# 1 VIF each. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# All the switches are connected to a logical router "router".
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+
+for i in 1 2; do
+    ls_name=ls$i
+    ovn-nbctl ls-add $ls_name bridged
+    ln_port_name=ln$i
+    if test $i -eq 1; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
+    elif test $i -eq 2; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
+    fi
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
+done
+
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif?[[north]]?) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+
+    ovs-vsctl add-port br-int vif$i$i -- \
+        set Interface vif$i$i external-ids:iface-id=lp$i$i \
+                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
+                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
+                              ofport-request=$i$i
+
+    lsp_name=lp$i$i
+    ls_name=$(lsp_to_ls $lsp_name)
+
+    ovn-nbctl lsp-add $ls_name $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i"
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl ls-add ls-north bridged
+ovn-nbctl lsp-add ls-north ln4 "" 1000
+ovn-nbctl lsp-set-addresses ln4 unknown
+ovn-nbctl lsp-set-type ln4 localnet
+ovn-nbctl lsp-set-options ln4 network_name=phys
+
+# Add a VM on ls-north
+ovn-nbctl lsp-add ls-north lp-north
+ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
+ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
+
+# Add 3rd hypervisor
+sim_add hv3
+as hv3 ovs-vsctl add-br br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
+as hv3 ovn_attach n1 br-phys 192.168.0.3
+
+# Add 4th hypervisor
+sim_add hv4
+as hv4 ovs-vsctl add-br br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
+as hv4 ovn_attach n1 br-phys 192.168.0.4
+
+as hv4 ovs-vsctl add-port br-int vif-north -- \
+        set Interface vif-north external-ids:iface-id=lp-north \
+                              options:tx_pcap=hv4/vif-north-tx.pcap \
+                              options:rxq_pcap=hv4/vif-north-rx.pcap \
+                              ofport-request=44
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
+ovn-nbctl [192.168.1.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.1.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=r3laW3QCkYmIZydSf8n5bHm0ObKIuSd3VACsekBmSbg&e=> lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
+ovn-nbctl [192.168.2.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.2.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=suQy9nYmhP89HVjKfN--Kvziv8XSkkzS9bXCDrfE1c4&e=> lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24 [172.31.0.1]<https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
+
+ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router \
+          options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router
+ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router \
+          options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+
+OVN_POPULATE_ARP
+
++# lsp_to_ls LSP
++#
++# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_ls () {
+    case $1 in dnl (
+        vif?[[11]]) echo ls1 ;; dnl (
+        vif?[[12]]) echo ls2 ;; dnl (
+        vif-north) echo ls-north ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+hv_to_num () {
+    case $1 in dnl (
+        hv1) echo 1 ;; dnl (
+        hv2) echo 2 ;; dnl (
+        hv3) echo 3 ;; dnl (
+        hv4) echo 4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_num () {
+    case $1 in dnl (
+        vif22) echo 22 ;; dnl (
+        vif21) echo 21 ;; dnl (
+        vif11) echo 11 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif-north) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_lrp () {
+    echo router-to-`vif_to_ls $1`
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+ovn-sbctl list port_binding
+ovn-sbctl list mac_binding
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+echo "------ hv3 dump ------"
+as hv3 ovs-vsctl show
+as hv3 ovs-vsctl list Open_Vswitch
+
+echo "------ hv4 dump ------"
+as hv4 ovs-vsctl show
+as hv4 ovs-vsctl list Open_Vswitch
+
+# test_arp INPORT SHA SPA TPA [REPLY_HA]
+#
+# Causes a packet to be received on INPORT.  The packet is an ARP
+# request with SHA, SPA, and TPA as specified.  If REPLY_HA is provided, then
+# it should be the hardware address of the target to expect to receive in an
+# ARP reply; otherwise no reply is expected.
+#
+# INPORT is an logical switch port number, e.g. 11 for vif11.
+# SHA and REPLY_HA are each 12 hex digits.
+# SPA and TPA are each 8 hex digits.
+test_arp() {
+    local inport=$1 sha=$2 spa=$3 tpa=$4 reply_ha=$5
+    local request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa}
+    hv=`vif_to_hv $inport`
+    as $hv ovs-appctl netdev-dummy/receive $inport $request
+
+    if test X$reply_ha = X; then
+        # Expect to receive the broadcast ARP on the other logical switch ports
+        # if no reply is expected.
+        local i j
+        for i in 1 2 3; do
+            for j in 1 2 3; do
+                if test $i$j != $inport; then
+                    echo $request >> $i$j.expected
+                fi
+            done
+        done
+    else
+        # Expect to receive the reply, if any.
+        local reply=${sha}${reply_ha}08060001080006040002${reply_ha}${tpa}${sha}${spa}
+        local reply_vid=${sha}${reply_ha}810003e808060001080006040002${reply_ha}${tpa}${sha}${spa}
+        echo $reply_vid >> ${inport}_vid.expected
+        echo $reply >> $inport.expected
+    fi
+}
+
+sip=`ip_to_hex 172 31 0 10`
+tip=`ip_to_hex 172 31 0 1`
+
+test_arp vif-north f0f000000011 $sip $tip
+# Confirm that vif-north does not get ARP reply
+AT_CHECK([wc -l hv4/vif-north-tx.pcap | awk '{print $1}'], [0], [[0
+]])
+
+# Set a hypervisor as gateway chassis, for router port 172.31.0.1
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
+ovn-nbctl --wait=sb sync
+sleep 2
+
+test_arp vif-north f0f000000011 $sip $tip 000001010207
+
+sleep 1
+
+# Confirm that vif-north gets a single ARP reply this time
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap], [vif-north.expected])
+
+# Confirm that only redirect chassis allowed arp resolution.
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv3/br-phys_n1-tx.pcap], [vif-north_vid.expected])
+sed -i '/ffffffffffff/d' hv3/br-phys_n1-tx.packets
+AT_CHECK([grep 000001010207 hv3/br-phys_n1-tx.packets | wc -l], [0], [[1
+]])
+
+# Confirm that other OVN chassis did not generate ARP reply.
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in [ovs-pcap.in]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovs-2Dpcap.in&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=2C3bpksCPiN-64fg1Las63zBhPREoL9p8vojGneVx9o&e=>" hv1/br-phys_n1-tx.pcap > hv1/br-phys_n1-tx.packets
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in [ovs-pcap.in]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovs-2Dpcap.in&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=2C3bpksCPiN-64fg1Las63zBhPREoL9p8vojGneVx9o&e=>" hv2/br-phys_n1-tx.pcap > hv2/br-phys_n1-tx.packets
+
+AT_CHECK([grep 000001010207 hv1/br-phys_n1-tx.packets | wc -l], [0], [[0
+]])
+AT_CHECK([grep 000001010207 hv2/br-phys_n1-tx.packets | wc -l], [0], [[0
+]])
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv3 dump -----------"
+as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv3 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv4 dump -----------"
+as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv4 ovs-appctl fdb/show br-phys
+
+OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
+
+AT_CLEANUP
--
1.8.3.1
Numan Siddique June 10, 2019, 5:18 p.m. UTC | #3
On Fri, Jun 7, 2019 at 5:15 AM Ankur Sharma <ankur.sharma@nutanix.com>
wrote:

> Hi Numan,
>
> Thanks for trying out the patch and providing feedback.
> I am planning to change this series to reflect only the E-W and would send
> out separate patches for N-S improvements.
>
> Following is the reasoning:
>
> ======================
> a. I agree that the network_type construct is adding to the confusion and
> may be we should rely on optional/external-id based key-value config.
> b. N-S patch has 3- changes (on a high level) which are distinct from each
> other, having them in a single patch is causing the confusion and is
> holding rest of the reviewed changes.
> c. Last but not the least, there were some gaps in the patch as well.
>
> Here is what I am planning to do:
>
> ==========================
> a. Keep this series for E-W only and remove all the network_type related
> changes from here (including showing type as bridged/vlan).
>


Hi Ankur.

I agree with the approach you are planning to take.



>
> b. For N-S Changes, this series has following changes. I will send them
> out in separate patches, especially the ones which are more of a bug fix.
>     i. Do not allow ARP resolution from physical network unless gateway
> chassis is configured è More of a bug fix, will be sent as a separate
> standalone patch.
>
>    ii. GARP advertisement during failover  in the absence of NAT
> configuration è More of a bug fix, will be sent as a separate standalone
> patch.
>
>   iii. Periodic GARP advertisement with/without NAT configuration è New
> feature, will be added along with SNAT changes.
>

This periodic GARP adv will be required irrespective of network_type of the
logical switches connected to a router. So I would request you to handle
this as well
when you submit the patch.


>   iv. Avoid redirection è New feature, will come as a separate patchset,
> we will make it as optional feature, i.e by default even non NATed traffic
> will go via gateway chassis, but config knob can override it.
>

Agree. This makes sense and easier.


>    v. No chassis mac replace on gateway chassis è More of a addendum to
> E-W, I am thinking about clubbing it some of N-S changes as this is where
> it will be relevant.
>
>
>
> c. Will send out separate patch for showing network type as overlay or
> bridged (based on localnet port’s presence), I believe it is good to have
> 😊.
>     i.e we will not have any new column in logical switch table, but the
> output of relevant ovn-nbctl show command will show type as “overlay” or
> “bridged”.
>
>
> Above will allow us to make progress on the changes we are in agreement
> on, while having thorough discussion on the remaining.
>
> Let me know, if you are fine with the plan, I should be able to send E-W
> only changes in a couple of days and should be able to individual bug fixes
> soon after as well.
>
>
> For rest of the comments, please find my replies inline.
>
> Appreciate your feedback.
>
> Regards,
> Ankur
>
>
>
>
> *From:* Numan Siddique <nusiddiq@redhat.com>
> *Sent:* Monday, June 3, 2019 3:06 AM
> *To:* Ankur Sharma <ankur.sharma@nutanix.com>
> *Cc:* ovs-dev@openvswitch.org
> *Subject:* Re: [ovs-dev] [PATCH v9 2/2] OVN: Enable N-S Traffic, Vlan
> backed DVR
>
>
>
>
>
> Hi Ankur,
>
>
>
> Please see some comments inline. Please note that I haven't got the chance
> to look into the code
>
> in detail. I am first trying to test out the patches. (I am in PTO. Expect
> some delay in my replies).
>
>
>
>
>
>
>
> On Thu, May 30, 2019 at 5:58 AM Ankur Sharma <ankur.sharma@nutanix.com>
> wrote:
>
> Background:
> [1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html
> [mail.openvswitch.org]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DOctober_353066.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=xQRPm8R90ygR4nx7uyRGOYHzW5NFiroiyZqi9JSYb-A&e=>
> [2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing
> [docs.google.com]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU_edit-3Fusp-3Dsharing&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=myrKaOI2LsuZQQOZhhhNw1zwDgat77e5CmPmpTFpllw&e=>
>
> This Series:
> Layer 2, Layer 3 E-W and Layer 3 N-S (NO NAT) changes for vlan
> backed distributed logical router.
>
> This patch:
> For North-South traffic, we need a chassis which will respond to
> ARP requests for router port coming from outside. For this purpose,
> we will reply upon gateway-chassis construct in OVN, on a logical
> router port, we will associate one or more chassis as gateway chassis.
>
> One of these chassis would be active at a point and will become
> entry point to traffic, bound for end points behind logical router
> coming from outside network (North to South).
>
> This patch make some enhancements to gateway chassis implementation
> to manage above used case.
>
> A.
> Do not replace router port mac with chassis mac on gateway
> chassis.
> This is done, because:
>     i. Chassisredirect port is NOT a distributed port, hence
>        we need not replace its mac address
>       (which same as router port mac).
>
>    ii. ARP cache will be consistent everywhere, i.e just like
>        endpoints on OVN chassis will see configured router port
>        mac as resolved mac for router port ip, outside endpoints
>        will see that as well.
>
>   iii. For implementing Network Address Translation. Although
>        not a part of this series. But, follow up series would
>        be having this feature and approach would rely upon
>        sending packets to redirect chassis using chassis redirect
>        router port mac as dest mac.
>
> B.
> Advertise router port GARP on gateway chassis.
> This is needed, especially if a failover happens and
> chassisredirect port moves to a new gateway chassis.
> Otherwise, there would be packet drops till outside
> router ARPs for router port ip again.
>
> Intention of this GARP is to update top of the rack (TOR)
> to direct router port mac to new hypervisor.
>
> Hence, we could have done the same using RARP as well, but
> because ovn-controller has implementation for GARP already,
> hence it did not look like worthy to add a RARP implementation
> just for this.
>
> C.
> For South to North traffic, we need not pass through gateway
> chassis, if there is no address transalation needed.
>
> For overlay networks, NATing is a must to talk to outside networks.
> However, for vlan backed networks, NATing is not a must, and hence
> in the absence of NATing configuration we need redirect the packet
> to gateway chassis.
>
> Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
> ---
>  ovn/controller/physical.c  |  24 +-
>  ovn/controller/pinctrl.c   | 205 +++++++++++--
>  ovn/controller/pinctrl.h   |   6 +
>  ovn/lib/ovn-util.c         |  31 ++
>  ovn/lib/ovn-util.h         |   6 +
>  ovn/northd/ovn-northd.c    |  43 ++-
>  ovn/ovn-architecture.7.xml |  87 +++++-
>  tests/ovn.at [ovn.at]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
>              | 732 ++++++++++++++++++++++++++++++++++++++++++++-
>  8 files changed, 1090 insertions(+), 44 deletions(-)
>
> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> index af587a5..1ab5968 100644
> --- a/ovn/controller/physical.c
> +++ b/ovn/controller/physical.c
> @@ -21,6 +21,7 @@
>  #include "lflow.h"
>  #include "lport.h"
>  #include "chassis.h"
> +#include "pinctrl.h"
>  #include "lib/bundle.h"
>  #include "openvswitch/poll-loop.h"
>  #include "lib/uuid.h"
> @@ -238,9 +239,12 @@ get_zone_ids(const struct sbrec_port_binding *binding,
>  }
>
>  static void
> -put_replace_router_port_mac_flows(const struct
> +put_replace_router_port_mac_flows(struct ovsdb_idl_index
> +                                  *sbrec_port_binding_by_name,
> +                                  const struct
>                                    sbrec_port_binding *localnet_port,
>                                    const struct sbrec_chassis *chassis,
> +                                  const struct sset *active_tunnels,
>                                    const struct hmap *local_datapaths,
>                                    struct ofpbuf *ofpacts_p,
>                                    ofp_port_t ofport,
> @@ -281,8 +285,21 @@ put_replace_router_port_mac_flows(const struct
>          char *err_str = NULL;
>          struct match match;
>          struct ofpact_mac *replace_mac;
> +        char *cr_peer_name = xasprintf("cr-%s",
> rport_binding->logical_port);
>
> -        /* Table 65, priority 150.
> +
> +        if (pinctrl_is_chassis_resident(sbrec_port_binding_by_name,
> +                                        chassis, active_tunnels,
> +                                        cr_peer_name)) {
> +            /* If a router port's chassisredirect port is
> +             * resident on this chassis, then we need not do mac replace.
> */
> +            free(cr_peer_name);
> +            continue;
> +        }
> +
> +        free(cr_peer_name);
> +
> +       /* Table 65, priority 150.
>           * =======================
>           *
>           * Implements output to localnet port.
> @@ -797,7 +814,8 @@ consider_port_binding(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
>                          &match, ofpacts_p, &binding->header_.uuid);
>
>          if (!strcmp(binding->type, "localnet")) {
> -            put_replace_router_port_mac_flows(binding, chassis,
> +            put_replace_router_port_mac_flows(sbrec_port_binding_by_name,
> +                                              binding, chassis,
> active_tunnels,
>                                                local_datapaths, ofpacts_p,
>                                                ofport, flow_table);
>          }
> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> index b7bb4c9..a145867 100644
> --- a/ovn/controller/pinctrl.c
> +++ b/ovn/controller/pinctrl.c
> @@ -226,6 +226,8 @@ static bool may_inject_pkts(void);
>  COVERAGE_DEFINE(pinctrl_drop_put_mac_binding);
>  COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map);
>
> +#define GARP_DEF_REPEAT_INTERVAL_MS   (3 * 60 * 1000) /* 3 minutes */
> +
>  void
>  pinctrl_init(void)
>  {
> @@ -242,6 +244,25 @@ pinctrl_init(void)
>                                                  &pinctrl);
>  }
>
> +bool
> +pinctrl_is_chassis_resident(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> +                            const struct sbrec_chassis *chassis,
> +                            const struct sset *active_tunnels,
> +                            const char *port_name)
> +{
> +    const struct sbrec_port_binding *pb
> +        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
> +    if (!pb || !pb->chassis) {
> +        return false;
> +    }
> +    if (strcmp(pb->type, "chassisredirect")) {
> +        return pb->chassis == chassis;
> +    } else {
> +        return ha_chassis_group_is_active(pb->ha_chassis_group,
> +                                          active_tunnels, chassis);
> +    }
> +}
> +
>  static ovs_be32
>  queue_msg(struct rconn *swconn, struct ofpbuf *msg)
>  {
> @@ -2548,6 +2569,8 @@ struct garp_data {
>      int backoff;                 /* Backoff for the next announcement. */
>      uint32_t dp_key;             /* Datapath used to output this GARP. */
>      uint32_t port_key;           /* Port to inject the GARP into. */
> +    bool is_repeat;              /* Send GARPs continously */
> +    long long int repeat_interval; /* Interval between GARP bursts in ms
> */
>  };
>
>  /* Contains GARPs to be sent. Protected by pinctrl_mutex*/
> @@ -2568,7 +2591,8 @@ destroy_send_garps(void)
>  /* Runs with in the main ovn-controller thread context. */
>  static void
>  add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
> -         uint32_t dp_key, uint32_t port_key)
> +         uint32_t dp_key, uint32_t port_key, bool is_repeat,
> +         long long int repeat_interval)
>  {
>      struct garp_data *garp = xmalloc(sizeof *garp);
>      garp->ea = ea;
> @@ -2577,6 +2601,8 @@ add_garp(const char *name, const struct eth_addr ea,
> ovs_be32 ip,
>      garp->backoff = 1;
>      garp->dp_key = dp_key;
>      garp->port_key = port_key;
> +    garp->is_repeat = is_repeat;
> +    garp->repeat_interval = repeat_interval;
>      shash_add(&send_garp_data, name, garp);
>
>      /* Notify pinctrl_handler so that it can wakeup and process
> @@ -2586,7 +2612,8 @@ add_garp(const char *name, const struct eth_addr ea,
> ovs_be32 ip,
>
>  /* Add or update a vif for which GARPs need to be announced. */
>  static void
> -send_garp_update(const struct sbrec_port_binding *binding_rec,
> +send_garp_update(struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +                 const struct sbrec_port_binding *binding_rec,
>                   struct shash *nat_addresses)
>  {
>      volatile struct garp_data *garp = NULL;
> @@ -2611,7 +2638,7 @@ send_garp_update(const struct sbrec_port_binding
> *binding_rec,
>                      add_garp(name, laddrs->ea,
>                               laddrs->ipv4_addrs[i].addr,
>                               binding_rec->datapath->tunnel_key,
> -                             binding_rec->tunnel_key);
> +                             binding_rec->tunnel_key, false, 0);
>                  }
>                  free(name);
>              }
> @@ -2621,6 +2648,64 @@ send_garp_update(const struct sbrec_port_binding
> *binding_rec,
>          return;
>      }
>
> +    /* Update GARPs for local chassisredirect port, if the peer
> +     * layer 2 switch is of type vlan.
> +     */
> +    if (!strcmp(binding_rec->type, "chassisredirect")) {
> +        struct eth_addr mac;
> +        ovs_be32 ip, mask;
> +        uint32_t dp_key = 0;
> +        uint32_t port_key = 0;
> +        const struct sbrec_port_binding *peer_port = NULL;
> +        const struct sbrec_port_binding *distributed_port = NULL;
> +
> +        if (!ovn_sbrec_get_port_binding_ip_mac(binding_rec, &mac,
> +                                               &ip, &mask)) {
> +            /* Router Port binding without ip and mac configured. */
> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> +            VLOG_WARN_RL(&rl, "cannot send garp, router port binding: %s,
> "
> +                         "does not have proper ip,mac values: %s",
> +                         binding_rec->logical_port, *binding_rec->mac);
> +            return;
> +        }
> +
> +        const char *lrp_name = smap_get(&binding_rec->options,
> +                                        "distributed-port");
> +        ovs_assert(lrp_name);
> +
> +        distributed_port =
> lport_lookup_by_name(sbrec_port_binding_by_name,
> +                                                lrp_name);
> +        ovs_assert(distributed_port);
> +
> +        const char *peer_name = smap_get(&distributed_port->options,
> "peer");
> +        ovs_assert(peer_name);
> +
> +        peer_port = lport_lookup_by_name(sbrec_port_binding_by_name,
> +                                         peer_name);
> +        ovs_assert(peer_port);
> +
> +        const char *network_type =
> smap_get(&peer_port->datapath->external_ids,
> +                                            "network-type");
> +
> +        /* Advertise GARP only of logical switch is of type bridged. */
> +        if (!network_type || strcmp(network_type, "bridged")) {
> +            return;
> +        }
> +
> +        dp_key = peer_port->datapath->tunnel_key;
> +        port_key = peer_port->tunnel_key;
> +
> +        garp = shash_find_data(&send_garp_data,
> binding_rec->logical_port);
> +        if (garp) {
> +            garp->dp_key = dp_key;
> +            garp->port_key = port_key;
> +        } else {
> +            add_garp(binding_rec->logical_port, mac, ip,
> +                     dp_key, port_key, true, GARP_DEF_REPEAT_INTERVAL_MS);
> +        }
> +        return;
> +    }
> +
>      /* Update GARP for vif if it exists. */
>      garp = shash_find_data(&send_garp_data, binding_rec->logical_port);
>      if (garp) {
> @@ -2640,7 +2725,8 @@ send_garp_update(const struct sbrec_port_binding
> *binding_rec,
>
>          add_garp(binding_rec->logical_port,
>                   laddrs.ea, laddrs.ipv4_addrs[0].addr,
> -                 binding_rec->datapath->tunnel_key,
> binding_rec->tunnel_key);
> +                 binding_rec->datapath->tunnel_key,
> binding_rec->tunnel_key,
> +                 false, 0);
>
>          destroy_lport_addresses(&laddrs);
>          break;
> @@ -2702,7 +2788,12 @@ send_garp(struct rconn *swconn, struct garp_data
> *garp,
>          garp->backoff *= 2;
>          garp->announce_time = current_time + garp->backoff * 1000;
>      } else {
> -        garp->announce_time = LLONG_MAX;
> +        if (garp->is_repeat) {
> +            garp->backoff = 1;
> +            garp->announce_time = current_time + garp->repeat_interval;
> +        } else {
> +            garp->announce_time = LLONG_MAX;
> +        }
>      }
>      return garp->announce_time;
>  }
> @@ -2786,25 +2877,6 @@ get_localnet_vifs_l3gwports(
>      sbrec_port_binding_index_destroy_row(target);
>  }
>
> -static bool
> -pinctrl_is_chassis_resident(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> -                            const struct sbrec_chassis *chassis,
> -                            const struct sset *active_tunnels,
> -                            const char *port_name)
> -{
> -    const struct sbrec_port_binding *pb
> -        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
> -    if (!pb || !pb->chassis) {
> -        return false;
> -    }
> -    if (strcmp(pb->type, "chassisredirect")) {
> -        return pb->chassis == chassis;
> -    } else {
> -        return ha_chassis_group_is_active(pb->ha_chassis_group,
> -                                          active_tunnels, chassis);
> -    }
> -}
> -
>  /* Extracts the mac, IPv4 and IPv6 addresses, and logical port from
>   * 'addresses' which should be of the format 'MAC [IP1 IP2 ..]
>   * [is_chassis_resident("LPORT_NAME")]', where IPn should be a valid IPv4
> @@ -2946,6 +3018,67 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
>  }
>
>  static void
> +get_local_cr_ports(struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +                   struct sset *local_cr_ports,
> +                   struct sset *local_l3gw_ports,
> +                   const struct sbrec_chassis *chassis,
> +                   const struct sset *active_tunnels)
> +{
> +    const char *gw_port;
> +    SSET_FOR_EACH (gw_port, local_l3gw_ports) {
> +        const struct sbrec_port_binding *binding_rec;
> +
> +        binding_rec = lport_lookup_by_name(sbrec_port_binding_by_name,
> +                                           gw_port);
> +        if (!binding_rec) {
> +            continue;
> +        }
> +
> +        /* For the patch port we will add send garp for peer's ip and
> mac. */
> +        if (!strcmp(binding_rec->type, "patch")) {
> +            const struct sbrec_port_binding *cr_port = NULL;
> +
> +            bool is_cr_resident;
> +            struct eth_addr mac;
> +            ovs_be32 ip, mask;
> +
> +            const char *peer_name = smap_get(&binding_rec->options,
> "peer");
> +            ovs_assert(peer_name);
> +
> +            char *cr_peer_name = xasprintf("cr-%s", peer_name);
> +            cr_port = lport_lookup_by_name(sbrec_port_binding_by_name,
> +                                           cr_peer_name);
> +            free(cr_peer_name);
> +
> +            if (!cr_port) {
> +                continue;
> +            }
> +
> +            is_cr_resident = pinctrl_is_chassis_resident
> +                                (sbrec_port_binding_by_name,
> +                                 chassis,
> +                                 active_tunnels,
> +                                 cr_port->logical_port);
> +            if (!is_cr_resident) {
> +                continue;
> +            }
> +
> +            if (!ovn_sbrec_get_port_binding_ip_mac(cr_port, &mac, &ip,
> +                                                   &mask)) {
> +                /* Router Port binding without ip and mac configured. */
> +                static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(1, 1);
> +                VLOG_WARN_RL(&rl, "cannot send garp, router port binding:
> %s, "
> +                             "does not have proper ip,mac values: %s",
> +                              cr_port->logical_port, *cr_port->mac);
> +                return;
> +            }
> +
> +            sset_add(local_cr_ports, cr_port->logical_port);
> +        }
> +    }
> +}
> +
> +static void
>  send_garp_wait(long long int send_garp_time)
>  {
>      /* Set the poll timer for next garp only if there is garp data to
> @@ -2990,6 +3123,8 @@ send_garp_prepare(struct ovsdb_idl_index
> *sbrec_port_binding_by_datapath,
>  {
>      struct sset localnet_vifs = SSET_INITIALIZER(&localnet_vifs);
>      struct sset local_l3gw_ports = SSET_INITIALIZER(&local_l3gw_ports);
> +    struct sset local_cr_ports = SSET_INITIALIZER(&local_cr_ports);
> +
>      struct sset nat_ip_keys = SSET_INITIALIZER(&nat_ip_keys);
>      struct shash nat_addresses;
>
> @@ -3004,11 +3139,17 @@ send_garp_prepare(struct ovsdb_idl_index
> *sbrec_port_binding_by_datapath,
>                                 &nat_ip_keys, &local_l3gw_ports,
>                                 chassis, active_tunnels,
>                                 &nat_addresses);
> +
> +    get_local_cr_ports(sbrec_port_binding_by_name,
> +                       &local_cr_ports, &local_l3gw_ports,
> +                       chassis, active_tunnels);
> +
>      /* For deleted ports and deleted nat ips, remove from send_garp_data.
> */
>      struct shash_node *iter, *next;
>      SHASH_FOR_EACH_SAFE (iter, next, &send_garp_data) {
>          if (!sset_contains(&localnet_vifs, iter->name) &&
> -            !sset_contains(&nat_ip_keys, iter->name)) {
> +            !sset_contains(&nat_ip_keys, iter->name) &&
> +            !sset_contains(&local_cr_ports, iter->name)) {
>              send_garp_delete(iter->name);
>          }
>      }
> @@ -3019,7 +3160,7 @@ send_garp_prepare(struct ovsdb_idl_index
> *sbrec_port_binding_by_datapath,
>          const struct sbrec_port_binding *pb = lport_lookup_by_name(
>              sbrec_port_binding_by_name, iface_id);
>          if (pb) {
> -            send_garp_update(pb, &nat_addresses);
> +            send_garp_update(sbrec_port_binding_by_name, pb,
> &nat_addresses);
>          }
>      }
>
> @@ -3029,7 +3170,17 @@ send_garp_prepare(struct ovsdb_idl_index
> *sbrec_port_binding_by_datapath,
>          const struct sbrec_port_binding *pb
>              = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port);
>          if (pb) {
> -            send_garp_update(pb, &nat_addresses);
> +            send_garp_update(sbrec_port_binding_by_name, pb,
> &nat_addresses);
> +        }
> +    }
> +
> +    /* Update send_garp_data for chassisredirect router ports. */
> +    const char *cr_port;
> +    SSET_FOR_EACH (cr_port, &local_cr_ports) {
> +        const struct sbrec_port_binding *pb
> +            = lport_lookup_by_name(sbrec_port_binding_by_name, cr_port);
> +        if (pb) {
> +            send_garp_update(sbrec_port_binding_by_name, pb,
> &nat_addresses);
>          }
>      }
>
> diff --git a/ovn/controller/pinctrl.h b/ovn/controller/pinctrl.h
> index f61d705..92f704e 100644
> --- a/ovn/controller/pinctrl.h
> +++ b/ovn/controller/pinctrl.h
> @@ -44,4 +44,10 @@ void pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>  void pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn);
>  void pinctrl_destroy(void);
>
> +bool
> +pinctrl_is_chassis_resident(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
> +                            const struct sbrec_chassis *chassis,
> +                            const struct sset *active_tunnels,
> +                            const char *port_name);
> +
>  #endif /* ovn/pinctrl.h */
> diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
> index 0f07d80..3d0ad8e 100644
> --- a/ovn/lib/ovn-util.c
> +++ b/ovn/lib/ovn-util.c
> @@ -16,6 +16,7 @@
>  #include "ovn-util.h"
>  #include "dirs.h"
>  #include "openvswitch/vlog.h"
> +#include "openvswitch/ofp-parse.h"
>  #include "ovn/lib/ovn-nb-idl.h"
>  #include "ovn/lib/ovn-sb-idl.h"
>
> @@ -371,3 +372,33 @@ ovn_logical_flow_hash(const struct uuid
> *logical_datapath,
>      hash = hash_string(match, hash);
>      return hash_string(actions, hash);
>  }
> +
> +/*  Extracts the mac, ip and mask for a sbrec_port_binding.
> + *
> + *  Expects following format:
> + *  "MAC_ADDRESS IP/MASK"
> + *
> + *  Return true if MAC, IP and MASK are found, false otherwise.
> + */
> +bool
> +ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding
> *binding,
> +                                  struct eth_addr *mac,
> +                                  ovs_be32 *ip, ovs_be32 *mask)
> +{
> +    char *err_str = NULL;
> +
> +    err_str = str_to_mac(binding->mac[0], mac);
> +    if (err_str) {
> +        free(err_str);
> +        return false;
> +    }
> +
> +    err_str = ip_parse_masked(binding->mac[0] + ETH_ADDR_STRLEN + 1,
> +                              ip, mask);
> +    if (err_str) {
> +        free(err_str);
> +        return false;
> +    }
> +
> +    return true;
> +}
> diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h
> index 6d5e1df..c01595a 100644
> --- a/ovn/lib/ovn-util.h
> +++ b/ovn/lib/ovn-util.h
> @@ -19,6 +19,7 @@
>  #include "lib/packets.h"
>
>  struct nbrec_logical_router_port;
> +struct sbrec_port_binding;
>  struct sbrec_logical_flow;
>  struct uuid;
>
> @@ -81,4 +82,9 @@ uint32_t ovn_logical_flow_hash(const struct uuid
> *logical_datapath,
>                                 uint16_t priority,
>                                 const char *match, const char *actions);
>
> +bool
> +ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding
> *binding,
> +                                  struct eth_addr *mac, ovs_be32 *ip,
> +                                  ovs_be32 *mask);
> +
>  #endif
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 74d3692..6835910 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -5914,6 +5914,20 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>                      ds_put_format(&match, " && is_chassis_resident(%s)",
>                                    op->od->l3redirect_port->json_key);
>                  }
> +            } else if (op->peer &&
> +                       op->peer->od->network_type == DP_NETWORK_BRIDGED) {
> +                /* For a router port connected to bridged logical switch,
> +                 * we will always have the is_chassis_resident check.
> +                 * This is because there could be vm/server on vlan
> network,
> +                 * but not on OVN chassis and could end up arping for
> router
> +                 * port ip.
> +                 *
> +                 * This check works on the assumption that for OVN
> chassis,
> +                 * VMs logical switch ARP responder will respond to ARP
> +                 * requests for router port IP.
> +                 */
> +                ds_put_format(&match, " &&
> is_chassis_resident(\"cr-%s\")",
> +                              op->key);
>              }
>
>              ds_clear(&actions);
> @@ -7365,18 +7379,23 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>              ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 300,
>                            REGBIT_DISTRIBUTED_NAT" == 1", "next;");
>
> -            /* For traffic with outport == l3dgw_port, if the
> -             * packet did not match any higher priority redirect
> -             * rule, then the traffic is redirected to the central
> -             * instance of the l3dgw_port. */
> -            ds_clear(&match);
> -            ds_put_format(&match, "outport == %s",
> -                          od->l3dgw_port->json_key);
> -            ds_clear(&actions);
> -            ds_put_format(&actions, "outport = %s; next;",
> -                          od->l3redirect_port->json_key);
> -            ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
> -                          ds_cstr(&match), ds_cstr(&actions));
> +            /* For VLAN backed networks, default match will not redirect
> to
> +             * chassis redirect port. */
> +            if (od->l3dgw_port->peer &&
> +                od->l3dgw_port->peer->od->network_type ==
> DP_NETWORK_OVERLAY) {
> +                /* For traffic with outport == l3dgw_port, if the
> +                 * packet did not match any higher priority redirect
> +                 * rule, then the traffic is redirected to the central
> +                 * instance of the l3dgw_port. */
> +                ds_clear(&match);
> +                ds_put_format(&match, "outport == %s",
> +                              od->l3dgw_port->json_key);
> +                ds_clear(&actions);
> +                ds_put_format(&actions, "outport = %s; next;",
> +                              od->l3redirect_port->json_key);
> +                ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
> +                              ds_cstr(&match), ds_cstr(&actions));
> +            }
>
>
>
> Looks like this code is having some side effects.
>
>
>
>
>
> Point 1.
>
> ======
>
> For my public switch if I don't set the network_type as "bridged",
>
> then I see the below logical flows and think this is as expected. And I
> think
>
> that's why in my v7 tests the packets were tunneled to the gw chassis (as
>
> you mentioned in the reply).
>
>
>
> ****
>
> table=12(lr_in_gw_redirect  ), priority=300  , match=(reg9[2] == 1),
> action=(next;)
>   table=12(lr_in_gw_redirect  ), priority=200  , match=(reg9[0] == 1),
> action=(outport = "cr-lr0-public"; next;)
>   table=12(lr_in_gw_redirect  ), priority=150  , match=(outport ==
> "lr0-public" && eth.dst == 00:00:00:00:00:00), action=(outport =
> "cr-lr0-public"; next;)
>   table=12(lr_in_gw_redirect  ), priority=50   , match=(outport ==
> "lr0-public"), action=(outport = "cr-lr0-public"; next;)
>   table=12(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
>
> ****
>
>
>
> If I set the type as "bridged", I see the below flows
>
>
>
> ****
>
>  table=12(lr_in_gw_redirect  ), priority=300  , match=(reg9[2] == 1),
> action=(next;)
>   table=12(lr_in_gw_redirect  ), priority=200  , match=(reg9[0] == 1),
> action=(outport = "cr-lr0-public"; next;)
>   table=12(lr_in_gw_redirect  ), priority=150  , match=(outport ==
> "lr0-sw1" && reg0 == 20.0.0.3 && eth.dst == 00:00:00:00:00:00),
> action=(eth.dst = 40:54:00:00:00:03; next;)
>   table=12(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
>
> ****
>
>
>
> I don't understand the 3rd flow with the match -- "outport == "lr0-sw1"...
>
>
>
> Looks like the "match" and "action" variables have some old data. Please
> look into the code again.
> [ANKUR]:
> Yup, missed out on clearing match and actions, thanks for calling it out.
>
>
>
> After the "if" condition you added in this patch at line 7384, the below
> code is still there and it doesn't make sense
>
>
>
> ******
>
>              /* For VLAN backed networks, default match will not redirect
> to
>              * chassis redirect port. */
>             if (od->l3dgw_port->peer &&
>                 od->l3dgw_port->peer->od->network_type ==
> DP_NETWORK_OVERLAY) {
>                 /* For traffic with outport == l3dgw_port, if the
>                  * packet did not match any higher priority redirect
>                  * rule, then the traffic is redirected to the central
>                  * instance of the l3dgw_port. */
>                 ds_clear(&match);
>                 ds_put_format(&match, "outport == %s",
>                               od->l3dgw_port->json_key);
>                 ds_clear(&actions);
>                 ds_put_format(&actions, "outport = %s; next;",
>                               od->l3redirect_port->json_key);
>                 ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
>                               ds_cstr(&match), ds_cstr(&actions));
>             }
>
>             /* If the Ethernet destination has not been resolved,
>              * redirect to the central instance of the l3dgw_port.
>              * Such traffic will be replaced by an ARP request or ND
>              * Neighbor Solicitation in the ARP request ingress
>              * table, before being redirected to the central instance.
>              */
>             ds_put_format(&match, " && eth.dst == 00:00:00:00:00:00");
>   ====> THIS ONE
>             ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 150,
>  ====> AND THIS ONE
>                           ds_cstr(&match), ds_cstr(&actions));
>         }
>
> [ANKUR]:
> Intention here was to just remove the flow which was sending out anything
> directed to router port to chassis redirect router port.
>
>
>
> ********
>
>
>
> Point 2
>
> =====
>
>
>
> This patch breaks the S/N traffic if we have a logical switch (sw0) of
> type overlay connected
>
> to a router and the router also a gw port connected to a logical switch
> (public) of type bridged (i.e provider network).
>
> This public switch has a localnet port.
>
>
>
> Some thing like this - http://paste.openstack.org/show/752427/
> [paste.openstack.org]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__paste.openstack.org_show_752427_&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=6rUfRX1OJVsHGsYSDvkxwm1jNck-tOBUBuVlw4FyvHQ&e=>
>
>
>
> It works fine if I change the type of the logical switch - public to
> overlay. But this doesn't make sense, since
>
> the logical switch - public is a provider (or bridged) network and CMS can
> set the type as bridged.
>
>
>
> I still think it's better not to have "network_type" column in
> logical_switch. We can always consider a logical
>
> switch having a localnet port of type "bridged" and with out a localnet
> port of type "overlay".
>
> [ANKUR]:
> Agreed, nomenclature and usage of type field is confusing. It will be
> difficult to convey / expect that CMS
>
> will NOT end up using even it was not supped to. I mentioned it the email
> reply, that I will be removing it
>
> from current patch series and we will have separate config knobs for the
> use cases this field was added for.
>
>
>
> This patch series sets the network_type=bridged in the external_ids of the
> datapath_binding row in SB DB.
>
>
>
> Please see my comments in v4 of the patch 1 where I suggested something
> like below
>
>
>
> ****
>
> enum ovn_datapath_nw_type {
>
>     DP_NETWORK_OVERLAY,
>
>     DP_NETWORK_PROVIDER
>
> };
>
>
>
> static void
>
> ovn_datapath_update_nw_type(struct ovn_datapath *od)
>
> {
>
>     if (!od->nbs) {
>
>         return;
>
>     }
>
>
>
>     if (!od->localnet_port) {
>
>         od->network_type = DP_NETWORK_OVERLAY;
>
>     } else {
>
>         od->network_type = DP_NETWORK_PROVIDER;
>
>     }
>
> }
>
> ******
>
>
>
> I think you can still set the external_ids of the datapath_binding row
> with "network_type=bridged"
>
> if od->network_type is BRIDGED so that ovn-controller can distinguish if
> its bridged or overlay datapath.
>
>
>
>
>
> I am mainly thinking from upgrades perspective for the existing
> deployments once this patch is series is applied.
>
> Until CMS changes the network_type to "bridged" for all the logical
> switches with localnet ports in the
>
> existing deployments, "ovn-nbctl show" will show these logical switches as
> "overlay" which is weird.
>
> And later we may encounter other issues when enhancing OVN with new
> features.
>
> [ANKUR]:
> Yes, the value is quite easy to get confused with, I will be removing it
> in v10.
>
>
>
> I think instead of adding the code to skip the redirection to the gateway
> chassis in ovn-northd if its a bridged network,
>
> it's better to handle it in table 32 and since the mac replacement is
> handled in table 65 it probably makes more sense this way.
> [ANKUR]:
> IMO, Table 32 should only decide if redirection has to be done on overlay
> or vlan. While, logical flow should decide if redirection
>
> is needed or not.
>
>
>
> Thanks
>
> Numan
>
>
>
>
>
>              /* If the Ethernet destination has not been resolved,
>               * redirect to the central instance of the l3dgw_port.
> diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> index 6275db1..6df711e 100644
> --- a/ovn/ovn-architecture.7.xml
> +++ b/ovn/ovn-architecture.7.xml
> @@ -1441,7 +1441,7 @@
>      </li>
>    </ol>
>
> -  <h3>External traffic</h3>
> +  <h3>External traffic (NAT)</h3>
>
>    <p>
>      The following happens when a VM sends an external traffic (which
> requires
> @@ -1607,6 +1607,91 @@
>      </li>
>    </ol>
>
> +  <h3>External traffic (NO NAT)</h3>
> +  <p>
> +    The following happens when a VM sends an external traffic (i.e to non
> +    logical router connected network), but there is not need for NATing.
> +  </p>
> +
> +  <p>
> +    Since, there is no NATing required, hence we need not redirect the
> packet
> +    to a gateway chassis. As a result, this packet flow is same as
> East-West.
> +    In order to ensure that OVN will not redirect the packet over a tunnel
> +    to gateway-chassis, "network_type" of destination localnet logical
> switch,
> +    should be set as "bridged". A "bridged" logical switch ensures that
> there
> +    is no tunnel encapsulation done while forwarding the packet on it.
> +    Please refer to <code>ovn-nb</code>(5) for more details.
> +  </p>
> +
> +  <ol>
> +    <li>
> +      It first enters the ingress pipeline, and then egress pipeline of
> the
> +      source localnet logical switch datapath. It then enters the ingress
> +      pipeline of the logical router datapath via the logical router port
> in
> +      the source chassis.
> +    </li>
> +
> +    <li>
> +      Routing decision is taken. Since, destination network is NOT
> directly
> +      connected to logial router, hence a static route is expected, which
> will
> +      provide next hop ip.
> +    </li>
> +
> +    <li>
> +      From the router datapath, packet enters the ingress pipeline and
> then
> +      egress pipeline of the destination localnet logical switch datapath
> +      (it is of type "bridged" and this is where the next hop is present)
> +      and goes out of the integration bridge to the provider bridge (
> +      belonging to the destination logical switch) via the localnet port.
> +      Same as East-West, source mac will replaced with chassis mac.
> +    </li>
> +  </ol>
> +
> +  <p>
> +    The following happens for the reverse external traffic.
> +  </p>
> +
> +  <ol>
> +    <li>
> +      The gateway chassis receives the packet from the localnet port of
> +      the logical switch (bridged type) which provides external
> connectivity.
> +      The packet then enters the ingress pipeline and then egress
> pipeline of
> +      the localnet logical switch (which provides external connectivity).
> +      The packet then enters the ingress pipeline of the logical router
> +      datapath.
> +    </li>
> +
> +    <li>
> +      Routing decision is taken and logical switch of destination VM is
> +      identified.
> +    </li>
> +
> +    <li>
> +      The packet then enters the ingress pipeline and then egress
> +      pipeline of VM's localnet logical switch. Since the source VM
> +      doesn't reside in the gateway chassis, the packet is sent out via
> the
> +      localnet port of the VM's logical switch. Source mac of this packet
> +      will be replaced with chassis unique mac.
> +    </li>
> +
> +    <li>
> +      VM's chassis receives the packet via the localnet port and
> +      sends it to the integration bridge. The packet enters the
> +      ingress pipeline and then egress pipeline of the localnet
> +      logical switch and finally gets delivered to the VM port.
> +    </li>
> +  </ol>
> +
> +  <p>
> +    One thing to note here is that, while VM to External traffic did not
> +    require redirection to gateway chassis, the reverse traffic is through
> +    gateway chassis only. This is because, for external router, OVN
> logical
> +    router port IP will be the next hop to reach the endpoints behind it.
> +    As a result, we need a centralized chassis, which will respond to ARP
> +    requests coming from external network. This centralized chassis, is
> the
> +    gateway chassis which is attached to corresponding router port.
> +  </p>
> +
>    <h2>Life Cycle of a VTEP gateway</h2>
>
>    <p>
> diff --git a/tests/ovn.at [ovn.at]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
> b/tests/ovn.at [ovn.at]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
> index e5108a7..8a03393 100644
> --- a/tests/ovn.at [ovn.at]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
> +++ b/tests/ovn.at [ovn.at]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
> @@ -29,6 +29,12 @@ m4_define([OVN_CHECK_PACKETS],
>    [ovn_check_packets__ "$1" "$2"
>     AT_CHECK([sort $rcv_text], [0], [expout])])
>
> +m4_define([OVN_CHECK_PACKETS_REMOVE_BROADCAST],
> +  [ovn_check_packets__ "$1" "$2"
> +   echo "received_text=$rcv_text"
> +   sed -i '/ffffffffffff/d' $rcv_text
> +   AT_CHECK([sort $rcv_text], [0], [expout])])
> +
>  AT_BANNER([OVN components])
>
>  AT_SETUP([ovn -- lexer])
> @@ -14018,7 +14024,7 @@ ovn-hv4-0
>  OVN_CLEANUP([hv1], [hv2], [hv3])
>  AT_CLEANUP
>
> -AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac])
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR E-W chassis mac])
>  ovn_start
>
>
> @@ -14028,6 +14034,8 @@ ovn_start
>  # of VIF port name indicates the hypervisor it is bound to, e.g.
>  # lp23 means VIF 3 on hv2.
>  #
> +# Both the switches are connected to a logical router "router".
> +#
>  # Each switch's VLAN tag and their logical switch ports are:
>  #   - ls1:
>  #       - tagged with VLAN 101
> @@ -14185,6 +14193,7 @@ test_ip() {
>  echo "------ OVN dump ------"
>  ovn-nbctl show
>  ovn-sbctl show
> +ovn-sbctl list port_binding
>
>  echo "------ hv1 dump ------"
>  as hv1 ovs-vsctl show
> @@ -14211,6 +14220,727 @@ as hv2 ovs-appctl fdb/show br-phys
>
>  OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected])
>
> +
> +# Associate a chassis as gateway chassis and validate garp.
> +
> +OVN_CLEANUP([hv1],[hv2])
> +
> +AT_CLEANUP
> +
> +
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S GARP])
> +ovn_start
> +
> +
> +# In this test cases we create 2 switches, all connected to same
> +# physical network (through br-phys on each HV). Each switch has
> +# 1 VIF. Each HV has 1 VIF port. The first digit
> +# of VIF port name indicates the hypervisor it is bound to, e.g.
> +# lp23 means VIF 3 on hv2.
> +#
> +# Both the switches are connected to a logical router "router".
> +#
> +# Additionally, we create a logical switch (ls-underlay) for N-S traffic.
> +#
> +# Each switch's VLAN tag and their logical switch ports are:
> +#   - ls1:
> +#       - tagged with VLAN 101
> +#       - ports: lp11
> +#   - ls2:
> +#       - tagged with VLAN 201
> +#       - ports: lp22
> +#   - ls-underlay:
> +#       - tagged with VLAN 1000
> +#
> +# Note: a localnet port is created for each switch to connect to
> +# physical network.
> +# lsp_to_ls LSP
> +#
> +# Prints the name of the logical switch that contains LSP.
> +
> +net_add n1
> +for i in 1 2; do
> +    sim_add hv$i
> +    as hv$i
> +    ovs-vsctl add-br br-phys
> +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +    ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
> +    ovs-vsctl set open . external-ids:system-id="HV$i"
> +    ovn_attach n1 br-phys 192.168.0.$i
> +    ovs-vsctl set-controller br-int ptcp:
> +    AT_CHECK([ovs-vsctl add-port br-phys snoopvif -- set Interface
> snoopvif options:tx_pcap=hv$i/snoopvif-tx.pcap
> options:rxq_pcap=hv$i/snoopvif-rx.pcap])
> +done
> +
> +ovn-nbctl ls-add ls-underlay bridged
> +ovn-nbctl lsp-add ls-underlay ln3 "" 1000
> +ovn-nbctl lsp-set-addresses ln3 unknown
> +ovn-nbctl lsp-set-type ln3 localnet
> +ovn-nbctl lsp-set-options ln3 network_name=phys
> +
> +ovn-nbctl lr-add router
> +ovn-nbctl lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24
> [172.31.0.1]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
> +
> +ovn-nbctl lsp-add ls-underlay underlay-to-router -- set
> Logical_Switch_Port \
> +                              underlay-to-router type=router \
> +                              options:router-port=router-to-underlay \
> +                              -- lsp-set-addresses underlay-to-router
> router
> +
> +ovn-nbctl --wait=sb sync
> +
> +# Associate hv2 as gateway chassis
> +ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv2
> +
> +ovn-nbctl show
> +ovn-sbctl show
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +
> +echo "------ hv1 dump ------"
> +as hv1 ovs-vsctl show
> +as hv1 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv2 dump ------"
> +as hv2 ovs-vsctl show
> +as hv2 ovs-vsctl list Open_Vswitch
> +
> +sleep 1
> +
> +echo "----------- Post Traffic hv1 dump -----------"
> +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv1 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv2 dump -----------"
> +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv2 ovs-appctl fdb/show br-phys
> +
> +AT_CHECK([as hv2 ovs-appctl fdb/show br-phys | grep 00:00:01:01:02:07 |
> grep 1000 | wc -l], [0], [[1
> +]])
> +
> +echo
> "ffffffffffff000001010207810003e808060001080006040001000001010207ac1f0001000000000000ac1f0001"
> > expected
> +OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [expected])
> +
>  OVN_CLEANUP([hv1],[hv2])
>
>  AT_CLEANUP
> +
> +
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S Ping])
> +ovn_start
> +
> +# In this test cases we create 3 switches, all connected to same
> +# physical network (through br-phys on each HV). LS1 and LS2 have
> +# 1 VIF each. Each HV has 1 VIF port. The first digit
> +# of VIF port name indicates the hypervisor it is bound to, e.g.
> +# lp23 means VIF 3 on hv2.
> +#
> +# All the switches are connected to a logical router "router".
> +#
> +# Each switch's VLAN tag and their logical switch ports are:
> +#   - ls1:
> +#       - tagged with VLAN 101
> +#       - ports: lp11
> +#   - ls2:
> +#       - tagged with VLAN 201
> +#       - ports: lp22
> +#   - ls-underlay:
> +#       - tagged with VLAN 1000
> +# Note: a localnet port is created for each switch to connect to
> +# physical network.
> +
> +for i in 1 2; do
> +    ls_name=ls$i
> +    ovn-nbctl ls-add $ls_name bridged
> +    ln_port_name=ln$i
> +    if test $i -eq 1; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
> +    elif test $i -eq 2; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
> +    fi
> +    ovn-nbctl lsp-set-addresses $ln_port_name unknown
> +    ovn-nbctl lsp-set-type $ln_port_name localnet
> +    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
> +done
> +
> +# lsp_to_ls LSP
> +#
> +# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        vif?[[north]]?) echo hv4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +net_add n1
> +for i in 1 2; do
> +    sim_add hv$i
> +    as hv$i
> +    ovs-vsctl add-br br-phys
> +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +    ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
> +    ovn_attach n1 br-phys 192.168.0.$i
> +
> +    ovs-vsctl add-port br-int vif$i$i -- \
> +        set Interface vif$i$i external-ids:iface-id=lp$i$i \
> +                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
> +                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
> +                              ofport-request=$i$i
> +
> +    lsp_name=lp$i$i
> +    ls_name=$(lsp_to_ls $lsp_name)
> +
> +    ovn-nbctl lsp-add $ls_name $lsp_name
> +    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i
> 192.168.$i.$i"
> +    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
> +
> +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> +
> +done
> +
> +ovn-nbctl ls-add ls-underlay bridged
> +ovn-nbctl lsp-add ls-underlay ln3 "" 1000
> +ovn-nbctl lsp-set-addresses ln3 unknown
> +ovn-nbctl lsp-set-type ln3 localnet
> +ovn-nbctl lsp-set-options ln3 network_name=phys
> +
> +ovn-nbctl ls-add ls-north bridged
> +ovn-nbctl lsp-add ls-north ln4 "" 1000
> +ovn-nbctl lsp-set-addresses ln4 unknown
> +ovn-nbctl lsp-set-type ln4 localnet
> +ovn-nbctl lsp-set-options ln4 network_name=phys
> +
> +# Add a VM on ls-north
> +ovn-nbctl lsp-add ls-north lp-north
> +ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
> +ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
> +
> +# Add 3rd hypervisor
> +sim_add hv3
> +as hv3 ovs-vsctl add-br br-phys
> +as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +as hv3 ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
> +as hv3 ovn_attach n1 br-phys 192.168.0.3
> +
> +# Add 4th hypervisor
> +sim_add hv4
> +as hv4 ovs-vsctl add-br br-phys
> +as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +as hv4 ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
> +as hv4 ovn_attach n1 br-phys 192.168.0.4
> +
> +as hv4 ovs-vsctl add-port br-int vif-north -- \
> +        set Interface vif-north external-ids:iface-id=lp-north \
> +                              options:tx_pcap=hv4/vif-north-tx.pcap \
> +                              options:rxq_pcap=hv4/vif-north-rx.pcap \
> +                              ofport-request=44
> +
> +ovn-nbctl lr-add router
> +ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
> +ovn-nbctl [192.168.1.3]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.1.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=r3laW3QCkYmIZydSf8n5bHm0ObKIuSd3VACsekBmSbg&e=>
> lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
> +ovn-nbctl [192.168.2.3]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.2.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=suQy9nYmhP89HVjKfN--Kvziv8XSkkzS9bXCDrfE1c4&e=>
> lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24
> [172.31.0.1]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
> +
> +ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port
> ls1-to-router type=router \
> +          options:router-port=router-to-ls1 -- lsp-set-addresses
> ls1-to-router router
> +ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port
> ls2-to-router type=router \
> +          options:router-port=router-to-ls2 -- lsp-set-addresses
> ls2-to-router router
> +ovn-nbctl lsp-add ls-underlay underlay-to-router -- set
> Logical_Switch_Port \
> +                              underlay-to-router type=router \
> +                              options:router-port=router-to-underlay \
> +                              -- lsp-set-addresses underlay-to-router
> router
> +
> +ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
> +
> +ovn-nbctl --wait=sb sync
> +
> +sleep 2
> +
> +OVN_POPULATE_ARP
> +
> ++# lsp_to_ls LSP
> ++#
> ++# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_ls () {
> +    case $1 in dnl (
> +        vif?[[11]]) echo ls1 ;; dnl (
> +        vif?[[12]]) echo ls2 ;; dnl (
> +        vif-north) echo ls-north ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +hv_to_num () {
> +    case $1 in dnl (
> +        hv1) echo 1 ;; dnl (
> +        hv2) echo 2 ;; dnl (
> +        hv3) echo 3 ;; dnl (
> +        hv4) echo 4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_num () {
> +    case $1 in dnl (
> +        vif22) echo 22 ;; dnl (
> +        vif21) echo 21 ;; dnl (
> +        vif11) echo 11 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        vif-north) echo hv4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_lrp () {
> +    echo router-to-`vif_to_ls $1`
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +
> +test_ip() {
> +        # This packet has bad checksums but logical L3 routing doesn't
> check.
> +        local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
> outport=$6
> +        local
> packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
> +        shift; shift; shift; shift; shift
> +        hv=`vif_to_hv $inport`
> +        as $hv ovs-appctl netdev-dummy/receive $inport $packet
> +        in_ls=`vif_to_ls $inport`
> +        for outport; do
> +            out_ls=`vif_to_ls $outport`
> +            if test $in_ls = $out_ls; then
> +                # Ports on the same logical switch receive exactly the
> same packet.
> +                echo $packet
> +            else
> +                # Routing decrements TTL and updates source and dest MAC
> +                # (and checksum).
> +                out_lrp=`vif_to_lrp $outport`
> +                # For North-South, packet will come via gateway chassis,
> i.e hv3
> +                if test $inport = vif-north; then
> +                    echo
> f00000000011aabbccddee3308004500001c000000003f110100${src_ip}${dst_ip}0035111100080000
> >> $outport.expected
> +                fi
> +                if test $outport = vif-north; then
> +                    echo
> f0f000000011aabbccddee1108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000
> >> $outport.expected
> +                fi
> +            fi >> $outport.expected
> +        done
> +}
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +ovn-sbctl list port_binding
> +ovn-sbctl list mac_binding
> +
> +echo "------ hv1 dump ------"
> +as hv1 ovs-vsctl show
> +as hv1 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv2 dump ------"
> +as hv2 ovs-vsctl show
> +as hv2 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv3 dump ------"
> +as hv3 ovs-vsctl show
> +as hv3 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv4 dump ------"
> +as hv4 ovs-vsctl show
> +as hv4 ovs-vsctl list Open_Vswitch
> +
> +echo "Send traffic North to South"
> +
> +sip=`ip_to_hex 172 31 0 10`
> +dip=`ip_to_hex 192 168 1 1`
> +test_ip vif-north f0f000000011 000001010207 $sip $dip vif11
> +
> +sleep 1
> +
> +# Confirm that North to south traffic works fine and went through gateway
> chassis, i.e HV3
> +OVN_CHECK_PACKETS([hv1/vif11-tx.pcap], [vif11.expected])
> +
> +echo "Send traffic South to Nouth"
> +sip=`ip_to_hex 192 168 1 1`
> +dip=`ip_to_hex 172 31 0 10`
> +test_ip vif11 f00000000011 000001010203 $sip $dip vif-north
> +
> +sleep 1
> +
> +# Confirm that South to North traffic works fine.
> +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap],
> [vif-north.expected])
> +
> +# Confirm that packets did not go out via tunnel port.
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=32 | grep
> NXM_NX_TUN_METADATA0 | grep n_packets=0 | wc -l], [0], [[1
> +]])
> +
> +# Confirm that HV1 chassis mac is never seen on Gateway chassis, i.e HV3
> +AT_CHECK([as hv3 ovs-appctl fdb/show br-phys | grep aa:bb:cc:dd:ee:11 |
> wc -l], [0], [[0
> +]])
> +
> +echo "----------- Post Traffic hv1 dump -----------"
> +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv1 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv2 dump -----------"
> +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv2 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv3 dump -----------"
> +as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv3 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv4 dump -----------"
> +as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv4 ovs-appctl fdb/show br-phys
> +
> +OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
> +
> +AT_CLEANUP
> +
> +
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S ARP
> handling])
> +ovn_start
> +
> +# In this test cases we create 3 switches, all connected to same
> +# physical network (through br-phys on each HV). LS1 and LS2 have
> +# 1 VIF each. Each HV has 1 VIF port. The first digit
> +# of VIF port name indicates the hypervisor it is bound to, e.g.
> +# lp23 means VIF 3 on hv2.
> +#
> +# All the switches are connected to a logical router "router".
> +#
> +# Each switch's VLAN tag and their logical switch ports are:
> +#   - ls1:
> +#       - tagged with VLAN 101
> +#       - ports: lp11
> +#   - ls2:
> +#       - tagged with VLAN 201
> +#       - ports: lp22
> +#   - ls-underlay:
> +#       - tagged with VLAN 1000
> +# Note: a localnet port is created for each switch to connect to
> +# physical network.
> +
> +for i in 1 2; do
> +    ls_name=ls$i
> +    ovn-nbctl ls-add $ls_name bridged
> +    ln_port_name=ln$i
> +    if test $i -eq 1; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
> +    elif test $i -eq 2; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
> +    fi
> +    ovn-nbctl lsp-set-addresses $ln_port_name unknown
> +    ovn-nbctl lsp-set-type $ln_port_name localnet
> +    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
> +done
> +
> +# lsp_to_ls LSP
> +#
> +# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        vif?[[north]]?) echo hv4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +net_add n1
> +for i in 1 2; do
> +    sim_add hv$i
> +    as hv$i
> +    ovs-vsctl add-br br-phys
> +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +    ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
> +    ovn_attach n1 br-phys 192.168.0.$i
> +
> +    ovs-vsctl add-port br-int vif$i$i -- \
> +        set Interface vif$i$i external-ids:iface-id=lp$i$i \
> +                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
> +                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
> +                              ofport-request=$i$i
> +
> +    lsp_name=lp$i$i
> +    ls_name=$(lsp_to_ls $lsp_name)
> +
> +    ovn-nbctl lsp-add $ls_name $lsp_name
> +    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i
> 192.168.$i.$i"
> +    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
> +
> +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> +
> +done
> +
> +ovn-nbctl ls-add ls-underlay bridged
> +ovn-nbctl lsp-add ls-underlay ln3 "" 1000
> +ovn-nbctl lsp-set-addresses ln3 unknown
> +ovn-nbctl lsp-set-type ln3 localnet
> +ovn-nbctl lsp-set-options ln3 network_name=phys
> +
> +ovn-nbctl ls-add ls-north bridged
> +ovn-nbctl lsp-add ls-north ln4 "" 1000
> +ovn-nbctl lsp-set-addresses ln4 unknown
> +ovn-nbctl lsp-set-type ln4 localnet
> +ovn-nbctl lsp-set-options ln4 network_name=phys
> +
> +# Add a VM on ls-north
> +ovn-nbctl lsp-add ls-north lp-north
> +ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
> +ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
> +
> +# Add 3rd hypervisor
> +sim_add hv3
> +as hv3 ovs-vsctl add-br br-phys
> +as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +as hv3 ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
> +as hv3 ovn_attach n1 br-phys 192.168.0.3
> +
> +# Add 4th hypervisor
> +sim_add hv4
> +as hv4 ovs-vsctl add-br br-phys
> +as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +as hv4 ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
> +as hv4 ovn_attach n1 br-phys 192.168.0.4
> +
> +as hv4 ovs-vsctl add-port br-int vif-north -- \
> +        set Interface vif-north external-ids:iface-id=lp-north \
> +                              options:tx_pcap=hv4/vif-north-tx.pcap \
> +                              options:rxq_pcap=hv4/vif-north-rx.pcap \
> +                              ofport-request=44
> +
> +ovn-nbctl lr-add router
> +ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
> +ovn-nbctl [192.168.1.3]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.1.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=r3laW3QCkYmIZydSf8n5bHm0ObKIuSd3VACsekBmSbg&e=>
> lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
> +ovn-nbctl [192.168.2.3]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.2.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=suQy9nYmhP89HVjKfN--Kvziv8XSkkzS9bXCDrfE1c4&e=>
> lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24
> [172.31.0.1]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
> +
> +ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port
> ls1-to-router type=router \
> +          options:router-port=router-to-ls1 -- lsp-set-addresses
> ls1-to-router router
> +ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port
> ls2-to-router type=router \
> +          options:router-port=router-to-ls2 -- lsp-set-addresses
> ls2-to-router router
> +ovn-nbctl lsp-add ls-underlay underlay-to-router -- set
> Logical_Switch_Port \
> +                              underlay-to-router type=router \
> +                              options:router-port=router-to-underlay \
> +                              -- lsp-set-addresses underlay-to-router
> router
> +
> +
> +OVN_POPULATE_ARP
> +
> ++# lsp_to_ls LSP
> ++#
> ++# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_ls () {
> +    case $1 in dnl (
> +        vif?[[11]]) echo ls1 ;; dnl (
> +        vif?[[12]]) echo ls2 ;; dnl (
> +        vif-north) echo ls-north ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +hv_to_num () {
> +    case $1 in dnl (
> +        hv1) echo 1 ;; dnl (
> +        hv2) echo 2 ;; dnl (
> +        hv3) echo 3 ;; dnl (
> +        hv4) echo 4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_num () {
> +    case $1 in dnl (
> +        vif22) echo 22 ;; dnl (
> +        vif21) echo 21 ;; dnl (
> +        vif11) echo 11 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        vif-north) echo hv4 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_lrp () {
> +    echo router-to-`vif_to_ls $1`
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +ovn-sbctl list port_binding
> +ovn-sbctl list mac_binding
> +
> +echo "------ hv1 dump ------"
> +as hv1 ovs-vsctl show
> +as hv1 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv2 dump ------"
> +as hv2 ovs-vsctl show
> +as hv2 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv3 dump ------"
> +as hv3 ovs-vsctl show
> +as hv3 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv4 dump ------"
> +as hv4 ovs-vsctl show
> +as hv4 ovs-vsctl list Open_Vswitch
> +
> +# test_arp INPORT SHA SPA TPA [REPLY_HA]
> +#
> +# Causes a packet to be received on INPORT.  The packet is an ARP
> +# request with SHA, SPA, and TPA as specified.  If REPLY_HA is provided,
> then
> +# it should be the hardware address of the target to expect to receive in
> an
> +# ARP reply; otherwise no reply is expected.
> +#
> +# INPORT is an logical switch port number, e.g. 11 for vif11.
> +# SHA and REPLY_HA are each 12 hex digits.
> +# SPA and TPA are each 8 hex digits.
> +test_arp() {
> +    local inport=$1 sha=$2 spa=$3 tpa=$4 reply_ha=$5
> +    local
> request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa}
> +    hv=`vif_to_hv $inport`
> +    as $hv ovs-appctl netdev-dummy/receive $inport $request
> +
> +    if test X$reply_ha = X; then
> +        # Expect to receive the broadcast ARP on the other logical switch
> ports
> +        # if no reply is expected.
> +        local i j
> +        for i in 1 2 3; do
> +            for j in 1 2 3; do
> +                if test $i$j != $inport; then
> +                    echo $request >> $i$j.expected
> +                fi
> +            done
> +        done
> +    else
> +        # Expect to receive the reply, if any.
> +        local
> reply=${sha}${reply_ha}08060001080006040002${reply_ha}${tpa}${sha}${spa}
> +        local
> reply_vid=${sha}${reply_ha}810003e808060001080006040002${reply_ha}${tpa}${sha}${spa}
> +        echo $reply_vid >> ${inport}_vid.expected
> +        echo $reply >> $inport.expected
> +    fi
> +}
> +
> +sip=`ip_to_hex 172 31 0 10`
> +tip=`ip_to_hex 172 31 0 1`
> +
> +test_arp vif-north f0f000000011 $sip $tip
> +# Confirm that vif-north does not get ARP reply
> +AT_CHECK([wc -l hv4/vif-north-tx.pcap | awk '{print $1}'], [0], [[0
> +]])
> +
> +# Set a hypervisor as gateway chassis, for router port 172.31.0.1
> +ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
> +ovn-nbctl --wait=sb sync
> +sleep 2
> +
> +test_arp vif-north f0f000000011 $sip $tip 000001010207
> +
> +sleep 1
> +
> +# Confirm that vif-north gets a single ARP reply this time
> +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap],
> [vif-north.expected])
> +
> +# Confirm that only redirect chassis allowed arp resolution.
> +OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv3/br-phys_n1-tx.pcap],
> [vif-north_vid.expected])
> +sed -i '/ffffffffffff/d' hv3/br-phys_n1-tx.packets
> +AT_CHECK([grep 000001010207 hv3/br-phys_n1-tx.packets | wc -l], [0], [[1
> +]])
> +
> +# Confirm that other OVN chassis did not generate ARP reply.
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in [ovs-pcap.in]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__ovs-2Dpcap.in&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=2C3bpksCPiN-64fg1Las63zBhPREoL9p8vojGneVx9o&e=>"
> hv1/br-phys_n1-tx.pcap > hv1/br-phys_n1-tx.packets
> +$PYTHON "$top_srcdir/utilities/ovs-pcap.in [ovs-pcap.in]
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__ovs-2Dpcap.in&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=2C3bpksCPiN-64fg1Las63zBhPREoL9p8vojGneVx9o&e=>"
> hv2/br-phys_n1-tx.pcap > hv2/br-phys_n1-tx.packets
> +
> +AT_CHECK([grep 000001010207 hv1/br-phys_n1-tx.packets | wc -l], [0], [[0
> +]])
> +AT_CHECK([grep 000001010207 hv2/br-phys_n1-tx.packets | wc -l], [0], [[0
> +]])
> +
> +echo "----------- Post Traffic hv1 dump -----------"
> +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv1 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv2 dump -----------"
> +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv2 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv3 dump -----------"
> +as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv3 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv4 dump -----------"
> +as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv4 ovs-appctl fdb/show br-phys
> +
> +OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
> +
> +AT_CLEANUP
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> [mail.openvswitch.org]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=Nints8x76fVjfvJvi7wl7Z201OHUMfS3oO7Dhq2JJ7Y&e=>
>
>
Ankur Sharma June 11, 2019, 11:17 p.m. UTC | #4
Hi Numan,

Thanks for going through the commit plan.
I submitted a V10 now, which has just the E-W changes.
It has all the review comments handled.

You wanted to have periodic  GARP advertisement patch together as well, however as per our
discussion on another email thread, since we want your garp advertisement patch to go in first,
hence I could not accommodate those changes in this patch.

Please take a look and looking forward toy our feedback.

Thanks

Regards,
Ankur

From: Numan Siddique <nusiddiq@redhat.com>
Sent: Monday, June 10, 2019 10:18 AM
To: Ankur Sharma <ankur.sharma@nutanix.com>
Cc: ovs-dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v9 2/2] OVN: Enable N-S Traffic, Vlan backed DVR



On Fri, Jun 7, 2019 at 5:15 AM Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>> wrote:
Hi Numan,

Thanks for trying out the patch and providing feedback.
I am planning to change this series to reflect only the E-W and would send out separate patches for N-S improvements.

Following is the reasoning:
======================
a. I agree that the network_type construct is adding to the confusion and may be we should rely on optional/external-id based key-value config.
b. N-S patch has 3- changes (on a high level) which are distinct from each other, having them in a single patch is causing the confusion and is holding rest of the reviewed changes.
c. Last but not the least, there were some gaps in the patch as well.

Here is what I am planning to do:
==========================
a. Keep this series for E-W only and remove all the network_type related changes from here (including showing type as bridged/vlan).


Hi Ankur.

I agree with the approach you are planning to take.



b. For N-S Changes, this series has following changes. I will send them out in separate patches, especially the ones which are more of a bug fix.
    i. Do not allow ARP resolution from physical network unless gateway chassis is configured ==> More of a bug fix, will be sent as a separate standalone patch.
   ii. GARP advertisement during failover  in the absence of NAT configuration ==> More of a bug fix, will be sent as a separate standalone patch.
  iii. Periodic GARP advertisement with/without NAT configuration ==> New feature, will be added along with SNAT changes.

This periodic GARP adv will be required irrespective of network_type of the logical switches connected to a router. So I would request you to handle this as well
when you submit the patch.

  iv. Avoid redirection ==> New feature, will come as a separate patchset, we will make it as optional feature, i.e by default even non NATed traffic will go via gateway chassis, but config knob can override it.

Agree. This makes sense and easier.

   v. No chassis mac replace on gateway chassis ==> More of a addendum to E-W, I am thinking about clubbing it some of N-S changes as this is where it will be relevant.

c. Will send out separate patch for showing network type as overlay or bridged (based on localnet port’s presence), I believe it is good to have 😊.
    i.e we will not have any new column in logical switch table, but the output of relevant ovn-nbctl show command will show type as “overlay” or “bridged”.

Above will allow us to make progress on the changes we are in agreement on, while having thorough discussion on the remaining.
Let me know, if you are fine with the plan, I should be able to send E-W only changes in a couple of days and should be able to individual bug fixes soon after as well.

For rest of the comments, please find my replies inline.

Appreciate your feedback.

Regards,
Ankur


From: Numan Siddique <nusiddiq@redhat.com<mailto:nusiddiq@redhat.com>>
Sent: Monday, June 3, 2019 3:06 AM
To: Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>>
Cc: ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>
Subject: Re: [ovs-dev] [PATCH v9 2/2] OVN: Enable N-S Traffic, Vlan backed DVR


Hi Ankur,

Please see some comments inline. Please note that I haven't got the chance to look into the code
in detail. I am first trying to test out the patches. (I am in PTO. Expect some delay in my replies).



On Thu, May 30, 2019 at 5:58 AM Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>> wrote:
Background:
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DOctober_353066.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=xQRPm8R90ygR4nx7uyRGOYHzW5NFiroiyZqi9JSYb-A&e=>
[2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing [docs.google.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU_edit-3Fusp-3Dsharing&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=myrKaOI2LsuZQQOZhhhNw1zwDgat77e5CmPmpTFpllw&e=>

This Series:
Layer 2, Layer 3 E-W and Layer 3 N-S (NO NAT) changes for vlan
backed distributed logical router.

This patch:
For North-South traffic, we need a chassis which will respond to
ARP requests for router port coming from outside. For this purpose,
we will reply upon gateway-chassis construct in OVN, on a logical
router port, we will associate one or more chassis as gateway chassis.

One of these chassis would be active at a point and will become
entry point to traffic, bound for end points behind logical router
coming from outside network (North to South).

This patch make some enhancements to gateway chassis implementation
to manage above used case.

A.
Do not replace router port mac with chassis mac on gateway
chassis.
This is done, because:
    i. Chassisredirect port is NOT a distributed port, hence
       we need not replace its mac address
      (which same as router port mac).

   ii. ARP cache will be consistent everywhere, i.e just like
       endpoints on OVN chassis will see configured router port
       mac as resolved mac for router port ip, outside endpoints
       will see that as well.

  iii. For implementing Network Address Translation. Although
       not a part of this series. But, follow up series would
       be having this feature and approach would rely upon
       sending packets to redirect chassis using chassis redirect
       router port mac as dest mac.

B.
Advertise router port GARP on gateway chassis.
This is needed, especially if a failover happens and
chassisredirect port moves to a new gateway chassis.
Otherwise, there would be packet drops till outside
router ARPs for router port ip again.

Intention of this GARP is to update top of the rack (TOR)
to direct router port mac to new hypervisor.

Hence, we could have done the same using RARP as well, but
because ovn-controller has implementation for GARP already,
hence it did not look like worthy to add a RARP implementation
just for this.

C.
For South to North traffic, we need not pass through gateway
chassis, if there is no address transalation needed.

For overlay networks, NATing is a must to talk to outside networks.
However, for vlan backed networks, NATing is not a must, and hence
in the absence of NATing configuration we need redirect the packet
to gateway chassis.

Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>>
---
 ovn/controller/physical.c  |  24 +-
 ovn/controller/pinctrl.c   | 205 +++++++++++--
 ovn/controller/pinctrl.h   |   6 +
 ovn/lib/ovn-util.c         |  31 ++
 ovn/lib/ovn-util.h         |   6 +
 ovn/northd/ovn-northd.c    |  43 ++-
 ovn/ovn-architecture.7.xml |  87 +++++-
 tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>               | 732 ++++++++++++++++++++++++++++++++++++++++++++-
 8 files changed, 1090 insertions(+), 44 deletions(-)

diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index af587a5..1ab5968 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -21,6 +21,7 @@
 #include "lflow.h"
 #include "lport.h"
 #include "chassis.h"
+#include "pinctrl.h"
 #include "lib/bundle.h"
 #include "openvswitch/poll-loop.h"
 #include "lib/uuid.h"
@@ -238,9 +239,12 @@ get_zone_ids(const struct sbrec_port_binding *binding,
 }

 static void
-put_replace_router_port_mac_flows(const struct
+put_replace_router_port_mac_flows(struct ovsdb_idl_index
+                                  *sbrec_port_binding_by_name,
+                                  const struct
                                   sbrec_port_binding *localnet_port,
                                   const struct sbrec_chassis *chassis,
+                                  const struct sset *active_tunnels,
                                   const struct hmap *local_datapaths,
                                   struct ofpbuf *ofpacts_p,
                                   ofp_port_t ofport,
@@ -281,8 +285,21 @@ put_replace_router_port_mac_flows(const struct
         char *err_str = NULL;
         struct match match;
         struct ofpact_mac *replace_mac;
+        char *cr_peer_name = xasprintf("cr-%s", rport_binding->logical_port);

-        /* Table 65, priority 150.
+
+        if (pinctrl_is_chassis_resident(sbrec_port_binding_by_name,
+                                        chassis, active_tunnels,
+                                        cr_peer_name)) {
+            /* If a router port's chassisredirect port is
+             * resident on this chassis, then we need not do mac replace. */
+            free(cr_peer_name);
+            continue;
+        }
+
+        free(cr_peer_name);
+
+       /* Table 65, priority 150.
          * =======================
          *
          * Implements output to localnet port.
@@ -797,7 +814,8 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name,
                         &match, ofpacts_p, &binding->header_.uuid);

         if (!strcmp(binding->type, "localnet")) {
-            put_replace_router_port_mac_flows(binding, chassis,
+            put_replace_router_port_mac_flows(sbrec_port_binding_by_name,
+                                              binding, chassis, active_tunnels,
                                               local_datapaths, ofpacts_p,
                                               ofport, flow_table);
         }
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index b7bb4c9..a145867 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -226,6 +226,8 @@ static bool may_inject_pkts(void);
 COVERAGE_DEFINE(pinctrl_drop_put_mac_binding);
 COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map);

+#define GARP_DEF_REPEAT_INTERVAL_MS   (3 * 60 * 1000) /* 3 minutes */
+
 void
 pinctrl_init(void)
 {
@@ -242,6 +244,25 @@ pinctrl_init(void)
                                                 &pinctrl);
 }

+bool
+pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                            const struct sbrec_chassis *chassis,
+                            const struct sset *active_tunnels,
+                            const char *port_name)
+{
+    const struct sbrec_port_binding *pb
+        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
+    if (!pb || !pb->chassis) {
+        return false;
+    }
+    if (strcmp(pb->type, "chassisredirect")) {
+        return pb->chassis == chassis;
+    } else {
+        return ha_chassis_group_is_active(pb->ha_chassis_group,
+                                          active_tunnels, chassis);
+    }
+}
+
 static ovs_be32
 queue_msg(struct rconn *swconn, struct ofpbuf *msg)
 {
@@ -2548,6 +2569,8 @@ struct garp_data {
     int backoff;                 /* Backoff for the next announcement. */
     uint32_t dp_key;             /* Datapath used to output this GARP. */
     uint32_t port_key;           /* Port to inject the GARP into. */
+    bool is_repeat;              /* Send GARPs continously */
+    long long int repeat_interval; /* Interval between GARP bursts in ms */
 };

 /* Contains GARPs to be sent. Protected by pinctrl_mutex*/
@@ -2568,7 +2591,8 @@ destroy_send_garps(void)
 /* Runs with in the main ovn-controller thread context. */
 static void
 add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
-         uint32_t dp_key, uint32_t port_key)
+         uint32_t dp_key, uint32_t port_key, bool is_repeat,
+         long long int repeat_interval)
 {
     struct garp_data *garp = xmalloc(sizeof *garp);
     garp->ea = ea;
@@ -2577,6 +2601,8 @@ add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
     garp->backoff = 1;
     garp->dp_key = dp_key;
     garp->port_key = port_key;
+    garp->is_repeat = is_repeat;
+    garp->repeat_interval = repeat_interval;
     shash_add(&send_garp_data, name, garp);

     /* Notify pinctrl_handler so that it can wakeup and process
@@ -2586,7 +2612,8 @@ add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,

 /* Add or update a vif for which GARPs need to be announced. */
 static void
-send_garp_update(const struct sbrec_port_binding *binding_rec,
+send_garp_update(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                 const struct sbrec_port_binding *binding_rec,
                  struct shash *nat_addresses)
 {
     volatile struct garp_data *garp = NULL;
@@ -2611,7 +2638,7 @@ send_garp_update(const struct sbrec_port_binding *binding_rec,
                     add_garp(name, laddrs->ea,
                              laddrs->ipv4_addrs[i].addr,
                              binding_rec->datapath->tunnel_key,
-                             binding_rec->tunnel_key);
+                             binding_rec->tunnel_key, false, 0);
                 }
                 free(name);
             }
@@ -2621,6 +2648,64 @@ send_garp_update(const struct sbrec_port_binding *binding_rec,
         return;
     }

+    /* Update GARPs for local chassisredirect port, if the peer
+     * layer 2 switch is of type vlan.
+     */
+    if (!strcmp(binding_rec->type, "chassisredirect")) {
+        struct eth_addr mac;
+        ovs_be32 ip, mask;
+        uint32_t dp_key = 0;
+        uint32_t port_key = 0;
+        const struct sbrec_port_binding *peer_port = NULL;
+        const struct sbrec_port_binding *distributed_port = NULL;
+
+        if (!ovn_sbrec_get_port_binding_ip_mac(binding_rec, &mac,
+                                               &ip, &mask)) {
+            /* Router Port binding without ip and mac configured. */
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+            VLOG_WARN_RL(&rl, "cannot send garp, router port binding: %s, "
+                         "does not have proper ip,mac values: %s",
+                         binding_rec->logical_port, *binding_rec->mac);
+            return;
+        }
+
+        const char *lrp_name = smap_get(&binding_rec->options,
+                                        "distributed-port");
+        ovs_assert(lrp_name);
+
+        distributed_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                                lrp_name);
+        ovs_assert(distributed_port);
+
+        const char *peer_name = smap_get(&distributed_port->options, "peer");
+        ovs_assert(peer_name);
+
+        peer_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                         peer_name);
+        ovs_assert(peer_port);
+
+        const char *network_type = smap_get(&peer_port->datapath->external_ids,
+                                            "network-type");
+
+        /* Advertise GARP only of logical switch is of type bridged. */
+        if (!network_type || strcmp(network_type, "bridged")) {
+            return;
+        }
+
+        dp_key = peer_port->datapath->tunnel_key;
+        port_key = peer_port->tunnel_key;
+
+        garp = shash_find_data(&send_garp_data, binding_rec->logical_port);
+        if (garp) {
+            garp->dp_key = dp_key;
+            garp->port_key = port_key;
+        } else {
+            add_garp(binding_rec->logical_port, mac, ip,
+                     dp_key, port_key, true, GARP_DEF_REPEAT_INTERVAL_MS);
+        }
+        return;
+    }
+
     /* Update GARP for vif if it exists. */
     garp = shash_find_data(&send_garp_data, binding_rec->logical_port);
     if (garp) {
@@ -2640,7 +2725,8 @@ send_garp_update(const struct sbrec_port_binding *binding_rec,

         add_garp(binding_rec->logical_port,
                  laddrs.ea, laddrs.ipv4_addrs[0].addr,
-                 binding_rec->datapath->tunnel_key, binding_rec->tunnel_key);
+                 binding_rec->datapath->tunnel_key, binding_rec->tunnel_key,
+                 false, 0);

         destroy_lport_addresses(&laddrs);
         break;
@@ -2702,7 +2788,12 @@ send_garp(struct rconn *swconn, struct garp_data *garp,
         garp->backoff *= 2;
         garp->announce_time = current_time + garp->backoff * 1000;
     } else {
-        garp->announce_time = LLONG_MAX;
+        if (garp->is_repeat) {
+            garp->backoff = 1;
+            garp->announce_time = current_time + garp->repeat_interval;
+        } else {
+            garp->announce_time = LLONG_MAX;
+        }
     }
     return garp->announce_time;
 }
@@ -2786,25 +2877,6 @@ get_localnet_vifs_l3gwports(
     sbrec_port_binding_index_destroy_row(target);
 }

-static bool
-pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
-                            const struct sbrec_chassis *chassis,
-                            const struct sset *active_tunnels,
-                            const char *port_name)
-{
-    const struct sbrec_port_binding *pb
-        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
-    if (!pb || !pb->chassis) {
-        return false;
-    }
-    if (strcmp(pb->type, "chassisredirect")) {
-        return pb->chassis == chassis;
-    } else {
-        return ha_chassis_group_is_active(pb->ha_chassis_group,
-                                          active_tunnels, chassis);
-    }
-}
-
 /* Extracts the mac, IPv4 and IPv6 addresses, and logical port from
  * 'addresses' which should be of the format 'MAC [IP1 IP2 ..]
  * [is_chassis_resident("LPORT_NAME")]', where IPn should be a valid IPv4
@@ -2946,6 +3018,67 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index *sbrec_port_binding_by_name,
 }

 static void
+get_local_cr_ports(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                   struct sset *local_cr_ports,
+                   struct sset *local_l3gw_ports,
+                   const struct sbrec_chassis *chassis,
+                   const struct sset *active_tunnels)
+{
+    const char *gw_port;
+    SSET_FOR_EACH (gw_port, local_l3gw_ports) {
+        const struct sbrec_port_binding *binding_rec;
+
+        binding_rec = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                           gw_port);
+        if (!binding_rec) {
+            continue;
+        }
+
+        /* For the patch port we will add send garp for peer's ip and mac. */
+        if (!strcmp(binding_rec->type, "patch")) {
+            const struct sbrec_port_binding *cr_port = NULL;
+
+            bool is_cr_resident;
+            struct eth_addr mac;
+            ovs_be32 ip, mask;
+
+            const char *peer_name = smap_get(&binding_rec->options, "peer");
+            ovs_assert(peer_name);
+
+            char *cr_peer_name = xasprintf("cr-%s", peer_name);
+            cr_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                           cr_peer_name);
+            free(cr_peer_name);
+
+            if (!cr_port) {
+                continue;
+            }
+
+            is_cr_resident = pinctrl_is_chassis_resident
+                                (sbrec_port_binding_by_name,
+                                 chassis,
+                                 active_tunnels,
+                                 cr_port->logical_port);
+            if (!is_cr_resident) {
+                continue;
+            }
+
+            if (!ovn_sbrec_get_port_binding_ip_mac(cr_port, &mac, &ip,
+                                                   &mask)) {
+                /* Router Port binding without ip and mac configured. */
+                static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+                VLOG_WARN_RL(&rl, "cannot send garp, router port binding: %s, "
+                             "does not have proper ip,mac values: %s",
+                              cr_port->logical_port, *cr_port->mac);
+                return;
+            }
+
+            sset_add(local_cr_ports, cr_port->logical_port);
+        }
+    }
+}
+
+static void
 send_garp_wait(long long int send_garp_time)
 {
     /* Set the poll timer for next garp only if there is garp data to
@@ -2990,6 +3123,8 @@ send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
 {
     struct sset localnet_vifs = SSET_INITIALIZER(&localnet_vifs);
     struct sset local_l3gw_ports = SSET_INITIALIZER(&local_l3gw_ports);
+    struct sset local_cr_ports = SSET_INITIALIZER(&local_cr_ports);
+
     struct sset nat_ip_keys = SSET_INITIALIZER(&nat_ip_keys);
     struct shash nat_addresses;

@@ -3004,11 +3139,17 @@ send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
                                &nat_ip_keys, &local_l3gw_ports,
                                chassis, active_tunnels,
                                &nat_addresses);
+
+    get_local_cr_ports(sbrec_port_binding_by_name,
+                       &local_cr_ports, &local_l3gw_ports,
+                       chassis, active_tunnels);
+
     /* For deleted ports and deleted nat ips, remove from send_garp_data. */
     struct shash_node *iter, *next;
     SHASH_FOR_EACH_SAFE (iter, next, &send_garp_data) {
         if (!sset_contains(&localnet_vifs, iter->name) &&
-            !sset_contains(&nat_ip_keys, iter->name)) {
+            !sset_contains(&nat_ip_keys, iter->name) &&
+            !sset_contains(&local_cr_ports, iter->name)) {
             send_garp_delete(iter->name);
         }
     }
@@ -3019,7 +3160,7 @@ send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
         const struct sbrec_port_binding *pb = lport_lookup_by_name(
             sbrec_port_binding_by_name, iface_id);
         if (pb) {
-            send_garp_update(pb, &nat_addresses);
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
         }
     }

@@ -3029,7 +3170,17 @@ send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
         const struct sbrec_port_binding *pb
             = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port);
         if (pb) {
-            send_garp_update(pb, &nat_addresses);
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
+        }
+    }
+
+    /* Update send_garp_data for chassisredirect router ports. */
+    const char *cr_port;
+    SSET_FOR_EACH (cr_port, &local_cr_ports) {
+        const struct sbrec_port_binding *pb
+            = lport_lookup_by_name(sbrec_port_binding_by_name, cr_port);
+        if (pb) {
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
         }
     }

diff --git a/ovn/controller/pinctrl.h b/ovn/controller/pinctrl.h
index f61d705..92f704e 100644
--- a/ovn/controller/pinctrl.h
+++ b/ovn/controller/pinctrl.h
@@ -44,4 +44,10 @@ void pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
 void pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn);
 void pinctrl_destroy(void);

+bool
+pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                            const struct sbrec_chassis *chassis,
+                            const struct sset *active_tunnels,
+                            const char *port_name);
+
 #endif /* ovn/pinctrl.h */
diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
index 0f07d80..3d0ad8e 100644
--- a/ovn/lib/ovn-util.c
+++ b/ovn/lib/ovn-util.c
@@ -16,6 +16,7 @@
 #include "ovn-util.h"
 #include "dirs.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/ofp-parse.h"
 #include "ovn/lib/ovn-nb-idl.h"
 #include "ovn/lib/ovn-sb-idl.h"

@@ -371,3 +372,33 @@ ovn_logical_flow_hash(const struct uuid *logical_datapath,
     hash = hash_string(match, hash);
     return hash_string(actions, hash);
 }
+
+/*  Extracts the mac, ip and mask for a sbrec_port_binding.
+ *
+ *  Expects following format:
+ *  "MAC_ADDRESS IP/MASK"
+ *
+ *  Return true if MAC, IP and MASK are found, false otherwise.
+ */
+bool
+ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding *binding,
+                                  struct eth_addr *mac,
+                                  ovs_be32 *ip, ovs_be32 *mask)
+{
+    char *err_str = NULL;
+
+    err_str = str_to_mac(binding->mac[0], mac);
+    if (err_str) {
+        free(err_str);
+        return false;
+    }
+
+    err_str = ip_parse_masked(binding->mac[0] + ETH_ADDR_STRLEN + 1,
+                              ip, mask);
+    if (err_str) {
+        free(err_str);
+        return false;
+    }
+
+    return true;
+}
diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h
index 6d5e1df..c01595a 100644
--- a/ovn/lib/ovn-util.h
+++ b/ovn/lib/ovn-util.h
@@ -19,6 +19,7 @@
 #include "lib/packets.h"

 struct nbrec_logical_router_port;
+struct sbrec_port_binding;
 struct sbrec_logical_flow;
 struct uuid;

@@ -81,4 +82,9 @@ uint32_t ovn_logical_flow_hash(const struct uuid *logical_datapath,
                                uint16_t priority,
                                const char *match, const char *actions);

+bool
+ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding *binding,
+                                  struct eth_addr *mac, ovs_be32 *ip,
+                                  ovs_be32 *mask);
+
 #endif
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 74d3692..6835910 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -5914,6 +5914,20 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
                     ds_put_format(&match, " && is_chassis_resident(%s)",
                                   op->od->l3redirect_port->json_key);
                 }
+            } else if (op->peer &&
+                       op->peer->od->network_type == DP_NETWORK_BRIDGED) {
+                /* For a router port connected to bridged logical switch,
+                 * we will always have the is_chassis_resident check.
+                 * This is because there could be vm/server on vlan network,
+                 * but not on OVN chassis and could end up arping for router
+                 * port ip.
+                 *
+                 * This check works on the assumption that for OVN chassis,
+                 * VMs logical switch ARP responder will respond to ARP
+                 * requests for router port IP.
+                 */
+                ds_put_format(&match, " && is_chassis_resident(\"cr-%s\")",
+                              op->key);
             }

             ds_clear(&actions);
@@ -7365,18 +7379,23 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
             ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 300,
                           REGBIT_DISTRIBUTED_NAT" == 1", "next;");

-            /* For traffic with outport == l3dgw_port, if the
-             * packet did not match any higher priority redirect
-             * rule, then the traffic is redirected to the central
-             * instance of the l3dgw_port. */
-            ds_clear(&match);
-            ds_put_format(&match, "outport == %s",
-                          od->l3dgw_port->json_key);
-            ds_clear(&actions);
-            ds_put_format(&actions, "outport = %s; next;",
-                          od->l3redirect_port->json_key);
-            ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
-                          ds_cstr(&match), ds_cstr(&actions));
+            /* For VLAN backed networks, default match will not redirect to
+             * chassis redirect port. */
+            if (od->l3dgw_port->peer &&
+                od->l3dgw_port->peer->od->network_type == DP_NETWORK_OVERLAY) {
+                /* For traffic with outport == l3dgw_port, if the
+                 * packet did not match any higher priority redirect
+                 * rule, then the traffic is redirected to the central
+                 * instance of the l3dgw_port. */
+                ds_clear(&match);
+                ds_put_format(&match, "outport == %s",
+                              od->l3dgw_port->json_key);
+                ds_clear(&actions);
+                ds_put_format(&actions, "outport = %s; next;",
+                              od->l3redirect_port->json_key);
+                ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
+                              ds_cstr(&match), ds_cstr(&actions));
+            }

Looks like this code is having some side effects.


Point 1.
======
For my public switch if I don't set the network_type as "bridged",
then I see the below logical flows and think this is as expected. And I think
that's why in my v7 tests the packets were tunneled to the gw chassis (as
you mentioned in the reply).

****
table=12(lr_in_gw_redirect  ), priority=300  , match=(reg9[2] == 1), action=(next;)
  table=12(lr_in_gw_redirect  ), priority=200  , match=(reg9[0] == 1), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=150  , match=(outport == "lr0-public" && eth.dst == 00:00:00:00:00:00), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=50   , match=(outport == "lr0-public"), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
****

If I set the type as "bridged", I see the below flows

****
 table=12(lr_in_gw_redirect  ), priority=300  , match=(reg9[2] == 1), action=(next;)
  table=12(lr_in_gw_redirect  ), priority=200  , match=(reg9[0] == 1), action=(outport = "cr-lr0-public"; next;)
  table=12(lr_in_gw_redirect  ), priority=150  , match=(outport == "lr0-sw1" && reg0 == 20.0.0.3 && eth.dst == 00:00:00:00:00:00), action=(eth.dst = 40:54:00:00:00:03; next;)
  table=12(lr_in_gw_redirect  ), priority=0    , match=(1), action=(next;)
****

I don't understand the 3rd flow with the match -- "outport == "lr0-sw1"...

Looks like the "match" and "action" variables have some old data. Please look into the code again.
[ANKUR]:
Yup, missed out on clearing match and actions, thanks for calling it out.

After the "if" condition you added in this patch at line 7384, the below code is still there and it doesn't make sense

******
             /* For VLAN backed networks, default match will not redirect to
             * chassis redirect port. */
            if (od->l3dgw_port->peer &&
                od->l3dgw_port->peer->od->network_type == DP_NETWORK_OVERLAY) {
                /* For traffic with outport == l3dgw_port, if the
                 * packet did not match any higher priority redirect
                 * rule, then the traffic is redirected to the central
                 * instance of the l3dgw_port. */
                ds_clear(&match);
                ds_put_format(&match, "outport == %s",
                              od->l3dgw_port->json_key);
                ds_clear(&actions);
                ds_put_format(&actions, "outport = %s; next;",
                              od->l3redirect_port->json_key);
                ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
                              ds_cstr(&match), ds_cstr(&actions));
            }

            /* If the Ethernet destination has not been resolved,
             * redirect to the central instance of the l3dgw_port.
             * Such traffic will be replaced by an ARP request or ND
             * Neighbor Solicitation in the ARP request ingress
             * table, before being redirected to the central instance.
             */
            ds_put_format(&match, " && eth.dst == 00:00:00:00:00:00");      ====> THIS ONE
            ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 150,   ====> AND THIS ONE
                          ds_cstr(&match), ds_cstr(&actions));
        }

[ANKUR]:
Intention here was to just remove the flow which was sending out anything directed to router port to chassis redirect router port.

********

Point 2
=====

This patch breaks the S/N traffic if we have a logical switch (sw0) of type overlay connected
to a router and the router also a gw port connected to a logical switch (public) of type bridged (i.e provider network).
This public switch has a localnet port.

Some thing like this - http://paste.openstack.org/show/752427/ [paste.openstack.org]<https://urldefense.proofpoint.com/v2/url?u=http-3A__paste.openstack.org_show_752427_&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=6rUfRX1OJVsHGsYSDvkxwm1jNck-tOBUBuVlw4FyvHQ&e=>

It works fine if I change the type of the logical switch - public to overlay. But this doesn't make sense, since
the logical switch - public is a provider (or bridged) network and CMS can set the type as bridged.

I still think it's better not to have "network_type" column in logical_switch. We can always consider a logical
switch having a localnet port of type "bridged" and with out a localnet port of type "overlay".

[ANKUR]:
Agreed, nomenclature and usage of type field is confusing. It will be difficult to convey / expect that CMS
will NOT end up using even it was not supped to. I mentioned it the email reply, that I will be removing it
from current patch series and we will have separate config knobs for the use cases this field was added for.

This patch series sets the network_type=bridged in the external_ids of the datapath_binding row in SB DB.

Please see my comments in v4 of the patch 1 where I suggested something like below

****
enum ovn_datapath_nw_type {
    DP_NETWORK_OVERLAY,
    DP_NETWORK_PROVIDER
};

static void
ovn_datapath_update_nw_type(struct ovn_datapath *od)
{
    if (!od->nbs) {
        return;
    }

    if (!od->localnet_port) {
        od->network_type = DP_NETWORK_OVERLAY;
    } else {
        od->network_type = DP_NETWORK_PROVIDER;
    }
}
******

I think you can still set the external_ids of the datapath_binding row with "network_type=bridged"
if od->network_type is BRIDGED so that ovn-controller can distinguish if its bridged or overlay datapath.


I am mainly thinking from upgrades perspective for the existing deployments once this patch is series is applied.
Until CMS changes the network_type to "bridged" for all the logical switches with localnet ports in the
existing deployments, "ovn-nbctl show" will show these logical switches as "overlay" which is weird.
And later we may encounter other issues when enhancing OVN with new features.
[ANKUR]:
Yes, the value is quite easy to get confused with, I will be removing it in v10.

I think instead of adding the code to skip the redirection to the gateway chassis in ovn-northd if its a bridged network,
it's better to handle it in table 32 and since the mac replacement is handled in table 65 it probably makes more sense this way.
[ANKUR]:
IMO, Table 32 should only decide if redirection has to be done on overlay or vlan. While, logical flow should decide if redirection
is needed or not.

Thanks
Numan


             /* If the Ethernet destination has not been resolved,
              * redirect to the central instance of the l3dgw_port.
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 6275db1..6df711e 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -1441,7 +1441,7 @@
     </li>
   </ol>

-  <h3>External traffic</h3>
+  <h3>External traffic (NAT)</h3>

   <p>
     The following happens when a VM sends an external traffic (which requires
@@ -1607,6 +1607,91 @@
     </li>
   </ol>

+  <h3>External traffic (NO NAT)</h3>
+  <p>
+    The following happens when a VM sends an external traffic (i.e to non
+    logical router connected network), but there is not need for NATing.
+  </p>
+
+  <p>
+    Since, there is no NATing required, hence we need not redirect the packet
+    to a gateway chassis. As a result, this packet flow is same as East-West.
+    In order to ensure that OVN will not redirect the packet over a tunnel
+    to gateway-chassis, "network_type" of destination localnet logical switch,
+    should be set as "bridged". A "bridged" logical switch ensures that there
+    is no tunnel encapsulation done while forwarding the packet on it.
+    Please refer to <code>ovn-nb</code>(5) for more details.
+  </p>
+
+  <ol>
+    <li>
+      It first enters the ingress pipeline, and then egress pipeline of the
+      source localnet logical switch datapath. It then enters the ingress
+      pipeline of the logical router datapath via the logical router port in
+      the source chassis.
+    </li>
+
+    <li>
+      Routing decision is taken. Since, destination network is NOT directly
+      connected to logial router, hence a static route is expected, which will
+      provide next hop ip.
+    </li>
+
+    <li>
+      From the router datapath, packet enters the ingress pipeline and then
+      egress pipeline of the destination localnet logical switch datapath
+      (it is of type "bridged" and this is where the next hop is present)
+      and goes out of the integration bridge to the provider bridge (
+      belonging to the destination logical switch) via the localnet port.
+      Same as East-West, source mac will replaced with chassis mac.
+    </li>
+  </ol>
+
+  <p>
+    The following happens for the reverse external traffic.
+  </p>
+
+  <ol>
+    <li>
+      The gateway chassis receives the packet from the localnet port of
+      the logical switch (bridged type) which provides external connectivity.
+      The packet then enters the ingress pipeline and then egress pipeline of
+      the localnet logical switch (which provides external connectivity).
+      The packet then enters the ingress pipeline of the logical router
+      datapath.
+    </li>
+
+    <li>
+      Routing decision is taken and logical switch of destination VM is
+      identified.
+    </li>
+
+    <li>
+      The packet then enters the ingress pipeline and then egress
+      pipeline of VM's localnet logical switch. Since the source VM
+      doesn't reside in the gateway chassis, the packet is sent out via the
+      localnet port of the VM's logical switch. Source mac of this packet
+      will be replaced with chassis unique mac.
+    </li>
+
+    <li>
+      VM's chassis receives the packet via the localnet port and
+      sends it to the integration bridge. The packet enters the
+      ingress pipeline and then egress pipeline of the localnet
+      logical switch and finally gets delivered to the VM port.
+    </li>
+  </ol>
+
+  <p>
+    One thing to note here is that, while VM to External traffic did not
+    require redirection to gateway chassis, the reverse traffic is through
+    gateway chassis only. This is because, for external router, OVN logical
+    router port IP will be the next hop to reach the endpoints behind it.
+    As a result, we need a centralized chassis, which will respond to ARP
+    requests coming from external network. This centralized chassis, is the
+    gateway chassis which is attached to corresponding router port.
+  </p>
+
   <h2>Life Cycle of a VTEP gateway</h2>

   <p>
diff --git a/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=> b/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
index e5108a7..8a03393 100644
--- a/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
+++ b/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=pIYtQRJ9jgQGUXeKv6WUk41aKfohgLbkkL-XzaS6SsQ&e=>
@@ -29,6 +29,12 @@ m4_define([OVN_CHECK_PACKETS],
   [ovn_check_packets__ "$1" "$2"
    AT_CHECK([sort $rcv_text], [0], [expout])])

+m4_define([OVN_CHECK_PACKETS_REMOVE_BROADCAST],
+  [ovn_check_packets__ "$1" "$2"
+   echo "received_text=$rcv_text"
+   sed -i '/ffffffffffff/d' $rcv_text
+   AT_CHECK([sort $rcv_text], [0], [expout])])
+
 AT_BANNER([OVN components])

 AT_SETUP([ovn -- lexer])
@@ -14018,7 +14024,7 @@ ovn-hv4-0
 OVN_CLEANUP([hv1], [hv2], [hv3])
 AT_CLEANUP

-AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac])
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR E-W chassis mac])
 ovn_start


@@ -14028,6 +14034,8 @@ ovn_start
 # of VIF port name indicates the hypervisor it is bound to, e.g.
 # lp23 means VIF 3 on hv2.
 #
+# Both the switches are connected to a logical router "router".
+#
 # Each switch's VLAN tag and their logical switch ports are:
 #   - ls1:
 #       - tagged with VLAN 101
@@ -14185,6 +14193,7 @@ test_ip() {
 echo "------ OVN dump ------"
 ovn-nbctl show
 ovn-sbctl show
+ovn-sbctl list port_binding

 echo "------ hv1 dump ------"
 as hv1 ovs-vsctl show
@@ -14211,6 +14220,727 @@ as hv2 ovs-appctl fdb/show br-phys

 OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected])

+
+# Associate a chassis as gateway chassis and validate garp.
+
+OVN_CLEANUP([hv1],[hv2])
+
+AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S GARP])
+ovn_start
+
+
+# In this test cases we create 2 switches, all connected to same
+# physical network (through br-phys on each HV). Each switch has
+# 1 VIF. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# Both the switches are connected to a logical router "router".
+#
+# Additionally, we create a logical switch (ls-underlay) for N-S traffic.
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+#
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovs-vsctl set open . external-ids:system-id="HV$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+    ovs-vsctl set-controller br-int ptcp:
+    AT_CHECK([ovs-vsctl add-port br-phys snoopvif -- set Interface snoopvif options:tx_pcap=hv$i/snoopvif-tx.pcap options:rxq_pcap=hv$i/snoopvif-rx.pcap])
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24 [172.31.0.1]<https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
+
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+ovn-nbctl --wait=sb sync
+
+# Associate hv2 as gateway chassis
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv2
+
+ovn-nbctl show
+ovn-sbctl show
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+sleep 1
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+AT_CHECK([as hv2 ovs-appctl fdb/show br-phys | grep 00:00:01:01:02:07 | grep 1000 | wc -l], [0], [[1
+]])
+
+echo "ffffffffffff000001010207810003e808060001080006040001000001010207ac1f0001000000000000ac1f0001" > expected
+OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [expected])
+
 OVN_CLEANUP([hv1],[hv2])

 AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S Ping])
+ovn_start
+
+# In this test cases we create 3 switches, all connected to same
+# physical network (through br-phys on each HV). LS1 and LS2 have
+# 1 VIF each. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# All the switches are connected to a logical router "router".
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+
+for i in 1 2; do
+    ls_name=ls$i
+    ovn-nbctl ls-add $ls_name bridged
+    ln_port_name=ln$i
+    if test $i -eq 1; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
+    elif test $i -eq 2; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
+    fi
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
+done
+
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif?[[north]]?) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+
+    ovs-vsctl add-port br-int vif$i$i -- \
+        set Interface vif$i$i external-ids:iface-id=lp$i$i \
+                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
+                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
+                              ofport-request=$i$i
+
+    lsp_name=lp$i$i
+    ls_name=$(lsp_to_ls $lsp_name)
+
+    ovn-nbctl lsp-add $ls_name $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i"
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl ls-add ls-north bridged
+ovn-nbctl lsp-add ls-north ln4 "" 1000
+ovn-nbctl lsp-set-addresses ln4 unknown
+ovn-nbctl lsp-set-type ln4 localnet
+ovn-nbctl lsp-set-options ln4 network_name=phys
+
+# Add a VM on ls-north
+ovn-nbctl lsp-add ls-north lp-north
+ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
+ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
+
+# Add 3rd hypervisor
+sim_add hv3
+as hv3 ovs-vsctl add-br br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
+as hv3 ovn_attach n1 br-phys 192.168.0.3
+
+# Add 4th hypervisor
+sim_add hv4
+as hv4 ovs-vsctl add-br br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
+as hv4 ovn_attach n1 br-phys 192.168.0.4
+
+as hv4 ovs-vsctl add-port br-int vif-north -- \
+        set Interface vif-north external-ids:iface-id=lp-north \
+                              options:tx_pcap=hv4/vif-north-tx.pcap \
+                              options:rxq_pcap=hv4/vif-north-rx.pcap \
+                              ofport-request=44
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
+ovn-nbctl [192.168.1.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.1.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=r3laW3QCkYmIZydSf8n5bHm0ObKIuSd3VACsekBmSbg&e=> lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
+ovn-nbctl [192.168.2.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.2.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=suQy9nYmhP89HVjKfN--Kvziv8XSkkzS9bXCDrfE1c4&e=> lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24 [172.31.0.1]<https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
+
+ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router \
+          options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router
+ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router \
+          options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
+
+ovn-nbctl --wait=sb sync
+
+sleep 2
+
+OVN_POPULATE_ARP
+
++# lsp_to_ls LSP
++#
++# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_ls () {
+    case $1 in dnl (
+        vif?[[11]]) echo ls1 ;; dnl (
+        vif?[[12]]) echo ls2 ;; dnl (
+        vif-north) echo ls-north ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+hv_to_num () {
+    case $1 in dnl (
+        hv1) echo 1 ;; dnl (
+        hv2) echo 2 ;; dnl (
+        hv3) echo 3 ;; dnl (
+        hv4) echo 4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_num () {
+    case $1 in dnl (
+        vif22) echo 22 ;; dnl (
+        vif21) echo 21 ;; dnl (
+        vif11) echo 11 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif-north) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_lrp () {
+    echo router-to-`vif_to_ls $1`
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+
+test_ip() {
+        # This packet has bad checksums but logical L3 routing doesn't check.
+        local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 outport=$6
+        local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+        shift; shift; shift; shift; shift
+        hv=`vif_to_hv $inport`
+        as $hv ovs-appctl netdev-dummy/receive $inport $packet
+        in_ls=`vif_to_ls $inport`
+        for outport; do
+            out_ls=`vif_to_ls $outport`
+            if test $in_ls = $out_ls; then
+                # Ports on the same logical switch receive exactly the same packet.
+                echo $packet
+            else
+                # Routing decrements TTL and updates source and dest MAC
+                # (and checksum).
+                out_lrp=`vif_to_lrp $outport`
+                # For North-South, packet will come via gateway chassis, i.e hv3
+                if test $inport = vif-north; then
+                    echo f00000000011aabbccddee3308004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected
+                fi
+                if test $outport = vif-north; then
+                    echo f0f000000011aabbccddee1108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected
+                fi
+            fi >> $outport.expected
+        done
+}
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+ovn-sbctl list port_binding
+ovn-sbctl list mac_binding
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+echo "------ hv3 dump ------"
+as hv3 ovs-vsctl show
+as hv3 ovs-vsctl list Open_Vswitch
+
+echo "------ hv4 dump ------"
+as hv4 ovs-vsctl show
+as hv4 ovs-vsctl list Open_Vswitch
+
+echo "Send traffic North to South"
+
+sip=`ip_to_hex 172 31 0 10`
+dip=`ip_to_hex 192 168 1 1`
+test_ip vif-north f0f000000011 000001010207 $sip $dip vif11
+
+sleep 1
+
+# Confirm that North to south traffic works fine and went through gateway chassis, i.e HV3
+OVN_CHECK_PACKETS([hv1/vif11-tx.pcap], [vif11.expected])
+
+echo "Send traffic South to Nouth"
+sip=`ip_to_hex 192 168 1 1`
+dip=`ip_to_hex 172 31 0 10`
+test_ip vif11 f00000000011 000001010203 $sip $dip vif-north
+
+sleep 1
+
+# Confirm that South to North traffic works fine.
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap], [vif-north.expected])
+
+# Confirm that packets did not go out via tunnel port.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=32 | grep NXM_NX_TUN_METADATA0 | grep n_packets=0 | wc -l], [0], [[1
+]])
+
+# Confirm that HV1 chassis mac is never seen on Gateway chassis, i.e HV3
+AT_CHECK([as hv3 ovs-appctl fdb/show br-phys | grep aa:bb:cc:dd:ee:11 | wc -l], [0], [[0
+]])
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv3 dump -----------"
+as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv3 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv4 dump -----------"
+as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv4 ovs-appctl fdb/show br-phys
+
+OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
+
+AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S ARP handling])
+ovn_start
+
+# In this test cases we create 3 switches, all connected to same
+# physical network (through br-phys on each HV). LS1 and LS2 have
+# 1 VIF each. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# All the switches are connected to a logical router "router".
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+
+for i in 1 2; do
+    ls_name=ls$i
+    ovn-nbctl ls-add $ls_name bridged
+    ln_port_name=ln$i
+    if test $i -eq 1; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
+    elif test $i -eq 2; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
+    fi
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
+done
+
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif?[[north]]?) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+
+    ovs-vsctl add-port br-int vif$i$i -- \
+        set Interface vif$i$i external-ids:iface-id=lp$i$i \
+                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
+                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
+                              ofport-request=$i$i
+
+    lsp_name=lp$i$i
+    ls_name=$(lsp_to_ls $lsp_name)
+
+    ovn-nbctl lsp-add $ls_name $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i"
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl ls-add ls-north bridged
+ovn-nbctl lsp-add ls-north ln4 "" 1000
+ovn-nbctl lsp-set-addresses ln4 unknown
+ovn-nbctl lsp-set-type ln4 localnet
+ovn-nbctl lsp-set-options ln4 network_name=phys
+
+# Add a VM on ls-north
+ovn-nbctl lsp-add ls-north lp-north
+ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
+ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
+
+# Add 3rd hypervisor
+sim_add hv3
+as hv3 ovs-vsctl add-br br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
+as hv3 ovn_attach n1 br-phys 192.168.0.3
+
+# Add 4th hypervisor
+sim_add hv4
+as hv4 ovs-vsctl add-br br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
+as hv4 ovn_attach n1 br-phys 192.168.0.4
+
+as hv4 ovs-vsctl add-port br-int vif-north -- \
+        set Interface vif-north external-ids:iface-id=lp-north \
+                              options:tx_pcap=hv4/vif-north-tx.pcap \
+                              options:rxq_pcap=hv4/vif-north-rx.pcap \
+                              ofport-request=44
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
+ovn-nbctl [192.168.1.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.1.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=r3laW3QCkYmIZydSf8n5bHm0ObKIuSd3VACsekBmSbg&e=> lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
+ovn-nbctl [192.168.2.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.2.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=suQy9nYmhP89HVjKfN--Kvziv8XSkkzS9bXCDrfE1c4&e=> lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24 [172.31.0.1]<https://urldefense.proofpoint.com/v2/url?u=http-3A__172.31.0.1_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=rqMxle8BgWOlhM3GlQ2s4e96C9zXLut8Ap0enbFWkfk&e=>
+
+ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router \
+          options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router
+ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router \
+          options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+
+OVN_POPULATE_ARP
+
++# lsp_to_ls LSP
++#
++# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_ls () {
+    case $1 in dnl (
+        vif?[[11]]) echo ls1 ;; dnl (
+        vif?[[12]]) echo ls2 ;; dnl (
+        vif-north) echo ls-north ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+hv_to_num () {
+    case $1 in dnl (
+        hv1) echo 1 ;; dnl (
+        hv2) echo 2 ;; dnl (
+        hv3) echo 3 ;; dnl (
+        hv4) echo 4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_num () {
+    case $1 in dnl (
+        vif22) echo 22 ;; dnl (
+        vif21) echo 21 ;; dnl (
+        vif11) echo 11 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif-north) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_lrp () {
+    echo router-to-`vif_to_ls $1`
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+ovn-sbctl list port_binding
+ovn-sbctl list mac_binding
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+echo "------ hv3 dump ------"
+as hv3 ovs-vsctl show
+as hv3 ovs-vsctl list Open_Vswitch
+
+echo "------ hv4 dump ------"
+as hv4 ovs-vsctl show
+as hv4 ovs-vsctl list Open_Vswitch
+
+# test_arp INPORT SHA SPA TPA [REPLY_HA]
+#
+# Causes a packet to be received on INPORT.  The packet is an ARP
+# request with SHA, SPA, and TPA as specified.  If REPLY_HA is provided, then
+# it should be the hardware address of the target to expect to receive in an
+# ARP reply; otherwise no reply is expected.
+#
+# INPORT is an logical switch port number, e.g. 11 for vif11.
+# SHA and REPLY_HA are each 12 hex digits.
+# SPA and TPA are each 8 hex digits.
+test_arp() {
+    local inport=$1 sha=$2 spa=$3 tpa=$4 reply_ha=$5
+    local request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa}
+    hv=`vif_to_hv $inport`
+    as $hv ovs-appctl netdev-dummy/receive $inport $request
+
+    if test X$reply_ha = X; then
+        # Expect to receive the broadcast ARP on the other logical switch ports
+        # if no reply is expected.
+        local i j
+        for i in 1 2 3; do
+            for j in 1 2 3; do
+                if test $i$j != $inport; then
+                    echo $request >> $i$j.expected
+                fi
+            done
+        done
+    else
+        # Expect to receive the reply, if any.
+        local reply=${sha}${reply_ha}08060001080006040002${reply_ha}${tpa}${sha}${spa}
+        local reply_vid=${sha}${reply_ha}810003e808060001080006040002${reply_ha}${tpa}${sha}${spa}
+        echo $reply_vid >> ${inport}_vid.expected
+        echo $reply >> $inport.expected
+    fi
+}
+
+sip=`ip_to_hex 172 31 0 10`
+tip=`ip_to_hex 172 31 0 1`
+
+test_arp vif-north f0f000000011 $sip $tip
+# Confirm that vif-north does not get ARP reply
+AT_CHECK([wc -l hv4/vif-north-tx.pcap | awk '{print $1}'], [0], [[0
+]])
+
+# Set a hypervisor as gateway chassis, for router port 172.31.0.1
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
+ovn-nbctl --wait=sb sync
+sleep 2
+
+test_arp vif-north f0f000000011 $sip $tip 000001010207
+
+sleep 1
+
+# Confirm that vif-north gets a single ARP reply this time
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap], [vif-north.expected])
+
+# Confirm that only redirect chassis allowed arp resolution.
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv3/br-phys_n1-tx.pcap], [vif-north_vid.expected])
+sed -i '/ffffffffffff/d' hv3/br-phys_n1-tx.packets
+AT_CHECK([grep 000001010207 hv3/br-phys_n1-tx.packets | wc -l], [0], [[1
+]])
+
+# Confirm that other OVN chassis did not generate ARP reply.
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in [ovs-pcap.in]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovs-2Dpcap.in&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=2C3bpksCPiN-64fg1Las63zBhPREoL9p8vojGneVx9o&e=>" hv1/br-phys_n1-tx.pcap > hv1/br-phys_n1-tx.packets
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in [ovs-pcap.in]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovs-2Dpcap.in&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=PIGJNMisAQ9iokicyVS4lKZ7fLKTOjQYSIV6R83EdO8&s=2C3bpksCPiN-64fg1Las63zBhPREoL9p8vojGneVx9o&e=>" hv2/br-phys_n1-tx.pcap > hv2/br-phys_n1-tx.packets
+
+AT_CHECK([grep 000001010207 hv1/br-phys_n1-tx.packets | wc -l], [0], [[0
+]])
+AT_CHECK([grep 000001010207 hv2/br-phys_n1-tx.packets | wc -l], [0], [[0
+]])
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv3 dump -----------"
+as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv3 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv4 dump -----------"
+as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv4 ovs-appctl fdb/show br-phys
+
+OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
+
+AT_CLEANUP
--
1.8.3.1

Patch
diff mbox series

diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index af587a5..1ab5968 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -21,6 +21,7 @@ 
 #include "lflow.h"
 #include "lport.h"
 #include "chassis.h"
+#include "pinctrl.h"
 #include "lib/bundle.h"
 #include "openvswitch/poll-loop.h"
 #include "lib/uuid.h"
@@ -238,9 +239,12 @@  get_zone_ids(const struct sbrec_port_binding *binding,
 }
 
 static void
-put_replace_router_port_mac_flows(const struct
+put_replace_router_port_mac_flows(struct ovsdb_idl_index
+                                  *sbrec_port_binding_by_name,
+                                  const struct
                                   sbrec_port_binding *localnet_port,
                                   const struct sbrec_chassis *chassis,
+                                  const struct sset *active_tunnels,
                                   const struct hmap *local_datapaths,
                                   struct ofpbuf *ofpacts_p,
                                   ofp_port_t ofport,
@@ -281,8 +285,21 @@  put_replace_router_port_mac_flows(const struct
         char *err_str = NULL;
         struct match match;
         struct ofpact_mac *replace_mac;
+        char *cr_peer_name = xasprintf("cr-%s", rport_binding->logical_port);
 
-        /* Table 65, priority 150.
+
+        if (pinctrl_is_chassis_resident(sbrec_port_binding_by_name,
+                                        chassis, active_tunnels,
+                                        cr_peer_name)) {
+            /* If a router port's chassisredirect port is
+             * resident on this chassis, then we need not do mac replace. */
+            free(cr_peer_name);
+            continue;
+        }
+
+        free(cr_peer_name);
+
+       /* Table 65, priority 150.
          * =======================
          *
          * Implements output to localnet port.
@@ -797,7 +814,8 @@  consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name,
                         &match, ofpacts_p, &binding->header_.uuid);
 
         if (!strcmp(binding->type, "localnet")) {
-            put_replace_router_port_mac_flows(binding, chassis,
+            put_replace_router_port_mac_flows(sbrec_port_binding_by_name,
+                                              binding, chassis, active_tunnels,
                                               local_datapaths, ofpacts_p,
                                               ofport, flow_table);
         }
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index b7bb4c9..a145867 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -226,6 +226,8 @@  static bool may_inject_pkts(void);
 COVERAGE_DEFINE(pinctrl_drop_put_mac_binding);
 COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map);
 
+#define GARP_DEF_REPEAT_INTERVAL_MS   (3 * 60 * 1000) /* 3 minutes */
+
 void
 pinctrl_init(void)
 {
@@ -242,6 +244,25 @@  pinctrl_init(void)
                                                 &pinctrl);
 }
 
+bool
+pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                            const struct sbrec_chassis *chassis,
+                            const struct sset *active_tunnels,
+                            const char *port_name)
+{
+    const struct sbrec_port_binding *pb
+        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
+    if (!pb || !pb->chassis) {
+        return false;
+    }
+    if (strcmp(pb->type, "chassisredirect")) {
+        return pb->chassis == chassis;
+    } else {
+        return ha_chassis_group_is_active(pb->ha_chassis_group,
+                                          active_tunnels, chassis);
+    }
+}
+
 static ovs_be32
 queue_msg(struct rconn *swconn, struct ofpbuf *msg)
 {
@@ -2548,6 +2569,8 @@  struct garp_data {
     int backoff;                 /* Backoff for the next announcement. */
     uint32_t dp_key;             /* Datapath used to output this GARP. */
     uint32_t port_key;           /* Port to inject the GARP into. */
+    bool is_repeat;              /* Send GARPs continously */
+    long long int repeat_interval; /* Interval between GARP bursts in ms */
 };
 
 /* Contains GARPs to be sent. Protected by pinctrl_mutex*/
@@ -2568,7 +2591,8 @@  destroy_send_garps(void)
 /* Runs with in the main ovn-controller thread context. */
 static void
 add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
-         uint32_t dp_key, uint32_t port_key)
+         uint32_t dp_key, uint32_t port_key, bool is_repeat,
+         long long int repeat_interval)
 {
     struct garp_data *garp = xmalloc(sizeof *garp);
     garp->ea = ea;
@@ -2577,6 +2601,8 @@  add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
     garp->backoff = 1;
     garp->dp_key = dp_key;
     garp->port_key = port_key;
+    garp->is_repeat = is_repeat;
+    garp->repeat_interval = repeat_interval;
     shash_add(&send_garp_data, name, garp);
 
     /* Notify pinctrl_handler so that it can wakeup and process
@@ -2586,7 +2612,8 @@  add_garp(const char *name, const struct eth_addr ea, ovs_be32 ip,
 
 /* Add or update a vif for which GARPs need to be announced. */
 static void
-send_garp_update(const struct sbrec_port_binding *binding_rec,
+send_garp_update(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                 const struct sbrec_port_binding *binding_rec,
                  struct shash *nat_addresses)
 {
     volatile struct garp_data *garp = NULL;
@@ -2611,7 +2638,7 @@  send_garp_update(const struct sbrec_port_binding *binding_rec,
                     add_garp(name, laddrs->ea,
                              laddrs->ipv4_addrs[i].addr,
                              binding_rec->datapath->tunnel_key,
-                             binding_rec->tunnel_key);
+                             binding_rec->tunnel_key, false, 0);
                 }
                 free(name);
             }
@@ -2621,6 +2648,64 @@  send_garp_update(const struct sbrec_port_binding *binding_rec,
         return;
     }
 
+    /* Update GARPs for local chassisredirect port, if the peer
+     * layer 2 switch is of type vlan.
+     */
+    if (!strcmp(binding_rec->type, "chassisredirect")) {
+        struct eth_addr mac;
+        ovs_be32 ip, mask;
+        uint32_t dp_key = 0;
+        uint32_t port_key = 0;
+        const struct sbrec_port_binding *peer_port = NULL;
+        const struct sbrec_port_binding *distributed_port = NULL;
+
+        if (!ovn_sbrec_get_port_binding_ip_mac(binding_rec, &mac,
+                                               &ip, &mask)) {
+            /* Router Port binding without ip and mac configured. */
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+            VLOG_WARN_RL(&rl, "cannot send garp, router port binding: %s, "
+                         "does not have proper ip,mac values: %s",
+                         binding_rec->logical_port, *binding_rec->mac);
+            return;
+        }
+
+        const char *lrp_name = smap_get(&binding_rec->options,
+                                        "distributed-port");
+        ovs_assert(lrp_name);
+
+        distributed_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                                lrp_name);
+        ovs_assert(distributed_port);
+
+        const char *peer_name = smap_get(&distributed_port->options, "peer");
+        ovs_assert(peer_name);
+
+        peer_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                         peer_name);
+        ovs_assert(peer_port);
+
+        const char *network_type = smap_get(&peer_port->datapath->external_ids,
+                                            "network-type");
+
+        /* Advertise GARP only of logical switch is of type bridged. */
+        if (!network_type || strcmp(network_type, "bridged")) {
+            return;
+        }
+
+        dp_key = peer_port->datapath->tunnel_key;
+        port_key = peer_port->tunnel_key;
+
+        garp = shash_find_data(&send_garp_data, binding_rec->logical_port);
+        if (garp) {
+            garp->dp_key = dp_key;
+            garp->port_key = port_key;
+        } else {
+            add_garp(binding_rec->logical_port, mac, ip,
+                     dp_key, port_key, true, GARP_DEF_REPEAT_INTERVAL_MS);
+        }
+        return;
+    }
+
     /* Update GARP for vif if it exists. */
     garp = shash_find_data(&send_garp_data, binding_rec->logical_port);
     if (garp) {
@@ -2640,7 +2725,8 @@  send_garp_update(const struct sbrec_port_binding *binding_rec,
 
         add_garp(binding_rec->logical_port,
                  laddrs.ea, laddrs.ipv4_addrs[0].addr,
-                 binding_rec->datapath->tunnel_key, binding_rec->tunnel_key);
+                 binding_rec->datapath->tunnel_key, binding_rec->tunnel_key,
+                 false, 0);
 
         destroy_lport_addresses(&laddrs);
         break;
@@ -2702,7 +2788,12 @@  send_garp(struct rconn *swconn, struct garp_data *garp,
         garp->backoff *= 2;
         garp->announce_time = current_time + garp->backoff * 1000;
     } else {
-        garp->announce_time = LLONG_MAX;
+        if (garp->is_repeat) {
+            garp->backoff = 1;
+            garp->announce_time = current_time + garp->repeat_interval;
+        } else {
+            garp->announce_time = LLONG_MAX;
+        }
     }
     return garp->announce_time;
 }
@@ -2786,25 +2877,6 @@  get_localnet_vifs_l3gwports(
     sbrec_port_binding_index_destroy_row(target);
 }
 
-static bool
-pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
-                            const struct sbrec_chassis *chassis,
-                            const struct sset *active_tunnels,
-                            const char *port_name)
-{
-    const struct sbrec_port_binding *pb
-        = lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
-    if (!pb || !pb->chassis) {
-        return false;
-    }
-    if (strcmp(pb->type, "chassisredirect")) {
-        return pb->chassis == chassis;
-    } else {
-        return ha_chassis_group_is_active(pb->ha_chassis_group,
-                                          active_tunnels, chassis);
-    }
-}
-
 /* Extracts the mac, IPv4 and IPv6 addresses, and logical port from
  * 'addresses' which should be of the format 'MAC [IP1 IP2 ..]
  * [is_chassis_resident("LPORT_NAME")]', where IPn should be a valid IPv4
@@ -2946,6 +3018,67 @@  get_nat_addresses_and_keys(struct ovsdb_idl_index *sbrec_port_binding_by_name,
 }
 
 static void
+get_local_cr_ports(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                   struct sset *local_cr_ports,
+                   struct sset *local_l3gw_ports,
+                   const struct sbrec_chassis *chassis,
+                   const struct sset *active_tunnels)
+{
+    const char *gw_port;
+    SSET_FOR_EACH (gw_port, local_l3gw_ports) {
+        const struct sbrec_port_binding *binding_rec;
+
+        binding_rec = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                           gw_port);
+        if (!binding_rec) {
+            continue;
+        }
+
+        /* For the patch port we will add send garp for peer's ip and mac. */
+        if (!strcmp(binding_rec->type, "patch")) {
+            const struct sbrec_port_binding *cr_port = NULL;
+
+            bool is_cr_resident;
+            struct eth_addr mac;
+            ovs_be32 ip, mask;
+
+            const char *peer_name = smap_get(&binding_rec->options, "peer");
+            ovs_assert(peer_name);
+
+            char *cr_peer_name = xasprintf("cr-%s", peer_name);
+            cr_port = lport_lookup_by_name(sbrec_port_binding_by_name,
+                                           cr_peer_name);
+            free(cr_peer_name);
+
+            if (!cr_port) {
+                continue;
+            }
+
+            is_cr_resident = pinctrl_is_chassis_resident
+                                (sbrec_port_binding_by_name,
+                                 chassis,
+                                 active_tunnels,
+                                 cr_port->logical_port);
+            if (!is_cr_resident) {
+                continue;
+            }
+
+            if (!ovn_sbrec_get_port_binding_ip_mac(cr_port, &mac, &ip,
+                                                   &mask)) {
+                /* Router Port binding without ip and mac configured. */
+                static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+                VLOG_WARN_RL(&rl, "cannot send garp, router port binding: %s, "
+                             "does not have proper ip,mac values: %s",
+                              cr_port->logical_port, *cr_port->mac);
+                return;
+            }
+
+            sset_add(local_cr_ports, cr_port->logical_port);
+        }
+    }
+}
+
+static void
 send_garp_wait(long long int send_garp_time)
 {
     /* Set the poll timer for next garp only if there is garp data to
@@ -2990,6 +3123,8 @@  send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
 {
     struct sset localnet_vifs = SSET_INITIALIZER(&localnet_vifs);
     struct sset local_l3gw_ports = SSET_INITIALIZER(&local_l3gw_ports);
+    struct sset local_cr_ports = SSET_INITIALIZER(&local_cr_ports);
+
     struct sset nat_ip_keys = SSET_INITIALIZER(&nat_ip_keys);
     struct shash nat_addresses;
 
@@ -3004,11 +3139,17 @@  send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
                                &nat_ip_keys, &local_l3gw_ports,
                                chassis, active_tunnels,
                                &nat_addresses);
+
+    get_local_cr_ports(sbrec_port_binding_by_name,
+                       &local_cr_ports, &local_l3gw_ports,
+                       chassis, active_tunnels);
+
     /* For deleted ports and deleted nat ips, remove from send_garp_data. */
     struct shash_node *iter, *next;
     SHASH_FOR_EACH_SAFE (iter, next, &send_garp_data) {
         if (!sset_contains(&localnet_vifs, iter->name) &&
-            !sset_contains(&nat_ip_keys, iter->name)) {
+            !sset_contains(&nat_ip_keys, iter->name) &&
+            !sset_contains(&local_cr_ports, iter->name)) {
             send_garp_delete(iter->name);
         }
     }
@@ -3019,7 +3160,7 @@  send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
         const struct sbrec_port_binding *pb = lport_lookup_by_name(
             sbrec_port_binding_by_name, iface_id);
         if (pb) {
-            send_garp_update(pb, &nat_addresses);
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
         }
     }
 
@@ -3029,7 +3170,17 @@  send_garp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
         const struct sbrec_port_binding *pb
             = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port);
         if (pb) {
-            send_garp_update(pb, &nat_addresses);
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
+        }
+    }
+
+    /* Update send_garp_data for chassisredirect router ports. */
+    const char *cr_port;
+    SSET_FOR_EACH (cr_port, &local_cr_ports) {
+        const struct sbrec_port_binding *pb
+            = lport_lookup_by_name(sbrec_port_binding_by_name, cr_port);
+        if (pb) {
+            send_garp_update(sbrec_port_binding_by_name, pb, &nat_addresses);
         }
     }
 
diff --git a/ovn/controller/pinctrl.h b/ovn/controller/pinctrl.h
index f61d705..92f704e 100644
--- a/ovn/controller/pinctrl.h
+++ b/ovn/controller/pinctrl.h
@@ -44,4 +44,10 @@  void pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
 void pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn);
 void pinctrl_destroy(void);
 
+bool
+pinctrl_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                            const struct sbrec_chassis *chassis,
+                            const struct sset *active_tunnels,
+                            const char *port_name);
+
 #endif /* ovn/pinctrl.h */
diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
index 0f07d80..3d0ad8e 100644
--- a/ovn/lib/ovn-util.c
+++ b/ovn/lib/ovn-util.c
@@ -16,6 +16,7 @@ 
 #include "ovn-util.h"
 #include "dirs.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/ofp-parse.h"
 #include "ovn/lib/ovn-nb-idl.h"
 #include "ovn/lib/ovn-sb-idl.h"
 
@@ -371,3 +372,33 @@  ovn_logical_flow_hash(const struct uuid *logical_datapath,
     hash = hash_string(match, hash);
     return hash_string(actions, hash);
 }
+
+/*  Extracts the mac, ip and mask for a sbrec_port_binding.
+ *
+ *  Expects following format:
+ *  "MAC_ADDRESS IP/MASK"
+ *
+ *  Return true if MAC, IP and MASK are found, false otherwise.
+ */
+bool
+ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding *binding,
+                                  struct eth_addr *mac,
+                                  ovs_be32 *ip, ovs_be32 *mask)
+{
+    char *err_str = NULL;
+
+    err_str = str_to_mac(binding->mac[0], mac);
+    if (err_str) {
+        free(err_str);
+        return false;
+    }
+
+    err_str = ip_parse_masked(binding->mac[0] + ETH_ADDR_STRLEN + 1,
+                              ip, mask);
+    if (err_str) {
+        free(err_str);
+        return false;
+    }
+
+    return true;
+}
diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h
index 6d5e1df..c01595a 100644
--- a/ovn/lib/ovn-util.h
+++ b/ovn/lib/ovn-util.h
@@ -19,6 +19,7 @@ 
 #include "lib/packets.h"
 
 struct nbrec_logical_router_port;
+struct sbrec_port_binding;
 struct sbrec_logical_flow;
 struct uuid;
 
@@ -81,4 +82,9 @@  uint32_t ovn_logical_flow_hash(const struct uuid *logical_datapath,
                                uint16_t priority,
                                const char *match, const char *actions);
 
+bool
+ovn_sbrec_get_port_binding_ip_mac(const struct sbrec_port_binding *binding,
+                                  struct eth_addr *mac, ovs_be32 *ip,
+                                  ovs_be32 *mask);
+
 #endif
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 74d3692..6835910 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -5914,6 +5914,20 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
                     ds_put_format(&match, " && is_chassis_resident(%s)",
                                   op->od->l3redirect_port->json_key);
                 }
+            } else if (op->peer &&
+                       op->peer->od->network_type == DP_NETWORK_BRIDGED) {
+                /* For a router port connected to bridged logical switch,
+                 * we will always have the is_chassis_resident check.
+                 * This is because there could be vm/server on vlan network,
+                 * but not on OVN chassis and could end up arping for router
+                 * port ip.
+                 *
+                 * This check works on the assumption that for OVN chassis,
+                 * VMs logical switch ARP responder will respond to ARP
+                 * requests for router port IP.
+                 */
+                ds_put_format(&match, " && is_chassis_resident(\"cr-%s\")",
+                              op->key);
             }
 
             ds_clear(&actions);
@@ -7365,18 +7379,23 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
             ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 300,
                           REGBIT_DISTRIBUTED_NAT" == 1", "next;");
 
-            /* For traffic with outport == l3dgw_port, if the
-             * packet did not match any higher priority redirect
-             * rule, then the traffic is redirected to the central
-             * instance of the l3dgw_port. */
-            ds_clear(&match);
-            ds_put_format(&match, "outport == %s",
-                          od->l3dgw_port->json_key);
-            ds_clear(&actions);
-            ds_put_format(&actions, "outport = %s; next;",
-                          od->l3redirect_port->json_key);
-            ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
-                          ds_cstr(&match), ds_cstr(&actions));
+            /* For VLAN backed networks, default match will not redirect to
+             * chassis redirect port. */
+            if (od->l3dgw_port->peer &&
+                od->l3dgw_port->peer->od->network_type == DP_NETWORK_OVERLAY) {
+                /* For traffic with outport == l3dgw_port, if the
+                 * packet did not match any higher priority redirect
+                 * rule, then the traffic is redirected to the central
+                 * instance of the l3dgw_port. */
+                ds_clear(&match);
+                ds_put_format(&match, "outport == %s",
+                              od->l3dgw_port->json_key);
+                ds_clear(&actions);
+                ds_put_format(&actions, "outport = %s; next;",
+                              od->l3redirect_port->json_key);
+                ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 50,
+                              ds_cstr(&match), ds_cstr(&actions));
+            }
 
             /* If the Ethernet destination has not been resolved,
              * redirect to the central instance of the l3dgw_port.
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 6275db1..6df711e 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -1441,7 +1441,7 @@ 
     </li>
   </ol>
 
-  <h3>External traffic</h3>
+  <h3>External traffic (NAT)</h3>
 
   <p>
     The following happens when a VM sends an external traffic (which requires
@@ -1607,6 +1607,91 @@ 
     </li>
   </ol>
 
+  <h3>External traffic (NO NAT)</h3>
+  <p>
+    The following happens when a VM sends an external traffic (i.e to non
+    logical router connected network), but there is not need for NATing.
+  </p>
+
+  <p>
+    Since, there is no NATing required, hence we need not redirect the packet
+    to a gateway chassis. As a result, this packet flow is same as East-West.
+    In order to ensure that OVN will not redirect the packet over a tunnel
+    to gateway-chassis, "network_type" of destination localnet logical switch,
+    should be set as "bridged". A "bridged" logical switch ensures that there
+    is no tunnel encapsulation done while forwarding the packet on it.
+    Please refer to <code>ovn-nb</code>(5) for more details.
+  </p>
+
+  <ol>
+    <li>
+      It first enters the ingress pipeline, and then egress pipeline of the
+      source localnet logical switch datapath. It then enters the ingress
+      pipeline of the logical router datapath via the logical router port in
+      the source chassis.
+    </li>
+
+    <li>
+      Routing decision is taken. Since, destination network is NOT directly
+      connected to logial router, hence a static route is expected, which will
+      provide next hop ip.
+    </li>
+
+    <li>
+      From the router datapath, packet enters the ingress pipeline and then
+      egress pipeline of the destination localnet logical switch datapath
+      (it is of type "bridged" and this is where the next hop is present)
+      and goes out of the integration bridge to the provider bridge (
+      belonging to the destination logical switch) via the localnet port.
+      Same as East-West, source mac will replaced with chassis mac.
+    </li>
+  </ol>
+
+  <p>
+    The following happens for the reverse external traffic.
+  </p>
+
+  <ol>
+    <li>
+      The gateway chassis receives the packet from the localnet port of
+      the logical switch (bridged type) which provides external connectivity.
+      The packet then enters the ingress pipeline and then egress pipeline of
+      the localnet logical switch (which provides external connectivity).
+      The packet then enters the ingress pipeline of the logical router
+      datapath.
+    </li>
+
+    <li>
+      Routing decision is taken and logical switch of destination VM is
+      identified.
+    </li>
+
+    <li>
+      The packet then enters the ingress pipeline and then egress
+      pipeline of VM's localnet logical switch. Since the source VM
+      doesn't reside in the gateway chassis, the packet is sent out via the
+      localnet port of the VM's logical switch. Source mac of this packet
+      will be replaced with chassis unique mac.
+    </li>
+
+    <li>
+      VM's chassis receives the packet via the localnet port and
+      sends it to the integration bridge. The packet enters the
+      ingress pipeline and then egress pipeline of the localnet
+      logical switch and finally gets delivered to the VM port.
+    </li>
+  </ol>
+
+  <p>
+    One thing to note here is that, while VM to External traffic did not
+    require redirection to gateway chassis, the reverse traffic is through
+    gateway chassis only. This is because, for external router, OVN logical
+    router port IP will be the next hop to reach the endpoints behind it.
+    As a result, we need a centralized chassis, which will respond to ARP
+    requests coming from external network. This centralized chassis, is the
+    gateway chassis which is attached to corresponding router port.
+  </p>
+
   <h2>Life Cycle of a VTEP gateway</h2>
 
   <p>
diff --git a/tests/ovn.at b/tests/ovn.at
index e5108a7..8a03393 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -29,6 +29,12 @@  m4_define([OVN_CHECK_PACKETS],
   [ovn_check_packets__ "$1" "$2"
    AT_CHECK([sort $rcv_text], [0], [expout])])
 
+m4_define([OVN_CHECK_PACKETS_REMOVE_BROADCAST],
+  [ovn_check_packets__ "$1" "$2"
+   echo "received_text=$rcv_text"
+   sed -i '/ffffffffffff/d' $rcv_text
+   AT_CHECK([sort $rcv_text], [0], [expout])])
+
 AT_BANNER([OVN components])
 
 AT_SETUP([ovn -- lexer])
@@ -14018,7 +14024,7 @@  ovn-hv4-0
 OVN_CLEANUP([hv1], [hv2], [hv3])
 AT_CLEANUP
 
-AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac])
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR E-W chassis mac])
 ovn_start
 
 
@@ -14028,6 +14034,8 @@  ovn_start
 # of VIF port name indicates the hypervisor it is bound to, e.g.
 # lp23 means VIF 3 on hv2.
 #
+# Both the switches are connected to a logical router "router".
+#
 # Each switch's VLAN tag and their logical switch ports are:
 #   - ls1:
 #       - tagged with VLAN 101
@@ -14185,6 +14193,7 @@  test_ip() {
 echo "------ OVN dump ------"
 ovn-nbctl show
 ovn-sbctl show
+ovn-sbctl list port_binding
 
 echo "------ hv1 dump ------"
 as hv1 ovs-vsctl show
@@ -14211,6 +14220,727 @@  as hv2 ovs-appctl fdb/show br-phys
 
 OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected])
 
+
+# Associate a chassis as gateway chassis and validate garp.
+
+OVN_CLEANUP([hv1],[hv2])
+
+AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S GARP])
+ovn_start
+
+
+# In this test cases we create 2 switches, all connected to same
+# physical network (through br-phys on each HV). Each switch has
+# 1 VIF. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# Both the switches are connected to a logical router "router".
+#
+# Additionally, we create a logical switch (ls-underlay) for N-S traffic.
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+#
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovs-vsctl set open . external-ids:system-id="HV$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+    ovs-vsctl set-controller br-int ptcp:
+    AT_CHECK([ovs-vsctl add-port br-phys snoopvif -- set Interface snoopvif options:tx_pcap=hv$i/snoopvif-tx.pcap options:rxq_pcap=hv$i/snoopvif-rx.pcap])
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24
+
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+ovn-nbctl --wait=sb sync
+
+# Associate hv2 as gateway chassis
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv2
+
+ovn-nbctl show
+ovn-sbctl show
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+sleep 1
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+AT_CHECK([as hv2 ovs-appctl fdb/show br-phys | grep 00:00:01:01:02:07 | grep 1000 | wc -l], [0], [[1
+]])
+
+echo "ffffffffffff000001010207810003e808060001080006040001000001010207ac1f0001000000000000ac1f0001" > expected
+OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], [expected])
+
 OVN_CLEANUP([hv1],[hv2])
 
 AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S Ping])
+ovn_start
+
+# In this test cases we create 3 switches, all connected to same
+# physical network (through br-phys on each HV). LS1 and LS2 have
+# 1 VIF each. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# All the switches are connected to a logical router "router".
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+
+for i in 1 2; do
+    ls_name=ls$i
+    ovn-nbctl ls-add $ls_name bridged
+    ln_port_name=ln$i
+    if test $i -eq 1; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
+    elif test $i -eq 2; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
+    fi
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
+done
+
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif?[[north]]?) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+
+    ovs-vsctl add-port br-int vif$i$i -- \
+        set Interface vif$i$i external-ids:iface-id=lp$i$i \
+                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
+                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
+                              ofport-request=$i$i
+
+    lsp_name=lp$i$i
+    ls_name=$(lsp_to_ls $lsp_name)
+
+    ovn-nbctl lsp-add $ls_name $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i"
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl ls-add ls-north bridged
+ovn-nbctl lsp-add ls-north ln4 "" 1000
+ovn-nbctl lsp-set-addresses ln4 unknown
+ovn-nbctl lsp-set-type ln4 localnet
+ovn-nbctl lsp-set-options ln4 network_name=phys
+
+# Add a VM on ls-north
+ovn-nbctl lsp-add ls-north lp-north
+ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
+ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
+
+# Add 3rd hypervisor
+sim_add hv3
+as hv3 ovs-vsctl add-br br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
+as hv3 ovn_attach n1 br-phys 192.168.0.3
+
+# Add 4th hypervisor
+sim_add hv4
+as hv4 ovs-vsctl add-br br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
+as hv4 ovn_attach n1 br-phys 192.168.0.4
+
+as hv4 ovs-vsctl add-port br-int vif-north -- \
+        set Interface vif-north external-ids:iface-id=lp-north \
+                              options:tx_pcap=hv4/vif-north-tx.pcap \
+                              options:rxq_pcap=hv4/vif-north-rx.pcap \
+                              ofport-request=44
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
+ovn-nbctl lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
+ovn-nbctl lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24
+
+ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router \
+          options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router
+ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router \
+          options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
+
+ovn-nbctl --wait=sb sync
+
+sleep 2
+
+OVN_POPULATE_ARP
+
++# lsp_to_ls LSP
++#
++# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_ls () {
+    case $1 in dnl (
+        vif?[[11]]) echo ls1 ;; dnl (
+        vif?[[12]]) echo ls2 ;; dnl (
+        vif-north) echo ls-north ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+hv_to_num () {
+    case $1 in dnl (
+        hv1) echo 1 ;; dnl (
+        hv2) echo 2 ;; dnl (
+        hv3) echo 3 ;; dnl (
+        hv4) echo 4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_num () {
+    case $1 in dnl (
+        vif22) echo 22 ;; dnl (
+        vif21) echo 21 ;; dnl (
+        vif11) echo 11 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif-north) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_lrp () {
+    echo router-to-`vif_to_ls $1`
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+
+test_ip() {
+        # This packet has bad checksums but logical L3 routing doesn't check.
+        local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 outport=$6
+        local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+        shift; shift; shift; shift; shift
+        hv=`vif_to_hv $inport`
+        as $hv ovs-appctl netdev-dummy/receive $inport $packet
+        in_ls=`vif_to_ls $inport`
+        for outport; do
+            out_ls=`vif_to_ls $outport`
+            if test $in_ls = $out_ls; then
+                # Ports on the same logical switch receive exactly the same packet.
+                echo $packet
+            else
+                # Routing decrements TTL and updates source and dest MAC
+                # (and checksum).
+                out_lrp=`vif_to_lrp $outport`
+                # For North-South, packet will come via gateway chassis, i.e hv3
+                if test $inport = vif-north; then
+                    echo f00000000011aabbccddee3308004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected
+                fi
+                if test $outport = vif-north; then
+                    echo f0f000000011aabbccddee1108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 >> $outport.expected
+                fi
+            fi >> $outport.expected
+        done
+}
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+ovn-sbctl list port_binding
+ovn-sbctl list mac_binding
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+echo "------ hv3 dump ------"
+as hv3 ovs-vsctl show
+as hv3 ovs-vsctl list Open_Vswitch
+
+echo "------ hv4 dump ------"
+as hv4 ovs-vsctl show
+as hv4 ovs-vsctl list Open_Vswitch
+
+echo "Send traffic North to South"
+
+sip=`ip_to_hex 172 31 0 10`
+dip=`ip_to_hex 192 168 1 1`
+test_ip vif-north f0f000000011 000001010207 $sip $dip vif11
+
+sleep 1
+
+# Confirm that North to south traffic works fine and went through gateway chassis, i.e HV3
+OVN_CHECK_PACKETS([hv1/vif11-tx.pcap], [vif11.expected])
+
+echo "Send traffic South to Nouth"
+sip=`ip_to_hex 192 168 1 1`
+dip=`ip_to_hex 172 31 0 10`
+test_ip vif11 f00000000011 000001010203 $sip $dip vif-north
+
+sleep 1
+
+# Confirm that South to North traffic works fine.
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap], [vif-north.expected])
+
+# Confirm that packets did not go out via tunnel port.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=32 | grep NXM_NX_TUN_METADATA0 | grep n_packets=0 | wc -l], [0], [[1
+]])
+
+# Confirm that HV1 chassis mac is never seen on Gateway chassis, i.e HV3
+AT_CHECK([as hv3 ovs-appctl fdb/show br-phys | grep aa:bb:cc:dd:ee:11 | wc -l], [0], [[0
+]])
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv3 dump -----------"
+as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv3 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv4 dump -----------"
+as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv4 ovs-appctl fdb/show br-phys
+
+OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
+
+AT_CLEANUP
+
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR N-S ARP handling])
+ovn_start
+
+# In this test cases we create 3 switches, all connected to same
+# physical network (through br-phys on each HV). LS1 and LS2 have
+# 1 VIF each. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# All the switches are connected to a logical router "router".
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#   - ls-underlay:
+#       - tagged with VLAN 1000
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+
+for i in 1 2; do
+    ls_name=ls$i
+    ovn-nbctl ls-add $ls_name bridged
+    ln_port_name=ln$i
+    if test $i -eq 1; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
+    elif test $i -eq 2; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
+    fi
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
+done
+
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif?[[north]]?) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+
+    ovs-vsctl add-port br-int vif$i$i -- \
+        set Interface vif$i$i external-ids:iface-id=lp$i$i \
+                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
+                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
+                              ofport-request=$i$i
+
+    lsp_name=lp$i$i
+    ls_name=$(lsp_to_ls $lsp_name)
+
+    ovn-nbctl lsp-add $ls_name $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i"
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+
+done
+
+ovn-nbctl ls-add ls-underlay bridged
+ovn-nbctl lsp-add ls-underlay ln3 "" 1000
+ovn-nbctl lsp-set-addresses ln3 unknown
+ovn-nbctl lsp-set-type ln3 localnet
+ovn-nbctl lsp-set-options ln3 network_name=phys
+
+ovn-nbctl ls-add ls-north bridged
+ovn-nbctl lsp-add ls-north ln4 "" 1000
+ovn-nbctl lsp-set-addresses ln4 unknown
+ovn-nbctl lsp-set-type ln4 localnet
+ovn-nbctl lsp-set-options ln4 network_name=phys
+
+# Add a VM on ls-north
+ovn-nbctl lsp-add ls-north lp-north
+ovn-nbctl lsp-set-addresses lp-north "f0:f0:00:00:00:11 172.31.0.10"
+ovn-nbctl lsp-set-port-security lp-north f0:f0:00:00:00:11
+
+# Add 3rd hypervisor
+sim_add hv3
+as hv3 ovs-vsctl add-br br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv3 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
+as hv3 ovn_attach n1 br-phys 192.168.0.3
+
+# Add 4th hypervisor
+sim_add hv4
+as hv4 ovs-vsctl add-br br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+as hv4 ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
+as hv4 ovn_attach n1 br-phys 192.168.0.4
+
+as hv4 ovs-vsctl add-port br-int vif-north -- \
+        set Interface vif-north external-ids:iface-id=lp-north \
+                              options:tx_pcap=hv4/vif-north-tx.pcap \
+                              options:rxq_pcap=hv4/vif-north-rx.pcap \
+                              ofport-request=44
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
+ovn-nbctl lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
+ovn-nbctl lrp-add router router-to-underlay 00:00:01:01:02:07 172.31.0.1/24
+
+ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router \
+          options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router
+ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router \
+          options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router
+ovn-nbctl lsp-add ls-underlay underlay-to-router -- set Logical_Switch_Port \
+                              underlay-to-router type=router \
+                              options:router-port=router-to-underlay \
+                              -- lsp-set-addresses underlay-to-router router
+
+
+OVN_POPULATE_ARP
+
++# lsp_to_ls LSP
++#
++# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_ls () {
+    case $1 in dnl (
+        vif?[[11]]) echo ls1 ;; dnl (
+        vif?[[12]]) echo ls2 ;; dnl (
+        vif-north) echo ls-north ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+hv_to_num () {
+    case $1 in dnl (
+        hv1) echo 1 ;; dnl (
+        hv2) echo 2 ;; dnl (
+        hv3) echo 3 ;; dnl (
+        hv4) echo 4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_num () {
+    case $1 in dnl (
+        vif22) echo 22 ;; dnl (
+        vif21) echo 21 ;; dnl (
+        vif11) echo 11 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        vif-north) echo hv4 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_lrp () {
+    echo router-to-`vif_to_ls $1`
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+ovn-sbctl list port_binding
+ovn-sbctl list mac_binding
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+echo "------ hv3 dump ------"
+as hv3 ovs-vsctl show
+as hv3 ovs-vsctl list Open_Vswitch
+
+echo "------ hv4 dump ------"
+as hv4 ovs-vsctl show
+as hv4 ovs-vsctl list Open_Vswitch
+
+# test_arp INPORT SHA SPA TPA [REPLY_HA]
+#
+# Causes a packet to be received on INPORT.  The packet is an ARP
+# request with SHA, SPA, and TPA as specified.  If REPLY_HA is provided, then
+# it should be the hardware address of the target to expect to receive in an
+# ARP reply; otherwise no reply is expected.
+#
+# INPORT is an logical switch port number, e.g. 11 for vif11.
+# SHA and REPLY_HA are each 12 hex digits.
+# SPA and TPA are each 8 hex digits.
+test_arp() {
+    local inport=$1 sha=$2 spa=$3 tpa=$4 reply_ha=$5
+    local request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa}
+    hv=`vif_to_hv $inport`
+    as $hv ovs-appctl netdev-dummy/receive $inport $request
+
+    if test X$reply_ha = X; then
+        # Expect to receive the broadcast ARP on the other logical switch ports
+        # if no reply is expected.
+        local i j
+        for i in 1 2 3; do
+            for j in 1 2 3; do
+                if test $i$j != $inport; then
+                    echo $request >> $i$j.expected
+                fi
+            done
+        done
+    else
+        # Expect to receive the reply, if any.
+        local reply=${sha}${reply_ha}08060001080006040002${reply_ha}${tpa}${sha}${spa}
+        local reply_vid=${sha}${reply_ha}810003e808060001080006040002${reply_ha}${tpa}${sha}${spa}
+        echo $reply_vid >> ${inport}_vid.expected
+        echo $reply >> $inport.expected
+    fi
+}
+
+sip=`ip_to_hex 172 31 0 10`
+tip=`ip_to_hex 172 31 0 1`
+
+test_arp vif-north f0f000000011 $sip $tip
+# Confirm that vif-north does not get ARP reply
+AT_CHECK([wc -l hv4/vif-north-tx.pcap | awk '{print $1}'], [0], [[0
+]])
+
+# Set a hypervisor as gateway chassis, for router port 172.31.0.1
+ovn-nbctl lrp-set-gateway-chassis router-to-underlay hv3
+ovn-nbctl --wait=sb sync
+sleep 2
+
+test_arp vif-north f0f000000011 $sip $tip 000001010207
+
+sleep 1
+
+# Confirm that vif-north gets a single ARP reply this time
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap], [vif-north.expected])
+
+# Confirm that only redirect chassis allowed arp resolution.
+OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv3/br-phys_n1-tx.pcap], [vif-north_vid.expected])
+sed -i '/ffffffffffff/d' hv3/br-phys_n1-tx.packets
+AT_CHECK([grep 000001010207 hv3/br-phys_n1-tx.packets | wc -l], [0], [[1
+]])
+
+# Confirm that other OVN chassis did not generate ARP reply.
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/br-phys_n1-tx.pcap > hv1/br-phys_n1-tx.packets
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > hv2/br-phys_n1-tx.packets
+
+AT_CHECK([grep 000001010207 hv1/br-phys_n1-tx.packets | wc -l], [0], [[0
+]])
+AT_CHECK([grep 000001010207 hv2/br-phys_n1-tx.packets | wc -l], [0], [[0
+]])
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv3 dump -----------"
+as hv3 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv3 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv4 dump -----------"
+as hv4 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv4 ovs-appctl fdb/show br-phys
+
+OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
+
+AT_CLEANUP