diff mbox series

[ovs-dev,v10] OVN: Enable E-W Traffic, Vlan backed DVR

Message ID 1560295002-28128-1-git-send-email-ankur.sharma@nutanix.com
State Superseded
Headers show
Series [ovs-dev,v10] OVN: Enable E-W Traffic, Vlan backed DVR | expand

Commit Message

Ankur Sharma June 11, 2019, 11:14 p.m. UTC
Background:
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html
[2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing

Key difference between an overlay logical switch and
vlan backed logical switch is that for vlan logical switches
packets are not encapsulated.

Hence, if a distributed router port is connected to vlan backed
logical switch, then router port mac as source mac could be
seen from multiple hypervisors. Same <mac,vlan> pairs coming
from multiple ports from a top of the rack switch (TOR) perspective
could be seen as a security threat and it could send alarms, drop
the packets or block the ports etc.

This patch addresses the same by introducing the concept of chassis mac.
A chassis mac is CMS provisioned unique mac per chassis. For any routed packet
(i.e source mac is router port mac) going on the wire on a vlan type
logical switch, we will replace its source mac with chassis mac.

This replacing of source mac with chassis mac will happen in table=65
of the logical switch datapath. A flow is added at priority 150, which
matches the source mac and replaces it with chassis mac if the value
is a router port mac.

Example flow:
cookie=0x0, duration=67765.830s, table=65, n_packets=0, n_bytes=0,
idle_age=65534, hard_age=65534, priority=150,reg15=0x1,metadata=0x4,
dl_src=00:00:01:01:02:03 actions=mod_dl_src:aa:bb:cc:dd:ee:ff,
mod_vlan_vid:1000,output:16

Here, 00:00:01:01:02:03 is router port mac and aa:bb:cc:dd:ee:ff
is chassis mac.

Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
---
 ovn/controller/binding.c            |  12 +--
 ovn/controller/chassis.c            |  64 +++++++++++-
 ovn/controller/chassis.h            |   4 +
 ovn/controller/ovn-controller.8.xml |  10 ++
 ovn/controller/ovn-controller.c     |   4 +-
 ovn/controller/ovn-controller.h     |   5 +-
 ovn/controller/physical.c           |  95 +++++++++++++++++
 ovn/ovn-architecture.7.xml          |  24 +++++
 ovn/ovn-sb.xml                      |   8 ++
 tests/ovn.at                        | 197 ++++++++++++++++++++++++++++++++++++
 10 files changed, 411 insertions(+), 12 deletions(-)

Comments

Numan Siddique June 17, 2019, 10:52 a.m. UTC | #1
On Wed, Jun 12, 2019 at 4:47 AM Ankur Sharma <ankur.sharma@nutanix.com>
wrote:

> Background:
> [1]
> https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html
> [2]
> https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing
>
> Key difference between an overlay logical switch and
> vlan backed logical switch is that for vlan logical switches
> packets are not encapsulated.
>
> Hence, if a distributed router port is connected to vlan backed
> logical switch, then router port mac as source mac could be
> seen from multiple hypervisors. Same <mac,vlan> pairs coming
> from multiple ports from a top of the rack switch (TOR) perspective
> could be seen as a security threat and it could send alarms, drop
> the packets or block the ports etc.
>
> This patch addresses the same by introducing the concept of chassis mac.
> A chassis mac is CMS provisioned unique mac per chassis. For any routed
> packet
> (i.e source mac is router port mac) going on the wire on a vlan type
> logical switch, we will replace its source mac with chassis mac.
>
> This replacing of source mac with chassis mac will happen in table=65
> of the logical switch datapath. A flow is added at priority 150, which
> matches the source mac and replaces it with chassis mac if the value
> is a router port mac.
>
> Example flow:
> cookie=0x0, duration=67765.830s, table=65, n_packets=0, n_bytes=0,
> idle_age=65534, hard_age=65534, priority=150,reg15=0x1,metadata=0x4,
> dl_src=00:00:01:01:02:03 actions=mod_dl_src:aa:bb:cc:dd:ee:ff,
> mod_vlan_vid:1000,output:16
>
> Here, 00:00:01:01:02:03 is router port mac and aa:bb:cc:dd:ee:ff
> is chassis mac.
>
> Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com>
>

Thanks Ankur for the patch.

Acked-by: Numan Siddique <nusiddiq@redhat.com>

There is just one small minor comment. It would be nice if you can address
it,

Thanks
Numan



> ---
>  ovn/controller/binding.c            |  12 +--
>  ovn/controller/chassis.c            |  64 +++++++++++-
>  ovn/controller/chassis.h            |   4 +
>  ovn/controller/ovn-controller.8.xml |  10 ++
>  ovn/controller/ovn-controller.c     |   4 +-
>  ovn/controller/ovn-controller.h     |   5 +-
>  ovn/controller/physical.c           |  95 +++++++++++++++++
>  ovn/ovn-architecture.7.xml          |  24 +++++
>  ovn/ovn-sb.xml                      |   8 ++
>  tests/ovn.at                        | 197
> ++++++++++++++++++++++++++++++++++++
>  10 files changed, 411 insertions(+), 12 deletions(-)
>
> diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> index b62b3da..c73d1aa 100644
> --- a/ovn/controller/binding.c
> +++ b/ovn/controller/binding.c
> @@ -159,13 +159,11 @@ add_local_datapath__(struct ovsdb_idl_index
> *sbrec_datapath_binding_by_key,
>                                           sbrec_port_binding_by_name,
>                                           peer->datapath, false,
>                                           depth + 1, local_datapaths);
> -                    ld->n_peer_dps++;
> -                    ld->peer_dps = xrealloc(
> -                            ld->peer_dps,
> -                            ld->n_peer_dps * sizeof *ld->peer_dps);
> -                    ld->peer_dps[ld->n_peer_dps - 1] =
> datapath_lookup_by_key(
> -                        sbrec_datapath_binding_by_key,
> -                        peer->datapath->tunnel_key);
> +                    ld->n_peer_ports++;
> +                    ld->peer_ports = xrealloc(ld->peer_ports,
> +                                              ld->n_peer_ports *
> +                                              sizeof *ld->peer_ports);
> +                    ld->peer_ports[ld->n_peer_ports - 1] = peer;
>                  }
>              }
>          }
> diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c
> index 0f537f1..8403212 100644
> --- a/ovn/controller/chassis.c
> +++ b/ovn/controller/chassis.c
> @@ -23,6 +23,7 @@
>  #include "lib/vswitch-idl.h"
>  #include "openvswitch/dynamic-string.h"
>  #include "openvswitch/vlog.h"
> +#include "openvswitch/ofp-parse.h"
>  #include "ovn/lib/chassis-index.h"
>  #include "ovn/lib/ovn-sb-idl.h"
>  #include "ovn-controller.h"
> @@ -69,6 +70,12 @@ get_bridge_mappings(const struct smap *ext_ids)
>  }
>
>  static const char *
> +get_chassis_mac_mappings(const struct smap *ext_ids)
> +{
> +    return smap_get_def(ext_ids, "ovn-chassis-mac-mappings", "");
> +}
> +
> +static const char *
>  get_cms_options(const struct smap *ext_ids)
>  {
>      return smap_get_def(ext_ids, "ovn-cms-options", "");
> @@ -162,6 +169,7 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>      const char *datapath_type =
>          br_int && br_int->datapath_type ? br_int->datapath_type : "";
>      const char *cms_options = get_cms_options(&cfg->external_ids);
> +    const char *chassis_macs =
> get_chassis_mac_mappings(&cfg->external_ids);
>
>      struct ds iface_types = DS_EMPTY_INITIALIZER;
>      ds_put_cstr(&iface_types, "");
> @@ -190,18 +198,22 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>              = smap_get_def(&chassis_rec->external_ids, "iface-types", "");
>          const char *chassis_cms_options
>              = get_cms_options(&chassis_rec->external_ids);
> +        const char *chassis_mac_mappings
> +            = get_chassis_mac_mappings(&chassis_rec->external_ids);
>
>          /* If any of the external-ids should change, update them. */
>          if (strcmp(bridge_mappings, chassis_bridge_mappings) ||
>              strcmp(datapath_type, chassis_datapath_type) ||
>              strcmp(iface_types_str, chassis_iface_types) ||
> -            strcmp(cms_options, chassis_cms_options)) {
> +            strcmp(cms_options, chassis_cms_options) ||
> +            strcmp(chassis_macs, chassis_mac_mappings)) {
>              struct smap new_ids;
>              smap_clone(&new_ids, &chassis_rec->external_ids);
>              smap_replace(&new_ids, "ovn-bridge-mappings",
> bridge_mappings);
>              smap_replace(&new_ids, "datapath-type", datapath_type);
>              smap_replace(&new_ids, "iface-types", iface_types_str);
>              smap_replace(&new_ids, "ovn-cms-options", cms_options);
> +            smap_replace(&new_ids, "ovn-chassis-mac-mappings",
> chassis_macs);
>              sbrec_chassis_verify_external_ids(chassis_rec);
>              sbrec_chassis_set_external_ids(chassis_rec, &new_ids);
>              smap_destroy(&new_ids);
> @@ -319,6 +331,56 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>      return chassis_rec;
>  }
>
> +bool
> +chassis_get_mac(const struct sbrec_chassis *chassis_rec,
> +                const char *bridge_mapping,
> +                struct eth_addr *chassis_mac)
> +{
> +    const char *tokens
> +        = get_chassis_mac_mappings(&chassis_rec->external_ids);
> +
> +    if (!strlen(tokens)) {
> +       return false;
> +    }
> +
> +    char *save_ptr = NULL;
> +    char *token;
> +    bool ret = false;
> +    char *tokstr = xstrdup(tokens);
> +
> +    /* Format for a chassis mac configuration is:
> +     * ovn-chassis-mac-mappings="bridge-name1:MAC1,bridge-name2:MAC2"
> +     */
> +    for (token = strtok_r(tokstr, ",", &save_ptr);
> +         token != NULL;
> +         token = strtok_r(NULL, ",", &save_ptr)) {
> +        char *save_ptr2 = NULL;
> +        char *chassis_mac_bridge = strtok_r(token, ":", &save_ptr2);
> +        char *chassis_mac_str = strtok_r(NULL, "", &save_ptr2);
> +
> +        if (!strcmp(chassis_mac_bridge, bridge_mapping)) {
> +            struct eth_addr temp_mac;
> +            char *err_str = NULL;
> +
> +            ret = true;
> +
> +            /* Return the first chassis mac. */
> +            if ((err_str = str_to_mac(chassis_mac_str, &temp_mac))) {
> +                free(err_str);
> +                ret = false;
> +                continue;
> +            }
> +
> +            *chassis_mac = temp_mac;
> +            break;
> +        }
> +    }
> +
> +    free(tokstr);
> +
> +    return ret;
> +}
> +
>  /* Returns true if the database is all cleaned up, false if more work is
>   * required. */
>  bool
> diff --git a/ovn/controller/chassis.h b/ovn/controller/chassis.h
> index 9847e19..e3fbc31 100644
> --- a/ovn/controller/chassis.h
> +++ b/ovn/controller/chassis.h
> @@ -26,6 +26,7 @@ struct ovsrec_open_vswitch_table;
>  struct sbrec_chassis;
>  struct sbrec_chassis_table;
>  struct sset;
> +struct eth_addr;
>
>  void chassis_register_ovs_idl(struct ovsdb_idl *);
>  const struct sbrec_chassis *chassis_run(
> @@ -36,5 +37,8 @@ const struct sbrec_chassis *chassis_run(
>      const struct sset *transport_zones);
>  bool chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
>                       const struct sbrec_chassis *);
> +bool chassis_get_mac(const struct sbrec_chassis *chassis,
> +                     const char *bridge_mapping,
> +                     struct eth_addr *chassis_mac);
>
>  #endif /* ovn/chassis.h */
> diff --git a/ovn/controller/ovn-controller.8.xml
> b/ovn/controller/ovn-controller.8.xml
> index 9721d9a..18f66fe 100644
> --- a/ovn/controller/ovn-controller.8.xml
> +++ b/ovn/controller/ovn-controller.8.xml
> @@ -182,6 +182,16 @@
>            transport zone.
>          </p>
>        </dd>
> +      <dt><code>external_ids:ovn-chassis-mac-mappings</code></dt>
> +      <dd>
> +        A list of key-value pairs that map a chassis specific mac to
> +        a physical network name. An example
> +        value mapping two chassis macs to two physical network names
> would be:
> +
> <code>physnet1:aa:bb:cc:dd:ee:ff,physnet2:a1:b2:c3:d4:e5:f6</code>.
> +        These are the macs that ovn-controller will replace a router port
> +        mac with, if packet is going from a distributed router port on
> +        vlan type logical switch.
> +      </dd>
>      </dl>
>
>      <p>
> diff --git a/ovn/controller/ovn-controller.c
> b/ovn/controller/ovn-controller.c
> index 6019016..315a88b 100644
> --- a/ovn/controller/ovn-controller.c
> +++ b/ovn/controller/ovn-controller.c
> @@ -899,7 +899,7 @@ en_runtime_data_cleanup(struct engine_node *node)
>      struct local_datapath *cur_node, *next_node;
>      HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node,
>                          &data->local_datapaths) {
> -        free(cur_node->peer_dps);
> +        free(cur_node->peer_ports);
>          hmap_remove(&data->local_datapaths, &cur_node->hmap_node);
>          free(cur_node);
>      }
> @@ -929,7 +929,7 @@ en_runtime_data_run(struct engine_node *node)
>      } else {
>          struct local_datapath *cur_node, *next_node;
>          HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node,
> local_datapaths) {
> -            free(cur_node->peer_dps);
> +            free(cur_node->peer_ports);
>              hmap_remove(local_datapaths, &cur_node->hmap_node);
>              free(cur_node);
>          }
> diff --git a/ovn/controller/ovn-controller.h
> b/ovn/controller/ovn-controller.h
> index 6afd727..a4c1309 100644
> --- a/ovn/controller/ovn-controller.h
> +++ b/ovn/controller/ovn-controller.h
> @@ -59,8 +59,9 @@ struct local_datapath {
>      /* True if this datapath contains an l3gateway port located on this
>       * hypervisor. */
>      bool has_local_l3gateway;
> -    const struct sbrec_datapath_binding **peer_dps;
> -    size_t n_peer_dps;
> +
> +    const struct sbrec_port_binding **peer_ports;
> +    size_t n_peer_ports;
>  };
>
>  struct local_datapath *get_local_datapath(const struct hmap *,
> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> index c8dc282..af587a5 100644
> --- a/ovn/controller/physical.c
> +++ b/ovn/controller/physical.c
> @@ -20,6 +20,7 @@
>  #include "ha-chassis.h"
>  #include "lflow.h"
>  #include "lport.h"
> +#include "chassis.h"
>  #include "lib/bundle.h"
>  #include "openvswitch/poll-loop.h"
>  #include "lib/uuid.h"
> @@ -30,6 +31,7 @@
>  #include "openvswitch/ofp-actions.h"
>  #include "openvswitch/ofpbuf.h"
>  #include "openvswitch/vlog.h"
> +#include "openvswitch/ofp-parse.h"
>  #include "ovn-controller.h"
>  #include "ovn/lib/chassis-index.h"
>  #include "ovn/lib/ovn-sb-idl.h"
> @@ -236,6 +238,92 @@ get_zone_ids(const struct sbrec_port_binding *binding,
>  }
>
>  static void
> +put_replace_router_port_mac_flows(const struct
> +                                  sbrec_port_binding *localnet_port,
> +                                  const struct sbrec_chassis *chassis,
> +                                  const struct hmap *local_datapaths,
> +                                  struct ofpbuf *ofpacts_p,
> +                                  ofp_port_t ofport,
> +                                  struct ovn_desired_flow_table
> *flow_table)
> +{
> +    struct local_datapath *ld = get_local_datapath(local_datapaths,
> +
>  localnet_port->datapath->
> +                                                   tunnel_key);
> +    ovs_assert(ld);
> +
> +    uint32_t dp_key = localnet_port->datapath->tunnel_key;
> +    uint32_t port_key = localnet_port->tunnel_key;
> +    int tag = localnet_port->tag ? *localnet_port->tag : 0;
> +    const char *network = smap_get(&localnet_port->options,
> "network_name");
> +    struct eth_addr chassis_mac;
> +
> +    if (!network) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> +        VLOG_WARN_RL(&rl, "Physical network not configured for datapath:
> %ld "
> +                     "with localnet port",
> +                     localnet_port->datapath->tunnel_key);
> +        return;
> +    }
> +
> +    /* Get chassis mac */
> +    if (!chassis_get_mac(chassis, network, &chassis_mac)) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> +        /* Keeping the log level low for backward compatibility.
> +         * Chassis mac is a new configuration.
> +         */
> +        VLOG_DBG_RL(&rl, "Could not get chassis mac for network: %s",
> network);
> +        return;
> +    }
> +
> +    for (int i = 0; i < ld->n_peer_ports; i++) {
> +        const struct sbrec_port_binding *rport_binding =
> ld->peer_ports[i];
> +        struct eth_addr router_port_mac;
> +        char *err_str = NULL;
> +        struct match match;
> +        struct ofpact_mac *replace_mac;
> +
> +        /* Table 65, priority 150.
> +         * =======================
> +         *
> +         * Implements output to localnet port.
> +         * a. Flow replaces ingress router port mac with a chassis mac.
> +         * b. Flow appends the vlan id localnet port is configured with.
> +         */
> +        match_init_catchall(&match);
> +        ofpbuf_clear(ofpacts_p);
> +
> +        ovs_assert(rport_binding->n_mac == 1);
> +        if ((err_str = str_to_mac(rport_binding->mac[0],
> &router_port_mac))) {
> +            /* Parsing of mac failed. */
> +            VLOG_WARN("Parsing or router port mac failed for router port:
> %s, "
> +                      "with error: %s", rport_binding->logical_port,
> err_str);
> +            free(err_str);
> +            return;
> +        }
> +
> +        /* Replace Router mac flow */
> +        match_set_metadata(&match, htonll(dp_key));
> +        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
> +        match_set_dl_src(&match, router_port_mac);
> +
> +        replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p);
> +        replace_mac->mac = chassis_mac;
> +
> +        if (tag) {
> +            struct ofpact_vlan_vid *vlan_vid;
> +            vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p);
> +            vlan_vid->vlan_vid = tag;
> +            vlan_vid->push_vlan_if_needed = true;
> +        }
> +
> +        ofpact_put_OUTPUT(ofpacts_p)->port = ofport;
> +
> +        ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 150, 0,
> +                        &match, ofpacts_p, &localnet_port->header_.uuid);
> +    }
> +}
> +
> +static void
>  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
>                         uint32_t parent_port_key,
>                         const struct zone_ids *zone_ids,
> @@ -707,6 +795,13 @@ consider_port_binding(struct ovsdb_idl_index
> *sbrec_port_binding_by_name,
>          }
>          ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100, 0,
>                          &match, ofpacts_p, &binding->header_.uuid);
> +
> +        if (!strcmp(binding->type, "localnet")) {
> +            put_replace_router_port_mac_flows(binding, chassis,
> +                                              local_datapaths, ofpacts_p,
> +                                              ofport, flow_table);
> +        }
> +
>      } else if (!tun && !is_ha_remote) {
>          /* Remote port connected by localnet port */
>          /* Table 33, priority 100.
> diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
> index 8c9e106..6275db1 100644
> --- a/ovn/ovn-architecture.7.xml
> +++ b/ovn/ovn-architecture.7.xml
> @@ -1407,6 +1407,30 @@
>        egress pipeline of the destination localnet logical switch datapath
>        and goes out of the integration bridge to the provider bridge (
>        belonging to the destination logical switch) via the localnet port.
> +      While sending the packet to provider bridge, we also replace router
> +      port mac as source mac with a chassis unique mac.
> +
> +      This chassis unique mac is configured as global ovs config on each
> +      chassis (eg. via "<code>ovs-vsctl set open . external-ids:
> +      ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"</code>").More
> +      details on this config are present in
> <code>ovn-controller</code>(8).
> +
> +      If the above is not configured, then source mac would be the router
> +      port mac. This could create problem if we have more than one
> chassis.
> +      This is because, since the router port is distributed, hence same
> +      mac,vlan tuple will seen by physical network from other chassis
> +      as well. This could cause some/all of these issues:
> +      <ul>
> +        <li>
> +          Continous mac moves in top of the rack switch (TOR).
> +        </li>
> +        <li>
> +          TOR dropping the traffic, which is causing continous mac moves.
> +        </li>
> +        <li>
> +          TOR blocking the ports from which mac moves are happening.
> +        </li>
> +      </ul>
>      </li>
>
>      <li>
> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
> index 1a2bc1d..89e88c4 100644
> --- a/ovn/ovn-sb.xml
> +++ b/ovn/ovn-sb.xml
> @@ -301,6 +301,14 @@
>        See <code>ovn-controller</code>(8) for more information.
>      </column>
>
> +    <column name="external_ids" key="ovn-chassis-mac-mappings">
> +      <code>ovn-controller</code> populates this key with the set of
> options
> +      configured in the <ref table="Open_vSwitch"
> +      column="external_ids:ovn-chassis-mac-mappings"/> column of the
> +      Open_vSwitch database's <ref table="Open_vSwitch"
> db="Open_vSwitch"/>
> +      table. See <code>ovn-controller</code>(8) for more information.
> +    </column>
> +
>      <group title="Common Columns">
>        The overall purpose of these columns is described under <code>Common
>        Columns</code> at the beginning of this document.
> diff --git a/tests/ovn.at b/tests/ovn.at
> index daf85a5..d6cbb7b 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -14017,3 +14017,200 @@ ovn-hv4-0
>
>  OVN_CLEANUP([hv1], [hv2], [hv3])
>  AT_CLEANUP
> +
> +AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac])
> +ovn_start
> +
> +
> +# In this test cases we create 2 switches, all connected to same
> +# physical network (through br-phys on each HV). Each switch has
> +# 1 VIF. Each HV has 1 VIF port. The first digit
> +# of VIF port name indicates the hypervisor it is bound to, e.g.
> +# lp23 means VIF 3 on hv2.
> +#
> +# Each switch's VLAN tag and their logical switch ports are:
> +#   - ls1:
> +#       - tagged with VLAN 101
> +#       - ports: lp11
> +#   - ls2:
> +#       - tagged with VLAN 201
> +#       - ports: lp22
> +#
> +# Note: a localnet port is created for each switch to connect to
> +# physical network.
> +
> +for i in 1 2; do
> +    ls_name=ls$i
> +    ovn-nbctl ls-add $ls_name
> +    ln_port_name=ln$i
> +    if test $i -eq 1; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
> +    elif test $i -eq 2; then
> +        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
> +    fi
> +    ovn-nbctl lsp-set-addresses $ln_port_name unknown
> +    ovn-nbctl lsp-set-type $ln_port_name localnet
> +    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
> +done
> +
> +# lsp_to_ls LSP
> +#
> +# Prints the name of the logical switch that contains LSP.
> +lsp_to_ls () {
> +    case $1 in dnl (
> +        lp?[[11]]) echo ls1 ;; dnl (
> +        lp?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_ls () {
> +    case $1 in dnl (
> +        vif?[[11]]) echo ls1 ;; dnl (
> +        vif?[[12]]) echo ls2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +hv_to_num () {
> +    case $1 in dnl (
> +        hv1) echo 1 ;; dnl (
> +        hv2) echo 2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_num () {
> +    case $1 in dnl (
> +        vif22) echo 22 ;; dnl (
> +        vif21) echo 21 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_hv () {
> +    case $1 in dnl (
> +        vif[[1]]?) echo hv1 ;; dnl (
> +        vif[[2]]?) echo hv2 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +vif_to_lrp () {
> +    echo router-to-`vif_to_ls $1`
> +}
> +
> +hv_to_chassis_mac () {
> +     case $1 in dnl (
> +        hv[[1]]) echo aa:bb:cc:dd:ee:11 ;; dnl (
> +        hv[[2]]) echo aa:bb:cc:dd:ee:22 ;; dnl (
> +        *) AT_FAIL_IF([:]) ;;
> +    esac
> +}
> +
> +ip_to_hex() {
> +       printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +net_add n1
> +for i in 1 2; do
> +    sim_add hv$i
> +    as hv$i
> +    ovs-vsctl add-br br-phys
> +    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
> +    ovs-vsctl set open .
> external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
> +    ovn_attach n1 br-phys 192.168.0.$i
> +
> +    ovs-vsctl add-port br-int vif$i$i -- \
> +        set Interface vif$i$i external-ids:iface-id=lp$i$i \
> +                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
> +                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
> +                              ofport-request=$i$i
> +
> +    lsp_name=lp$i$i
> +    ls_name=$(lsp_to_ls $lsp_name)
> +
> +    ovn-nbctl lsp-add $ls_name $lsp_name
> +    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i
> 192.168.$i.$i"
> +    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
> +
> +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
> +
> +done
> +
> +ovn-nbctl lr-add router
> +ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
> +ovn-nbctl <http://192.168.1.3/24+ovn-nbctl> lrp-add router router-to-ls2
> 00:00:01:01:02:05 192.168.2.3/24
> +
> +ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port
> ls1-to-router type=router options:router-port=router-to-ls1 --
> lsp-set-addresses ls1-to-router router
> +ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port
> ls2-to-router type=router options:router-port=router-to-ls2 --
> lsp-set-addresses ls2-to-router router
> +
> +ovn-nbctl --wait=sb sync
> +#ovn-sbctl dump-flows
> +
> +ovn-nbctl show
> +ovn-sbctl show
> +
> +OVN_POPULATE_ARP
> +
> +test_ip() {
> +    # This packet has bad checksums but logical L3 routing doesn't check.
> +    local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
> +    local
> packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
> +    shift; shift; shift; shift; shift
> +    hv=`vif_to_hv $inport`
> +    hv_num=`hv_to_num $hv`
> +    chassis_mac=`hv_to_chassis_mac $hv`
> +    as $hv ovs-appctl netdev-dummy/receive $inport $packet
> +    #as $hv ovs-appctl ofproto/trace br-int in_port=$inport $packet
> +    in_ls=`vif_to_ls $inport`
> +    in_lrp=`vif_to_lrp $inport`
> +    for outport; do
> +        out_ls=`vif_to_ls $outport`
> +        if test $in_ls = $out_ls; then
> +            # Ports on the same logical switch receive exactly the same
> packet.
> +            echo $packet
> +        else
> +            # Routing decrements TTL and updates source and dest MAC
> +            # (and checksum).
> +            outport_num=`vif_to_num $outport`
> +            out_lrp=`vif_to_lrp $outport`
> +            echo
> f000000000${outport_num}aabbccddee${hv_num}${hv_num}08004500001c00000000"3f1101"00${src_ip}${dst_ip}0035111100080000
> +        fi >> $outport.expected
> +    done
> +}
> +
> +# Dump a bunch of info helpful for debugging if there's a failure.
> +
> +echo "------ OVN dump ------"
> +ovn-nbctl show
> +ovn-sbctl show
> +
> +echo "------ hv1 dump ------"
> +as hv1 ovs-vsctl show
> +as hv1 ovs-vsctl list Open_Vswitch
> +
> +echo "------ hv2 dump ------"
> +as hv2 ovs-vsctl show
> +as hv2 ovs-vsctl list Open_Vswitch
> +
> +echo "Send traffic"
> +sip=`ip_to_hex 192 168 1 1`
> +dip=`ip_to_hex 192 168 2 2`
> +test_ip vif11 f00000000011  000001010203 $sip $dip vif22
> +
> +sleep 1
>

I think you can delete this sleep. It adds no value.



> +
> +echo "----------- Post Traffic hv1 dump -----------"
> +as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv1 ovs-appctl fdb/show br-phys
> +
> +echo "----------- Post Traffic hv2 dump -----------"
> +as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
> +as hv2 ovs-appctl fdb/show br-phys
> +
> +OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected])
> +
> +OVN_CLEANUP([hv1],[hv2])
> +
> +AT_CLEANUP
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Ankur Sharma June 17, 2019, 8:38 p.m. UTC | #2
Hi Numan,

Thank for the Ack.
Sent out v11, addressing your comment.

Thanks again.

Regards,
Ankur

From: Numan Siddique <nusiddiq@redhat.com>
Sent: Monday, June 17, 2019 3:53 AM
To: Ankur Sharma <ankur.sharma@nutanix.com>
Cc: ovs-dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v10] OVN: Enable E-W Traffic, Vlan backed DVR



On Wed, Jun 12, 2019 at 4:47 AM Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>> wrote:
Background:
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DOctober_353066.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=wxz7gTPh2rjmCdqqfwx-1bR-TmO4cH5vUWwounmM7bI&e=>
[2] https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing [docs.google.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU_edit-3Fusp-3Dsharing&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3viJBXBU_4-d5yneJW8CgdfdmpDFL_vbjyydTdEZzrI&e=>

Key difference between an overlay logical switch and
vlan backed logical switch is that for vlan logical switches
packets are not encapsulated.

Hence, if a distributed router port is connected to vlan backed
logical switch, then router port mac as source mac could be
seen from multiple hypervisors. Same <mac,vlan> pairs coming
from multiple ports from a top of the rack switch (TOR) perspective
could be seen as a security threat and it could send alarms, drop
the packets or block the ports etc.

This patch addresses the same by introducing the concept of chassis mac.
A chassis mac is CMS provisioned unique mac per chassis. For any routed packet
(i.e source mac is router port mac) going on the wire on a vlan type
logical switch, we will replace its source mac with chassis mac.

This replacing of source mac with chassis mac will happen in table=65
of the logical switch datapath. A flow is added at priority 150, which
matches the source mac and replaces it with chassis mac if the value
is a router port mac.

Example flow:
cookie=0x0, duration=67765.830s, table=65, n_packets=0, n_bytes=0,
idle_age=65534, hard_age=65534, priority=150,reg15=0x1,metadata=0x4,
dl_src=00:00:01:01:02:03 actions=mod_dl_src:aa:bb:cc:dd:ee:ff,
mod_vlan_vid:1000,output:16

Here, 00:00:01:01:02:03 is router port mac and aa:bb:cc:dd:ee:ff
is chassis mac.

Signed-off-by: Ankur Sharma <ankur.sharma@nutanix.com<mailto:ankur.sharma@nutanix.com>>

Thanks Ankur for the patch.

Acked-by: Numan Siddique <nusiddiq@redhat.com<mailto:nusiddiq@redhat.com>>

There is just one small minor comment. It would be nice if you can address it,

Thanks
Numan


---
 ovn/controller/binding.c            |  12 +--
 ovn/controller/chassis.c            |  64 +++++++++++-
 ovn/controller/chassis.h            |   4 +
 ovn/controller/ovn-controller.8.xml |  10 ++
 ovn/controller/ovn-controller.c     |   4 +-
 ovn/controller/ovn-controller.h     |   5 +-
 ovn/controller/physical.c           |  95 +++++++++++++++++
 ovn/ovn-architecture.7.xml          |  24 +++++
 ovn/ovn-sb.xml                      |   8 ++
 tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=>                        | 197 ++++++++++++++++++++++++++++++++++++
 10 files changed, 411 insertions(+), 12 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index b62b3da..c73d1aa 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -159,13 +159,11 @@ add_local_datapath__(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
                                          sbrec_port_binding_by_name,
                                          peer->datapath, false,
                                          depth + 1, local_datapaths);
-                    ld->n_peer_dps++;
-                    ld->peer_dps = xrealloc(
-                            ld->peer_dps,
-                            ld->n_peer_dps * sizeof *ld->peer_dps);
-                    ld->peer_dps[ld->n_peer_dps - 1] = datapath_lookup_by_key(
-                        sbrec_datapath_binding_by_key,
-                        peer->datapath->tunnel_key);
+                    ld->n_peer_ports++;
+                    ld->peer_ports = xrealloc(ld->peer_ports,
+                                              ld->n_peer_ports *
+                                              sizeof *ld->peer_ports);
+                    ld->peer_ports[ld->n_peer_ports - 1] = peer;
                 }
             }
         }
diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c
index 0f537f1..8403212 100644
--- a/ovn/controller/chassis.c
+++ b/ovn/controller/chassis.c
@@ -23,6 +23,7 @@
 #include "lib/vswitch-idl.h"
 #include "openvswitch/dynamic-string.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/ofp-parse.h"
 #include "ovn/lib/chassis-index.h"
 #include "ovn/lib/ovn-sb-idl.h"
 #include "ovn-controller.h"
@@ -69,6 +70,12 @@ get_bridge_mappings(const struct smap *ext_ids)
 }

 static const char *
+get_chassis_mac_mappings(const struct smap *ext_ids)
+{
+    return smap_get_def(ext_ids, "ovn-chassis-mac-mappings", "");
+}
+
+static const char *
 get_cms_options(const struct smap *ext_ids)
 {
     return smap_get_def(ext_ids, "ovn-cms-options", "");
@@ -162,6 +169,7 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
     const char *datapath_type =
         br_int && br_int->datapath_type ? br_int->datapath_type : "";
     const char *cms_options = get_cms_options(&cfg->external_ids);
+    const char *chassis_macs = get_chassis_mac_mappings(&cfg->external_ids);

     struct ds iface_types = DS_EMPTY_INITIALIZER;
     ds_put_cstr(&iface_types, "");
@@ -190,18 +198,22 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
             = smap_get_def(&chassis_rec->external_ids, "iface-types", "");
         const char *chassis_cms_options
             = get_cms_options(&chassis_rec->external_ids);
+        const char *chassis_mac_mappings
+            = get_chassis_mac_mappings(&chassis_rec->external_ids);

         /* If any of the external-ids should change, update them. */
         if (strcmp(bridge_mappings, chassis_bridge_mappings) ||
             strcmp(datapath_type, chassis_datapath_type) ||
             strcmp(iface_types_str, chassis_iface_types) ||
-            strcmp(cms_options, chassis_cms_options)) {
+            strcmp(cms_options, chassis_cms_options) ||
+            strcmp(chassis_macs, chassis_mac_mappings)) {
             struct smap new_ids;
             smap_clone(&new_ids, &chassis_rec->external_ids);
             smap_replace(&new_ids, "ovn-bridge-mappings", bridge_mappings);
             smap_replace(&new_ids, "datapath-type", datapath_type);
             smap_replace(&new_ids, "iface-types", iface_types_str);
             smap_replace(&new_ids, "ovn-cms-options", cms_options);
+            smap_replace(&new_ids, "ovn-chassis-mac-mappings", chassis_macs);
             sbrec_chassis_verify_external_ids(chassis_rec);
             sbrec_chassis_set_external_ids(chassis_rec, &new_ids);
             smap_destroy(&new_ids);
@@ -319,6 +331,56 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
     return chassis_rec;
 }

+bool
+chassis_get_mac(const struct sbrec_chassis *chassis_rec,
+                const char *bridge_mapping,
+                struct eth_addr *chassis_mac)
+{
+    const char *tokens
+        = get_chassis_mac_mappings(&chassis_rec->external_ids);
+
+    if (!strlen(tokens)) {
+       return false;
+    }
+
+    char *save_ptr = NULL;
+    char *token;
+    bool ret = false;
+    char *tokstr = xstrdup(tokens);
+
+    /* Format for a chassis mac configuration is:
+     * ovn-chassis-mac-mappings="bridge-name1:MAC1,bridge-name2:MAC2"
+     */
+    for (token = strtok_r(tokstr, ",", &save_ptr);
+         token != NULL;
+         token = strtok_r(NULL, ",", &save_ptr)) {
+        char *save_ptr2 = NULL;
+        char *chassis_mac_bridge = strtok_r(token, ":", &save_ptr2);
+        char *chassis_mac_str = strtok_r(NULL, "", &save_ptr2);
+
+        if (!strcmp(chassis_mac_bridge, bridge_mapping)) {
+            struct eth_addr temp_mac;
+            char *err_str = NULL;
+
+            ret = true;
+
+            /* Return the first chassis mac. */
+            if ((err_str = str_to_mac(chassis_mac_str, &temp_mac))) {
+                free(err_str);
+                ret = false;
+                continue;
+            }
+
+            *chassis_mac = temp_mac;
+            break;
+        }
+    }
+
+    free(tokstr);
+
+    return ret;
+}
+
 /* Returns true if the database is all cleaned up, false if more work is
  * required. */
 bool
diff --git a/ovn/controller/chassis.h b/ovn/controller/chassis.h
index 9847e19..e3fbc31 100644
--- a/ovn/controller/chassis.h
+++ b/ovn/controller/chassis.h
@@ -26,6 +26,7 @@ struct ovsrec_open_vswitch_table;
 struct sbrec_chassis;
 struct sbrec_chassis_table;
 struct sset;
+struct eth_addr;

 void chassis_register_ovs_idl(struct ovsdb_idl *);
 const struct sbrec_chassis *chassis_run(
@@ -36,5 +37,8 @@ const struct sbrec_chassis *chassis_run(
     const struct sset *transport_zones);
 bool chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
                      const struct sbrec_chassis *);
+bool chassis_get_mac(const struct sbrec_chassis *chassis,
+                     const char *bridge_mapping,
+                     struct eth_addr *chassis_mac);

 #endif /* ovn/chassis.h */
diff --git a/ovn/controller/ovn-controller.8.xml b/ovn/controller/ovn-controller.8.xml
index 9721d9a..18f66fe 100644
--- a/ovn/controller/ovn-controller.8.xml
+++ b/ovn/controller/ovn-controller.8.xml
@@ -182,6 +182,16 @@
           transport zone.
         </p>
       </dd>
+      <dt><code>external_ids:ovn-chassis-mac-mappings</code></dt>
+      <dd>
+        A list of key-value pairs that map a chassis specific mac to
+        a physical network name. An example
+        value mapping two chassis macs to two physical network names would be:
+        <code>physnet1:aa:bb:cc:dd:ee:ff,physnet2:a1:b2:c3:d4:e5:f6</code>.
+        These are the macs that ovn-controller will replace a router port
+        mac with, if packet is going from a distributed router port on
+        vlan type logical switch.
+      </dd>
     </dl>

     <p>
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 6019016..315a88b 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -899,7 +899,7 @@ en_runtime_data_cleanup(struct engine_node *node)
     struct local_datapath *cur_node, *next_node;
     HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node,
                         &data->local_datapaths) {
-        free(cur_node->peer_dps);
+        free(cur_node->peer_ports);
         hmap_remove(&data->local_datapaths, &cur_node->hmap_node);
         free(cur_node);
     }
@@ -929,7 +929,7 @@ en_runtime_data_run(struct engine_node *node)
     } else {
         struct local_datapath *cur_node, *next_node;
         HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, local_datapaths) {
-            free(cur_node->peer_dps);
+            free(cur_node->peer_ports);
             hmap_remove(local_datapaths, &cur_node->hmap_node);
             free(cur_node);
         }
diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h
index 6afd727..a4c1309 100644
--- a/ovn/controller/ovn-controller.h
+++ b/ovn/controller/ovn-controller.h
@@ -59,8 +59,9 @@ struct local_datapath {
     /* True if this datapath contains an l3gateway port located on this
      * hypervisor. */
     bool has_local_l3gateway;
-    const struct sbrec_datapath_binding **peer_dps;
-    size_t n_peer_dps;
+
+    const struct sbrec_port_binding **peer_ports;
+    size_t n_peer_ports;
 };

 struct local_datapath *get_local_datapath(const struct hmap *,
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index c8dc282..af587a5 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -20,6 +20,7 @@
 #include "ha-chassis.h"
 #include "lflow.h"
 #include "lport.h"
+#include "chassis.h"
 #include "lib/bundle.h"
 #include "openvswitch/poll-loop.h"
 #include "lib/uuid.h"
@@ -30,6 +31,7 @@
 #include "openvswitch/ofp-actions.h"
 #include "openvswitch/ofpbuf.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/ofp-parse.h"
 #include "ovn-controller.h"
 #include "ovn/lib/chassis-index.h"
 #include "ovn/lib/ovn-sb-idl.h"
@@ -236,6 +238,92 @@ get_zone_ids(const struct sbrec_port_binding *binding,
 }

 static void
+put_replace_router_port_mac_flows(const struct
+                                  sbrec_port_binding *localnet_port,
+                                  const struct sbrec_chassis *chassis,
+                                  const struct hmap *local_datapaths,
+                                  struct ofpbuf *ofpacts_p,
+                                  ofp_port_t ofport,
+                                  struct ovn_desired_flow_table *flow_table)
+{
+    struct local_datapath *ld = get_local_datapath(local_datapaths,
+                                                   localnet_port->datapath->
+                                                   tunnel_key);
+    ovs_assert(ld);
+
+    uint32_t dp_key = localnet_port->datapath->tunnel_key;
+    uint32_t port_key = localnet_port->tunnel_key;
+    int tag = localnet_port->tag ? *localnet_port->tag : 0;
+    const char *network = smap_get(&localnet_port->options, "network_name");
+    struct eth_addr chassis_mac;
+
+    if (!network) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+        VLOG_WARN_RL(&rl, "Physical network not configured for datapath: %ld "
+                     "with localnet port",
+                     localnet_port->datapath->tunnel_key);
+        return;
+    }
+
+    /* Get chassis mac */
+    if (!chassis_get_mac(chassis, network, &chassis_mac)) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+        /* Keeping the log level low for backward compatibility.
+         * Chassis mac is a new configuration.
+         */
+        VLOG_DBG_RL(&rl, "Could not get chassis mac for network: %s", network);
+        return;
+    }
+
+    for (int i = 0; i < ld->n_peer_ports; i++) {
+        const struct sbrec_port_binding *rport_binding = ld->peer_ports[i];
+        struct eth_addr router_port_mac;
+        char *err_str = NULL;
+        struct match match;
+        struct ofpact_mac *replace_mac;
+
+        /* Table 65, priority 150.
+         * =======================
+         *
+         * Implements output to localnet port.
+         * a. Flow replaces ingress router port mac with a chassis mac.
+         * b. Flow appends the vlan id localnet port is configured with.
+         */
+        match_init_catchall(&match);
+        ofpbuf_clear(ofpacts_p);
+
+        ovs_assert(rport_binding->n_mac == 1);
+        if ((err_str = str_to_mac(rport_binding->mac[0], &router_port_mac))) {
+            /* Parsing of mac failed. */
+            VLOG_WARN("Parsing or router port mac failed for router port: %s, "
+                      "with error: %s", rport_binding->logical_port, err_str);
+            free(err_str);
+            return;
+        }
+
+        /* Replace Router mac flow */
+        match_set_metadata(&match, htonll(dp_key));
+        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
+        match_set_dl_src(&match, router_port_mac);
+
+        replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p);
+        replace_mac->mac = chassis_mac;
+
+        if (tag) {
+            struct ofpact_vlan_vid *vlan_vid;
+            vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p);
+            vlan_vid->vlan_vid = tag;
+            vlan_vid->push_vlan_if_needed = true;
+        }
+
+        ofpact_put_OUTPUT(ofpacts_p)->port = ofport;
+
+        ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 150, 0,
+                        &match, ofpacts_p, &localnet_port->header_.uuid);
+    }
+}
+
+static void
 put_local_common_flows(uint32_t dp_key, uint32_t port_key,
                        uint32_t parent_port_key,
                        const struct zone_ids *zone_ids,
@@ -707,6 +795,13 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name,
         }
         ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100, 0,
                         &match, ofpacts_p, &binding->header_.uuid);
+
+        if (!strcmp(binding->type, "localnet")) {
+            put_replace_router_port_mac_flows(binding, chassis,
+                                              local_datapaths, ofpacts_p,
+                                              ofport, flow_table);
+        }
+
     } else if (!tun && !is_ha_remote) {
         /* Remote port connected by localnet port */
         /* Table 33, priority 100.
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 8c9e106..6275db1 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -1407,6 +1407,30 @@
       egress pipeline of the destination localnet logical switch datapath
       and goes out of the integration bridge to the provider bridge (
       belonging to the destination logical switch) via the localnet port.
+      While sending the packet to provider bridge, we also replace router
+      port mac as source mac with a chassis unique mac.
+
+      This chassis unique mac is configured as global ovs config on each
+      chassis (eg. via "<code>ovs-vsctl set open . external-ids:
+      ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"</code>").More
+      details on this config are present in <code>ovn-controller</code>(8).
+
+      If the above is not configured, then source mac would be the router
+      port mac. This could create problem if we have more than one chassis.
+      This is because, since the router port is distributed, hence same
+      mac,vlan tuple will seen by physical network from other chassis
+      as well. This could cause some/all of these issues:
+      <ul>
+        <li>
+          Continous mac moves in top of the rack switch (TOR).
+        </li>
+        <li>
+          TOR dropping the traffic, which is causing continous mac moves.
+        </li>
+        <li>
+          TOR blocking the ports from which mac moves are happening.
+        </li>
+      </ul>
     </li>

     <li>
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 1a2bc1d..89e88c4 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -301,6 +301,14 @@
       See <code>ovn-controller</code>(8) for more information.
     </column>

+    <column name="external_ids" key="ovn-chassis-mac-mappings">
+      <code>ovn-controller</code> populates this key with the set of options
+      configured in the <ref table="Open_vSwitch"
+      column="external_ids:ovn-chassis-mac-mappings"/> column of the
+      Open_vSwitch database's <ref table="Open_vSwitch" db="Open_vSwitch"/>
+      table. See <code>ovn-controller</code>(8) for more information.
+    </column>
+
     <group title="Common Columns">
       The overall purpose of these columns is described under <code>Common
       Columns</code> at the beginning of this document.
diff --git a/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=> b/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=>
index daf85a5..d6cbb7b 100644
--- a/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=>
+++ b/tests/ovn.at [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=3liFn9Ja4awYUt3zaPN5Wsw1PO0SLdzD9kiAPJ09Oco&e=>
@@ -14017,3 +14017,200 @@ ovn-hv4-0

 OVN_CLEANUP([hv1], [hv2], [hv3])
 AT_CLEANUP
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac])
+ovn_start
+
+
+# In this test cases we create 2 switches, all connected to same
+# physical network (through br-phys on each HV). Each switch has
+# 1 VIF. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+
+for i in 1 2; do
+    ls_name=ls$i
+    ovn-nbctl ls-add $ls_name
+    ln_port_name=ln$i
+    if test $i -eq 1; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
+    elif test $i -eq 2; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
+    fi
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
+done
+
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_ls () {
+    case $1 in dnl (
+        vif?[[11]]) echo ls1 ;; dnl (
+        vif?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+hv_to_num () {
+    case $1 in dnl (
+        hv1) echo 1 ;; dnl (
+        hv2) echo 2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_num () {
+    case $1 in dnl (
+        vif22) echo 22 ;; dnl (
+        vif21) echo 21 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_lrp () {
+    echo router-to-`vif_to_ls $1`
+}
+
+hv_to_chassis_mac () {
+     case $1 in dnl (
+        hv[[1]]) echo aa:bb:cc:dd:ee:11 ;; dnl (
+        hv[[2]]) echo aa:bb:cc:dd:ee:22 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+
+    ovs-vsctl add-port br-int vif$i$i -- \
+        set Interface vif$i$i external-ids:iface-id=lp$i$i \
+                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
+                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
+                              ofport-request=$i$i
+
+    lsp_name=lp$i$i
+    ls_name=$(lsp_to_ls $lsp_name)
+
+    ovn-nbctl lsp-add $ls_name $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i"
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+
+done
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
+ovn-nbctl [192.168.1.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.1.3_24-2Bovn-2Dnbctl&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=aRxfkK_fs5bvaH5xX0Jl7E-WPVOkqXaaCWuJiLRCbaI&e=> lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24 [192.168.2.3]<https://urldefense.proofpoint.com/v2/url?u=http-3A__192.168.2.3_24&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=zaT-JvuNmQv0Q5YQoPtHigGpZlX0UC0QZHKN-a8VzfQ&s=Ux6gDi23oYndvNl_Gz2PaF7lMjb7jcqK6AdHBjCaHIo&e=>
+
+ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router
+ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router
+
+ovn-nbctl --wait=sb sync
+#ovn-sbctl dump-flows
+
+ovn-nbctl show
+ovn-sbctl show
+
+OVN_POPULATE_ARP
+
+test_ip() {
+    # This packet has bad checksums but logical L3 routing doesn't check.
+    local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
+    local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+    shift; shift; shift; shift; shift
+    hv=`vif_to_hv $inport`
+    hv_num=`hv_to_num $hv`
+    chassis_mac=`hv_to_chassis_mac $hv`
+    as $hv ovs-appctl netdev-dummy/receive $inport $packet
+    #as $hv ovs-appctl ofproto/trace br-int in_port=$inport $packet
+    in_ls=`vif_to_ls $inport`
+    in_lrp=`vif_to_lrp $inport`
+    for outport; do
+        out_ls=`vif_to_ls $outport`
+        if test $in_ls = $out_ls; then
+            # Ports on the same logical switch receive exactly the same packet.
+            echo $packet
+        else
+            # Routing decrements TTL and updates source and dest MAC
+            # (and checksum).
+            outport_num=`vif_to_num $outport`
+            out_lrp=`vif_to_lrp $outport`
+            echo f000000000${outport_num}aabbccddee${hv_num}${hv_num}08004500001c00000000"3f1101"00${src_ip}${dst_ip}0035111100080000
+        fi >> $outport.expected
+    done
+}
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+echo "Send traffic"
+sip=`ip_to_hex 192 168 1 1`
+dip=`ip_to_hex 192 168 2 2`
+test_ip vif11 f00000000011  000001010203 $sip $dip vif22
+
+sleep 1

I think you can delete this sleep. It adds no value.


+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected])
+
+OVN_CLEANUP([hv1],[hv2])
+
+AT_CLEANUP
--
1.8.3.1
diff mbox series

Patch

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index b62b3da..c73d1aa 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -159,13 +159,11 @@  add_local_datapath__(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
                                          sbrec_port_binding_by_name,
                                          peer->datapath, false,
                                          depth + 1, local_datapaths);
-                    ld->n_peer_dps++;
-                    ld->peer_dps = xrealloc(
-                            ld->peer_dps,
-                            ld->n_peer_dps * sizeof *ld->peer_dps);
-                    ld->peer_dps[ld->n_peer_dps - 1] = datapath_lookup_by_key(
-                        sbrec_datapath_binding_by_key,
-                        peer->datapath->tunnel_key);
+                    ld->n_peer_ports++;
+                    ld->peer_ports = xrealloc(ld->peer_ports,
+                                              ld->n_peer_ports *
+                                              sizeof *ld->peer_ports);
+                    ld->peer_ports[ld->n_peer_ports - 1] = peer;
                 }
             }
         }
diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c
index 0f537f1..8403212 100644
--- a/ovn/controller/chassis.c
+++ b/ovn/controller/chassis.c
@@ -23,6 +23,7 @@ 
 #include "lib/vswitch-idl.h"
 #include "openvswitch/dynamic-string.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/ofp-parse.h"
 #include "ovn/lib/chassis-index.h"
 #include "ovn/lib/ovn-sb-idl.h"
 #include "ovn-controller.h"
@@ -69,6 +70,12 @@  get_bridge_mappings(const struct smap *ext_ids)
 }
 
 static const char *
+get_chassis_mac_mappings(const struct smap *ext_ids)
+{
+    return smap_get_def(ext_ids, "ovn-chassis-mac-mappings", "");
+}
+
+static const char *
 get_cms_options(const struct smap *ext_ids)
 {
     return smap_get_def(ext_ids, "ovn-cms-options", "");
@@ -162,6 +169,7 @@  chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
     const char *datapath_type =
         br_int && br_int->datapath_type ? br_int->datapath_type : "";
     const char *cms_options = get_cms_options(&cfg->external_ids);
+    const char *chassis_macs = get_chassis_mac_mappings(&cfg->external_ids);
 
     struct ds iface_types = DS_EMPTY_INITIALIZER;
     ds_put_cstr(&iface_types, "");
@@ -190,18 +198,22 @@  chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
             = smap_get_def(&chassis_rec->external_ids, "iface-types", "");
         const char *chassis_cms_options
             = get_cms_options(&chassis_rec->external_ids);
+        const char *chassis_mac_mappings
+            = get_chassis_mac_mappings(&chassis_rec->external_ids);
 
         /* If any of the external-ids should change, update them. */
         if (strcmp(bridge_mappings, chassis_bridge_mappings) ||
             strcmp(datapath_type, chassis_datapath_type) ||
             strcmp(iface_types_str, chassis_iface_types) ||
-            strcmp(cms_options, chassis_cms_options)) {
+            strcmp(cms_options, chassis_cms_options) ||
+            strcmp(chassis_macs, chassis_mac_mappings)) {
             struct smap new_ids;
             smap_clone(&new_ids, &chassis_rec->external_ids);
             smap_replace(&new_ids, "ovn-bridge-mappings", bridge_mappings);
             smap_replace(&new_ids, "datapath-type", datapath_type);
             smap_replace(&new_ids, "iface-types", iface_types_str);
             smap_replace(&new_ids, "ovn-cms-options", cms_options);
+            smap_replace(&new_ids, "ovn-chassis-mac-mappings", chassis_macs);
             sbrec_chassis_verify_external_ids(chassis_rec);
             sbrec_chassis_set_external_ids(chassis_rec, &new_ids);
             smap_destroy(&new_ids);
@@ -319,6 +331,56 @@  chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
     return chassis_rec;
 }
 
+bool
+chassis_get_mac(const struct sbrec_chassis *chassis_rec,
+                const char *bridge_mapping,
+                struct eth_addr *chassis_mac)
+{
+    const char *tokens
+        = get_chassis_mac_mappings(&chassis_rec->external_ids);
+
+    if (!strlen(tokens)) {
+       return false;
+    }
+
+    char *save_ptr = NULL;
+    char *token;
+    bool ret = false;
+    char *tokstr = xstrdup(tokens);
+
+    /* Format for a chassis mac configuration is:
+     * ovn-chassis-mac-mappings="bridge-name1:MAC1,bridge-name2:MAC2"
+     */
+    for (token = strtok_r(tokstr, ",", &save_ptr);
+         token != NULL;
+         token = strtok_r(NULL, ",", &save_ptr)) {
+        char *save_ptr2 = NULL;
+        char *chassis_mac_bridge = strtok_r(token, ":", &save_ptr2);
+        char *chassis_mac_str = strtok_r(NULL, "", &save_ptr2);
+
+        if (!strcmp(chassis_mac_bridge, bridge_mapping)) {
+            struct eth_addr temp_mac;
+            char *err_str = NULL;
+
+            ret = true;
+
+            /* Return the first chassis mac. */
+            if ((err_str = str_to_mac(chassis_mac_str, &temp_mac))) {
+                free(err_str);
+                ret = false;
+                continue;
+            }
+
+            *chassis_mac = temp_mac;
+            break;
+        }
+    }
+
+    free(tokstr);
+
+    return ret;
+}
+
 /* Returns true if the database is all cleaned up, false if more work is
  * required. */
 bool
diff --git a/ovn/controller/chassis.h b/ovn/controller/chassis.h
index 9847e19..e3fbc31 100644
--- a/ovn/controller/chassis.h
+++ b/ovn/controller/chassis.h
@@ -26,6 +26,7 @@  struct ovsrec_open_vswitch_table;
 struct sbrec_chassis;
 struct sbrec_chassis_table;
 struct sset;
+struct eth_addr;
 
 void chassis_register_ovs_idl(struct ovsdb_idl *);
 const struct sbrec_chassis *chassis_run(
@@ -36,5 +37,8 @@  const struct sbrec_chassis *chassis_run(
     const struct sset *transport_zones);
 bool chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
                      const struct sbrec_chassis *);
+bool chassis_get_mac(const struct sbrec_chassis *chassis,
+                     const char *bridge_mapping,
+                     struct eth_addr *chassis_mac);
 
 #endif /* ovn/chassis.h */
diff --git a/ovn/controller/ovn-controller.8.xml b/ovn/controller/ovn-controller.8.xml
index 9721d9a..18f66fe 100644
--- a/ovn/controller/ovn-controller.8.xml
+++ b/ovn/controller/ovn-controller.8.xml
@@ -182,6 +182,16 @@ 
           transport zone.
         </p>
       </dd>
+      <dt><code>external_ids:ovn-chassis-mac-mappings</code></dt>
+      <dd>
+        A list of key-value pairs that map a chassis specific mac to
+        a physical network name. An example
+        value mapping two chassis macs to two physical network names would be:
+        <code>physnet1:aa:bb:cc:dd:ee:ff,physnet2:a1:b2:c3:d4:e5:f6</code>.
+        These are the macs that ovn-controller will replace a router port
+        mac with, if packet is going from a distributed router port on
+        vlan type logical switch.
+      </dd>
     </dl>
 
     <p>
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 6019016..315a88b 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -899,7 +899,7 @@  en_runtime_data_cleanup(struct engine_node *node)
     struct local_datapath *cur_node, *next_node;
     HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node,
                         &data->local_datapaths) {
-        free(cur_node->peer_dps);
+        free(cur_node->peer_ports);
         hmap_remove(&data->local_datapaths, &cur_node->hmap_node);
         free(cur_node);
     }
@@ -929,7 +929,7 @@  en_runtime_data_run(struct engine_node *node)
     } else {
         struct local_datapath *cur_node, *next_node;
         HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, local_datapaths) {
-            free(cur_node->peer_dps);
+            free(cur_node->peer_ports);
             hmap_remove(local_datapaths, &cur_node->hmap_node);
             free(cur_node);
         }
diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h
index 6afd727..a4c1309 100644
--- a/ovn/controller/ovn-controller.h
+++ b/ovn/controller/ovn-controller.h
@@ -59,8 +59,9 @@  struct local_datapath {
     /* True if this datapath contains an l3gateway port located on this
      * hypervisor. */
     bool has_local_l3gateway;
-    const struct sbrec_datapath_binding **peer_dps;
-    size_t n_peer_dps;
+
+    const struct sbrec_port_binding **peer_ports;
+    size_t n_peer_ports;
 };
 
 struct local_datapath *get_local_datapath(const struct hmap *,
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index c8dc282..af587a5 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -20,6 +20,7 @@ 
 #include "ha-chassis.h"
 #include "lflow.h"
 #include "lport.h"
+#include "chassis.h"
 #include "lib/bundle.h"
 #include "openvswitch/poll-loop.h"
 #include "lib/uuid.h"
@@ -30,6 +31,7 @@ 
 #include "openvswitch/ofp-actions.h"
 #include "openvswitch/ofpbuf.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/ofp-parse.h"
 #include "ovn-controller.h"
 #include "ovn/lib/chassis-index.h"
 #include "ovn/lib/ovn-sb-idl.h"
@@ -236,6 +238,92 @@  get_zone_ids(const struct sbrec_port_binding *binding,
 }
 
 static void
+put_replace_router_port_mac_flows(const struct
+                                  sbrec_port_binding *localnet_port,
+                                  const struct sbrec_chassis *chassis,
+                                  const struct hmap *local_datapaths,
+                                  struct ofpbuf *ofpacts_p,
+                                  ofp_port_t ofport,
+                                  struct ovn_desired_flow_table *flow_table)
+{
+    struct local_datapath *ld = get_local_datapath(local_datapaths,
+                                                   localnet_port->datapath->
+                                                   tunnel_key);
+    ovs_assert(ld);
+
+    uint32_t dp_key = localnet_port->datapath->tunnel_key;
+    uint32_t port_key = localnet_port->tunnel_key;
+    int tag = localnet_port->tag ? *localnet_port->tag : 0;
+    const char *network = smap_get(&localnet_port->options, "network_name");
+    struct eth_addr chassis_mac;
+
+    if (!network) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+        VLOG_WARN_RL(&rl, "Physical network not configured for datapath: %ld "
+                     "with localnet port",
+                     localnet_port->datapath->tunnel_key);
+        return;
+    }
+
+    /* Get chassis mac */
+    if (!chassis_get_mac(chassis, network, &chassis_mac)) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+        /* Keeping the log level low for backward compatibility.
+         * Chassis mac is a new configuration.
+         */
+        VLOG_DBG_RL(&rl, "Could not get chassis mac for network: %s", network);
+        return;
+    }
+
+    for (int i = 0; i < ld->n_peer_ports; i++) {
+        const struct sbrec_port_binding *rport_binding = ld->peer_ports[i];
+        struct eth_addr router_port_mac;
+        char *err_str = NULL;
+        struct match match;
+        struct ofpact_mac *replace_mac;
+
+        /* Table 65, priority 150.
+         * =======================
+         *
+         * Implements output to localnet port.
+         * a. Flow replaces ingress router port mac with a chassis mac.
+         * b. Flow appends the vlan id localnet port is configured with.
+         */
+        match_init_catchall(&match);
+        ofpbuf_clear(ofpacts_p);
+
+        ovs_assert(rport_binding->n_mac == 1);
+        if ((err_str = str_to_mac(rport_binding->mac[0], &router_port_mac))) {
+            /* Parsing of mac failed. */
+            VLOG_WARN("Parsing or router port mac failed for router port: %s, "
+                      "with error: %s", rport_binding->logical_port, err_str);
+            free(err_str);
+            return;
+        }
+
+        /* Replace Router mac flow */
+        match_set_metadata(&match, htonll(dp_key));
+        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
+        match_set_dl_src(&match, router_port_mac);
+
+        replace_mac = ofpact_put_SET_ETH_SRC(ofpacts_p);
+        replace_mac->mac = chassis_mac;
+
+        if (tag) {
+            struct ofpact_vlan_vid *vlan_vid;
+            vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p);
+            vlan_vid->vlan_vid = tag;
+            vlan_vid->push_vlan_if_needed = true;
+        }
+
+        ofpact_put_OUTPUT(ofpacts_p)->port = ofport;
+
+        ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 150, 0,
+                        &match, ofpacts_p, &localnet_port->header_.uuid);
+    }
+}
+
+static void
 put_local_common_flows(uint32_t dp_key, uint32_t port_key,
                        uint32_t parent_port_key,
                        const struct zone_ids *zone_ids,
@@ -707,6 +795,13 @@  consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name,
         }
         ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100, 0,
                         &match, ofpacts_p, &binding->header_.uuid);
+
+        if (!strcmp(binding->type, "localnet")) {
+            put_replace_router_port_mac_flows(binding, chassis,
+                                              local_datapaths, ofpacts_p,
+                                              ofport, flow_table);
+        }
+
     } else if (!tun && !is_ha_remote) {
         /* Remote port connected by localnet port */
         /* Table 33, priority 100.
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 8c9e106..6275db1 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -1407,6 +1407,30 @@ 
       egress pipeline of the destination localnet logical switch datapath
       and goes out of the integration bridge to the provider bridge (
       belonging to the destination logical switch) via the localnet port.
+      While sending the packet to provider bridge, we also replace router
+      port mac as source mac with a chassis unique mac.
+
+      This chassis unique mac is configured as global ovs config on each
+      chassis (eg. via "<code>ovs-vsctl set open . external-ids:
+      ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"</code>").More
+      details on this config are present in <code>ovn-controller</code>(8).
+
+      If the above is not configured, then source mac would be the router
+      port mac. This could create problem if we have more than one chassis.
+      This is because, since the router port is distributed, hence same
+      mac,vlan tuple will seen by physical network from other chassis
+      as well. This could cause some/all of these issues:
+      <ul>
+        <li>
+          Continous mac moves in top of the rack switch (TOR).
+        </li>
+        <li>
+          TOR dropping the traffic, which is causing continous mac moves.
+        </li>
+        <li>
+          TOR blocking the ports from which mac moves are happening.
+        </li>
+      </ul>
     </li>
 
     <li>
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 1a2bc1d..89e88c4 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -301,6 +301,14 @@ 
       See <code>ovn-controller</code>(8) for more information.
     </column>
 
+    <column name="external_ids" key="ovn-chassis-mac-mappings">
+      <code>ovn-controller</code> populates this key with the set of options
+      configured in the <ref table="Open_vSwitch"
+      column="external_ids:ovn-chassis-mac-mappings"/> column of the
+      Open_vSwitch database's <ref table="Open_vSwitch" db="Open_vSwitch"/>
+      table. See <code>ovn-controller</code>(8) for more information.
+    </column>
+
     <group title="Common Columns">
       The overall purpose of these columns is described under <code>Common
       Columns</code> at the beginning of this document.
diff --git a/tests/ovn.at b/tests/ovn.at
index daf85a5..d6cbb7b 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -14017,3 +14017,200 @@  ovn-hv4-0
 
 OVN_CLEANUP([hv1], [hv2], [hv3])
 AT_CLEANUP
+
+AT_SETUP([ovn -- 2 HVs, 2 lports/HV, localnet ports, DVR chassis mac])
+ovn_start
+
+
+# In this test cases we create 2 switches, all connected to same
+# physical network (through br-phys on each HV). Each switch has
+# 1 VIF. Each HV has 1 VIF port. The first digit
+# of VIF port name indicates the hypervisor it is bound to, e.g.
+# lp23 means VIF 3 on hv2.
+#
+# Each switch's VLAN tag and their logical switch ports are:
+#   - ls1:
+#       - tagged with VLAN 101
+#       - ports: lp11
+#   - ls2:
+#       - tagged with VLAN 201
+#       - ports: lp22
+#
+# Note: a localnet port is created for each switch to connect to
+# physical network.
+
+for i in 1 2; do
+    ls_name=ls$i
+    ovn-nbctl ls-add $ls_name
+    ln_port_name=ln$i
+    if test $i -eq 1; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 101
+    elif test $i -eq 2; then
+        ovn-nbctl lsp-add $ls_name $ln_port_name "" 201
+    fi
+    ovn-nbctl lsp-set-addresses $ln_port_name unknown
+    ovn-nbctl lsp-set-type $ln_port_name localnet
+    ovn-nbctl lsp-set-options $ln_port_name network_name=phys
+done
+
+# lsp_to_ls LSP
+#
+# Prints the name of the logical switch that contains LSP.
+lsp_to_ls () {
+    case $1 in dnl (
+        lp?[[11]]) echo ls1 ;; dnl (
+        lp?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_ls () {
+    case $1 in dnl (
+        vif?[[11]]) echo ls1 ;; dnl (
+        vif?[[12]]) echo ls2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+hv_to_num () {
+    case $1 in dnl (
+        hv1) echo 1 ;; dnl (
+        hv2) echo 2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_num () {
+    case $1 in dnl (
+        vif22) echo 22 ;; dnl (
+        vif21) echo 21 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_hv () {
+    case $1 in dnl (
+        vif[[1]]?) echo hv1 ;; dnl (
+        vif[[2]]?) echo hv2 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+vif_to_lrp () {
+    echo router-to-`vif_to_ls $1`
+}
+
+hv_to_chassis_mac () {
+     case $1 in dnl (
+        hv[[1]]) echo aa:bb:cc:dd:ee:11 ;; dnl (
+        hv[[2]]) echo aa:bb:cc:dd:ee:22 ;; dnl (
+        *) AT_FAIL_IF([:]) ;;
+    esac
+}
+
+ip_to_hex() {
+       printf "%02x%02x%02x%02x" "$@"
+}
+
+net_add n1
+for i in 1 2; do
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovs-vsctl set open . external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
+    ovn_attach n1 br-phys 192.168.0.$i
+
+    ovs-vsctl add-port br-int vif$i$i -- \
+        set Interface vif$i$i external-ids:iface-id=lp$i$i \
+                              options:tx_pcap=hv$i/vif$i$i-tx.pcap \
+                              options:rxq_pcap=hv$i/vif$i$i-rx.pcap \
+                              ofport-request=$i$i
+
+    lsp_name=lp$i$i
+    ls_name=$(lsp_to_ls $lsp_name)
+
+    ovn-nbctl lsp-add $ls_name $lsp_name
+    ovn-nbctl lsp-set-addresses $lsp_name "f0:00:00:00:00:$i$i 192.168.$i.$i"
+    ovn-nbctl lsp-set-port-security $lsp_name f0:00:00:00:00:$i$i
+
+    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lsp_name` = xup])
+
+done
+
+ovn-nbctl lr-add router
+ovn-nbctl lrp-add router router-to-ls1 00:00:01:01:02:03 192.168.1.3/24
+ovn-nbctl lrp-add router router-to-ls2 00:00:01:01:02:05 192.168.2.3/24
+
+ovn-nbctl lsp-add ls1 ls1-to-router -- set Logical_Switch_Port ls1-to-router type=router options:router-port=router-to-ls1 -- lsp-set-addresses ls1-to-router router
+ovn-nbctl lsp-add ls2 ls2-to-router -- set Logical_Switch_Port ls2-to-router type=router options:router-port=router-to-ls2 -- lsp-set-addresses ls2-to-router router
+
+ovn-nbctl --wait=sb sync
+#ovn-sbctl dump-flows
+
+ovn-nbctl show
+ovn-sbctl show
+
+OVN_POPULATE_ARP
+
+test_ip() {
+    # This packet has bad checksums but logical L3 routing doesn't check.
+    local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
+    local packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+    shift; shift; shift; shift; shift
+    hv=`vif_to_hv $inport`
+    hv_num=`hv_to_num $hv`
+    chassis_mac=`hv_to_chassis_mac $hv`
+    as $hv ovs-appctl netdev-dummy/receive $inport $packet
+    #as $hv ovs-appctl ofproto/trace br-int in_port=$inport $packet
+    in_ls=`vif_to_ls $inport`
+    in_lrp=`vif_to_lrp $inport`
+    for outport; do
+        out_ls=`vif_to_ls $outport`
+        if test $in_ls = $out_ls; then
+            # Ports on the same logical switch receive exactly the same packet.
+            echo $packet
+        else
+            # Routing decrements TTL and updates source and dest MAC
+            # (and checksum).
+            outport_num=`vif_to_num $outport`
+            out_lrp=`vif_to_lrp $outport`
+            echo f000000000${outport_num}aabbccddee${hv_num}${hv_num}08004500001c00000000"3f1101"00${src_ip}${dst_ip}0035111100080000
+        fi >> $outport.expected
+    done
+}
+
+# Dump a bunch of info helpful for debugging if there's a failure.
+
+echo "------ OVN dump ------"
+ovn-nbctl show
+ovn-sbctl show
+
+echo "------ hv1 dump ------"
+as hv1 ovs-vsctl show
+as hv1 ovs-vsctl list Open_Vswitch
+
+echo "------ hv2 dump ------"
+as hv2 ovs-vsctl show
+as hv2 ovs-vsctl list Open_Vswitch
+
+echo "Send traffic"
+sip=`ip_to_hex 192 168 1 1`
+dip=`ip_to_hex 192 168 2 2`
+test_ip vif11 f00000000011  000001010203 $sip $dip vif22
+
+sleep 1
+
+echo "----------- Post Traffic hv1 dump -----------"
+as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv1 ovs-appctl fdb/show br-phys
+
+echo "----------- Post Traffic hv2 dump -----------"
+as hv2 ovs-ofctl -O OpenFlow13 dump-flows br-int
+as hv2 ovs-appctl fdb/show br-phys
+
+OVN_CHECK_PACKETS([hv2/vif22-tx.pcap], [vif22.expected])
+
+OVN_CLEANUP([hv1],[hv2])
+
+AT_CLEANUP