diff mbox

[ovs-dev,v1] ovn: Cache mac bindings from broadcasted dynamic ARP responses

Message ID 1482912328-12984-1-git-send-email-bschanmu@redhat.com
State Changes Requested
Headers show

Commit Message

Babu Shanmugam Dec. 28, 2016, 8:05 a.m. UTC
This patch attempts to avoid the usage of MAC_Binding table.

Dynamic ARP response originates from the logical ports with "unknown"
address. When the ARP resolution is requested via a logical
router datapath, ARP response will be delivered to the logical router
port of the switch. Since the logical router is distributed, the
response will never reach the other chassis from where the ARP resolution
is requested.

This is done using a OVN logical action bcast2lr() which adds an action
to send the packet to a specific multicast group for the logical router
datapath which will ultimately be broadcasted to all the other chassis.

Imagine two hypervisors hv0 and hv1, switches ls0 and ls1 are connected
to lr. ls1 has a logical port lspu with "unknown" destination and
bound to hv1. When a dynamic address is requested to be resolved from
ls0 in hv0, following happens;

1. The ARP request is routed to ls1 and will be sent from hv0 to hv1
   with the source mac address being that of ls1's endpoint to lr.
2. In hv1, the packet will be received by lspu and the ARP response
   will be sent to the source of the ARP request.
3. The ARP response will be processed by bcast2lr logical action which
   in hv1 will be converted to a flow, similar to

   priority=100,arp,reg14=0x1,metadata=0x2,dl_dst=f0:00:00:00:01:00,arp_op=2
   actions=push:OXM_OF_METADATA[],push:NXM_NX_REG15[],push:NXM_NX_REG14[],
   load:0x3->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],load:0xfffd->NXM_NX_REG15[],
   resubmit(,32),pop:NXM_NX_REG14[],pop:NXM_NX_REG15[],pop:OXM_OF_METADATA[],
   resubmit(,30)

   bcast2lr actions modifies the MFF_LOG_INPORT (REG14) to the port id of
   ls1's endpoint to lr and MFF_LOG_DATAPATH (METADATA) to the lr's datapath
   id and submits to table 32 which handles the multicast group for lr's
   datapath. It again reverts MFF_LOG_INPORT and MFF_LOG_DATAPATH values and
   submits it to table 30 which will enables the lr in hv1 to receive this ARP
   response.

4. The above flow will only be present in the chassis that has a logical
   port having "unknown" address. In hv0, bcast2lr() will have no actions,
   hence it will look like

   priority=100,arp,reg14=0x1,metadata=0x2,dl_dst=f0:00:00:00:01:00,arp_op=2
   actions=resubmit(,30)

5. In hv1, the packet is sent to a multicast group 0xfffd which will take care
   of the broadcast to all the chassis. The chassis which are supposed to receive
   the broadcasted ARP response from the multicast group 0xfffd (hv0),
   will have one additional flow in table 0 which looks like

   priority=150,tun_id=0x3,in_port=14
   actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],
   move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],
   move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,16)

   The actions are all similar to the actions of the other flows in table0. This
   flow is at a higher priority and has a check for the tunnel id equivalent to
   lr's datapath id.

   This flow will not be present in hv1 which will not accept any message from
   this multicast group of lr's datapath.

6. Now that the packet has reached hv0, it will be processed by put_arp()
   logical action which caches the MAC binding in memory and updates the openflow
   table used for the dynamic MAC resolution.

Signed-off-by: Babu Shanmugam <bschanmu@redhat.com>
---
 include/ovn/actions.h           |  23 ++++++-
 ovn/controller/lflow.c          | 118 +++++++++++++++++++++++++-----------
 ovn/controller/lflow.h          |   3 +-
 ovn/controller/ovn-controller.c |   8 +--
 ovn/controller/physical.c       |  74 +++++++++++++++++++++--
 ovn/controller/pinctrl.c        |  94 +++++------------------------
 ovn/controller/pinctrl.h        |  13 +++-
 ovn/lib/actions.c               | 130 ++++++++++++++++++++++++++++++++++++++++
 ovn/northd/ovn-northd.c         |  86 ++++++++++++++++----------
 ovn/ovn-architecture.7.xml      |   6 +-
 ovn/ovn-sb.ovsschema            |  11 +---
 ovn/ovn-sb.xml                  | 128 ++++++++++-----------------------------
 ovn/utilities/ovn-sbctl.c       |   4 --
 ovn/utilities/ovn-trace.c       |  45 +-------------
 tests/ovn.at                    | 103 ++++++++++++++++++++-----------
 15 files changed, 494 insertions(+), 352 deletions(-)

Comments

Ben Pfaff Jan. 5, 2017, 5:32 a.m. UTC | #1
On Wed, Dec 28, 2016 at 01:35:28PM +0530, Babu Shanmugam wrote:
> This patch attempts to avoid the usage of MAC_Binding table.
> 
> Dynamic ARP response originates from the logical ports with "unknown"
> address. When the ARP resolution is requested via a logical
> router datapath, ARP response will be delivered to the logical router
> port of the switch. Since the logical router is distributed, the
> response will never reach the other chassis from where the ARP resolution
> is requested.
> 
> This is done using a OVN logical action bcast2lr() which adds an action
> to send the packet to a specific multicast group for the logical router
> datapath which will ultimately be broadcasted to all the other chassis.

Thanks for the revised patch, and especially for the expanded
explanation.

I think that I understand now how this is supposed to work.  I think the
idea is that ARP responses from an LSP with "unknown" MAC will be sent
from the LSP's LS to the LR on all the other chassis, as well as being
processed locally.  I think that this basic idea is sound.

I still do not understand understand the detailed design.

First, the code goes to some trouble to ensure that this special
processing for ARP responses from an LSP with "unknown" MAC only happens
on the chassis where the LSP resides.  It passes the name of this
chassis as an argument to the bcast2lr action, which violates the OVN
principle that the logical flow table's contents are independent of the
physical layout of the system.  It also requires the logical flow table
to be updated whenever the physical layout of the system changes, which
is unprecedented as far as I can tell.  Such a change would, therefore,
need to be for an important reason, but I don't know what it's good for,
because I don't know why hypervisor hv0 would process a packet through
an LS ingress pipeline for LSP lsp0, if lsp0 didn't reside on hv0.

Second, I'm suspicious of the idea of a special-purpose action dedicated
to this purpose.  When we can, it's better to make use of general
primitives than to invent new special-purpose ones.  In this case, the
new action combines a few goals that can be broken down into primitives
that might individually be useful in other contexts:

    1. Do nothing, if the packet originated on any chassis but one
       designated one.  As I said, I don't yet see why this is needed,
       but if it is then it seems like this could be made generic.  For
       example, the originating chassis could be part of the match
       condition, or we could introduce a new action that executes
       actions provided as arguments only on chassis specified in a
       parameter.

    2. Send a packet to a different logical datapath.  Logical patch
       ports can do this already.

    3. Replicate a packet to multiple hypervisors, with the same logical
       output port on all of them.  The patch does this with a logical
       multicast group, but it has to special case it.  I'm not sure
       that's a good way to do it.  It's nice to have uniform
       abstractions, which multicast groups have been until now.  An
       alternative that comes to mind is to introduce a new kind of
       logical port that replicates packets to every chassis that hosts
       the logical datapath.  One reason that this seems better is
       because logical ports are already nonuniform and adding another
       type simply doesn't make it much worse.  Another is that,
       eventually, I guess we'll want to have more configurability here
       and logical ports already have options for configuration.

Details
=======

Here are some detailed comments.  Some of these only make sense if we
stick with the overall design, which, as I said above, I'd prefer to
change.

This patch adds a new logical action, so it should add new tests under
"ovn -- action parsing" in ovn.at.

parse_BCAST2LR() silently does nothing if a parameter is missing.  Why
doesn't it report an error?

parse_BCAST2LR() leaks memory if it finds a syntax error or if a
parameter is missing.

parse_BCAST2LR() doesn't really need to build a structure and then copy
it, it can just initialize it in-place.

I don't understand why ovnact_bcast2lr's "char *"s are const.  They
don't seem any more "const" than other ovnacts' members.

encode_BCAST2LR() should use init_stack(), like encode_setup_args() and
encode_restore_args(), instead of open-coding it.

ovn-northd.8.xml needs an update to explain the new flows.

Please do not introduce pinctrl_put_mac_binding_for_each().  Callback
interfaces are awkward and should only be used when really necessary.
Instead, I'd move all the related code from lflow.c into a new
pinctrl_*() function that can be called directly from ovn-controller.c
somewhere around the call to lflow_run().

The code in pinctrl makes only limited sense now.  Until now, it was
organized as a kind of a buffer or queue that was flushed to the
database when it could be.  Now, it's just a cache, but it's still
organized the same way.  This has some nasty side effects.  I don't see
anything that ever flushes an entry, for example.  Combined with a fixed
limit of 1000 entries, this means that after 1000 MACs have been
learned, no more can ever be learned until ovn-controller restarts.

Thanks,

Ben.
Russell Bryant Jan. 11, 2017, 3:44 a.m. UTC | #2
On Thu, Jan 5, 2017 at 12:32 AM, Ben Pfaff <blp@ovn.org> wrote:

> On Wed, Dec 28, 2016 at 01:35:28PM +0530, Babu Shanmugam wrote:
> > This patch attempts to avoid the usage of MAC_Binding table.
> >
> > Dynamic ARP response originates from the logical ports with "unknown"
> > address. When the ARP resolution is requested via a logical
> > router datapath, ARP response will be delivered to the logical router
> > port of the switch. Since the logical router is distributed, the
> > response will never reach the other chassis from where the ARP resolution
> > is requested.
> >
> > This is done using a OVN logical action bcast2lr() which adds an action
> > to send the packet to a specific multicast group for the logical router
> > datapath which will ultimately be broadcasted to all the other chassis.
>
> Thanks for the revised patch, and especially for the expanded
> explanation.
>

Thanks for the detailed review, Ben.

There will be some delay to getting back to revising this.  Babu is no
longer working with us, so someone else will be picking this up.
diff mbox

Patch

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index 0bf6145..9a35e90 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -68,7 +68,8 @@  struct simap;
     OVNACT(PUT_ND,        ovnact_put_mac_bind)      \
     OVNACT(PUT_DHCPV4_OPTS, ovnact_put_dhcp_opts)   \
     OVNACT(PUT_DHCPV6_OPTS, ovnact_put_dhcp_opts)   \
-    OVNACT(SET_QUEUE,       ovnact_set_queue)
+    OVNACT(SET_QUEUE,       ovnact_set_queue)       \
+    OVNACT(BCAST2LR,        ovnact_bcast2lr)
 
 /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */
 enum OVS_PACKED_ENUM ovnact_type {
@@ -234,6 +235,18 @@  struct ovnact_set_queue {
     uint16_t queue_id;
 };
 
+#define MC_LROUTER     "_MC_lrouter"
+#define MC_LROUTER_KEY 65533
+
+/* OVNACT_BCAST2LR */
+struct ovnact_bcast2lr {
+    struct ovnact ovnact;
+    const char *datapath;
+    const char *port;
+    const char *chassis;
+    bool is_this_chassis;
+};
+
 /* Internal use by the helpers below. */
 void ovnact_init(struct ovnact *, enum ovnact_type, size_t len);
 void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len);
@@ -381,6 +394,8 @@  struct ovnact_parse_params {
     /* hmap of 'struct dhcp_opts_map'  to support 'put_dhcpv6_opts' action */
     const struct hmap *dhcpv6_opts;
 
+    /* The current chassis id */
+    const char *chassis_id;
     /* OVN maps each logical flow table (ltable), one-to-one, onto a physical
      * OpenFlow flow table (ptable).  A number of parameters describe this
      * mapping and data related to flow tables:
@@ -412,6 +427,12 @@  struct ovnact_encode_params {
      * '*portp' and returns true; otherwise, returns false. */
     bool (*lookup_port)(const void *aux, const char *port_name,
                         unsigned int *portp);
+    /* Looks up path keys of a port in datapath dp_uuid. If found, stores its
+     * port number in "*portp" and datapath number in "*datapathp". */
+    bool (*lookup_path)(const void *aux,
+                        const char *dp_uuid,
+                        const char *port_name,
+                        unsigned int *portp, unsigned int *datapathp);
     const void *aux;
 
     /* 'true' if the flow is for a switch. */
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index d913998..8edadb0 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -27,6 +27,7 @@ 
 #include "ovn/lib/ovn-dhcp.h"
 #include "ovn/lib/ovn-sb-idl.h"
 #include "packets.h"
+#include "pinctrl.h"
 #include "physical.h"
 #include "simap.h"
 #include "sset.h"
@@ -64,7 +65,8 @@  struct lookup_port_aux {
     const struct sbrec_datapath_binding *dp;
 };
 
-static void consider_logical_flow(const struct lport_index *lports,
+static void consider_logical_flow(const char *chassis_id,
+                                  const struct lport_index *lports,
                                   const struct mcgroup_index *mcgroups,
                                   const struct sbrec_logical_flow *lflow,
                                   const struct hmap *local_datapaths,
@@ -99,6 +101,31 @@  lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp)
 }
 
 static bool
+lookup_path_cb(const void *aux_,
+               const char *dp_uuid,
+               const char *port_name,
+               unsigned int *datapathp,
+               unsigned int *portp)
+{
+    const struct lookup_port_aux *aux = aux_;
+    struct uuid dp;
+
+    if (!uuid_from_string(&dp, dp_uuid)) {
+        return false;
+    }
+
+    const struct sbrec_port_binding *pb
+        = lport_lookup_by_name(aux->lports, port_name);
+    if (pb && uuid_equals(&pb->datapath->header_.uuid, &dp)) {
+        *portp = pb->tunnel_key;
+        *datapathp = pb->datapath->tunnel_key;
+        return true;
+    }
+
+    return false;
+}
+
+static bool
 is_switch(const struct sbrec_datapath_binding *ldp)
 {
     return smap_get(&ldp->external_ids, "logical-switch") != NULL;
@@ -107,7 +134,9 @@  is_switch(const struct sbrec_datapath_binding *ldp)
 
 /* Adds the logical flows from the Logical_Flow table to flow tables. */
 static void
-add_logical_flows(struct controller_ctx *ctx, const struct lport_index *lports,
+add_logical_flows(struct controller_ctx *ctx,
+                  const char *chassis_id,
+                  const struct lport_index *lports,
                   const struct mcgroup_index *mcgroups,
                   const struct hmap *local_datapaths,
                   struct group_table *group_table,
@@ -134,8 +163,8 @@  add_logical_flows(struct controller_ctx *ctx, const struct lport_index *lports,
     }
 
     SBREC_LOGICAL_FLOW_FOR_EACH (lflow, ctx->ovnsb_idl) {
-        consider_logical_flow(lports, mcgroups, lflow, local_datapaths,
-                              group_table, ct_zones,
+        consider_logical_flow(chassis_id, lports, mcgroups, lflow,
+                              local_datapaths, group_table, ct_zones,
                               &dhcp_opts, &dhcpv6_opts, &conj_id_ofs,
                               flow_table, expr_address_sets_p);
     }
@@ -145,7 +174,8 @@  add_logical_flows(struct controller_ctx *ctx, const struct lport_index *lports,
 }
 
 static void
-consider_logical_flow(const struct lport_index *lports,
+consider_logical_flow(const char *chassis_id,
+                      const struct lport_index *lports,
                       const struct mcgroup_index *mcgroups,
                       const struct sbrec_logical_flow *lflow,
                       const struct hmap *local_datapaths,
@@ -186,6 +216,7 @@  consider_logical_flow(const struct lport_index *lports,
         .symtab = &symtab,
         .dhcp_opts = dhcp_opts_p,
         .dhcpv6_opts = dhcpv6_opts_p,
+        .chassis_id = chassis_id,
 
         .n_tables = LOG_PIPELINE_LEN,
         .cur_ltable = lflow->table_id,
@@ -214,6 +245,7 @@  consider_logical_flow(const struct lport_index *lports,
     };
     struct ovnact_encode_params ep = {
         .lookup_port = lookup_port_cb,
+        .lookup_path = lookup_path_cb,
         .aux = &aux,
         .is_switch = is_switch(ldp),
         .ct_zones = ct_zones,
@@ -305,38 +337,31 @@  put_load(const uint8_t *data, size_t len,
     bitwise_one(ofpact_set_field_mask(sf), sf->field->n_bytes, ofs, n_bits);
 }
 
+struct put_mac_binding_context {
+    const struct lport_index *lports;
+    struct hmap *flow_table;
+};
+
 static void
-consider_neighbor_flow(const struct lport_index *lports,
-                       const struct sbrec_mac_binding *b,
+consider_neighbor_flow(const struct sbrec_port_binding *pb,
+                       const char *ip_s,
+                       const struct eth_addr *mac,
                        struct hmap *flow_table)
 {
-    const struct sbrec_port_binding *pb
-        = lport_lookup_by_name(lports, b->logical_port);
-    if (!pb) {
-        return;
-    }
-
-    struct eth_addr mac;
-    if (!eth_addr_from_string(b->mac, &mac)) {
-        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-        VLOG_WARN_RL(&rl, "bad 'mac' %s", b->mac);
-        return;
-    }
-
     struct match match = MATCH_CATCHALL_INITIALIZER;
-    if (strchr(b->ip, '.')) {
+    if (strchr(ip_s, '.')) {
         ovs_be32 ip;
-        if (!ip_parse(b->ip, &ip)) {
+        if (!ip_parse(ip_s, &ip)) {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-            VLOG_WARN_RL(&rl, "bad 'ip' %s", b->ip);
+            VLOG_WARN_RL(&rl, "bad 'ip' %s", ip_s);
             return;
         }
         match_set_reg(&match, 0, ntohl(ip));
     } else {
         struct in6_addr ip6;
-        if (!ipv6_parse(b->ip, &ip6)) {
+        if (!ipv6_parse(ip_s, &ip6)) {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-            VLOG_WARN_RL(&rl, "bad 'ip' %s", b->ip);
+            VLOG_WARN_RL(&rl, "bad 'ip' %s", ip_s);
             return;
         }
         ovs_be128 value;
@@ -349,41 +374,62 @@  consider_neighbor_flow(const struct lport_index *lports,
 
     uint64_t stub[1024 / 8];
     struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(stub);
-    put_load(mac.ea, sizeof mac.ea, MFF_ETH_DST, 0, 48, &ofpacts);
+    put_load(mac->ea, sizeof mac->ea, MFF_ETH_DST, 0, 48, &ofpacts);
     ofctrl_add_flow(flow_table, OFTABLE_MAC_BINDING, 100, &match, &ofpacts);
     ofpbuf_uninit(&ofpacts);
 }
 
+static void
+mac_binding_iteration_cb(void *private_data,
+                         uint32_t port_key,
+                         uint32_t dp_key,
+                         const char *ip_s,
+                         const struct eth_addr *mac) {
+    struct put_mac_binding_context *mb_ctx = private_data;
+    const struct sbrec_port_binding *pb
+        = lport_lookup_by_key(mb_ctx->lports, dp_key, port_key);
+    if (!pb) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+
+        VLOG_WARN_RL(&rl, "unknown logical port with datapath %"PRIu32" "
+                     "and port %"PRIu32, dp_key, port_key);
+        return;
+    }
+
+    consider_neighbor_flow(pb, ip_s, mac, mb_ctx->flow_table);
+}
+
 /* Adds an OpenFlow flow to flow tables for each MAC binding in the OVN
  * southbound database, using 'lports' to resolve logical port names to
  * numbers. */
 static void
-add_neighbor_flows(struct controller_ctx *ctx,
-                   const struct lport_index *lports,
+add_neighbor_flows(const struct lport_index *lports,
                    struct hmap *flow_table)
 {
-    const struct sbrec_mac_binding *b;
-    SBREC_MAC_BINDING_FOR_EACH (b, ctx->ovnsb_idl) {
-        consider_neighbor_flow(lports, b, flow_table);
-    }
+    struct put_mac_binding_context pmb_ctx = {.lports = lports,
+        .flow_table = flow_table};
+
+    pinctrl_put_mac_binding_for_each(mac_binding_iteration_cb, &pmb_ctx);
 }
 
 /* Translates logical flows in the Logical_Flow table in the OVN_SB database
  * into OpenFlow flows.  See ovn-architecture(7) for more information. */
 void
-lflow_run(struct controller_ctx *ctx, const struct lport_index *lports,
+lflow_run(struct controller_ctx *ctx,
+          const struct lport_index *lports,
           const struct mcgroup_index *mcgroups,
           const struct hmap *local_datapaths,
           struct group_table *group_table,
           const struct simap *ct_zones,
-          struct hmap *flow_table)
+          struct hmap *flow_table,
+          const char *chassis_id)
 {
     struct shash expr_address_sets = SHASH_INITIALIZER(&expr_address_sets);
 
     update_address_sets(ctx, &expr_address_sets);
-    add_logical_flows(ctx, lports, mcgroups, local_datapaths,
+    add_logical_flows(ctx, chassis_id, lports, mcgroups, local_datapaths,
                       group_table, ct_zones, flow_table, &expr_address_sets);
-    add_neighbor_flows(ctx, lports, flow_table);
+    add_neighbor_flows(lports, flow_table);
 
     expr_macros_destroy(&expr_address_sets);
     shash_destroy(&expr_address_sets);
diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
index 6305ce0..fd7e514 100644
--- a/ovn/controller/lflow.h
+++ b/ovn/controller/lflow.h
@@ -66,7 +66,8 @@  void lflow_run(struct controller_ctx *, const struct lport_index *,
                const struct hmap *local_datapaths,
                struct group_table *group_table,
                const struct simap *ct_zones,
-               struct hmap *flow_table);
+               struct hmap *flow_table,
+               const char *chassis_id);
 void lflow_destroy(void);
 
 #endif /* ovn/lflow.h */
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 368ccc9..c3be3eb 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -169,13 +169,11 @@  update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
             sbrec_port_binding_add_clause_datapath(&pb, OVSDB_F_EQ, uuid);
             sbrec_logical_flow_add_clause_logical_datapath(&lf, OVSDB_F_EQ,
                                                            uuid);
-            sbrec_mac_binding_add_clause_datapath(&mb, OVSDB_F_EQ, uuid);
             sbrec_multicast_group_add_clause_datapath(&mg, OVSDB_F_EQ, uuid);
         }
     }
     sbrec_port_binding_set_condition(ovnsb_idl, &pb);
     sbrec_logical_flow_set_condition(ovnsb_idl, &lf);
-    sbrec_mac_binding_set_condition(ovnsb_idl, &mb);
     sbrec_multicast_group_set_condition(ovnsb_idl, &mg);
     ovsdb_idl_condition_destroy(&pb);
     ovsdb_idl_condition_destroy(&lf);
@@ -585,7 +583,7 @@  main(int argc, char *argv[])
             enum mf_field_id mff_ovn_geneve = ofctrl_run(br_int,
                                                          &pending_ct_zones);
 
-            pinctrl_run(&ctx, &lports, br_int, chassis, &local_datapaths);
+            pinctrl_run(&lports, br_int, chassis, &local_datapaths);
             update_ct_zones(&local_lports, &local_datapaths, &ct_zones,
                             ct_zone_bitmap, &pending_ct_zones);
             if (ctx.ovs_idl_txn) {
@@ -593,7 +591,7 @@  main(int argc, char *argv[])
 
                 struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
                 lflow_run(&ctx, &lports, &mcgroups, &local_datapaths,
-                          &group_table, &ct_zones, &flow_table);
+                          &group_table, &ct_zones, &flow_table, chassis_id);
 
                 physical_run(&ctx, mff_ovn_geneve,
                              br_int, chassis, &ct_zones, &lports,
@@ -636,7 +634,7 @@  main(int argc, char *argv[])
 
         if (br_int) {
             ofctrl_wait();
-            pinctrl_wait(&ctx);
+            pinctrl_wait();
         }
         ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop);
 
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index 9d37410..d8b2899 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -29,6 +29,7 @@ 
 #include "ovn-controller.h"
 #include "ovn/lib/ovn-sb-idl.h"
 #include "ovn/lib/ovn-util.h"
+#include "ovn/actions.h"
 #include "physical.h"
 #include "openvswitch/shash.h"
 #include "simap.h"
@@ -60,6 +61,11 @@  static struct simap localvif_to_ofport =
     SIMAP_INITIALIZER(&localvif_to_ofport);
 static struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
 
+struct dp_key {
+    uint64_t tunnel_key;
+    struct ovs_list list;
+};
+
 /* Maps from a chassis to the OpenFlow port number of the tunnel that can be
  * used to reach that chassis. */
 struct chassis_tunnel {
@@ -568,7 +574,8 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
                   const struct sbrec_multicast_group *mc,
                   struct ofpbuf *ofpacts_p,
                   struct ofpbuf *remote_ofpacts_p,
-                  struct hmap *flow_table)
+                  struct hmap *flow_table,
+                  struct sset *bcast_chassis)
 {
     uint32_t dp_key = mc->datapath->tunnel_key;
     if (!get_local_datapath(local_datapaths, dp_key)) {
@@ -576,6 +583,7 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
     }
 
     struct sset remote_chassis = SSET_INITIALIZER(&remote_chassis);
+    bool is_this_chassis_bcasting = false;
     struct match match;
 
     match_init_catchall(&match);
@@ -612,10 +620,21 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
             put_load(zone_id, MFF_LOG_CT_ZONE, 0, 32, ofpacts_p);
         }
 
-        if (!strcmp(port->type, "patch")) {
+        if (!strcmp(mc->name, MC_LROUTER)) {
+            const char *arp_bcast_chassis = smap_get(&port->options,
+                                                     "arp-bcast-chassis");
+            if (arp_bcast_chassis) {
+                if (!strcmp(arp_bcast_chassis, chassis->name)) {
+                    is_this_chassis_bcasting = true;
+                } else {
+                    sset_add(bcast_chassis, arp_bcast_chassis);
+                }
+            }
+        } else if (!strcmp(port->type, "patch")) {
             put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32,
                      remote_ofpacts_p);
             put_resubmit(OFTABLE_CHECK_LOOPBACK, remote_ofpacts_p);
+
         } else if (simap_contains(&localvif_to_ofport,
                            (port->parent_port && *port->parent_port)
                            ? port->parent_port : port->logical_port)
@@ -632,6 +651,17 @@  consider_mc_group(enum mf_field_id mff_ovn_geneve,
         }
     }
 
+    if (is_this_chassis_bcasting) {
+        /* Send to all the chassis except this */
+        struct chassis_tunnel *tun;
+        HMAP_FOR_EACH (tun, hmap_node, &tunnels) {
+            if (!strcmp(tun->chassis_id, chassis->name)) {
+                continue;
+            }
+            sset_add(&remote_chassis, tun->chassis_id);
+        }
+    }
+
     /* Table 33, priority 100.
      * =======================
      *
@@ -839,10 +869,28 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     /* Handle output to multicast groups, in tables 32 and 33. */
     const struct sbrec_multicast_group *mc;
     struct ofpbuf remote_ofpacts;
+    struct shash bcast_chassis_dps = SHASH_INITIALIZER(&bcast_chassis_dps);
+
     ofpbuf_init(&remote_ofpacts, 0);
     SBREC_MULTICAST_GROUP_FOR_EACH (mc, ctx->ovnsb_idl) {
+        struct sset bcast_chassis = SSET_INITIALIZER(&bcast_chassis);
         consider_mc_group(mff_ovn_geneve, ct_zones, local_datapaths, chassis,
-                          mc, &ofpacts, &remote_ofpacts, flow_table);
+                          mc, &ofpacts, &remote_ofpacts, flow_table,
+                          &bcast_chassis);
+
+        const char *chassis_name;
+        SSET_FOR_EACH(chassis_name, &bcast_chassis) {
+            struct shash_node *this_node = shash_find(&bcast_chassis_dps, chassis_name);
+            struct dp_key *dp_key = xmalloc(sizeof(*dp_key));
+            dp_key->tunnel_key = mc->datapath->tunnel_key;
+            if (!this_node) {
+                struct ovs_list *list = xmalloc(sizeof(*list));
+                ovs_list_init(list);
+                this_node = shash_add(&bcast_chassis_dps, chassis_name, list);
+            }
+            ovs_list_push_back((struct ovs_list *)this_node->data, &dp_key->list);
+        }
+        sset_destroy(&bcast_chassis);
     }
 
     ofpbuf_uninit(&remote_ofpacts);
@@ -880,11 +928,29 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
             OVS_NOT_REACHED();
         }
 
-        put_resubmit(OFTABLE_LOCAL_OUTPUT, &ofpacts);
+        /* If this tunnel is found in the chassis list, add an additiona flow*/
+        struct shash_node *this_node = shash_find(&bcast_chassis_dps, tun->chassis_id);
+        if (this_node) {
+            struct ofpbuf *bcast2lr_ofpacts = ofpbuf_clone(&ofpacts);
+            struct match  tun_match = MATCH_CATCHALL_INITIALIZER;
+            struct dp_key *dp_key;
 
+            match_set_in_port(&tun_match, tun->ofport);
+            put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, bcast2lr_ofpacts);
+
+            LIST_FOR_EACH_POP(dp_key, list, (struct ovs_list *)this_node->data) {
+                match_set_tun_id(&tun_match, htonll(dp_key->tunnel_key));
+            }
+            ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, 150,
+                            &tun_match, bcast2lr_ofpacts);
+            ofpbuf_delete(bcast2lr_ofpacts);
+            shash_delete(&bcast_chassis_dps, this_node);
+        }
+        put_resubmit(OFTABLE_LOCAL_OUTPUT, &ofpacts);
         ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, 100, &match, &ofpacts);
     }
 
+    shash_destroy(&bcast_chassis_dps);
     /* Add flows for VXLAN encapsulations.  Due to the limited amount of
      * metadata, we only support VXLAN for connections to gateways.  The
      * VNI is used to populate MFF_LOG_DATAPATH.  The gateway's logical
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index 7320a84..d31f4a9 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -59,9 +59,6 @@  static void pinctrl_handle_put_mac_binding(const struct flow *md,
                                            bool is_arp);
 static void init_put_mac_bindings(void);
 static void destroy_put_mac_bindings(void);
-static void run_put_mac_bindings(struct controller_ctx *,
-                                 const struct lport_index *lports);
-static void wait_put_mac_bindings(struct controller_ctx *);
 static void flush_put_mac_bindings(void);
 
 static void init_send_garps(void);
@@ -750,7 +747,7 @@  pinctrl_recv(const struct ofp_header *oh, enum ofptype type)
 }
 
 void
-pinctrl_run(struct controller_ctx *ctx, const struct lport_index *lports,
+pinctrl_run(const struct lport_index *lports,
             const struct ovsrec_bridge *br_int,
             const struct sbrec_chassis *chassis,
             struct hmap *local_datapaths)
@@ -768,7 +765,6 @@  pinctrl_run(struct controller_ctx *ctx, const struct lport_index *lports,
         if (conn_seq_no != rconn_get_connection_seqno(swconn)) {
             pinctrl_setup(swconn);
             conn_seq_no = rconn_get_connection_seqno(swconn);
-            flush_put_mac_bindings();
         }
 
         /* Process a limited number of messages per call. */
@@ -787,14 +783,12 @@  pinctrl_run(struct controller_ctx *ctx, const struct lport_index *lports,
         }
     }
 
-    run_put_mac_bindings(ctx, lports);
     send_garp_run(br_int, chassis, lports, local_datapaths);
 }
 
 void
-pinctrl_wait(struct controller_ctx *ctx)
+pinctrl_wait(void)
 {
-    wait_put_mac_bindings(ctx);
     rconn_run_wait(swconn);
     rconn_recv_wait(swconn);
     send_garp_wait();
@@ -811,13 +805,9 @@  pinctrl_destroy(void)
 /* Implementation of the "put_arp" and "put_nd" OVN actions.  These
  * actions send a packet to ovn-controller, using the flow as an API
  * (see actions.h for details).  This code implements the actions by
- * updating the MAC_Binding table in the southbound database.
+ * updating the local in-memory cache of the mac bindings
  *
- * This code could be a lot simpler if the database could always be updated,
- * but in fact we can only update it when ctx->ovnsb_idl_txn is nonnull.  Thus,
- * we buffer up a few put_mac_bindings (but we don't keep them longer
- * than 1 second) and apply them whenever a database transaction is
- * available. */
+ * */
 
 /* Buffered "put_mac_binding" operation. */
 struct put_mac_binding {
@@ -899,74 +889,18 @@  pinctrl_handle_put_mac_binding(const struct flow *md,
     pmb->mac = headers->dl_src;
 }
 
-static void
-run_put_mac_binding(struct controller_ctx *ctx,
-                    const struct lport_index *lports,
-                    const struct put_mac_binding *pmb)
-{
-    if (time_msec() > pmb->timestamp + 1000) {
-        return;
-    }
-
-    /* Convert logical datapath and logical port key into lport. */
-    const struct sbrec_port_binding *pb
-        = lport_lookup_by_key(lports, pmb->dp_key, pmb->port_key);
-    if (!pb) {
-        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
-
-        VLOG_WARN_RL(&rl, "unknown logical port with datapath %"PRIu32" "
-                     "and port %"PRIu32, pmb->dp_key, pmb->port_key);
-        return;
-    }
-
-    /* Convert ethernet argument to string form for database. */
-    char mac_string[ETH_ADDR_STRLEN + 1];
-    snprintf(mac_string, sizeof mac_string,
-             ETH_ADDR_FMT, ETH_ADDR_ARGS(pmb->mac));
-
-    /* Check for an update an existing IP-MAC binding for this logical
-     * port.
-     *
-     * XXX This is not very efficient. */
-    const struct sbrec_mac_binding *b;
-    SBREC_MAC_BINDING_FOR_EACH (b, ctx->ovnsb_idl) {
-        if (!strcmp(b->logical_port, pb->logical_port)
-            && !strcmp(b->ip, pmb->ip_s)) {
-            if (strcmp(b->mac, mac_string)) {
-                sbrec_mac_binding_set_mac(b, mac_string);
-            }
-            return;
-        }
-    }
-
-    /* Add new IP-MAC binding for this logical port. */
-    b = sbrec_mac_binding_insert(ctx->ovnsb_idl_txn);
-    sbrec_mac_binding_set_logical_port(b, pb->logical_port);
-    sbrec_mac_binding_set_ip(b, pmb->ip_s);
-    sbrec_mac_binding_set_mac(b, mac_string);
-    sbrec_mac_binding_set_datapath(b, pb->datapath);
-}
-
-static void
-run_put_mac_bindings(struct controller_ctx *ctx,
-                     const struct lport_index *lports)
-{
-    if (!ctx->ovnsb_idl_txn) {
-        return;
-    }
-
+void
+pinctrl_put_mac_binding_for_each(void (*iteration_cb)(void *private_data,
+                                                      uint32_t port_key,
+                                                      uint32_t dp_key,
+                                                      const char *ip_s,
+                                                      const struct eth_addr *mac),
+                                 void *private_data) {
     const struct put_mac_binding *pmb;
     HMAP_FOR_EACH (pmb, hmap_node, &put_mac_bindings) {
-        run_put_mac_binding(ctx, lports, pmb);
-    }
-    flush_put_mac_bindings();
-}
-
-static void
-wait_put_mac_bindings(struct controller_ctx *ctx)
-{
-    if (ctx->ovnsb_idl_txn && !hmap_is_empty(&put_mac_bindings)) {
-        poll_immediate_wake();
+        iteration_cb(private_data, pmb->port_key,
+                     pmb->dp_key, pmb->ip_s,
+                     &pmb->mac);
     }
 }
 
diff --git a/ovn/controller/pinctrl.h b/ovn/controller/pinctrl.h
index af3c4b0..39b8a6b 100644
--- a/ovn/controller/pinctrl.h
+++ b/ovn/controller/pinctrl.h
@@ -28,10 +28,17 @@  struct controller_ctx;
 struct sbrec_chassis;
 
 void pinctrl_init(void);
-void pinctrl_run(struct controller_ctx *, const struct lport_index *,
-                 const struct ovsrec_bridge *, const struct sbrec_chassis *,
+void pinctrl_run(const struct lport_index *,
+                 const struct ovsrec_bridge *,
+                 const struct sbrec_chassis *,
                  struct hmap *local_datapaths);
-void pinctrl_wait(struct controller_ctx *);
+void pinctrl_wait(void);
 void pinctrl_destroy(void);
+void pinctrl_put_mac_binding_for_each(void (*iteration_cb)(void *private_data,
+                                                           uint32_t port_key,
+                                                           uint32_t dp_key,
+                                                           const char *ip_s,
+                                                           const struct eth_addr *mac),
+                                      void *private_data);
 
 #endif /* ovn/pinctrl.h */
diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
index 686ecc5..35abf91 100644
--- a/ovn/lib/actions.c
+++ b/ovn/lib/actions.c
@@ -1660,6 +1660,133 @@  static void
 free_SET_QUEUE(struct ovnact_set_queue *a OVS_UNUSED)
 {
 }
+
+static void
+parse_BCAST2LR(struct action_context *ctx)
+{
+    struct ovnact_bcast2lr bc2lr = {
+        .datapath = NULL, .port = NULL, .chassis = NULL
+    };
+
+    if (lexer_match(ctx->lexer, LEX_T_LPAREN)) {
+        while (!lexer_match(ctx->lexer, LEX_T_RPAREN)) {
+            const char **keyp = NULL;
+
+            if (lexer_match_id(ctx->lexer, "datapath")) {
+                keyp = &bc2lr.datapath;
+            } else if (lexer_match_id(ctx->lexer, "port")) {
+                keyp = &bc2lr.port;
+            } else if (lexer_match_id(ctx->lexer, "chassis")) {
+                keyp = &bc2lr.chassis;
+            } else {
+                return;
+            }
+
+            if (!lexer_force_match(ctx->lexer, LEX_T_EQUALS)
+                || ctx->lexer->token.type != LEX_T_STRING) {
+                return;
+            }
+            *keyp = xstrdup(ctx->lexer->token.s);
+            lexer_get(ctx->lexer);
+
+            lexer_match(ctx->lexer, LEX_T_COMMA);
+        }
+    }
+
+    if (!bc2lr.datapath || !bc2lr.port || !bc2lr.chassis) {
+        return;
+    }
+
+    bc2lr.is_this_chassis = !strcmp(ctx->pp->chassis_id,
+                              bc2lr.chassis);
+    struct ovnact_bcast2lr *b = ovnact_put_BCAST2LR(ctx->ovnacts);
+    b->chassis = bc2lr.chassis;
+    b->port = bc2lr.port;
+    b->datapath = bc2lr.datapath;
+    b->is_this_chassis = bc2lr.is_this_chassis;
+}
+
+static void
+format_BCAST2LR(const struct ovnact_bcast2lr *bc2lr, struct ds *s)
+{
+    ds_put_format(s, "bcast2lr(datapath=\"%s\", port=\"%s\", chassis=\"%s\");",
+                  bc2lr->datapath, bc2lr->port, bc2lr->chassis);
+}
+
+static void
+encode_BCAST2LR(const struct ovnact_bcast2lr *bc2lr,
+                const struct ovnact_encode_params *ep,
+                struct ofpbuf *ofpacts)
+{
+    if (!bc2lr->is_this_chassis) {
+        /* Nothing to be broadcasted from this chassis */
+        return;
+    }
+
+    unsigned int dp_key, port_key;
+    if (!ep->lookup_path(ep->aux, bc2lr->datapath, bc2lr->port,
+                         &dp_key, &port_key)) {
+        return;
+    }
+
+    ovs_be64 dp_key_be = htonll(dp_key);
+    port_key = htonl(port_key);
+
+    const ovs_be32 out_value = htonl(MC_LROUTER_KEY);
+    const struct mf_field *mf_dp = mf_from_id(MFF_LOG_DATAPATH);
+    const struct mf_field *mf_op = mf_from_id(MFF_LOG_OUTPORT);
+    const struct mf_field *mf_ip = mf_from_id(MFF_LOG_INPORT);
+    struct mf_subfield dp_field = {
+        .field = mf_dp,
+        .ofs = 0,
+        .n_bits = mf_dp->n_bits
+    };
+    struct mf_subfield op_field = {
+        .field = mf_op,
+        .ofs = 0,
+        .n_bits = mf_op->n_bits
+    };
+    struct mf_subfield ip_field = {
+        .field = mf_ip,
+        .ofs = 0,
+        .n_bits = mf_ip->n_bits
+    };
+
+    ofpact_put_STACK_PUSH(ofpacts)->subfield = dp_field;
+    ofpact_put_STACK_PUSH(ofpacts)->subfield = op_field;
+    ofpact_put_STACK_PUSH(ofpacts)->subfield = ip_field;
+
+    struct ofpact_set_field *sf_dp = ofpact_put_set_field(ofpacts,
+                                                          mf_dp,
+                                                          NULL,
+                                                          NULL);
+    bitwise_copy(&dp_key_be, 8, 0, sf_dp->value,
+                 sf_dp->field->n_bytes, 0, mf_dp->n_bits);
+    bitwise_one(ofpact_set_field_mask(sf_dp), sf_dp->field->n_bytes, 0, mf_dp->n_bits);
+
+
+    ofpact_put_set_field(ofpacts,
+                         mf_ip, &port_key,
+                         NULL);
+
+    ofpact_put_set_field(ofpacts,
+                         mf_op, &out_value,
+                         NULL);
+    emit_resubmit(ofpacts, ep->output_ptable);
+
+    ofpact_put_STACK_POP(ofpacts)->subfield = ip_field;
+    ofpact_put_STACK_POP(ofpacts)->subfield = op_field;
+    ofpact_put_STACK_POP(ofpacts)->subfield = dp_field;
+}
+
+static void
+free_BCAST2LR(struct ovnact_bcast2lr *bc2lr)
+{
+    free((void *)bc2lr->chassis);
+    free((void *)bc2lr->port);
+    free((void *)bc2lr->datapath);
+}
+
 
 /* Parses an assignment or exchange or put_dhcp_opts action. */
 static void
@@ -1733,6 +1860,8 @@  parse_action(struct action_context *ctx)
         parse_put_mac_bind(ctx, 128, ovnact_put_PUT_ND(ctx->ovnacts));
     } else if (lexer_match_id(ctx->lexer, "set_queue")) {
         parse_SET_QUEUE(ctx);
+    } else if (lexer_match_id(ctx->lexer, "bcast2lr")) {
+        parse_BCAST2LR(ctx);
     } else {
         lexer_syntax_error(ctx->lexer, "expecting action");
     }
@@ -1866,6 +1995,7 @@  static void
 ovnact_encode(const struct ovnact *a, const struct ovnact_encode_params *ep,
               struct ofpbuf *ofpacts)
 {
+
     switch (a->type) {
 #define OVNACT(ENUM, STRUCT)                                            \
         case OVNACT_##ENUM:                                             \
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index a28327b..be02ccb 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -111,7 +111,8 @@  enum ovn_stage {
     PIPELINE_STAGE(SWITCH, IN,  ARP_ND_RSP,    10, "ls_in_arp_rsp")       \
     PIPELINE_STAGE(SWITCH, IN,  DHCP_OPTIONS,  11, "ls_in_dhcp_options")  \
     PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 12, "ls_in_dhcp_response") \
-    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       13, "ls_in_l2_lkup")       \
+    PIPELINE_STAGE(SWITCH, IN,  DYN_ARP_BCAST, 13, "ls_in_dynamic_arp_bcast") \
+    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,       14, "ls_in_l2_lkup")      \
                                                                       \
     /* Logical switch egress stages. */                               \
     PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")     \
@@ -378,7 +379,7 @@  struct ovn_datapath {
     struct hmap port_tnlids;
     uint32_t port_key_hint;
 
-    bool has_unknown;
+    struct ovn_port *unknown_port;
 
     /* IPAM data. */
     struct hmap ipam;
@@ -1443,19 +1444,6 @@  ovn_port_update_sbrec(const struct ovn_port *op,
     }
 }
 
-/* Remove mac_binding entries that refer to logical_ports which are
- * deleted. */
-static void
-cleanup_mac_bindings(struct northd_context *ctx, struct hmap *ports)
-{
-    const struct sbrec_mac_binding *b, *n;
-    SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
-        if (!ovn_port_find(ports, b->logical_port)) {
-            sbrec_mac_binding_delete(b);
-        }
-    }
-}
-
 /* Updates the southbound Port_Binding table so that it contains the logical
  * switch ports specified by the northbound database.
  *
@@ -1504,20 +1492,12 @@  build_ports(struct northd_context *ctx, struct hmap *datapaths,
         sbrec_port_binding_set_tunnel_key(op->sb, tunnel_key);
     }
 
-    bool remove_mac_bindings = false;
-    if (!ovs_list_is_empty(&sb_only)) {
-        remove_mac_bindings = true;
-    }
-
     /* Delete southbound records without northbound matches. */
     LIST_FOR_EACH_SAFE(op, next, list, &sb_only) {
         ovs_list_remove(&op->list);
         sbrec_port_binding_delete(op->sb);
         ovn_port_destroy(ports, op);
     }
-    if (remove_mac_bindings) {
-        cleanup_mac_bindings(ctx, ports);
-    }
 
     tag_alloc_destroy(&tag_alloc_table);
     destroy_chassis_queues(&chassis_qdisc_queues);
@@ -1537,6 +1517,8 @@  static const struct multicast_group mc_flood = { MC_FLOOD, 65535 };
 #define MC_UNKNOWN "_MC_unknown"
 static const struct multicast_group mc_unknown = { MC_UNKNOWN, 65534 };
 
+static const struct multicast_group mc_lrouter = { MC_LROUTER, MC_LROUTER_KEY };
+
 static bool
 multicast_group_equal(const struct multicast_group *a,
                       const struct multicast_group *b)
@@ -3061,9 +3043,10 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
 
         ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_OPTIONS, 0, "1", "next;");
         ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;");
+        ovn_lflow_add(lflows, od, S_SWITCH_IN_DYN_ARP_BCAST, 0, "1", "next;");
     }
 
-    /* Ingress table 13: Destination lookup, broadcast and multicast handling
+    /* Ingress table 14: Destination lookup, broadcast and multicast handling
      * (priority 100). */
     HMAP_FOR_EACH (op, key_node, ports) {
         if (!op->nbsp) {
@@ -3107,7 +3090,7 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
             } else if (!strcmp(op->nbsp->addresses[i], "unknown")) {
                 if (lsp_is_enabled(op->nbsp)) {
                     ovn_multicast_add(mcgroups, &mc_unknown, op);
-                    op->od->has_unknown = true;
+                    op->od->unknown_port = op;
                 }
             } else if (is_dynamic_lsp_address(op->nbsp->addresses[i])) {
                 if (!op->nbsp->dynamic_addresses
@@ -3139,7 +3122,48 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
             continue;
         }
 
-        if (od->has_unknown) {
+        if (od->unknown_port) {
+            if ( od->unknown_port->sb->chassis) {
+                for (int i = 0; i < od->n_router_ports; i++) {
+                    struct ovn_port *peer = od->router_ports[i]->peer;
+                    if (!peer) {
+                        continue;
+                    }
+
+                    ds_clear(&match);
+                    ds_clear(&actions);
+                    ds_put_format(
+                        &match,
+                        "inport == \"%s\" && arp.op == 2 /* ARP reply */"
+                        " && eth.dst == %s",
+                        od->unknown_port->key,
+                        peer->lrp_networks.ea_s);
+
+                    ds_put_format(
+                        &actions,
+                        "bcast2lr(""datapath=\""UUID_FMT"\", port=\"%s\", "
+                        "chassis=\"%s\");  next;",
+                        UUID_ARGS(&peer->od->sb->header_.uuid),
+                        peer->sb->logical_port,
+                        od->unknown_port->sb->chassis->name);
+
+                    struct smap opts;
+                    smap_clone(&opts, &peer->sb->options);
+                    smap_add(&opts,
+                             "arp-bcast-chassis",
+                             od->unknown_port->sb->chassis->name);
+                    sbrec_port_binding_set_options(
+                        peer->sb,
+                        &opts);
+                    smap_destroy(&opts);
+                    /* Broadcast arp responses to the router datapath */
+                    ovn_lflow_add(lflows, od, S_SWITCH_IN_DYN_ARP_BCAST, 100,
+                                  ds_cstr(&match),
+                                  ds_cstr(&actions));
+                    ovn_multicast_add(mcgroups, &mc_lrouter,
+                                      peer);
+                }
+            }
             ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 0, "1",
                           "outport = \""MC_UNKNOWN"\"; output;");
         }
@@ -3317,7 +3341,7 @@  build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
                         const struct nbrec_logical_router_static_route *route)
 {
     ovs_be32 nexthop;
-    const char *lrp_addr_s;
+    const char *lrp_addr_s = NULL;
     unsigned int plen;
     bool is_ipv4;
 
@@ -4977,12 +5001,6 @@  main(int argc, char *argv[])
     add_column_noalert(ovnsb_idl_loop.idl, &sbrec_port_binding_col_options);
     add_column_noalert(ovnsb_idl_loop.idl, &sbrec_port_binding_col_mac);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_port_binding_col_chassis);
-    ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_mac_binding);
-    add_column_noalert(ovnsb_idl_loop.idl, &sbrec_mac_binding_col_datapath);
-    add_column_noalert(ovnsb_idl_loop.idl, &sbrec_mac_binding_col_ip);
-    add_column_noalert(ovnsb_idl_loop.idl, &sbrec_mac_binding_col_mac);
-    add_column_noalert(ovnsb_idl_loop.idl,
-                       &sbrec_mac_binding_col_logical_port);
     ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_dhcp_options);
     add_column_noalert(ovnsb_idl_loop.idl, &sbrec_dhcp_options_col_code);
     add_column_noalert(ovnsb_idl_loop.idl, &sbrec_dhcp_options_col_type);
@@ -4997,6 +5015,8 @@  main(int argc, char *argv[])
 
     ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_chassis);
     ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_nb_cfg);
+    ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_name);
+    add_column_noalert(ovnsb_idl_loop.idl, &sbrec_chassis_col_name);
 
     /* Main loop. */
     exiting = false;
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index d96e4b1..f8d12c5 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -871,8 +871,8 @@ 
           <p>
             Implemented by storing arguments into OpenFlow fields, then
             resubmitting to table 66, which <code>ovn-controller</code>
-            populates with flows generated from the <code>MAC_Binding</code>
-            table in the OVN Southbound database.  If there is a match in table
+            populates with flows generated from the MAC binding cache
+            updated from the arp broadcasts. If there is a match in table
             66, then its actions store the bound MAC in the Ethernet
             destination address field.
           </p>
@@ -890,7 +890,7 @@ 
           <p>
             Implemented by storing the arguments into OpenFlow fields, then
             outputting a packet to <code>ovn-controller</code>, which updates
-            the <code>MAC_Binding</code> table.
+            a local in-memory cache for MAC bindings.
           </p>
 
           <p>
diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema
index 0212a5e..a59ef12 100644
--- a/ovn/ovn-sb.ovsschema
+++ b/ovn/ovn-sb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Southbound",
     "version": "1.9.0",
-    "cksum": "2240045372 9719",
+    "cksum": "3527166512 9310",
     "tables": {
         "SB_Global": {
             "columns": {
@@ -130,15 +130,6 @@ 
                                  "max": "unlimited"}}},
             "indexes": [["datapath", "tunnel_key"], ["logical_port"]],
             "isRoot": true},
-        "MAC_Binding": {
-            "columns": {
-                "logical_port": {"type": "string"},
-                "ip": {"type": "string"},
-                "mac": {"type": "string"},
-                "datapath": {"type": {"key": {"type": "uuid",
-                                              "refTable": "Datapath_Binding"}}}},
-            "indexes": [["logical_port", "ip"]],
-            "isRoot": true},
         "DHCP_Options": {
             "columns": {
                 "name": {"type": "string"},
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 2f35079..0abd42d 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -103,17 +103,6 @@ 
     contain binding data.
   </p>
 
-  <h3>MAC bindings</h3>
-
-  <p>
-    The <ref table="MAC_Binding"/> table tracks the bindings from IP addresses
-    to Ethernet addresses that are dynamically discovered using ARP (for IPv4)
-    and neighbor discovery (for IPv6).  Usually, IP-to-MAC bindings for virtual
-    machines are statically populated into the <ref table="Port_Binding"/>
-    table, so <ref table="MAC_Binding"/> is primarily used to discover bindings
-    on physical networks.
-  </p>
-
   <h2>Common Columns</h2>
 
   <p>
@@ -1410,6 +1399,39 @@ 
             packet in that connection.
           </p>
         </dd>
+
+        <dt><code>bcast2lr(datapath=<var>router_datapath_uuid</var>, port=<var>router_port_name</var>, chassis=<var>broadcasting_chassis_name</var>);</code></dt>
+        <dd>
+          <p>
+            <b>Parameters</b>:
+              <p>
+                <code>datapath</code> - <code>uuid</code> of the logical router
+                datapath to which the packet has to be broadcasted.
+              </p>
+              <p>
+                <code>port</code> - name of the logical router port which is
+                supposed to receive the packet in the <code>datapath</code>.
+              </p>
+              <p>
+                <code>chassis</code> - The name of the chassis from which this
+                packet should be broadcasted.
+              </p>
+          </p>
+
+          <p>
+            When used, <code>bcast2lr</code> broadcasts a packet to the
+            _MC_LRouter multicast group. This action is used to broadcast the
+            dynamic ARP responses to a logical router to which the switch is
+            connected.
+          </p>
+
+          <p>
+            Since the dynamic arp responses can originate from the chassis in
+            which, a port with <code>"unknown"</code> address is bound to, the
+            broadcast will happen only from that chassis. There will be
+            no operation in all the other chassis for this logical action.
+          </p>
+        </dd>
       </dl>
 
       <p>
@@ -1963,90 +1985,6 @@  tcp.flags = RST;
     </group>
   </table>
 
-  <table name="MAC_Binding" title="IP to MAC bindings">
-    <p>
-      Each row in this table specifies a binding from an IP address to an
-      Ethernet address that has been discovered through ARP (for IPv4) or
-      neighbor discovery (for IPv6).  This table is primarily used to discover
-      bindings on physical networks, because IP-to-MAC bindings for virtual
-      machines are usually populated statically into the <ref
-      table="Port_Binding"/> table.
-    </p>
-
-    <p>
-      This table expresses a functional relationship: <ref
-      table="MAC_Binding"/>(<ref column="logical_port"/>, <ref column="ip"/>) =
-      <ref column="mac"/>.
-    </p>
-
-    <p>
-      In outline, the lifetime of a logical router's MAC binding looks like
-      this:
-    </p>
-
-    <ol>
-      <li>
-        On hypervisor 1, a logical router determines that a packet should be
-        forwarded to IP address <var>A</var> on one of its router ports.  It
-        uses its logical flow table to determine that <var>A</var> lacks a
-        static IP-to-MAC binding and the <code>get_arp</code> action to
-        determine that it lacks a dynamic IP-to-MAC binding.
-      </li>
-
-      <li>
-        Using an OVN logical <code>arp</code> action, the logical router
-        generates and sends a broadcast ARP request to the router port.  It
-        drops the IP packet.
-      </li>
-
-      <li>
-        The logical switch attached to the router port delivers the ARP request
-        to all of its ports.  (It might make sense to deliver it only to ports
-        that have no static IP-to-MAC bindings, but this could also be
-        surprising behavior.)
-      </li>
-
-      <li>
-        A host or VM on hypervisor 2 (which might be the same as hypervisor 1)
-        attached to the logical switch owns the IP address in question.  It
-        composes an ARP reply and unicasts it to the logical router port's
-        Ethernet address.
-      </li>
-
-      <li>
-        The logical switch delivers the ARP reply to the logical router port.
-      </li>
-
-      <li>
-        The logical router flow table executes a <code>put_arp</code> action.
-        To record the IP-to-MAC binding, <code>ovn-controller</code> adds a row
-        to the <ref table="MAC_Binding"/> table.
-      </li>
-
-      <li>
-        On hypervisor 1, <code>ovn-controller</code> receives the updated <ref
-        table="MAC_Binding"/> table from the OVN southbound database.  The next
-        packet destined to <var>A</var> through the logical router is sent
-        directly to the bound Ethernet address.
-      </li>
-    </ol>
-
-    <column name="logical_port">
-      The logical port on which the binding was discovered.
-    </column>
-
-    <column name="ip">
-      The bound IP address.
-    </column>
-
-    <column name="mac">
-      The Ethernet address to which the IP is bound.
-    </column>
-    <column name="datapath">
-      The logical datapath to which the logical port belongs.
-    </column>
-  </table>
-
   <table name="DHCP_Options" title="DHCP Options supported by native OVN DHCP">
     <p>
       Each row in this table stores the DHCP Options supported by native OVN
diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
index 92ae3e5..e3119cc 100644
--- a/ovn/utilities/ovn-sbctl.c
+++ b/ovn/utilities/ovn-sbctl.c
@@ -972,10 +972,6 @@  static const struct ctl_table_class tables[] = {
      {{&sbrec_table_port_binding, &sbrec_port_binding_col_logical_port, NULL},
       {NULL, NULL, NULL}}},
 
-    {&sbrec_table_mac_binding,
-     {{&sbrec_table_mac_binding, &sbrec_mac_binding_col_logical_port, NULL},
-      {NULL, NULL, NULL}}},
-
     {&sbrec_table_address_set,
      {{&sbrec_table_address_set, &sbrec_address_set_col_name, NULL},
       {NULL, NULL, NULL}}},
diff --git a/ovn/utilities/ovn-trace.c b/ovn/utilities/ovn-trace.c
index 6d8e514..c74a0f8 100644
--- a/ovn/utilities/ovn-trace.c
+++ b/ovn/utilities/ovn-trace.c
@@ -689,47 +689,6 @@  read_dhcp_opts(void)
 }
 
 static void
-read_mac_bindings(void)
-{
-    const struct sbrec_mac_binding *sbmb;
-    SBREC_MAC_BINDING_FOR_EACH (sbmb, ovnsb_idl) {
-        const struct ovntrace_port *port = shash_find_data(
-            &ports, sbmb->logical_port);
-        if (!port) {
-            VLOG_WARN("missing port %s", sbmb->logical_port);
-            continue;
-        }
-
-        if (!uuid_equals(&port->dp->sb_uuid, &sbmb->datapath->header_.uuid)) {
-            VLOG_WARN("port %s is in wrong datapath", sbmb->logical_port);
-            continue;
-        }
-
-        struct in6_addr ip6;
-        ovs_be32 ip4;
-        if (ip_parse(sbmb->ip, &ip4)) {
-            ip6 = in6_addr_mapped_ipv4(ip4);
-        } else if (!ipv6_parse(sbmb->ip, &ip6)) {
-            VLOG_WARN("%s: bad IP address", sbmb->ip);
-            continue;
-        }
-
-        struct eth_addr mac;
-        if (!eth_addr_from_string(sbmb->mac, &mac)) {
-            VLOG_WARN("%s: bad Ethernet address", sbmb->mac);
-            continue;
-        }
-
-        struct ovntrace_mac_binding *binding = xmalloc(sizeof *binding);
-        binding->port_key = port->tunnel_key;
-        binding->ip = ip6;
-        binding->mac = mac;
-        hmap_insert(&port->dp->mac_bindings, &binding->node,
-                    hash_mac_binding(binding->port_key, &ip6));
-    }
-}
-
-static void
 read_db(void)
 {
     read_datapaths();
@@ -738,7 +697,6 @@  read_db(void)
     read_address_sets();
     read_dhcp_opts();
     read_flows();
-    read_mac_bindings();
 }
 
 static bool
@@ -1377,6 +1335,9 @@  trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len,
              * though, it would be easy enough to track the queue information
              * by adjusting uflow->skb_priority. */
             break;
+        case OVNACT_BCAST2LR:
+            /* TODO: Should complete! */
+            break;
         }
 
     }
diff --git a/tests/ovn.at b/tests/ovn.at
index 557b2ca..7bb7812 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -2375,7 +2375,7 @@  for is in 1 2 3; do
           fi
 
           kd=1
-          d=$id$jd$kd
+          d=${id}1${kd}
 
           o4=`expr $is '*' 9 + $js '*' 3 + $ks + 10`
           host_ip=`ip_to_hex 192 168 $id$jd $o4`
@@ -2405,7 +2405,7 @@  done
 
 # Allow some time for packet forwarding.
 # XXX This can be improved.
-sleep 1
+sleep 2
 
 # 9. Send an IP packet from every logical port to every other subnet.  These
 #    are the same packets already sent as #3, but now the destinations' IP-MAC
@@ -2446,8 +2446,72 @@  for is in 1 2 3; do
   done
 done
 
-ovn-sbctl -f csv -d bare --no-heading \
-    -- --columns=logical_port,ip,mac list mac_binding > mac_bindings
+hex2ip() {
+    local ip dec=$((16#$1))
+        for e in {3..0}
+    do
+        ((octet = dec / (256 ** e) ))
+        ((dec -= octet * 256 ** e))
+        ip+=$delim$octet
+        delim=.
+    done
+    printf '%s\n' "$ip"
+}
+
+_parse_flow_rule() {
+    echo -e "$@" | \
+    sed -e 's/.*reg0=0x\([[a-zA-Z0-9]]*\).*,reg15=0x\([[0-9]]*\).*metadata=0x\([[0-9]]*\).*actions=mod_dl_dst:\([[:0-9A-Za-z]]*\)/\1,\2,\3,\4/'
+}
+
+_get_hash_value() {
+    local key=$1
+    shift
+    local hash="$@"
+    local ret=
+
+    for h in $hash; do
+        case "$h" in
+            $key=*)
+                ret=`echo $h | sed -e 's/.*=//'`
+                ;;
+        esac
+    done
+    echo $ret
+}
+
+update_mac_binding() {
+    local dp_map=""
+    ovn-sbctl -d bare -f csv --no-headings -- --columns=_uuid,tunnel_key list Datapath_Binding > sbctl.dp.log
+    while IFS=, read -r -a line; do
+        dp_map="$dp_map ${line[[0]]}=${line[[1]]}"
+    done < sbctl.dp.log
+    rm sbctl.dp.log
+
+    pb_map=
+    ovn-sbctl -d bare -f csv --no-headings -- --columns=tunnel_key,logical_port,datapath list Port_Binding > sbctl.pb.log
+    while IFS=, read -r -a line; do
+        dp_key=$(_get_hash_value ${line[[2]]} "$dp_map")
+        pb_map="$pb_map ${line[[0]]},${dp_key}=${line[[1]]}"
+    done < sbctl.pb.log
+    rm sbctl.pb.log
+
+    ovs-ofctl dump-flows br-int table=66 | \
+    while read line; do
+        _parse_flow_rule $line | while IFS=, read -r -a mb; do
+            if test "${#mb[@]}" -ne 4; then
+                continue
+            fi
+            local _ip=$(hex2ip ${mb[[0]]})
+            local _port_key=${mb[[1]]}
+            local _dp_key=${mb[[2]]}
+            local _mac=${mb[[3]]}
+            local _port_name=$(_get_hash_value "$_port_key,$_dp_key" "$pb_map")
+            echo "$_port_name,$_ip,$_mac" >> mac_bindings
+        done
+    done
+}
+
+as hv2 update_mac_binding
 
 # Now check the packets actually received against the ones expected.
 for i in 1 2 3; do
@@ -5170,37 +5234,6 @@  OVN_CLEANUP([hv1])
 
 AT_CLEANUP
 
-AT_SETUP([ovn -- delete mac bindings])
-ovn_start
-net_add n1
-sim_add hv1
-as hv1
-ovs-vsctl -- add-br br-phys
-ovn_attach n1 br-phys 192.168.0.1
-# Create logical switch ls0
-ovn-nbctl ls-add ls0
-# Create ports lp0, lp1 in ls0
-ovn-nbctl lsp-add ls0 lp0
-ovn-nbctl lsp-add ls0 lp1
-ovn-nbctl lsp-set-addresses lp0 "f0:00:00:00:00:01 192.168.0.1"
-ovn-nbctl lsp-set-addresses lp1 "f0:00:00:00:00:02 192.168.0.2"
-dp_uuid=`ovn-sbctl find datapath | grep uuid | cut -f2 -d ":" | cut -f2 -d " "`
-ovn-sbctl create MAC_Binding ip=10.0.0.1 datapath=$dp_uuid logical_port=lp0 mac="mac1"
-ovn-sbctl create MAC_Binding ip=10.0.0.1 datapath=$dp_uuid logical_port=lp1 mac="mac2"
-ovn-sbctl find MAC_Binding
-# Delete port lp0 and check that its MAC_Binding is deleted.
-ovn-nbctl lsp-del lp0
-ovn-sbctl find MAC_Binding
-OVS_WAIT_UNTIL([test `ovn-sbctl find MAC_Binding logical_port=lp0 | wc -l` = 0])
-# Delete logical switch ls0 and check that its MAC_Binding is deleted.
-ovn-nbctl ls-del ls0
-ovn-sbctl find MAC_Binding
-OVS_WAIT_UNTIL([test `ovn-sbctl find MAC_Binding | wc -l` = 0])
-
-OVN_CLEANUP([hv1])
-
-AT_CLEANUP
-
 AT_SETUP([ovn -- conntrack zone allocation])
 AT_SKIP_IF([test $HAVE_PYTHON = no])
 ovn_start