diff mbox series

[ovs-dev,ovn,v2,3/3] ovn-northd.c: Support optionally disabling neighbor learning from ARP request/NS.

Message ID 1596611114-118204-4-git-send-email-hzhou@ovn.org
State Accepted
Headers show
Series Avoid ARP flow explosion. | expand

Commit Message

Han Zhou Aug. 5, 2020, 7:05 a.m. UTC
Support a new logical router option "always_learn_from_arp_request" that controls
behavior when handling ARP requests or IPv4 ND-NS packets.

"true" - Always learn the MAC/IP binding and add a new MAC_Binding entry
(default behavior)

"false" - If there is a MAC_binding for that IP and the MAC is different, or,
if TPA of ARP request belongs to any router port on this router, then
update/add that MAC/IP binding. Otherwise, don't update/add entries.

Reported-by: Girish Moodalbail <gmoodalbail@nvidia.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049995.html
Signed-off-by: Han Zhou <hzhou@ovn.org>
---
 northd/ovn-northd.8.xml | 109 ++++++++++++++++++++++++++++++++++++++++--------
 northd/ovn-northd.c     |  96 +++++++++++++++++++++++++++++++++++-------
 ovn-nb.xml              |  27 ++++++++++++
 tests/ovn.at            |  18 ++++++++
 4 files changed, 217 insertions(+), 33 deletions(-)

Comments

0-day Robot Aug. 5, 2020, 8:02 a.m. UTC | #1
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 92 characters long (recommended limit is 79)
#370 FILE: ovn-nb.xml:1862:
      <column name="options" key="always_learn_from_arp_request" type='{"type": "boolean"}'>

Lines checked: 438, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Numan Siddique Aug. 6, 2020, 6:07 p.m. UTC | #2
On Wed, Aug 5, 2020 at 12:36 PM Han Zhou <hzhou@ovn.org> wrote:
>
> Support a new logical router option "always_learn_from_arp_request" that controls
> behavior when handling ARP requests or IPv4 ND-NS packets.
>
> "true" - Always learn the MAC/IP binding and add a new MAC_Binding entry
> (default behavior)
>
> "false" - If there is a MAC_binding for that IP and the MAC is different, or,
> if TPA of ARP request belongs to any router port on this router, then
> update/add that MAC/IP binding. Otherwise, don't update/add entries.
>
> Reported-by: Girish Moodalbail <gmoodalbail@nvidia.com>
> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049995.html
> Signed-off-by: Han Zhou <hzhou@ovn.org>

Hi Han,

I have few small nits. Please see below. With those addressed

Acked-by: Numan Siddique <numans@ovn.org> for all the 3 patches in the series.


> ---
>  northd/ovn-northd.8.xml | 109 ++++++++++++++++++++++++++++++++++++++++--------
>  northd/ovn-northd.c     |  96 +++++++++++++++++++++++++++++++++++-------
>  ovn-nb.xml              |  27 ++++++++++++
>  tests/ovn.at            |  18 ++++++++
>  4 files changed, 217 insertions(+), 33 deletions(-)
>
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 508123e..b08293f 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -1537,58 +1537,126 @@ output;
>          <p>
>            For each router port <var>P</var> that owns IP address <var>A</var>,
>            which belongs to subnet <var>S</var> with prefix length <var>L</var>,
> -          a priority-100 flow is added which matches
> -          <code>inport == <var>P</var> &amp;&amp;
> -          arp.spa == <var>S</var>/<var>L</var> &amp;&amp; arp.op == 1</code>
> -          (ARP request) with the
> -          following actions:
> +          if the option <code>always_learn_from_arp_request</code> is
> +          <code>true</code> for this router, a priority-100 flow is added which
> +          matches <code>inport == <var>P</var> &amp;&amp; arp.spa ==
> +          <var>S</var>/<var>L</var> &amp;&amp; arp.op == 1</code> (ARP request)
> +          with the following actions:
> +        </p>
> +
> +        <pre>
> +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> +next;
> +        </pre>
> +
> +        <p>
> +          If the option <code>always_learn_from_arp_request</code> is
> +          <code>false</code>, the following two flows are added.
> +        </p>
> +
> +        <p>
> +          A priority-110 flow is added which matches <code>inport ==
> +          <var>P</var> &amp;&amp; arp.spa == <var>S</var>/<var>L</var>
> +          &amp;&amp; arp.tpa == <var>A</var> &amp;&amp; arp.op == 1</code>
> +          (ARP request) with the following actions:
>          </p>
>
>          <pre>
>  reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> +reg9[3] = 1;
> +next;
> +        </pre>
> +
> +        <p>
> +          A priority-100 flow is added which matches <code>inport ==
> +          <var>P</var> &amp;&amp; arp.spa == <var>S</var>/<var>L</var>
> +          &amp;&amp; arp.op == 1</code> (ARP request) with the following
> +          actions:
> +        </p>
> +
> +        <pre>
> +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> +reg9[3] = lookup_arp_ip(inport, arp.spa);
>  next;
>          </pre>
>
>          <p>
>            If the logical router port <var>P</var> is a distributed gateway
>            router port, additional match
> -          <code>is_chassis_resident(cr-<var>P</var>)</code> is added so that
> -          the resident gateway chassis handles the neighbor lookup.
> +          <code>is_chassis_resident(cr-<var>P</var>)</code> is added for all
> +          these flows.
>          </p>
>        </li>
>
>        <li>
>          <p>
>            A priority-100 flow which matches on ARP reply packets and applies
> -          the actions:
> +          the actions if the option <code>always_learn_from_arp_request</code>
> +          is <code>true</code>:
>          </p>
>
>          <pre>
>  reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
>  next;
>          </pre>
> +
> +        <p>
> +          If the option <code>always_learn_from_arp_request</code>
> +          is <code>false</code>, the above actions will be:
> +        </p>
> +
> +        <pre>
> +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> +reg9[3] = 1;
> +next;
> +        </pre>
> +
>        </li>
>
>        <li>
>          <p>
>            A priority-100 flow which matches on IPv6 Neighbor Discovery
> -          advertisement packet and applies the actions:
> +          advertisement packet and applies the actions if the option
> +          <code>always_learn_from_arp_request</code> is <code>true</code>:
>          </p>
>
>          <pre>
>  reg9[2] = lookup_nd(inport, nd.target, nd.tll);
>  next;
>          </pre>
> +
> +        <p>
> +          If the option <code>always_learn_from_arp_request</code>
> +          is <code>false</code>, the above actions will be:
> +        </p>
> +
> +        <pre>
> +reg9[2] = lookup_nd(inport, nd.target, nd.tll);
> +reg9[3] = 1;
> +next;
> +        </pre>
>        </li>
>
>        <li>
>          <p>
>            A priority-100 flow which matches on IPv6 Neighbor Discovery
> -          solicitation packet and applies the actions:
> +          solicitation packet and applies the actions if the option
> +          <code>always_learn_from_arp_request</code> is <code>true</code>:
> +        </p>
> +
> +        <pre>
> +reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
> +next;
> +        </pre>
> +
> +        <p>
> +          If the option <code>always_learn_from_arp_request</code>
> +          is <code>false</code>, the above actions will be:
>          </p>
>
>          <pre>
>  reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
> +reg9[3] = lookup_nd_ip(inport, ip6.src);
>  next;
>          </pre>
>        </li>
> @@ -1604,21 +1672,28 @@ next;
>
>      <p>
>        This table adds flows to learn the mac bindings from the ARP and
> -      IPv6 Neighbor Solicitation/Advertisement packets if ARP/ND lookup
> -      failed in the previous table.
> +      IPv6 Neighbor Solicitation/Advertisement packets if it is needed
> +      according to the lookup results from the previous stage.
>      </p>
>
>      <p>
>        reg9[2] will be <code>1</code> if the <code>lookup_arp/lookup_nd</code>
> -      in the previous table was successful, or if there was no need to do the
> -      lookup.
> +      in the previous table was successful or skipped, meaning no need
> +      to learn mac binding from the packet.
> +    </p>
> +
> +    <p>
> +      reg9[3] will be <code>1</code> if the
> +      <code>lookup_arp_ip/lookup_nd_ip</code> in the previous table was
> +      successful or skipped, meaning it is ok to learn mac binding from
> +      the packet (if reg9[2] is 0).
>      </p>
>
>      <ul>
>        <li>
> -        A priority-100 flow with the match
> -        <code>reg9[2] == 1</code> and advances the packet
> -        to the next table as there is no need to learn the neighbor.
> +        A priority-100 flow with the match <code>reg9[2] == 1 || reg9[3] ==
> +        0</code> and advances the packet to the next table as there is no need
> +        to learn the neighbor.
>        </li>
>
>        <li>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 5dd49f9..b7102b2 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -222,6 +222,7 @@ enum ovn_stage {
>  /* Register to store the result of check_pkt_larger action. */
>  #define REGBIT_PKT_LARGER        "reg9[1]"
>  #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[2]"
> +#define REGBIT_LOOKUP_NEIGHBOR_IP_RESULT "reg9[3]"
>
>  /* Register to store the eth address associated to a router port for packets
>   * received in S_ROUTER_IN_ADMISSION.
> @@ -8442,36 +8443,62 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
>           * For ARP packets, table LOOKUP_NEIGHBOR does a lookup for the
>           * (arp.spa, arp.sha) in the mac binding table using the 'lookup_arp'
>           * action and stores the result in REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> +         * If "always_learn_from_arp_request" is set to false, it will also
> +         * lookup for the (arp.spa) in the mac binding table using the
> +         * "lookup_arp_ip" action for ARP request packets, and stores the
> +         * result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit; or set that bit
> +         * to "1" directly for ARP response packets.
>           *
>           * For IPv6 ND NA packets, table LOOKUP_NEIGHBOR does a lookup
>           * for the (nd.target, nd.tll) in the mac binding table using the
>           * 'lookup_nd' action and stores the result in
> -         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> +         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
> +         * "always_learn_from_arp_request" is set to false,
> +         * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit is set.
>           *
>           * For IPv6 ND NS packets, table LOOKUP_NEIGHBOR does a lookup
>           * for the (ip6.src, nd.sll) in the mac binding table using the
>           * 'lookup_nd' action and stores the result in
> -         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> +         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
> +         * "always_learn_from_arp_request" is set to false, it will also lookup
> +         * for the (ip6.src) in the mac binding table using the "lookup_nd_ip"
> +         * action and stores the result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> +         * bit.
>           *
>           * Table LEARN_NEIGHBOR learns the mac-binding using the action
> -         * - 'put_arp/put_nd' only if REGBIT_LOOKUP_NEIGHBOR_RESULT bit
> -         * is not set.
> +         * - 'put_arp/put_nd'. Learning mac-binding is skipped if
> +         *   REGBIT_LOOKUP_NEIGHBOR_RESULT bit is set or
> +         *   REGBIT_LOOKUP_NEIGHBOR_IP_RESULT is not set.
>           *
>           * */
>
>          /* Flows for LOOKUP_NEIGHBOR. */
> +        bool learn_from_arp_request = smap_get_bool(&od->nbr->options,
> +            "always_learn_from_arp_request", true);
> +        ds_clear(&actions);
> +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> +                      " = lookup_arp(inport, arp.spa, arp.sha); %snext;",
> +                      learn_from_arp_request ? "" :
> +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
>          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> -                      "arp.op == 2",
> -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> -                      "lookup_arp(inport, arp.spa, arp.sha); next;");
> +                      "arp.op == 2", ds_cstr(&actions));
>
> +        ds_clear(&actions);
> +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> +                      " = lookup_nd(inport, nd.target, nd.tll); %snext;",
> +                      learn_from_arp_request ? "" :
> +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
>          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100, "nd_na",
> -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> -                      "lookup_nd(inport, nd.target, nd.tll); next;");
> +                      ds_cstr(&actions));
>
> +        ds_clear(&actions);
> +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> +                      " = lookup_nd(inport, ip6.src, nd.sll); %snext;",
> +                      learn_from_arp_request ? "" :
> +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> +                      " = lookup_nd_ip(inport, ip6.src); ");
>          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100, "nd_ns",
> -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> -                      "lookup_nd(inport, ip6.src, nd.sll); next;");
> +                      ds_cstr(&actions));
>
>          /* For other packet types, we can skip neighbor learning.
>           * So set REGBIT_LOOKUP_NEIGHBOR_RESULT to 1. */
> @@ -8480,8 +8507,12 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
>
>          /* Flows for LEARN_NEIGHBOR. */
>          /* Skip Neighbor learning if not required. */
> +        ds_clear(&match);
> +        ds_put_format(&match, REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1%s",
> +                      learn_from_arp_request ? "" :
> +                      " || "REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" == 0");
>          ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 100,
> -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1", "next;");
> +                      ds_cstr(&match), "next;");
>
>          ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 90,
>                        "arp", "put_arp(inport, arp.spa, arp.sha); next;");
> @@ -8498,8 +8529,37 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
>              continue;
>          }
>
> +        bool learn_from_arp_request = smap_get_bool(&op->od->nbr->options,
> +            "always_learn_from_arp_request", true);
> +
>          /* Check if we need to learn mac-binding from ARP requests. */
>          for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> +            if (!learn_from_arp_request) {
> +                /* ARP request to this address should always get learned,
> +                 * so add a priority-110 flow to set
> +                 * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT to 1. */
> +                ds_clear(&match);
> +                ds_put_format(&match,
> +                              "inport == %s && arp.spa == %s/%u && "
> +                              "arp.tpa == %s && arp.op == 1",
> +                              op->json_key,
> +                              op->lrp_networks.ipv4_addrs[i].network_s,
> +                              op->lrp_networks.ipv4_addrs[i].plen,
> +                              op->lrp_networks.ipv4_addrs[i].addr_s);
> +                if (op->od->l3dgw_port && op == op->od->l3dgw_port
> +                    && op->od->l3redirect_port) {
> +                    ds_put_format(&match, " && is_chassis_resident(%s)",
> +                                  op->od->l3redirect_port->json_key);
> +                }
> +                const char *actions_s = REGBIT_LOOKUP_NEIGHBOR_RESULT
> +                                  " = lookup_arp(inport, arp.spa, arp.sha); "
> +                                  REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1;"
> +                                  " next;";
> +                ovn_lflow_add_with_hint(lflows, op->od,
> +                                        S_ROUTER_IN_LOOKUP_NEIGHBOR, 110,
> +                                        ds_cstr(&match), actions_s,
> +                                        &op->nbrp->header_);
> +            }
>              ds_clear(&match);
>              ds_put_format(&match,
>                            "inport == %s && arp.spa == %s/%u && arp.op == 1",
> @@ -8511,12 +8571,16 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
>                  ds_put_format(&match, " && is_chassis_resident(%s)",
>                                op->od->l3redirect_port->json_key);
>              }
> +            ds_clear(&actions);
> +            ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> +                          " = lookup_arp(inport, arp.spa, arp.sha); %snext;",
> +                          learn_from_arp_request ? "" :
> +                          REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> +                          " = lookup_arp_ip(inport, arp.spa); ");
>              ovn_lflow_add_with_hint(lflows, op->od,
>                                      S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> -                                    ds_cstr(&match),
> -                                    REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> -                                    "lookup_arp(inport, arp.spa, arp.sha); "
> -                                    "next;", &op->nbrp->header_);
> +                                    ds_cstr(&match), ds_cstr(&actions),
> +                                    &op->nbrp->header_);
>          }
>      }
>
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 4c59338..d0317db 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -1859,6 +1859,33 @@
>            send traffic between each other.
>          </p>
>        </column>
> +      <column name="options" key="always_learn_from_arp_request" type='{"type": "boolean"}'>
> +        <p>
> +          This option controls the behavior when handling IPv4 ARP requests or
> +          IPv6 ND-NS packets - whether a dynamic neighbor (MAC binding) entry
> +          is added/updated.
> +        </p>
> +
> +        <p>
> +          <code>true</code> - Always learn the MAC-IP binding, and add/update
> +          the MAC binding entry.
> +        </p>
> +
> +        <p>
> +          <code>false</code> - If there is a MAC binding for that IP and the
> +          MAC is different, or, if TPA of ARP request belongs to any router
> +          port on this router, then update/add that MAC-IP binding. Otherwise,
> +          don't update/add entries.
> +        </p>
> +
> +        <p>
> +          It is <code>true</code> by default.  It is recommended to set to
> +          <code>false</code> when a large number of logical routers are
> +          connected to the same logical switch but most of them never need to
> +          send traffic between each other, to reduce the size of the MAC
> +          binding table.
> +        </p>
> +      </column>
>      </group>
>
>      <group title="Common Columns">
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 33fa6f2..ef5ef12 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -4042,6 +4042,18 @@ ip_to_hex() {
>  sha=f00000000011
>  spa=`ip_to_hex 192 168 1 100`
>  tpa=$spa
> +
> +# When always_learn_from_arp_request=false, the new mac-binding will not be learned
> +# through GARP request.
> +ovn-nbctl --wait=hv set logical_router lr0 options:always_learn_from_arp_request=false
> +
> +test_arp 11 $sha $spa $tpa
> +sleep 1

Is it possible to use ovn-nbctl --wait=hv sync instead of sleep 1 ?

> +AT_CHECK([ovn-sbctl find mac_binding ip="192.168.1.100"], [0], [])
> +
> +# When always_learn_from_arp_request=true, the new mac-binding will be learned.
> +ovn-nbctl --wait=hv set logical_router lr0 options:always_learn_from_arp_request=true
> +
>  test_arp 11 $sha $spa $tpa
>  OVS_WAIT_UNTIL([test `ovn-sbctl find mac_binding ip="192.168.1.100" | wc -l` -gt 0])
>  ovn-nbctl --wait=hv sync
> @@ -4056,6 +4068,12 @@ test_ip 21 $smac $dmac $sip $dip 11
>
>  # lp12 send GARP request to announce ownership of 192.168.1.100.
>
> +# Even when always_learn_from_arp_request=false, the existing mac-binding should be
> +# updated through GARP request.
> +ovn-nbctl --wait=hv set logical_router lr0 options:always_learn_from_arp_request=false
> +ovn-sbctl lflow-list
> +as hv2 ovs-ofctl dump-flows br-int

Are these above 2 leftover after debugging ? I think you can delete them.


> +
>  sha=f00000000012
>  test_arp 12 $sha $spa $tpa
>  OVS_WAIT_UNTIL([ovn-sbctl find mac_binding ip="192.168.1.100" | grep f0:00:00:00:00:12])
> --
> 2.1.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Han Zhou Aug. 6, 2020, 6:15 p.m. UTC | #3
On Thu, Aug 6, 2020 at 11:07 AM Numan Siddique <numans@ovn.org> wrote:
>
> On Wed, Aug 5, 2020 at 12:36 PM Han Zhou <hzhou@ovn.org> wrote:
> >
> > Support a new logical router option "always_learn_from_arp_request"
that controls
> > behavior when handling ARP requests or IPv4 ND-NS packets.
> >
> > "true" - Always learn the MAC/IP binding and add a new MAC_Binding entry
> > (default behavior)
> >
> > "false" - If there is a MAC_binding for that IP and the MAC is
different, or,
> > if TPA of ARP request belongs to any router port on this router, then
> > update/add that MAC/IP binding. Otherwise, don't update/add entries.
> >
> > Reported-by: Girish Moodalbail <gmoodalbail@nvidia.com>
> > Reported-at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049995.html
> > Signed-off-by: Han Zhou <hzhou@ovn.org>
>
> Hi Han,
>
> I have few small nits. Please see below. With those addressed
>
> Acked-by: Numan Siddique <numans@ovn.org> for all the 3 patches in the
series.
>
>
> > ---
> >  northd/ovn-northd.8.xml | 109
++++++++++++++++++++++++++++++++++++++++--------
> >  northd/ovn-northd.c     |  96
+++++++++++++++++++++++++++++++++++-------
> >  ovn-nb.xml              |  27 ++++++++++++
> >  tests/ovn.at            |  18 ++++++++
> >  4 files changed, 217 insertions(+), 33 deletions(-)
> >
> > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > index 508123e..b08293f 100644
> > --- a/northd/ovn-northd.8.xml
> > +++ b/northd/ovn-northd.8.xml
> > @@ -1537,58 +1537,126 @@ output;
> >          <p>
> >            For each router port <var>P</var> that owns IP address
<var>A</var>,
> >            which belongs to subnet <var>S</var> with prefix length
<var>L</var>,
> > -          a priority-100 flow is added which matches
> > -          <code>inport == <var>P</var> &amp;&amp;
> > -          arp.spa == <var>S</var>/<var>L</var> &amp;&amp; arp.op ==
1</code>
> > -          (ARP request) with the
> > -          following actions:
> > +          if the option <code>always_learn_from_arp_request</code> is
> > +          <code>true</code> for this router, a priority-100 flow is
added which
> > +          matches <code>inport == <var>P</var> &amp;&amp; arp.spa ==
> > +          <var>S</var>/<var>L</var> &amp;&amp; arp.op == 1</code> (ARP
request)
> > +          with the following actions:
> > +        </p>
> > +
> > +        <pre>
> > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > +next;
> > +        </pre>
> > +
> > +        <p>
> > +          If the option <code>always_learn_from_arp_request</code> is
> > +          <code>false</code>, the following two flows are added.
> > +        </p>
> > +
> > +        <p>
> > +          A priority-110 flow is added which matches <code>inport ==
> > +          <var>P</var> &amp;&amp; arp.spa == <var>S</var>/<var>L</var>
> > +          &amp;&amp; arp.tpa == <var>A</var> &amp;&amp; arp.op ==
1</code>
> > +          (ARP request) with the following actions:
> >          </p>
> >
> >          <pre>
> >  reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > +reg9[3] = 1;
> > +next;
> > +        </pre>
> > +
> > +        <p>
> > +          A priority-100 flow is added which matches <code>inport ==
> > +          <var>P</var> &amp;&amp; arp.spa == <var>S</var>/<var>L</var>
> > +          &amp;&amp; arp.op == 1</code> (ARP request) with the
following
> > +          actions:
> > +        </p>
> > +
> > +        <pre>
> > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > +reg9[3] = lookup_arp_ip(inport, arp.spa);
> >  next;
> >          </pre>
> >
> >          <p>
> >            If the logical router port <var>P</var> is a distributed
gateway
> >            router port, additional match
> > -          <code>is_chassis_resident(cr-<var>P</var>)</code> is added
so that
> > -          the resident gateway chassis handles the neighbor lookup.
> > +          <code>is_chassis_resident(cr-<var>P</var>)</code> is added
for all
> > +          these flows.
> >          </p>
> >        </li>
> >
> >        <li>
> >          <p>
> >            A priority-100 flow which matches on ARP reply packets and
applies
> > -          the actions:
> > +          the actions if the option
<code>always_learn_from_arp_request</code>
> > +          is <code>true</code>:
> >          </p>
> >
> >          <pre>
> >  reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> >  next;
> >          </pre>
> > +
> > +        <p>
> > +          If the option <code>always_learn_from_arp_request</code>
> > +          is <code>false</code>, the above actions will be:
> > +        </p>
> > +
> > +        <pre>
> > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > +reg9[3] = 1;
> > +next;
> > +        </pre>
> > +
> >        </li>
> >
> >        <li>
> >          <p>
> >            A priority-100 flow which matches on IPv6 Neighbor Discovery
> > -          advertisement packet and applies the actions:
> > +          advertisement packet and applies the actions if the option
> > +          <code>always_learn_from_arp_request</code> is
<code>true</code>:
> >          </p>
> >
> >          <pre>
> >  reg9[2] = lookup_nd(inport, nd.target, nd.tll);
> >  next;
> >          </pre>
> > +
> > +        <p>
> > +          If the option <code>always_learn_from_arp_request</code>
> > +          is <code>false</code>, the above actions will be:
> > +        </p>
> > +
> > +        <pre>
> > +reg9[2] = lookup_nd(inport, nd.target, nd.tll);
> > +reg9[3] = 1;
> > +next;
> > +        </pre>
> >        </li>
> >
> >        <li>
> >          <p>
> >            A priority-100 flow which matches on IPv6 Neighbor Discovery
> > -          solicitation packet and applies the actions:
> > +          solicitation packet and applies the actions if the option
> > +          <code>always_learn_from_arp_request</code> is
<code>true</code>:
> > +        </p>
> > +
> > +        <pre>
> > +reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
> > +next;
> > +        </pre>
> > +
> > +        <p>
> > +          If the option <code>always_learn_from_arp_request</code>
> > +          is <code>false</code>, the above actions will be:
> >          </p>
> >
> >          <pre>
> >  reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
> > +reg9[3] = lookup_nd_ip(inport, ip6.src);
> >  next;
> >          </pre>
> >        </li>
> > @@ -1604,21 +1672,28 @@ next;
> >
> >      <p>
> >        This table adds flows to learn the mac bindings from the ARP and
> > -      IPv6 Neighbor Solicitation/Advertisement packets if ARP/ND lookup
> > -      failed in the previous table.
> > +      IPv6 Neighbor Solicitation/Advertisement packets if it is needed
> > +      according to the lookup results from the previous stage.
> >      </p>
> >
> >      <p>
> >        reg9[2] will be <code>1</code> if the
<code>lookup_arp/lookup_nd</code>
> > -      in the previous table was successful, or if there was no need to
do the
> > -      lookup.
> > +      in the previous table was successful or skipped, meaning no need
> > +      to learn mac binding from the packet.
> > +    </p>
> > +
> > +    <p>
> > +      reg9[3] will be <code>1</code> if the
> > +      <code>lookup_arp_ip/lookup_nd_ip</code> in the previous table was
> > +      successful or skipped, meaning it is ok to learn mac binding from
> > +      the packet (if reg9[2] is 0).
> >      </p>
> >
> >      <ul>
> >        <li>
> > -        A priority-100 flow with the match
> > -        <code>reg9[2] == 1</code> and advances the packet
> > -        to the next table as there is no need to learn the neighbor.
> > +        A priority-100 flow with the match <code>reg9[2] == 1 ||
reg9[3] ==
> > +        0</code> and advances the packet to the next table as there is
no need
> > +        to learn the neighbor.
> >        </li>
> >
> >        <li>
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 5dd49f9..b7102b2 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -222,6 +222,7 @@ enum ovn_stage {
> >  /* Register to store the result of check_pkt_larger action. */
> >  #define REGBIT_PKT_LARGER        "reg9[1]"
> >  #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[2]"
> > +#define REGBIT_LOOKUP_NEIGHBOR_IP_RESULT "reg9[3]"
> >
> >  /* Register to store the eth address associated to a router port for
packets
> >   * received in S_ROUTER_IN_ADMISSION.
> > @@ -8442,36 +8443,62 @@ build_lrouter_flows(struct hmap *datapaths,
struct hmap *ports,
> >           * For ARP packets, table LOOKUP_NEIGHBOR does a lookup for the
> >           * (arp.spa, arp.sha) in the mac binding table using the
'lookup_arp'
> >           * action and stores the result in
REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > +         * If "always_learn_from_arp_request" is set to false, it will
also
> > +         * lookup for the (arp.spa) in the mac binding table using the
> > +         * "lookup_arp_ip" action for ARP request packets, and stores
the
> > +         * result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit; or set that
bit
> > +         * to "1" directly for ARP response packets.
> >           *
> >           * For IPv6 ND NA packets, table LOOKUP_NEIGHBOR does a lookup
> >           * for the (nd.target, nd.tll) in the mac binding table using
the
> >           * 'lookup_nd' action and stores the result in
> > -         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > +         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
> > +         * "always_learn_from_arp_request" is set to false,
> > +         * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit is set.
> >           *
> >           * For IPv6 ND NS packets, table LOOKUP_NEIGHBOR does a lookup
> >           * for the (ip6.src, nd.sll) in the mac binding table using the
> >           * 'lookup_nd' action and stores the result in
> > -         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > +         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
> > +         * "always_learn_from_arp_request" is set to false, it will
also lookup
> > +         * for the (ip6.src) in the mac binding table using the
"lookup_nd_ip"
> > +         * action and stores the result in
REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > +         * bit.
> >           *
> >           * Table LEARN_NEIGHBOR learns the mac-binding using the action
> > -         * - 'put_arp/put_nd' only if REGBIT_LOOKUP_NEIGHBOR_RESULT bit
> > -         * is not set.
> > +         * - 'put_arp/put_nd'. Learning mac-binding is skipped if
> > +         *   REGBIT_LOOKUP_NEIGHBOR_RESULT bit is set or
> > +         *   REGBIT_LOOKUP_NEIGHBOR_IP_RESULT is not set.
> >           *
> >           * */
> >
> >          /* Flows for LOOKUP_NEIGHBOR. */
> > +        bool learn_from_arp_request = smap_get_bool(&od->nbr->options,
> > +            "always_learn_from_arp_request", true);
> > +        ds_clear(&actions);
> > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > +                      " = lookup_arp(inport, arp.spa, arp.sha);
%snext;",
> > +                      learn_from_arp_request ? "" :
> > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
> >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> > -                      "arp.op == 2",
> > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > -                      "lookup_arp(inport, arp.spa, arp.sha); next;");
> > +                      "arp.op == 2", ds_cstr(&actions));
> >
> > +        ds_clear(&actions);
> > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > +                      " = lookup_nd(inport, nd.target, nd.tll);
%snext;",
> > +                      learn_from_arp_request ? "" :
> > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
> >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
"nd_na",
> > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > -                      "lookup_nd(inport, nd.target, nd.tll); next;");
> > +                      ds_cstr(&actions));
> >
> > +        ds_clear(&actions);
> > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > +                      " = lookup_nd(inport, ip6.src, nd.sll); %snext;",
> > +                      learn_from_arp_request ? "" :
> > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > +                      " = lookup_nd_ip(inport, ip6.src); ");
> >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
"nd_ns",
> > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > -                      "lookup_nd(inport, ip6.src, nd.sll); next;");
> > +                      ds_cstr(&actions));
> >
> >          /* For other packet types, we can skip neighbor learning.
> >           * So set REGBIT_LOOKUP_NEIGHBOR_RESULT to 1. */
> > @@ -8480,8 +8507,12 @@ build_lrouter_flows(struct hmap *datapaths,
struct hmap *ports,
> >
> >          /* Flows for LEARN_NEIGHBOR. */
> >          /* Skip Neighbor learning if not required. */
> > +        ds_clear(&match);
> > +        ds_put_format(&match, REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1%s",
> > +                      learn_from_arp_request ? "" :
> > +                      " || "REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" == 0");
> >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 100,
> > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1", "next;");
> > +                      ds_cstr(&match), "next;");
> >
> >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 90,
> >                        "arp", "put_arp(inport, arp.spa, arp.sha);
next;");
> > @@ -8498,8 +8529,37 @@ build_lrouter_flows(struct hmap *datapaths,
struct hmap *ports,
> >              continue;
> >          }
> >
> > +        bool learn_from_arp_request =
smap_get_bool(&op->od->nbr->options,
> > +            "always_learn_from_arp_request", true);
> > +
> >          /* Check if we need to learn mac-binding from ARP requests. */
> >          for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> > +            if (!learn_from_arp_request) {
> > +                /* ARP request to this address should always get
learned,
> > +                 * so add a priority-110 flow to set
> > +                 * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT to 1. */
> > +                ds_clear(&match);
> > +                ds_put_format(&match,
> > +                              "inport == %s && arp.spa == %s/%u && "
> > +                              "arp.tpa == %s && arp.op == 1",
> > +                              op->json_key,
> > +                              op->lrp_networks.ipv4_addrs[i].network_s,
> > +                              op->lrp_networks.ipv4_addrs[i].plen,
> > +                              op->lrp_networks.ipv4_addrs[i].addr_s);
> > +                if (op->od->l3dgw_port && op == op->od->l3dgw_port
> > +                    && op->od->l3redirect_port) {
> > +                    ds_put_format(&match, " &&
is_chassis_resident(%s)",
> > +                                  op->od->l3redirect_port->json_key);
> > +                }
> > +                const char *actions_s = REGBIT_LOOKUP_NEIGHBOR_RESULT
> > +                                  " = lookup_arp(inport, arp.spa,
arp.sha); "
> > +                                  REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" =
1;"
> > +                                  " next;";
> > +                ovn_lflow_add_with_hint(lflows, op->od,
> > +                                        S_ROUTER_IN_LOOKUP_NEIGHBOR,
110,
> > +                                        ds_cstr(&match), actions_s,
> > +                                        &op->nbrp->header_);
> > +            }
> >              ds_clear(&match);
> >              ds_put_format(&match,
> >                            "inport == %s && arp.spa == %s/%u && arp.op
== 1",
> > @@ -8511,12 +8571,16 @@ build_lrouter_flows(struct hmap *datapaths,
struct hmap *ports,
> >                  ds_put_format(&match, " && is_chassis_resident(%s)",
> >                                op->od->l3redirect_port->json_key);
> >              }
> > +            ds_clear(&actions);
> > +            ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > +                          " = lookup_arp(inport, arp.spa, arp.sha);
%snext;",
> > +                          learn_from_arp_request ? "" :
> > +                          REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > +                          " = lookup_arp_ip(inport, arp.spa); ");
> >              ovn_lflow_add_with_hint(lflows, op->od,
> >                                      S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> > -                                    ds_cstr(&match),
> > -                                    REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > -                                    "lookup_arp(inport, arp.spa,
arp.sha); "
> > -                                    "next;", &op->nbrp->header_);
> > +                                    ds_cstr(&match), ds_cstr(&actions),
> > +                                    &op->nbrp->header_);
> >          }
> >      }
> >
> > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > index 4c59338..d0317db 100644
> > --- a/ovn-nb.xml
> > +++ b/ovn-nb.xml
> > @@ -1859,6 +1859,33 @@
> >            send traffic between each other.
> >          </p>
> >        </column>
> > +      <column name="options" key="always_learn_from_arp_request"
type='{"type": "boolean"}'>
> > +        <p>
> > +          This option controls the behavior when handling IPv4 ARP
requests or
> > +          IPv6 ND-NS packets - whether a dynamic neighbor (MAC
binding) entry
> > +          is added/updated.
> > +        </p>
> > +
> > +        <p>
> > +          <code>true</code> - Always learn the MAC-IP binding, and
add/update
> > +          the MAC binding entry.
> > +        </p>
> > +
> > +        <p>
> > +          <code>false</code> - If there is a MAC binding for that IP
and the
> > +          MAC is different, or, if TPA of ARP request belongs to any
router
> > +          port on this router, then update/add that MAC-IP binding.
Otherwise,
> > +          don't update/add entries.
> > +        </p>
> > +
> > +        <p>
> > +          It is <code>true</code> by default.  It is recommended to
set to
> > +          <code>false</code> when a large number of logical routers are
> > +          connected to the same logical switch but most of them never
need to
> > +          send traffic between each other, to reduce the size of the
MAC
> > +          binding table.
> > +        </p>
> > +      </column>
> >      </group>
> >
> >      <group title="Common Columns">
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 33fa6f2..ef5ef12 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -4042,6 +4042,18 @@ ip_to_hex() {
> >  sha=f00000000011
> >  spa=`ip_to_hex 192 168 1 100`
> >  tpa=$spa
> > +
> > +# When always_learn_from_arp_request=false, the new mac-binding will
not be learned
> > +# through GARP request.
> > +ovn-nbctl --wait=hv set logical_router lr0
options:always_learn_from_arp_request=false
> > +
> > +test_arp 11 $sha $spa $tpa
> > +sleep 1
>
> Is it possible to use ovn-nbctl --wait=hv sync instead of sleep 1 ?

Hi Numan, I thought about this, but since mac_binding learning is triggered
by the data plane ARP packets instead of only DB changes, "ovn-nbctl
--wait=hv sync" may still not help in race conditions. So sleep is the best
way I can think of for now. (I replied to this same comment from your
earlier email, but maybe you didn't notice it)

>
>
> > +AT_CHECK([ovn-sbctl find mac_binding ip="192.168.1.100"], [0], [])
> > +
> > +# When always_learn_from_arp_request=true, the new mac-binding will be
learned.
> > +ovn-nbctl --wait=hv set logical_router lr0
options:always_learn_from_arp_request=true
> > +
> >  test_arp 11 $sha $spa $tpa
> >  OVS_WAIT_UNTIL([test `ovn-sbctl find mac_binding ip="192.168.1.100" |
wc -l` -gt 0])
> >  ovn-nbctl --wait=hv sync
> > @@ -4056,6 +4068,12 @@ test_ip 21 $smac $dmac $sip $dip 11
> >
> >  # lp12 send GARP request to announce ownership of 192.168.1.100.
> >
> > +# Even when always_learn_from_arp_request=false, the existing
mac-binding should be
> > +# updated through GARP request.
> > +ovn-nbctl --wait=hv set logical_router lr0
options:always_learn_from_arp_request=false
> > +ovn-sbctl lflow-list
> > +as hv2 ovs-ofctl dump-flows br-int
>
> Are these above 2 leftover after debugging ? I think you can delete them.
>
Ack.

Thanks,
Han

>
> > +
> >  sha=f00000000012
> >  test_arp 12 $sha $spa $tpa
> >  OVS_WAIT_UNTIL([ovn-sbctl find mac_binding ip="192.168.1.100" | grep
f0:00:00:00:00:12])
> > --
> > 2.1.0
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
Numan Siddique Aug. 6, 2020, 6:22 p.m. UTC | #4
On Thu, Aug 6, 2020 at 11:45 PM Han Zhou <hzhou@ovn.org> wrote:
>
> On Thu, Aug 6, 2020 at 11:07 AM Numan Siddique <numans@ovn.org> wrote:
> >
> > On Wed, Aug 5, 2020 at 12:36 PM Han Zhou <hzhou@ovn.org> wrote:
> > >
> > > Support a new logical router option "always_learn_from_arp_request"
> that controls
> > > behavior when handling ARP requests or IPv4 ND-NS packets.
> > >
> > > "true" - Always learn the MAC/IP binding and add a new MAC_Binding entry
> > > (default behavior)
> > >
> > > "false" - If there is a MAC_binding for that IP and the MAC is
> different, or,
> > > if TPA of ARP request belongs to any router port on this router, then
> > > update/add that MAC/IP binding. Otherwise, don't update/add entries.
> > >
> > > Reported-by: Girish Moodalbail <gmoodalbail@nvidia.com>
> > > Reported-at:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049995.html
> > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> >
> > Hi Han,
> >
> > I have few small nits. Please see below. With those addressed
> >
> > Acked-by: Numan Siddique <numans@ovn.org> for all the 3 patches in the
> series.
> >
> >
> > > ---
> > >  northd/ovn-northd.8.xml | 109
> ++++++++++++++++++++++++++++++++++++++++--------
> > >  northd/ovn-northd.c     |  96
> +++++++++++++++++++++++++++++++++++-------
> > >  ovn-nb.xml              |  27 ++++++++++++
> > >  tests/ovn.at            |  18 ++++++++
> > >  4 files changed, 217 insertions(+), 33 deletions(-)
> > >
> > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > index 508123e..b08293f 100644
> > > --- a/northd/ovn-northd.8.xml
> > > +++ b/northd/ovn-northd.8.xml
> > > @@ -1537,58 +1537,126 @@ output;
> > >          <p>
> > >            For each router port <var>P</var> that owns IP address
> <var>A</var>,
> > >            which belongs to subnet <var>S</var> with prefix length
> <var>L</var>,
> > > -          a priority-100 flow is added which matches
> > > -          <code>inport == <var>P</var> &amp;&amp;
> > > -          arp.spa == <var>S</var>/<var>L</var> &amp;&amp; arp.op ==
> 1</code>
> > > -          (ARP request) with the
> > > -          following actions:
> > > +          if the option <code>always_learn_from_arp_request</code> is
> > > +          <code>true</code> for this router, a priority-100 flow is
> added which
> > > +          matches <code>inport == <var>P</var> &amp;&amp; arp.spa ==
> > > +          <var>S</var>/<var>L</var> &amp;&amp; arp.op == 1</code> (ARP
> request)
> > > +          with the following actions:
> > > +        </p>
> > > +
> > > +        <pre>
> > > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > +next;
> > > +        </pre>
> > > +
> > > +        <p>
> > > +          If the option <code>always_learn_from_arp_request</code> is
> > > +          <code>false</code>, the following two flows are added.
> > > +        </p>
> > > +
> > > +        <p>
> > > +          A priority-110 flow is added which matches <code>inport ==
> > > +          <var>P</var> &amp;&amp; arp.spa == <var>S</var>/<var>L</var>
> > > +          &amp;&amp; arp.tpa == <var>A</var> &amp;&amp; arp.op ==
> 1</code>
> > > +          (ARP request) with the following actions:
> > >          </p>
> > >
> > >          <pre>
> > >  reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > +reg9[3] = 1;
> > > +next;
> > > +        </pre>
> > > +
> > > +        <p>
> > > +          A priority-100 flow is added which matches <code>inport ==
> > > +          <var>P</var> &amp;&amp; arp.spa == <var>S</var>/<var>L</var>
> > > +          &amp;&amp; arp.op == 1</code> (ARP request) with the
> following
> > > +          actions:
> > > +        </p>
> > > +
> > > +        <pre>
> > > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > +reg9[3] = lookup_arp_ip(inport, arp.spa);
> > >  next;
> > >          </pre>
> > >
> > >          <p>
> > >            If the logical router port <var>P</var> is a distributed
> gateway
> > >            router port, additional match
> > > -          <code>is_chassis_resident(cr-<var>P</var>)</code> is added
> so that
> > > -          the resident gateway chassis handles the neighbor lookup.
> > > +          <code>is_chassis_resident(cr-<var>P</var>)</code> is added
> for all
> > > +          these flows.
> > >          </p>
> > >        </li>
> > >
> > >        <li>
> > >          <p>
> > >            A priority-100 flow which matches on ARP reply packets and
> applies
> > > -          the actions:
> > > +          the actions if the option
> <code>always_learn_from_arp_request</code>
> > > +          is <code>true</code>:
> > >          </p>
> > >
> > >          <pre>
> > >  reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > >  next;
> > >          </pre>
> > > +
> > > +        <p>
> > > +          If the option <code>always_learn_from_arp_request</code>
> > > +          is <code>false</code>, the above actions will be:
> > > +        </p>
> > > +
> > > +        <pre>
> > > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > +reg9[3] = 1;
> > > +next;
> > > +        </pre>
> > > +
> > >        </li>
> > >
> > >        <li>
> > >          <p>
> > >            A priority-100 flow which matches on IPv6 Neighbor Discovery
> > > -          advertisement packet and applies the actions:
> > > +          advertisement packet and applies the actions if the option
> > > +          <code>always_learn_from_arp_request</code> is
> <code>true</code>:
> > >          </p>
> > >
> > >          <pre>
> > >  reg9[2] = lookup_nd(inport, nd.target, nd.tll);
> > >  next;
> > >          </pre>
> > > +
> > > +        <p>
> > > +          If the option <code>always_learn_from_arp_request</code>
> > > +          is <code>false</code>, the above actions will be:
> > > +        </p>
> > > +
> > > +        <pre>
> > > +reg9[2] = lookup_nd(inport, nd.target, nd.tll);
> > > +reg9[3] = 1;
> > > +next;
> > > +        </pre>
> > >        </li>
> > >
> > >        <li>
> > >          <p>
> > >            A priority-100 flow which matches on IPv6 Neighbor Discovery
> > > -          solicitation packet and applies the actions:
> > > +          solicitation packet and applies the actions if the option
> > > +          <code>always_learn_from_arp_request</code> is
> <code>true</code>:
> > > +        </p>
> > > +
> > > +        <pre>
> > > +reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
> > > +next;
> > > +        </pre>
> > > +
> > > +        <p>
> > > +          If the option <code>always_learn_from_arp_request</code>
> > > +          is <code>false</code>, the above actions will be:
> > >          </p>
> > >
> > >          <pre>
> > >  reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
> > > +reg9[3] = lookup_nd_ip(inport, ip6.src);
> > >  next;
> > >          </pre>
> > >        </li>
> > > @@ -1604,21 +1672,28 @@ next;
> > >
> > >      <p>
> > >        This table adds flows to learn the mac bindings from the ARP and
> > > -      IPv6 Neighbor Solicitation/Advertisement packets if ARP/ND lookup
> > > -      failed in the previous table.
> > > +      IPv6 Neighbor Solicitation/Advertisement packets if it is needed
> > > +      according to the lookup results from the previous stage.
> > >      </p>
> > >
> > >      <p>
> > >        reg9[2] will be <code>1</code> if the
> <code>lookup_arp/lookup_nd</code>
> > > -      in the previous table was successful, or if there was no need to
> do the
> > > -      lookup.
> > > +      in the previous table was successful or skipped, meaning no need
> > > +      to learn mac binding from the packet.
> > > +    </p>
> > > +
> > > +    <p>
> > > +      reg9[3] will be <code>1</code> if the
> > > +      <code>lookup_arp_ip/lookup_nd_ip</code> in the previous table was
> > > +      successful or skipped, meaning it is ok to learn mac binding from
> > > +      the packet (if reg9[2] is 0).
> > >      </p>
> > >
> > >      <ul>
> > >        <li>
> > > -        A priority-100 flow with the match
> > > -        <code>reg9[2] == 1</code> and advances the packet
> > > -        to the next table as there is no need to learn the neighbor.
> > > +        A priority-100 flow with the match <code>reg9[2] == 1 ||
> reg9[3] ==
> > > +        0</code> and advances the packet to the next table as there is
> no need
> > > +        to learn the neighbor.
> > >        </li>
> > >
> > >        <li>
> > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > index 5dd49f9..b7102b2 100644
> > > --- a/northd/ovn-northd.c
> > > +++ b/northd/ovn-northd.c
> > > @@ -222,6 +222,7 @@ enum ovn_stage {
> > >  /* Register to store the result of check_pkt_larger action. */
> > >  #define REGBIT_PKT_LARGER        "reg9[1]"
> > >  #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[2]"
> > > +#define REGBIT_LOOKUP_NEIGHBOR_IP_RESULT "reg9[3]"
> > >
> > >  /* Register to store the eth address associated to a router port for
> packets
> > >   * received in S_ROUTER_IN_ADMISSION.
> > > @@ -8442,36 +8443,62 @@ build_lrouter_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >           * For ARP packets, table LOOKUP_NEIGHBOR does a lookup for the
> > >           * (arp.spa, arp.sha) in the mac binding table using the
> 'lookup_arp'
> > >           * action and stores the result in
> REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > > +         * If "always_learn_from_arp_request" is set to false, it will
> also
> > > +         * lookup for the (arp.spa) in the mac binding table using the
> > > +         * "lookup_arp_ip" action for ARP request packets, and stores
> the
> > > +         * result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit; or set that
> bit
> > > +         * to "1" directly for ARP response packets.
> > >           *
> > >           * For IPv6 ND NA packets, table LOOKUP_NEIGHBOR does a lookup
> > >           * for the (nd.target, nd.tll) in the mac binding table using
> the
> > >           * 'lookup_nd' action and stores the result in
> > > -         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > > +         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
> > > +         * "always_learn_from_arp_request" is set to false,
> > > +         * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit is set.
> > >           *
> > >           * For IPv6 ND NS packets, table LOOKUP_NEIGHBOR does a lookup
> > >           * for the (ip6.src, nd.sll) in the mac binding table using the
> > >           * 'lookup_nd' action and stores the result in
> > > -         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > > +         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
> > > +         * "always_learn_from_arp_request" is set to false, it will
> also lookup
> > > +         * for the (ip6.src) in the mac binding table using the
> "lookup_nd_ip"
> > > +         * action and stores the result in
> REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > > +         * bit.
> > >           *
> > >           * Table LEARN_NEIGHBOR learns the mac-binding using the action
> > > -         * - 'put_arp/put_nd' only if REGBIT_LOOKUP_NEIGHBOR_RESULT bit
> > > -         * is not set.
> > > +         * - 'put_arp/put_nd'. Learning mac-binding is skipped if
> > > +         *   REGBIT_LOOKUP_NEIGHBOR_RESULT bit is set or
> > > +         *   REGBIT_LOOKUP_NEIGHBOR_IP_RESULT is not set.
> > >           *
> > >           * */
> > >
> > >          /* Flows for LOOKUP_NEIGHBOR. */
> > > +        bool learn_from_arp_request = smap_get_bool(&od->nbr->options,
> > > +            "always_learn_from_arp_request", true);
> > > +        ds_clear(&actions);
> > > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > +                      " = lookup_arp(inport, arp.spa, arp.sha);
> %snext;",
> > > +                      learn_from_arp_request ? "" :
> > > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
> > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> > > -                      "arp.op == 2",
> > > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > > -                      "lookup_arp(inport, arp.spa, arp.sha); next;");
> > > +                      "arp.op == 2", ds_cstr(&actions));
> > >
> > > +        ds_clear(&actions);
> > > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > +                      " = lookup_nd(inport, nd.target, nd.tll);
> %snext;",
> > > +                      learn_from_arp_request ? "" :
> > > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
> > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> "nd_na",
> > > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > > -                      "lookup_nd(inport, nd.target, nd.tll); next;");
> > > +                      ds_cstr(&actions));
> > >
> > > +        ds_clear(&actions);
> > > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > +                      " = lookup_nd(inport, ip6.src, nd.sll); %snext;",
> > > +                      learn_from_arp_request ? "" :
> > > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > > +                      " = lookup_nd_ip(inport, ip6.src); ");
> > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> "nd_ns",
> > > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > > -                      "lookup_nd(inport, ip6.src, nd.sll); next;");
> > > +                      ds_cstr(&actions));
> > >
> > >          /* For other packet types, we can skip neighbor learning.
> > >           * So set REGBIT_LOOKUP_NEIGHBOR_RESULT to 1. */
> > > @@ -8480,8 +8507,12 @@ build_lrouter_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >
> > >          /* Flows for LEARN_NEIGHBOR. */
> > >          /* Skip Neighbor learning if not required. */
> > > +        ds_clear(&match);
> > > +        ds_put_format(&match, REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1%s",
> > > +                      learn_from_arp_request ? "" :
> > > +                      " || "REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" == 0");
> > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 100,
> > > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1", "next;");
> > > +                      ds_cstr(&match), "next;");
> > >
> > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 90,
> > >                        "arp", "put_arp(inport, arp.spa, arp.sha);
> next;");
> > > @@ -8498,8 +8529,37 @@ build_lrouter_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >              continue;
> > >          }
> > >
> > > +        bool learn_from_arp_request =
> smap_get_bool(&op->od->nbr->options,
> > > +            "always_learn_from_arp_request", true);
> > > +
> > >          /* Check if we need to learn mac-binding from ARP requests. */
> > >          for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> > > +            if (!learn_from_arp_request) {
> > > +                /* ARP request to this address should always get
> learned,
> > > +                 * so add a priority-110 flow to set
> > > +                 * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT to 1. */
> > > +                ds_clear(&match);
> > > +                ds_put_format(&match,
> > > +                              "inport == %s && arp.spa == %s/%u && "
> > > +                              "arp.tpa == %s && arp.op == 1",
> > > +                              op->json_key,
> > > +                              op->lrp_networks.ipv4_addrs[i].network_s,
> > > +                              op->lrp_networks.ipv4_addrs[i].plen,
> > > +                              op->lrp_networks.ipv4_addrs[i].addr_s);
> > > +                if (op->od->l3dgw_port && op == op->od->l3dgw_port
> > > +                    && op->od->l3redirect_port) {
> > > +                    ds_put_format(&match, " &&
> is_chassis_resident(%s)",
> > > +                                  op->od->l3redirect_port->json_key);
> > > +                }
> > > +                const char *actions_s = REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > +                                  " = lookup_arp(inport, arp.spa,
> arp.sha); "
> > > +                                  REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" =
> 1;"
> > > +                                  " next;";
> > > +                ovn_lflow_add_with_hint(lflows, op->od,
> > > +                                        S_ROUTER_IN_LOOKUP_NEIGHBOR,
> 110,
> > > +                                        ds_cstr(&match), actions_s,
> > > +                                        &op->nbrp->header_);
> > > +            }
> > >              ds_clear(&match);
> > >              ds_put_format(&match,
> > >                            "inport == %s && arp.spa == %s/%u && arp.op
> == 1",
> > > @@ -8511,12 +8571,16 @@ build_lrouter_flows(struct hmap *datapaths,
> struct hmap *ports,
> > >                  ds_put_format(&match, " && is_chassis_resident(%s)",
> > >                                op->od->l3redirect_port->json_key);
> > >              }
> > > +            ds_clear(&actions);
> > > +            ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > +                          " = lookup_arp(inport, arp.spa, arp.sha);
> %snext;",
> > > +                          learn_from_arp_request ? "" :
> > > +                          REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > > +                          " = lookup_arp_ip(inport, arp.spa); ");
> > >              ovn_lflow_add_with_hint(lflows, op->od,
> > >                                      S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> > > -                                    ds_cstr(&match),
> > > -                                    REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > > -                                    "lookup_arp(inport, arp.spa,
> arp.sha); "
> > > -                                    "next;", &op->nbrp->header_);
> > > +                                    ds_cstr(&match), ds_cstr(&actions),
> > > +                                    &op->nbrp->header_);
> > >          }
> > >      }
> > >
> > > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > > index 4c59338..d0317db 100644
> > > --- a/ovn-nb.xml
> > > +++ b/ovn-nb.xml
> > > @@ -1859,6 +1859,33 @@
> > >            send traffic between each other.
> > >          </p>
> > >        </column>
> > > +      <column name="options" key="always_learn_from_arp_request"
> type='{"type": "boolean"}'>
> > > +        <p>
> > > +          This option controls the behavior when handling IPv4 ARP
> requests or
> > > +          IPv6 ND-NS packets - whether a dynamic neighbor (MAC
> binding) entry
> > > +          is added/updated.
> > > +        </p>
> > > +
> > > +        <p>
> > > +          <code>true</code> - Always learn the MAC-IP binding, and
> add/update
> > > +          the MAC binding entry.
> > > +        </p>
> > > +
> > > +        <p>
> > > +          <code>false</code> - If there is a MAC binding for that IP
> and the
> > > +          MAC is different, or, if TPA of ARP request belongs to any
> router
> > > +          port on this router, then update/add that MAC-IP binding.
> Otherwise,
> > > +          don't update/add entries.
> > > +        </p>
> > > +
> > > +        <p>
> > > +          It is <code>true</code> by default.  It is recommended to
> set to
> > > +          <code>false</code> when a large number of logical routers are
> > > +          connected to the same logical switch but most of them never
> need to
> > > +          send traffic between each other, to reduce the size of the
> MAC
> > > +          binding table.
> > > +        </p>
> > > +      </column>
> > >      </group>
> > >
> > >      <group title="Common Columns">
> > > diff --git a/tests/ovn.at b/tests/ovn.at
> > > index 33fa6f2..ef5ef12 100644
> > > --- a/tests/ovn.at
> > > +++ b/tests/ovn.at
> > > @@ -4042,6 +4042,18 @@ ip_to_hex() {
> > >  sha=f00000000011
> > >  spa=`ip_to_hex 192 168 1 100`
> > >  tpa=$spa
> > > +
> > > +# When always_learn_from_arp_request=false, the new mac-binding will
> not be learned
> > > +# through GARP request.
> > > +ovn-nbctl --wait=hv set logical_router lr0
> options:always_learn_from_arp_request=false
> > > +
> > > +test_arp 11 $sha $spa $tpa
> > > +sleep 1
> >
> > Is it possible to use ovn-nbctl --wait=hv sync instead of sleep 1 ?
>
> Hi Numan, I thought about this, but since mac_binding learning is triggered
> by the data plane ARP packets instead of only DB changes, "ovn-nbctl
> --wait=hv sync" may still not help in race conditions. So sleep is the best
> way I can think of for now. (I replied to this same comment from your
> earlier email, but maybe you didn't notice it)


I think I missed it. Agree with you.

Thanks
Numan

>
> >
> >
> > > +AT_CHECK([ovn-sbctl find mac_binding ip="192.168.1.100"], [0], [])
> > > +
> > > +# When always_learn_from_arp_request=true, the new mac-binding will be
> learned.
> > > +ovn-nbctl --wait=hv set logical_router lr0
> options:always_learn_from_arp_request=true
> > > +
> > >  test_arp 11 $sha $spa $tpa
> > >  OVS_WAIT_UNTIL([test `ovn-sbctl find mac_binding ip="192.168.1.100" |
> wc -l` -gt 0])
> > >  ovn-nbctl --wait=hv sync
> > > @@ -4056,6 +4068,12 @@ test_ip 21 $smac $dmac $sip $dip 11
> > >
> > >  # lp12 send GARP request to announce ownership of 192.168.1.100.
> > >
> > > +# Even when always_learn_from_arp_request=false, the existing
> mac-binding should be
> > > +# updated through GARP request.
> > > +ovn-nbctl --wait=hv set logical_router lr0
> options:always_learn_from_arp_request=false
> > > +ovn-sbctl lflow-list
> > > +as hv2 ovs-ofctl dump-flows br-int
> >
> > Are these above 2 leftover after debugging ? I think you can delete them.
> >
> Ack.
>
> Thanks,
> Han
>
> >
> > > +
> > >  sha=f00000000012
> > >  test_arp 12 $sha $spa $tpa
> > >  OVS_WAIT_UNTIL([ovn-sbctl find mac_binding ip="192.168.1.100" | grep
> f0:00:00:00:00:12])
> > > --
> > > 2.1.0
> > >
> > > _______________________________________________
> > > dev mailing list
> > > dev@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Han Zhou Aug. 6, 2020, 6:46 p.m. UTC | #5
On Thu, Aug 6, 2020 at 11:22 AM Numan Siddique <numans@ovn.org> wrote:
>
> On Thu, Aug 6, 2020 at 11:45 PM Han Zhou <hzhou@ovn.org> wrote:
> >
> > On Thu, Aug 6, 2020 at 11:07 AM Numan Siddique <numans@ovn.org> wrote:
> > >
> > > On Wed, Aug 5, 2020 at 12:36 PM Han Zhou <hzhou@ovn.org> wrote:
> > > >
> > > > Support a new logical router option "always_learn_from_arp_request"
> > that controls
> > > > behavior when handling ARP requests or IPv4 ND-NS packets.
> > > >
> > > > "true" - Always learn the MAC/IP binding and add a new MAC_Binding
entry
> > > > (default behavior)
> > > >
> > > > "false" - If there is a MAC_binding for that IP and the MAC is
> > different, or,
> > > > if TPA of ARP request belongs to any router port on this router,
then
> > > > update/add that MAC/IP binding. Otherwise, don't update/add entries.
> > > >
> > > > Reported-by: Girish Moodalbail <gmoodalbail@nvidia.com>
> > > > Reported-at:
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049995.html
> > > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> > >
> > > Hi Han,
> > >
> > > I have few small nits. Please see below. With those addressed
> > >
> > > Acked-by: Numan Siddique <numans@ovn.org> for all the 3 patches in the
> > series.
> > >
> > >
> > > > ---
> > > >  northd/ovn-northd.8.xml | 109
> > ++++++++++++++++++++++++++++++++++++++++--------
> > > >  northd/ovn-northd.c     |  96
> > +++++++++++++++++++++++++++++++++++-------
> > > >  ovn-nb.xml              |  27 ++++++++++++
> > > >  tests/ovn.at            |  18 ++++++++
> > > >  4 files changed, 217 insertions(+), 33 deletions(-)
> > > >
> > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > > index 508123e..b08293f 100644
> > > > --- a/northd/ovn-northd.8.xml
> > > > +++ b/northd/ovn-northd.8.xml
> > > > @@ -1537,58 +1537,126 @@ output;
> > > >          <p>
> > > >            For each router port <var>P</var> that owns IP address
> > <var>A</var>,
> > > >            which belongs to subnet <var>S</var> with prefix length
> > <var>L</var>,
> > > > -          a priority-100 flow is added which matches
> > > > -          <code>inport == <var>P</var> &amp;&amp;
> > > > -          arp.spa == <var>S</var>/<var>L</var> &amp;&amp; arp.op ==
> > 1</code>
> > > > -          (ARP request) with the
> > > > -          following actions:
> > > > +          if the option <code>always_learn_from_arp_request</code>
is
> > > > +          <code>true</code> for this router, a priority-100 flow is
> > added which
> > > > +          matches <code>inport == <var>P</var> &amp;&amp; arp.spa
==
> > > > +          <var>S</var>/<var>L</var> &amp;&amp; arp.op == 1</code>
(ARP
> > request)
> > > > +          with the following actions:
> > > > +        </p>
> > > > +
> > > > +        <pre>
> > > > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > > +next;
> > > > +        </pre>
> > > > +
> > > > +        <p>
> > > > +          If the option <code>always_learn_from_arp_request</code>
is
> > > > +          <code>false</code>, the following two flows are added.
> > > > +        </p>
> > > > +
> > > > +        <p>
> > > > +          A priority-110 flow is added which matches <code>inport
==
> > > > +          <var>P</var> &amp;&amp; arp.spa ==
<var>S</var>/<var>L</var>
> > > > +          &amp;&amp; arp.tpa == <var>A</var> &amp;&amp; arp.op ==
> > 1</code>
> > > > +          (ARP request) with the following actions:
> > > >          </p>
> > > >
> > > >          <pre>
> > > >  reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > > +reg9[3] = 1;
> > > > +next;
> > > > +        </pre>
> > > > +
> > > > +        <p>
> > > > +          A priority-100 flow is added which matches <code>inport
==
> > > > +          <var>P</var> &amp;&amp; arp.spa ==
<var>S</var>/<var>L</var>
> > > > +          &amp;&amp; arp.op == 1</code> (ARP request) with the
> > following
> > > > +          actions:
> > > > +        </p>
> > > > +
> > > > +        <pre>
> > > > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > > +reg9[3] = lookup_arp_ip(inport, arp.spa);
> > > >  next;
> > > >          </pre>
> > > >
> > > >          <p>
> > > >            If the logical router port <var>P</var> is a distributed
> > gateway
> > > >            router port, additional match
> > > > -          <code>is_chassis_resident(cr-<var>P</var>)</code> is
added
> > so that
> > > > -          the resident gateway chassis handles the neighbor lookup.
> > > > +          <code>is_chassis_resident(cr-<var>P</var>)</code> is
added
> > for all
> > > > +          these flows.
> > > >          </p>
> > > >        </li>
> > > >
> > > >        <li>
> > > >          <p>
> > > >            A priority-100 flow which matches on ARP reply packets
and
> > applies
> > > > -          the actions:
> > > > +          the actions if the option
> > <code>always_learn_from_arp_request</code>
> > > > +          is <code>true</code>:
> > > >          </p>
> > > >
> > > >          <pre>
> > > >  reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > >  next;
> > > >          </pre>
> > > > +
> > > > +        <p>
> > > > +          If the option <code>always_learn_from_arp_request</code>
> > > > +          is <code>false</code>, the above actions will be:
> > > > +        </p>
> > > > +
> > > > +        <pre>
> > > > +reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
> > > > +reg9[3] = 1;
> > > > +next;
> > > > +        </pre>
> > > > +
> > > >        </li>
> > > >
> > > >        <li>
> > > >          <p>
> > > >            A priority-100 flow which matches on IPv6 Neighbor
Discovery
> > > > -          advertisement packet and applies the actions:
> > > > +          advertisement packet and applies the actions if the
option
> > > > +          <code>always_learn_from_arp_request</code> is
> > <code>true</code>:
> > > >          </p>
> > > >
> > > >          <pre>
> > > >  reg9[2] = lookup_nd(inport, nd.target, nd.tll);
> > > >  next;
> > > >          </pre>
> > > > +
> > > > +        <p>
> > > > +          If the option <code>always_learn_from_arp_request</code>
> > > > +          is <code>false</code>, the above actions will be:
> > > > +        </p>
> > > > +
> > > > +        <pre>
> > > > +reg9[2] = lookup_nd(inport, nd.target, nd.tll);
> > > > +reg9[3] = 1;
> > > > +next;
> > > > +        </pre>
> > > >        </li>
> > > >
> > > >        <li>
> > > >          <p>
> > > >            A priority-100 flow which matches on IPv6 Neighbor
Discovery
> > > > -          solicitation packet and applies the actions:
> > > > +          solicitation packet and applies the actions if the option
> > > > +          <code>always_learn_from_arp_request</code> is
> > <code>true</code>:
> > > > +        </p>
> > > > +
> > > > +        <pre>
> > > > +reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
> > > > +next;
> > > > +        </pre>
> > > > +
> > > > +        <p>
> > > > +          If the option <code>always_learn_from_arp_request</code>
> > > > +          is <code>false</code>, the above actions will be:
> > > >          </p>
> > > >
> > > >          <pre>
> > > >  reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
> > > > +reg9[3] = lookup_nd_ip(inport, ip6.src);
> > > >  next;
> > > >          </pre>
> > > >        </li>
> > > > @@ -1604,21 +1672,28 @@ next;
> > > >
> > > >      <p>
> > > >        This table adds flows to learn the mac bindings from the ARP
and
> > > > -      IPv6 Neighbor Solicitation/Advertisement packets if ARP/ND
lookup
> > > > -      failed in the previous table.
> > > > +      IPv6 Neighbor Solicitation/Advertisement packets if it is
needed
> > > > +      according to the lookup results from the previous stage.
> > > >      </p>
> > > >
> > > >      <p>
> > > >        reg9[2] will be <code>1</code> if the
> > <code>lookup_arp/lookup_nd</code>
> > > > -      in the previous table was successful, or if there was no
need to
> > do the
> > > > -      lookup.
> > > > +      in the previous table was successful or skipped, meaning no
need
> > > > +      to learn mac binding from the packet.
> > > > +    </p>
> > > > +
> > > > +    <p>
> > > > +      reg9[3] will be <code>1</code> if the
> > > > +      <code>lookup_arp_ip/lookup_nd_ip</code> in the previous
table was
> > > > +      successful or skipped, meaning it is ok to learn mac binding
from
> > > > +      the packet (if reg9[2] is 0).
> > > >      </p>
> > > >
> > > >      <ul>
> > > >        <li>
> > > > -        A priority-100 flow with the match
> > > > -        <code>reg9[2] == 1</code> and advances the packet
> > > > -        to the next table as there is no need to learn the
neighbor.
> > > > +        A priority-100 flow with the match <code>reg9[2] == 1 ||
> > reg9[3] ==
> > > > +        0</code> and advances the packet to the next table as
there is
> > no need
> > > > +        to learn the neighbor.
> > > >        </li>
> > > >
> > > >        <li>
> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > index 5dd49f9..b7102b2 100644
> > > > --- a/northd/ovn-northd.c
> > > > +++ b/northd/ovn-northd.c
> > > > @@ -222,6 +222,7 @@ enum ovn_stage {
> > > >  /* Register to store the result of check_pkt_larger action. */
> > > >  #define REGBIT_PKT_LARGER        "reg9[1]"
> > > >  #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[2]"
> > > > +#define REGBIT_LOOKUP_NEIGHBOR_IP_RESULT "reg9[3]"
> > > >
> > > >  /* Register to store the eth address associated to a router port
for
> > packets
> > > >   * received in S_ROUTER_IN_ADMISSION.
> > > > @@ -8442,36 +8443,62 @@ build_lrouter_flows(struct hmap *datapaths,
> > struct hmap *ports,
> > > >           * For ARP packets, table LOOKUP_NEIGHBOR does a lookup
for the
> > > >           * (arp.spa, arp.sha) in the mac binding table using the
> > 'lookup_arp'
> > > >           * action and stores the result in
> > REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > > > +         * If "always_learn_from_arp_request" is set to false, it
will
> > also
> > > > +         * lookup for the (arp.spa) in the mac binding table using
the
> > > > +         * "lookup_arp_ip" action for ARP request packets, and
stores
> > the
> > > > +         * result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit; or set
that
> > bit
> > > > +         * to "1" directly for ARP response packets.
> > > >           *
> > > >           * For IPv6 ND NA packets, table LOOKUP_NEIGHBOR does a
lookup
> > > >           * for the (nd.target, nd.tll) in the mac binding table
using
> > the
> > > >           * 'lookup_nd' action and stores the result in
> > > > -         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > > > +         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
> > > > +         * "always_learn_from_arp_request" is set to false,
> > > > +         * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit is set.
> > > >           *
> > > >           * For IPv6 ND NS packets, table LOOKUP_NEIGHBOR does a
lookup
> > > >           * for the (ip6.src, nd.sll) in the mac binding table
using the
> > > >           * 'lookup_nd' action and stores the result in
> > > > -         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
> > > > +         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
> > > > +         * "always_learn_from_arp_request" is set to false, it will
> > also lookup
> > > > +         * for the (ip6.src) in the mac binding table using the
> > "lookup_nd_ip"
> > > > +         * action and stores the result in
> > REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > > > +         * bit.
> > > >           *
> > > >           * Table LEARN_NEIGHBOR learns the mac-binding using the
action
> > > > -         * - 'put_arp/put_nd' only if
REGBIT_LOOKUP_NEIGHBOR_RESULT bit
> > > > -         * is not set.
> > > > +         * - 'put_arp/put_nd'. Learning mac-binding is skipped if
> > > > +         *   REGBIT_LOOKUP_NEIGHBOR_RESULT bit is set or
> > > > +         *   REGBIT_LOOKUP_NEIGHBOR_IP_RESULT is not set.
> > > >           *
> > > >           * */
> > > >
> > > >          /* Flows for LOOKUP_NEIGHBOR. */
> > > > +        bool learn_from_arp_request =
smap_get_bool(&od->nbr->options,
> > > > +            "always_learn_from_arp_request", true);
> > > > +        ds_clear(&actions);
> > > > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > > +                      " = lookup_arp(inport, arp.spa, arp.sha);
> > %snext;",
> > > > +                      learn_from_arp_request ? "" :
> > > > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
> > > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> > > > -                      "arp.op == 2",
> > > > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > > > -                      "lookup_arp(inport, arp.spa, arp.sha);
next;");
> > > > +                      "arp.op == 2", ds_cstr(&actions));
> > > >
> > > > +        ds_clear(&actions);
> > > > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > > +                      " = lookup_nd(inport, nd.target, nd.tll);
> > %snext;",
> > > > +                      learn_from_arp_request ? "" :
> > > > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
> > > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> > "nd_na",
> > > > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > > > -                      "lookup_nd(inport, nd.target, nd.tll);
next;");
> > > > +                      ds_cstr(&actions));
> > > >
> > > > +        ds_clear(&actions);
> > > > +        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > > +                      " = lookup_nd(inport, ip6.src, nd.sll);
%snext;",
> > > > +                      learn_from_arp_request ? "" :
> > > > +                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > > > +                      " = lookup_nd_ip(inport, ip6.src); ");
> > > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
> > "nd_ns",
> > > > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
> > > > -                      "lookup_nd(inport, ip6.src, nd.sll); next;");
> > > > +                      ds_cstr(&actions));
> > > >
> > > >          /* For other packet types, we can skip neighbor learning.
> > > >           * So set REGBIT_LOOKUP_NEIGHBOR_RESULT to 1. */
> > > > @@ -8480,8 +8507,12 @@ build_lrouter_flows(struct hmap *datapaths,
> > struct hmap *ports,
> > > >
> > > >          /* Flows for LEARN_NEIGHBOR. */
> > > >          /* Skip Neighbor learning if not required. */
> > > > +        ds_clear(&match);
> > > > +        ds_put_format(&match, REGBIT_LOOKUP_NEIGHBOR_RESULT" ==
1%s",
> > > > +                      learn_from_arp_request ? "" :
> > > > +                      " || "REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" ==
0");
> > > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 100,
> > > > -                      REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1",
"next;");
> > > > +                      ds_cstr(&match), "next;");
> > > >
> > > >          ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 90,
> > > >                        "arp", "put_arp(inport, arp.spa, arp.sha);
> > next;");
> > > > @@ -8498,8 +8529,37 @@ build_lrouter_flows(struct hmap *datapaths,
> > struct hmap *ports,
> > > >              continue;
> > > >          }
> > > >
> > > > +        bool learn_from_arp_request =
> > smap_get_bool(&op->od->nbr->options,
> > > > +            "always_learn_from_arp_request", true);
> > > > +
> > > >          /* Check if we need to learn mac-binding from ARP
requests. */
> > > >          for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> > > > +            if (!learn_from_arp_request) {
> > > > +                /* ARP request to this address should always get
> > learned,
> > > > +                 * so add a priority-110 flow to set
> > > > +                 * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT to 1. */
> > > > +                ds_clear(&match);
> > > > +                ds_put_format(&match,
> > > > +                              "inport == %s && arp.spa == %s/%u &&
"
> > > > +                              "arp.tpa == %s && arp.op == 1",
> > > > +                              op->json_key,
> > > > +
 op->lrp_networks.ipv4_addrs[i].network_s,
> > > > +                              op->lrp_networks.ipv4_addrs[i].plen,
> > > > +
 op->lrp_networks.ipv4_addrs[i].addr_s);
> > > > +                if (op->od->l3dgw_port && op == op->od->l3dgw_port
> > > > +                    && op->od->l3redirect_port) {
> > > > +                    ds_put_format(&match, " &&
> > is_chassis_resident(%s)",
> > > > +
 op->od->l3redirect_port->json_key);
> > > > +                }
> > > > +                const char *actions_s =
REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > > +                                  " = lookup_arp(inport, arp.spa,
> > arp.sha); "
> > > > +
 REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" =
> > 1;"
> > > > +                                  " next;";
> > > > +                ovn_lflow_add_with_hint(lflows, op->od,
> > > > +
 S_ROUTER_IN_LOOKUP_NEIGHBOR,
> > 110,
> > > > +                                        ds_cstr(&match), actions_s,
> > > > +                                        &op->nbrp->header_);
> > > > +            }
> > > >              ds_clear(&match);
> > > >              ds_put_format(&match,
> > > >                            "inport == %s && arp.spa == %s/%u &&
arp.op
> > == 1",
> > > > @@ -8511,12 +8571,16 @@ build_lrouter_flows(struct hmap *datapaths,
> > struct hmap *ports,
> > > >                  ds_put_format(&match, " &&
is_chassis_resident(%s)",
> > > >                                op->od->l3redirect_port->json_key);
> > > >              }
> > > > +            ds_clear(&actions);
> > > > +            ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
> > > > +                          " = lookup_arp(inport, arp.spa, arp.sha);
> > %snext;",
> > > > +                          learn_from_arp_request ? "" :
> > > > +                          REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
> > > > +                          " = lookup_arp_ip(inport, arp.spa); ");
> > > >              ovn_lflow_add_with_hint(lflows, op->od,
> > > >                                      S_ROUTER_IN_LOOKUP_NEIGHBOR,
100,
> > > > -                                    ds_cstr(&match),
> > > > -                                    REGBIT_LOOKUP_NEIGHBOR_RESULT"
= "
> > > > -                                    "lookup_arp(inport, arp.spa,
> > arp.sha); "
> > > > -                                    "next;", &op->nbrp->header_);
> > > > +                                    ds_cstr(&match),
ds_cstr(&actions),
> > > > +                                    &op->nbrp->header_);
> > > >          }
> > > >      }
> > > >
> > > > diff --git a/ovn-nb.xml b/ovn-nb.xml
> > > > index 4c59338..d0317db 100644
> > > > --- a/ovn-nb.xml
> > > > +++ b/ovn-nb.xml
> > > > @@ -1859,6 +1859,33 @@
> > > >            send traffic between each other.
> > > >          </p>
> > > >        </column>
> > > > +      <column name="options" key="always_learn_from_arp_request"
> > type='{"type": "boolean"}'>
> > > > +        <p>
> > > > +          This option controls the behavior when handling IPv4 ARP
> > requests or
> > > > +          IPv6 ND-NS packets - whether a dynamic neighbor (MAC
> > binding) entry
> > > > +          is added/updated.
> > > > +        </p>
> > > > +
> > > > +        <p>
> > > > +          <code>true</code> - Always learn the MAC-IP binding, and
> > add/update
> > > > +          the MAC binding entry.
> > > > +        </p>
> > > > +
> > > > +        <p>
> > > > +          <code>false</code> - If there is a MAC binding for that
IP
> > and the
> > > > +          MAC is different, or, if TPA of ARP request belongs to
any
> > router
> > > > +          port on this router, then update/add that MAC-IP binding.
> > Otherwise,
> > > > +          don't update/add entries.
> > > > +        </p>
> > > > +
> > > > +        <p>
> > > > +          It is <code>true</code> by default.  It is recommended to
> > set to
> > > > +          <code>false</code> when a large number of logical
routers are
> > > > +          connected to the same logical switch but most of them
never
> > need to
> > > > +          send traffic between each other, to reduce the size of
the
> > MAC
> > > > +          binding table.
> > > > +        </p>
> > > > +      </column>
> > > >      </group>
> > > >
> > > >      <group title="Common Columns">
> > > > diff --git a/tests/ovn.at b/tests/ovn.at
> > > > index 33fa6f2..ef5ef12 100644
> > > > --- a/tests/ovn.at
> > > > +++ b/tests/ovn.at
> > > > @@ -4042,6 +4042,18 @@ ip_to_hex() {
> > > >  sha=f00000000011
> > > >  spa=`ip_to_hex 192 168 1 100`
> > > >  tpa=$spa
> > > > +
> > > > +# When always_learn_from_arp_request=false, the new mac-binding
will
> > not be learned
> > > > +# through GARP request.
> > > > +ovn-nbctl --wait=hv set logical_router lr0
> > options:always_learn_from_arp_request=false
> > > > +
> > > > +test_arp 11 $sha $spa $tpa
> > > > +sleep 1
> > >
> > > Is it possible to use ovn-nbctl --wait=hv sync instead of sleep 1 ?
> >
> > Hi Numan, I thought about this, but since mac_binding learning is
triggered
> > by the data plane ARP packets instead of only DB changes, "ovn-nbctl
> > --wait=hv sync" may still not help in race conditions. So sleep is the
best
> > way I can think of for now. (I replied to this same comment from your
> > earlier email, but maybe you didn't notice it)
>
>
> I think I missed it. Agree with you.
>
> Thanks
> Numan
>
> >
> > >
> > >
> > > > +AT_CHECK([ovn-sbctl find mac_binding ip="192.168.1.100"], [0], [])
> > > > +
> > > > +# When always_learn_from_arp_request=true, the new mac-binding
will be
> > learned.
> > > > +ovn-nbctl --wait=hv set logical_router lr0
> > options:always_learn_from_arp_request=true
> > > > +
> > > >  test_arp 11 $sha $spa $tpa
> > > >  OVS_WAIT_UNTIL([test `ovn-sbctl find mac_binding
ip="192.168.1.100" |
> > wc -l` -gt 0])
> > > >  ovn-nbctl --wait=hv sync
> > > > @@ -4056,6 +4068,12 @@ test_ip 21 $smac $dmac $sip $dip 11
> > > >
> > > >  # lp12 send GARP request to announce ownership of 192.168.1.100.
> > > >
> > > > +# Even when always_learn_from_arp_request=false, the existing
> > mac-binding should be
> > > > +# updated through GARP request.
> > > > +ovn-nbctl --wait=hv set logical_router lr0
> > options:always_learn_from_arp_request=false
> > > > +ovn-sbctl lflow-list
> > > > +as hv2 ovs-ofctl dump-flows br-int
> > >
> > > Are these above 2 leftover after debugging ? I think you can delete
them.
> > >
> > Ack.
> >
> > Thanks,
> > Han
> >
> > >
> > > > +
> > > >  sha=f00000000012
> > > >  test_arp 12 $sha $spa $tpa
> > > >  OVS_WAIT_UNTIL([ovn-sbctl find mac_binding ip="192.168.1.100" |
grep
> > f0:00:00:00:00:12])
> > > > --
> > > > 2.1.0
> > > >
> > > > _______________________________________________
> > > > dev mailing list
> > > > dev@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >

Thanks Numan. I applied this to master (with the two lines of debug print
in the test case removed).
diff mbox series

Patch

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 508123e..b08293f 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1537,58 +1537,126 @@  output;
         <p>
           For each router port <var>P</var> that owns IP address <var>A</var>,
           which belongs to subnet <var>S</var> with prefix length <var>L</var>,
-          a priority-100 flow is added which matches
-          <code>inport == <var>P</var> &amp;&amp;
-          arp.spa == <var>S</var>/<var>L</var> &amp;&amp; arp.op == 1</code>
-          (ARP request) with the
-          following actions:
+          if the option <code>always_learn_from_arp_request</code> is
+          <code>true</code> for this router, a priority-100 flow is added which
+          matches <code>inport == <var>P</var> &amp;&amp; arp.spa ==
+          <var>S</var>/<var>L</var> &amp;&amp; arp.op == 1</code> (ARP request)
+          with the following actions:
+        </p>
+
+        <pre>
+reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
+next;
+        </pre>
+
+        <p>
+          If the option <code>always_learn_from_arp_request</code> is
+          <code>false</code>, the following two flows are added.
+        </p>
+
+        <p>
+          A priority-110 flow is added which matches <code>inport ==
+          <var>P</var> &amp;&amp; arp.spa == <var>S</var>/<var>L</var>
+          &amp;&amp; arp.tpa == <var>A</var> &amp;&amp; arp.op == 1</code>
+          (ARP request) with the following actions:
         </p>
 
         <pre>
 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
+reg9[3] = 1;
+next;
+        </pre>
+
+        <p>
+          A priority-100 flow is added which matches <code>inport ==
+          <var>P</var> &amp;&amp; arp.spa == <var>S</var>/<var>L</var>
+          &amp;&amp; arp.op == 1</code> (ARP request) with the following
+          actions:
+        </p>
+
+        <pre>
+reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
+reg9[3] = lookup_arp_ip(inport, arp.spa);
 next;
         </pre>
 
         <p>
           If the logical router port <var>P</var> is a distributed gateway
           router port, additional match
-          <code>is_chassis_resident(cr-<var>P</var>)</code> is added so that
-          the resident gateway chassis handles the neighbor lookup.
+          <code>is_chassis_resident(cr-<var>P</var>)</code> is added for all
+          these flows.
         </p>
       </li>
 
       <li>
         <p>
           A priority-100 flow which matches on ARP reply packets and applies
-          the actions:
+          the actions if the option <code>always_learn_from_arp_request</code>
+          is <code>true</code>:
         </p>
 
         <pre>
 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
 next;
         </pre>
+
+        <p>
+          If the option <code>always_learn_from_arp_request</code>
+          is <code>false</code>, the above actions will be:
+        </p>
+
+        <pre>
+reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
+reg9[3] = 1;
+next;
+        </pre>
+
       </li>
 
       <li>
         <p>
           A priority-100 flow which matches on IPv6 Neighbor Discovery
-          advertisement packet and applies the actions:
+          advertisement packet and applies the actions if the option
+          <code>always_learn_from_arp_request</code> is <code>true</code>:
         </p>
 
         <pre>
 reg9[2] = lookup_nd(inport, nd.target, nd.tll);
 next;
         </pre>
+
+        <p>
+          If the option <code>always_learn_from_arp_request</code>
+          is <code>false</code>, the above actions will be:
+        </p>
+
+        <pre>
+reg9[2] = lookup_nd(inport, nd.target, nd.tll);
+reg9[3] = 1;
+next;
+        </pre>
       </li>
 
       <li>
         <p>
           A priority-100 flow which matches on IPv6 Neighbor Discovery
-          solicitation packet and applies the actions:
+          solicitation packet and applies the actions if the option
+          <code>always_learn_from_arp_request</code> is <code>true</code>:
+        </p>
+
+        <pre>
+reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
+next;
+        </pre>
+
+        <p>
+          If the option <code>always_learn_from_arp_request</code>
+          is <code>false</code>, the above actions will be:
         </p>
 
         <pre>
 reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
+reg9[3] = lookup_nd_ip(inport, ip6.src);
 next;
         </pre>
       </li>
@@ -1604,21 +1672,28 @@  next;
 
     <p>
       This table adds flows to learn the mac bindings from the ARP and
-      IPv6 Neighbor Solicitation/Advertisement packets if ARP/ND lookup
-      failed in the previous table.
+      IPv6 Neighbor Solicitation/Advertisement packets if it is needed
+      according to the lookup results from the previous stage.
     </p>
 
     <p>
       reg9[2] will be <code>1</code> if the <code>lookup_arp/lookup_nd</code>
-      in the previous table was successful, or if there was no need to do the
-      lookup.
+      in the previous table was successful or skipped, meaning no need
+      to learn mac binding from the packet.
+    </p>
+
+    <p>
+      reg9[3] will be <code>1</code> if the
+      <code>lookup_arp_ip/lookup_nd_ip</code> in the previous table was
+      successful or skipped, meaning it is ok to learn mac binding from
+      the packet (if reg9[2] is 0).
     </p>
 
     <ul>
       <li>
-        A priority-100 flow with the match
-        <code>reg9[2] == 1</code> and advances the packet
-        to the next table as there is no need to learn the neighbor.
+        A priority-100 flow with the match <code>reg9[2] == 1 || reg9[3] ==
+        0</code> and advances the packet to the next table as there is no need
+        to learn the neighbor.
       </li>
 
       <li>
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 5dd49f9..b7102b2 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -222,6 +222,7 @@  enum ovn_stage {
 /* Register to store the result of check_pkt_larger action. */
 #define REGBIT_PKT_LARGER        "reg9[1]"
 #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[2]"
+#define REGBIT_LOOKUP_NEIGHBOR_IP_RESULT "reg9[3]"
 
 /* Register to store the eth address associated to a router port for packets
  * received in S_ROUTER_IN_ADMISSION.
@@ -8442,36 +8443,62 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
          * For ARP packets, table LOOKUP_NEIGHBOR does a lookup for the
          * (arp.spa, arp.sha) in the mac binding table using the 'lookup_arp'
          * action and stores the result in REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
+         * If "always_learn_from_arp_request" is set to false, it will also
+         * lookup for the (arp.spa) in the mac binding table using the
+         * "lookup_arp_ip" action for ARP request packets, and stores the
+         * result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit; or set that bit
+         * to "1" directly for ARP response packets.
          *
          * For IPv6 ND NA packets, table LOOKUP_NEIGHBOR does a lookup
          * for the (nd.target, nd.tll) in the mac binding table using the
          * 'lookup_nd' action and stores the result in
-         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
+         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
+         * "always_learn_from_arp_request" is set to false,
+         * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit is set.
          *
          * For IPv6 ND NS packets, table LOOKUP_NEIGHBOR does a lookup
          * for the (ip6.src, nd.sll) in the mac binding table using the
          * 'lookup_nd' action and stores the result in
-         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
+         * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
+         * "always_learn_from_arp_request" is set to false, it will also lookup
+         * for the (ip6.src) in the mac binding table using the "lookup_nd_ip"
+         * action and stores the result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
+         * bit.
          *
          * Table LEARN_NEIGHBOR learns the mac-binding using the action
-         * - 'put_arp/put_nd' only if REGBIT_LOOKUP_NEIGHBOR_RESULT bit
-         * is not set.
+         * - 'put_arp/put_nd'. Learning mac-binding is skipped if
+         *   REGBIT_LOOKUP_NEIGHBOR_RESULT bit is set or
+         *   REGBIT_LOOKUP_NEIGHBOR_IP_RESULT is not set.
          *
          * */
 
         /* Flows for LOOKUP_NEIGHBOR. */
+        bool learn_from_arp_request = smap_get_bool(&od->nbr->options,
+            "always_learn_from_arp_request", true);
+        ds_clear(&actions);
+        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
+                      " = lookup_arp(inport, arp.spa, arp.sha); %snext;",
+                      learn_from_arp_request ? "" :
+                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
         ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
-                      "arp.op == 2",
-                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
-                      "lookup_arp(inport, arp.spa, arp.sha); next;");
+                      "arp.op == 2", ds_cstr(&actions));
 
+        ds_clear(&actions);
+        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
+                      " = lookup_nd(inport, nd.target, nd.tll); %snext;",
+                      learn_from_arp_request ? "" :
+                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1; ");
         ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100, "nd_na",
-                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
-                      "lookup_nd(inport, nd.target, nd.tll); next;");
+                      ds_cstr(&actions));
 
+        ds_clear(&actions);
+        ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
+                      " = lookup_nd(inport, ip6.src, nd.sll); %snext;",
+                      learn_from_arp_request ? "" :
+                      REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
+                      " = lookup_nd_ip(inport, ip6.src); ");
         ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 100, "nd_ns",
-                      REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
-                      "lookup_nd(inport, ip6.src, nd.sll); next;");
+                      ds_cstr(&actions));
 
         /* For other packet types, we can skip neighbor learning.
          * So set REGBIT_LOOKUP_NEIGHBOR_RESULT to 1. */
@@ -8480,8 +8507,12 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
 
         /* Flows for LEARN_NEIGHBOR. */
         /* Skip Neighbor learning if not required. */
+        ds_clear(&match);
+        ds_put_format(&match, REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1%s",
+                      learn_from_arp_request ? "" :
+                      " || "REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" == 0");
         ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 100,
-                      REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1", "next;");
+                      ds_cstr(&match), "next;");
 
         ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 90,
                       "arp", "put_arp(inport, arp.spa, arp.sha); next;");
@@ -8498,8 +8529,37 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
             continue;
         }
 
+        bool learn_from_arp_request = smap_get_bool(&op->od->nbr->options,
+            "always_learn_from_arp_request", true);
+
         /* Check if we need to learn mac-binding from ARP requests. */
         for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
+            if (!learn_from_arp_request) {
+                /* ARP request to this address should always get learned,
+                 * so add a priority-110 flow to set
+                 * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT to 1. */
+                ds_clear(&match);
+                ds_put_format(&match,
+                              "inport == %s && arp.spa == %s/%u && "
+                              "arp.tpa == %s && arp.op == 1",
+                              op->json_key,
+                              op->lrp_networks.ipv4_addrs[i].network_s,
+                              op->lrp_networks.ipv4_addrs[i].plen,
+                              op->lrp_networks.ipv4_addrs[i].addr_s);
+                if (op->od->l3dgw_port && op == op->od->l3dgw_port
+                    && op->od->l3redirect_port) {
+                    ds_put_format(&match, " && is_chassis_resident(%s)",
+                                  op->od->l3redirect_port->json_key);
+                }
+                const char *actions_s = REGBIT_LOOKUP_NEIGHBOR_RESULT
+                                  " = lookup_arp(inport, arp.spa, arp.sha); "
+                                  REGBIT_LOOKUP_NEIGHBOR_IP_RESULT" = 1;"
+                                  " next;";
+                ovn_lflow_add_with_hint(lflows, op->od,
+                                        S_ROUTER_IN_LOOKUP_NEIGHBOR, 110,
+                                        ds_cstr(&match), actions_s,
+                                        &op->nbrp->header_);
+            }
             ds_clear(&match);
             ds_put_format(&match,
                           "inport == %s && arp.spa == %s/%u && arp.op == 1",
@@ -8511,12 +8571,16 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
                 ds_put_format(&match, " && is_chassis_resident(%s)",
                               op->od->l3redirect_port->json_key);
             }
+            ds_clear(&actions);
+            ds_put_format(&actions, REGBIT_LOOKUP_NEIGHBOR_RESULT
+                          " = lookup_arp(inport, arp.spa, arp.sha); %snext;",
+                          learn_from_arp_request ? "" :
+                          REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
+                          " = lookup_arp_ip(inport, arp.spa); ");
             ovn_lflow_add_with_hint(lflows, op->od,
                                     S_ROUTER_IN_LOOKUP_NEIGHBOR, 100,
-                                    ds_cstr(&match),
-                                    REGBIT_LOOKUP_NEIGHBOR_RESULT" = "
-                                    "lookup_arp(inport, arp.spa, arp.sha); "
-                                    "next;", &op->nbrp->header_);
+                                    ds_cstr(&match), ds_cstr(&actions),
+                                    &op->nbrp->header_);
         }
     }
 
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 4c59338..d0317db 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -1859,6 +1859,33 @@ 
           send traffic between each other.
         </p>
       </column>
+      <column name="options" key="always_learn_from_arp_request" type='{"type": "boolean"}'>
+        <p>
+          This option controls the behavior when handling IPv4 ARP requests or
+          IPv6 ND-NS packets - whether a dynamic neighbor (MAC binding) entry
+          is added/updated.
+        </p>
+
+        <p>
+          <code>true</code> - Always learn the MAC-IP binding, and add/update
+          the MAC binding entry.
+        </p>
+
+        <p>
+          <code>false</code> - If there is a MAC binding for that IP and the
+          MAC is different, or, if TPA of ARP request belongs to any router
+          port on this router, then update/add that MAC-IP binding. Otherwise,
+          don't update/add entries.
+        </p>
+
+        <p>
+          It is <code>true</code> by default.  It is recommended to set to
+          <code>false</code> when a large number of logical routers are
+          connected to the same logical switch but most of them never need to
+          send traffic between each other, to reduce the size of the MAC
+          binding table.
+        </p>
+      </column>
     </group>
 
     <group title="Common Columns">
diff --git a/tests/ovn.at b/tests/ovn.at
index 33fa6f2..ef5ef12 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -4042,6 +4042,18 @@  ip_to_hex() {
 sha=f00000000011
 spa=`ip_to_hex 192 168 1 100`
 tpa=$spa
+
+# When always_learn_from_arp_request=false, the new mac-binding will not be learned
+# through GARP request.
+ovn-nbctl --wait=hv set logical_router lr0 options:always_learn_from_arp_request=false
+
+test_arp 11 $sha $spa $tpa
+sleep 1
+AT_CHECK([ovn-sbctl find mac_binding ip="192.168.1.100"], [0], [])
+
+# When always_learn_from_arp_request=true, the new mac-binding will be learned.
+ovn-nbctl --wait=hv set logical_router lr0 options:always_learn_from_arp_request=true
+
 test_arp 11 $sha $spa $tpa
 OVS_WAIT_UNTIL([test `ovn-sbctl find mac_binding ip="192.168.1.100" | wc -l` -gt 0])
 ovn-nbctl --wait=hv sync
@@ -4056,6 +4068,12 @@  test_ip 21 $smac $dmac $sip $dip 11
 
 # lp12 send GARP request to announce ownership of 192.168.1.100.
 
+# Even when always_learn_from_arp_request=false, the existing mac-binding should be
+# updated through GARP request.
+ovn-nbctl --wait=hv set logical_router lr0 options:always_learn_from_arp_request=false
+ovn-sbctl lflow-list
+as hv2 ovs-ofctl dump-flows br-int
+
 sha=f00000000012
 test_arp 12 $sha $spa $tpa
 OVS_WAIT_UNTIL([ovn-sbctl find mac_binding ip="192.168.1.100" | grep f0:00:00:00:00:12])