diff mbox

[ovs-dev,patch_v4] ovn: Add additional comments regarding arp responders.

Message ID 1475775274-129940-1-git-send-email-dlu998@gmail.com
State Superseded
Headers show

Commit Message

Darrell Ball Oct. 6, 2016, 5:34 p.m. UTC
There has been enough confusion regarding logical switch datapath
arp responders in ovn to warrant some additional comments;
hence add a general description regarding why they exist and
document the special cases.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
---

Note this patch is meant to be merge with the code change for vtep
inport handling here.
https://patchwork.ozlabs.org/patch/675796/

v3->v4: Capitalization fixes.
        Reinstate comment regarding L2 learning confusion.

v2->v3: Reword and further elaborate.

v1->v2: Dropped RFC code change for logical switch router
        type ports

 ovn/northd/ovn-northd.8.xml | 52 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 46 insertions(+), 6 deletions(-)

Comments

Han Zhou Oct. 8, 2016, 12:25 a.m. UTC | #1
Overall it looks good to me. Just suggestions for rewording.

On Thu, Oct 6, 2016 at 10:34 AM, Darrell Ball <dlu998@gmail.com> wrote:
>
> There has been enough confusion regarding logical switch datapath
> arp responders in ovn to warrant some additional comments;
> hence add a general description regarding why they exist and
> document the special cases.
>
> Signed-off-by: Darrell Ball <dlu998@gmail.com>
> ---
>
> Note this patch is meant to be merge with the code change for vtep
> inport handling here.
> https://patchwork.ozlabs.org/patch/675796/
>
> v3->v4: Capitalization fixes.
>         Reinstate comment regarding L2 learning confusion.
>
> v2->v3: Reword and further elaborate.
>
> v1->v2: Dropped RFC code change for logical switch router
>         type ports
>
>  ovn/northd/ovn-northd.8.xml | 52 ++++++++++++++++++++++++++++++
+++++++++------
>  1 file changed, 46 insertions(+), 6 deletions(-)
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 77eb3d1..5ac351d 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -415,20 +415,60 @@
>      <h3>Ingress Table 9: ARP/ND responder</h3>
>
>      <p>
> -      This table implements ARP/ND responder for known IPs.

To avoid confusing with lrouter ARP responder, it is better to be: this
table implements ARP/ND responder in *logical switch*, for known IPs.

> +      ARP requests received by multiple hypervisors, as in the case of
> +      <code>localnet</code> and <code>vtep</code> logical inports need
> +      to skip these logical switch ARP responders;  the reason being
> +      that northd downloads the same mac binding rules to all hypervisors
> +      and all hypervisors will receive the ARP request from the external
> +      network and respond.  This will confuse L2 learning on the source
> +      of the ARP requests.  These skip rules are mentioned under
> +      priority-100 flows.

Suggest to avoid this detail here, but just describe in the priority-100
flow section. Too much details here adds redundancy and not helpful to
overall understanding. For example, there is also another case to skip
which is when IP is owned by the inport, and it is described separately
already.

> +                                   ARP requests arrive from VMs with a
logical
> +      switch inport type of type empty, which is the default.  For this
> +      case,

This part of text is correct, but I feel it doesn't help understanding but
add confusion, so suggest to remove.

> +               If this logical switch router type port does not have an
> +      IP address configured, ARP requests will hit another ARP responder
> +      on the logical router datapath itself, which is most commonly a
> +      distributed logical router.  The advantage of using the logical
> +      switch proxy ARP rule for logical router ports is that this rule
> +      is hit before the logical switch L2 broadcast rule.  This means
> +      the ARP request is not broadcast on this logical switch.

How about:

If this logical switch router type port does not have an IP address
configured, although the request will still be responded by the ARP
responder on the logical router datapath, the ARP request will be broadcast
on the logical switch.

> +                                             Note that ARP requests
received from
> +      <code>localnet</code> or <code>vtep</code> logical inports can
> +      either go directly to VMs, in which case the VM responds or can
> +      hit an ARP responder for a logical router port if the packet is
> +      used to resolve a logical router port next hop address.

This part of text is somehow redundant with the above. If this is to be
emphasised, we may put it under the section for router ARP responder flows
description, to explain why we still need ARP responder for on router
datapath.

> +                                                                 The
inport being of type
> +        <code>router</code> has no known use case for these ARP
> +        responders.  However, no skip flows are installed for these
> +        packets, as there would be some additional flow cost for this
> +        and the value appears limited.

Maybe we don't even need to mention this if we don't want to skip this kind
of packet. If there is no such case, it means we don't need to add flows to
skip it; if there is such case, but we want the ARP responder to work for
that case, we don't need to a flows to skip either.


Acked-by: Han Zhou <zhouhan@gmail.com>
Mickey Spiegel Oct. 10, 2016, 7:27 a.m. UTC | #2
This is getting close. Some rewording suggestions below.

On Thu, Oct 6, 2016 at 10:34 AM, Darrell Ball <dlu998@gmail.com> wrote:

> There has been enough confusion regarding logical switch datapath
> arp responders in ovn to warrant some additional comments;
> hence add a general description regarding why they exist and
> document the special cases.
>
> Signed-off-by: Darrell Ball <dlu998@gmail.com>
> ---
>
> Note this patch is meant to be merge with the code change for vtep
> inport handling here.
> https://patchwork.ozlabs.org/patch/675796/
>
> v3->v4: Capitalization fixes.
>         Reinstate comment regarding L2 learning confusion.
>
> v2->v3: Reword and further elaborate.
>
> v1->v2: Dropped RFC code change for logical switch router
>         type ports
>
>  ovn/northd/ovn-northd.8.xml | 52 ++++++++++++++++++++++++++++++
> +++++++++------
>  1 file changed, 46 insertions(+), 6 deletions(-)
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 77eb3d1..5ac351d 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -415,20 +415,60 @@
>      <h3>Ingress Table 9: ARP/ND responder</h3>
>
>      <p>
> -      This table implements ARP/ND responder for known IPs.  It contains
> these
> -      logical flows:
> +      This table implements ARP/ND responder for known IPs.


I agree with Han Zhou that mentioning logical switch explicitly helps
clarify things.

+      The advantage
> +      of the ARP responder flow is to limit ARP broadcasts by locally
> +      responding to ARP requests without the need to send to other
> +      hypervisors.  One common case is when the inport is a logical
> +      port associated with a VIF and the broadcast is responded to on the
> +      local hypervisor rather than broadcast across the whole network and
> +      responded to by the destination VM.  This behavior is proxy ARP.
>

Agree up to this point. Wondering if there should be multiple paragraphs,
with the text above being the first paragraph.


> +      ARP requests received by multiple hypervisors, as in the case of
> +      <code>localnet</code> and <code>vtep</code> logical inports need
> +      to skip these logical switch ARP responders;  the reason being
> +      that northd downloads the same mac binding rules to all hypervisors
> +      and all hypervisors will receive the ARP request from the external
> +      network and respond.  This will confuse L2 learning on the source
> +      of the ARP requests.  These skip rules are mentioned under
> +      priority-100 flows.


I am OK with Han Zhou's suggestion to move this description to the
priority-100 flows themselves. If you do that, then the l2 gateway text
below should also move to the priority-100 flows.


> +      ARP requests arrive from VMs with a logical
> +      switch inport type of type empty, which is the default.  For this
> +      case, the logical switch proxy ARP rules can be for other VMs or
> +      a logical router port.


Suggest to replace the above two lines with something generic like:

      Logical switch proxy ARP rules may be programmed both for IP
      addresses on logical switch VIF ports (of type empty, which is the
      default, representing connectivity to VMs or containers), and for IP
      addresses on logical switch router ports.  Note that it is common
      for ARP requests to be received on one type of port (e.g. of type
      empty, from a VM) for an IP address that resides on a different
      type of port (e.g. of type router).

+      In order to support proxy ARP for logical
> +      router ports, an IP address must be configured on the logical
> +      switch router type port, with the same value as the peer of the
> +      logical router port.  The configured MAC addresses must match as
> +      well.


Agree with the text above.

+      If this logical switch router type port does not have an
> +      IP address configured, ARP requests will hit another ARP responder
> +      on the logical router datapath itself, which is most commonly a
> +      distributed logical router.  The advantage of using the logical
> +      switch proxy ARP rule for logical router ports is that this rule
> +      is hit before the logical switch L2 broadcast rule.  This means
> +      the ARP request is not broadcast on this logical switch.


Han Zhou suggested:

If this logical switch router type port does not have an IP address
configured, although the request will still be responded by the ARP
responder on the logical router datapath, the ARP request will be broadcast
on the logical switch.

My proposal, building on top of the original text and Han's proposal:

If this logical switch router type port does not have an IP address
configured, the ARP request will be broadcast on the logical switch.
One of the copies of the ARP request will go through the logical
switch router type port to the logical router datapath, where the
logical router ARP responder will generate a reply.

+      Logical
> +      switch ARP responder proxy ARP rules can also be hit when
> +      receiving ARP requests externally on a L2 gateway port.  In this
> +      case, the hypervisor acting as an L2 gateway, responds to the ARP
> +      request on behalf of a VM.  Note that ARP requests received from
> +      <code>localnet</code> or <code>vtep</code> logical inports can
> +      either go directly to VMs, in which case the VM responds or can
> +      hit an ARP responder for a logical router port if the packet is
> +      used to resolve a logical router port next hop address.
>

If the priority-100 text near the top moves to the priority-100 bullet
below, then the above text should also move to the priority-100 bullet.


> +      It contains these logical flows:
>      </p>
>
>      <ul>
>        <li>
> -        Priority-100 flows to skip ARP responder if inport is of type
> -        <code>localnet</code>, and advances directly to the next table.
> +        Priority-100 flows to skip the ARP responder if inport is
> +        of type <code>localnet</code> or <code>vtep</code> and
> +        advances directly to the next table.


If text is moved from the top section above, it would go here.

+        The inport being of type
> +        <code>router</code> has no known use case for these ARP
> +        responders.  However, no skip flows are installed for these
> +        packets, as there would be some additional flow cost for this
> +        and the value appears limited.
>

This would be easier to follow with more explanation.  How about:

        ARP requests received on an inport of type <code>router</code>
        are not expected to hit any logical switch ARP responder flows,
        since the logical router ARP resolve stage already includes all
        corresponding IP address to MAC address flows.  However, no
        skip flows are installed for these packets, as there would be
        some additional flow cost for this and the value appears limited.

Mickey

       </li>
>
>        <li>
>          <p>
>            Priority-50 flows that match ARP requests to each known IP
> address
> -          <var>A</var> of every logical router port, and respond with ARP
> +          <var>A</var> of every logical switch port, and respond with ARP
>            replies directly with corresponding Ethernet address
> <var>E</var>:
>          </p>
>
> @@ -455,7 +495,7 @@ output;
>          <p>
>            Priority-50 flows that match IPv6 ND neighbor solicitations to
>            each known IP address <var>A</var> (and <var>A</var>'s
> -          solicited node address) of every logical router port, and
> +          solicited node address) of every logical switch port, and
>            respond with neighbor advertisements directly with
>            corresponding Ethernet address <var>E</var>:
>          </p>
> --
> 1.9.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
Darrell Ball Oct. 10, 2016, 10:40 p.m. UTC | #3
On Fri, Oct 7, 2016 at 5:25 PM, Han Zhou <zhouhan@gmail.com> wrote:

> Overall it looks good to me. Just suggestions for rewording.
>
> On Thu, Oct 6, 2016 at 10:34 AM, Darrell Ball <dlu998@gmail.com> wrote:
> >
> > There has been enough confusion regarding logical switch datapath
> > arp responders in ovn to warrant some additional comments;
> > hence add a general description regarding why they exist and
> > document the special cases.
> >
> > Signed-off-by: Darrell Ball <dlu998@gmail.com>
> > ---
> >
> > Note this patch is meant to be merge with the code change for vtep
> > inport handling here.
> > https://patchwork.ozlabs.org/patch/675796/
> >
> > v3->v4: Capitalization fixes.
> >         Reinstate comment regarding L2 learning confusion.
> >
> > v2->v3: Reword and further elaborate.
> >
> > v1->v2: Dropped RFC code change for logical switch router
> >         type ports
> >
> >  ovn/northd/ovn-northd.8.xml | 52 ++++++++++++++++++++++++++++++
> +++++++++------
> >  1 file changed, 46 insertions(+), 6 deletions(-)
> >
> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> > index 77eb3d1..5ac351d 100644
> > --- a/ovn/northd/ovn-northd.8.xml
> > +++ b/ovn/northd/ovn-northd.8.xml
> > @@ -415,20 +415,60 @@
> >      <h3>Ingress Table 9: ARP/ND responder</h3>
> >
> >      <p>
> > -      This table implements ARP/ND responder for known IPs.
>
> To avoid confusing with lrouter ARP responder, it is better to be: this
> table implements ARP/ND responder in *logical switch*, for known IPs.
>



This section is under
"Logical Switch Datapaths",
so the context is should be clear that it refers to "Logical Switch"
but it you are confused, I can add a redundant reference to Logical Switches
- it is just a few words.



>
> > +      ARP requests received by multiple hypervisors, as in the case of
> > +      <code>localnet</code> and <code>vtep</code> logical inports need
> > +      to skip these logical switch ARP responders;  the reason being
> > +      that northd downloads the same mac binding rules to all
> hypervisors
> > +      and all hypervisors will receive the ARP request from the external
> > +      network and respond.  This will confuse L2 learning on the source
> > +      of the ARP requests.  These skip rules are mentioned under
> > +      priority-100 flows.
>
> Suggest to avoid this detail here, but just describe in the priority-100
> flow section. Too much details here adds redundancy and not helpful to
> overall understanding. For example, there is also another case to skip
> which is when IP is owned by the inport, and it is described separately
> already.
>

I am fine with moving it back to the priority-100 flow sub-section.



>
> > +                                   ARP requests arrive from VMs with a
> logical
> > +      switch inport type of type empty, which is the default.  For this
> > +      case,
>
> This part of text is correct, but I feel it doesn't help understanding but
> add confusion, so suggest to remove.
>

I would like to keep it as I found it useful; I will remove the "empty" and
keep default
for type.



>
> > +               If this logical switch router type port does not have an
> > +      IP address configured, ARP requests will hit another ARP responder
> > +      on the logical router datapath itself, which is most commonly a
> > +      distributed logical router.  The advantage of using the logical
> > +      switch proxy ARP rule for logical router ports is that this rule
> > +      is hit before the logical switch L2 broadcast rule.  This means
> > +      the ARP request is not broadcast on this logical switch.
>
> How about:
>
> If this logical switch router type port does not have an IP address
> configured, although the request will still be responded by the ARP
> responder on the logical router datapath, the ARP request will be broadcast
> on the logical switch.
>

I can reword the concept folding in part of your shorter wording.


>
> > +                                             Note that ARP requests
> received from
> > +      <code>localnet</code> or <code>vtep</code> logical inports can
> > +      either go directly to VMs, in which case the VM responds or can
> > +      hit an ARP responder for a logical router port if the packet is
> > +      used to resolve a logical router port next hop address.
>
> This part of text is somehow redundant with the above. If this is to be
> emphasised, we may put it under the section for router ARP responder flows
> description, to explain why we still need ARP responder for on router
> datapath.
>

I am ok to omit this part for now. I thought it was obvious - I only had
added it
to clarify a misleading feedback I had previously received to set that
straight.



>
> > +                                                                 The
> inport being of type
> > +        <code>router</code> has no known use case for these ARP
> > +        responders.  However, no skip flows are installed for these
> > +        packets, as there would be some additional flow cost for this
> > +        and the value appears limited.
>
> Maybe we don't even need to mention this if we don't want to skip this
> kind of packet. If there is no such case, it means we don't need to add
> flows to skip it; if there is such case, but we want the ARP responder to
> work for that case, we don't need to a flows to skip either.
>


Since I plan to revisit the logical switch router type arp responders in
general
I think we can skip this for now.



>
>
> Acked-by: Han Zhou <zhouhan@gmail.com>
>
Darrell Ball Oct. 11, 2016, 2:21 a.m. UTC | #4
On Mon, Oct 10, 2016 at 12:27 AM, Mickey Spiegel <mickeys.dev@gmail.com>
wrote:

> This is getting close. Some rewording suggestions below.
>
> On Thu, Oct 6, 2016 at 10:34 AM, Darrell Ball <dlu998@gmail.com> wrote:
>
>> There has been enough confusion regarding logical switch datapath
>> arp responders in ovn to warrant some additional comments;
>> hence add a general description regarding why they exist and
>> document the special cases.
>>
>> Signed-off-by: Darrell Ball <dlu998@gmail.com>
>> ---
>>
>> Note this patch is meant to be merge with the code change for vtep
>> inport handling here.
>> https://patchwork.ozlabs.org/patch/675796/
>>
>> v3->v4: Capitalization fixes.
>>         Reinstate comment regarding L2 learning confusion.
>>
>> v2->v3: Reword and further elaborate.
>>
>> v1->v2: Dropped RFC code change for logical switch router
>>         type ports
>>
>>  ovn/northd/ovn-northd.8.xml | 52 ++++++++++++++++++++++++++++++
>> +++++++++------
>>  1 file changed, 46 insertions(+), 6 deletions(-)
>>
>> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>> index 77eb3d1..5ac351d 100644
>> --- a/ovn/northd/ovn-northd.8.xml
>> +++ b/ovn/northd/ovn-northd.8.xml
>> @@ -415,20 +415,60 @@
>>      <h3>Ingress Table 9: ARP/ND responder</h3>
>>
>>      <p>
>> -      This table implements ARP/ND responder for known IPs.  It contains
>> these
>> -      logical flows:
>> +      This table implements ARP/ND responder for known IPs.
>
>
> I agree with Han Zhou that mentioning logical switch explicitly helps
> clarify things.
>


We three are fine with this.



>
> +      The advantage
>> +      of the ARP responder flow is to limit ARP broadcasts by locally
>> +      responding to ARP requests without the need to send to other
>> +      hypervisors.  One common case is when the inport is a logical
>> +      port associated with a VIF and the broadcast is responded to on the
>> +      local hypervisor rather than broadcast across the whole network and
>> +      responded to by the destination VM.  This behavior is proxy ARP.
>>
>
> Agree up to this point. Wondering if there should be multiple paragraphs,
> with the text above being the first paragraph.
>

I was thinking same - done.



>
>
>> +      ARP requests received by multiple hypervisors, as in the case of
>> +      <code>localnet</code> and <code>vtep</code> logical inports need
>> +      to skip these logical switch ARP responders;  the reason being
>> +      that northd downloads the same mac binding rules to all hypervisors
>> +      and all hypervisors will receive the ARP request from the external
>> +      network and respond.  This will confuse L2 learning on the source
>> +      of the ARP requests.  These skip rules are mentioned under
>> +      priority-100 flows.
>
>
> I am OK with Han Zhou's suggestion to move this description to the
> priority-100 flows themselves. If you do that, then the l2 gateway text
> below should also move to the priority-100 flows.
>

priority-100 flows are skip flows; we are not skipping l2gateway
inport types, so I think we can leave the l2gateway references where they
are.



>
>
>> +      ARP requests arrive from VMs with a logical
>> +      switch inport type of type empty, which is the default.  For this
>> +      case, the logical switch proxy ARP rules can be for other VMs or
>> +      a logical router port.
>
>
> Suggest to replace the above two lines with something generic like:
>
>       Logical switch proxy ARP rules may be programmed both for IP
>       addresses on logical switch VIF ports (of type empty, which is the
>       default, representing connectivity to VMs or containers), and for IP
>       addresses on logical switch router ports.
>

I see you added IP address - I would have hoped that was obvious context,
but I can add
it. We have slightly different wording and Han does not even like the
paragraph.
Let us compromise on:

Logical switch proxy ARP rules may be programmed both for binding IP
addresses on other logical switch VIF ports (which are of the default
logical switch port type, representing connectivity to VMs or containers),
and for binding IP addresses on logical switch router type ports,
representing their logical router port peers.



> Note that it is common
>       for ARP requests to be received on one type of port (e.g. of type
>       empty, from a VM) for an IP address that resides on a different
>       type of port (e.g. of type router).
>

I  am going to skip this part - I think it is hard to understand.
By the way, logical switch router type arp responders are on my
hit list - I have checked with some NSX folks and they agree with me that
there is no real optimization here for the reasons I mentioned before.
I intend to follow up separately regarding this.



>
> +      In order to support proxy ARP for logical
>> +      router ports, an IP address must be configured on the logical
>> +      switch router type port, with the same value as the peer of the
>> +      logical router port.  The configured MAC addresses must match as
>> +      well.
>
>
> Agree with the text above.
>
> +      If this logical switch router type port does not have an
>> +      IP address configured, ARP requests will hit another ARP responder
>> +      on the logical router datapath itself, which is most commonly a
>> +      distributed logical router.  The advantage of using the logical
>> +      switch proxy ARP rule for logical router ports is that this rule
>> +      is hit before the logical switch L2 broadcast rule.  This means
>> +      the ARP request is not broadcast on this logical switch.
>
>
> Han Zhou suggested:
>
> If this logical switch router type port does not have an IP address
> configured, although the request will still be responded by the ARP
> responder on the logical router datapath, the ARP request will be broadcast
> on the logical switch.
>
> My proposal, building on top of the original text and Han's proposal:
>
> If this logical switch router type port does not have an IP address
> configured, the ARP request will be broadcast on the logical switch.
> One of the copies of the ARP request will go through the logical
> switch router type port to the logical router datapath, where the
> logical router ARP responder will generate a reply.
>

I am fine with this - more wordy but even first year art students
will understand it.



>
> +      Logical
>> +      switch ARP responder proxy ARP rules can also be hit when
>> +      receiving ARP requests externally on a L2 gateway port.  In this
>> +      case, the hypervisor acting as an L2 gateway, responds to the ARP
>> +      request on behalf of a VM.  Note that ARP requests received from
>> +      <code>localnet</code> or <code>vtep</code> logical inports can
>> +      either go directly to VMs, in which case the VM responds or can
>> +      hit an ARP responder for a logical router port if the packet is
>> +      used to resolve a logical router port next hop address.
>>
>
> If the priority-100 text near the top moves to the priority-100 bullet
> below, then the above text should also move to the priority-100 bullet.
>

I can move the skip flow (below) part back to priority-100; I will leave
the l2gateway
blurb where it is. I think Han will need to live with the below sentence,
since we both
are leaning in favor of it.

+     Note that ARP requests received from
+      <code>localnet</code> or <code>vtep</code> logical inports can
+      either go directly to VMs, in which case the VM responds or can
+      hit an ARP responder for a logical router port if the packet is
+      used to resolve a logical router port next hop address.



>
>> +      It contains these logical flows:
>>      </p>
>>
>>      <ul>
>>        <li>
>> -        Priority-100 flows to skip ARP responder if inport is of type
>> -        <code>localnet</code>, and advances directly to the next table.
>> +        Priority-100 flows to skip the ARP responder if inport is
>> +        of type <code>localnet</code> or <code>vtep</code> and
>> +        advances directly to the next table.
>
>
> If text is moved from the top section above, it would go here.
>

Good idea



>
> +        The inport being of type
>> +        <code>router</code> has no known use case for these ARP
>> +        responders.  However, no skip flows are installed for these
>> +        packets, as there would be some additional flow cost for this
>> +        and the value appears limited.
>>
>
> This would be easier to follow with more explanation.  How about:
>
>         ARP requests received on an inport of type <code>router</code>
>         are not expected to hit any logical switch ARP responder flows,
>         since the logical router ARP resolve stage already includes all
>         corresponding IP address to MAC address flows.  However, no
>         skip flows are installed for these packets, as there would be
>         some additional flow cost for this and the value appears limited.
>

I will omit this sentence fragment:

 "since the logical router ARP resolve stage already includes all
        corresponding IP address to MAC address flows".

because the reason is related to the OVN pipeline (as it exists today); it
is
not an intentional flow optimization.

Let me go with the resulting merged version:

ARP requests received on an inport of type <code>router</code>
are not expected to hit any logical switch ARP responder flows.
However, no skip flows are installed for these packets, as there would be
some additional flow cost for this and the value appears limited.




>
> Mickey
>
>        </li>
>>
>>        <li>
>>          <p>
>>            Priority-50 flows that match ARP requests to each known IP
>> address
>> -          <var>A</var> of every logical router port, and respond with ARP
>> +          <var>A</var> of every logical switch port, and respond with ARP
>>            replies directly with corresponding Ethernet address
>> <var>E</var>:
>>          </p>
>>
>> @@ -455,7 +495,7 @@ output;
>>          <p>
>>            Priority-50 flows that match IPv6 ND neighbor solicitations to
>>            each known IP address <var>A</var> (and <var>A</var>'s
>> -          solicited node address) of every logical router port, and
>> +          solicited node address) of every logical switch port, and
>>            respond with neighbor advertisements directly with
>>            corresponding Ethernet address <var>E</var>:
>>          </p>
>> --
>> 1.9.1
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>>
>
>
diff mbox

Patch

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 77eb3d1..5ac351d 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -415,20 +415,60 @@ 
     <h3>Ingress Table 9: ARP/ND responder</h3>
 
     <p>
-      This table implements ARP/ND responder for known IPs.  It contains these
-      logical flows:
+      This table implements ARP/ND responder for known IPs.  The advantage
+      of the ARP responder flow is to limit ARP broadcasts by locally
+      responding to ARP requests without the need to send to other
+      hypervisors.  One common case is when the inport is a logical
+      port associated with a VIF and the broadcast is responded to on the
+      local hypervisor rather than broadcast across the whole network and
+      responded to by the destination VM.  This behavior is proxy ARP.
+      ARP requests received by multiple hypervisors, as in the case of
+      <code>localnet</code> and <code>vtep</code> logical inports need
+      to skip these logical switch ARP responders;  the reason being
+      that northd downloads the same mac binding rules to all hypervisors
+      and all hypervisors will receive the ARP request from the external
+      network and respond.  This will confuse L2 learning on the source
+      of the ARP requests.  These skip rules are mentioned under
+      priority-100 flows.  ARP requests arrive from VMs with a logical
+      switch inport type of type empty, which is the default.  For this
+      case, the logical switch proxy ARP rules can be for other VMs or
+      a logical router port.  In order to support proxy ARP for logical
+      router ports, an IP address must be configured on the logical
+      switch router type port, with the same value as the peer of the
+      logical router port.  The configured MAC addresses must match as
+      well.  If this logical switch router type port does not have an
+      IP address configured, ARP requests will hit another ARP responder
+      on the logical router datapath itself, which is most commonly a
+      distributed logical router.  The advantage of using the logical
+      switch proxy ARP rule for logical router ports is that this rule
+      is hit before the logical switch L2 broadcast rule.  This means
+      the ARP request is not broadcast on this logical switch.  Logical
+      switch ARP responder proxy ARP rules can also be hit when
+      receiving ARP requests externally on a L2 gateway port.  In this
+      case, the hypervisor acting as an L2 gateway, responds to the ARP
+      request on behalf of a VM.  Note that ARP requests received from
+      <code>localnet</code> or <code>vtep</code> logical inports can
+      either go directly to VMs, in which case the VM responds or can
+      hit an ARP responder for a logical router port if the packet is
+      used to resolve a logical router port next hop address.
+      It contains these logical flows:
     </p>
 
     <ul>
       <li>
-        Priority-100 flows to skip ARP responder if inport is of type
-        <code>localnet</code>, and advances directly to the next table.
+        Priority-100 flows to skip the ARP responder if inport is
+        of type <code>localnet</code> or <code>vtep</code> and
+        advances directly to the next table.  The inport being of type
+        <code>router</code> has no known use case for these ARP
+        responders.  However, no skip flows are installed for these
+        packets, as there would be some additional flow cost for this
+        and the value appears limited.
       </li>
 
       <li>
         <p>
           Priority-50 flows that match ARP requests to each known IP address
-          <var>A</var> of every logical router port, and respond with ARP
+          <var>A</var> of every logical switch port, and respond with ARP
           replies directly with corresponding Ethernet address <var>E</var>:
         </p>
 
@@ -455,7 +495,7 @@  output;
         <p>
           Priority-50 flows that match IPv6 ND neighbor solicitations to
           each known IP address <var>A</var> (and <var>A</var>'s
-          solicited node address) of every logical router port, and
+          solicited node address) of every logical switch port, and
           respond with neighbor advertisements directly with
           corresponding Ethernet address <var>E</var>:
         </p>