diff mbox

[ovs-dev,v2] ovn-vtep: fix arping from vtep-gw physical port

Message ID 1474574825-27993-1-git-send-email-ramu.ramamurthy@us.ibm.com
State Superseded
Headers show

Commit Message

Ramu Ramamurthy Sept. 22, 2016, 8:07 p.m. UTC
Currently, arping from a vtep-gw physical-switch port to
a VIF IP address does not work.

When a physical-switch-port arps for an IP address
of a VIF, that arp packet comes into the VIF hypervisor via a
vxlan tunnel. That arp packet must not be responded-to by the
arp responder table because, potentially, multiple hypervisors
could independently respond.

Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com>
---
 ovn/northd/ovn-northd.8.xml | 4 ++--
 ovn/northd/ovn-northd.c     | 4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

Comments

Darrell Ball Sept. 23, 2016, 7:34 p.m. UTC | #1
On Thu, Sep 22, 2016 at 1:07 PM, Ramu Ramamurthy <ramu.ramamurthy@gmail.com>
wrote:

> Currently, arping from a vtep-gw physical-switch port to
> a VIF IP address does not work.
>
> When a physical-switch-port arps for an IP address
> of a VIF, that arp packet comes into the VIF hypervisor via a
> vxlan tunnel. That arp packet must not be responded-to by the
> arp responder table because, potentially, multiple hypervisors
> could independently respond.
>
> Signed-off-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com>
> ---
>  ovn/northd/ovn-northd.8.xml | 4 ++--
>  ovn/northd/ovn-northd.c     | 4 +++-
>  2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 1da633a..3b885f9 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -421,8 +421,8 @@
>
>      <ul>
>        <li>
> -        Priority-100 flows to skip ARP responder if inport is of type
> -        <code>localnet</code>, and advances directly to the next table.
> +        Priority-100 flows to skip ARP responder if inport is not
> associated
> +        with a local VIF, and advances directly to the next table.
>        </li>
>
>        <li>
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 2cbbb47..d752589 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -2683,7 +2683,9 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>              continue;
>          }
>
> -        if (!strcmp(op->nbsp->type, "localnet")) {
> +        /* Skip installing arp responder if the logical switch inport
>

Pls remove the word "installing" as here; we are just skipping looking at
an arp responder flow. I had removed the word "installing" in a later
suggestion.



> +         * is not associated with a local VIF. */
> +        if (strcmp(op->nbsp->type, "")) {
>

I debated whether it made sense to keep the l2gateway case for arp
responder for north->south traffic. The use case is limited, however,
it seems like it is worth keeping for now, although I did not verify it.
Did you check that case ?

If we keep the l2gateway case, it becomes


-        if (!strcmp(op->nbsp->type, "localnet")) {
+        /* Skip arp responder if the logical switch inport is not
+         * associated with a local VIF or a l2gateway port */
+        if ((strcmp(op->nbsp->type, "")) &&
+            (strcmp(op->nbsp->type, "l2gateway"))) {
             ds_clear(&match);
             ds_put_format(&match, "inport == %s", op->json_key);
             ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,

plus corresponding xml file changes


             ds_clear(&match);
>              ds_put_format(&match, "inport == %s", op->json_key);
>              ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
> --
> 1.9.1
>
>
Ramu Ramamurthy Sept. 26, 2016, 6:24 p.m. UTC | #2
-        if (!strcmp(op->nbsp->type, "localnet")) {
+        /* Skip arp responder if the logical switch inport is not
+         * associated with a local VIF or a l2gateway port */
+        if ((strcmp(op->nbsp->type, "")) &&
+            (strcmp(op->nbsp->type, "l2gateway"))) {
             ds_clear(&match);
             ds_put_format(&match, "inport == %s", op->json_key);
             ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,

The logical switch port of type "router" needs to have an
arp-responder entry for VIFs to get the
mac of the gateway port, but the above skips it and would introduce a
regression.

I would suggest a) that the code is simpler if we explicitly list the
two known port-types for which we want to
skip the arp-responder b) keep every other port-type as-is with
respect to the arp-responder (and not make
code changes for them in this commit). Please let me know what you
think of this..

-        if (!strcmp(op->nbsp->type, "localnet")) {
+        if (!strcmp(op->nbsp->type, "localnet")
+            || !strcmp(op->nbsp->type, "vtep")) {
             ds_clear(&match);
             ds_put_format(&match, "inport == %s", op->json_key);
             ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
Darrell Ball Sept. 26, 2016, 9 p.m. UTC | #3
On Mon, Sep 26, 2016 at 11:24 AM, Ramu Ramamurthy <ramu.ramamurthy@gmail.com
> wrote:

> -        if (!strcmp(op->nbsp->type, "localnet")) {
> +        /* Skip arp responder if the logical switch inport is not
> +         * associated with a local VIF or a l2gateway port */
> +        if ((strcmp(op->nbsp->type, "")) &&
> +            (strcmp(op->nbsp->type, "l2gateway"))) {
>              ds_clear(&match);
>              ds_put_format(&match, "inport == %s", op->json_key);
>              ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
>
> The logical switch port of type "router" needs to have an
> arp-responder entry for VIFs to get the
> mac of the gateway port, but the above skips it and would introduce a
> regression.
>

Thanks for fully testing these cases.

The OVN system tests, including NAT that I ran all pass when skipping
the arp responder for "LS router type" inports. I see we don't emulate
everything, however, so that is expected.

After a second set of discussions :-), it is agreed that the L3 gateway
support
using transit logical switch datapath in between distributed logical router
and gateway router would hit this case for south -> north traffic.
If that is what you see ? then we agree.

Please add some comments regarding arp responders to the xml file and code
to document what we discussed and found by testing, especially since there
is some special code paths involved.



>
> I would suggest a) that the code is simpler if we explicitly list the
> two known port-types for which we want to
> skip the arp-responder b) keep every other port-type as-is with
> respect to the arp-responder (and not make
> code changes for them in this commit). Please let me know what you
> think of this..
>
> -        if (!strcmp(op->nbsp->type, "localnet")) {
> +        if (!strcmp(op->nbsp->type, "localnet")
> +            || !strcmp(op->nbsp->type, "vtep")) {
>              ds_clear(&match);
>              ds_put_format(&match, "inport == %s", op->json_key);
>              ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
>
Darrell Ball Sept. 30, 2016, 2:19 a.m. UTC | #4
On Mon, Sep 26, 2016 at 2:00 PM, Darrell Ball <dlu998@gmail.com> wrote:

>
>
> On Mon, Sep 26, 2016 at 11:24 AM, Ramu Ramamurthy <
> ramu.ramamurthy@gmail.com> wrote:
>
>> -        if (!strcmp(op->nbsp->type, "localnet")) {
>> +        /* Skip arp responder if the logical switch inport is not
>> +         * associated with a local VIF or a l2gateway port */
>> +        if ((strcmp(op->nbsp->type, "")) &&
>> +            (strcmp(op->nbsp->type, "l2gateway"))) {
>>              ds_clear(&match);
>>              ds_put_format(&match, "inport == %s", op->json_key);
>>              ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
>>
>> The logical switch port of type "router" needs to have an
>> arp-responder entry for VIFs to get the
>> mac of the gateway port, but the above skips it and would introduce a
>> regression.
>>
>

Hello Ramu

Can you describe your configuration for this test failure when logical
switch
arp responders are skipped for logical switch "router type" ports  ?
I know the existing OVN tests (both system and non-system) pass either way.

Thanks Darrell







>
> Thanks for fully testing these cases.
>
> The OVN system tests, including NAT that I ran all pass when skipping
> the arp responder for "LS router type" inports. I see we don't emulate
> everything, however, so that is expected.
>
> After a second set of discussions :-), it is agreed that the L3 gateway
> support
> using transit logical switch datapath in between distributed logical router
> and gateway router would hit this case for south -> north traffic.
> If that is what you see ? then we agree.
>
> Please add some comments regarding arp responders to the xml file and code
> to document what we discussed and found by testing, especially since there
> is some special code paths involved.
>
>
>
>>
>> I would suggest a) that the code is simpler if we explicitly list the
>> two known port-types for which we want to
>> skip the arp-responder b) keep every other port-type as-is with
>> respect to the arp-responder (and not make
>> code changes for them in this commit). Please let me know what you
>> think of this..
>>
>> -        if (!strcmp(op->nbsp->type, "localnet")) {
>> +        if (!strcmp(op->nbsp->type, "localnet")
>> +            || !strcmp(op->nbsp->type, "vtep")) {
>>              ds_clear(&match);
>>              ds_put_format(&match, "inport == %s", op->json_key);
>>              ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
>>
>
>
Ramu Ramamurthy Sept. 30, 2016, 2:31 a.m. UTC | #5
> Hello Ramu
>
> Can you describe your configuration for this test failure when logical
> switch
> arp responders are skipped for logical switch "router type" ports  ?
> I know the existing OVN tests (both system and non-system) pass either way.
>
> Thanks Darrell
>

There is no test that failed. But, I found that when a VM boots, and
gets dhcp-ip,
it pings the default-gw IP address, and for this it arps the
gateway-ip. That arp
hits the arp-responder flows for the router-port.
Darrell Ball Sept. 30, 2016, 3:26 a.m. UTC | #6
On Thu, Sep 29, 2016 at 7:31 PM, Ramu Ramamurthy <ramu.ramamurthy@gmail.com>
wrote:

> > Hello Ramu
> >
> > Can you describe your configuration for this test failure when logical
> > switch
> > arp responders are skipped for logical switch "router type" ports  ?
> > I know the existing OVN tests (both system and non-system) pass either
> way.
> >
> > Thanks Darrell
> >
>
> There is no test that failed. But, I found that when a VM boots, and
> gets dhcp-ip,
> it pings the default-gw IP address, and for this it arps the
> gateway-ip. That arp
> hits the arp-responder flows for the router-port.
>


There is a separate arp responder for logical-router-ports in the logical
router pipeline

        /* ARP reply.  These flows reply to ARP requests for the router's
own
         * IP address. */
        for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
            ds_clear(&match);
            ds_put_format(&match,
                          "inport == %s && arp.tpa == %s && arp.op == 1",
                          op->json_key,
op->lrp_networks.ipv4_addrs[i].addr_s);

            ds_clear(&actions);
            ds_put_format(&actions,
                "eth.dst = eth.src; "
                "eth.src = %s; "
                "arp.op = 2; /* ARP reply */ "
                "arp.tha = arp.sha; "
                "arp.sha = %s; "
                "arp.tpa = arp.spa; "
                "arp.spa = %s; "
                "outport = %s; "
                "flags.loopback = 1; "
                "output;",
                op->lrp_networks.ea_s,
                op->lrp_networks.ea_s,
                op->lrp_networks.ipv4_addrs[i].addr_s,
                op->json_key);
            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90,
                          ds_cstr(&match), ds_cstr(&actions));
        }

could it be that you are hitting this type of arp responder rather than the
one for the logical switch
datapath "router type" arp responder ?

    /* Ingress table 9: ARP/ND responder, reply for known IPs.
     * (priority 50). */
    HMAP_FOR_EACH (op, key_node, ports) {
        if (!op->nbsp) {
            continue;
        }

        /*
         * Add ARP/ND reply flows if either the
         *  - port is up or
         *  - port type is router
         */
        if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router")) {
            continue;
        }

        for (size_t i = 0; i < op->n_lsp_addrs; i++) {
            for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) {
                ds_clear(&match);
                ds_put_format(&match, "arp.tpa == %s && arp.op == 1",
                              op->lsp_addrs[i].ipv4_addrs[j].addr_s);
                ds_clear(&actions);
                ds_put_format(&actions,
                    "eth.dst = eth.src; "
                    "eth.src = %s; "
                    "arp.op = 2; /* ARP reply */ "
                    "arp.tha = arp.sha; "
                    "arp.sha = %s; "
                    "arp.tpa = arp.spa; "
                    "arp.spa = %s; "
                    "outport = inport; "
                    "flags.loopback = 1; "
                    "output;",
                    op->lsp_addrs[i].ea_s, op->lsp_addrs[i].ea_s,
                    op->lsp_addrs[i].ipv4_addrs[j].addr_s);
                ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50,
                              ds_cstr(&match), ds_cstr(&actions));
            }
diff mbox

Patch

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 1da633a..3b885f9 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -421,8 +421,8 @@ 
 
     <ul>
       <li>
-        Priority-100 flows to skip ARP responder if inport is of type
-        <code>localnet</code>, and advances directly to the next table.
+        Priority-100 flows to skip ARP responder if inport is not associated
+        with a local VIF, and advances directly to the next table.
       </li>
 
       <li>
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 2cbbb47..d752589 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2683,7 +2683,9 @@  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
             continue;
         }
 
-        if (!strcmp(op->nbsp->type, "localnet")) {
+        /* Skip installing arp responder if the logical switch inport
+         * is not associated with a local VIF. */
+        if (strcmp(op->nbsp->type, "")) {
             ds_clear(&match);
             ds_put_format(&match, "inport == %s", op->json_key);
             ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,