diff mbox

[ovs-dev,patch_v7] ovn: Fix receive from vxlan in ovn-controller.

Message ID 1469833718-70978-1-git-send-email-dlu998@gmail.com
State Superseded
Headers show

Commit Message

Darrell Ball July 29, 2016, 11:08 p.m. UTC
The changes enable source node replication in OVN for receive from Vxlan
tunnels.  OVN only supports source node replication mode.  This is needed
for OVN to interoperate with hardware switches.

OVN only supports source_node replication and previously vtep interaction,
which used service node replication by default for
multicast/broadcast/unknown unicast traffic "worked" by happenstance.
Because of limited vxlan encapsulation metadata, received packets were
resubmitted to find the egress port(s). This is not correct for multicast,
broadcast and unknown unicast traffic as traffic will get resent on the tunnel
mesh. ovn-controller is changed not to send traffic received from vxlan
tunnels out the tunnel mesh again.  Traffic received from vxlan tunnels is
now only sent locally as intended with obvious benefits.  This behavior is
newly documented in ovn-architecture.7.xml.

To support keeping state for receipt from a vxlan tunnel, a MFF logical
register is allocated for general scratchpad purposes and one bit is used for
receipt from vxlan.  The new register usage is documented in
ovn-architecture.7.xml.

As part of this change ovn-controller-vtep is hard-coded to set the
replication mode of each logical switch to source node as OVN will only
support source node replication.

Signed-off-by: Darrell Ball <dlu998@gmail.com>
---
 ovn/controller-vtep/vtep.c |  4 ++++
 ovn/controller/physical.c  | 25 ++++++++++++++++++++-----
 ovn/lib/logical-fields.h   | 15 +++++++++++++++
 ovn/ovn-architecture.7.xml | 26 +++++++++++++++++++++++++-
 tests/ovn.at               |  3 +++
 5 files changed, 67 insertions(+), 6 deletions(-)

Comments

Ryan Moats Aug. 6, 2016, 7:38 p.m. UTC | #1
"dev" <dev-bounces@openvswitch.org> wrote on 07/29/2016 06:08:38 PM:

> From: Darrell Ball <dlu998@gmail.com>
> To: dlu998@gmail.com, dev@openvswitch.com, blp@ovn.org
> Date: 07/29/2016 06:09 PM
> Subject: [ovs-dev] [patch_v7] ovn: Fix receive from vxlan in
ovn-controller.
> Sent by: "dev" <dev-bounces@openvswitch.org>
>
> The changes enable source node replication in OVN for receive from Vxlan
> tunnels.  OVN only supports source node replication mode.  This is needed
> for OVN to interoperate with hardware switches.
>
> OVN only supports source_node replication and previously vtep
interaction,
> which used service node replication by default for
> multicast/broadcast/unknown unicast traffic "worked" by happenstance.
> Because of limited vxlan encapsulation metadata, received packets were
> resubmitted to find the egress port(s). This is not correct for
multicast,
> broadcast and unknown unicast traffic as traffic will get resent on the
tunnel
> mesh. ovn-controller is changed not to send traffic received from vxlan
> tunnels out the tunnel mesh again.  Traffic received from vxlan tunnels
is
> now only sent locally as intended with obvious benefits.  This behavior
is
> newly documented in ovn-architecture.7.xml.
>
> To support keeping state for receipt from a vxlan tunnel, a MFF logical
> register is allocated for general scratchpad purposes and one bit is used
for
> receipt from vxlan.  The new register usage is documented in
> ovn-architecture.7.xml.
>
> As part of this change ovn-controller-vtep is hard-coded to set the
> replication mode of each logical switch to source node as OVN will only
> support source node replication.
>
> Signed-off-by: Darrell Ball <dlu998@gmail.com>
> ---

I was going to try this out, but the patches to ovn/lib/logical-fields.h
didn't apply cleanly and it looks like some of the changes are already
there, so rather than guess what goes in and doesn't, can we get a rebase?
Darrell Ball Aug. 8, 2016, 7:57 p.m. UTC | #2
On Sat, Aug 6, 2016 at 12:38 PM, Ryan Moats <rmoats@us.ibm.com> wrote:

> "dev" <dev-bounces@openvswitch.org> wrote on 07/29/2016 06:08:38 PM:
>
> > From: Darrell Ball <dlu998@gmail.com>
> > To: dlu998@gmail.com, dev@openvswitch.com, blp@ovn.org
> > Date: 07/29/2016 06:09 PM
> > Subject: [ovs-dev] [patch_v7] ovn: Fix receive from vxlan in
> ovn-controller.
> > Sent by: "dev" <dev-bounces@openvswitch.org>
> >
> > The changes enable source node replication in OVN for receive from Vxlan
> > tunnels.  OVN only supports source node replication mode.  This is needed
> > for OVN to interoperate with hardware switches.
> >
> > OVN only supports source_node replication and previously vtep
> interaction,
> > which used service node replication by default for
> > multicast/broadcast/unknown unicast traffic "worked" by happenstance.
> > Because of limited vxlan encapsulation metadata, received packets were
> > resubmitted to find the egress port(s). This is not correct for
> multicast,
> > broadcast and unknown unicast traffic as traffic will get resent on the
> tunnel
> > mesh. ovn-controller is changed not to send traffic received from vxlan
> > tunnels out the tunnel mesh again.  Traffic received from vxlan tunnels
> is
> > now only sent locally as intended with obvious benefits.  This behavior
> is
> > newly documented in ovn-architecture.7.xml.
> >
> > To support keeping state for receipt from a vxlan tunnel, a MFF logical
> > register is allocated for general scratchpad purposes and one bit is
> used for
> > receipt from vxlan.  The new register usage is documented in
> > ovn-architecture.7.xml.
> >
> > As part of this change ovn-controller-vtep is hard-coded to set the
> > replication mode of each logical switch to source node as OVN will only
> > support source node replication.
> >
> > Signed-off-by: Darrell Ball <dlu998@gmail.com>
> > ---
>
> I was going to try this out, but the patches to ovn/lib/logical-fields.h
> didn't apply cleanly and it looks like some of the changes are already
> there, so rather than guess what goes in and doesn't, can we get a rebase?
>
>
>
hmm, it looks like a logical flags register was added by Justin recently
hence
there is no need for me to add that.

Since this patch or patch series has been out for a about 2 months and
there
has been some disagreement regarding document
Darrell Ball Aug. 8, 2016, 8:01 p.m. UTC | #3
hmm, it looks like a logical flags register was added by Justin recently
hence
there is no need for me to add that.

Since this patch or patch series has been out for about 2 months and there
has been some disagreement regarding documentation refactoring, it might
be best to coordinate this closely with some folks.

On Mon, Aug 8, 2016 at 12:57 PM, Darrell Ball <dlu998@gmail.com> wrote:

>
>
> On Sat, Aug 6, 2016 at 12:38 PM, Ryan Moats <rmoats@us.ibm.com> wrote:
>
>> "dev" <dev-bounces@openvswitch.org> wrote on 07/29/2016 06:08:38 PM:
>>
>> > From: Darrell Ball <dlu998@gmail.com>
>> > To: dlu998@gmail.com, dev@openvswitch.com, blp@ovn.org
>> > Date: 07/29/2016 06:09 PM
>> > Subject: [ovs-dev] [patch_v7] ovn: Fix receive from vxlan in
>> ovn-controller.
>> > Sent by: "dev" <dev-bounces@openvswitch.org>
>> >
>> > The changes enable source node replication in OVN for receive from Vxlan
>> > tunnels.  OVN only supports source node replication mode.  This is
>> needed
>> > for OVN to interoperate with hardware switches.
>> >
>> > OVN only supports source_node replication and previously vtep
>> interaction,
>> > which used service node replication by default for
>> > multicast/broadcast/unknown unicast traffic "worked" by happenstance.
>> > Because of limited vxlan encapsulation metadata, received packets were
>> > resubmitted to find the egress port(s). This is not correct for
>> multicast,
>> > broadcast and unknown unicast traffic as traffic will get resent on the
>> tunnel
>> > mesh. ovn-controller is changed not to send traffic received from vxlan
>> > tunnels out the tunnel mesh again.  Traffic received from vxlan tunnels
>> is
>> > now only sent locally as intended with obvious benefits.  This behavior
>> is
>> > newly documented in ovn-architecture.7.xml.
>> >
>> > To support keeping state for receipt from a vxlan tunnel, a MFF logical
>> > register is allocated for general scratchpad purposes and one bit is
>> used for
>> > receipt from vxlan.  The new register usage is documented in
>> > ovn-architecture.7.xml.
>> >
>> > As part of this change ovn-controller-vtep is hard-coded to set the
>> > replication mode of each logical switch to source node as OVN will only
>> > support source node replication.
>> >
>> > Signed-off-by: Darrell Ball <dlu998@gmail.com>
>> > ---
>>
>> I was going to try this out, but the patches to ovn/lib/logical-fields.h
>> didn't apply cleanly and it looks like some of the changes are already
>> there, so rather than guess what goes in and doesn't, can we get a rebase?
>>
>>
>>
> hmm, it looks like a logical flags register was added by Justin recently
> hence
> there is no need for me to add that.
>
> Since this patch or patch series has been out for a about 2 months and
> there
> has been some disagreement regarding document
>
>
>
>

On Mon, Aug 8, 2016 at 12:57 PM, Darrell Ball <dlu998@gmail.com> wrote:

>
>
> On Sat, Aug 6, 2016 at 12:38 PM, Ryan Moats <rmoats@us.ibm.com> wrote:
>
>> "dev" <dev-bounces@openvswitch.org> wrote on 07/29/2016 06:08:38 PM:
>>
>> > From: Darrell Ball <dlu998@gmail.com>
>> > To: dlu998@gmail.com, dev@openvswitch.com, blp@ovn.org
>> > Date: 07/29/2016 06:09 PM
>> > Subject: [ovs-dev] [patch_v7] ovn: Fix receive from vxlan in
>> ovn-controller.
>> > Sent by: "dev" <dev-bounces@openvswitch.org>
>> >
>> > The changes enable source node replication in OVN for receive from Vxlan
>> > tunnels.  OVN only supports source node replication mode.  This is
>> needed
>> > for OVN to interoperate with hardware switches.
>> >
>> > OVN only supports source_node replication and previously vtep
>> interaction,
>> > which used service node replication by default for
>> > multicast/broadcast/unknown unicast traffic "worked" by happenstance.
>> > Because of limited vxlan encapsulation metadata, received packets were
>> > resubmitted to find the egress port(s). This is not correct for
>> multicast,
>> > broadcast and unknown unicast traffic as traffic will get resent on the
>> tunnel
>> > mesh. ovn-controller is changed not to send traffic received from vxlan
>> > tunnels out the tunnel mesh again.  Traffic received from vxlan tunnels
>> is
>> > now only sent locally as intended with obvious benefits.  This behavior
>> is
>> > newly documented in ovn-architecture.7.xml.
>> >
>> > To support keeping state for receipt from a vxlan tunnel, a MFF logical
>> > register is allocated for general scratchpad purposes and one bit is
>> used for
>> > receipt from vxlan.  The new register usage is documented in
>> > ovn-architecture.7.xml.
>> >
>> > As part of this change ovn-controller-vtep is hard-coded to set the
>> > replication mode of each logical switch to source node as OVN will only
>> > support source node replication.
>> >
>> > Signed-off-by: Darrell Ball <dlu998@gmail.com>
>> > ---
>>
>> I was going to try this out, but the patches to ovn/lib/logical-fields.h
>> didn't apply cleanly and it looks like some of the changes are already
>> there, so rather than guess what goes in and doesn't, can we get a rebase?
>>
>>
>>
> hmm, it looks like a logical flags register was added by Justin recently
> hence
> there is no need for me to add that.
>
> Since this patch or patch series has been out for a about 2 months and
> there
> has been some disagreement regarding document
>
>
>
>
diff mbox

Patch

diff --git a/ovn/controller-vtep/vtep.c b/ovn/controller-vtep/vtep.c
index b529519..ba78a39 100644
--- a/ovn/controller-vtep/vtep.c
+++ b/ovn/controller-vtep/vtep.c
@@ -233,6 +233,10 @@  vtep_lswitch_run(struct shash *vtep_pbs, struct sset *vtep_pswitches,
                          vtep_ls->tunnel_key[0], tnl_key);
             }
             vteprec_logical_switch_set_tunnel_key(vtep_ls, &tnl_key, 1);
+            /* OVN is expected to always use source node replication mode,
+             * hence the replication mode is hard-coded for each logical
+             * switch in the context of ovn-controller-vtep. */
+            vteprec_logical_switch_set_replication_mode(vtep_ls, "source_node");
             sset_add(&used_ls, lswitch_name);
         }
     }
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index ee1da4f..90237f6 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -472,6 +472,21 @@  consider_port_binding(enum mf_field_id mff_ovn_geneve,
         ofpact_put_OUTPUT(ofpacts_p)->port = ofport;
         ofctrl_add_flow(OFTABLE_REMOTE_OUTPUT, 100,
                         &match, ofpacts_p, &binding->header_.uuid);
+
+        /* For packets received from a Vxlan tunnel which get
+         * resubmitted to OFTABLE_LOG_INGRESS_PIPELINE due to lack of
+         * needed metadata in Vxlan, explicitly skip sending back out
+         * any tunnels and resubmit to table 33 for local delivery. */
+         match_init_catchall(&match);
+         ofpbuf_clear(ofpacts_p);
+         match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
+                              MFF_LOG_FLAGS_RCV_FROM_VXLAN,
+                              MFF_LOG_FLAGS_RCV_FROM_VXLAN);
+         /* Resubmit to table 33. */
+         put_resubmit(OFTABLE_LOCAL_OUTPUT, ofpacts_p);
+         ofctrl_add_flow(OFTABLE_REMOTE_OUTPUT, 101, &match, ofpacts_p,
+                         &binding->header_.uuid);
+
     }
 }
 
@@ -835,11 +850,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
      * metadata, we only support VXLAN for connections to gateways.  The
      * VNI is used to populate MFF_LOG_DATAPATH.  The gateway's logical
      * port is set to MFF_LOG_INPORT.  Then the packet is resubmitted to
-     * table 16 to determine the logical egress port.
-     *
-     * xxx Due to resubmitting to table 16, broadcasts will be re-sent to
-     * xxx all logical ports, including non-local ones which could cause
-     * xxx duplicate packets to be received by multiply-connected gateways. */
+     * table 16 to determine the logical egress port. */
     HMAP_FOR_EACH (tun, hmap_node, &tunnels) {
         if (tun->type != VXLAN) {
             continue;
@@ -859,6 +870,10 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
             ofpbuf_clear(&ofpacts);
             put_move(MFF_TUN_ID, 0,  MFF_LOG_DATAPATH, 0, 24, &ofpacts);
             put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 15, &ofpacts);
+            /* For packets received from a vxlan tunnel, set a flag to that
+             * effect. */
+            put_load(MFF_LOG_FLAGS_RCV_FROM_VXLAN, MFF_LOG_FLAGS,
+                     0, 1, &ofpacts);
             put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts);
 
             ofctrl_add_flow(OFTABLE_PHY_TO_LOG, 100, &match, &ofpacts, hc_uuid);
diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h
index 24c15c7..022fa18 100644
--- a/ovn/lib/logical-fields.h
+++ b/ovn/lib/logical-fields.h
@@ -23,6 +23,7 @@ 
  * These values are documented in ovn-architecture(7), please update the
  * documentation if you change any of them. */
 #define MFF_LOG_DATAPATH MFF_METADATA /* Logical datapath (64 bits). */
+#define MFF_LOG_FLAGS      MFF_REG10  /* Logical flags (32 bits). */
 #define MFF_LOG_DNAT_ZONE  MFF_REG11  /* conntrack dnat zone for gateway router
                                        * (32 bits). */
 #define MFF_LOG_SNAT_ZONE  MFF_REG12  /* conntrack snat zone for gateway router
@@ -47,4 +48,18 @@ 
     MFF_LOG_REG(MFF_REG8) \
     MFF_LOG_REG(MFF_REG9)
 
+/* MFF_LOG_FLAGS_REG bit assignments */
+enum mff_log_flags_bits {
+    MFF_LOG_FLAGS_RCV_FROM_VXLAN_BIT =         0,
+};
+
+/* MFF_LOG_FLAGS_REG flag assignments */
+enum mff_log_flags {
+    /* This flag is used to indicate that a packet was received from a vxlan
+     * tunnel to compensate for the lack of egress port information available
+     * in Vxlan encapsulation.  Egress port information is available for Geneve
+     * and STT tunnel types. */
+    MFF_LOG_FLAGS_RCV_FROM_VXLAN = (1 << MFF_LOG_FLAGS_RCV_FROM_VXLAN_BIT),
+};
+
 #endif /* ovn/lib/logical-fields.h */
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index fe20a14..79967d2 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -742,6 +742,11 @@ 
       <p>
         Geneve and STT tunnels pass this field as part of the tunnel key.
         VXLAN tunnels do not transmit the logical output port field.
+        Since VXLAN tunnels do not carry a logical output port field in
+        the tunnel key, when a packet is received from VXLAN tunnel by
+        an OVN hypervisor, the packet is resubmitted to table 16 to
+        determine the output port(s);  when the packet reaches table 32,
+        these packets are resubmitted to table 33 for local delivery.
       </p>
     </dd>
 
@@ -768,6 +773,20 @@ 
       extension register number 12.
     </dd>
 
+    <dt>logical flags</dt>
+    <dd>
+      The logical flags are intended to handle keeping context between
+      tables in order to decide which rules in subsequent tables are
+      matched. One bit is presently claimed and is used to note that a
+      packet was received from a VXLAN tunnel (see <code>logical output
+      port field</code>, above and discussion of table 32 below, for more
+      information).  These values only have local significance and are
+      not meaningful between chassis.  OVN stores the logical flags in
+        <!-- Keep the following in sync with MFF_LOG_FLAGS in
+        ovn/lib/logical-fields.h. -->
+      Nicira extension register number 10.
+    </dd>
+
     <dt>VLAN ID</dt>
     <dd>
       The VLAN ID is used as an interface between OVN and containers nested
@@ -917,7 +936,12 @@ 
         multicast group includes a logical port or ports on the local
         hypervisor, then its actions also resubmit to table 33.  Table 32 also
         includes a fallback flow that resubmits to table 33 if there is no
-        other match.
+        other match.  Table 32 also contains a higher priority rule to match
+        packets received from VXLAN tunnels and resubmit these packets to table
+        33 for local delivery. Packets received from VXLAN tunnels reach here
+        because of a lack of logical output port field in the tunnel key and
+        thus these packets needed to be submitted to table 16 to determine the
+        output port.
       </p>
 
       <p>
diff --git a/tests/ovn.at b/tests/ovn.at
index 051b222..8f8219b 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -1212,6 +1212,9 @@  sleep 1
 
 vtep-ctl bind-ls br-vtep br-vtep_n2 0 lsw0
 
+OVS_WAIT_UNTIL([test -n "`as vtep vtep-ctl get-replication-mode lsw0 |
+               grep -- source`"])
+# It takes more time for the update to be processed by ovs-vtep.
 sleep 1
 
 # Add hv3 on the other side of the vtep