[ovs-dev,v8,2/2] ovn: Add "localnet" logical port type.
diff mbox

Message ID 1441298701-32163-3-git-send-email-rbryant@redhat.com
State Accepted
Headers show

Commit Message

Russell Bryant Sept. 3, 2015, 4:45 p.m. UTC
Introduce a new logical port type called "localnet".  A logical port
with this type also has an option called "network_name".  A "localnet"
logical port represents a connection to a network that is locally
accessible from each chassis running ovn-controller.  ovn-controller
will use the ovn-bridge-mappings configuration to figure out which
patch port on br-int should be used for this port.

OpenStack Neutron has an API extension called "provider networks" which
allows an administrator to specify that it would like ports directly
attached to some pre-existing network in their environment.  There was a
previous thread where we got into the details of this here:

  http://openvswitch.org/pipermail/dev/2015-June/056765.html

The case where this would be used is an environment that isn't actually
interested in virtual networks and just wants all of their compute
resources connected up to externally managed networks.  Even in this
environment, OVN still has a lot of value to add.  OVN implements port
security and ACLs for all ports connected to these networks.  OVN also
provides the configuration interface and control plane to manage this
across many hypervisors.

As a specific example, consider an environment with two hypvervisors
(A and B) with two VMs on each hypervisor (A1, A2, B1, B2).  Now imagine
that the desired setup from an OpenStack perspective is to have all of
these VMs attached to the same provider network, which is a physical
network we'll refer to as "physnet1".

The first step here is to configure each hypervisor with bridge mappings
that tell ovn-controller that a local bridge called "br-eth1" is used to
reach the network called "physnet1".  We can simulate the inital setup
of this environment in ovs-sandbox with the following commands:

  # Setup the local hypervisor (A)
  ovs-vsctl add-br br-eth1
  ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1

  # Create a fake remote hypervisor (B)
  ovn-sbctl chassis-add fakechassis geneve 127.0.0.1

To get the behavior we want, we model every Neutron port connected to a
Neutron provider network as an OVN logical switch with 2 ports.  The
first port is a normal logical port to be used by the VM.  The second
logical port is a special port with its type set to "localnet".

To simulate the creation of the OVN logical switches and OVN logical
ports for A1, A2, B1, and B2, you can run the following commands:

  # Create 4 OVN logical switches.  Each logical switch has 2 ports,
  # port1 for a VM and physnet1 for the existing network we are
  # connecting to.
  for n in 1 2 3 4; do
      ovn-nbctl lswitch-add provnet1-$n

      ovn-nbctl lport-add provnet1-$n provnet1-$n-port1
      ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n
      ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n

      ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1
      ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown
      ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet
      ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1
  done

  # Bind lport1 (A1) and lport2 (A2) to the local hypervisor.
  ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1
  ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1

  # Bind the other 2 ports to the fake remote hypervisor.
  ovn-sbctl lport-bind provnet1-3-port1 fakechassis
  ovn-sbctl lport-bind provnet1-4-port1 fakechassis

After running these commands, we have the following logical
configuration:

  $ ovn-nbctl show
    lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4)
        lport provnet1-4-physnet1
            macs: unknown
        lport provnet1-4-port1
            macs: 00:00:00:00:00:04
    lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2)
        lport provnet1-2-physnet1
            macs: unknown
        lport provnet1-2-port1
            macs: 00:00:00:00:00:02
    lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3)
        lport provnet1-3-physnet1
            macs: unknown
        lport provnet1-3-port1
            macs: 00:00:00:00:00:03
    lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1)
        lport provnet1-1-physnet1
            macs: unknown
        lport provnet1-1-port1
            macs: 00:00:00:00:00:01

We can also look at OVN_Southbound to see that 2 logical ports are bound
to each hypervisor:

  $ ovn-sbctl show
  Chassis "56b18105-5706-46ef-80c4-ff20979ab068"
      Encap geneve
          ip: "127.0.0.1"
      Port_Binding "provnet1-1-port1"
      Port_Binding "provnet1-2-port1"
  Chassis fakechassis
      Encap geneve
          ip: "127.0.0.1"
      Port_Binding "provnet1-3-port1"
      Port_Binding "provnet1-4-port1"

Now we can generate several packets to test how a packet would be
processed on hypervisor A.  The OpenFlow port numbers in this demo are:

  1 - patch port to br-eth1 (physnet1)
  2 - tunnel to fakechassis
  3 - lport1 (A1)
  4 - lport2 (A2)

Packet test #1: A1 to A2 - This will be output to ofport 1.  Despite
both VMs being local to this hypervisor, all packets betwen the VMs go
through physnet1.  In practice, this will get optimized at br-eth1.

  ovs-appctl ofproto/trace br-int \
    in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate

Packet test #2: physnet1 to A2 - Consider this a continuation of test
is attached to will be considered.  The end result should be that the
only output is to ofport 4 (A2).

  ovs-appctl ofproto/trace br-int \
    in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate

Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1
is to be used to reach any other port.  When it arrives at hypervisor B,
processing would look just like test #2.

  ovs-appctl ofproto/trace br-int \
    in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate

Packet test #4: A1 broadcast. - Again, the packet will only be sent to
physnet1.

  ovs-appctl ofproto/trace br-int \
    in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate

Packet test #5: B1 broadcast arriving at hypervisor A.  This is somewhat
a continuation of test #4.  When a broadcast packet arrives from
physnet1 on hypervisor A, we should see it output to both A1 and A2
(ofports 3 and 4).

  ovs-appctl ofproto/trace br-int \
    in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate

Signed-off-by: Russell Bryant <rbryant@redhat.com>
---
 ovn/controller/physical.c  | 178 +++++++++++++++++++++++++++++++++++++++------
 ovn/ovn-architecture.7.xml |   7 ++
 ovn/ovn-nb.xml             |  18 ++++-
 ovn/ovn-sb.xml             |  33 ++++++++-
 4 files changed, 208 insertions(+), 28 deletions(-)

Patch
diff mbox

diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index 2ec0ba9..ba2cddf 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -23,7 +23,9 @@ 
 #include "ovn-controller.h"
 #include "ovn/lib/ovn-sb-idl.h"
 #include "openvswitch/vlog.h"
+#include "shash.h"
 #include "simap.h"
+#include "smap.h"
 #include "sset.h"
 #include "vswitch-idl.h"
 
@@ -138,6 +140,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 {
     struct simap lport_to_ofport = SIMAP_INITIALIZER(&lport_to_ofport);
     struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
+    struct simap localnet_to_ofport = SIMAP_INITIALIZER(&localnet_to_ofport);
+
     for (int i = 0; i < br_int->n_ports; i++) {
         const struct ovsrec_port *port_rec = br_int->ports[i];
         if (!strcmp(port_rec->name, br_int->name)) {
@@ -150,6 +154,9 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
             continue;
         }
 
+        const char *localnet = smap_get(&port_rec->external_ids,
+                                        "ovn-patch-port");
+
         for (int j = 0; j < port_rec->n_interfaces; j++) {
             const struct ovsrec_interface *iface_rec = port_rec->interfaces[j];
 
@@ -162,8 +169,11 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 continue;
             }
 
-            /* Record as chassis or local logical port. */
-            if (chassis_id) {
+            /* Record as patch to local net, chassis, or local logical port. */
+            if (!strcmp(iface_rec->type, "patch") && localnet) {
+                simap_put(&localnet_to_ofport, localnet, ofport);
+                break;
+            } else if (chassis_id) {
                 enum chassis_tunnel_type tunnel_type;
                 if (!strcmp(iface_rec->type, "geneve")) {
                     tunnel_type = GENEVE;
@@ -196,6 +206,20 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     struct ofpbuf ofpacts;
     ofpbuf_init(&ofpacts, 0);
 
+    struct binding_elem {
+        struct ovs_list list_elem;
+        const struct sbrec_port_binding *binding;
+    };
+    struct localnet_bindings {
+        ofp_port_t ofport;
+        struct ovs_list bindings;
+    };
+    /* A list of localnet port bindings hashed by network name */
+    struct shash localnet_inputs = SHASH_INITIALIZER(&localnet_inputs);
+
+    /* Datapaths with at least one local port binding */
+    struct hmap local_datapaths = HMAP_INITIALIZER(&local_datapaths);
+
     /* Set up flows in table 0 for physical-to-logical translation and in table
      * 64 for logical-to-physical translation. */
     const struct sbrec_port_binding *binding;
@@ -210,7 +234,13 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
         int tag = 0;
         ofp_port_t ofport;
-        if (binding->parent_port) {
+        if (!strcmp(binding->type, "localnet")) {
+            const char *network = smap_get(&binding->options, "network_name");
+            if (!network) {
+                continue;
+            }
+            ofport = u16_to_ofp(simap_get(&localnet_to_ofport, network));
+        } else if (binding->parent_port) {
             ofport = u16_to_ofp(simap_get(&lport_to_ofport,
                                           binding->parent_port));
             if (ofport && binding->tag) {
@@ -245,33 +275,67 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
              *
              * Priority 150 is for traffic belonging to containers. For such
              * traffic, match on the tags and then strip the tag.
-             * Priority 100 is for traffic belonging to VMs.
+             * Priority 100 is for traffic belonging to VMs or locally connected
+             * networks.
              *
              * For both types of traffic: set MFF_LOG_INPORT to the logical
              * input port, MFF_LOG_DATAPATH to the logical datapath, and
              * resubmit into the logical ingress pipeline starting at table
              * 16. */
-            match_init_catchall(&match);
-            ofpbuf_clear(&ofpacts);
-            match_set_in_port(&match, ofport);
-            if (tag) {
-                match_set_dl_vlan(&match, htons(tag));
-            }
+            if (!strcmp(binding->type, "localnet")) {
+                /* The same OpenFlow port may correspond to localnet ports
+                 * attached to more than one logical datapath, so keep track of
+                 * all associated bindings and add a flow at the end. */
+
+                const char *network = smap_get(&binding->options, "network_name");
+                struct shash_node *node;
+                struct localnet_bindings *ln_bindings;
+
+                node = shash_find(&localnet_inputs, network);
+                if (!node) {
+                    ln_bindings = xmalloc(sizeof *ln_bindings);
+                    ln_bindings->ofport = ofport;
+                    list_init(&ln_bindings->bindings);
+                    node = shash_add(&localnet_inputs, network, ln_bindings);
+                }
+                ln_bindings = node->data;
 
-            /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */
-            put_load(binding->datapath->tunnel_key, MFF_LOG_DATAPATH, 0, 64,
-                     &ofpacts);
-            put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 32, &ofpacts);
+                struct binding_elem *b = xmalloc(sizeof *b);
+                b->binding = binding;
+                list_insert(&ln_bindings->bindings, &b->list_elem);
+            } else {
+                struct hmap_node *ld;
+                ld = hmap_first_with_hash(&local_datapaths,
+                                          binding->datapath->tunnel_key);
+                if (!ld) {
+                    ld = xmalloc(sizeof *ld);
+                    hmap_insert(&local_datapaths, ld,
+                                binding->datapath->tunnel_key);
+                }
 
-            /* Strip vlans. */
-            if (tag) {
-                ofpact_put_STRIP_VLAN(&ofpacts);
-            }
+                ofpbuf_clear(&ofpacts);
+                match_init_catchall(&match);
+                match_set_in_port(&match, ofport);
+                if (tag) {
+                    match_set_dl_vlan(&match, htons(tag));
+                }
 
-            /* Resubmit to first logical ingress pipeline table. */
-            put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts);
-            ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, tag ? 150 : 100,
-                            &match, &ofpacts);
+                /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */
+                put_load(binding->datapath->tunnel_key, MFF_LOG_DATAPATH, 0, 64,
+                         &ofpacts);
+                put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 32,
+                         &ofpacts);
+
+                /* Strip vlans. */
+                if (tag) {
+                    ofpact_put_STRIP_VLAN(&ofpacts);
+                }
+
+                /* Resubmit to first logical ingress pipeline table. */
+                put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts);
+                ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG,
+                                tag ? 150 : 100, &match, &ofpacts);
+            }
 
             /* Table 33, priority 100.
              * =======================
@@ -401,6 +465,16 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts);
             } else if (port->chassis) {
                 sset_add(&remote_chassis, port->chassis->name);
+            } else if (!strcmp(port->type, "localnet")) {
+                const char *network = smap_get(&port->options, "network_name");
+                if (!network) {
+                    continue;
+                }
+                if (!simap_contains(&localnet_to_ofport, network)) {
+                    continue;
+                }
+                put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, &ofpacts);
+                put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts);
             }
         }
 
@@ -516,4 +590,64 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
         free(tun);
     }
     hmap_destroy(&tunnels);
+
+    /* Table 0, Priority 100
+     * =====================
+     *
+     * We have now determined the full set of port bindings associated with each
+     * "localnet" network.  Only create flows for datapaths that have another
+     * local binding.  Otherwise, we know it would just be dropped.
+     */
+    struct shash_node *ln_bindings_node, *ln_bindings_node_next;
+    struct localnet_bindings *ln_bindings;
+    SHASH_FOR_EACH_SAFE (ln_bindings_node, ln_bindings_node_next, &localnet_inputs) {
+        ln_bindings = ln_bindings_node->data;
+
+        struct match match;
+        match_init_catchall(&match);
+        match_set_in_port(&match, ln_bindings->ofport);
+
+        struct ofpbuf ofpacts;
+        ofpbuf_init(&ofpacts, 0);
+
+        bool found_local = false;
+        struct binding_elem *b;
+        LIST_FOR_EACH_POP (b, list_elem, &ln_bindings->bindings) {
+            struct hmap_node *ld;
+            ld = hmap_first_with_hash(&local_datapaths,
+                                      b->binding->datapath->tunnel_key);
+            if (ld) {
+                /* This datapath has local port bindings. */
+                found_local = true;
+
+                /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */
+                put_load(b->binding->datapath->tunnel_key, MFF_LOG_DATAPATH,
+                         0, 64, &ofpacts);
+                put_load(b->binding->tunnel_key, MFF_LOG_INPORT, 0, 32,
+                         &ofpacts);
+                put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts);
+            }
+
+            free(b);
+        }
+
+        if (found_local) {
+            ofctrl_add_flow(flow_table, 0, 100, &match, &ofpacts);
+        }
+
+        ofpbuf_uninit(&ofpacts);
+
+        shash_delete(&localnet_inputs, ln_bindings_node);
+        free(ln_bindings);
+    }
+    shash_destroy(&localnet_inputs);
+
+    struct hmap_node *node;
+    while ((node = hmap_first(&local_datapaths))) {
+        hmap_remove(&local_datapaths, node);
+        free(node);
+    }
+    hmap_destroy(&local_datapaths);
+
+    simap_destroy(&localnet_to_ofport);
 }
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 31488bd..b4e6d10 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -679,6 +679,13 @@ 
       </p>
 
       <p>
+        It's possible that a single ingress physical port maps to multiple
+        logical ports with a type of <code>localnet</code>. The logical datapath
+        and logical input port fields will be reset and the packet will be
+        resubmitted to table 16 multiple times.
+      </p>
+
+      <p>
         Packets that originate from a container nested within a VM are treated
         in a slightly different way.  The originating container can be
         distinguished based on the VIF-specific VLAN ID, so the
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index ade8164..0dfdab5 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -116,13 +116,25 @@ 
       </p>
 
       <p>
-      There are no other logical port types implemented yet.
+      When this column is set to <em>localnet</em>, this logical port represents a
+      connection to a locally accessible network from each ovn-controller instance.
+      A logical switch can only have a single <em>localnet</em> port attached
+      and at most one regular logical port.  This is used to model direct
+      connectivity to an existing network.
       </p>
     </column>
 
     <column name="options">
-        This column provides key/value settings specific to the logical port
-        <ref column="type"/>.
+      <p>
+      This column provides key/value settings specific to the logical port
+      <ref column="type"/>.
+      </p>
+
+      <p>
+      When <ref column="type"/> is set to <em>localnet</em>, you must set the option
+      <em>network_name</em>.  ovn-controller uses local configuration to determine
+      exactly how to connect to this locally accessible network.
+      </p>
     </column>
 
     <column name="parent_name">
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 8102eb3..844150c 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -943,13 +943,40 @@ 
       </p>
 
       <p>
-      There are no other logical port types implemented yet.
+      When this column is set to <em>localnet</em>, this logical port represents a
+      connection to a locally accessible network from each ovn-controller instance.
+      A logical switch can only have a single <em>localnet</em> port attached
+      and at most one regular logical port.  This is used to model direct
+      connectivity to an existing network.
       </p>
     </column>
 
     <column name="options">
-        This column provides key/value settings specific to the logical port
-        <ref column="type"/>.
+      <p>
+      This column provides key/value settings specific to the logical port
+      <ref column="type"/>.
+      </p>
+
+      <p>
+      When <ref column="type"/> is set to <em>localnet</em>, you must set the option
+      <em>network_name</em>.  ovn-controller uses the configuration entry
+      <em>ovn-bridge-mappings</em> to determine how to connect to this network.
+      <em>ovn-bridge-mappings</em> is a list of network names mapped to a local
+      OVS bridge that provides access to that network.  An example of configuring
+      <em>ovn-bridge-mappings</em> would be:
+      </p>
+
+      <p>
+      <em>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</em>
+      </p>
+
+      <p>
+      Also note that when a logical switch has a <em>localnet</em> port attached,
+      every chassis that may have a local vif attached to that logical switch
+      must have a bridge mapping configured to reach that <em>localnet</em>.
+      Traffic that arrives on a <em>localnet</em> port is never forwarded over a tunnel
+      to another chassis.
+      </p>
     </column>
 
     <column name="tunnel_key">