From patchwork Thu Sep 3 16:45:01 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Russell Bryant X-Patchwork-Id: 514209 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (unknown [IPv6:2600:3c00::f03c:91ff:fe6e:bdf7]) by ozlabs.org (Postfix) with ESMTP id 8B47D1402A9 for ; Fri, 4 Sep 2015 02:45:24 +1000 (AEST) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id E9B0210B74; Thu, 3 Sep 2015 09:45:11 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e4.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id 91D0210B71 for ; Thu, 3 Sep 2015 09:45:10 -0700 (PDT) Received: from bar5.cudamail.com (unknown [192.168.21.12]) by mx1e4.cudamail.com (Postfix) with ESMTPS id F416E1E0455 for ; Thu, 3 Sep 2015 10:45:09 -0600 (MDT) X-ASG-Debug-ID: 1441298709-09eadd603072740001-byXFYA Received: from mx1-pf2.cudamail.com ([192.168.24.2]) by bar5.cudamail.com with ESMTP id IfesShCFOz4H03HA (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 03 Sep 2015 10:45:09 -0600 (MDT) X-Barracuda-Envelope-From: rbryant@redhat.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.2 Received: from unknown (HELO mx1.redhat.com) (209.132.183.28) by mx1-pf2.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 3 Sep 2015 16:45:08 -0000 Received-SPF: error (mx1-pf2.cudamail.com: error in processing during lookup of redhat.com: DNS problem) X-Barracuda-Apparent-Source-IP: 209.132.183.28 X-Barracuda-RBL-IP: 209.132.183.28 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (Postfix) with ESMTPS id 863B7A2C26 for ; Thu, 3 Sep 2015 16:45:07 +0000 (UTC) Received: from x1c.redhat.com (ovpn-112-103.phx2.redhat.com [10.3.112.103]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t83Gj3aB011620; Thu, 3 Sep 2015 12:45:06 -0400 X-CudaMail-Envelope-Sender: rbryant@redhat.com From: Russell Bryant To: dev@openvswitch.org X-CudaMail-MID: CM-E2-902045473 X-CudaMail-DTE: 090315 X-CudaMail-Originating-IP: 209.132.183.28 Date: Thu, 3 Sep 2015 12:45:01 -0400 X-ASG-Orig-Subj: [##CM-E2-902045473##][PATCH v8 2/2] ovn: Add "localnet" logical port type. Message-Id: <1441298701-32163-3-git-send-email-rbryant@redhat.com> In-Reply-To: <1441298701-32163-1-git-send-email-rbryant@redhat.com> References: <1440601674-27185-1-git-send-email-rbryant@redhat.com> <1441298701-32163-1-git-send-email-rbryant@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-GBUdb-Analysis: 0, 209.132.183.28, Ugly c=0.247104 p=-0.272727 Source Normal X-MessageSniffer-Rules: 0-0-0-32767-c X-Barracuda-Connect: UNKNOWN[192.168.24.2] X-Barracuda-Start-Time: 1441298709 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 X-Barracuda-Spam-Score: 1.60 X-Barracuda-Spam-Status: No, SCORE=1.60 using per-user scores of TAG_LEVEL=3.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=BSF_RULE7568M, BSF_RULE_7582B, BSF_SC5_MJ1963, RDNS_NONE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.22193 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M Custom Rule 7568M 0.50 BSF_RULE_7582B Custom Rule 7582B 0.10 RDNS_NONE Delivered to trusted network by a host with no rDNS 0.50 BSF_SC5_MJ1963 Custom Rule MJ1963 Subject: [ovs-dev] [PATCH v8 2/2] ovn: Add "localnet" logical port type. X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant --- ovn/controller/physical.c | 178 +++++++++++++++++++++++++++++++++++++++------ ovn/ovn-architecture.7.xml | 7 ++ ovn/ovn-nb.xml | 18 ++++- ovn/ovn-sb.xml | 33 ++++++++- 4 files changed, 208 insertions(+), 28 deletions(-) diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index 2ec0ba9..ba2cddf 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -23,7 +23,9 @@ #include "ovn-controller.h" #include "ovn/lib/ovn-sb-idl.h" #include "openvswitch/vlog.h" +#include "shash.h" #include "simap.h" +#include "smap.h" #include "sset.h" #include "vswitch-idl.h" @@ -138,6 +140,8 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, { struct simap lport_to_ofport = SIMAP_INITIALIZER(&lport_to_ofport); struct hmap tunnels = HMAP_INITIALIZER(&tunnels); + struct simap localnet_to_ofport = SIMAP_INITIALIZER(&localnet_to_ofport); + for (int i = 0; i < br_int->n_ports; i++) { const struct ovsrec_port *port_rec = br_int->ports[i]; if (!strcmp(port_rec->name, br_int->name)) { @@ -150,6 +154,9 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, continue; } + const char *localnet = smap_get(&port_rec->external_ids, + "ovn-patch-port"); + for (int j = 0; j < port_rec->n_interfaces; j++) { const struct ovsrec_interface *iface_rec = port_rec->interfaces[j]; @@ -162,8 +169,11 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, continue; } - /* Record as chassis or local logical port. */ - if (chassis_id) { + /* Record as patch to local net, chassis, or local logical port. */ + if (!strcmp(iface_rec->type, "patch") && localnet) { + simap_put(&localnet_to_ofport, localnet, ofport); + break; + } else if (chassis_id) { enum chassis_tunnel_type tunnel_type; if (!strcmp(iface_rec->type, "geneve")) { tunnel_type = GENEVE; @@ -196,6 +206,20 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, struct ofpbuf ofpacts; ofpbuf_init(&ofpacts, 0); + struct binding_elem { + struct ovs_list list_elem; + const struct sbrec_port_binding *binding; + }; + struct localnet_bindings { + ofp_port_t ofport; + struct ovs_list bindings; + }; + /* A list of localnet port bindings hashed by network name */ + struct shash localnet_inputs = SHASH_INITIALIZER(&localnet_inputs); + + /* Datapaths with at least one local port binding */ + struct hmap local_datapaths = HMAP_INITIALIZER(&local_datapaths); + /* Set up flows in table 0 for physical-to-logical translation and in table * 64 for logical-to-physical translation. */ const struct sbrec_port_binding *binding; @@ -210,7 +234,13 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, int tag = 0; ofp_port_t ofport; - if (binding->parent_port) { + if (!strcmp(binding->type, "localnet")) { + const char *network = smap_get(&binding->options, "network_name"); + if (!network) { + continue; + } + ofport = u16_to_ofp(simap_get(&localnet_to_ofport, network)); + } else if (binding->parent_port) { ofport = u16_to_ofp(simap_get(&lport_to_ofport, binding->parent_port)); if (ofport && binding->tag) { @@ -245,33 +275,67 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, * * Priority 150 is for traffic belonging to containers. For such * traffic, match on the tags and then strip the tag. - * Priority 100 is for traffic belonging to VMs. + * Priority 100 is for traffic belonging to VMs or locally connected + * networks. * * For both types of traffic: set MFF_LOG_INPORT to the logical * input port, MFF_LOG_DATAPATH to the logical datapath, and * resubmit into the logical ingress pipeline starting at table * 16. */ - match_init_catchall(&match); - ofpbuf_clear(&ofpacts); - match_set_in_port(&match, ofport); - if (tag) { - match_set_dl_vlan(&match, htons(tag)); - } + if (!strcmp(binding->type, "localnet")) { + /* The same OpenFlow port may correspond to localnet ports + * attached to more than one logical datapath, so keep track of + * all associated bindings and add a flow at the end. */ + + const char *network = smap_get(&binding->options, "network_name"); + struct shash_node *node; + struct localnet_bindings *ln_bindings; + + node = shash_find(&localnet_inputs, network); + if (!node) { + ln_bindings = xmalloc(sizeof *ln_bindings); + ln_bindings->ofport = ofport; + list_init(&ln_bindings->bindings); + node = shash_add(&localnet_inputs, network, ln_bindings); + } + ln_bindings = node->data; - /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */ - put_load(binding->datapath->tunnel_key, MFF_LOG_DATAPATH, 0, 64, - &ofpacts); - put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 32, &ofpacts); + struct binding_elem *b = xmalloc(sizeof *b); + b->binding = binding; + list_insert(&ln_bindings->bindings, &b->list_elem); + } else { + struct hmap_node *ld; + ld = hmap_first_with_hash(&local_datapaths, + binding->datapath->tunnel_key); + if (!ld) { + ld = xmalloc(sizeof *ld); + hmap_insert(&local_datapaths, ld, + binding->datapath->tunnel_key); + } - /* Strip vlans. */ - if (tag) { - ofpact_put_STRIP_VLAN(&ofpacts); - } + ofpbuf_clear(&ofpacts); + match_init_catchall(&match); + match_set_in_port(&match, ofport); + if (tag) { + match_set_dl_vlan(&match, htons(tag)); + } - /* Resubmit to first logical ingress pipeline table. */ - put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts); - ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, tag ? 150 : 100, - &match, &ofpacts); + /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */ + put_load(binding->datapath->tunnel_key, MFF_LOG_DATAPATH, 0, 64, + &ofpacts); + put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 32, + &ofpacts); + + /* Strip vlans. */ + if (tag) { + ofpact_put_STRIP_VLAN(&ofpacts); + } + + /* Resubmit to first logical ingress pipeline table. */ + put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts); + ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, + tag ? 150 : 100, &match, &ofpacts); + } /* Table 33, priority 100. * ======================= @@ -401,6 +465,16 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts); } else if (port->chassis) { sset_add(&remote_chassis, port->chassis->name); + } else if (!strcmp(port->type, "localnet")) { + const char *network = smap_get(&port->options, "network_name"); + if (!network) { + continue; + } + if (!simap_contains(&localnet_to_ofport, network)) { + continue; + } + put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, &ofpacts); + put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts); } } @@ -516,4 +590,64 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, free(tun); } hmap_destroy(&tunnels); + + /* Table 0, Priority 100 + * ===================== + * + * We have now determined the full set of port bindings associated with each + * "localnet" network. Only create flows for datapaths that have another + * local binding. Otherwise, we know it would just be dropped. + */ + struct shash_node *ln_bindings_node, *ln_bindings_node_next; + struct localnet_bindings *ln_bindings; + SHASH_FOR_EACH_SAFE (ln_bindings_node, ln_bindings_node_next, &localnet_inputs) { + ln_bindings = ln_bindings_node->data; + + struct match match; + match_init_catchall(&match); + match_set_in_port(&match, ln_bindings->ofport); + + struct ofpbuf ofpacts; + ofpbuf_init(&ofpacts, 0); + + bool found_local = false; + struct binding_elem *b; + LIST_FOR_EACH_POP (b, list_elem, &ln_bindings->bindings) { + struct hmap_node *ld; + ld = hmap_first_with_hash(&local_datapaths, + b->binding->datapath->tunnel_key); + if (ld) { + /* This datapath has local port bindings. */ + found_local = true; + + /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */ + put_load(b->binding->datapath->tunnel_key, MFF_LOG_DATAPATH, + 0, 64, &ofpacts); + put_load(b->binding->tunnel_key, MFF_LOG_INPORT, 0, 32, + &ofpacts); + put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts); + } + + free(b); + } + + if (found_local) { + ofctrl_add_flow(flow_table, 0, 100, &match, &ofpacts); + } + + ofpbuf_uninit(&ofpacts); + + shash_delete(&localnet_inputs, ln_bindings_node); + free(ln_bindings); + } + shash_destroy(&localnet_inputs); + + struct hmap_node *node; + while ((node = hmap_first(&local_datapaths))) { + hmap_remove(&local_datapaths, node); + free(node); + } + hmap_destroy(&local_datapaths); + + simap_destroy(&localnet_to_ofport); } diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml index 31488bd..b4e6d10 100644 --- a/ovn/ovn-architecture.7.xml +++ b/ovn/ovn-architecture.7.xml @@ -679,6 +679,13 @@

+ It's possible that a single ingress physical port maps to multiple + logical ports with a type of localnet. The logical datapath + and logical input port fields will be reset and the packet will be + resubmitted to table 16 multiple times. +

+ +

Packets that originate from a container nested within a VM are treated in a slightly different way. The originating container can be distinguished based on the VIF-specific VLAN ID, so the diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml index ade8164..0dfdab5 100644 --- a/ovn/ovn-nb.xml +++ b/ovn/ovn-nb.xml @@ -116,13 +116,25 @@

- There are no other logical port types implemented yet. + When this column is set to localnet, this logical port represents a + connection to a locally accessible network from each ovn-controller instance. + A logical switch can only have a single localnet port attached + and at most one regular logical port. This is used to model direct + connectivity to an existing network.

- This column provides key/value settings specific to the logical port - . +

+ This column provides key/value settings specific to the logical port + . +

+ +

+ When is set to localnet, you must set the option + network_name. ovn-controller uses local configuration to determine + exactly how to connect to this locally accessible network. +

diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml index 8102eb3..844150c 100644 --- a/ovn/ovn-sb.xml +++ b/ovn/ovn-sb.xml @@ -943,13 +943,40 @@

- There are no other logical port types implemented yet. + When this column is set to localnet, this logical port represents a + connection to a locally accessible network from each ovn-controller instance. + A logical switch can only have a single localnet port attached + and at most one regular logical port. This is used to model direct + connectivity to an existing network.

- This column provides key/value settings specific to the logical port - . +

+ This column provides key/value settings specific to the logical port + . +

+ +

+ When is set to localnet, you must set the option + network_name. ovn-controller uses the configuration entry + ovn-bridge-mappings to determine how to connect to this network. + ovn-bridge-mappings is a list of network names mapped to a local + OVS bridge that provides access to that network. An example of configuring + ovn-bridge-mappings would be: +

+ +

+ $ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1 +

+ +

+ Also note that when a logical switch has a localnet port attached, + every chassis that may have a local vif attached to that logical switch + must have a bridge mapping configured to reach that localnet. + Traffic that arrives on a localnet port is never forwarded over a tunnel + to another chassis. +