From patchwork Wed Oct 5 10:17:44 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gurucharan Shetty X-Patchwork-Id: 678595 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3sq6fF0lZtz9sDG for ; Thu, 6 Oct 2016 07:22:45 +1100 (AEDT) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 2B460106D8; Wed, 5 Oct 2016 13:22:44 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx3v3.cudamail.com (mx3.cudamail.com [64.34.241.5]) by archives.nicira.com (Postfix) with ESMTPS id B41C81026B for ; Wed, 5 Oct 2016 13:22:42 -0700 (PDT) Received: from bar6.cudamail.com (localhost [127.0.0.1]) by mx3v3.cudamail.com (Postfix) with ESMTPS id 4C99D162CA2 for ; Wed, 5 Oct 2016 14:22:42 -0600 (MDT) X-ASG-Debug-ID: 1475698961-0b32370edb01250001-byXFYA Received: from mx3-pf3.cudamail.com ([192.168.14.3]) by bar6.cudamail.com with ESMTP id PdsQMXSnpmzPk4di (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 05 Oct 2016 14:22:41 -0600 (MDT) X-Barracuda-Envelope-From: guru.ovn@gmail.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.14.3 Received: from unknown (HELO mail-pa0-f68.google.com) (209.85.220.68) by mx3-pf3.cudamail.com with ESMTPS (AES128-SHA encrypted); 5 Oct 2016 20:22:41 -0000 Received-SPF: pass (mx3-pf3.cudamail.com: SPF record at _netblocks.google.com designates 209.85.220.68 as permitted sender) X-Barracuda-Apparent-Source-IP: 209.85.220.68 X-Barracuda-RBL-IP: 209.85.220.68 Received: by mail-pa0-f68.google.com with SMTP id hh10so4516175pac.0 for ; Wed, 05 Oct 2016 13:22:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=RzGVU7FkHVd6lqcogo2nNnjOU2F2rfpu+4PEEX4sL1I=; b=XWVicjYRfLp99gsl3hFYDBruIOu9nQfI7fbFd74cZ2MX3BqALeTa727S/vOeJ1j7Mg esxrJai1I8wVYUu6opxk64eMkq/P1rRTfYLkHdQpHmknKcPmiMBEP45wBTEn+02iq02G L9diMfW/bdBK2TR7zDhuMIinCDfXnpcHSiSjdaGaVFuGkAlej32OS3fIrS8l0vitAt1L D0GBJng634bAvZwYCtb/dfe/Bgy+E6c8C+VPNol/LBMhUV7A+DpW+cP5RODBu+Bq88A8 yeXmIS4pQeFxC0Y6kVa9jo7GQYMwcBCjmIS1dM3Q+9LoJ/wor3KdU1uyVYqsBEB54/LU wT9A== X-Gm-Message-State: AA6/9RnqTDYKexx7FQbBolBQ8ynuFgA3uHAGhyqBDyXjpUmM5GnzxTkyGKMSuv1tg02uRg== X-Received: by 10.66.188.202 with SMTP id gc10mr16553577pac.104.1475698959832; Wed, 05 Oct 2016 13:22:39 -0700 (PDT) Received: from ubuntu.eng.vmware.com ([208.91.1.34]) by smtp.gmail.com with ESMTPSA id p7sm16429498paa.3.2016.10.05.13.22.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 05 Oct 2016 13:22:38 -0700 (PDT) X-CudaMail-Envelope-Sender: guru.ovn@gmail.com From: Gurucharan Shetty To: dev@openvswitch.org X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-V3-1004048797 X-CudaMail-DTE: 100516 X-CudaMail-Originating-IP: 209.85.220.68 Date: Wed, 5 Oct 2016 03:17:44 -0700 X-ASG-Orig-Subj: [##CM-V3-1004048797##][PATCH v2] ovn-controller: Container can have connection to a hosting VM. Message-Id: <1475662664-11609-1-git-send-email-guru@ovn.org> X-Mailer: git-send-email 1.9.1 X-Barracuda-Connect: UNKNOWN[192.168.14.3] X-Barracuda-Start-Time: 1475698961 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 Cc: Dustin Spinhirne Subject: [ovs-dev] [PATCH v2] ovn-controller: Container can have connection to a hosting VM. X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" A Container running inside a VM can have a connection to the hosting VM (parent port) in the logical topology (for e.g via a router). So we should be able to loop-back into the same VM, even if the final packet delivered does not have any tags in it. Reported-by: Dustin Spinhirne Signed-off-by: Gurucharan Shetty Acked-by: Ben Pfaff --- AUTHORS | 1 + ovn/controller/physical.c | 27 ++++--- tests/ovn.at | 195 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 211 insertions(+), 12 deletions(-) diff --git a/AUTHORS b/AUTHORS index a30a5d8..2525265 100644 --- a/AUTHORS +++ b/AUTHORS @@ -324,6 +324,7 @@ Derek Cormier derek.cormier@lab.ntt.co.jp Dhaval Badiani dbadiani@vmware.com DK Moon dkmoon@nicira.com Ding Zhi zhi.ding@6wind.com +Dustin Spinhirne dspinhirne@vmware.com Edwin Chiu echiu@vmware.com Eivind Bulie Haanaes Enas Ahmad enas.ahmad@kaust.edu.sa diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index 1a91531..c5866b4 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -211,6 +211,7 @@ consider_port_binding(enum mf_field_id mff_ovn_geneve, */ int tag = 0; + bool nested_container = false; ofp_port_t ofport; bool is_remote = false; if (binding->parent_port && *binding->parent_port) { @@ -221,6 +222,7 @@ consider_port_binding(enum mf_field_id mff_ovn_geneve, binding->parent_port)); if (ofport) { tag = *binding->tag; + nested_container = true; } } else { ofport = u16_to_ofp(simap_get(&localvif_to_ofport, @@ -288,6 +290,12 @@ consider_port_binding(enum mf_field_id mff_ovn_geneve, if (tag || !strcmp(binding->type, "localnet") || !strcmp(binding->type, "l2gateway")) { match_set_dl_vlan(&match, htons(tag)); + if (nested_container) { + /* When a packet comes from a container sitting behind a + * parent_port, we should let it loopback to other containers + * or the parent_port itself. */ + put_load(MLF_ALLOW_LOOPBACK, MFF_LOG_FLAGS, 0, 1, ofpacts_p); + } ofpact_put_STRIP_VLAN(ofpacts_p); } @@ -385,15 +393,18 @@ consider_port_binding(enum mf_field_id mff_ovn_geneve, * ======================= * * If the packet is supposed to hair-pin because the "loopback" - * flag is set, temporarily set the in_port to zero, resubmit to + * flag is set (or if the destination is a nested container), + * temporarily set the in_port to zero, resubmit to * table 65 for logical-to-physical translation, then restore * the port number. */ match_init_catchall(&match); ofpbuf_clear(ofpacts_p); match_set_metadata(&match, htonll(dp_key)); - match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0, - MLF_ALLOW_LOOPBACK, MLF_ALLOW_LOOPBACK); match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key); + if (!nested_container) { + match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0, + MLF_ALLOW_LOOPBACK, MLF_ALLOW_LOOPBACK); + } put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_p)); put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p); @@ -417,22 +428,14 @@ consider_port_binding(enum mf_field_id mff_ovn_geneve, vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p); vlan_vid->vlan_vid = tag; vlan_vid->push_vlan_if_needed = true; - - /* A packet might need to hair-pin back into its ingress - * OpenFlow port (to a different logical port, which we already - * checked back in table 34), so set the in_port to zero. */ - put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_p)); - put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p); } ofpact_put_OUTPUT(ofpacts_p)->port = ofport; if (tag) { /* Revert the tag added to the packets headed to containers * in the previous step. If we don't do this, the packets * that are to be broadcasted to a VM in the same logical - * switch will also contain the tag. Also revert the zero'd - * in_port. */ + * switch will also contain the tag. */ ofpact_put_STRIP_VLAN(ofpacts_p); - put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p)); } ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100, &match, ofpacts_p); diff --git a/tests/ovn.at b/tests/ovn.at index 3b27581..1d2e4e5 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -5362,3 +5362,198 @@ OVS_WAIT_UNTIL([test `ovs-vsctl show | grep "Port patch-br-int-to-ln_port" | wc OVN_CLEANUP([hv1],[hv2]) AT_CLEANUP + +AT_SETUP([ovn -- nested containers]) +AT_KEYWORDS([ovncontainers]) +ovn_start + +# Physical network: +# 2 HVs. HV1 has 2 VMs - "VM1" and "bar3". HV2 has 1 VM - "VM2" + +# Logical network: +# 3 Logical switches - "mgmt" (172.16.1.0/24), "foo" (192.168.1.0/24) +# and "bar" (192.168.2.0/24). They are all connected to router R1. + +ovn-nbctl lr-add R1 +ovn-nbctl ls-add mgmt +ovn-nbctl ls-add foo +ovn-nbctl ls-add bar + +# Connect mgmt to R1 +ovn-nbctl lrp-add R1 mgmt 00:00:00:01:02:02 172.16.1.1/24 +ovn-nbctl lsp-add mgmt rp-mgmt -- set Logical_Switch_Port rp-mgmt type=router \ + options:router-port=mgmt addresses=\"00:00:00:01:02:02\" + +# Connect foo to R1 +ovn-nbctl lrp-add R1 foo 00:00:00:01:02:03 192.168.1.1/24 +ovn-nbctl lsp-add foo rp-foo -- set Logical_Switch_Port rp-foo type=router \ + options:router-port=foo addresses=\"00:00:00:01:02:03\" + +# Connect bar to R1 +ovn-nbctl lrp-add R1 bar 00:00:00:01:02:04 192.168.2.1/24 +ovn-nbctl lsp-add bar rp-bar -- set Logical_Switch_Port rp-bar type=router \ + options:router-port=bar addresses=\"00:00:00:01:02:04\" + +# "mgmt" has VM1 and VM2 connected +ovn-nbctl lsp-add mgmt vm1 \ +-- lsp-set-addresses vm1 "f0:00:00:01:02:03 172.16.1.2" + +ovn-nbctl lsp-add mgmt vm2 \ +-- lsp-set-addresses vm2 "f0:00:00:01:02:04 172.16.1.3" + +# "foo1" and "foo2" are containers belonging to switch "foo" +# "foo1" has "VM1" as parent_port and "foo2" has "VM2" as parent_port. +ovn-nbctl lsp-add foo foo1 vm1 1 \ +-- lsp-set-addresses foo1 "f0:00:00:01:02:05 192.168.1.2" + +ovn-nbctl lsp-add foo foo2 vm2 2 \ +-- lsp-set-addresses foo2 "f0:00:00:01:02:06 192.168.1.3" + +# "bar1" and "bar2" are containers belonging to switch "bar" +# "bar1" has "VM1" as parent_port and "bar2" has "VM2" as parent_port. +ovn-nbctl lsp-add bar bar1 vm1 2 \ +-- lsp-set-addresses bar1 "f0:00:00:01:02:07 192.168.2.2" + +ovn-nbctl lsp-add bar bar2 vm2 1 \ +-- lsp-set-addresses bar2 "f0:00:00:01:02:08 192.168.2.3" + +# bar3 is a standalone VM belonging to switch "bar" +ovn-nbctl lsp-add bar bar3 \ +-- lsp-set-addresses bar3 "f0:00:00:01:02:09 192.168.2.4" + +# Create two hypervisor and create OVS ports corresponding to logical ports. +net_add n1 + +sim_add hv1 +as hv1 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.1 +ovs-vsctl -- add-port br-int vm1 -- \ + set interface vm1 external-ids:iface-id=vm1 \ + options:tx_pcap=hv1/vm1-tx.pcap \ + options:rxq_pcap=hv1/vm1-rx.pcap \ + ofport-request=1 + +ovs-vsctl -- add-port br-int bar3 -- \ + set interface bar3 external-ids:iface-id=bar3 \ + options:tx_pcap=hv1/bar3-tx.pcap \ + options:rxq_pcap=hv1/bar3-rx.pcap \ + ofport-request=2 + +sim_add hv2 +as hv2 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.2 +ovs-vsctl -- add-port br-int vm2 -- \ + set interface vm2 external-ids:iface-id=vm2 \ + options:tx_pcap=hv2/vm2-tx.pcap \ + options:rxq_pcap=hv2/vm2-rx.pcap \ + ofport-request=1 + +# Pre-populate the hypervisors' ARP tables so that we don't lose any +# packets for ARP resolution (native tunneling doesn't queue packets +# for ARP resolution). +ovn_populate_arp + +# Allow some time for ovn-northd and ovn-controller to catch up. +# XXX This should be more systematic. +sleep 1 + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +# Send ip packets between foo1 and foo2 (same switch, different HVs and +# different VLAN tags). +src_mac="f00000010205" +dst_mac="f00000010206" +src_ip=`ip_to_hex 192 168 1 2` +dst_ip=`ip_to_hex 192 168 1 3` +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet + +# expected packet at foo2 +packet=${dst_mac}${src_mac}8100000208004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 +echo $packet > expected +OVN_CHECK_PACKETS([hv2/vm2-tx.pcap], [expected]) + +# Send ip packets between foo1 and bar2 (different switch, different HV) +src_mac="f00000010205" +dst_mac="000000010203" +src_ip=`ip_to_hex 192 168 1 2` +dst_ip=`ip_to_hex 192 168 2 3` +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet + +# expected packet at bar2 +src_mac="000000010204" +dst_mac="f00000010208" +packet=${dst_mac}${src_mac}8100000108004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 +echo $packet >> expected +OVN_CHECK_PACKETS([hv2/vm2-tx.pcap], [expected]) + +# Send ip packets between foo1 and bar1 +# (different switch, loopback to same vm but different tag) +src_mac="f00000010205" +dst_mac="000000010203" +src_ip=`ip_to_hex 192 168 1 2` +dst_ip=`ip_to_hex 192 168 2 2` +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet + +# expected packet at bar1 +src_mac="000000010204" +dst_mac="f00000010207" +packet=${dst_mac}${src_mac}8100000208004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 +echo $packet > expected1 +OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1]) + +# Send ip packets between bar1 and bar3 +# (same switch. But one is container and another is a standalone VM) +src_mac="f00000010207" +dst_mac="f00000010209" +src_ip=`ip_to_hex 192 168 2 2` +dst_ip=`ip_to_hex 192 168 2 3` +packet=${dst_mac}${src_mac}8100000208004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet + +# expected packet at bar3 +packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 +echo $packet > expected +OVN_CHECK_PACKETS([hv1/bar3-tx.pcap], [expected]) + +# Send ip packets between foo1 and vm1. +(different switch, container to the VM hosting it.) +src_mac="f00000010205" +dst_mac="000000010203" +src_ip=`ip_to_hex 192 168 1 2` +dst_ip=`ip_to_hex 172 16 1 2` +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet + +# expected packet at vm1 +src_mac="000000010202" +dst_mac="f00000010203" +packet=${dst_mac}${src_mac}08004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 +echo $packet >> expected1 +OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1]) + +# Send packets from vm1 to bar1. +(different switch, A hosting VM to a container inside it) +src_mac="f00000010203" +dst_mac="000000010202" +src_ip=`ip_to_hex 172 16 1 2` +dst_ip=`ip_to_hex 192 168 2 2` +packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000 +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet + +# expected packet at vm1 +src_mac="000000010204" +dst_mac="f00000010207" +packet=${dst_mac}${src_mac}8100000208004500001c000000003f110100${src_ip}${dst_ip}0035111100080000 +echo $packet >> expected1 +OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1]) + +OVN_CLEANUP([hv1],[hv2]) + +AT_CLEANUP