From patchwork Fri Oct 5 17:14:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 979666 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42Rbwr2HqYz9s4Z for ; Sat, 6 Oct 2018 03:15:08 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id A2C4B6965; Fri, 5 Oct 2018 17:15:04 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id C206A1BD9 for ; Fri, 5 Oct 2018 17:15:02 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id EB7547C0 for ; Fri, 5 Oct 2018 17:15:00 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7B0173091D66 for ; Fri, 5 Oct 2018 17:15:00 +0000 (UTC) Received: from nusiddiq.redhat (ovpn-116-77.sin2.redhat.com [10.67.116.77]) by smtp.corp.redhat.com (Postfix) with ESMTP id F216A2016763; Fri, 5 Oct 2018 17:14:57 +0000 (UTC) From: nusiddiq@redhat.com To: dev@openvswitch.org Date: Fri, 5 Oct 2018 22:44:52 +0530 Message-Id: <20181005171452.31582-1-nusiddiq@redhat.com> In-Reply-To: <20181005171425.31441-1-nusiddiq@redhat.com> References: <20181005171425.31441-1-nusiddiq@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.25 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Fri, 05 Oct 2018 17:15:00 +0000 (UTC) X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH 1/2] ovn: Avoid tunneling for VLAN packets redirected to a gateway chassis X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Numan Siddique An OVN deployment can have multiple logical switches each with a localnet port connected to a distributed logical router with one logical router port providing external connectivity (provider network) and others used as tenant networks with VLAN tagging. As reported in [1], external traffic from these VLAN tenant networks are tunnelled to the gateway chassis (chassis hosting a distributed gateway port which applies NAT rules). As part of the discussion in [1], there were few possible solutions proposed by Russell [2]. This patch implements the first option in [2]. With this patch, a new option 'reside-on-redirect-chassis' in 'options' column of Logical_Router_Port table is added. If the value of this option is set to 'true' and if the logical router also have a distributed gateway port, then routing for this logical router port is centralized in the chassis hosting the distributed gateway port. If a logical switch 'sw0' is connected to a router 'lr0' with the router port - 'lr0-sw0' with the address - "00:00:00:00:af:12 192.168.1.1" , and it has a distributed logical port - 'lr0-public', then the below logical flow is added in the logical switch pipeline of 'sw0' if the 'reside-on-redirect-chassis' option is set on 'lr-sw0' - table=16(ls_in_l2_lkup), priority=50, match=(eth.dst == 00:00:00:00:af:12 && is_chassis_resident("cr-lr0-public")), action=(outport = "sw0-lr0"; output;) With the above flow, the packet doesn't enter the router pipeline in the source chassis. Instead the packet is sent out via the localnet port of 'sw0'. The gateway chassis upon receiving this packet, runs the logical router pipeline applying NAT rules and sends the traffic out via the localnet port of the provider network. The gateway chassis will also reply to the ARP requests for the router port IPs. With this approach, we avoid redirecting the external traffic to the gateway chassis via the tunnel port. There are a couple of drawbacks with this approach: - East - West routing is no more distributed for the VLAN tenant networks if 'reside-on-redirect-chassis' option is defined - 'dnat_and_snat' NAT rules with 'logical_mac' and 'logical_port' columns defined will not work for the VLAN tenant networks. This approach is taken for now as it is simple. If there is a requirement to support distributed routing for these VLAN tenant networks, we can explore other possible solutions. [1] - https://mail.openvswitch.org/pipermail/ovs-discuss/2018-April/046543.html [2] - https://mail.openvswitch.org/pipermail/ovs-discuss/2018-April/046557.html Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-April/046543.html Reported-by: venkata anil Co-authored-by: venkata anil Signed-off-by: Numan Siddique Signed-off-by: venkata anil Acked-by: Gurucharan Shetty Acked-by: Gurucharan Shetty --- ovn/northd/ovn-northd.8.xml | 30 ++++ ovn/northd/ovn-northd.c | 71 +++++++--- ovn/ovn-architecture.7.xml | 160 +++++++++++++++++++++ ovn/ovn-nb.xml | 43 ++++++ tests/ovn.at | 273 ++++++++++++++++++++++++++++++++++++ 5 files changed, 561 insertions(+), 16 deletions(-) diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml index 7352c6764..f52699bd3 100644 --- a/ovn/northd/ovn-northd.8.xml +++ b/ovn/northd/ovn-northd.8.xml @@ -874,6 +874,25 @@ output; resident. + +

+ For the Ethernet address on a logical switch port of type + router, when that logical switch port's + column is set to router and + the connected logical router port specifies a + reside-on-redirect-chassis and the logical router + to which the connected logical router port belongs to has a + redirect-chassis distributed gateway logical router + port: +

+ +
    +
  • + The flow for the connected logical router port's Ethernet + address is only programmed on the redirect-chassis. +
  • +
  • @@ -1179,6 +1198,17 @@ output; upstream MAC learning to point to the redirect-chassis.

    + +

    + For the logical router port with the option + reside-on-redirect-chassis set (which is centralized), + the above flows are only programmed on the gateway port instance on + the redirect-chassis (if the logical router has a + distributed gateway port). This behavior avoids generation + of multiple ARP responses from different chassis, and allows + upstream MAC learning to point to the + redirect-chassis. +

  • diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 31ea5f410..3998a898c 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -4426,13 +4426,32 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, ds_put_format(&match, "eth.dst == "ETH_ADDR_FMT, ETH_ADDR_ARGS(mac)); if (op->peer->od->l3dgw_port - && op->peer == op->peer->od->l3dgw_port - && op->peer->od->l3redirect_port) { - /* The destination lookup flow for the router's - * distributed gateway port MAC address should only be - * programmed on the "redirect-chassis". */ - ds_put_format(&match, " && is_chassis_resident(%s)", - op->peer->od->l3redirect_port->json_key); + && op->peer->od->l3redirect_port + && op->od->localnet_port) { + bool add_chassis_resident_check = false; + if (op->peer == op->peer->od->l3dgw_port) { + /* The peer of this port represents a distributed + * gateway port. The destination lookup flow for the + * router's distributed gateway port MAC address should + * only be programmed on the "redirect-chassis". */ + add_chassis_resident_check = true; + } else { + /* Check if the option 'reside-on-redirect-chassis' + * is set to true on the peer port. If set to true + * and if the logical switch has a localnet port, it + * means the router pipeline for the packets from + * this logical switch should be run on the chassis + * hosting the gateway port. + */ + add_chassis_resident_check = smap_get_bool( + &op->peer->nbrp->options, + "reside-on-redirect-chassis", false); + } + + if (add_chassis_resident_check) { + ds_put_format(&match, " && is_chassis_resident(%s)", + op->peer->od->l3redirect_port->json_key); + } } ds_clear(&actions); @@ -5197,15 +5216,35 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, op->lrp_networks.ipv4_addrs[i].network_s, op->lrp_networks.ipv4_addrs[i].plen, op->lrp_networks.ipv4_addrs[i].addr_s); - if (op->od->l3dgw_port && op == op->od->l3dgw_port - && op->od->l3redirect_port) { - /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s - * should only be sent from the "redirect-chassis", so that - * upstream MAC learning points to the "redirect-chassis". - * Also need to avoid generation of multiple ARP responses - * from different chassis. */ - ds_put_format(&match, " && is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); + + if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer + && op->peer->od->localnet_port) { + bool add_chassis_resident_check = false; + if (op == op->od->l3dgw_port) { + /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s + * should only be sent from the "redirect-chassis", so that + * upstream MAC learning points to the "redirect-chassis". + * Also need to avoid generation of multiple ARP responses + * from different chassis. */ + add_chassis_resident_check = true; + } else { + /* Check if the option 'reside-on-redirect-chassis' + * is set to true on the router port. If set to true + * and if peer's logical switch has a localnet port, it + * means the router pipeline for the packets from + * peer's logical switch is be run on the chassis + * hosting the gateway port and it should reply to the + * ARP requests for the router port IPs. + */ + add_chassis_resident_check = smap_get_bool( + &op->nbrp->options, + "reside-on-redirect-chassis", false); + } + + if (add_chassis_resident_check) { + ds_put_format(&match, " && is_chassis_resident(%s)", + op->od->l3redirect_port->json_key); + } } ds_clear(&actions); diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml index 6ed2cf132..998470c34 100644 --- a/ovn/ovn-architecture.7.xml +++ b/ovn/ovn-architecture.7.xml @@ -1372,6 +1372,166 @@ http://docs.openvswitch.org/en/latest/topics/high-availability.

    +

    Tenant VLAN networks connected to a Logical Router

    + +

    + It is possible to have multiple logical switches each with a localnet port + (representing physical networks) connected to a logical router in which one + may provide the external connectivity via a distributed gatewat port and + the rest of them are used internally (with VLAN tagged). It is expected + that ovn-bridge-mappings is configured appropriately on the + chassis. +

    + +

    East West routing

    +

    + East-West routing between these tenant VLAN logical switches works almost + the same way as normal logical switches. When the VM sends such a packet, + then: +

    +
      +
    1. + The packet enters the ingress pipeline of the logical router datapath + via the logical router port in the source chassis. +
    2. + +
    3. + Routing decision is taken. +
    4. + +
    5. + The packet goes out of the integration bridge to the provider bridge ( + belonging to the destination logical switch) via the localnet port. +
    6. + +
    7. + The destination chassis receives the packet via the localnet port + and delivers to the destination VM. +
    8. +
    + +

    External traffic

    + +

    + The following happens when a VM sends an external traffic (which requires + NATting) and the chassis hosting the VM doesn't have a distributed gateway + port. +

    + +
      +
    1. + The packet enters the ingress pipeline of the logical router datapath + via the logical router port in the source chassis. +
    2. + +
    3. + Routing decision is taken. Since the gateway router or the distributed + gateway port doesn't reside in the source chassis, the traffic is + redirected to the gateway chassis via the tunnel port. +
    4. + +
    5. + The gateway chassis receives the packet, applies the NAT rules and + forwards it via the localnet port. +
    6. +
    + +

    + Although this works, the VM traffic is tunnelled. In order for it to + work properly, the MTU of the VLAN tenant networks must be lowered to + account for the tunnel encapsulation. +

    + +

    Centralized routing for VLAN tenant networks

    + +

    + To overcome the tunnel encapsulation problem described in the previous + section, OVN supports the option of enabling centralized + routing for VLAN tenant networks. CMS can configure the option + to true for each + which connects to the + logical switch of the VLAN tenant network. This causes the gateway + chassis (hosting the distributed gateway port) to handle all the + routing for these networks, making it centralized. It will reply to + the ARP requests for the logical router port IPs. +

    + +

    + If the logical router doesn't have a distributed gateway port connecting + to the provider network, then this option is ignored by OVN. +

    + +

    + The following happens when a VM sends an east-west traffic which needs to + be routed: +

    + +
      +
    1. + The packet from the VM enters the logical datapath pipeline of the source + VLAN network in the source chassis and is sent out via the localnet port + (instead of sending it to router pipeline). +
    2. + +
    3. + The packet enters the logical datapath pipeline of the source VLAN + network in the gateway chassis and is sent to the logical datapath + pipeline belonging to the logical router. +
    4. + +
    5. + Routing decision is taken. +
    6. + +
    7. + The packet enters the logical datapath pipeline of the destination + VLAN network. The packet is delivered to the destination VM if it resides + in the same chassis. Otherwise the packet is sent out via the localnet + port of the destination VLAN network. +
    8. + +
    9. + The destination chassis receives the packet via the localnet port + and delivers to the destination VM. +
    10. +
    + +

    + The following happens when a VM sends an external traffic which requires + NATting: +

    + +
      +
    1. + The packet from the VM enters the logical datapath pipeline of the source + VLAN network in the source chassis and is sent out via the localnet port + (instead of sending it to router pipeline). +
    2. + +
    3. + The packet enters the logical datapath pipeline of the source VLAN + network in the gateway chassis and is sent to the logical datapath + pipeline belonging to the logical router. +
    4. + +
    5. + Routing decision is taken and NAT rules are applied. +
    6. + +
    7. + The packet enters the logical datapath pipeline of the provider network + and is sent out via the localnet port of the provider network. +
    8. +
    + +

    + For the reverse external traffic, the gateway chassis applies the unNATting + rules and sends the packet via the localnet port of the VLAN tenant + network and the destination chassis receives the packet and delivers to + the VM. +

    +

    Life Cycle of a VTEP gateway

    diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml index 8564ed39c..13ae56e13 100644 --- a/ovn/ovn-nb.xml +++ b/ovn/ovn-nb.xml @@ -1635,6 +1635,49 @@ chassis to enable high availability.

    + + +

    + Generally routing is distributed in OVN. The packet + from a logical port which needs to be routed hits the router pipeline + in the source chassis. For the East-West traffic, the packet is + sent directly to the destination chassis. For the outside traffic + the packet is sent to the gateway chassis. +

    + +

    + When this option is set, OVN considers this only if +

    + +
      +
    • + The logical router to which this logical router port belongs to + has a distributed gateway port. +
    • + +
    • + The peer's logical switch has a localnet port (representing + a tenant VLAN network) +
    • +
    + +

    + When this option is set to true, then the packet + which needs to be routed hits the router pipeline in the chassis + hosting the distributed gateway router port. The source chassis + pushes out this traffic via the localnet port. With this the + East-West traffic is no more distributed and will always go through + the gateway chassis. +

    + +

    + Without this option set, for any traffic destined to outside from a + logical port which belongs to a logical switch with localnet port, + the source chassis will send the traffic to the gateway chassis via + the tunnel port instead of the localnet port and this could cause MTU + issues. +

    +
    diff --git a/tests/ovn.at b/tests/ovn.at index 769e09f81..504ba228d 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -8537,6 +8537,279 @@ OVN_CLEANUP([hv1],[hv2],[hv3]) AT_CLEANUP +# VLAN traffic for external network redirected through distributed router +# gateway port should use vlans(i.e input network vlan tag) across hypervisors +# instead of tunneling. +AT_SETUP([ovn -- vlan traffic for external network with distributed router gateway port]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +ovn_start + +# Logical network: +# # One LR R1 that has switches foo (192.168.1.0/24) and +# # alice (172.16.1.0/24) connected to it. The logical port +# # between R1 and alice has a "redirect-chassis" specified, +# # i.e. it is the distributed router gateway port(172.16.1.6). +# # Switch alice also has a localnet port defined. +# # An additional switch outside has the same subnet as alice +# # (172.16.1.0/24), a localnet port and nexthop port(172.16.1.1) +# # which will receive the packet destined for external network +# # (i.e 8.8.8.8 as destination ip). + +# Physical network: +# # Three hypervisors hv[123]. +# # hv1 hosts vif foo1. +# # hv2 is the "redirect-chassis" that hosts the distributed router gateway port. +# # hv3 hosts nexthop port vif outside1. +# # All other tests connect hypervisors to network n1 through br-phys for tunneling. +# # But in this test, hv1 won't connect to n1(and no br-phys in hv1), and +# # in order to show vlans(instead of tunneling) used between hv1 and hv2, +# # a new network n2 created and hv1 and hv2 connected to this network through br-ex. +# # hv2 and hv3 are still connected to n1 network through br-phys. +net_add n1 + +# We are not calling ovn_attach for hv1, to avoid adding br-phys. +# Tunneling won't work in hv1 as ovn-encap-ip is not added to any bridge in hv1 +sim_add hv1 +as hv1 +ovs-vsctl \ + -- set Open_vSwitch . external-ids:system-id=hv1 \ + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve,vxlan \ + -- set Open_vSwitch . external-ids:ovn-encap-ip=192.168.0.1 \ + -- add-br br-int \ + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true \ + -- set Open_vSwitch . external-ids:ovn-bridge-mappings=public:br-ex + +start_daemon ovn-controller +ovs-vsctl -- add-port br-int hv1-vif1 -- \ + set interface hv1-vif1 external-ids:iface-id=foo1 \ + ofport-request=1 + +sim_add hv2 +as hv2 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.2 +ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings="public:br-ex,phys:br-phys" + +sim_add hv3 +as hv3 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.3 +ovs-vsctl -- add-port br-int hv3-vif1 -- \ + set interface hv3-vif1 external-ids:iface-id=outside1 \ + options:tx_pcap=hv3/vif1-tx.pcap \ + options:rxq_pcap=hv3/vif1-rx.pcap \ + ofport-request=1 +ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings="phys:br-phys" + +# Create network n2 for vlan connectivity between hv1 and hv2 +net_add n2 + +as hv1 +ovs-vsctl add-br br-ex +net_attach n2 br-ex + +as hv2 +ovs-vsctl add-br br-ex +net_attach n2 br-ex + +OVN_POPULATE_ARP + +ovn-nbctl create Logical_Router name=R1 + +ovn-nbctl ls-add foo +ovn-nbctl ls-add alice +ovn-nbctl ls-add outside + +# Connect foo to R1 +ovn-nbctl lrp-add R1 foo 00:00:01:01:02:03 192.168.1.1/24 +ovn-nbctl lsp-add foo rp-foo -- set Logical_Switch_Port rp-foo \ + type=router options:router-port=foo \ + -- lsp-set-addresses rp-foo router + +# Connect alice to R1 as distributed router gateway port (172.16.1.6) on hv2 +ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.6/24 \ + -- set Logical_Router_Port alice options:redirect-chassis="hv2" +ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice \ + type=router options:router-port=alice \ + -- lsp-set-addresses rp-alice router \ + +# Create logical port foo1 in foo +ovn-nbctl lsp-add foo foo1 \ +-- lsp-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2" + +# Create logical port outside1 in outside, which is a nexthop address +# for 172.16.1.0/24 +ovn-nbctl lsp-add outside outside1 \ +-- lsp-set-addresses outside1 "f0:00:00:01:02:04 172.16.1.1" + +# Set default gateway (nexthop) to 172.16.1.1 +ovn-nbctl lr-route-add R1 "0.0.0.0/0" 172.16.1.1 alice +AT_CHECK([ovn-nbctl lr-nat-add R1 snat 172.16.1.6 192.168.1.1/24]) +ovn-nbctl set Logical_Switch_Port rp-alice options:nat-addresses=router + +ovn-nbctl lsp-add foo ln-foo +ovn-nbctl lsp-set-addresses ln-foo unknown +ovn-nbctl lsp-set-options ln-foo network_name=public +ovn-nbctl lsp-set-type ln-foo localnet +AT_CHECK([ovn-nbctl set Logical_Switch_Port ln-foo tag=2]) + +# Create localnet port in alice +ovn-nbctl lsp-add alice ln-alice +ovn-nbctl lsp-set-addresses ln-alice unknown +ovn-nbctl lsp-set-type ln-alice localnet +ovn-nbctl lsp-set-options ln-alice network_name=phys + +# Create localnet port in outside +ovn-nbctl lsp-add outside ln-outside +ovn-nbctl lsp-set-addresses ln-outside unknown +ovn-nbctl lsp-set-type ln-outside localnet +ovn-nbctl lsp-set-options ln-outside network_name=phys + +# Allow some time for ovn-northd and ovn-controller to catch up. +# XXX This should be more systematic. +ovn-nbctl --wait=hv --timeout=3 sync + +# Check that there is a logical flow in logical switch foo's pipeline +# to set the outport to rp-foo (which is expected). +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl dump-flows foo | grep ls_in_l2_lkup | \ +grep rp-foo | grep -v is_chassis_resident | wc -l`]) + +# Set the option 'reside-on-redirect-chassis' for foo +ovn-nbctl set logical_router_port foo options:reside-on-redirect-chassis=true +# Check that there is a logical flow in logical switch foo's pipeline +# to set the outport to rp-foo with the condition is_chassis_redirect. +ovn-sbctl dump-flows foo +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl dump-flows foo | grep ls_in_l2_lkup | \ +grep rp-foo | grep is_chassis_resident | wc -l`]) + +echo "---------NB dump-----" +ovn-nbctl show +echo "---------------------" +ovn-nbctl list logical_router +echo "---------------------" +ovn-nbctl list nat +echo "---------------------" +ovn-nbctl list logical_router_port +echo "---------------------" + +echo "---------SB dump-----" +ovn-sbctl list datapath_binding +echo "---------------------" +ovn-sbctl list port_binding +echo "---------------------" +ovn-sbctl dump-flows +echo "---------------------" +ovn-sbctl list chassis +echo "---------------------" + +for chassis in hv1 hv2 hv3; do + as $chassis + echo "------ $chassis dump ----------" + ovs-vsctl show br-int + ovs-ofctl show br-int + ovs-ofctl dump-flows br-int + echo "--------------------------" +done + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +foo1_ip=$(ip_to_hex 192 168 1 2) +gw_ip=$(ip_to_hex 172 16 1 6) +dst_ip=$(ip_to_hex 8 8 8 8) +nexthop_ip=$(ip_to_hex 172 16 1 1) + +foo1_mac="f00000010203" +foo_mac="000001010203" +gw_mac="000002010203" +nexthop_mac="f00000010204" + +# Send ip packet from foo1 to 8.8.8.8 +src_mac="f00000010203" +dst_mac="000001010203" +packet=${foo_mac}${foo1_mac}08004500001c0000000040110000${foo1_ip}${dst_ip}0035111100080000 + +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet +sleep 2 + +# ARP request packet for nexthop_ip to expect at outside1 +arp_request=ffffffffffff${gw_mac}08060001080006040001${gw_mac}${gw_ip}000000000000${nexthop_ip} +echo $arp_request >> hv3-vif1.expected +cat hv3-vif1.expected > expout +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap | grep ${nexthop_ip} | uniq > hv3-vif1 +AT_CHECK([sort hv3-vif1], [0], [expout]) + +# Send ARP reply from outside1 back to the router +reply_mac="f00000010204" +arp_reply=${gw_mac}${nexthop_mac}08060001080006040002${nexthop_mac}${nexthop_ip}${gw_mac}${gw_ip} + +as hv3 ovs-appctl netdev-dummy/receive hv3-vif1 $arp_reply +OVS_WAIT_UNTIL([ + test `as hv2 ovs-ofctl dump-flows br-int | grep table=66 | \ +grep actions=mod_dl_dst:f0:00:00:01:02:04 | wc -l` -eq 1 + ]) + +# VLAN tagged packet with router port(192.168.1.1) MAC as destination MAC +# is expected on bridge connecting hv1 and hv2 +expected=${foo_mac}${foo1_mac}8100000208004500001c0000000040110000${foo1_ip}${dst_ip}0035111100080000 +echo $expected > hv1-br-ex_n2.expected + +# Packet to Expect at outside1 i.e nexthop(172.16.1.1) port. +# As connection tracking not enabled for this test, snat can't be done on the packet. +# We still see foo1 as the source ip address. But source mac(gateway MAC) and +# dest mac(nexthop mac) are properly configured. +expected=${nexthop_mac}${gw_mac}08004500001c000000003f110100${foo1_ip}${dst_ip}0035111100080000 +echo $expected > hv3-vif1.expected + +reset_pcap_file() { + local iface=$1 + local pcap_file=$2 + ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \ +options:rxq_pcap=dummy-rx.pcap + rm -f ${pcap_file}*.pcap + ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \ +options:rxq_pcap=${pcap_file}-rx.pcap +} + +as hv1 reset_pcap_file br-ex_n2 hv1/br-ex_n2 +as hv3 reset_pcap_file hv3-vif1 hv3/vif1 +sleep 2 +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet +sleep 2 + +# On hv1, the packet should not go from vlan switch pipleline to router +# pipleine +as hv1 ovs-ofctl dump-flows br-int + +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep "priority=100,reg15=0x1,metadata=0x2" \ +| grep actions=clone | grep -v n_packets=0 | wc -l], [0], [[0 +]]) + +# On hv1, table 32 check that no packet goes via the tunnel port +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 \ +| grep "NXM_NX_TUN_ID" | grep -v n_packets=0 | wc -l], [0], [[0 +]]) + +ip_packet() { + grep "1010203f00000010203" +} + +# Check vlan tagged packet on the bridge connecting hv1 and hv2 with the +# foo1's mac. +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/br-ex_n2-tx.pcap | ip_packet | uniq > hv1-br-ex_n2 +cat hv1-br-ex_n2.expected > expout +AT_CHECK([sort hv1-br-ex_n2], [0], [expout]) + +# Check expected packet on nexthop interface +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap | grep ${foo1_ip}${dst_ip} | uniq > hv3-vif1 +cat hv3-vif1.expected > expout +AT_CHECK([sort hv3-vif1], [0], [expout]) + +OVN_CLEANUP([hv1],[hv2],[hv3]) +AT_CLEANUP + AT_SETUP([ovn -- IPv6 ND Router Solicitation responder]) AT_KEYWORDS([ovn-nd_ra]) AT_SKIP_IF([test $HAVE_PYTHON = no]) From patchwork Fri Oct 5 17:15:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 979668 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42Rbxb3pTnz9s3C for ; Sat, 6 Oct 2018 03:15:47 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 5012D69A5; Fri, 5 Oct 2018 17:15:40 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 22AC369A0 for ; Fri, 5 Oct 2018 17:15:38 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 2B03D772 for ; Fri, 5 Oct 2018 17:15:34 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A504831500A6 for ; Fri, 5 Oct 2018 17:15:33 +0000 (UTC) Received: from nusiddiq.redhat (ovpn-116-77.sin2.redhat.com [10.67.116.77]) by smtp.corp.redhat.com (Postfix) with ESMTP id 73FF1424D; Fri, 5 Oct 2018 17:15:28 +0000 (UTC) From: nusiddiq@redhat.com To: dev@openvswitch.org Date: Fri, 5 Oct 2018 22:45:10 +0530 Message-Id: <20181005171510.31709-1-nusiddiq@redhat.com> In-Reply-To: <20181005171425.31441-1-nusiddiq@redhat.com> References: <20181005171425.31441-1-nusiddiq@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Fri, 05 Oct 2018 17:15:33 +0000 (UTC) X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH 2/2] ovn: Support a new Logical_Switch_Port.type - 'external' X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Numan Siddique In the case of OpenStack + OVN, when the VMs are booted on hypervisors supporting SR-IOV nics, there are no OVS ports for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6 Router Solicitation requests, the local ovn-controller cannot reply to these packets. OpenStack Neutron dhcp agent service needs to be run to serve these requests. With the new logical port type - 'external', OVN itself can handle these requests avoiding the need to deploy any external services like neutron dhcp agent. To make use of this feature, CMS has to - create a logical port for such VMs - set the type to 'external' - set requested-chassis="" in the options column. - create a localnet port for the logical switch - configure the ovn-bridge-mappings option in the OVS db. When the ovn-controller running in that 'chassis', detects the Port_Binding row, it adds the necessary DHCPv4/v6 OF flows. Since the packet enters the logical switch pipeline via the localnet port, the inport register (reg14) is set to the tunnel key of localnet port in the match conditions. In case the chassis goes down for some reason, it is the responsibility of CMS to change the 'requested-chassis' option to some other active chassis, so that it can serve these requests. When the VM with the external port, sends an ARP request for the router ips, only the chassis which has claimed the port, will reply to the ARP requests. Rest of the chassis on receiving these packets drop them in the ingress switch datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just before S_SWITCH_IN_L2_LKUP. This would guarantee that only the chassis which has claimed the external ports will run the router datapath pipeline. Signed-off-by: Numan Siddique --- ovn/controller/binding.c | 15 +- ovn/controller/lflow.c | 41 ++- ovn/controller/lflow.h | 2 + ovn/controller/lport.c | 26 ++ ovn/controller/lport.h | 5 + ovn/controller/ovn-controller.c | 6 + ovn/lib/ovn-util.c | 1 + ovn/northd/ovn-northd.8.xml | 52 +++- ovn/northd/ovn-northd.c | 123 ++++++-- ovn/ovn-architecture.7.xml | 66 ++++ ovn/ovn-nb.xml | 33 ++ tests/ovn.at | 530 +++++++++++++++++++++++++++++++- 12 files changed, 868 insertions(+), 32 deletions(-) diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c index 021ecddcf..ee396c93d 100644 --- a/ovn/controller/binding.c +++ b/ovn/controller/binding.c @@ -471,13 +471,26 @@ consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn, * for them. */ sset_add(local_lports, binding_rec->logical_port); our_chassis = false; + } else if (!strcmp(binding_rec->type, "external")) { + const char *chassis_id = smap_get(&binding_rec->options, + "requested-chassis"); + our_chassis = chassis_id && ( + !strcmp(chassis_id, chassis_rec->name) || + !strcmp(chassis_id, chassis_rec->hostname)); + if (our_chassis) { + add_local_datapath(sbrec_datapath_binding_by_key, + sbrec_port_binding_by_datapath, + sbrec_port_binding_by_name, + binding_rec->datapath, true, local_datapaths); + } } if (our_chassis || !strcmp(binding_rec->type, "patch") || !strcmp(binding_rec->type, "localport") || !strcmp(binding_rec->type, "vtep") - || !strcmp(binding_rec->type, "localnet")) { + || !strcmp(binding_rec->type, "localnet") + || !strcmp(binding_rec->type, "external")) { update_local_lport_ids(local_lport_ids, binding_rec); } diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c index 8db81927e..98e8ed3b9 100644 --- a/ovn/controller/lflow.c +++ b/ovn/controller/lflow.c @@ -52,7 +52,10 @@ lflow_init(void) struct lookup_port_aux { struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath; struct ovsdb_idl_index *sbrec_port_binding_by_name; + struct ovsdb_idl_index *sbrec_port_binding_by_type; + struct ovsdb_idl_index *sbrec_datapath_binding_by_key; const struct sbrec_datapath_binding *dp; + const struct sbrec_chassis *chassis; }; struct condition_aux { @@ -66,6 +69,8 @@ static void consider_logical_flow( struct ovsdb_idl_index *sbrec_chassis_by_name, struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath, struct ovsdb_idl_index *sbrec_port_binding_by_name, + struct ovsdb_idl_index *sbrec_port_binding_by_type, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, const struct sbrec_logical_flow *, const struct hmap *local_datapaths, const struct sbrec_chassis *, @@ -89,8 +94,24 @@ lookup_port_cb(const void *aux_, const char *port_name, unsigned int *portp) const struct sbrec_port_binding *pb = lport_lookup_by_name(aux->sbrec_port_binding_by_name, port_name); if (pb && pb->datapath == aux->dp) { - *portp = pb->tunnel_key; - return true; + if (strcmp(pb->type, "external")) { + *portp = pb->tunnel_key; + return true; + } + const char *chassis_id = smap_get(&pb->options, + "requested-chassis"); + if (chassis_id && (!strcmp(chassis_id, aux->chassis->name) || + !strcmp(chassis_id, aux->chassis->hostname))) { + const struct sbrec_port_binding *localnet_pb + = lport_lookup_by_type(aux->sbrec_datapath_binding_by_key, + aux->sbrec_port_binding_by_type, + aux->dp->tunnel_key, "localnet"); + if (localnet_pb) { + *portp = localnet_pb->tunnel_key; + return true; + } + } + return false; } const struct sbrec_multicast_group *mg = mcgroup_lookup_by_dp_name( @@ -144,6 +165,8 @@ add_logical_flows( struct ovsdb_idl_index *sbrec_chassis_by_name, struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath, struct ovsdb_idl_index *sbrec_port_binding_by_name, + struct ovsdb_idl_index *sbrec_port_binding_by_type, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, const struct sbrec_dhcp_options_table *dhcp_options_table, const struct sbrec_dhcpv6_options_table *dhcpv6_options_table, const struct sbrec_logical_flow_table *logical_flow_table, @@ -183,6 +206,8 @@ add_logical_flows( consider_logical_flow(sbrec_chassis_by_name, sbrec_multicast_group_by_name_datapath, sbrec_port_binding_by_name, + sbrec_port_binding_by_type, + sbrec_datapath_binding_by_key, lflow, local_datapaths, chassis, &dhcp_opts, &dhcpv6_opts, &nd_ra_opts, addr_sets, port_groups, active_tunnels, @@ -200,6 +225,8 @@ consider_logical_flow( struct ovsdb_idl_index *sbrec_chassis_by_name, struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath, struct ovsdb_idl_index *sbrec_port_binding_by_name, + struct ovsdb_idl_index *sbrec_port_binding_by_type, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, const struct sbrec_logical_flow *lflow, const struct hmap *local_datapaths, const struct sbrec_chassis *chassis, @@ -292,7 +319,10 @@ consider_logical_flow( .sbrec_multicast_group_by_name_datapath = sbrec_multicast_group_by_name_datapath, .sbrec_port_binding_by_name = sbrec_port_binding_by_name, - .dp = lflow->logical_datapath + .sbrec_port_binding_by_type = sbrec_port_binding_by_type, + .sbrec_datapath_binding_by_key = sbrec_datapath_binding_by_key, + .dp = lflow->logical_datapath, + .chassis = chassis }; struct condition_aux cond_aux = { .sbrec_chassis_by_name = sbrec_chassis_by_name, @@ -463,6 +493,8 @@ void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name, struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath, struct ovsdb_idl_index *sbrec_port_binding_by_name, + struct ovsdb_idl_index *sbrec_port_binding_by_type, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, const struct sbrec_dhcp_options_table *dhcp_options_table, const struct sbrec_dhcpv6_options_table *dhcpv6_options_table, const struct sbrec_logical_flow_table *logical_flow_table, @@ -481,7 +513,8 @@ lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name, add_logical_flows(sbrec_chassis_by_name, sbrec_multicast_group_by_name_datapath, - sbrec_port_binding_by_name, dhcp_options_table, + sbrec_port_binding_by_name, sbrec_port_binding_by_type, + sbrec_datapath_binding_by_key, dhcp_options_table, dhcpv6_options_table, logical_flow_table, local_datapaths, chassis, addr_sets, port_groups, active_tunnels, local_lport_ids, flow_table, group_table, diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h index d19338140..b2911e0eb 100644 --- a/ovn/controller/lflow.h +++ b/ovn/controller/lflow.h @@ -68,6 +68,8 @@ void lflow_init(void); void lflow_run(struct ovsdb_idl_index *sbrec_chassis_by_name, struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath, struct ovsdb_idl_index *sbrec_port_binding_by_name, + struct ovsdb_idl_index *sbrec_port_binding_by_type, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, const struct sbrec_dhcp_options_table *, const struct sbrec_dhcpv6_options_table *, const struct sbrec_logical_flow_table *, diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c index cc5c5fbb2..9c827d9b0 100644 --- a/ovn/controller/lport.c +++ b/ovn/controller/lport.c @@ -64,6 +64,32 @@ lport_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key, return retval; } +const struct sbrec_port_binding * +lport_lookup_by_type(struct ovsdb_idl_index *sbrec_datapath_binding_by_key, + struct ovsdb_idl_index *sbrec_port_binding_by_type, + uint64_t dp_key, const char *port_type) +{ + /* Lookup datapath corresponding to dp_key. */ + const struct sbrec_datapath_binding *db = datapath_lookup_by_key( + sbrec_datapath_binding_by_key, dp_key); + if (!db) { + return NULL; + } + + /* Build key for an indexed lookup. */ + struct sbrec_port_binding *pb = sbrec_port_binding_index_init_row( + sbrec_port_binding_by_type); + sbrec_port_binding_index_set_datapath(pb, db); + sbrec_port_binding_index_set_type(pb, port_type); + + const struct sbrec_port_binding *retval = sbrec_port_binding_index_find( + sbrec_port_binding_by_type, pb); + + sbrec_port_binding_index_destroy_row(pb); + + return retval; +} + const struct sbrec_datapath_binding * datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key) diff --git a/ovn/controller/lport.h b/ovn/controller/lport.h index 7dcd5bee0..2d49792f6 100644 --- a/ovn/controller/lport.h +++ b/ovn/controller/lport.h @@ -42,6 +42,11 @@ const struct sbrec_port_binding *lport_lookup_by_key( struct ovsdb_idl_index *sbrec_port_binding_by_key, uint64_t dp_key, uint64_t port_key); +const struct sbrec_port_binding *lport_lookup_by_type( + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, + struct ovsdb_idl_index *sbrec_port_binding_by_type, + uint64_t dp_key, const char *port_type); + const struct sbrec_datapath_binding *datapath_lookup_by_key( struct ovsdb_idl_index *sbrec_datapath_binding_by_key, uint64_t dp_key); diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index f46156021..60f305cd6 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -145,6 +145,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl, * ports that have a Gateway_Chassis that point's to our own * chassis */ sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "chassisredirect"); + sbrec_port_binding_add_clause_type(&pb, OVSDB_F_EQ, "external"); if (chassis) { /* This should be mostly redundant with the other clauses for port * bindings, but it allows us to catch any ports that are assigned to @@ -616,6 +617,9 @@ main(int argc, char *argv[]) struct ovsdb_idl_index *sbrec_port_binding_by_datapath = ovsdb_idl_index_create1(ovnsb_idl_loop.idl, &sbrec_port_binding_col_datapath); + struct ovsdb_idl_index *sbrec_port_binding_by_type + = ovsdb_idl_index_create1(ovnsb_idl_loop.idl, + &sbrec_port_binding_col_type); struct ovsdb_idl_index *sbrec_datapath_binding_by_key = ovsdb_idl_index_create1(ovnsb_idl_loop.idl, &sbrec_datapath_binding_col_tunnel_key); @@ -742,6 +746,8 @@ main(int argc, char *argv[]) sbrec_chassis_by_name, sbrec_multicast_group_by_name_datapath, sbrec_port_binding_by_name, + sbrec_port_binding_by_type, + sbrec_datapath_binding_by_key, sbrec_dhcp_options_table_get(ovnsb_idl_loop.idl), sbrec_dhcpv6_options_table_get(ovnsb_idl_loop.idl), sbrec_logical_flow_table_get(ovnsb_idl_loop.idl), diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c index e9464e926..0e4439c5d 100644 --- a/ovn/lib/ovn-util.c +++ b/ovn/lib/ovn-util.c @@ -311,6 +311,7 @@ static const char *OVN_NB_LSP_TYPES[] = { "localport", "router", "vtep", + "external", }; bool diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml index f52699bd3..c133969ed 100644 --- a/ovn/northd/ovn-northd.8.xml +++ b/ovn/northd/ovn-northd.8.xml @@ -113,6 +113,26 @@ logical ports on which port security is not enabled, these advance all packets that match the inport.
  • + +
  • +

    + For each logical port of type external with port + security enabled and the logical switch to which it belongs, has a + localnet port, a priority 90 flow is added which matches on the + inport of localnet port and the valid + eth.src address(es) of the external + logical port and sets the REGBIT_EXT_PORT_NOT_RESIDENT + flag if the logical port doesn't reside on a chassis and advances the + packet to the next flow table. +

    + +

    + On the chassis where the external port resides doesn't + add the above flow. The priority 50 flow with the match on the + inport of the localnet port takes care of forwarding + the packet to the next flow table. +

    +
  • @@ -626,7 +646,8 @@ nd_na_router {

    This table adds the DHCPv4 options to a DHCPv4 packet from the logical ports configured with IPv4 address(es) and DHCPv4 options, - and similarly for DHCPv6 options. + and similarly for DHCPv6 options. This table also adds flows for the + logical ports of type external.

      @@ -827,7 +848,34 @@ output;
    -

    Ingress Table 16 Destination Lookup

    +

    Ingress table 16 External ports

    + +

    + Traffic from the external logical ports enter the ingress + datapath pipeline via the localnet port. This table adds the + below logical flows to handle the traffic from these ports. +

    + +
      +
    • +

      + For each router port IP address A which belongs to the + logical switch, A priority-100 flow is added which matches + REGBIT_EXT_PORT_NOT_RESIDENT && arp.tpa == A + && arp.op == 1 (ARP request to the router + IP) with the action to drop the packet. +

      + +

      + These flows guarantees that the ARP/NS request to the router IP + address from the external ports is responded by only the chassis + which has claimed these external ports. All the other chassis, + drops these packets. +

      +
    • +
    + +

    Ingress Table 17 Destination Lookup

    This table implements switching behavior. It contains these logical diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 3998a898c..42832dbc7 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -118,7 +118,8 @@ enum ovn_stage { PIPELINE_STAGE(SWITCH, IN, DHCP_RESPONSE, 13, "ls_in_dhcp_response") \ PIPELINE_STAGE(SWITCH, IN, DNS_LOOKUP, 14, "ls_in_dns_lookup") \ PIPELINE_STAGE(SWITCH, IN, DNS_RESPONSE, 15, "ls_in_dns_response") \ - PIPELINE_STAGE(SWITCH, IN, L2_LKUP, 16, "ls_in_l2_lkup") \ + PIPELINE_STAGE(SWITCH, IN, EXTERNAL_PORT, 16, "ls_in_external_port") \ + PIPELINE_STAGE(SWITCH, IN, L2_LKUP, 17, "ls_in_l2_lkup") \ \ /* Logical switch egress stages. */ \ PIPELINE_STAGE(SWITCH, OUT, PRE_LB, 0, "ls_out_pre_lb") \ @@ -165,12 +166,13 @@ enum ovn_stage { #define OVN_ACL_PRI_OFFSET 1000 /* Register definitions specific to switches. */ -#define REGBIT_CONNTRACK_DEFRAG "reg0[0]" -#define REGBIT_CONNTRACK_COMMIT "reg0[1]" -#define REGBIT_CONNTRACK_NAT "reg0[2]" -#define REGBIT_DHCP_OPTS_RESULT "reg0[3]" -#define REGBIT_DNS_LOOKUP_RESULT "reg0[4]" -#define REGBIT_ND_RA_OPTS_RESULT "reg0[5]" +#define REGBIT_CONNTRACK_DEFRAG "reg0[0]" +#define REGBIT_CONNTRACK_COMMIT "reg0[1]" +#define REGBIT_CONNTRACK_NAT "reg0[2]" +#define REGBIT_DHCP_OPTS_RESULT "reg0[3]" +#define REGBIT_DNS_LOOKUP_RESULT "reg0[4]" +#define REGBIT_ND_RA_OPTS_RESULT "reg0[5]" +#define REGBIT_EXT_PORT_NOT_RESIDENT "reg0[6]" /* Register definitions for switches and routers. */ #define REGBIT_NAT_REDIRECT "reg9[0]" @@ -442,6 +444,8 @@ struct ovn_datapath { /* Port groups related to the datapath, used only when nbs is NOT NULL. */ struct hmap nb_pgs; + + bool has_external_ports; }; struct macam_node { @@ -1581,6 +1585,10 @@ join_logical_ports(struct northd_context *ctx, od->localnet_port = op; } + if (!strcmp(nbsp->type, "external")) { + od->has_external_ports = true; + } + op->lsp_addrs = xmalloc(sizeof *op->lsp_addrs * nbsp->n_addresses); for (size_t j = 0; j < nbsp->n_addresses; j++) { @@ -2870,6 +2878,12 @@ lsp_is_up(const struct nbrec_logical_switch_port *lsp) return !lsp->up || *lsp->up; } +static bool +lsp_is_external(const struct nbrec_logical_switch_port *nbsp) +{ + return !strcmp(nbsp->type, "external"); +} + static bool build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip, struct ds *options_action, struct ds *response_action, @@ -4051,9 +4065,24 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, continue; } + bool is_external = lsp_is_external(op->nbsp); + if (is_external && (!op->od->localnet_port || !op->n_ps_addrs)) { + /* If the lsp is external with no port securty addresses then, + * we don't need to add any port security rules. + * The packets from external ports is received on localnet port + * and we allow the traffic from localnet ports. + * + * We also need to ignore these ports if the logical switch + * doesn't have a localnet port. + */ + continue; + } + ds_clear(&match); ds_clear(&actions); - ds_put_format(&match, "inport == %s", op->json_key); + ds_put_format( + &match, "inport == %s", + is_external ? op->od->localnet_port->json_key : op->json_key); build_port_security_l2("eth.src", op->ps_addrs, op->n_ps_addrs, &match); @@ -4061,11 +4090,21 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, if (queue_id) { ds_put_format(&actions, "set_queue(%s); ", queue_id); } + + if (is_external) { + /* Set REGBIT_EXT_PORT_NOT_RESIDENT bit if the external port + * doesn't reside on a chassis. */ + ds_put_format(&match, " && !is_chassis_resident(%s)", + op->json_key); + ds_put_cstr(&actions, REGBIT_EXT_PORT_NOT_RESIDENT" = 1; "); + } + ds_put_cstr(&actions, "next;"); - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2, 50, + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2, + is_external ? 90 : 50, ds_cstr(&match), ds_cstr(&actions)); - if (op->nbsp->n_port_security) { + if (op->nbsp->n_port_security && !is_external) { build_port_security_ip(P_IN, op, lflows); build_port_security_nd(op, lflows); } @@ -4113,7 +4152,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, * - port type is localport */ if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") && - strcmp(op->nbsp->type, "localport")) { + strcmp(op->nbsp->type, "localport") && lsp_is_external(op->nbsp)) { continue; } @@ -4225,6 +4264,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, continue; } + bool is_external = lsp_is_external(op->nbsp); + if (is_external && !op->od->localnet_port) { + /* If it's an external port and there is no localnet port + * ignore it. */ + continue; + } + for (size_t i = 0; i < op->n_lsp_addrs; i++) { for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) { struct ds options_action = DS_EMPTY_INITIALIZER; @@ -4237,8 +4283,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, ds_put_format( &match, "inport == %s && eth.src == %s && " "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && " - "udp.src == 68 && udp.dst == 67", op->json_key, - op->lsp_addrs[i].ea_s); + "udp.src == 68 && udp.dst == 67", + op->json_key, op->lsp_addrs[i].ea_s); ovn_lflow_add(lflows, op->od, S_SWITCH_IN_DHCP_OPTIONS, 100, ds_cstr(&match), @@ -4343,7 +4389,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, /* Ingress table 12 and 13: DHCP options and response, by default goto * next. (priority 0). * Ingress table 14 and 15: DNS lookup and response, by default goto next. - * (priority 0).*/ + * (priority 0). + * Ingress table 16 - External port handling */ HMAP_FOR_EACH (od, key_node, datapaths) { if (!od->nbs) { @@ -4354,9 +4401,47 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, ovn_lflow_add(lflows, od, S_SWITCH_IN_DHCP_RESPONSE, 0, "1", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_LOOKUP, 0, "1", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_IN_DNS_RESPONSE, 0, "1", "next;"); + ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, 0, "1", "next;"); + + if (od->has_external_ports) { + /* Table 16: External port. Drop ARP request for router ips from + * external ports if REGBIT_EXT_PORT_NOT_RESIDENT is set. + * This makes the router pipeline to be run only the chassis + * binding the external ports. */ + for (size_t i = 0; i < od->n_router_ports; i++) { + struct ovn_port *rp = od->router_ports[i]; + for (size_t j = 0; j < rp->n_lsp_addrs; j++) { + for (size_t k = 0; k < rp->lsp_addrs[j].n_ipv4_addrs; + k++) { + ds_clear(&match); + ds_put_format( + &match, REGBIT_EXT_PORT_NOT_RESIDENT"" + " && arp.tpa == %s && arp.op == 1", + rp->lsp_addrs[j].ipv4_addrs[k].addr_s); + ds_clear(&actions); + ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, + 100, ds_cstr(&match), "drop;"); + } + for (size_t k = 0; k < rp->lsp_addrs[j].n_ipv6_addrs; + k++) { + ds_clear(&match); + ds_put_format( + &match, REGBIT_EXT_PORT_NOT_RESIDENT"" + " && nd_ns && ip6.dst == {%s, %s} && " + "nd.target == %s", + rp->lsp_addrs[j].ipv6_addrs[k].addr_s, + rp->lsp_addrs[j].ipv6_addrs[k].sn_addr_s, + rp->lsp_addrs[j].ipv6_addrs[k].addr_s); + ds_clear(&actions); + ovn_lflow_add(lflows, od, S_SWITCH_IN_EXTERNAL_PORT, + 100, ds_cstr(&match), "drop;"); + } + } + } + } } - /* Ingress table 16: Destination lookup, broadcast and multicast handling + /* Ingress table 17: Destination lookup, broadcast and multicast handling * (priority 100). */ HMAP_FOR_EACH (op, key_node, ports) { if (!op->nbsp) { @@ -4376,9 +4461,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, "outport = \""MC_FLOOD"\"; output;"); } - /* Ingress table 16: Destination lookup, unicast handling (priority 50), */ + /* Ingress table 17: Destination lookup, unicast handling (priority 50), */ HMAP_FOR_EACH (op, key_node, ports) { - if (!op->nbsp) { + if (!op->nbsp || lsp_is_external(op->nbsp)) { continue; } @@ -4495,7 +4580,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, } } - /* Ingress table 16: Destination lookup for unknown MACs (priority 0). */ + /* Ingress table 17: Destination lookup for unknown MACs (priority 0). */ HMAP_FOR_EACH (od, key_node, datapaths) { if (!od->nbs) { continue; @@ -4530,7 +4615,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, * Priority 150 rules drop packets to disabled logical ports, so that they * don't even receive multicast or broadcast packets. */ HMAP_FOR_EACH (op, key_node, ports) { - if (!op->nbsp) { + if (!op->nbsp || lsp_is_external(op->nbsp)) { continue; } diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml index 998470c34..05af5f3f0 100644 --- a/ovn/ovn-architecture.7.xml +++ b/ovn/ovn-architecture.7.xml @@ -1627,6 +1627,72 @@ +

    Native OVN services for external logical ports

    + +

    + To support OVN native services (like DHCP/IPv6 RA/DNS lookup) to the + cloud resources which are external, OVN supports external + logical ports. +

    + +

    + Below are some of the use cases where external ports can be + used. +

    + +
      +
    • + VMs connected to SR-IOV nics - Traffic from these VMs by passes the + kernel stack and local ovn-controller do not bind these + ports and cannot serve the native services. +
    • +
    • + When CMS supports provisioning baremetal servers. +
    • +
    + +

    + OVN will provide the native services if CMS has done the below + configuration in the OVN Northbound Database. +

    + +
      +
    • + A row is created in Logical_Switch_Port, configuring the + column + and setting the to external. +
    • + +
    • + column is configured to a desired chassis. +
    • + +
    • + The chassis on which this logical port is requested has the + ovn-bridge-mappings configured and has proper L2 + connectivity so that it can receive the DHCP and other related request + packets from these external resources. +
    • + +
    • + The Logical_Switch of this port has a localnet port. +
    • + +
    • + Native OVN services are enabled by configuring the DHCP and other + options like the way it is done for the normal logical ports. +
    • +
    + +

    + OVN doesn't support HA for these external ports. In case + the ovn-controller running on the requested chassis goes down, + it is the responsiblity of CMS, to reschedule these external + ports to other active chassis. +

    +

    Security

    Role-Based Access Controls for the Soutbound DB

    diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml index 13ae56e13..251287702 100644 --- a/ovn/ovn-nb.xml +++ b/ovn/ovn-nb.xml @@ -302,6 +302,39 @@
    A port to a logical switch on a VTEP gateway.
    + +
    external
    +
    +

    + Represents a logical port which is external and not having + an OVS port in the integration bridge. + OVN will never receive any traffic from this port or + send any traffic to this port. OVN can support + native services like DHCPv4/DHCPv6/DNS for this port. + If is defined, + ovn-controller running in that chassis will bind + this port to provide these native services. It is expected that + this port belong to a bridged logical switch + (with a localnet port). +

    + +

    + Below are some of the use cases where external + ports can be used. +

    + +
      +
    • + VMs connected to SR-IOV nics - Traffic from these VMs by passes + the kernel stack and local ovn-controller do not + bind these ports and cannot serve the native services. +
    • + +
    • + When CMS supports provisioning baremetal servers. +
    • +
    +
    diff --git a/tests/ovn.at b/tests/ovn.at index 504ba228d..fa5ac108e 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -9428,9 +9428,9 @@ AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=32 | grep active_backup | gre sleep 3 # let BFD sessions settle so we get the right flows on the right chassis # make sure that flows for handling the outside router port reside on gw1 -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1 +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1 ]]) -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0 +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0 ]]) # make sure ARP responder flows for outside router port reside on gw1 too @@ -9520,9 +9520,9 @@ AT_CHECK([ovs-vsctl --bare --columns bfd find Interface name=ovn-hv1-0],[0], sleep 3 # let BFD sessions settle so we get the right flows on the right chassis # make sure that flows for handling the outside router port reside on gw2 now -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1 +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1 ]]) -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0 +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0 ]]) # disconnect GW2 from the network, GW1 should take over @@ -9534,9 +9534,9 @@ sleep 4 bfd_dump # make sure that flows for handling the outside router port reside on gw2 now -AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[1 +AT_CHECK([as gw1 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[1 ]]) -AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=24 | grep 00:00:02:01:02:04 | wc -l], [0], [[0 +AT_CHECK([as gw2 ovs-ofctl dump-flows br-int table=25 | grep 00:00:02:01:02:04 | wc -l], [0], [[0 ]]) # check that the chassis redirect port has been reclaimed by the gw1 chassis @@ -11376,6 +11376,524 @@ as hv2 start_daemon ovn-controller OVN_CLEANUP([hv1],[hv2]) AT_CLEANUP +AT_SETUP([ovn -- external logical port]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +ovn_start + +net_add n1 +sim_add hv1 +sim_add hv2 + +ovn-nbctl ls-add ls1 +ovn-nbctl lsp-add ls1 ls1-lp1 \ +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.0.0.4 ae70::4" + +# Add a couple of external logical port +ovn-nbctl lsp-add ls1 ls1-lp_ext1 \ +-- lsp-set-addresses ls1-lp_ext1 "f0:00:00:00:00:03 10.0.0.6 ae70::6" +ovn-nbctl lsp-set-port-security ls1-lp_ext1 \ +"f0:00:00:00:00:03 10.0.0.6 ae70::6" +ovn-nbctl lsp-set-type ls1-lp_ext1 external + +ovn-nbctl lsp-add ls1 ls1-lp_ext2 \ +-- lsp-set-addresses ls1-lp_ext2 "f0:00:00:00:00:04 10.0.0.7 ae70::7" +ovn-nbctl lsp-set-port-security ls1-lp_ext2 \ +"f0:00:00:00:00:04 10.0.0.7 ae70::8" +ovn-nbctl lsp-set-type ls1-lp_ext2 external + +d1="$(ovn-nbctl create DHCP_Options cidr=10.0.0.0/24 \ +options="\"server_id\"=\"10.0.0.1\" \"server_mac\"=\"ff:10:00:00:00:01\" \ +\"lease_time\"=\"3600\" \"router\"=\"10.0.0.1\"")" + +d2="$(ovn-nbctl create DHCP_Options cidr="ae70\:\:/64" \ +options="\"server_id\"=\"00:00:00:10:00:01\"")" + +ovn-nbctl lsp-set-dhcpv4-options ls1-lp1 ${d1} +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext1 ${d1} +ovn-nbctl lsp-set-dhcpv4-options ls1-lp_ext2 ${d1} + +ovn-nbctl lsp-set-dhcpv6-options ls1-lp1 ${d2} +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext1 ${d2} +ovn-nbctl lsp-set-dhcpv6-options ls1-lp_ext2 ${d2} + +# Create a logical router and connect it to ls1 +ovn-nbctl lr-add lr0 +ovn-nbctl lrp-add lr0 lr0-ls1 a0:10:00:00:00:01 10.0.0.1/24 +ovn-nbctl lsp-add ls1 ls1-lr0 +ovn-nbctl set Logical_Switch_Port ls1-lr0 type=router \ + options:router-port=lr0-ls1 addresses=router + +as hv1 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.1 +ovs-vsctl -- add-port br-phys hv1-ext1 -- \ + set interface hv1-ext1 options:tx_pcap=hv1/ext1-tx.pcap \ + options:rxq_pcap=hv1/ext1-rx.pcap \ + ofport-request=2 +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys + +as hv2 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.2 +ovs-vsctl -- add-port br-phys hv2-ext2 -- \ + set interface hv2-ext2 options:tx_pcap=hv2/ext2-tx.pcap \ + options:rxq_pcap=hv2/ext2-rx.pcap \ + ofport-request=2 +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys + +ovn-sbctl dump-flows > lflows_n.txt + +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and +# hv2 as requested-chassis option is not set and no localnet port added to ls1. +AT_CHECK([ovn-sbctl dump-flows ls1 | grep "offerip = 10.0.0.6" | \ +wc -l], [0], [0 +]) +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.06" | wc -l], [0], [0 +]) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.06" | wc -l], [0], [0 +]) +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 +]) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 +]) + +hv1_uuid=$(ovn-sbctl list chassis hv1 | grep uuid | awk '{print $3}') + +# The port_binding row for ls1-lp_ext1 should have empty chassis +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \ +grep -v requested | grep chassis | awk '{print $3}') + +AT_CHECK([test $chassis == "[[]]"], [0], []) + +# Set the requested-chassis option for ls1-lp_ext1 +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1 + +# No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and hv2 +# as no localnet port added to ls1 yet. +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.06" | wc -l], [0], [0 +]) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.06" | wc -l], [0], [0 +]) +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 +]) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 +]) + +# Add the localnet port to the logical switch ls1 +ovn-nbctl lsp-add ls1 ln-public +ovn-nbctl lsp-set-addresses ln-public unknown +ovn-nbctl lsp-set-type ln-public localnet +ovn-nbctl --wait=hv lsp-set-options ln-public network_name=phys + +ln_public_key=$(ovn-sbctl list port_binding ln-public | grep tunnel_key | \ +awk '{print $3}') + +# The ls1-lp_ext1 should be bound to hv1 +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \ +grep -v requested | grep chassis | awk '{print $3}') +AT_CHECK([test $chassis == "$hv1_uuid"], [0], []) + +# There should be DHCPv4/v6 OF flows for the ls1-lp_ext1 port in hv1 +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \ +wc -l], [0], [3 +]) +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \ +grep reg14=0x$ln_public_key | wc -l], [0], [1 +]) + +# There should ne no DHCPv4/v6 flows for ls1-lp_ext1 on hv2 +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.06" | wc -l], [0], [0 +]) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 +]) + +# No DHCPv4/v6 flows for the external port - ls1-lp_ext2 - 10.0.0.7 in hv1 and +# hv2 as requested-chassis option is not set. +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.07" | wc -l], [0], [0 +]) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.07" | wc -l], [0], [0 +]) +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0 +]) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0 +]) + +as hv1 +ovs-vsctl show + +# This shell function sends a DHCP request packet +# test_dhcp INPORT SRC_MAC DHCP_TYPE OFFER_IP ... +test_dhcp() { + local inport=$1 src_mac=$2 dhcp_type=$3 offer_ip=$4 use_ip=$5 + shift; shift; shift; shift; shift; + if test $use_ip != 0; then + src_ip=$1 + dst_ip=$2 + shift; shift; + else + src_ip=`ip_to_hex 0 0 0 0` + dst_ip=`ip_to_hex 255 255 255 255` + fi + local request=ffffffffffff${src_mac}0800451001100000000080110000${src_ip}${dst_ip} + # udp header and dhcp header + request=${request}0044004300fc0000 + request=${request}010106006359aa760000000000000000000000000000000000000000${src_mac} + # client hardware padding + request=${request}00000000000000000000 + # server hostname + request=${request}0000000000000000000000000000000000000000000000000000000000000000 + request=${request}0000000000000000000000000000000000000000000000000000000000000000 + # boot file name + request=${request}0000000000000000000000000000000000000000000000000000000000000000 + request=${request}0000000000000000000000000000000000000000000000000000000000000000 + request=${request}0000000000000000000000000000000000000000000000000000000000000000 + request=${request}0000000000000000000000000000000000000000000000000000000000000000 + # dhcp magic cookie + request=${request}63825363 + # dhcp message type + request=${request}3501${dhcp_type}ff + + local srv_mac=$1 srv_ip=$2 expected_dhcp_opts=$3 + # total IP length will be the IP length of the request packet + # (which is 272 in our case) + 8 (padding bytes) + (expected_dhcp_opts / 2) + ip_len=`expr 280 + ${#expected_dhcp_opts} / 2` + udp_len=`expr $ip_len - 20` + ip_len=$(printf "%x" $ip_len) + udp_len=$(printf "%x" $udp_len) + # $ip_len var will be in 3 digits i.e 134. So adding a '0' before $ip_len + local reply=${src_mac}${srv_mac}080045100${ip_len}000000008011XXXX${srv_ip}${offer_ip} + # udp header and dhcp header. + # $udp_len var will be in 3 digits. So adding a '0' before $udp_len + reply=${reply}004300440${udp_len}0000020106006359aa760000000000000000 + # your ip address + reply=${reply}${offer_ip} + # next server ip address, relay agent ip address, client mac address + reply=${reply}0000000000000000${src_mac} + # client hardware padding + reply=${reply}00000000000000000000 + # server hostname + reply=${reply}0000000000000000000000000000000000000000000000000000000000000000 + reply=${reply}0000000000000000000000000000000000000000000000000000000000000000 + # boot file name + reply=${reply}0000000000000000000000000000000000000000000000000000000000000000 + reply=${reply}0000000000000000000000000000000000000000000000000000000000000000 + reply=${reply}0000000000000000000000000000000000000000000000000000000000000000 + reply=${reply}0000000000000000000000000000000000000000000000000000000000000000 + # dhcp magic cookie + reply=${reply}63825363 + # dhcp message type + local dhcp_reply_type=02 + if test $dhcp_type = 03; then + dhcp_reply_type=05 + fi + reply=${reply}3501${dhcp_reply_type}${expected_dhcp_opts}00000000ff00000000 + echo $reply >> ext1_v4.expected + + as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport} $request +} + + +trim_zeros() { + sed 's/\(00\)\{1,\}$//' +} + +# This shell function sends a DHCPv6 request packet +# test_dhcpv6 INPORT SRC_MAC SRC_LLA DHCPv6_MSG_TYPE OFFER_IP OUTPORT... +# The OUTPORTs (zero or more) list the VIFs on which the original DHCPv6 +# packet should be received twice (one from ovn-controller and the other +# from the "ovs-ofctl monitor br-int resume" +test_dhcpv6() { + local inport=$1 src_mac=$2 src_lla=$3 msg_code=$4 offer_ip=$5 + local req_pkt_in_expected=$6 + local request=ffffffffffff${src_mac}86dd00000000002a1101${src_lla} + # dst ip ff02::1:2 + request=${request}ff020000000000000000000000010002 + # udp header and dhcpv6 header + request=${request}02220223002affff${msg_code}010203 + # Client identifier + request=${request}0001000a00030001${src_mac} + # IA-NA (Identity Association for Non Temporary Address) + request=${request}0003000c0102030400000e1000001518 + shift; shift; shift; shift; shift; + + local server_mac=000000100001 + local server_lla=fe80000000000000020000fffe100001 + local reply_code=07 + if test $msg_code = 01; then + reply_code=02 + fi + local msg_len=54 + if test $offer_ip = 1; then + msg_len=28 + fi + local reply=${src_mac}${server_mac}86dd0000000000${msg_len}1101 + reply=${reply}${server_lla}${src_lla} + + # udp header and dhcpv6 header + reply=${reply}0223022200${msg_len}ffff${reply_code}010203 + # Client identifier + reply=${reply}0001000a00030001${src_mac} + # IA-NA + if test $offer_ip != 1; then + reply=${reply}0003002801020304ffffffffffffffff00050018${offer_ip} + reply=${reply}ffffffffffffffff + fi + # Server identifier + reply=${reply}0002000a00030001${server_mac} + + echo $reply | trim_zeros >> ext${inport}_v6.expected + # The inport also receives the request packet since it is connected + # to the br-phys. + #echo $request >> ext${inport}_v6.expected + + as hv1 ovs-appctl netdev-dummy/receive hv${inport}-ext${inport} $request +} + +reset_pcap_file() { + local iface=$1 + local pcap_file=$2 + ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \ +options:rxq_pcap=dummy-rx.pcap + rm -f ${pcap_file}*.pcap + ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \ +options:rxq_pcap=${pcap_file}-rx.pcap +} + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +AT_CAPTURE_FILE([ofctl_monitor0_hv1.log]) +as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \ +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv1.log + +AT_CAPTURE_FILE([ofctl_monitor0_hv2.log]) +as hv2 ovs-ofctl monitor br-int resume --detach --no-chdir \ +--pidfile=ovs-ofctl0.pid 2> ofctl_monitor0_hv2.log + +# Send DHCPDISCOVER. +offer_ip=`ip_to_hex 10 0 0 6` +server_ip=`ip_to_hex 10 0 0 1` +server_mac=ff1000000001 +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001 +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \ +$expected_dhcp_opts + +# NXT_RESUMEs should be 1 in hv1. +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`]) + +# NXT_RESUMEs should be 0 in hv2. +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`]) + +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_v4.packets +cat ext1_v4.expected | cut -c -48 > expout +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout]) +# Skipping the IPv4 checksum. +cat ext1_v4.expected | cut -c 53- > expout +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout]) + +# ovs-ofctl also resumes the packets and this causes other ports to receive +# the DHCP request packet. So reset the pcap files so that its easier to test. +reset_pcap_file hv1-ext1 hv1/ext1 +rm -f ext1_v4.expected +rm -f ext1_v4.packets + +# Send DHCPv6 request +src_mac=f00000000003 +src_lla=fe80000000000000f20000fffe000003 +offer_ip=ae700000000000000000000000000006 +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip + +# NXT_RESUMEs should be 2 in hv1. +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`]) + +# NXT_RESUMEs should be 0 in hv2. +OVS_WAIT_UNTIL([test 0 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`]) + +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \ +sort > ext1_v6.packets +cat ext1_v6.expected | cut -c -120 > expout +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout]) +# Skipping the UDP checksum +cat ext1_v6.expected | cut -c 125- > expout +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout]) + +rm -f ext1_v6.expected +rm -f ext1_v6.packets +reset_pcap_file hv1-ext1 hv1/ext1 + +# Change the requested-chassis option for ls1-lp_ext1 from hv1 to hv2 +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv2 + +hv2_uuid=$(ovn-sbctl list chassis hv2 | grep uuid | awk '{print $3}') + +# The ls1-lp_ext1 should be bound to hv2 +chassis=$(ovn-sbctl list port_binding ls1-lp_ext1 | grep -v gateway | \ +grep -v requested | grep chassis | awk '{print $3}') +AT_CHECK([test $chassis == "$hv2_uuid"], [0], []) + +# There should be OF flows for DHCP4/v6 for the ls1-lp_ext1 port in hv2 +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \ +wc -l], [0], [3 +]) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \ +grep reg14=0x$ln_public_key | wc -l], [0], [1 +]) + +# There should ne no DHCPv4/v6 flows for ls1-lp_ext1 on hv1 +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep "0a.00.00.06" | wc -l], [0], [0 +]) +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=20 | \ +grep controller | grep tp_src=546 | grep \ +"ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \ +grep reg14=0x$ln_public_key | wc -l], [0], [0 +]) + +# Send DHCPDISCOVER again for hv1/ext1. The DHCP response should come from +# hv2 ovn-controller. Due to the test setup, the port hv1/ext1 is also +# receiving the expected packet. +offer_ip=`ip_to_hex 10 0 0 6` +server_ip=`ip_to_hex 10 0 0 1` +server_mac=ff1000000001 +expected_dhcp_opts=330400000e100104ffffff0003040a00000136040a000001 +test_dhcp 1 f00000000003 01 $offer_ip 0 $server_mac $server_ip \ +$expected_dhcp_opts + +# NXT_RESUMEs should be 2 in hv1. +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`]) + +# NXT_RESUMEs should be 1 in hv2. +OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`]) + +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_v4.packets +cat ext1_v4.expected | cut -c -48 > expout +AT_CHECK([cat ext1_v4.packets | cut -c -48], [0], [expout]) +# Skipping the IPv4 checksum. +cat ext1_v4.expected | cut -c 53- > expout +AT_CHECK([cat ext1_v4.packets | cut -c 53-], [0], [expout]) + +# ovs-ofctl also resumes the packets and this causes other ports to receive +# the DHCP request packet. So reset the pcap files so that its easier to test. +reset_pcap_file hv1-ext1 hv1/ext1 +rm -f ext1_v4.expected + +# Send DHCPv6 request again +src_mac=f00000000003 +src_lla=fe80000000000000f20000fffe000003 +offer_ip=ae700000000000000000000000000006 +test_dhcpv6 1 $src_mac $src_lla 01 $offer_ip 1 + +# NXT_RESUMEs should be 2 in hv1. +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv1.log | grep -c NXT_RESUME`]) + +# NXT_RESUMEs should be 2 in hv2. +OVS_WAIT_UNTIL([test 2 = `cat ofctl_monitor0_hv2.log | grep -c NXT_RESUME`]) + +as hv1 +ovs-vsctl show +ovs-ofctl dump-flows br-int + +as hv2 +ovs-vsctl show +ovs-ofctl dump-flows br-int + +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap | \ +sort > ext1_v6.packets +cat ext1_v6.expected | cut -c -120 > expout +AT_CHECK([cat ext1_v6.packets | cut -c -120], [0], [expout]) +# Skipping the UDP checksum +cat ext1_v6.expected | cut -c 125- > expout +AT_CHECK([cat ext1_v6.packets | cut -c 125-], [0], [expout]) + +rm -f ext1_v6.expected +rm -f ext1_v6.packets + +as hv1 +ovs-vsctl show +reset_pcap_file hv1-ext1 hv1/ext1 +reset_pcap_file br-phys_n1 hv1/br-phys_n1 +reset_pcap_file br-phys hv1/br-phys + +as hv2 +ovs-vsctl show +reset_pcap_file hv2-ext2 hv2/ext2 +reset_pcap_file br-phys_n1 hv2/br-phys_n1 +reset_pcap_file br-phys hv2/br-phys + +# From ls1-lp_ext1, send ARP request for the router ip. The ARP +# response should come from the router pipeline of hv2. +ext1_mac=f00000000003 +router_mac=a01000000001 +ext1_ip=`ip_to_hex 10 0 0 6` +router_ip=`ip_to_hex 10 0 0 1` +arp_request=ffffffffffff${ext1_mac}08060001080006040001${ext1_mac}${ext1_ip}000000000000${router_ip} + +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request +expected_response=${src_mac}${router_mac}08060001080006040002${router_mac}${router_ip}${ext1_mac}${ext1_ip} +echo $expected_response > expout +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_arp_resp +AT_CHECK([cat ext1_arp_resp], [0], [expout]) + +# Verify that the response came from hv2 +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > ext1_arp_resp +AT_CHECK([cat ext1_arp_resp], [0], [expout]) + + +# # Change the requested-chassis option for ls1-lp_ext1 from hv2 to hv1 +ovn-nbctl --wait=hv lsp-set-options ls1-lp_ext1 requested-chassis=hv1 + +as hv1 +ovs-vsctl show +reset_pcap_file hv1-ext1 hv1/ext1 +reset_pcap_file br-phys_n1 hv1/br-phys_n1 +reset_pcap_file br-phys hv1/br-phys + +as hv2 +ovs-vsctl show +reset_pcap_file hv2-ext2 hv2/ext2 +reset_pcap_file br-phys_n1 hv2/br-phys_n1 +reset_pcap_file br-phys hv2/br-phys + +as hv1 ovs-appctl netdev-dummy/receive hv1-ext1 $arp_request + +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/ext1-tx.pcap > ext1_arp_resp +AT_CHECK([cat ext1_arp_resp], [0], [expout]) + +# Verify that the response didn't come from hv2 +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/br-phys_n1-tx.pcap > ext1_arp_resp +AT_CHECK([cat ext1_arp_resp], [0], []) + +OVN_CLEANUP([hv1],[hv2]) +AT_CLEANUP + AT_SETUP([ovn -- ovn-controller restart]) AT_SKIP_IF([test $HAVE_PYTHON = no]) ovn_start