From patchwork Wed Aug 1 12:16:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Venkata Anil X-Patchwork-Id: 952119 X-Patchwork-Delegate: jpettit@nicira.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41gXPF5sXZz9s4Z for ; Wed, 1 Aug 2018 22:17:21 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 1E7C8BB3; Wed, 1 Aug 2018 12:16:52 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 90791DB3 for ; Wed, 1 Aug 2018 12:16:50 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 89184713 for ; Wed, 1 Aug 2018 12:16:48 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AF0E78197001 for ; Wed, 1 Aug 2018 12:16:47 +0000 (UTC) Received: from vkommadi.lab.eng.blr.redhat.com (dhcp35-207.lab.eng.blr.redhat.com [10.70.35.207]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 00CC91C5BB; Wed, 1 Aug 2018 12:16:45 +0000 (UTC) From: vkommadi@redhat.com To: dev@openvswitch.org Date: Wed, 1 Aug 2018 17:46:32 +0530 Message-Id: <20180801121635.14509-2-vkommadi@redhat.com> In-Reply-To: <20180801121635.14509-1-vkommadi@redhat.com> References: <20180801121635.14509-1-vkommadi@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 01 Aug 2018 12:16:47 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 01 Aug 2018 12:16:47 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'vkommadi@redhat.com' RCPT:'' X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v7 1/4] Avoid tunneling for VLAN packets redirected to a gateway chassis X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: venkata anil When a vm on a vlan tenant network sends traffic to an external network, it is tunneled from host chassis to gateway chassis. In the earlier discussion [1], Russel (also in his doc [2]) suggested if we can figure out a way for OVN to do this redirect to the gateway host over a VLAN network. This patch implements his suggestion i.e will redirect to gateway chassis using incoming tenant vlan network. Gateway chassis are expected to be configured with tenant vlan networks. In this approach, new logical and physical flows introduced for packet processing in both host and gateway chassis. Packet processing in the host chassis: 1) A new ovs flow added in physical table 65, which sets MLF_RCV_FROM_VLAN flag for packets from vlan network entering into router pipeline 2) A new flow added in lr_in_ip_routing, for packets output through distributed gateway port and matching MLF_RCV_FROM_VLAN flag, set REGBIT_NAT_REDIRECT i.e table=7 (lr_in_ip_routing ), priority=2 , match=( ip4.dst == 0.0.0.0/0 && flags.rcv_from_vlan == 1 && !is_chassis_resident("cr-alice")), action=(reg9[0] = 1; next;) This flow will be set only on chassis not hosting chassisredirect port i.e compute node. When REGBIT_NAT_REDIRECT set, a) lr_in_arp_resolve, will set packet eth.dst to distibuted gateway port MAC b) lr_in_gw_redirect, will set chassisredirect port as outport 3) A new ovs flow added in physical table 32 will use source vlan tenant network tag as vlan ID for sending the packet to gateway chassis. As this vlan packet destination MAC is distibuted gateway port MAC, packet will only reach the gateway chassis. table=32,priority=150,reg10=0x20/0x20,reg14=0x3,reg15=0x6,metadata=0x4 actions=mod_vlan_vid:2010,output:25,strip_vlan This flow will be set only on chassis not hosting chassisredirect port i.e compute node. Packet processing in the gateway chassis: 1) A new ovs flow added in physical table 0 for vlan traffic coming from localnet port with router distributed gateway port MAC as destination MAC address, resubmit to connected router ingress pipeline (i.e router attached to vlan tenant network). table=0,priority=150,in_port=67,dl_vlan=2010,dl_dst=00:00:02:01:02:03 actions=strip_vlan,load:0x4->OXM_OF_METADATA[],load:0x3->NXM_NX_REG14[], load:0x1->NXM_NX_REG10[5],resubmit(,8) This flow will be set only on chassis hosting chassisredirect port i.e gateway node. 2) A new flow added in lr_in_admission which checks MLF_RCV_FROM_VLAN and allows the packet. This flow will be set only on chassis hosting chassisredirect port i.e gateway node. table=0 (lr_in_admission ), priority=100 , match=( flags.rcv_from_vlan == 1 && inport == "lrp-44383893-613a-4bfe-b483- e7d0dc3055cd" && is_chassis_resident("cr-lrp-a6e3d2ab-313a-4ea3- 8ec4-c3c774a11f49")), action=(next;) Then packet will pass through router ingress and egress pipelines and then to external switch pipeline. [1] https://mail.openvswitch.org/pipermail/ovs-discuss/2018-April/046557.html [2] Point 3 in section 3.3.1 - Future Enhancements https://docs.google.com/document/d/1JecGIXPH0RAqfGvD0nmtBdEU1zflHACp8WSRnKCFSgg/edit# Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-April/046543.html Signed-off-by: Venkata Anil --- v6->v7: * Rebased * Addressed review comments v5->v6: * Rebased v4->v5: * No changes in this patch v3->v4: * Previous v3 patch became this patch of v4 * Updated the newly added flow in physical table 0 on gateway chassis to check for distributed gateway port MAC and then resubmit to router ingress pipeline * Improved the test * Added more comments ovn/controller/bfd.c | 3 +- ovn/controller/binding.c | 10 +- ovn/controller/ovn-controller.c | 3 + ovn/controller/ovn-controller.h | 17 ++- ovn/controller/physical.c | 120 ++++++++++++++++- ovn/lib/logical-fields.c | 4 + ovn/lib/logical-fields.h | 2 + ovn/northd/ovn-northd.c | 35 +++++ tests/ovn.at | 278 ++++++++++++++++++++++++++++++++++++++++ 9 files changed, 464 insertions(+), 8 deletions(-) diff --git a/ovn/controller/bfd.c b/ovn/controller/bfd.c index 051781f..c696741 100644 --- a/ovn/controller/bfd.c +++ b/ovn/controller/bfd.c @@ -139,8 +139,9 @@ bfd_travel_gw_related_chassis( LIST_FOR_EACH_POP (dp_binding, node, &dp_list) { dp = dp_binding->dp; free(dp_binding); + const struct sbrec_datapath_binding *pdp; for (size_t i = 0; i < dp->n_peer_dps; i++) { - const struct sbrec_datapath_binding *pdp = dp->peer_dps[i]; + pdp = dp->peer_dps[i]->peer_dp; if (!pdp) { continue; } diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c index 021ecdd..168b78d 100644 --- a/ovn/controller/binding.c +++ b/ovn/controller/binding.c @@ -145,10 +145,14 @@ add_local_datapath__(struct ovsdb_idl_index *sbrec_datapath_binding_by_key, const struct sbrec_port_binding *pb; SBREC_PORT_BINDING_FOR_EACH_EQUAL (pb, target, sbrec_port_binding_by_datapath) { + if (!strcmp(pb->type, "chassisredirect")) { + ld->chassisredirect_port = pb; + } if (!strcmp(pb->type, "patch")) { const char *peer_name = smap_get(&pb->options, "peer"); if (peer_name) { const struct sbrec_port_binding *peer; + struct peer_datapath *pdp; peer = lport_lookup_by_name(sbrec_port_binding_by_name, peer_name); @@ -163,9 +167,13 @@ add_local_datapath__(struct ovsdb_idl_index *sbrec_datapath_binding_by_key, ld->peer_dps = xrealloc( ld->peer_dps, ld->n_peer_dps * sizeof *ld->peer_dps); - ld->peer_dps[ld->n_peer_dps - 1] = datapath_lookup_by_key( + pdp = xmalloc(sizeof(struct peer_datapath)); + pdp->peer_dp = datapath_lookup_by_key( sbrec_datapath_binding_by_key, peer->datapath->tunnel_key); + pdp->patch = pb; + pdp->peer = peer; + ld->peer_dps[ld->n_peer_dps - 1] = pdp; } } } diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index 81d3306..d338ab1 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -823,6 +823,9 @@ main(int argc, char *argv[]) struct local_datapath *cur_node, *next_node; HMAP_FOR_EACH_SAFE (cur_node, next_node, hmap_node, &local_datapaths) { + for (int i = 0; i < cur_node->n_peer_dps; i++) { + free(cur_node->peer_dps[i]); + } free(cur_node->peer_dps); hmap_remove(&local_datapaths, &cur_node->hmap_node); free(cur_node); diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h index b13b371..596cb3b 100644 --- a/ovn/controller/ovn-controller.h +++ b/ovn/controller/ovn-controller.h @@ -40,6 +40,17 @@ struct ct_zone_pending_entry { enum ct_zone_pending_state state; }; +/* Represents a peer datapath connected to a given datapath */ +struct peer_datapath { + const struct sbrec_datapath_binding *peer_dp; + + /* Patch port connected to local datapath */ + const struct sbrec_port_binding *patch; + + /* Peer patch port connected to peer datapath */ + const struct sbrec_port_binding *peer; +}; + /* A logical datapath that has some relevance to this hypervisor. A logical * datapath D is relevant to hypervisor H if: * @@ -56,10 +67,14 @@ struct local_datapath { /* The localnet port in this datapath, if any (at most one is allowed). */ const struct sbrec_port_binding *localnet_port; + /* The chassisredirect port in this datapath, if any + * (at most one is allowed). */ + const struct sbrec_port_binding *chassisredirect_port; + /* True if this datapath contains an l3gateway port located on this * hypervisor. */ bool has_local_l3gateway; - const struct sbrec_datapath_binding **peer_dps; + struct peer_datapath **peer_dps; size_t n_peer_dps; }; diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index c38d7b0..f269a1d 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -304,7 +304,8 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name, { uint32_t dp_key = binding->datapath->tunnel_key; uint32_t port_key = binding->tunnel_key; - if (!get_local_datapath(local_datapaths, dp_key)) { + struct local_datapath *ld = get_local_datapath(local_datapaths, dp_key); + if (!ld) { return; } @@ -350,6 +351,12 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name, put_load(0, MFF_LOG_REG0 + i, 0, 32, ofpacts_p); } put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p); + + /* Set MLF_RCV_FROM_VLAN flag for vlan network */ + if (ld->localnet_port && ld->localnet_port->n_tag && + *ld->localnet_port->tag) { + put_load(1, MFF_LOG_FLAGS, MLF_RCV_FROM_VLAN_BIT, 1, ofpacts_p); + } put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, ofpacts_p); clone = ofpbuf_at_assert(ofpacts_p, clone_ofs, sizeof *clone); ofpacts_p->header = clone; @@ -526,9 +533,15 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name, put_local_common_flows(dp_key, port_key, nested_container, &zone_ids, ofpacts_p, flow_table); - /* Table 0, Priority 150 and 100. + /* Table 0, Priority 200, 150 and 100. * ============================== * + * Priority 200 is for vlan traffic with distributed gateway port MAC + * as destination MAC address. For such traffic, set MLF_RCV_FROM_VLAN + * flag, MFF_LOG_DATAPATH to the router metadata and MFF_LOG_INPORT to + * the patch port connecting router and vlan network and resubmit into + * the logical router ingress pipeline. + * * Priority 150 is for tagged traffic. This may be containers in a * VM or a VLAN on a local network. For such traffic, match on the * tags and then strip the tag. @@ -540,6 +553,57 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name, * input port, MFF_LOG_DATAPATH to the logical datapath, and * resubmit into the logical ingress pipeline starting at table * 16. */ + + /* For packet from vlan network with distributed gateway port MAC as + * destination MAC address, submit it to router ingress pipeline */ + int vlan_tag = binding->n_tag ? *binding->tag : 0; + if (!strcmp(binding->type, "localnet") && vlan_tag) { + for (int i = 0; i < ld->n_peer_dps; i++) { + struct local_datapath *peer_ldp = get_local_datapath( + local_datapaths, ld->peer_dps[i]->peer_dp->tunnel_key); + ovs_assert(peer_ldp); + const struct sbrec_port_binding *crp; + crp = peer_ldp->chassisredirect_port; + if (crp && crp->chassis && + !strcmp(crp->chassis->name, chassis->name)) { + const char *gwp = smap_get(&crp->options, + "distributed-port"); + if (strcmp(gwp, ld->peer_dps[i]->peer->logical_port)) { + bool mac_found = 0; + ofpbuf_clear(ofpacts_p); + match_init_catchall(&match); + + match_set_in_port(&match, ofport); + match_set_dl_vlan(&match, htons(vlan_tag), 0); + for (int j = 0; j < crp->n_mac; j++) { + struct lport_addresses laddrs; + if (!extract_lsp_addresses(crp->mac[j], &laddrs)) { + continue; + } + match_set_dl_dst(&match, laddrs.ea); + destroy_lport_addresses(&laddrs); + mac_found = 1; + break; + } + if (!mac_found) { + continue; + } + ofpact_put_STRIP_VLAN(ofpacts_p); + put_load(peer_ldp->datapath->tunnel_key, + MFF_LOG_DATAPATH, 0, 64, ofpacts_p); + put_load(ld->peer_dps[i]->peer->tunnel_key, + MFF_LOG_INPORT, 0, 32, ofpacts_p); + put_load(1, MFF_LOG_FLAGS, + MLF_RCV_FROM_VLAN_BIT, 1, ofpacts_p); + put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, ofpacts_p); + + ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, + 200, 0, &match, ofpacts_p); + } + } + } + } + ofpbuf_clear(ofpacts_p); match_init_catchall(&match); match_set_in_port(&match, ofport); @@ -633,13 +697,59 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name, } else { /* Remote port connected by tunnel */ - /* Table 32, priority 100. - * ======================= + /* Table 32, priority 150 and 100. + * ============================== * * Handles traffic that needs to be sent to a remote hypervisor. Each * flow matches an output port that includes a logical port on a remote - * hypervisor, and tunnels the packet to that hypervisor. + * hypervisor, and tunnels the packet or send through vlan network to + * that hypervisor. */ + + /* For each vlan network connected to the router, add that network's + * vlan tag to the packet and output it through localnet port */ + struct local_datapath *ldp = get_local_datapath(local_datapaths, + dp_key); + for (int i = 0; i < ldp->n_peer_dps; i++) { + struct ofpact_vlan_vid *vlan_vid; + ofp_port_t port_ofport = 0; + struct peer_datapath *pdp = ldp->peer_dps[i]; + struct local_datapath *peer_ldp = get_local_datapath( + local_datapaths, pdp->peer_dp->tunnel_key); + ovs_assert(peer_ldp); + if (peer_ldp->localnet_port && pdp->patch->tunnel_key) { + int64_t vlan_tag = (peer_ldp->localnet_port->n_tag ? + *peer_ldp->localnet_port->tag : 0); + if (!vlan_tag) { + continue; + } + port_ofport = u16_to_ofp(simap_get(&localvif_to_ofport, + peer_ldp->localnet_port->logical_port)); + if (!port_ofport) { + continue; + } + + match_init_catchall(&match); + ofpbuf_clear(ofpacts_p); + + match_set_metadata(&match, htonll(dp_key)); + match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0, + MLF_RCV_FROM_VLAN, MLF_RCV_FROM_VLAN); + match_set_reg(&match, MFF_LOG_INPORT - MFF_REG0, + pdp->patch->tunnel_key); + match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key); + + vlan_vid = ofpact_put_SET_VLAN_VID(ofpacts_p); + vlan_vid->vlan_vid = vlan_tag; + vlan_vid->push_vlan_if_needed = true; + ofpact_put_OUTPUT(ofpacts_p)->port = port_ofport; + ofpact_put_STRIP_VLAN(ofpacts_p); + + ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 150, 0, + &match, ofpacts_p); + } + } + match_init_catchall(&match); ofpbuf_clear(ofpacts_p); diff --git a/ovn/lib/logical-fields.c b/ovn/lib/logical-fields.c index a8b5e3c..b9efa02 100644 --- a/ovn/lib/logical-fields.c +++ b/ovn/lib/logical-fields.c @@ -105,6 +105,10 @@ ovn_init_symtab(struct shash *symtab) MLF_FORCE_SNAT_FOR_LB_BIT); expr_symtab_add_subfield(symtab, "flags.force_snat_for_lb", NULL, flags_str); + snprintf(flags_str, sizeof flags_str, "flags[%d]", + MLF_RCV_FROM_VLAN_BIT); + expr_symtab_add_subfield(symtab, "flags.rcv_from_vlan", NULL, + flags_str); /* Connection tracking state. */ expr_symtab_add_field(symtab, "ct_mark", MFF_CT_MARK, NULL, false); diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h index b1dbb03..96250fd 100644 --- a/ovn/lib/logical-fields.h +++ b/ovn/lib/logical-fields.h @@ -50,6 +50,7 @@ enum mff_log_flags_bits { MLF_FORCE_SNAT_FOR_DNAT_BIT = 2, MLF_FORCE_SNAT_FOR_LB_BIT = 3, MLF_LOCAL_ONLY_BIT = 4, + MLF_RCV_FROM_VLAN_BIT = 5, }; /* MFF_LOG_FLAGS_REG flag assignments */ @@ -75,6 +76,7 @@ enum mff_log_flags { * hypervisors should instead only be output to local targets */ MLF_LOCAL_ONLY = (1 << MLF_LOCAL_ONLY_BIT), + MLF_RCV_FROM_VLAN = (1 << MLF_RCV_FROM_VLAN_BIT), }; #endif /* ovn/lib/logical-fields.h */ diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 35baabc..2497a5b 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -4434,6 +4434,28 @@ add_route(struct hmap *lflows, const struct ovn_port *op, * routing. */ ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, priority, ds_cstr(&match), ds_cstr(&actions)); + + /* When output port is distributed gateway port, check if the router + * input port is a patch port connected to vlan network. + * Traffic from VLAN network to external network should be redirected + * to "redirect-chassis" by setting REGBIT_NAT_REDIRECT flag. + * Later physical table 32 will output this traffic to gateway + * chassis using input network vlan tag */ + if ((op == op->od->l3dgw_port) && op->od->l3redirect_port) { + ds_clear(&match); + ds_clear(&actions); + + ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", + dir, network_s, plen); + ds_put_format(&match, " && flags.rcv_from_vlan == 1"); + ds_put_format(&match, " && !is_chassis_resident(%s)", + op->od->l3redirect_port->json_key); + + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, + priority + 1, ds_cstr(&match), + REGBIT_NAT_REDIRECT" = 1; next;"); + } + ds_destroy(&match); ds_destroy(&actions); } @@ -4852,6 +4874,19 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, } ovn_lflow_add(lflows, op->od, S_ROUTER_IN_ADMISSION, 50, ds_cstr(&match), "next;"); + + /* VLAN traffic from localnet port should be allowed for + * router processing on the "redirect-chassis". */ + if (op->od->l3dgw_port && op->od->l3redirect_port && op->peer && + op->peer->od->localnet_port && (op != op->od->l3dgw_port)) { + ds_clear(&match); + ds_put_format(&match, "flags.rcv_from_vlan == 1"); + ds_put_format(&match, " && inport == %s", op->json_key); + ds_put_format(&match, " && is_chassis_resident(%s)", + op->od->l3redirect_port->json_key); + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_ADMISSION, 100, + ds_cstr(&match), "next;"); + } } /* Logical router ingress table 1: IP Input. */ diff --git a/tests/ovn.at b/tests/ovn.at index 17c740a..fb9b516 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -7805,6 +7805,284 @@ test_ip_packet gw2 gw1 OVN_CLEANUP([hv1],[gw1],[gw2],[ext1]) AT_CLEANUP +# VLAN traffic for external network redirected through distributed router gateway port +# should use vlans(i.e input network vlan tag) across hypervisors instead of tunneling. +AT_SETUP([ovn -- vlan traffic for external network with distributed router gateway port]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +ovn_start + +# Logical network: +# # One LR R1 that has switches foo (192.168.1.0/24) and +# # alice (172.16.1.0/24) connected to it. The logical port +# # between R1 and alice has a "redirect-chassis" specified, +# # i.e. it is the distributed router gateway port(172.16.1.6). +# # Switch alice also has a localnet port defined. +# # An additional switch outside has the same subnet as alice +# # (172.16.1.0/24), a localnet port and nexthop port(172.16.1.1) +# # which will receive the packet destined for external network +# # (i.e 8.8.8.8 as destination ip). + +# Physical network: +# # Three hypervisors hv[123]. +# # hv1 hosts vif foo1. +# # hv2 is the "redirect-chassis" that hosts the distributed router gateway port. +# # hv3 hosts nexthop port vif outside1. +# # All other tests connect hypervisors to network n1 through br-phys for tunneling. +# # But in this test, hv1 won't connect to n1(and no br-phys in hv1), and +# # in order to show vlans(instead of tunneling) used between hv1 and hv2, +# # a new network n2 created and hv1 and hv2 connected to this network through br-ex. +# # hv2 and hv3 are still connected to n1 network through br-phys. +net_add n1 + +# We are not calling ovn_attach for hv1, to avoid adding br-phys. +# Tunneling won't work in hv1 as ovn-encap-ip is not added to any bridge in hv1 +sim_add hv1 +as hv1 +ovs-vsctl \ + -- set Open_vSwitch . external-ids:system-id=hv1 \ + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve,vxlan \ + -- set Open_vSwitch . external-ids:ovn-encap-ip=192.168.0.1 \ + -- add-br br-int \ + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true \ + -- set Open_vSwitch . external-ids:ovn-bridge-mappings=public:br-ex + +start_daemon ovn-controller +ovs-vsctl -- add-port br-int hv1-vif1 -- \ + set interface hv1-vif1 external-ids:iface-id=foo1 \ + ofport-request=1 + +sim_add hv2 +as hv2 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.2 +ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings="public:br-ex,phys:br-phys" + +sim_add hv3 +as hv3 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.3 +ovs-vsctl -- add-port br-int hv3-vif1 -- \ + set interface hv3-vif1 external-ids:iface-id=outside1 \ + options:tx_pcap=hv3/vif1-tx.pcap \ + options:rxq_pcap=hv3/vif1-rx.pcap \ + ofport-request=1 +ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings="phys:br-phys" + +# Create network n2 for vlan connectivity between hv1 and hv2 +net_add n2 + +as hv1 +ovs-vsctl add-br br-ex +net_attach n2 br-ex + +as hv2 +ovs-vsctl add-br br-ex +net_attach n2 br-ex + +OVN_POPULATE_ARP + +ovn-nbctl create Logical_Router name=R1 + +ovn-nbctl ls-add foo +ovn-nbctl ls-add alice +ovn-nbctl ls-add outside + +# Connect foo to R1 +ovn-nbctl lrp-add R1 foo 00:00:01:01:02:03 192.168.1.1/24 +ovn-nbctl lsp-add foo rp-foo -- set Logical_Switch_Port rp-foo \ + type=router options:router-port=foo \ + -- lsp-set-addresses rp-foo router + +# Connect alice to R1 as distributed router gateway port (172.16.1.6) on hv2 +ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.6/24 \ + -- set Logical_Router_Port alice options:redirect-chassis="hv2" +ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice \ + type=router options:router-port=alice \ + -- lsp-set-addresses rp-alice router \ + + +# Create logical port foo1 in foo +ovn-nbctl lsp-add foo foo1 \ +-- lsp-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2" + +# Create logical port outside1 in outside, which is a nexthop address +# for 172.16.1.0/24 +ovn-nbctl lsp-add outside outside1 \ +-- lsp-set-addresses outside1 "f0:00:00:01:02:04 172.16.1.1" + +# Set default gateway (nexthop) to 172.16.1.1 +ovn-nbctl lr-route-add R1 "0.0.0.0/0" 172.16.1.1 alice +AT_CHECK([ovn-nbctl lr-nat-add R1 snat 172.16.1.6 192.168.1.1/24]) +ovn-nbctl set Logical_Switch_Port rp-alice options:nat-addresses=router + +ovn-nbctl lsp-add foo ln-foo +ovn-nbctl lsp-set-addresses ln-foo unknown +ovn-nbctl lsp-set-options ln-foo network_name=public +ovn-nbctl lsp-set-type ln-foo localnet +AT_CHECK([ovn-nbctl set Logical_Switch_Port ln-foo tag=2]) + +# Create localnet port in alice +ovn-nbctl lsp-add alice ln-alice +ovn-nbctl lsp-set-addresses ln-alice unknown +ovn-nbctl lsp-set-type ln-alice localnet +ovn-nbctl lsp-set-options ln-alice network_name=phys + +# Create localnet port in outside +ovn-nbctl lsp-add outside ln-outside +ovn-nbctl lsp-set-addresses ln-outside unknown +ovn-nbctl lsp-set-type ln-outside localnet +ovn-nbctl lsp-set-options ln-outside network_name=phys + +# Allow some time for ovn-northd and ovn-controller to catch up. +# XXX This should be more systematic. +ovn-nbctl --wait=hv --timeout=3 sync + +echo "---------NB dump-----" +ovn-nbctl show +echo "---------------------" +ovn-nbctl list logical_router +echo "---------------------" +ovn-nbctl list nat +echo "---------------------" +ovn-nbctl list logical_router_port +echo "---------------------" + +echo "---------SB dump-----" +ovn-sbctl list datapath_binding +echo "---------------------" +ovn-sbctl list port_binding +echo "---------------------" +ovn-sbctl dump-flows +echo "---------------------" +ovn-sbctl list chassis +echo "---------------------" + +for chassis in hv1 hv2 hv3; do + as $chassis + echo "------ $chassis dump ----------" + ovs-vsctl show br-int + ovs-ofctl show br-int + ovs-ofctl dump-flows br-int + echo "--------------------------" +done + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +foo1_ip=$(ip_to_hex 192 168 1 2) +gw_ip=$(ip_to_hex 172 16 1 6) +dst_ip=$(ip_to_hex 8 8 8 8) +nexthop_ip=$(ip_to_hex 172 16 1 1) + +foo1_mac="f00000010203" +foo_mac="000001010203" +gw_mac="000002010203" +nexthop_mac="f00000010204" + +# Send ip packet from foo1 to 8.8.8.8 +src_mac="f00000010203" +dst_mac="000001010203" +packet=${foo_mac}${foo1_mac}08004500001c0000000040110000${foo1_ip}${dst_ip}0035111100080000 + +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet +sleep 2 + +# ARP request packet for nexthop_ip to expect at outside1 +arp_request=ffffffffffff${gw_mac}08060001080006040001${gw_mac}${gw_ip}000000000000${nexthop_ip} +echo $arp_request >> hv3-vif1.expected +cat hv3-vif1.expected > expout +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap | grep ${nexthop_ip} | uniq > hv3-vif1 +AT_CHECK([sort hv3-vif1], [0], [expout]) + +# Send ARP reply from outside1 back to the router +reply_mac="f00000010204" +arp_reply=${gw_mac}${nexthop_mac}08060001080006040002${nexthop_mac}${nexthop_ip}${gw_mac}${gw_ip} + +as hv3 ovs-appctl netdev-dummy/receive hv3-vif1 $arp_reply +OVS_WAIT_UNTIL([ + test `as hv2 ovs-ofctl dump-flows br-int | grep table=66 | \ +grep actions=mod_dl_dst:f0:00:00:01:02:04 | wc -l` -eq 1 + ]) + +echo "ovn-sbctl list MAC_Binding" +ovn-sbctl list MAC_Binding +echo "============================" + +# VLAN tagged packet with distributed gateway port(172.16.1.6) MAC as destination MAC +# is expected on bridge connecting hv1 and hv2 +expected=${gw_mac}${foo1_mac}8100000208004500001c0000000040110000${foo1_ip}${dst_ip}0035111100080000 +echo $expected > hv1-br-ex_n2.expected + +# Packet to Expect at outside1 i.e nexthop(172.16.1.1) port. +# As connection tracking not enabled for this test, snat can't be done on the packet. +# We still see foo1 as the source ip address. But source mac(gateway MAC) and +# dest mac(nexthop mac) are properly configured. +expected=${nexthop_mac}${gw_mac}08004500001c000000003f110100${foo1_ip}${dst_ip}0035111100080000 +echo $expected > hv3-vif1.expected + +reset_pcap_file() { + local iface=$1 + local pcap_file=$2 + ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \ +options:rxq_pcap=dummy-rx.pcap + rm -f ${pcap_file}*.pcap + ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \ +options:rxq_pcap=${pcap_file}-rx.pcap +} + +as hv1 reset_pcap_file br-ex_n2 hv1/br-ex_n2 +as hv3 reset_pcap_file hv3-vif1 hv3/vif1 +sleep 2 +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet +sleep 2 + +# On hv1, table 65 for packets going from vlan switch pipleline to router pipleine +# set MLF_RCV_FROM_VLAN flag +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=65 | grep "priority=100,reg15=0x1,metadata=0x2" \ +| grep actions=clone | grep "load:0x1->NXM_NX_REG10" | wc -l], [0], [[1 +]]) +# On hv1, because of snat rule in table 15, a higher priority(i.e 2) flow +# added for packets with MLF_RCV_FROM_VLAN flag with output as distributed +# gateway port, which sets REGBIT_NAT_REDIRECT flag +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=15 | grep "priority=2,ip,reg10=0x20/0x20,metadata=0x1" \ +| grep "actions=load:0x1->OXM_OF_PKT_REG4" | wc -l], [0], [[1 +]]) + +# On hv1, table 32 flow which tags packet with source network vlan tag and sends it to hv2 +# through br-ex +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=32 | grep "priority=150,reg10=0x20/0x20,reg14=0x1,reg15=0x3,metadata=0x1" \ +| grep "actions=mod_vlan_vid:2" | grep "n_packets=2," | wc -l], [0], [[1 +]]) + +# On hv2 table 0, vlan tagged packet is sent through router pipeline +# by setting MLF_RCV_FROM_VLAN flag (REG10) +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep "table=0," | grep "priority=200" | grep "dl_vlan=2" | \ +grep "dl_dst=00:00:02:01:02:03" | grep "actions=strip_vlan,load:0x1->OXM_OF_METADATA" | grep "load:0x1->NXM_NX_REG14" | \ +grep "load:0x1->NXM_NX_REG10" | wc -l], [0], [[1 +]]) +# on hv2 table 8, allow packets with router metadata and with MLF_RCV_FROM_VLAN flag +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=8 | grep "priority=100,reg10=0x20/0x20,reg14=0x1,metadata=0x1" | wc -l], [0], [[1 +]]) + +ip_packet() { + grep "2010203f00000010203" +} + +# Check vlan tagged packet on the bridge connecting hv1 and hv2 +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv1/br-ex_n2-tx.pcap | ip_packet | uniq > hv1-br-ex_n2 +cat hv1-br-ex_n2.expected > expout +AT_CHECK([sort hv1-br-ex_n2], [0], [expout]) + +# Check expected packet on nexthop interface +$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap | grep ${foo1_ip}${dst_ip} | uniq > hv3-vif1 +cat hv3-vif1.expected > expout +AT_CHECK([sort hv3-vif1], [0], [expout]) + +OVN_CLEANUP([hv1],[hv2],[hv3]) +AT_CLEANUP + AT_SETUP([ovn -- 1 LR with distributed router gateway port]) AT_SKIP_IF([test $HAVE_PYTHON = no]) ovn_start From patchwork Wed Aug 1 12:16:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Venkata Anil X-Patchwork-Id: 952120 X-Patchwork-Delegate: jpettit@nicira.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41gXQ04hkYz9s3Z for ; Wed, 1 Aug 2018 22:18:00 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 1C558DD5; Wed, 1 Aug 2018 12:16:54 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 55D54DB3 for ; Wed, 1 Aug 2018 12:16:53 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 9783C701 for ; Wed, 1 Aug 2018 12:16:52 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C19657DAC8 for ; Wed, 1 Aug 2018 12:16:51 +0000 (UTC) Received: from vkommadi.lab.eng.blr.redhat.com (dhcp35-207.lab.eng.blr.redhat.com [10.70.35.207]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9D7011C641; Wed, 1 Aug 2018 12:16:50 +0000 (UTC) From: vkommadi@redhat.com To: dev@openvswitch.org Date: Wed, 1 Aug 2018 17:46:33 +0530 Message-Id: <20180801121635.14509-3-vkommadi@redhat.com> In-Reply-To: <20180801121635.14509-1-vkommadi@redhat.com> References: <20180801121635.14509-1-vkommadi@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 01 Aug 2018 12:16:51 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Wed, 01 Aug 2018 12:16:51 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'vkommadi@redhat.com' RCPT:'' X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v7 2/4] Send gateway port ARP through router internal ports X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: venkata anil External switches should learn the distributed gateway port MAC address as they have to forward the packet tagged with tenant vlan network but with this MAC as destination MAC address. So router has to send ARP reply and gARP for this MAC address through router internal patch ports connecting tenant vlan networks. Signed-off-by: Venkata Anil --- v6->v7: * Rebased * Addressed review comments v5->v6: * Rebased v4->v5: * No changes in this patch ovn/controller/pinctrl.c | 57 +++++++++++++++++++++++++++++++++++++++++++----- ovn/northd/ovn-northd.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++ tests/ovn.at | 6 +++++ 3 files changed, 114 insertions(+), 6 deletions(-) diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c index a0bf602..d01e57f 100644 --- a/ovn/controller/pinctrl.c +++ b/ovn/controller/pinctrl.c @@ -2185,8 +2185,47 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index *sbrec_chassis_by_name, struct sset *local_l3gw_ports, const struct sbrec_chassis *chassis, const struct sset *active_tunnels, - struct shash *nat_addresses) + struct shash *nat_addresses, + const struct hmap *local_datapaths) { + /* When a router has tenant vlan networks, gARP for distributed gateway + * router port has to be sent through internal tenant vlan network's + * localnet port, so that external switches can learn this MAC and forward + * tenant vlan network traffic with distributed gateway router port MAC + * as destination MAC address */ + + struct local_datapath *ldp; + struct shash router_vlan_ports; + + shash_init(&router_vlan_ports); + HMAP_FOR_EACH (ldp, hmap_node, local_datapaths) { + const struct sbrec_port_binding *crp; + crp = ldp->chassisredirect_port; + /* check if it a router with chassis redirect port, + * get corresponding distributed port */ + if (crp && crp->chassis && + !strcmp(crp->chassis->name, chassis->name)) { + const struct sbrec_port_binding *dp = NULL; + for (int i = 0; i < ldp->n_peer_dps; i++) { + if (!strcmp(ldp->peer_dps[i]->patch->logical_port, + smap_get(&crp->options, "distributed-port"))) { + dp = ldp->peer_dps[i]->peer; + break; + } + } + + /* Save router internal port (patch port on tenant vlan network) + * along with distributed port. */ + for (int i = 0; i < ldp->n_peer_dps; i++) { + if (strcmp(ldp->peer_dps[i]->patch->logical_port, + smap_get(&crp->options, "distributed-port"))) { + shash_add(&router_vlan_ports, + ldp->peer_dps[i]->peer->logical_port, dp); + } + } + } + } + const char *gw_port; SSET_FOR_EACH(gw_port, local_l3gw_ports) { const struct sbrec_port_binding *pb; @@ -2196,11 +2235,16 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index *sbrec_chassis_by_name, continue; } - if (pb->n_nat_addresses) { - for (int i = 0; i < pb->n_nat_addresses; i++) { + /* Router internal ports should send gARP for distributed port + * NAT addresses */ + const struct sbrec_port_binding *dp; + dp = shash_find_data(&router_vlan_ports, pb->logical_port); + const struct sbrec_port_binding *nat_port = dp ? dp : pb; + if (nat_port->n_nat_addresses) { + for (int i = 0; i < nat_port->n_nat_addresses; i++) { consider_nat_address(sbrec_chassis_by_name, sbrec_port_binding_by_name, - pb->nat_addresses[i], pb, + nat_port->nat_addresses[i], pb, nat_address_keys, chassis, active_tunnels, nat_addresses); @@ -2208,7 +2252,7 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index *sbrec_chassis_by_name, } else { /* Continue to support options:nat-addresses for version * upgrade. */ - const char *nat_addresses_options = smap_get(&pb->options, + const char *nat_addresses_options = smap_get(&nat_port->options, "nat-addresses"); if (nat_addresses_options) { consider_nat_address(sbrec_chassis_by_name, @@ -2220,6 +2264,7 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index *sbrec_chassis_by_name, } } } + shash_destroy(&router_vlan_ports); } static void @@ -2255,7 +2300,7 @@ send_garp_run(struct ovsdb_idl_index *sbrec_chassis_by_name, sbrec_port_binding_by_name, &nat_ip_keys, &local_l3gw_ports, chassis, active_tunnels, - &nat_addresses); + &nat_addresses, local_datapaths); /* For deleted ports and deleted nat ips, remove from send_garp_data. */ struct shash_node *iter, *next; SHASH_FOR_EACH_SAFE (iter, next, &send_garp_data) { diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 2497a5b..bcf0b66 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -5043,6 +5043,63 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, ds_cstr(&match), ds_cstr(&actions)); } + /* ARP requests for distributed port IP address but coming from router + * internal network vlan, should be replied through router internal + * network vlan ports */ + if (op->od->l3dgw_port && (op == op->od->l3dgw_port) + && op->od->l3redirect_port) { + for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) { + ds_clear(&match); + ds_put_format(&match, + "flags.rcv_from_vlan == 1 && " + "arp.tpa == %s && arp.op == 1 && " + "is_chassis_resident(%s)", + op->lrp_networks.ipv4_addrs[i].addr_s, + op->od->l3redirect_port->json_key); + + ds_clear(&actions); + ds_put_format(&actions, + "eth.dst = eth.src; " + "eth.src = %s; " + "arp.op = 2; /* ARP reply */ " + "arp.tha = arp.sha; " + "arp.sha = %s; " + "arp.tpa = arp.spa; " + "arp.spa = %s; " + "flags.loopback = 1; ", + op->lrp_networks.ea_s, + op->lrp_networks.ea_s, + op->lrp_networks.ipv4_addrs[i].addr_s); + + /* Add internal vlan network ports as output ports */ + bool router_ports_exist = false; + struct ovn_datapath *dp; + HMAP_FOR_EACH (dp, key_node, datapaths) { + if (!dp->nbs) { + continue; + } + if (!dp->localnet_port) { + continue; + } + for (size_t j = 0; j < dp->n_router_ports; j++) { + struct ovn_port *rp = dp->router_ports[j]; + if (rp->peer && rp->peer->od == op->od && + rp->peer != op) { + router_ports_exist = true; + ds_put_format(&actions, + "outport = %s; " + "output;", + rp->peer->json_key); + } + } + } + if (router_ports_exist) { + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90, + ds_cstr(&match), ds_cstr(&actions)); + } + } + } + /* A set to hold all load-balancer vips that need ARP responses. */ struct sset all_ips = SSET_INITIALIZER(&all_ips); int addr_family; diff --git a/tests/ovn.at b/tests/ovn.at index fb9b516..314f0ad 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -7981,6 +7981,12 @@ foo_mac="000001010203" gw_mac="000002010203" nexthop_mac="f00000010204" +# gARP for distributed port has to be sent through router internal patch port +# which is connected to vlan network +garp=ffffffffffff${gw_mac}8100000208060001080006040001${gw_mac}${gw_ip}000000000000${gw_ip} +echo $garp >> hv1-br-ex_n2-rx.expected +OVN_CHECK_PACKETS([hv1/br-ex_n2-rx.pcap], [hv1-br-ex_n2-rx.expected]) + # Send ip packet from foo1 to 8.8.8.8 src_mac="f00000010203" dst_mac="000001010203" From patchwork Wed Aug 1 12:16:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Venkata Anil X-Patchwork-Id: 952121 X-Patchwork-Delegate: jpettit@nicira.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41gXQR6wXqz9s3Z for ; Wed, 1 Aug 2018 22:18:23 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id DAB14DF2; Wed, 1 Aug 2018 12:16:57 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 1A019DDD for ; Wed, 1 Aug 2018 12:16:57 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 6D099701 for ; Wed, 1 Aug 2018 12:16:56 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9512940201BE for ; Wed, 1 Aug 2018 12:16:55 +0000 (UTC) Received: from vkommadi.lab.eng.blr.redhat.com (dhcp35-207.lab.eng.blr.redhat.com [10.70.35.207]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 718391C5BB; Wed, 1 Aug 2018 12:16:54 +0000 (UTC) From: vkommadi@redhat.com To: dev@openvswitch.org Date: Wed, 1 Aug 2018 17:46:34 +0530 Message-Id: <20180801121635.14509-4-vkommadi@redhat.com> In-Reply-To: <20180801121635.14509-1-vkommadi@redhat.com> References: <20180801121635.14509-1-vkommadi@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 01 Aug 2018 12:16:55 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 01 Aug 2018 12:16:55 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'vkommadi@redhat.com' RCPT:'' X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v7 3/4] Document the flows for redirecting VLAN packets X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: venkata anil We have added new flows for using vlans instead of tunnels for redirecting VLAN packets to a gateway chassis. This patch documents these flows in ovn-northd.8.xml and ovn-architecture.7.xml. Signed-off-by: Venkata Anil --- v6->v7: * Rebased v5->v6: * Rebased v4->v5: * This patch is added to document the logical and physical flows ovn/northd/ovn-northd.8.xml | 46 +++++++++++++++++++++++++++++++++++++++++++++ ovn/ovn-architecture.7.xml | 26 ++++++++++++++++++++++++- 2 files changed, 71 insertions(+), 1 deletion(-) diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml index f1771c6..8fa5272 100644 --- a/ovn/northd/ovn-northd.8.xml +++ b/ovn/northd/ovn-northd.8.xml @@ -995,6 +995,23 @@ output;
  • + For each enabled router port P which is connected to + a VLAN network, a priority-100 flow that matches inport == + P && flags.rcv_from_vlan == 1, + with action next;. +

    + +

    + For the gateway port on a distributed logical router (where + one of the logical router ports specifies a + redirect-chassis), the above flow is only + programmed on the gateway port instance on the + redirect-chassis. +

    +
  • + +
  • +

    For each enabled router port P with Ethernet address E, a priority-50 flow that matches inport == P && (eth.mcast || eth.dst == @@ -1146,6 +1163,18 @@ output;

    For the gateway port on a distributed logical router (where one of the logical router ports specifies a + redirect-chassis), when the ARP request is + from router internal ports connected to vlan network (i.e + flags.rcv_from_vlan == 1), a priority-90 flow matches + flags.rcv_from_vlan == 1 && arp.op == 1 + && arp.tpa == A will have the + above action but outport set to all router internal ports + which are connected to vlan network. +

    + +

    + For the gateway port on a distributed logical router (where + one of the logical router ports specifies a redirect-chassis), the above flows are only programmed on the gateway port instance on the redirect-chassis. This behavior avoids generation @@ -1839,6 +1868,23 @@ next; If the address A is in the link-local scope, the route will be limited to sending on the ingress port.

    + +

    + If the route's outport is a gateway port on a + distributed logical router (where one of the logical router ports + specifies a redirect-chassis), for the packets matching + MLF_RCV_FROM_VLAN flag along with ip4.dst == + N/M, or ip6.dst == + N/M, add a flow with priority we get + by adding 1 to number of 1-bits in M, with an action + REGBIT_NAT_REDIRECT = 1; next;. + By setting the REGBIT_NAT_REDIRECT flag, in the ingress + table Gateway Redirect this will trigger a redirect to + the instance of the gateway port on the redict-chassis. + This flow is programmed on the gateway port instance other than the + redirect-chassis. This flow is also added if the route + is from a configured static route. +

  • diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml index ae5ca8e..ad2101c 100644 --- a/ovn/ovn-architecture.7.xml +++ b/ovn/ovn-architecture.7.xml @@ -874,6 +874,19 @@ Encapsulations for encoding details). Then the actions resubmit to table 33 to enter the logical egress pipeline.

    + +

    + For VLAN packets coming through localnet port from remote chassis, + table 0 sets logical datapath and logical ingress port based on + localnet port. If these VLAN packets have distributed gateway port MAC + (gateway port on a distributed logical router where one of the logical + router ports specifies a redirect-chassis) as destination MAC address, + a new flow with priority 200 is added which sets logical datapath to + router metadata and logical ingress port to the patch port connecting + router and vlan network, resubmit into the logical router ingress + pipeline i.e table 8. This flow is only programmed on the gateway + port instance on the redirect-chassis. +

  • @@ -1020,6 +1033,16 @@ determine the output port.
  • + A higher-priority rule to match packets received from router ports + which are connected to vlan networks, based on flag + MLF_RCV_FROM_VLAN, where logical output port is a gateway port, on + a distributed logical router (where one of the logical router ports + specifies a redirect-chassis), but on remote hypervisor, the actions + tag the packet with input network VLAN id and output the packet + through input VLAN switch's localnet port. This flow is programmed + on chassis other than redirect-chassis. +
  • +
  • A higher-priority rule to match packets received from ports of type localport, based on the logical input port, and resubmit these packets to table 33 for local delivery. Ports of type @@ -1153,7 +1176,8 @@ directly to the first OpenFlow flow table in the ingress pipeline, setting the logical ingress port to the peer logical patch port, and using the peer logical patch port's logical datapath (that - represents the logical router). + represents the logical router). If the logical datapath is a VLAN + network, set the MLF_RCV_FROM_VLAN flag.
  • From patchwork Wed Aug 1 12:16:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Venkata Anil X-Patchwork-Id: 952122 X-Patchwork-Delegate: jpettit@nicira.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41gXQs55DRz9s3Z for ; Wed, 1 Aug 2018 22:18:45 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 951F2E17; Wed, 1 Aug 2018 12:17:01 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 91D76E04 for ; Wed, 1 Aug 2018 12:17:00 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id BD80ECF for ; Wed, 1 Aug 2018 12:16:59 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D01AD87A76 for ; Wed, 1 Aug 2018 12:16:58 +0000 (UTC) Received: from vkommadi.lab.eng.blr.redhat.com (dhcp35-207.lab.eng.blr.redhat.com [10.70.35.207]) by smtp.corp.redhat.com (Postfix) with ESMTPS id ABB921C5BB; Wed, 1 Aug 2018 12:16:57 +0000 (UTC) From: vkommadi@redhat.com To: dev@openvswitch.org Date: Wed, 1 Aug 2018 17:46:35 +0530 Message-Id: <20180801121635.14509-5-vkommadi@redhat.com> In-Reply-To: <20180801121635.14509-1-vkommadi@redhat.com> References: <20180801121635.14509-1-vkommadi@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 01 Aug 2018 12:16:58 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 01 Aug 2018 12:16:58 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'vkommadi@redhat.com' RCPT:'' X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v7 4/4] Replace router internal MAC with gateway MAC for reply packets X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: venkata anil Previous patches in the series doesn't address issue 1 explained in [1] i.e 1) removal of router gateway port MAC address on external switches after expiring of aging time. 2) then external switches unable to learn the gateway MAC as reply packets carry router internal port MAC address as source To fix this, router on gateway node will use router gateway MAC address instead of router internal port MAC address as source for reply packets, so that external switches can learn gateway MAC address. This is done only for reply packets from router gateway to tenant VLAN switch ports. Later before delivering the packet to the port, ovn-controller will replace the gateway MAC with router internal port MAC in table 33. [1] //mail.openvswitch.org/pipermail/ovs-dev/2018-July/349803.html Reported-by: Miguel Angel Ajo Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-July/349803.html Signed-off-by: Venkata Anil Tested-By: Miguel Angel Ajo --- v6->v7: * Added this patch ovn/controller/physical.c | 60 ++++++++++++++++++++++++++++++++++++++++++--- ovn/northd/ovn-northd.8.xml | 10 ++++++++ ovn/northd/ovn-northd.c | 29 ++++++++++++++++++++++ ovn/ovn-architecture.7.xml | 4 ++- 4 files changed, 99 insertions(+), 4 deletions(-) diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index f269a1d..1f41f59 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -190,7 +190,9 @@ get_zone_ids(const struct sbrec_port_binding *binding, static void put_local_common_flows(uint32_t dp_key, uint32_t port_key, bool nested_container, const struct zone_ids *zone_ids, - struct ofpbuf *ofpacts_p, struct hmap *flow_table) + struct ofpbuf *ofpacts_p, struct hmap *flow_table, + struct local_datapath *ld, + const struct hmap *local_datapaths) { struct match match; @@ -221,11 +223,63 @@ put_local_common_flows(uint32_t dp_key, uint32_t port_key, } } + struct ofpbuf *clone = NULL; + clone = ofpbuf_clone(ofpacts_p); + /* Resubmit to table 34. */ put_resubmit(OFTABLE_CHECK_LOOPBACK, ofpacts_p); ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 100, 0, &match, ofpacts_p); + /* For a reply packet from gateway with VLAN switch port as destination + * (excluding localnet_port and external VLAN networks), gateway router + * will use gateway MAC address as source MAC instead of router internal + * port MAC, so that external switches can learn gateway MAC address. + * Here (before packet is given to the port) we replace router gateway + * MAC address with router internal port MAC. */ + if (ld->localnet_port && (port_key != ld->localnet_port->tunnel_key)) { + for (int i = 0; i < ld->n_peer_dps; i++) { + struct local_datapath *peer_ldp = get_local_datapath( + local_datapaths, ld->peer_dps[i]->peer_dp->tunnel_key); + const struct sbrec_port_binding *crp; + crp = peer_ldp->chassisredirect_port; + if (!crp) { + continue; + } + + if (strcmp(smap_get(&crp->options, "distributed-port"), + ld->peer_dps[i]->peer->logical_port) && + (port_key != ld->peer_dps[i]->patch->tunnel_key)) { + for (int j = 0; j < crp->n_mac; j++) { + struct lport_addresses laddrs; + if (!extract_lsp_addresses(crp->mac[j], &laddrs)) { + continue; + } + match_set_dl_src(&match, laddrs.ea); + destroy_lport_addresses(&laddrs); + break; + } + for (int j = 0; j < ld->peer_dps[i]->peer->n_mac; j++) { + struct lport_addresses laddrs; + uint64_t mac64; + if (!extract_lsp_addresses( + ld->peer_dps[i]->peer->mac[j], &laddrs)) { + continue; + } + mac64 = eth_addr_to_uint64(laddrs.ea); + put_load(mac64, + MFF_ETH_SRC, 0, 48, clone); + destroy_lport_addresses(&laddrs); + break; + } + put_resubmit(OFTABLE_CHECK_LOOPBACK, clone); + ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 150, 0, + &match, clone); + } + } + } + ofpbuf_delete(clone); + /* Table 34, Priority 100. * ======================= * @@ -330,7 +384,7 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name, struct zone_ids binding_zones = get_zone_ids(binding, ct_zones); put_local_common_flows(dp_key, port_key, false, &binding_zones, - ofpacts_p, flow_table); + ofpacts_p, flow_table, ld, local_datapaths); match_init_catchall(&match); ofpbuf_clear(ofpacts_p); @@ -531,7 +585,7 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name, struct zone_ids zone_ids = get_zone_ids(binding, ct_zones); put_local_common_flows(dp_key, port_key, nested_container, &zone_ids, - ofpacts_p, flow_table); + ofpacts_p, flow_table, ld, local_datapaths); /* Table 0, Priority 200, 150 and 100. * ============================== diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml index 8fa5272..876c121 100644 --- a/ovn/northd/ovn-northd.8.xml +++ b/ovn/northd/ovn-northd.8.xml @@ -2013,6 +2013,16 @@ next;
  • + A priority-100 logical flow with match + inport == GW && + flags.rcv_from_vlan == 1 has actions + eth.dst = E; next;, where + GW is the logical router distributed gateway + port and E is the MAC address of router + distributed gateway port. +
  • + +
  • For each NAT rule in the OVN Northbound database that can be handled in a distributed manner, a priority-100 logical flow with match ip4.src == B && diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index bcf0b66..d012bb8 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -4419,6 +4419,15 @@ add_route(struct hmap *lflows, const struct ovn_port *op, } else { ds_put_format(&actions, "ip%s.dst", is_ipv4 ? "4" : "6"); } + + if (op->peer && op->peer->od->localnet_port && + op->od->l3dgw_port && op->od->l3redirect_port && + (op != op->od->l3redirect_port) && + (op != op->od->l3dgw_port)) { + ds_put_format(&match, " && is_chassis_resident(%s)", + op->od->l3redirect_port->json_key); + ds_put_format(&actions, "; flags.rcv_from_vlan = 1"); + } ds_put_format(&actions, "; " "%sreg1 = %s; " "eth.src = %s; " @@ -6131,6 +6140,26 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, op->lrp_networks.ipv6_addrs[i].network_s, op->lrp_networks.ipv6_addrs[i].plen, NULL, NULL); } + + /* For a reply packet from gateway with VLAN switch port as + * destination, replace router internal port MAC with router gateway + * MAC address, so that external switches can learn gateway MAC + * address. Later before delivering the packet to the port, + * controller will replace the gateway MAC with router internal port + * MAC in table 33. */ + if (op->od->l3dgw_port && (op == op->od->l3dgw_port) && + op->od->l3redirect_port) { + ds_clear(&actions); + ds_clear(&match); + ds_put_format(&match, "inport == %s", op->json_key); + ds_put_format(&match, " && flags.rcv_from_vlan == 1"); + ds_put_format(&match, " && is_chassis_resident(%s)", + op->od->l3redirect_port->json_key); + ds_put_format(&actions, + "eth.src = %s; next;", op->lrp_networks.ea_s); + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_GW_REDIRECT, 100, + ds_cstr(&match), ds_cstr(&actions)); + } } /* Convert the static routes to flows. */ diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml index ad2101c..0de41d2 100644 --- a/ovn/ovn-architecture.7.xml +++ b/ovn/ovn-architecture.7.xml @@ -1067,7 +1067,9 @@

    Flows in table 33 resemble those in table 32 but for logical ports that - reside locally rather than remotely. For unicast logical output ports + reside locally rather than remotely. If these are VLAN ports and + packet has router gateway port MAC address as source, replace it with + router internal port MAC address. For unicast logical output ports on the local hypervisor, the actions just resubmit to table 34. For multicast output ports that include one or more logical ports on the local hypervisor, for each such logical port P, the actions