From patchwork Wed Nov 11 21:37:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Michelson X-Patchwork-Id: 1398518 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=SZPxXXRM; dkim-atps=neutral Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CWdPf3y67z9s0b for ; Thu, 12 Nov 2020 08:37:58 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 07D0D875E5; Wed, 11 Nov 2020 21:37:56 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4jJLVwYX7EVv; Wed, 11 Nov 2020 21:37:53 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 0AD42875E2; Wed, 11 Nov 2020 21:37:53 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id CAD51C088B; Wed, 11 Nov 2020 21:37:52 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1E57EC016F for ; Wed, 11 Nov 2020 21:37:51 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 0C4A086D28 for ; Wed, 11 Nov 2020 21:37:51 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WFZDbJZVuKME for ; Wed, 11 Nov 2020 21:37:49 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 9AED286D24 for ; Wed, 11 Nov 2020 21:37:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1605130668; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=pSwVZoBUoZCHmww8p29wdjeHGC/YquwH//QgxKybtJ4=; b=SZPxXXRMXH0/5yCYbLNTs8jIVsPI6Bo3k9PaKqeaNrevrgfKezarxbgzdoS5B5I59F0+LS wEp8CrGEBHC/rP2FcYxcEFPNC+uiwP/8+xi3ZnmEZ26m+ppUZJuSdq2VOe26+RaVskhxTo S5UQLQVSXGOUwENcVgQCpOS7c8xR1H4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-321--fW7Z4K4M1Wy7P4yKB_CQA-1; Wed, 11 Nov 2020 16:37:46 -0500 X-MC-Unique: -fW7Z4K4M1Wy7P4yKB_CQA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7870F905C3B for ; Wed, 11 Nov 2020 21:37:44 +0000 (UTC) Received: from monae.redhat.com (ovpn-112-233.rdu2.redhat.com [10.10.112.233]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0A8FD1002C09 for ; Wed, 11 Nov 2020 21:37:43 +0000 (UTC) From: Mark Michelson To: dev@openvswitch.org Date: Wed, 11 Nov 2020 16:37:40 -0500 Message-Id: <20201111213740.4180761-1-mmichels@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mmichels@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH ovn] Allow explicit setting of the SNAT zone on a gateway router. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" In certain situations, OVN may coexist with other applications on a host. Traffic from OVN and the other applications may then go out a shared gateway. If OVN traffic and the other application traffic use different conntrack zones for SNAT, then it is possible for the shared gateway to assign conflicting source IP:port combinations. By sharing the same conntrack zone, there will be no conflicting assignments. In this commit, we introduce options:snat-ct-zone for northbound logical routers. By setting this option, users can explicitly set the conntrack zone for the logical router so that it will match the zone used by non-OVN traffic on the host. The biggest side effects of this patch are: 1) southbound datapath changes now result in recalculating CT zones in ovn-controller. This can result in recomputing physical flows in more situations than previously. 2) The table 65 flow to transition between datapaths is no longer associated with a port binding. This is because the flow refers to the peer datapath's CT zones, which can now be updated due to changes on that datapath. The flow therefore may need to be updated either due to the port binding being changed or the peer datapath being changed. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1892311 Signed-off-by: Mark Michelson --- controller/ovn-controller.c | 89 +++++++++++++++++++++++++++++++----- controller/physical.c | 2 +- lib/ovn-util.c | 7 +++ lib/ovn-util.h | 1 + northd/ovn-northd.c | 10 ++++ ovn-nb.xml | 7 +++ tests/ovn.at | 91 +++++++++++++++++++++++++++++++++++++ 7 files changed, 194 insertions(+), 13 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index a06cae3cc..8691c3076 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -531,6 +531,21 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct ovsdb_idl *ovnsb_idl, } } +static void +add_pending_ct_zone_entry(struct shash *pending_ct_zones, + enum ct_zone_pending_state state, + int zone, bool add, char *name) +{ + VLOG_DBG("%s ct zone %"PRId32" for '%s'", + add ? "assigning" : "removing", zone, name); + + struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending); + pending->state = state; /* Skip flushing zone. */ + pending->zone = zone; + pending->add = add; + shash_add(pending_ct_zones, name, pending); +} + static void update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths, struct simap *ct_zones, unsigned long *ct_zone_bitmap, @@ -540,6 +555,7 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths, int scan_start = 1; const char *user; struct sset all_users = SSET_INITIALIZER(&all_users); + struct simap req_snat_zones = SIMAP_INITIALIZER(&req_snat_zones); SSET_FOR_EACH(user, lports) { sset_add(&all_users, user); @@ -554,6 +570,25 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths, char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "snat"); sset_add(&all_users, dnat); sset_add(&all_users, snat); + + int req_snat_zone = datapath_snat_ct_zone(ld->datapath); + if (req_snat_zone >= 0) { + struct simap_node *node; + bool collision = false; + SIMAP_FOR_EACH (node, &req_snat_zones) { + if (node->data == req_snat_zone) { + VLOG_WARN("Datapaths %.*s and " UUID_FMT " request SNAT " + "CT zone %d\n", UUID_LEN, node->name, + UUID_ARGS(&ld->datapath->header_.uuid), + req_snat_zone); + collision = true; + break; + } + } + if (!collision) { + simap_put(&req_snat_zones, snat, req_snat_zone); + } + } free(dnat); free(snat); } @@ -564,17 +599,51 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths, VLOG_DBG("removing ct zone %"PRId32" for '%s'", ct_zone->data, ct_zone->name); - struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending); - pending->state = CT_ZONE_DB_QUEUED; /* Skip flushing zone. */ - pending->zone = ct_zone->data; - pending->add = false; - shash_add(pending_ct_zones, ct_zone->name, pending); + add_pending_ct_zone_entry(pending_ct_zones, CT_ZONE_DB_QUEUED, + ct_zone->data, false, ct_zone->name); bitmap_set0(ct_zone_bitmap, ct_zone->data); simap_delete(ct_zones, ct_zone); } } + /* Prioritize requested CT zones */ + struct simap_node *snat_req_node; + SIMAP_FOR_EACH (snat_req_node, &req_snat_zones) { + struct simap_node *node = simap_find(ct_zones, snat_req_node->name); + if (node) { + if (node->data == snat_req_node->data) { + /* Already have this zone reserved */ + continue; + } else { + /* Zone has changed for this node. delete old entry */ + bitmap_set0(ct_zone_bitmap, node->data); + simap_delete(ct_zones, node); + } + } else if (snat_req_node->data > 0 && + bitmap_is_set(ct_zone_bitmap, snat_req_node->data)) { + /* Uh oh. Someone else already has this zone assigned. + * We need to find who and remove them from ct_zones so + * that they get re-assigned a new zone below + */ + struct simap_node *next; + SIMAP_FOR_EACH_SAFE(node, next, ct_zones) { + if (node->data == snat_req_node->data) { + simap_delete(ct_zones, node); + break; + } + } + } + + add_pending_ct_zone_entry(pending_ct_zones, CT_ZONE_OF_QUEUED, + snat_req_node->data, true, + snat_req_node->name); + + bitmap_set1(ct_zone_bitmap, snat_req_node->data); + simap_put(ct_zones, snat_req_node->name, snat_req_node->data); + } + simap_destroy(&req_snat_zones); + /* xxx This is wasteful to assign a zone to each port--even if no * xxx security policy is applied. */ @@ -596,13 +665,8 @@ update_ct_zones(const struct sset *lports, const struct hmap *local_datapaths, } scan_start = zone + 1; - VLOG_DBG("assigning ct zone %"PRId32" to '%s'", zone, user); - - struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending); - pending->state = CT_ZONE_OF_QUEUED; - pending->zone = zone; - pending->add = true; - shash_add(pending_ct_zones, user, pending); + add_pending_ct_zone_entry(pending_ct_zones, CT_ZONE_OF_QUEUED, + zone, true, user); bitmap_set1(ct_zone_bitmap, zone); simap_put(ct_zones, user, zone); @@ -2330,6 +2394,7 @@ main(int argc, char *argv[]) engine_add_input(&en_ct_zones, &en_ovs_open_vswitch, NULL); engine_add_input(&en_ct_zones, &en_ovs_bridge, NULL); + engine_add_input(&en_ct_zones, &en_sb_datapath_binding, NULL); engine_add_input(&en_ct_zones, &en_runtime_data, NULL); engine_add_input(&en_runtime_data, &en_ofctrl_is_connected, NULL); diff --git a/controller/physical.c b/controller/physical.c index 1bc2c389b..00c4ca4fd 100644 --- a/controller/physical.c +++ b/controller/physical.c @@ -926,7 +926,7 @@ consider_port_binding(struct ovsdb_idl_index *sbrec_port_binding_by_name, ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100, binding->header_.uuid.parts[0], - &match, ofpacts_p, &binding->header_.uuid); + &match, ofpacts_p, hc_uuid); return; } diff --git a/lib/ovn-util.c b/lib/ovn-util.c index abe6b04a7..0a6758ab1 100644 --- a/lib/ovn-util.c +++ b/lib/ovn-util.c @@ -532,6 +532,13 @@ datapath_is_switch(const struct sbrec_datapath_binding *ldp) { return smap_get(&ldp->external_ids, "logical-switch") != NULL; } + +int +datapath_snat_ct_zone(const struct sbrec_datapath_binding *dp) +{ + return smap_get_int(&dp->external_ids, "snat-ct-zone", -1); +} + struct tnlid_node { struct hmap_node hmap_node; diff --git a/lib/ovn-util.h b/lib/ovn-util.h index a39cbef5a..a035c86e1 100644 --- a/lib/ovn-util.h +++ b/lib/ovn-util.h @@ -107,6 +107,7 @@ uint32_t ovn_logical_flow_hash(const struct uuid *logical_datapath, uint16_t priority, const char *match, const char *actions); bool datapath_is_switch(const struct sbrec_datapath_binding *); +int datapath_snat_ct_zone(const struct sbrec_datapath_binding *ldp); void ovn_conn_show(struct unixctl_conn *conn, int argc OVS_UNUSED, const char *argv[] OVS_UNUSED, void *idl_); diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 684c2bd47..7b22f2c3e 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -1179,6 +1179,16 @@ ovn_datapath_update_external_ids(struct ovn_datapath *od) smap_add(&ids, "interconn-ts", ts); } } + + /* Set snat-ct-zone */ + if (od->nbr) { + int nat_default_ct = smap_get_int(&od->nbr->options, + "snat-ct-zone", -1); + if (nat_default_ct >= 0) { + smap_add_format(&ids, "snat-ct-zone", "%d", nat_default_ct); + } + } + sbrec_datapath_binding_set_external_ids(od->sb, &ids); smap_destroy(&ids); } diff --git a/ovn-nb.xml b/ovn-nb.xml index 5e8635992..1be1ca6c0 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -1941,6 +1941,13 @@ unique key for each datapath by itself. However, if it is configured, ovn-northd honors the configured value. + + Use the requested conntrack zone for SNAT with this router. This can be + useful if egress traffic from the host running OVN comes from both OVN and + other sources. This way, OVN and the other sources can make use of the same + conntrack zone. + diff --git a/tests/ovn.at b/tests/ovn.at index f154e3d77..0e902b49e 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -22578,3 +22578,94 @@ AT_CHECK([test "$encap_rec_mvtep" == "$encap_rec_mvtep1"], [0], []) OVN_CLEANUP([hv1]) AT_CLEANUP + +AT_SETUP([ovn -- snat default ct zone]) +ovn_start + +net_add n1 +sim_add hv1 +ovs-vsctl add-br br-phys +as hv1 +ovn_attach n1 br-phys 192.168.0.10 + +ovn-nbctl ls-add sw0 +ovn-nbctl lsp-add sw0 sw0-p1 +ovn-nbctl lsp-set-addresses sw0-p1 "00:00:00:00:00:02 10.0.0.2" + +ovn-nbctl lr-add gw_router +ovn-nbctl set Logical_Router gw_router options:chassis="hv1" + +ovn-nbctl lrp-add gw_router gw_router-sw0 00:00:00:00:00:01 10.0.0.1/24 +ovn-nbctl lsp-add sw0 sw0-gw_router +ovn-nbctl lsp-set-addresses sw0-gw_router router +ovn-nbctl set Logical_Switch_Port sw0-gw_router type=router \ + options:router-port=gw_router-sw0 \ + +ovn-nbctl lr-nat-add gw_router snat 192.168.0.1 10.0.0.0/24 + +as hv1 ovs-vsctl -- add-port br-int hv1-vif1 -- \ + set interface hv1-vif1 external-ids:iface-id=sw0-p1 + +ovn-nbctl --wait=hv sync + +ro_nb_uuid=$(ovn-nbctl get Logical_Router gw_router _uuid) +sw_nb_uuid=$(ovn-nbctl get Logical_Switch sw0 _uuid) +ro_sb_uuid=$(ovn-sbctl --bare --columns=_uuid find Datapath_Binding external-ids:logical-router=${ro_nb_uuid}) +sw_sb_uuid=$(ovn-sbctl --bare --columns=_uuid find Datapath_Binding external-ids:logical-switch=${sw_nb_uuid}) +ro_meta=$(ovn-sbctl get Datapath_Binding ${ro_sb_uuid} tunnel_key) +ro_meta=$(printf %#x ${ro_meta}) +sw_meta=$(ovn-sbctl get Datapath_Binding ${sw_sb_uuid} tunnel_key) +sw_meta=$(printf %#x ${sw_meta}) + +echo "ro_nb_uuid: ${ro_nb_uuid}" +echo "sw_nb_uuid: ${sw_nb_uuid}" +echo "ro_sb_uuid: ${ro_sb_uuid}" +echo "sw_sb_uuid: ${sw_sb_uuid}" +echo "ro_meta: ${ro_meta}" +echo "sw_meta: ${sw_meta}" + +as hv1 +ovs-vsctl list bridge br-int +snat_zone=$(printf %#x $(ovs-vsctl get bridge br-int external-ids:ct-zone-${ro_sb_uuid}_snat | tr -d \")) + +echo "snat_zone: ${snat_zone}" + +as hv1 ovs-ofctl dump-flows br-int > offlows_pre +AT_CAPTURE_FILE([offlows_pre]) +# We should have a flow in table 33 that transitions from the ingress pipeline +# to the egress pipeline of gw_router. +AT_CHECK_UNQUOTED([grep -c "table=33.*metadata=${ro_meta}.*load:${snat_zone}->NXM_NX_REG12[]" offlows_pre], [0], [dnl +1 +]) + +# We should have a flow in table 65 that transitions from the egress pipeline +# of sw0 to the ingress pipeline of gw_router. +AT_CHECK_UNQUOTED([grep -c "table=65.*metadata=${sw_meta}.*load:${snat_zone}->NXM_NX_REG12[]" offlows_pre], [0], [dnl +1 +]) + +ovn-nbctl --wait=hv set Logical_Router gw_router options:snat-ct-zone=666 + +as hv1 +snat_zone=$(ovs-vsctl get bridge br-int external-ids:ct-zone-${ro_sb_uuid}_snat | tr -d \") + +echo "snat_zone: ${snat_zone}" + +AT_CHECK([test "${snat_zone}" = "666"], [0], []) + +as hv1 ovs-ofctl dump-flows br-int > offlows_post +AT_CAPTURE_FILE([offlows_post]) +# We should have a flow in table 33 that transitions from the ingress pipeline +# to the egress pipeline of gw_router. +AT_CHECK_UNQUOTED([grep -c "table=33.*metadata=${ro_meta}.*load:0x29a->NXM_NX_REG12[]" offlows_post], [0], [dnl +1 +]) + +# We should have a flow in table 65 that transitions from the egress pipeline +# of sw0 to the ingress pipeline of gw_router. +AT_CHECK_UNQUOTED([grep -c "table=65.*metadata=${sw_meta}.*load:0x29a->NXM_NX_REG12[]" offlows_post], [0], [dnl +1 +]) + +OVN_CLEANUP([hv1]) +AT_CLEANUP