From patchwork Thu May 23 13:57:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ales Musil X-Patchwork-Id: 1938359 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=hNALz0/Q; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VlVBV6J4Tz20KL for ; Thu, 23 May 2024 23:58:26 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id D69664082E; Thu, 23 May 2024 13:58:24 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id 6DpEFK-74W5T; Thu, 23 May 2024 13:58:19 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=2605:bc80:3010:104::8cd3:938; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 844D940800 Authentication-Results: smtp4.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=hNALz0/Q Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp4.osuosl.org (Postfix) with ESMTPS id 844D940800; Thu, 23 May 2024 13:58:19 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 24A7AC0DD8; Thu, 23 May 2024 13:58:19 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id A4D2DC0DCE for ; Thu, 23 May 2024 13:58:09 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 63538407D0 for ; Thu, 23 May 2024 13:58:09 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id qBUIFtxJeV5i for ; Thu, 23 May 2024 13:58:06 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=170.10.129.124; helo=us-smtp-delivery-124.mimecast.com; envelope-from=amusil@redhat.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp4.osuosl.org 3B84040685 Authentication-Results: smtp4.osuosl.org; dmarc=pass (p=none dis=none) header.from=redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 3B84040685 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp4.osuosl.org (Postfix) with ESMTPS id 3B84040685 for ; Thu, 23 May 2024 13:58:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1716472684; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dTu6o/GyXVhIiV6dXseNC1jDJQekv+0iC+Fa26j9yVM=; b=hNALz0/QUaFV8GLEnl/NMny5Jq0tdzSymBdl+gIirbHM+rHLaxpSFJdIadI10IPKf/5Pqf s+4nF+ydn0g16uSww7slFuQ2XnbelX5jzU7ADrQhBVnES4Q+uwNGcg5ewaqBEaRAXQWYwW HEvhgNJJmubt+P+jWLwEuYO+A+wTaT4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-139-95g_AhdtP_m7pJ9CmQmZLQ-1; Thu, 23 May 2024 09:58:03 -0400 X-MC-Unique: 95g_AhdtP_m7pJ9CmQmZLQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0A44E185A780 for ; Thu, 23 May 2024 13:58:03 +0000 (UTC) Received: from amusil.redhat.com (unknown [10.45.226.38]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0356A2026D68; Thu, 23 May 2024 13:58:01 +0000 (UTC) From: Ales Musil To: dev@openvswitch.org Date: Thu, 23 May 2024 15:57:56 +0200 Message-ID: <20240523135759.1352700-2-amusil@redhat.com> In-Reply-To: <20240523135759.1352700-1-amusil@redhat.com> References: <20240523135759.1352700-1-amusil@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH ovn 1/4] controller: Move CT zone handling into separate module. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Move the CT zone handling specific bits into it's own module. This allows for easier changes done within the module and separates the logic that is unrelated from ovn-controller. Signed-off-by: Ales Musil --- controller/automake.mk | 4 +- controller/ct-zone.c | 377 ++++++++++++++++++++++++++++++++++ controller/ct-zone.h | 74 +++++++ controller/ofctrl.c | 3 +- controller/ovn-controller.c | 392 +++--------------------------------- controller/ovn-controller.h | 21 +- controller/pinctrl.c | 2 +- tests/ovn.at | 4 +- 8 files changed, 485 insertions(+), 392 deletions(-) create mode 100644 controller/ct-zone.c create mode 100644 controller/ct-zone.h diff --git a/controller/automake.mk b/controller/automake.mk index 1b1b3aeb1..ed93cfb3c 100644 --- a/controller/automake.mk +++ b/controller/automake.mk @@ -47,7 +47,9 @@ controller_ovn_controller_SOURCES = \ controller/mac-cache.h \ controller/mac-cache.c \ controller/statctrl.h \ - controller/statctrl.c + controller/statctrl.c \ + controller/ct-zone.h \ + controller/ct-zone.c controller_ovn_controller_LDADD = lib/libovn.la $(OVS_LIBDIR)/libopenvswitch.la man_MANS += controller/ovn-controller.8 diff --git a/controller/ct-zone.c b/controller/ct-zone.c new file mode 100644 index 000000000..96084fd9e --- /dev/null +++ b/controller/ct-zone.c @@ -0,0 +1,377 @@ +/* Copyright (c) 2024, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "ct-zone.h" +#include "local_data.h" +#include "openvswitch/vlog.h" + +VLOG_DEFINE_THIS_MODULE(ct_zone); + +static void +ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table, + struct ct_zone_ctx *ctx, const char *name, int zone); +static void ct_zone_add_pending(struct shash *pending_ct_zones, + enum ct_zone_pending_state state, + int zone, bool add, const char *name); + +void +ct_zones_restore(struct ct_zone_ctx *ctx, + const struct ovsrec_open_vswitch_table *ovs_table, + const struct sbrec_datapath_binding_table *dp_table, + const struct ovsrec_bridge *br_int) +{ + memset(ctx->bitmap, 0, sizeof ctx->bitmap); + bitmap_set1(ctx->bitmap, 0); /* Zone 0 is reserved. */ + + struct shash_node *pending_node; + SHASH_FOR_EACH (pending_node, &ctx->pending) { + struct ct_zone_pending_entry *ctpe = pending_node->data; + + if (ctpe->add) { + ct_zone_restore(dp_table, ctx, pending_node->name, ctpe->zone); + } + } + + const struct ovsrec_open_vswitch *cfg; + cfg = ovsrec_open_vswitch_table_first(ovs_table); + if (!cfg) { + return; + } + + if (!br_int) { + /* If the integration bridge hasn't been defined, assume that + * any existing ct-zone definitions aren't valid. */ + return; + } + + struct smap_node *node; + SMAP_FOR_EACH (node, &br_int->external_ids) { + if (strncmp(node->key, "ct-zone-", 8)) { + continue; + } + + const char *user = node->key + 8; + if (!user[0]) { + continue; + } + + if (shash_find(&ctx->pending, user)) { + continue; + } + + unsigned int zone; + if (!str_to_uint(node->value, 10, &zone)) { + continue; + } + + ct_zone_restore(dp_table, ctx, user, zone); + } +} + +bool +ct_zone_assign_unused(struct ct_zone_ctx *ctx, const char *zone_name, + int *scan_start) +{ + /* We assume that there are 64K zones and that we own them all. */ + int zone = bitmap_scan(ctx->bitmap, 0, *scan_start, MAX_CT_ZONES + 1); + if (zone == MAX_CT_ZONES + 1) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + VLOG_WARN_RL(&rl, "exhausted all ct zones"); + return false; + } + + *scan_start = zone + 1; + + ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, + zone, true, zone_name); + + bitmap_set1(ctx->bitmap, zone); + simap_put(&ctx->current, zone_name, zone); + return true; +} + +bool +ct_zone_remove(struct ct_zone_ctx *ctx, const char *name) +{ + struct simap_node *ct_zone = simap_find(&ctx->current, name); + if (!ct_zone) { + return false; + } + + VLOG_DBG("removing ct zone %"PRId32" for '%s'", ct_zone->data, + ct_zone->name); + + ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, + ct_zone->data, false, ct_zone->name); + bitmap_set0(ctx->bitmap, ct_zone->data); + simap_delete(&ctx->current, ct_zone); + + return true; +} + +void +ct_zones_update(const struct sset *local_lports, + const struct hmap *local_datapaths, struct ct_zone_ctx *ctx) +{ + struct simap_node *ct_zone; + int scan_start = 1; + const char *user; + struct sset all_users = SSET_INITIALIZER(&all_users); + struct simap req_snat_zones = SIMAP_INITIALIZER(&req_snat_zones); + unsigned long unreq_snat_zones_map[BITMAP_N_LONGS(MAX_CT_ZONES)]; + struct simap unreq_snat_zones = SIMAP_INITIALIZER(&unreq_snat_zones); + + const char *local_lport; + SSET_FOR_EACH (local_lport, local_lports) { + sset_add(&all_users, local_lport); + } + + /* Local patched datapath (gateway routers) need zones assigned. */ + const struct local_datapath *ld; + HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { + /* XXX Add method to limit zone assignment to logical router + * datapaths with NAT */ + const char *name = smap_get(&ld->datapath->external_ids, "name"); + if (!name) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); + VLOG_ERR_RL(&rl, "Missing name for datapath '"UUID_FMT"' " + "skipping zone assignment.", + UUID_ARGS(&ld->datapath->header_.uuid)); + continue; + } + + char *dnat = alloc_nat_zone_key(name, "dnat"); + char *snat = alloc_nat_zone_key(name, "snat"); + sset_add(&all_users, dnat); + sset_add(&all_users, snat); + + int req_snat_zone = ct_zone_get_snat(ld->datapath); + if (req_snat_zone >= 0) { + simap_put(&req_snat_zones, snat, req_snat_zone); + } + free(dnat); + free(snat); + } + + /* Delete zones that do not exist in above sset. */ + SIMAP_FOR_EACH_SAFE (ct_zone, &ctx->current) { + if (!sset_contains(&all_users, ct_zone->name)) { + ct_zone_remove(ctx, ct_zone->name); + } else if (!simap_find(&req_snat_zones, ct_zone->name)) { + bitmap_set1(unreq_snat_zones_map, ct_zone->data); + simap_put(&unreq_snat_zones, ct_zone->name, ct_zone->data); + } + } + + /* Prioritize requested CT zones */ + struct simap_node *snat_req_node; + SIMAP_FOR_EACH (snat_req_node, &req_snat_zones) { + /* Determine if someone already had this zone auto-assigned. + * If so, then they need to give up their assignment since + * that zone is being explicitly requested now. + */ + if (bitmap_is_set(unreq_snat_zones_map, snat_req_node->data)) { + struct simap_node *unreq_node; + SIMAP_FOR_EACH_SAFE (unreq_node, &unreq_snat_zones) { + if (unreq_node->data == snat_req_node->data) { + simap_find_and_delete(&ctx->current, unreq_node->name); + simap_delete(&unreq_snat_zones, unreq_node); + } + } + + /* Set this bit to 0 so that if multiple datapaths have requested + * this zone, we don't needlessly double-detect this condition. + */ + bitmap_set0(unreq_snat_zones_map, snat_req_node->data); + } + + struct simap_node *node = simap_find(&ctx->current, + snat_req_node->name); + if (node) { + if (node->data != snat_req_node->data) { + /* Zone request has changed for this node. delete old entry and + * create new one*/ + ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, + snat_req_node->data, true, + snat_req_node->name); + bitmap_set0(ctx->bitmap, node->data); + } + bitmap_set1(ctx->bitmap, snat_req_node->data); + node->data = snat_req_node->data; + } else { + ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, + snat_req_node->data, true, + snat_req_node->name); + bitmap_set1(ctx->bitmap, snat_req_node->data); + simap_put(&ctx->current, snat_req_node->name, snat_req_node->data); + } + } + + /* xxx This is wasteful to assign a zone to each port--even if no + * xxx security policy is applied. */ + + /* Assign a unique zone id for each logical port and two zones + * to a gateway router. */ + SSET_FOR_EACH (user, &all_users) { + if (simap_contains(&ctx->current, user)) { + continue; + } + + ct_zone_assign_unused(ctx, user, &scan_start); + } + + simap_destroy(&req_snat_zones); + simap_destroy(&unreq_snat_zones); + sset_destroy(&all_users); +} + +void +ct_zones_commit(const struct ovsrec_bridge *br_int, + struct shash *pending_ct_zones) +{ + struct shash_node *iter; + SHASH_FOR_EACH (iter, pending_ct_zones) { + struct ct_zone_pending_entry *ctzpe = iter->data; + + /* The transaction is open, so any pending entries in the + * CT_ZONE_DB_QUEUED must be sent and any in CT_ZONE_DB_QUEUED + * need to be retried. */ + if (ctzpe->state != CT_ZONE_DB_QUEUED + && ctzpe->state != CT_ZONE_DB_SENT) { + continue; + } + + char *user_str = xasprintf("ct-zone-%s", iter->name); + if (ctzpe->add) { + char *zone_str = xasprintf("%"PRId32, ctzpe->zone); + struct smap_node *node = + smap_get_node(&br_int->external_ids, user_str); + if (!node || strcmp(node->value, zone_str)) { + ovsrec_bridge_update_external_ids_setkey(br_int, + user_str, zone_str); + } + free(zone_str); + } else { + if (smap_get(&br_int->external_ids, user_str)) { + ovsrec_bridge_update_external_ids_delkey(br_int, user_str); + } + } + free(user_str); + + ctzpe->state = CT_ZONE_DB_SENT; + } +} + +int +ct_zone_get_snat(const struct sbrec_datapath_binding *dp) +{ + return smap_get_int(&dp->external_ids, "snat-ct-zone", -1); +} + +void +ct_zones_pending_clear_commited(struct shash *pending) +{ + struct shash_node *iter; + SHASH_FOR_EACH_SAFE (iter, pending) { + struct ct_zone_pending_entry *ctzpe = iter->data; + if (ctzpe->state == CT_ZONE_DB_SENT) { + shash_delete(pending, iter); + free(ctzpe); + } + } +} + +static void +ct_zone_add_pending(struct shash *pending_ct_zones, + enum ct_zone_pending_state state, + int zone, bool add, const char *name) +{ + VLOG_DBG("%s ct zone %"PRId32" for '%s'", + add ? "assigning" : "removing", zone, name); + + struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending); + *pending = (struct ct_zone_pending_entry) { + .state = state, + .zone = zone, + .add = add, + }; + + /* Its important that we add only one entry for the key 'name'. + * Replace 'pending' with 'existing' and free up 'existing'. + * Otherwise, we may end up in a continuous loop of adding + * and deleting the zone entry in the 'external_ids' of + * integration bridge. + */ + struct ct_zone_pending_entry *existing = + shash_replace(pending_ct_zones, name, pending); + free(existing); +} + +/* Replaces a UUID prefix from 'uuid_zone' (if any) with the + * corresponding Datapath_Binding.external_ids.name. Returns it + * as a new string that will be owned by the caller. */ +static char * +ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table *dp_table, + const char *uuid_zone) +{ + struct uuid uuid; + if (!uuid_from_string_prefix(&uuid, uuid_zone)) { + return NULL; + } + + const struct sbrec_datapath_binding *dp = + sbrec_datapath_binding_table_get_for_uuid(dp_table, &uuid); + if (!dp) { + return NULL; + } + + const char *entity_name = smap_get(&dp->external_ids, "name"); + if (!entity_name) { + return NULL; + } + + return xasprintf("%s%s", entity_name, uuid_zone + UUID_LEN); +} + +static void +ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table, + struct ct_zone_ctx *ctx, const char *name, int zone) +{ + VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, name); + + char *new_name = ct_zone_name_from_uuid(dp_table, name); + const char *current_name = name; + if (new_name) { + VLOG_DBG("ct zone %"PRId32" replace uuid name '%s' with '%s'", + zone, name, new_name); + + /* Make sure we remove the uuid one in the next OvS DB commit without + * flush. */ + ct_zone_add_pending(&ctx->pending, CT_ZONE_DB_QUEUED, + zone, false, name); + /* Store the zone in OvS DB with name instead of uuid without flush. + * */ + ct_zone_add_pending(&ctx->pending, CT_ZONE_DB_QUEUED, + zone, true, new_name); + current_name = new_name; + } + + simap_put(&ctx->current, current_name, zone); + bitmap_set1(ctx->bitmap, zone); + + free(new_name); +} diff --git a/controller/ct-zone.h b/controller/ct-zone.h new file mode 100644 index 000000000..6b14de935 --- /dev/null +++ b/controller/ct-zone.h @@ -0,0 +1,74 @@ +/* Copyright (c) 2024, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef OVN_CT_ZONE_H +#define OVN_CT_ZONE_H + +#include + +#include "bitmap.h" +#include "openvswitch/hmap.h" +#include "openvswitch/shash.h" +#include "openvswitch/types.h" +#include "ovn-sb-idl.h" +#include "simap.h" +#include "vswitch-idl.h" + +/* Linux supports a maximum of 64K zones, which seems like a fine default. */ +#define MAX_CT_ZONES 65535 +#define BITMAP_SIZE BITMAP_N_LONGS(MAX_CT_ZONES) + +/* Context of CT zone assignment. */ +struct ct_zone_ctx { + unsigned long bitmap[BITMAP_SIZE]; /* Bitmap indication of allocated + * zones. */ + struct shash pending; /* Pending entries, + * 'struct ct_zone_pending_entry' + * by name. */ + struct simap current; /* Current CT zones mapping + * (zone id by name). */ +}; + +/* States to move through when a new conntrack zone has been allocated. */ +enum ct_zone_pending_state { + CT_ZONE_OF_QUEUED, /* Waiting to send conntrack flush command. */ + CT_ZONE_OF_SENT, /* Sent and waiting for confirmation on flush. */ + CT_ZONE_DB_QUEUED, /* Waiting for DB transaction to open. */ + CT_ZONE_DB_SENT, /* Sent and waiting for confirmation from DB. */ +}; + +struct ct_zone_pending_entry { + int zone; + bool add; /* Is the entry being added? */ + ovs_be32 of_xid; /* Transaction id for barrier. */ + enum ct_zone_pending_state state; +}; + +void ct_zones_restore(struct ct_zone_ctx *ctx, + const struct ovsrec_open_vswitch_table *ovs_table, + const struct sbrec_datapath_binding_table *dp_table, + const struct ovsrec_bridge *br_int); +bool ct_zone_assign_unused(struct ct_zone_ctx *ctx, const char *zone_name, + int *scan_start); +bool ct_zone_remove(struct ct_zone_ctx *ctx, const char *name); +void ct_zones_update(const struct sset *local_lports, + const struct hmap *local_datapaths, + struct ct_zone_ctx *ctx); +void ct_zones_commit(const struct ovsrec_bridge *br_int, + struct shash *pending_ct_zones); +int ct_zone_get_snat(const struct sbrec_datapath_binding *dp); +void ct_zones_pending_clear_commited(struct shash *pending); + +#endif /* controller/ct-zone.h */ diff --git a/controller/ofctrl.c b/controller/ofctrl.c index 6a2564604..5e54a2e3f 100644 --- a/controller/ofctrl.c +++ b/controller/ofctrl.c @@ -42,7 +42,6 @@ #include "openvswitch/ofp-util.h" #include "openvswitch/ofpbuf.h" #include "openvswitch/vlog.h" -#include "ovn-controller.h" #include "ovn/actions.h" #include "lib/extend-table.h" #include "lib/lb.h" @@ -53,6 +52,8 @@ #include "timeval.h" #include "util.h" #include "vswitch-idl.h" +#include "ovn-sb-idl.h" +#include "ct-zone.h" VLOG_DEFINE_THIS_MODULE(ofctrl); diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 6b38f113d..fc72e5e2c 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -86,6 +86,7 @@ #include "mac-cache.h" #include "statctrl.h" #include "lib/dns-resolve.h" +#include "ct-zone.h" VLOG_DEFINE_THIS_MODULE(main); @@ -668,342 +669,14 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct ovsdb_idl *ovnsb_idl, } } -static void -add_pending_ct_zone_entry(struct shash *pending_ct_zones, - enum ct_zone_pending_state state, - int zone, bool add, const char *name) -{ - VLOG_DBG("%s ct zone %"PRId32" for '%s'", - add ? "assigning" : "removing", zone, name); - - struct ct_zone_pending_entry *pending = xmalloc(sizeof *pending); - pending->state = state; /* Skip flushing zone. */ - pending->zone = zone; - pending->add = add; - - /* Its important that we add only one entry for the key 'name'. - * Replace 'pending' with 'existing' and free up 'existing'. - * Otherwise, we may end up in a continuous loop of adding - * and deleting the zone entry in the 'external_ids' of - * integration bridge. - */ - struct ct_zone_pending_entry *existing = - shash_replace(pending_ct_zones, name, pending); - free(existing); -} - -static bool -alloc_id_to_ct_zone(const char *zone_name, struct simap *ct_zones, - unsigned long *ct_zone_bitmap, int *scan_start, - struct shash *pending_ct_zones) -{ - /* We assume that there are 64K zones and that we own them all. */ - int zone = bitmap_scan(ct_zone_bitmap, 0, *scan_start, MAX_CT_ZONES + 1); - if (zone == MAX_CT_ZONES + 1) { - static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); - VLOG_WARN_RL(&rl, "exhausted all ct zones"); - return false; - } - - *scan_start = zone + 1; - - add_pending_ct_zone_entry(pending_ct_zones, CT_ZONE_OF_QUEUED, - zone, true, zone_name); - - bitmap_set1(ct_zone_bitmap, zone); - simap_put(ct_zones, zone_name, zone); - return true; -} - -static int -get_snat_ct_zone(const struct sbrec_datapath_binding *dp) -{ - return smap_get_int(&dp->external_ids, "snat-ct-zone", -1); -} - -static void -update_ct_zones(const struct sset *local_lports, - const struct hmap *local_datapaths, - struct simap *ct_zones, unsigned long *ct_zone_bitmap, - struct shash *pending_ct_zones) -{ - struct simap_node *ct_zone; - int scan_start = 1; - const char *user; - struct sset all_users = SSET_INITIALIZER(&all_users); - struct simap req_snat_zones = SIMAP_INITIALIZER(&req_snat_zones); - unsigned long unreq_snat_zones_map[BITMAP_N_LONGS(MAX_CT_ZONES)]; - struct simap unreq_snat_zones = SIMAP_INITIALIZER(&unreq_snat_zones); - - const char *local_lport; - SSET_FOR_EACH (local_lport, local_lports) { - sset_add(&all_users, local_lport); - } - - /* Local patched datapath (gateway routers) need zones assigned. */ - const struct local_datapath *ld; - HMAP_FOR_EACH (ld, hmap_node, local_datapaths) { - /* XXX Add method to limit zone assignment to logical router - * datapaths with NAT */ - const char *name = smap_get(&ld->datapath->external_ids, "name"); - if (!name) { - static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); - VLOG_ERR_RL(&rl, "Missing name for datapath '"UUID_FMT"' " - "skipping zone assignment.", - UUID_ARGS(&ld->datapath->header_.uuid)); - continue; - } - - char *dnat = alloc_nat_zone_key(name, "dnat"); - char *snat = alloc_nat_zone_key(name, "snat"); - sset_add(&all_users, dnat); - sset_add(&all_users, snat); - - int req_snat_zone = get_snat_ct_zone(ld->datapath); - if (req_snat_zone >= 0) { - simap_put(&req_snat_zones, snat, req_snat_zone); - } - free(dnat); - free(snat); - } - - /* Delete zones that do not exist in above sset. */ - SIMAP_FOR_EACH_SAFE (ct_zone, ct_zones) { - if (!sset_contains(&all_users, ct_zone->name)) { - VLOG_DBG("removing ct zone %"PRId32" for '%s'", - ct_zone->data, ct_zone->name); - - add_pending_ct_zone_entry(pending_ct_zones, CT_ZONE_OF_QUEUED, - ct_zone->data, false, ct_zone->name); - - bitmap_set0(ct_zone_bitmap, ct_zone->data); - simap_delete(ct_zones, ct_zone); - } else if (!simap_find(&req_snat_zones, ct_zone->name)) { - bitmap_set1(unreq_snat_zones_map, ct_zone->data); - simap_put(&unreq_snat_zones, ct_zone->name, ct_zone->data); - } - } - - /* Prioritize requested CT zones */ - struct simap_node *snat_req_node; - SIMAP_FOR_EACH (snat_req_node, &req_snat_zones) { - /* Determine if someone already had this zone auto-assigned. - * If so, then they need to give up their assignment since - * that zone is being explicitly requested now. - */ - if (bitmap_is_set(unreq_snat_zones_map, snat_req_node->data)) { - struct simap_node *unreq_node; - SIMAP_FOR_EACH_SAFE (unreq_node, &unreq_snat_zones) { - if (unreq_node->data == snat_req_node->data) { - simap_find_and_delete(ct_zones, unreq_node->name); - simap_delete(&unreq_snat_zones, unreq_node); - } - } - - /* Set this bit to 0 so that if multiple datapaths have requested - * this zone, we don't needlessly double-detect this condition. - */ - bitmap_set0(unreq_snat_zones_map, snat_req_node->data); - } - - struct simap_node *node = simap_find(ct_zones, snat_req_node->name); - if (node) { - if (node->data != snat_req_node->data) { - /* Zone request has changed for this node. delete old entry and - * create new one*/ - add_pending_ct_zone_entry(pending_ct_zones, CT_ZONE_OF_QUEUED, - snat_req_node->data, true, - snat_req_node->name); - bitmap_set0(ct_zone_bitmap, node->data); - } - bitmap_set1(ct_zone_bitmap, snat_req_node->data); - node->data = snat_req_node->data; - } else { - add_pending_ct_zone_entry(pending_ct_zones, CT_ZONE_OF_QUEUED, - snat_req_node->data, true, snat_req_node->name); - bitmap_set1(ct_zone_bitmap, snat_req_node->data); - simap_put(ct_zones, snat_req_node->name, snat_req_node->data); - } - } - - /* xxx This is wasteful to assign a zone to each port--even if no - * xxx security policy is applied. */ - - /* Assign a unique zone id for each logical port and two zones - * to a gateway router. */ - SSET_FOR_EACH(user, &all_users) { - if (simap_contains(ct_zones, user)) { - continue; - } - - alloc_id_to_ct_zone(user, ct_zones, ct_zone_bitmap, &scan_start, - pending_ct_zones); - } - - simap_destroy(&req_snat_zones); - simap_destroy(&unreq_snat_zones); - sset_destroy(&all_users); -} - -static void -commit_ct_zones(const struct ovsrec_bridge *br_int, - struct shash *pending_ct_zones) -{ - struct shash_node *iter; - SHASH_FOR_EACH(iter, pending_ct_zones) { - struct ct_zone_pending_entry *ctzpe = iter->data; - - /* The transaction is open, so any pending entries in the - * CT_ZONE_DB_QUEUED must be sent and any in CT_ZONE_DB_QUEUED - * need to be retried. */ - if (ctzpe->state != CT_ZONE_DB_QUEUED - && ctzpe->state != CT_ZONE_DB_SENT) { - continue; - } - - char *user_str = xasprintf("ct-zone-%s", iter->name); - if (ctzpe->add) { - char *zone_str = xasprintf("%"PRId32, ctzpe->zone); - struct smap_node *node = - smap_get_node(&br_int->external_ids, user_str); - if (!node || strcmp(node->value, zone_str)) { - ovsrec_bridge_update_external_ids_setkey(br_int, - user_str, zone_str); - } - free(zone_str); - } else { - if (smap_get(&br_int->external_ids, user_str)) { - ovsrec_bridge_update_external_ids_delkey(br_int, user_str); - } - } - free(user_str); - - ctzpe->state = CT_ZONE_DB_SENT; - } -} - /* Connection tracking zones. */ struct ed_type_ct_zones { - unsigned long bitmap[BITMAP_N_LONGS(MAX_CT_ZONES)]; - struct shash pending; - struct simap current; + struct ct_zone_ctx ctx; /* Tracked data. */ bool recomputed; }; -/* Replaces a UUID prefix from 'uuid_zone' (if any) with the - * corresponding Datapath_Binding.external_ids.name. Returns it - * as a new string that will be owned by the caller. */ -static char * -ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table *dp_table, - const char *uuid_zone) -{ - struct uuid uuid; - if (!uuid_from_string_prefix(&uuid, uuid_zone)) { - return NULL; - } - - const struct sbrec_datapath_binding *dp = - sbrec_datapath_binding_table_get_for_uuid(dp_table, &uuid); - if (!dp) { - return NULL; - } - - const char *entity_name = smap_get(&dp->external_ids, "name"); - if (!entity_name) { - return NULL; - } - - return xasprintf("%s%s", entity_name, uuid_zone + UUID_LEN); -} - -static void -ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table, - struct ed_type_ct_zones *ct_zones_data, const char *name, - int zone) -{ - VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, name); - - char *new_name = ct_zone_name_from_uuid(dp_table, name); - const char *current_name = name; - if (new_name) { - VLOG_DBG("ct zone %"PRId32" replace uuid name '%s' with '%s'", - zone, name, new_name); - - /* Make sure we remove the uuid one in the next OvS DB commit without - * flush. */ - add_pending_ct_zone_entry(&ct_zones_data->pending, CT_ZONE_DB_QUEUED, - zone, false, name); - /* Store the zone in OvS DB with name instead of uuid without flush. - * */ - add_pending_ct_zone_entry(&ct_zones_data->pending, CT_ZONE_DB_QUEUED, - zone, true, new_name); - current_name = new_name; - } - - simap_put(&ct_zones_data->current, current_name, zone); - bitmap_set1(ct_zones_data->bitmap, zone); - - free(new_name); -} - -static void -restore_ct_zones(const struct ovsrec_bridge_table *bridge_table, - const struct ovsrec_open_vswitch_table *ovs_table, - const struct sbrec_datapath_binding_table *dp_table, - struct ed_type_ct_zones *ct_zones_data) -{ - memset(ct_zones_data->bitmap, 0, sizeof ct_zones_data->bitmap); - bitmap_set1(ct_zones_data->bitmap, 0); /* Zone 0 is reserved. */ - - struct shash_node *pending_node; - SHASH_FOR_EACH (pending_node, &ct_zones_data->pending) { - struct ct_zone_pending_entry *ctpe = pending_node->data; - - if (ctpe->add) { - ct_zone_restore(dp_table, ct_zones_data, - pending_node->name, ctpe->zone); - } - } - - const struct ovsrec_open_vswitch *cfg; - cfg = ovsrec_open_vswitch_table_first(ovs_table); - if (!cfg) { - return; - } - - const struct ovsrec_bridge *br_int; - br_int = get_bridge(bridge_table, br_int_name(ovs_table)); - if (!br_int) { - /* If the integration bridge hasn't been defined, assume that - * any existing ct-zone definitions aren't valid. */ - return; - } - - struct smap_node *node; - SMAP_FOR_EACH(node, &br_int->external_ids) { - if (strncmp(node->key, "ct-zone-", 8)) { - continue; - } - - const char *user = node->key + 8; - if (!user[0]) { - continue; - } - - if (shash_find(&ct_zones_data->pending, user)) { - continue; - } - - unsigned int zone; - if (!str_to_uint(node->value, 10, &zone)) { - continue; - } - - ct_zone_restore(dp_table, ct_zones_data, user, zone); - } -} static uint64_t get_nb_cfg(const struct sbrec_sb_global_table *sb_global_table, @@ -2512,8 +2185,8 @@ en_ct_zones_init(struct engine_node *node OVS_UNUSED, { struct ed_type_ct_zones *data = xzalloc(sizeof *data); - shash_init(&data->pending); - simap_init(&data->current); + shash_init(&data->ctx.pending); + simap_init(&data->ctx.current); return data; } @@ -2530,8 +2203,8 @@ en_ct_zones_cleanup(void *data) { struct ed_type_ct_zones *ct_zones_data = data; - simap_destroy(&ct_zones_data->current); - shash_destroy_free_data(&ct_zones_data->pending); + simap_destroy(&ct_zones_data->ctx.current); + shash_destroy_free_data(&ct_zones_data->ctx.pending); } static void @@ -2548,11 +2221,11 @@ en_ct_zones_run(struct engine_node *node, void *data) const struct sbrec_datapath_binding_table *dp_table = EN_OVSDB_GET(engine_get_input("SB_datapath_binding", node)); - restore_ct_zones(bridge_table, ovs_table, dp_table, ct_zones_data); - update_ct_zones(&rt_data->local_lports, &rt_data->local_datapaths, - &ct_zones_data->current, ct_zones_data->bitmap, - &ct_zones_data->pending); + const struct ovsrec_bridge *br_int = get_br_int(bridge_table, ovs_table); + ct_zones_restore(&ct_zones_data->ctx, ovs_table, dp_table, br_int); + ct_zones_update(&rt_data->local_lports, &rt_data->local_datapaths, + &ct_zones_data->ctx); ct_zones_data->recomputed = true; engine_set_node_state(node, EN_UPDATED); @@ -2583,7 +2256,7 @@ ct_zones_datapath_binding_handler(struct engine_node *node, void *data) return false; } - int req_snat_zone = get_snat_ct_zone(dp); + int req_snat_zone = ct_zone_get_snat(dp); if (req_snat_zone == -1) { /* datapath snat ct zone is not set. This condition will also hit * when CMS clears the snat-ct-zone for the logical router. @@ -2605,7 +2278,7 @@ ct_zones_datapath_binding_handler(struct engine_node *node, void *data) } char *snat_dp_zone_key = alloc_nat_zone_key(name, "snat"); - struct simap_node *simap_node = simap_find(&ct_zones_data->current, + struct simap_node *simap_node = simap_find(&ct_zones_data->ctx.current, snat_dp_zone_key); free(snat_dp_zone_key); if (!simap_node || simap_node->data != req_snat_zone) { @@ -2657,25 +2330,16 @@ ct_zones_runtime_data_handler(struct engine_node *node, void *data) if (t_lport->tracked_type == TRACKED_RESOURCE_NEW || t_lport->tracked_type == TRACKED_RESOURCE_UPDATED) { - if (!simap_contains(&ct_zones_data->current, + if (!simap_contains(&ct_zones_data->ctx.current, t_lport->pb->logical_port)) { - alloc_id_to_ct_zone(t_lport->pb->logical_port, - &ct_zones_data->current, - ct_zones_data->bitmap, &scan_start, - &ct_zones_data->pending); + ct_zone_assign_unused(&ct_zones_data->ctx, + t_lport->pb->logical_port, + &scan_start); updated = true; } } else if (t_lport->tracked_type == TRACKED_RESOURCE_REMOVED) { - struct simap_node *ct_zone = - simap_find(&ct_zones_data->current, - t_lport->pb->logical_port); - if (ct_zone) { - add_pending_ct_zone_entry( - &ct_zones_data->pending, CT_ZONE_OF_QUEUED, - ct_zone->data, false, ct_zone->name); - - bitmap_set0(ct_zones_data->bitmap, ct_zone->data); - simap_delete(&ct_zones_data->current, ct_zone); + if (ct_zone_remove(&ct_zones_data->ctx, + t_lport->pb->logical_port)) { updated = true; } } @@ -4689,7 +4353,7 @@ static void init_physical_ctx(struct engine_node *node, struct ed_type_ct_zones *ct_zones_data = engine_get_input_data("ct_zones", node); - struct simap *ct_zones = &ct_zones_data->current; + struct simap *ct_zones = &ct_zones_data->ctx.current; parse_encap_ips(ovs_table, &p_ctx->n_encap_ips, &p_ctx->encap_ips); p_ctx->sbrec_port_binding_by_name = sbrec_port_binding_by_name; @@ -5591,7 +5255,7 @@ main(int argc, char *argv[]) unixctl_command_register("ct-zone-list", "", 0, 0, ct_zone_list, - &ct_zones_data->current); + &ct_zones_data->ctx.current); struct pending_pkt pending_pkt = { .conn = NULL }; unixctl_command_register("inject-pkt", "MICROFLOW", 1, 1, inject_pkt, @@ -5812,7 +5476,7 @@ main(int argc, char *argv[]) if (br_int) { ct_zones_data = engine_get_data(&en_ct_zones); if (ofctrl_run(br_int, ovs_table, - ct_zones_data ? &ct_zones_data->pending + ct_zones_data ? &ct_zones_data->ctx.pending : NULL)) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); @@ -5872,7 +5536,8 @@ main(int argc, char *argv[]) if (ct_zones_data) { stopwatch_start(CT_ZONE_COMMIT_STOPWATCH_NAME, time_msec()); - commit_ct_zones(br_int, &ct_zones_data->pending); + ct_zones_commit(br_int, + &ct_zones_data->ctx.pending); stopwatch_stop(CT_ZONE_COMMIT_STOPWATCH_NAME, time_msec()); } @@ -6016,7 +5681,7 @@ main(int argc, char *argv[]) time_msec()); ofctrl_put(&lflow_output_data->flow_table, &pflow_output_data->flow_table, - &ct_zones_data->pending, + &ct_zones_data->ctx.pending, &lb_data->removed_tuples, sbrec_meter_by_name, ofctrl_seqno_get_req_cfg(), @@ -6134,14 +5799,7 @@ main(int argc, char *argv[]) * (or it did not change anything in the database). */ ct_zones_data = engine_get_data(&en_ct_zones); if (ct_zones_data) { - struct shash_node *iter; - SHASH_FOR_EACH_SAFE (iter, &ct_zones_data->pending) { - struct ct_zone_pending_entry *ctzpe = iter->data; - if (ctzpe->state == CT_ZONE_DB_SENT) { - shash_delete(&ct_zones_data->pending, iter); - free(ctzpe); - } - } + ct_zones_pending_clear_commited(&ct_zones_data->ctx.pending); } vif_plug_finish_deleted( diff --git a/controller/ovn-controller.h b/controller/ovn-controller.h index 3a0e95377..fafd704df 100644 --- a/controller/ovn-controller.h +++ b/controller/ovn-controller.h @@ -17,29 +17,10 @@ #ifndef OVN_CONTROLLER_H #define OVN_CONTROLLER_H 1 -#include "simap.h" -#include "lib/ovn-sb-idl.h" +#include struct ovsrec_bridge_table; -/* Linux supports a maximum of 64K zones, which seems like a fine default. */ -#define MAX_CT_ZONES 65535 - -/* States to move through when a new conntrack zone has been allocated. */ -enum ct_zone_pending_state { - CT_ZONE_OF_QUEUED, /* Waiting to send conntrack flush command. */ - CT_ZONE_OF_SENT, /* Sent and waiting for confirmation on flush. */ - CT_ZONE_DB_QUEUED, /* Waiting for DB transaction to open. */ - CT_ZONE_DB_SENT, /* Sent and waiting for confirmation from DB. */ -}; - -struct ct_zone_pending_entry { - int zone; - bool add; /* Is the entry being added? */ - ovs_be32 of_xid; /* Transaction id for barrier. */ - enum ct_zone_pending_state state; -}; - const struct ovsrec_bridge *get_bridge(const struct ovsrec_bridge_table *, const char *br_name); diff --git a/controller/pinctrl.c b/controller/pinctrl.c index 6a2c3dc68..e754d61a8 100644 --- a/controller/pinctrl.c +++ b/controller/pinctrl.c @@ -45,7 +45,6 @@ #include "lib/crc32c.h" #include "lib/dhcp.h" -#include "ovn-controller.h" #include "ovn/actions.h" #include "ovn/lex.h" #include "lib/acl-log.h" @@ -62,6 +61,7 @@ #include "vswitch-idl.h" #include "lflow.h" #include "ip-mcast.h" +#include "ovn-sb-idl.h" VLOG_DEFINE_THIS_MODULE(pinctrl); diff --git a/tests/ovn.at b/tests/ovn.at index 96e43d80c..359a31204 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -36941,7 +36941,7 @@ check ovn-nbctl lsp-set-type sw0-lr0 router check ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:ff:01 check ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 -check ovn-appctl -t ovn-controller vlog/set dbg:main +check ovn-appctl -t ovn-controller vlog/set dbg:ct_zone wait_for_ports_up ovn-nbctl --wait=hv sync @@ -37049,7 +37049,7 @@ OVS_WAIT_UNTIL([test "$(ovn-appctl -t ovn-controller debug/status)" != "running" replace_with_uuid lr0 replace_with_uuid sw0 -start_daemon ovn-controller --verbose="main:dbg" +start_daemon ovn-controller --verbose="ct_zone:dbg" OVS_WAIT_UNTIL([test "$(grep -c 'ct zone .* replace uuid name' hv1/ovn-controller.log)" = "8"])