From patchwork Thu Feb 8 21:49:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1896771 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TW9cb5BWxz23hb for ; Fri, 9 Feb 2024 08:49:35 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 8CAAF41C6F; Thu, 8 Feb 2024 21:49:33 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ELFoejcs0c4m; Thu, 8 Feb 2024 21:49:32 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=2605:bc80:3010:104::8cd3:938; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 5018141BDC Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp2.osuosl.org (Postfix) with ESMTPS id 5018141BDC; Thu, 8 Feb 2024 21:49:32 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2227DC0072; Thu, 8 Feb 2024 21:49:32 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 4C370C0072 for ; Thu, 8 Feb 2024 21:49:30 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 2E40F41BDC for ; Thu, 8 Feb 2024 21:49:30 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yIB4n1ohVx6M for ; Thu, 8 Feb 2024 21:49:29 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=217.70.183.196; helo=relay4-d.mail.gandi.net; envelope-from=numans@ovn.org; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp2.osuosl.org 0DDF841BC3 Authentication-Results: smtp2.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 0DDF841BC3 Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net [217.70.183.196]) by smtp2.osuosl.org (Postfix) with ESMTPS id 0DDF841BC3 for ; Thu, 8 Feb 2024 21:49:28 +0000 (UTC) Received: by mail.gandi.net (Postfix) with ESMTPSA id 2776BE0002; Thu, 8 Feb 2024 21:49:24 +0000 (UTC) From: numans@ovn.org To: dev@openvswitch.org Date: Thu, 8 Feb 2024 16:49:20 -0500 Message-ID: <20240208214920.12747-1-numans@ovn.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240208214904.12696-1-numans@ovn.org> References: <20240208214904.12696-1-numans@ovn.org> MIME-Version: 1.0 X-GND-Sasl: numans@ovn.org Subject: [ovs-dev] [PATCH ovn v1 1/4] northd: Don't add lr_out_delivery default drop flow for each lrp. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Numan Siddique The default drop flow in lr_out_delivery stage is generated for every router port of a logical router. This results in the lflow_table_add_lflow() to be called multiple times for the same match and actions and the ovn_lflow to have multiple dp_refcnts. Fix this by generating this lflow only once for each router. Fixes: 27a92cc272aa ("northd: make default drops explicit") Signed-off-by: Numan Siddique Acked-by: Han Zhou Acked-by: Dumitru Ceara --- northd/northd.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/northd/northd.c b/northd/northd.c index a174a4dcd1..a5d5e67117 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -13470,9 +13470,6 @@ build_egress_delivery_flows_for_lrouter_port( ds_put_format(match, "outport == %s", op->json_key); ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_DELIVERY, 100, ds_cstr(match), "output;", lflow_ref); - - ovn_lflow_add_default_drop(lflows, op->od, S_ROUTER_OUT_DELIVERY, - lflow_ref); } static void @@ -14838,9 +14835,9 @@ lrouter_check_nat_entry(const struct ovn_datapath *od, } /* NAT, Defrag and load balancing. */ -static void build_lr_nat_defrag_and_lb_default_flows(struct ovn_datapath *od, - struct lflow_table *lflows, - struct lflow_ref *lflow_ref) +static void build_lr_nat_defrag_and_lb_default_flows( + struct ovn_datapath *od, struct lflow_table *lflows, + struct lflow_ref *lflow_ref) { ovs_assert(od->nbr); @@ -14866,6 +14863,12 @@ static void build_lr_nat_defrag_and_lb_default_flows(struct ovn_datapath *od, * packet would go through conntrack - which is not required. */ ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 120, "nd_ns", "next;", lflow_ref); + + /* Default drop rule in lr_out_delivery stage. See + * build_egress_delivery_flows_for_lrouter_port() which adds a rule + * for each router port. */ + ovn_lflow_add_default_drop(lflows, od, S_ROUTER_OUT_DELIVERY, + lflow_ref); } static void From patchwork Thu Feb 8 21:49:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1896772 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TW9cl0QsWz23hb for ; Fri, 9 Feb 2024 08:49:43 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 2C74F4ECC9; Thu, 8 Feb 2024 21:49:41 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1yf3mghJoZc1; Thu, 8 Feb 2024 21:49:40 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=2605:bc80:3010:104::8cd3:938; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 19A8B4ED22 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp4.osuosl.org (Postfix) with ESMTPS id 19A8B4ED22; Thu, 8 Feb 2024 21:49:40 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id DB5DAC0072; Thu, 8 Feb 2024 21:49:39 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 3EAEFC0037 for ; Thu, 8 Feb 2024 21:49:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id C2F264EC85 for ; Thu, 8 Feb 2024 21:49:36 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LK8t6XGACJR0 for ; Thu, 8 Feb 2024 21:49:36 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2001:4b98:dc4:8::223; helo=relay3-d.mail.gandi.net; envelope-from=numans@ovn.org; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp4.osuosl.org 696CE4ECCC Authentication-Results: smtp4.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 696CE4ECCC Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [IPv6:2001:4b98:dc4:8::223]) by smtp4.osuosl.org (Postfix) with ESMTPS id 696CE4ECCC for ; Thu, 8 Feb 2024 21:49:34 +0000 (UTC) Received: by mail.gandi.net (Postfix) with ESMTPSA id 87D5860002; Thu, 8 Feb 2024 21:49:31 +0000 (UTC) From: numans@ovn.org To: dev@openvswitch.org Date: Thu, 8 Feb 2024 16:49:26 -0500 Message-ID: <20240208214926.12763-1-numans@ovn.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240208214904.12696-1-numans@ovn.org> References: <20240208214904.12696-1-numans@ovn.org> MIME-Version: 1.0 X-GND-Sasl: numans@ovn.org Subject: [ovs-dev] [PATCH ovn v1 2/4] northd: Don't add ARP request responder flows for NAT multiple times. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Numan Siddique If an SNAT external_ip belongs to the router IP, then there is no need to generate ARP request responder flows in build_lswitch_rport_arp_req_flows_for_lbnats() as these flows for router ips are generated by build_lswitch_rport_arp_req_flows(). Otherwise this results in the lflow_table_add_lflow() to be called multiple times for the same match and actions and the ovn_lflow to have multiple dp_refcnts. Fixes: fe1c5df98b6f ("northd: forward arp request to lrp snat on.") Signed-off-by: Numan Siddique Acked-by: Han Zhou Acked-by: Dumitru Ceara --- northd/en-lr-nat.c | 6 ++++++ northd/en-lr-nat.h | 2 ++ northd/northd.c | 28 ++++++++++++++++++++++++---- northd/northd.h | 1 + 4 files changed, 33 insertions(+), 4 deletions(-) diff --git a/northd/en-lr-nat.c b/northd/en-lr-nat.c index 7664ec0ca9..ad11025c69 100644 --- a/northd/en-lr-nat.c +++ b/northd/en-lr-nat.c @@ -288,6 +288,8 @@ lr_nat_record_init(struct lr_nat_record *lrnat_rec, struct ovn_nat *nat_entry = &lrnat_rec->nat_entries[i]; nat_entry->nb = nat; + nat_entry->is_router_ip = false; + if (!extract_ip_addresses(nat->external_ip, &nat_entry->ext_addrs) || !nat_entry_is_valid(nat_entry)) { @@ -302,6 +304,10 @@ lr_nat_record_init(struct lr_nat_record *lrnat_rec, /* If this is a SNAT rule add the IP to the set of unique SNAT IPs. */ if (!strcmp(nat->type, "snat")) { + if (sset_contains(&od->router_ips, nat->external_ip)) { + nat_entry->is_router_ip = true; + } + if (!nat_entry_is_v6(nat_entry)) { snat_ip_add(lrnat_rec, nat_entry->ext_addrs.ipv4_addrs[0].addr_s, diff --git a/northd/en-lr-nat.h b/northd/en-lr-nat.h index 16b166ee05..6d3b2b6d65 100644 --- a/northd/en-lr-nat.h +++ b/northd/en-lr-nat.h @@ -37,6 +37,8 @@ struct ovn_nat { * list of nat entries. Currently * only used for SNAT. */ + bool is_router_ip; /* Indicates if the NAT external_ip is also one of + * router's lrp ip. Initialized only for SNAT. */ }; /* Stores the list of SNAT entries referencing a unique SNAT IP address. diff --git a/northd/northd.c b/northd/northd.c index a5d5e67117..c59aa8d304 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -431,6 +431,7 @@ ovn_datapath_create(struct hmap *datapaths, const struct uuid *key, hmap_insert(datapaths, &od->key_node, uuid_hash(&od->key)); od->lr_group = NULL; hmap_init(&od->ports); + sset_init(&od->router_ips); return od; } @@ -459,6 +460,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od) free(od->l3dgw_ports); destroy_mcast_info_for_datapath(od); destroy_ports_for_datapath(od); + sset_destroy(&od->router_ips); free(od); } } @@ -2190,6 +2192,19 @@ join_logical_ports(const struct sbrec_port_binding_table *sbrec_pb_table, op->lrp_networks = lrp_networks; op->od = od; + + for (size_t j = 0; j < op->lrp_networks.n_ipv4_addrs; j++) { + sset_add(&op->od->router_ips, + op->lrp_networks.ipv4_addrs[j].addr_s); + } + for (size_t j = 0; j < op->lrp_networks.n_ipv6_addrs; j++) { + /* Exclude the LLA. */ + if (!in6_is_lla(&op->lrp_networks.ipv6_addrs[j].addr)) { + sset_add(&op->od->router_ips, + op->lrp_networks.ipv6_addrs[j].addr_s); + } + } + hmap_insert(&od->ports, &op->dp_node, hmap_node_hash(&op->key_node)); @@ -8302,22 +8317,27 @@ build_lswitch_rport_arp_req_flows_for_lbnats( struct ovn_nat *nat_entry = CONTAINER_OF(ovs_list_front(&snat_ip->snat_entries), struct ovn_nat, ext_addr_list_node); + if (nat_entry->is_router_ip) { + /* If its a router ip, then there is no need to add the ARP + * request forwarder flows as it will be added by + * build_lswitch_rport_arp_req_flows(). */ + continue; + } + const struct nbrec_nat *nat = nat_entry->nb; /* Check if the ovn port has a network configured on which we could * expect ARP requests/NS for the SNAT external_ip. */ if (nat_entry_is_v6(nat_entry)) { - if (!lr_stateful_rec || - !sset_contains(&lr_stateful_rec->lb_ips->ips_v6, + if (!sset_contains(&lr_stateful_rec->lb_ips->ips_v6, nat->external_ip)) { build_lswitch_rport_arp_req_flow( nat->external_ip, AF_INET6, sw_op, sw_od, 80, lflows, stage_hint, lflow_ref); } } else { - if (!lr_stateful_rec || - !sset_contains(&lr_stateful_rec->lb_ips->ips_v4, + if (!sset_contains(&lr_stateful_rec->lb_ips->ips_v4, nat->external_ip)) { build_lswitch_rport_arp_req_flow( nat->external_ip, AF_INET, sw_op, sw_od, 80, lflows, diff --git a/northd/northd.h b/northd/northd.h index b5c175929e..3f1cd83413 100644 --- a/northd/northd.h +++ b/northd/northd.h @@ -293,6 +293,7 @@ struct ovn_datapath { struct ovn_datapath **ls_peers; size_t n_ls_peers; size_t n_allocated_ls_peers; + struct sset router_ips; /* Router port IPs except the IPv6 LLAs. */ /* Logical switch data. */ struct ovn_port **router_ports; From patchwork Thu Feb 8 21:49:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1896773 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TW9d2151Xz23hb for ; Fri, 9 Feb 2024 08:49:58 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 2445C7041C; Thu, 8 Feb 2024 21:49:56 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6bq2CwJXL9T3; Thu, 8 Feb 2024 21:49:55 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 328567036B Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTPS id 328567036B; Thu, 8 Feb 2024 21:49:55 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id EB0BDC0072; Thu, 8 Feb 2024 21:49:54 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 11DAEC0072 for ; Thu, 8 Feb 2024 21:49:54 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 6B89F8492B for ; Thu, 8 Feb 2024 21:49:46 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XErXDgKfBekF for ; Thu, 8 Feb 2024 21:49:42 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2001:4b98:dc4:8::226; helo=relay6-d.mail.gandi.net; envelope-from=numans@ovn.org; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp1.osuosl.org 53E9984D07 Authentication-Results: smtp1.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 53E9984D07 Received: from relay6-d.mail.gandi.net (relay6-d.mail.gandi.net [IPv6:2001:4b98:dc4:8::226]) by smtp1.osuosl.org (Postfix) with ESMTPS id 53E9984D07 for ; Thu, 8 Feb 2024 21:49:41 +0000 (UTC) Received: by mail.gandi.net (Postfix) with ESMTPSA id 7A863C0003; Thu, 8 Feb 2024 21:49:38 +0000 (UTC) From: numans@ovn.org To: dev@openvswitch.org Date: Thu, 8 Feb 2024 16:49:32 -0500 Message-ID: <20240208214932.12783-1-numans@ovn.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240208214904.12696-1-numans@ovn.org> References: <20240208214904.12696-1-numans@ovn.org> MIME-Version: 1.0 X-GND-Sasl: numans@ovn.org Subject: [ovs-dev] [PATCH ovn v1 3/4] northd: Fix lflow ref node's reference counting. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Numan Siddique When the lflows in an lflow_ref are unlinked by calling lflow_ref_unlink_lflows(lflow_ref), the dp_ref counter for each lflow in the lflow_ref is decremented (by calling dp_refcnt_release()), but it is not incremented later when the same lflow is linked back to the lflow_ref. This patch fixes it. Fixes: a623606052ea ("northd: Refactor lflow management into a separate module.") Signed-off-by: Numan Siddique Acked-by: Han Zhou Acked-by: Dumitru Ceara --- northd/lflow-mgr.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c index 61729e9039..df62cd6ab4 100644 --- a/northd/lflow-mgr.c +++ b/northd/lflow-mgr.c @@ -690,19 +690,24 @@ lflow_table_add_lflow(struct lflow_table *lflow_table, if (lrn->dpgrp_lflow) { lrn->dpgrp_bitmap = bitmap_clone(dp_bitmap, dp_bitmap_len); lrn->dpgrp_bitmap_len = dp_bitmap_len; + } else { + lrn->dp_index = od->index; + } + ovs_list_insert(&lflow->referenced_by, &lrn->ref_list_node); + hmap_insert(&lflow_ref->lflow_ref_nodes, &lrn->ref_node, hash); + } + if (!lrn->linked) { + if (lrn->dpgrp_lflow) { + ovs_assert(lrn->dpgrp_bitmap_len == dp_bitmap_len); size_t index; BITMAP_FOR_EACH_1 (index, dp_bitmap_len, dp_bitmap) { dp_refcnt_use(&lflow->dp_refcnts_map, index); } } else { - lrn->dp_index = od->index; dp_refcnt_use(&lflow->dp_refcnts_map, lrn->dp_index); } - ovs_list_insert(&lflow->referenced_by, &lrn->ref_list_node); - hmap_insert(&lflow_ref->lflow_ref_nodes, &lrn->ref_node, hash); } - lrn->linked = true; } From patchwork Thu Feb 8 21:49:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1896774 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TW9dJ3WVJz23h4 for ; Fri, 9 Feb 2024 08:50:12 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 9FDBF4ED5F; Thu, 8 Feb 2024 21:50:10 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0wBhqRwcE3lC; Thu, 8 Feb 2024 21:50:09 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=2605:bc80:3010:104::8cd3:938; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 565A94ED3E Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp4.osuosl.org (Postfix) with ESMTPS id 565A94ED3E; Thu, 8 Feb 2024 21:50:09 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2D577C007C; Thu, 8 Feb 2024 21:50:09 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1164DC0072 for ; Thu, 8 Feb 2024 21:50:08 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 335AD4ED3E for ; Thu, 8 Feb 2024 21:49:50 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EItTqH_iaB9X for ; Thu, 8 Feb 2024 21:49:49 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=217.70.183.197; helo=relay5-d.mail.gandi.net; envelope-from=numans@ovn.org; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp4.osuosl.org DC6CC4ED27 Authentication-Results: smtp4.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org DC6CC4ED27 Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net [217.70.183.197]) by smtp4.osuosl.org (Postfix) with ESMTPS id DC6CC4ED27 for ; Thu, 8 Feb 2024 21:49:48 +0000 (UTC) Received: by mail.gandi.net (Postfix) with ESMTPSA id 86DF71C0003; Thu, 8 Feb 2024 21:49:46 +0000 (UTC) From: numans@ovn.org To: dev@openvswitch.org Date: Thu, 8 Feb 2024 16:49:39 -0500 Message-ID: <20240208214939.12799-1-numans@ovn.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240208214904.12696-1-numans@ovn.org> References: <20240208214904.12696-1-numans@ovn.org> MIME-Version: 1.0 X-GND-Sasl: numans@ovn.org Cc: Ilya Maximets Subject: [ovs-dev] [PATCH ovn v1 4/4] northd: lflow-mgr: Allocate DP reference counters on a second use. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Ilya Maximets Currently, whenever a new logical flow is created, northd allocates a reference counter for every datapath in the datapath group for that logical flow. Those reference counters are necessary in order to not remove the datapath from a logical flow during incremental processing if it was created from two different objects for the same datapath and now one of them is removed. However, that creates a serious scalability problem. In a large scale setups with tens of thousands of logical flows applied to large datapath groups we may have hundreds of millions of these reference counters allocated, which is many GB of RSS just for that purpose. For example, in ovn-heater's cluster-density 500 node test, ovn-northd started to consume up to 8.5 GB or RAM. In the same test before the reference counting, ovn-northd consumed only 2.5 GB. All those memory allocation also increased CPU usage. Re-compute time went up from 1.5 seconds to 6 seconds in the same test. In the end we have about 4x increase on both CPU and memory usage. Running under valgrind --tool=massif shows: ------------------------------------------------------------- total(B) useful-heap(B) extra-heap(B) stacks(B) ------------------------------------------------------------- 9,416,438,200 7,401,971,556 2,014,466,644 0 78.61% (7,401 MB) (heap allocation functions) malloc ->45.78% (4,311 MB) xcalloc__ (util.c:124) | ->45.68% (4,301 MB) xzalloc__ (util.c:134) | | ->45.68% (4,301 MB) xzalloc (util.c:168) | | ->40.97% (3,857 MB) dp_refcnt_use (lflow-mgr.c:1348) | | | ->40.89% (3,850 MB) lflow_table_add_lflow (lflow-mgr.c:696) | | | | ->12.27% (1,155 MB) build_lb_rules_pre_stateful (northd.c:7180) | | | | ->12.27% (1,155 MB) build_lb_rules (northd.c:7658) | | | | ->12.27% (1,155,MB) build_gw_lrouter_nat_flows_for_lb (northd.c:11054) ->28.62% (2,694 MB) xmalloc__ (util.c:140) | ->28.62% (2,694 MB) xmalloc (util.c:175) | ->06.71% (631 MB) resize (hmap.c:100) | | ->06.01% (565 MB) hmap_expand_at (hmap.c:175) | | | ->05.24% (492 MB) hmap_insert_at (hmap.h:309) | | | | ->05.24% (492 MB) dp_refcnt_use (lflow-mgr.c:1351) 45% of all the memory is allocated for reference counters themselves and another 7% is taken by hash maps to hold them. Also, there is more than 20% of a total memory allocation overhead (extra-heap) since all these allocated objects are very small (32B). This test allocates 120 M reference counters total. However, the vast majority of all the reference counters always has a value of 1, i.e. these datapaths are not used more than once. Defer allocation of reference counters until the datapath is used for the same logical flow for the second time. We can do that by checking the current datapath group bitmap. With this change, the amount of allocated reference counters goes down from 120 M to just 12 M. Memory consumption reduced from 8.5 GB to 2.67 GB and the northd recompute time reduced from 6 to 2.1 seconds. It is still a little higher than resource usage before introduction of incremental processing for logical flows, but it is fairly manageable. Also, the resource usage and memory consumption may be further improved by reducing the number of cases where northd attempts to create the logical flows for the same datapaths multiple times. Note: the cluster-density test in ovn-heater creates new port groups on every iteration and ovn-northd doesn't handle this incrementally, so it always re-computes. That's why there is no benefit from northd I-P for CPU usage in this scenario. Fixes: a623606052ea ("northd: Refactor lflow management into a separate module.") Signed-off-by: Ilya Maximets Acked-by: Han Zhou Acked-by: Dumitru Ceara --- northd/lflow-mgr.c | 39 ++++++++++++++++++++------------------- 1 file changed, 20 insertions(+), 19 deletions(-) diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c index df62cd6ab4..f70896cce7 100644 --- a/northd/lflow-mgr.c +++ b/northd/lflow-mgr.c @@ -47,8 +47,7 @@ static void ovn_lflow_destroy(struct lflow_table *lflow_table, static char *ovn_lflow_hint(const struct ovsdb_idl_row *row); static struct ovn_lflow *do_ovn_lflow_add( - struct lflow_table *, const struct ovn_datapath *, - const unsigned long *dp_bitmap, size_t dp_bitmap_len, uint32_t hash, + struct lflow_table *, size_t dp_bitmap_len, uint32_t hash, enum ovn_stage stage, uint16_t priority, const char *match, const char *actions, const char *io_port, const char *ctrl_meter, @@ -674,9 +673,9 @@ lflow_table_add_lflow(struct lflow_table *lflow_table, hash_lock = lflow_hash_lock(&lflow_table->entries, hash); struct ovn_lflow *lflow = - do_ovn_lflow_add(lflow_table, od, dp_bitmap, - dp_bitmap_len, hash, stage, - priority, match, actions, + do_ovn_lflow_add(lflow_table, + od ? ods_size(od->datapaths) : dp_bitmap_len, + hash, stage, priority, match, actions, io_port, ctrl_meter, stage_hint, where); if (lflow_ref) { @@ -702,17 +701,24 @@ lflow_table_add_lflow(struct lflow_table *lflow_table, ovs_assert(lrn->dpgrp_bitmap_len == dp_bitmap_len); size_t index; BITMAP_FOR_EACH_1 (index, dp_bitmap_len, dp_bitmap) { - dp_refcnt_use(&lflow->dp_refcnts_map, index); + /* Allocate a reference counter only if already used. */ + if (bitmap_is_set(lflow->dpg_bitmap, index)) { + dp_refcnt_use(&lflow->dp_refcnts_map, index); + } } } else { - dp_refcnt_use(&lflow->dp_refcnts_map, lrn->dp_index); + /* Allocate a reference counter only if already used. */ + if (bitmap_is_set(lflow->dpg_bitmap, lrn->dp_index)) { + dp_refcnt_use(&lflow->dp_refcnts_map, lrn->dp_index); + } } } lrn->linked = true; } - lflow_hash_unlock(hash_lock); + ovn_dp_group_add_with_reference(lflow, od, dp_bitmap, dp_bitmap_len); + lflow_hash_unlock(hash_lock); } void @@ -946,9 +952,7 @@ ovn_lflow_destroy(struct lflow_table *lflow_table, struct ovn_lflow *lflow) } static struct ovn_lflow * -do_ovn_lflow_add(struct lflow_table *lflow_table, - const struct ovn_datapath *od, - const unsigned long *dp_bitmap, size_t dp_bitmap_len, +do_ovn_lflow_add(struct lflow_table *lflow_table, size_t dp_bitmap_len, uint32_t hash, enum ovn_stage stage, uint16_t priority, const char *match, const char *actions, const char *io_port, const char *ctrl_meter, @@ -959,14 +963,11 @@ do_ovn_lflow_add(struct lflow_table *lflow_table, struct ovn_lflow *old_lflow; struct ovn_lflow *lflow; - size_t bitmap_len = od ? ods_size(od->datapaths) : dp_bitmap_len; - ovs_assert(bitmap_len); + ovs_assert(dp_bitmap_len); old_lflow = ovn_lflow_find(&lflow_table->entries, stage, priority, match, actions, ctrl_meter, hash); if (old_lflow) { - ovn_dp_group_add_with_reference(old_lflow, od, dp_bitmap, - bitmap_len); return old_lflow; } @@ -974,14 +975,12 @@ do_ovn_lflow_add(struct lflow_table *lflow_table, /* While adding new logical flows we're not setting single datapath, but * collecting a group. 'od' will be updated later for all flows with only * one datapath in a group, so it could be hashed correctly. */ - ovn_lflow_init(lflow, NULL, bitmap_len, stage, priority, + ovn_lflow_init(lflow, NULL, dp_bitmap_len, stage, priority, xstrdup(match), xstrdup(actions), io_port ? xstrdup(io_port) : NULL, nullable_xstrdup(ctrl_meter), ovn_lflow_hint(stage_hint), where); - ovn_dp_group_add_with_reference(lflow, od, dp_bitmap, bitmap_len); - if (parallelization_state != STATE_USE_PARALLELIZATION) { hmap_insert(&lflow_table->entries, &lflow->hmap_node, hash); } else { @@ -1350,8 +1349,10 @@ dp_refcnt_use(struct hmap *dp_refcnts_map, size_t dp_index) struct dp_refcnt *dp_refcnt = dp_refcnt_find(dp_refcnts_map, dp_index); if (!dp_refcnt) { - dp_refcnt = xzalloc(sizeof *dp_refcnt); + dp_refcnt = xmalloc(sizeof *dp_refcnt); dp_refcnt->dp_index = dp_index; + /* Allocation is happening on the second (!) use. */ + dp_refcnt->refcnt = 1; hmap_insert(dp_refcnts_map, &dp_refcnt->key_node, dp_index); }