From patchwork Wed Jun 1 16:38:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Traynor X-Patchwork-Id: 1638039 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=X1mApX8E; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LCvx3492sz9s0w for ; Thu, 2 Jun 2022 02:39:03 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 98D2182F37; Wed, 1 Jun 2022 16:39:01 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F3GuqYMl3qbA; Wed, 1 Jun 2022 16:39:00 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 73CA482EDB; Wed, 1 Jun 2022 16:38:59 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 46DDDC0032; Wed, 1 Jun 2022 16:38:59 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 19F0FC002D for ; Wed, 1 Jun 2022 16:38:58 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id EDBF740220 for ; Wed, 1 Jun 2022 16:38:57 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Authentication-Results: smtp2.osuosl.org (amavisd-new); dkim=pass (1024-bit key) header.d=redhat.com Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GOqIM4_TrDOg for ; Wed, 1 Jun 2022 16:38:56 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp2.osuosl.org (Postfix) with ESMTPS id BB4BA400B8 for ; Wed, 1 Jun 2022 16:38:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1654101535; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y7djesI8KwBYosg1UyiMSqZ9GtCD5UGPkZTPytl+f1w=; b=X1mApX8EJR2zM0gY89E/guekbL1Rop7cdzSo5zHRATVup7uhHAtxJnH2y970tshyI6/UBD tbLn4TT0pEknBBACVlAJW8wZytMKzmKZROIIdWNhbfkWeXd2RCJNIRMfg4AqTg2ShLw6IF xsUxpGhqTu792BdeJ98AqO3NhzbMb+U= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-331-PhfygZ06PI6f03viYvHQSA-1; Wed, 01 Jun 2022 12:38:54 -0400 X-MC-Unique: PhfygZ06PI6f03viYvHQSA-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A897180D3DE; Wed, 1 Jun 2022 16:38:46 +0000 (UTC) Received: from rh.redhat.com (unknown [10.39.194.241]) by smtp.corp.redhat.com (Postfix) with ESMTP id 09DF0492C3B; Wed, 1 Jun 2022 16:38:44 +0000 (UTC) From: Kevin Traynor To: dev@openvswitch.org Date: Wed, 1 Jun 2022 17:38:35 +0100 Message-Id: <20220601163837.206937-2-ktraynor@redhat.com> In-Reply-To: <20220601163837.206937-1-ktraynor@redhat.com> References: <20220601163837.206937-1-ktraynor@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.9 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=ktraynor@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: david.marchand@redhat.com Subject: [ovs-dev] [PATCH v4 1/3] netdev-dpdk: Add shared mempool config. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Mempools may currently be shared between DPDK ports based on port MTU and NUMA. With some hint from the user we can increase the sharing on MTU and hence reduce memory consumption in many cases. For example, a port with MTU 9000, uses a mempool with an mbuf size based on 9000 MTU. A port with MTU 1500, uses a different mempool with an mbuf size based on 1500 MTU. In this case, assuming same NUMA, both these ports could share the 9000 MTU mempool. The user must give a hint as order of creation of ports and setting of MTUs may vary and we need to ensure that upgrades from older OVS versions do not require more memory. This scheme can also prevent multiple mempools being created for cases where a port is added picking up a default MTU and an appropriate mempool, but later has it's MTU changed to a different value requiring a different mempool. Example usage: $ ovs-vsctl --no-wait set Open_vSwitch . \ other_config:shared-mempool-config=9000,1500:1,6000:1 Port added on NUMA 0: * MTU 1500, use mempool based on 9000 MTU * MTU 5000, use mempool based on 9000 MTU * MTU 9000, use mempool based on 9000 MTU * MTU 9300, use mempool based on 9300 MTU (existing behaviour) Port added on NUMA 1: * MTU 1500, use mempool based on 1500 MTU * MTU 5000, use mempool based on 6000 MTU * MTU 9000, use mempool based on 9000 MTU * MTU 9300, use mempool based on 9300 MTU (existing behaviour) Default behaviour is unchanged and mempools are still only created when needed. Signed-off-by: Kevin Traynor Reviewed-by: David Marchand --- Documentation/topics/dpdk/memory.rst | 44 +++++++++++ lib/dpdk.c | 2 +- lib/netdev-dpdk.c | 109 ++++++++++++++++++++++++++- lib/netdev-dpdk.h | 5 +- vswitchd/vswitch.xml | 37 +++++++++ 5 files changed, 192 insertions(+), 5 deletions(-) diff --git a/Documentation/topics/dpdk/memory.rst b/Documentation/topics/dpdk/memory.rst index 8b7758e6e..9714d79d4 100644 --- a/Documentation/topics/dpdk/memory.rst +++ b/Documentation/topics/dpdk/memory.rst @@ -214,2 +214,46 @@ Example 3: (2 rxq, 2 PMD, 9000 MTU) Mbuf size = 10176 Bytes Memory required = 26656 * 10176 = 271 MB + +Shared Mempool Configuration +---------------------------- + +In order to increase sharing of mempools, a user can configure the MTUs which +mempools are based on by using ``shared-mempool-config``. + +An MTU configured by the user is adjusted to an mbuf size used for mempool +creation and stored. If a port is subsequently added that has an MTU which can +be accommodated by this mbuf size, it will be used for mempool creation/reuse. + +This can increase sharing by consolidating mempools for ports with different +MTUs which would otherwise use separate mempools. It can also help to remove +the need for mempools being created after a port is added but before it's MTU +is changed to a different value. + +For example, on a 2 NUMA system:: + + $ ovs-vsctl ovs-vsctl --no-wait set Open_vSwitch . \ + other_config:shared-mempool-config=9000,1500:1,6000:1 + + +In this case, OVS stores the mbuf sizes based on the following MTUs. + +* NUMA 0: 9000 +* NUMA 1: 1500, 6000, 9000 + +Ports added will use mempools with the mbuf sizes based on the above MTUs where +possible. If there is more than one suitable, the one closest to the MTU will +be selected. + +Port added on NUMA 0: + +* MTU 1500, use mempool based on 9000 MTU +* MTU 6000, use mempool based on 9000 MTU +* MTU 9000, use mempool based on 9000 MTU +* MTU 9300, use mempool based on 9300 MTU (existing behaviour) + +Port added on NUMA 1: + +* MTU 1500, use mempool based on 1500 MTU +* MTU 6000, use mempool based on 6000 MTU +* MTU 9000, use mempool based on 9000 MTU +* MTU 9300, use mempool based on 9300 MTU (existing behaviour) diff --git a/lib/dpdk.c b/lib/dpdk.c index 6886fbd9d..d909974f9 100644 --- a/lib/dpdk.c +++ b/lib/dpdk.c @@ -519,5 +519,5 @@ dpdk_init__(const struct smap *ovs_other_config) /* Finally, register the dpdk classes */ - netdev_dpdk_register(); + netdev_dpdk_register(ovs_other_config); netdev_register_flow_api_provider(&netdev_offload_dpdk); return true; diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f9535bfb4..a4b62ec6d 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -54,4 +54,5 @@ #include "openvswitch/list.h" #include "openvswitch/match.h" +#include "openvswitch/ofp-parse.h" #include "openvswitch/ofp-print.h" #include "openvswitch/shash.h" @@ -371,5 +372,13 @@ struct dpdk_mp { int refcount; struct ovs_list list_node OVS_GUARDED_BY(dpdk_mp_mutex); - }; +}; + +struct user_mempool_config { + int adj_mtu; + int socket_id; +}; + +static struct user_mempool_config *user_mempools = NULL; +static int n_user_mempools; /* There should be one 'struct dpdk_tx_queue' created for @@ -573,4 +582,42 @@ dpdk_buf_size(int mtu) } +static int +dpdk_get_user_adjusted_mtu(int port_adj_mtu, int port_mtu, int port_socket_id) +{ + int best_adj_user_mtu = INT_MAX; + + for (unsigned i = 0; i < n_user_mempools; i++) { + int user_adj_mtu, user_socket_id; + + user_adj_mtu = user_mempools[i].adj_mtu; + user_socket_id = user_mempools[i].socket_id; + if (port_adj_mtu > user_adj_mtu + || (user_socket_id != INT_MAX + && user_socket_id != port_socket_id)) { + continue; + } + if (user_adj_mtu < best_adj_user_mtu) { + /* This is the is the lowest valid user MTU. */ + best_adj_user_mtu = user_adj_mtu; + if (best_adj_user_mtu == port_adj_mtu) { + /* Found an exact fit, no need to keep searching. */ + break; + } + } + } + if (best_adj_user_mtu == INT_MAX) { + VLOG_DBG("No user configured shared mempool mbuf sizes found " + "suitable for port with MTU %d, NUMA %d.", port_mtu, + port_socket_id); + best_adj_user_mtu = port_adj_mtu; + } else { + VLOG_DBG("Found user configured shared mempool with mbufs " + "of size %d, suitable for port with MTU %d, NUMA %d.", + MTU_TO_FRAME_LEN(best_adj_user_mtu), port_mtu, + port_socket_id); + } + return best_adj_user_mtu; +} + /* Allocates an area of 'sz' bytes from DPDK. The memory is zero'ed. * @@ -796,4 +843,8 @@ dpdk_mp_get(struct netdev_dpdk *dev, int mtu, bool per_port_mp) * to see if reuse is possible. */ if (!per_port_mp) { + /* If user has provided defined mempools, check if one is suitable + * and get new buffer size.*/ + mtu = dpdk_get_user_adjusted_mtu(mtu, dev->requested_mtu, + dev->requested_socket_id); LIST_FOR_EACH (dmp, list_node, &dpdk_mp_list) { if (dmp->socket_id == dev->requested_socket_id @@ -5335,4 +5386,54 @@ netdev_dpdk_rte_flow_tunnel_item_release(struct netdev *netdev, #endif /* ALLOW_EXPERIMENTAL_API */ +static void +parse_user_mempools_list(const char *mtus) +{ + char *list, *copy, *key, *value; + int error = 0; + + if (!mtus) { + return; + } + + n_user_mempools = 0; + list = copy = xstrdup(mtus); + + while (ofputil_parse_key_value(&list, &key, &value)) { + int socket_id, mtu, adj_mtu; + + if (!str_to_int(key, 0, &mtu) || mtu < 0) { + error = EINVAL; + VLOG_WARN("Invalid user configured shared mempool MTU."); + break; + } + + if (!str_to_int(value, 0, &socket_id)) { + /* No socket specified. It will apply for all numas. */ + socket_id = INT_MAX; + } else if (socket_id < 0) { + error = EINVAL; + VLOG_WARN("Invalid user configured shared mempool NUMA."); + break; + } + + user_mempools = xrealloc(user_mempools, (n_user_mempools + 1) * + sizeof(struct user_mempool_config)); + adj_mtu = FRAME_LEN_TO_MTU(dpdk_buf_size(mtu)); + user_mempools[n_user_mempools].adj_mtu = adj_mtu; + user_mempools[n_user_mempools].socket_id = socket_id; + n_user_mempools++; + VLOG_INFO("User configured shared mempool set for: MTU %d, NUMA %s.", + mtu, socket_id == INT_MAX ? "ALL" : value); + } + + if (error) { + VLOG_WARN("User configured shared mempools will not be used."); + n_user_mempools = 0; + free(user_mempools); + user_mempools = NULL; + } + free(copy); +} + #define NETDEV_DPDK_CLASS_COMMON \ .is_pmd = true, \ @@ -5418,6 +5519,10 @@ static const struct netdev_class dpdk_vhost_client_class = { void -netdev_dpdk_register(void) +netdev_dpdk_register(const struct smap *ovs_other_config) { + const char *mempoolcfg = smap_get(ovs_other_config, + "shared-mempool-config"); + + parse_user_mempools_list(mempoolcfg); netdev_register_provider(&dpdk_class); netdev_register_provider(&dpdk_vhost_class); diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h index 699be3fb4..7d2f64af2 100644 --- a/lib/netdev-dpdk.h +++ b/lib/netdev-dpdk.h @@ -21,4 +21,5 @@ #include "openvswitch/compiler.h" +#include "smap.h" struct dp_packet; @@ -29,5 +30,5 @@ struct netdev; #include -void netdev_dpdk_register(void); +void netdev_dpdk_register(const struct smap *); void free_dpdk_buf(struct dp_packet *); @@ -151,5 +152,5 @@ netdev_dpdk_rte_flow_tunnel_item_release( static inline void -netdev_dpdk_register(void) +netdev_dpdk_register(const struct smap *ovs_other_config OVS_UNUSED) { /* Nothing */ diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index cc1dd77ec..98486c009 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -491,4 +491,41 @@ + +

Specifies dpdk shared mempool config.

+

Value should be set in the following form:

+

+ other_config:shared-mempool-config=< + user-shared-mempool-mtu-list> +

+

where

+

+

    +
  • + <user-shared-mempool-mtu-list> ::= + NULL | <non-empty-list> +
  • +
  • + <non-empty-list> ::= <user-mtus> | + <user-mtus> , + <non-empty-list> +
  • +
  • + <user-mtus> ::= <mtu-all-socket> | + <mtu-socket-pair> +
  • +
  • + <mtu-all-socket> ::= <mtu> +
  • +
  • + <mtu-socket-pair> ::= <mtu> : <socket-id> +
  • +
+

+

+ Changing this value requires restarting the daemon if dpdk-init has + already been set to true. +

+
+