From patchwork Mon Dec 2 16:03:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Marchand X-Patchwork-Id: 1203215 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="Ghn5Jl5J"; dkim-atps=neutral Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47RVKR2V2wz9sPf for ; Tue, 3 Dec 2019 03:03:55 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 4CECF21503; Mon, 2 Dec 2019 16:03:53 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rQbpAbFjInpw; Mon, 2 Dec 2019 16:03:49 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by silver.osuosl.org (Postfix) with ESMTP id CAB8620471; Mon, 2 Dec 2019 16:03:49 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id C2EACC1798; Mon, 2 Dec 2019 16:03:49 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 142EFC087F for ; Mon, 2 Dec 2019 16:03:48 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 017C28646F for ; Mon, 2 Dec 2019 16:03:48 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GuR3jtZqVy3U for ; Mon, 2 Dec 2019 16:03:46 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) by whitealder.osuosl.org (Postfix) with ESMTPS id B045986460 for ; Mon, 2 Dec 2019 16:03:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1575302625; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=GVz1TcrSPDhunVJWXZB/rZgDDXKWxCsxLZZ4nREconI=; b=Ghn5Jl5JaMqL+OgsnOfFn68ClgZp43RI9EIKWaV4mEMJGXQSghaqx00JDyETFGFCFvH3Pa 6QF1HZzIKiDzYLr6+7s+tIafCeL60OdoPqaqT6DAgEidnEzDP3t0liTXRSqzgz+iFXEpYL LlQmeKGjYwcm7G7S6MXMZqQdx1q3TT8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-150-mFEmeBQCNMKdGL0UzIHW9A-1; Mon, 02 Dec 2019 11:03:39 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 154D38C059A; Mon, 2 Dec 2019 16:03:38 +0000 (UTC) Received: from dmarchan.remote.csb (ovpn-205-88.brq.redhat.com [10.40.205.88]) by smtp.corp.redhat.com (Postfix) with ESMTP id A89FF67648; Mon, 2 Dec 2019 16:03:36 +0000 (UTC) From: David Marchand To: dev@openvswitch.org Date: Mon, 2 Dec 2019 17:03:30 +0100 Message-Id: <20191202160330.14413-1-david.marchand@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-MC-Unique: mFEmeBQCNMKdGL0UzIHW9A-1 X-Mimecast-Spam-Score: 0 Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH] dpdk: Support running PMD threads on cores > RTE_MAX_LCORE. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Most DPDK components make the assumption that rte_lcore_id() returns a valid lcore_id in [0..RTE_MAX_LCORE] range (with the exception of the LCORE_ID_ANY special value). OVS does not currently check which value is set in RTE_PER_LCORE(_lcore_id) which exposes us to potential crashes on DPDK side. Introduce a lcore allocator in OVS for PMD threads and map them to unused lcores from DPDK à la --lcores. The physical cores on which the PMD threads are running still constitutes an important information when debugging, so still keep those in the PMD thread names but add a new debug log when starting them. Synchronize DPDK internals on numa and cpuset for the PMD threads by registering them via the rte_thread_set_affinity() helper. Signed-off-by: David Marchand Acked-by: Flavio Leitner --- lib/dpdk-stub.c | 8 +++++- lib/dpdk.c | 69 +++++++++++++++++++++++++++++++++++++++++++---- lib/dpdk.h | 3 ++- lib/dpif-netdev.c | 3 ++- 4 files changed, 75 insertions(+), 8 deletions(-) diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index c332c217c..90473bc8e 100644 --- a/lib/dpdk-stub.c +++ b/lib/dpdk-stub.c @@ -39,7 +39,13 @@ dpdk_init(const struct smap *ovs_other_config) } void -dpdk_set_lcore_id(unsigned cpu OVS_UNUSED) +dpdk_init_thread_context(unsigned cpu OVS_UNUSED) +{ + /* Nothing */ +} + +void +dpdk_uninit_thread_context(void) { /* Nothing */ } diff --git a/lib/dpdk.c b/lib/dpdk.c index 21dd47e80..771baa413 100644 --- a/lib/dpdk.c +++ b/lib/dpdk.c @@ -33,6 +33,7 @@ #include "dirs.h" #include "fatal-signal.h" +#include "id-pool.h" #include "netdev-dpdk.h" #include "netdev-offload-provider.h" #include "openvswitch/dynamic-string.h" @@ -55,6 +56,9 @@ static bool dpdk_initialized = false; /* Indicates successful initialization * of DPDK. */ static bool per_port_memory = false; /* Status of per port memory support */ +static struct id_pool *lcore_id_pool; +static struct ovs_mutex lcore_id_pool_mutex = OVS_MUTEX_INITIALIZER; + static int process_vhost_flags(char *flag, const char *default_val, int size, const struct smap *ovs_other_config, @@ -346,7 +350,8 @@ dpdk_init__(const struct smap *ovs_other_config) } } - if (args_contains(&args, "-c") || args_contains(&args, "-l")) { + if (args_contains(&args, "-c") || args_contains(&args, "-l") || + args_contains(&args, "--lcores")) { auto_determine = false; } @@ -372,8 +377,8 @@ dpdk_init__(const struct smap *ovs_other_config) * thread affintity - default to core #0 */ VLOG_ERR("Thread getaffinity failed. Using core #0"); } - svec_add(&args, "-l"); - svec_add_nocopy(&args, xasprintf("%d", cpu)); + svec_add(&args, "--lcores"); + svec_add_nocopy(&args, xasprintf("0@%d", cpu)); } svec_terminate(&args); @@ -429,6 +434,23 @@ dpdk_init__(const struct smap *ovs_other_config) } } + ovs_mutex_lock(&lcore_id_pool_mutex); + lcore_id_pool = id_pool_create(0, RTE_MAX_LCORE); + /* Empty the whole pool... */ + for (uint32_t lcore = 0; lcore < RTE_MAX_LCORE; lcore++) { + uint32_t lcore_id; + + id_pool_alloc_id(lcore_id_pool, &lcore_id); + } + /* ...and release the unused spots. */ + for (uint32_t lcore = 0; lcore < RTE_MAX_LCORE; lcore++) { + if (rte_eal_lcore_role(lcore) != ROLE_OFF) { + continue; + } + id_pool_free_id(lcore_id_pool, lcore); + } + ovs_mutex_unlock(&lcore_id_pool_mutex); + /* We are called from the main thread here */ RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID; @@ -522,11 +544,48 @@ dpdk_available(void) } void -dpdk_set_lcore_id(unsigned cpu) +dpdk_init_thread_context(unsigned cpu) { + cpu_set_t cpuset; + unsigned lcore; + int err; + /* NON_PMD_CORE_ID is reserved for use by non pmd threads. */ ovs_assert(cpu != NON_PMD_CORE_ID); - RTE_PER_LCORE(_lcore_id) = cpu; + + ovs_mutex_lock(&lcore_id_pool_mutex); + if (lcore_id_pool == NULL || !id_pool_alloc_id(lcore_id_pool, &lcore)) { + lcore = NON_PMD_CORE_ID; + } + ovs_mutex_unlock(&lcore_id_pool_mutex); + + RTE_PER_LCORE(_lcore_id) = lcore; + + /* DPDK is not initialised, nothing more to do. */ + if (lcore == NON_PMD_CORE_ID) { + return; + } + + CPU_ZERO(&cpuset); + err = pthread_getaffinity_np(pthread_self(), sizeof cpuset, &cpuset); + if (err) { + VLOG_ABORT("Thread getaffinity error: %s", ovs_strerror(err)); + } + + rte_thread_set_affinity(&cpuset); + VLOG_INFO("Initialised lcore %u for core %u", lcore, cpu); +} + +void +dpdk_uninit_thread_context(void) +{ + if (RTE_PER_LCORE(_lcore_id) == NON_PMD_CORE_ID) { + return; + } + + ovs_mutex_lock(&lcore_id_pool_mutex); + id_pool_free_id(lcore_id_pool, RTE_PER_LCORE(_lcore_id)); + ovs_mutex_unlock(&lcore_id_pool_mutex); } void diff --git a/lib/dpdk.h b/lib/dpdk.h index 736a64279..404ac1a4b 100644 --- a/lib/dpdk.h +++ b/lib/dpdk.h @@ -36,7 +36,8 @@ struct smap; struct ovsrec_open_vswitch; void dpdk_init(const struct smap *ovs_other_config); -void dpdk_set_lcore_id(unsigned cpu); +void dpdk_init_thread_context(unsigned cpu); +void dpdk_uninit_thread_context(void); const char *dpdk_get_vhost_sock_dir(void); bool dpdk_vhost_iommu_enabled(void); bool dpdk_vhost_postcopy_enabled(void); diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 5142bad1d..c40031a78 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -5472,7 +5472,7 @@ pmd_thread_main(void *f_) /* Stores the pmd thread's 'pmd' to 'per_pmd_key'. */ ovsthread_setspecific(pmd->dp->per_pmd_key, pmd); ovs_numa_thread_setaffinity_core(pmd->core_id); - dpdk_set_lcore_id(pmd->core_id); + dpdk_init_thread_context(pmd->core_id); poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list); dfc_cache_init(&pmd->flow_cache); pmd_alloc_static_tx_qid(pmd); @@ -5592,6 +5592,7 @@ reload: dfc_cache_uninit(&pmd->flow_cache); free(poll_list); pmd_free_cached_ports(pmd); + dpdk_uninit_thread_context(); return NULL; }