From patchwork Wed Nov 18 16:14:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1402388 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CbnyN47vZz9sPB for ; Thu, 19 Nov 2020 03:17:16 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id DF3B1228E3; Wed, 18 Nov 2020 16:17:14 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Lq-pysTjXpRY; Wed, 18 Nov 2020 16:16:53 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by silver.osuosl.org (Postfix) with ESMTP id 38BC2228EA; Wed, 18 Nov 2020 16:15:33 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2ADB7C1833; Wed, 18 Nov 2020 16:15:33 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 8F5B2C0800 for ; Wed, 18 Nov 2020 16:15:30 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 7DBBA8722C for ; Wed, 18 Nov 2020 16:15:30 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ye4GBcvDWRPO for ; Wed, 18 Nov 2020 16:15:28 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by hemlock.osuosl.org (Postfix) with ESMTPS id 0058187251 for ; Wed, 18 Nov 2020 16:15:27 +0000 (UTC) IronPort-SDR: ci1lmqz2oKl9KHr4gkKCr8/mmOrxTWhFvPkD9e9v0zkIkwaAusPcX+2z0yYVLP58cukMcdvXHg eSpMkJBrYa0Q== X-IronPort-AV: E=McAfee;i="6000,8403,9808"; a="255850817" X-IronPort-AV: E=Sophos;i="5.77,486,1596524400"; d="scan'208";a="255850817" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Nov 2020 08:15:27 -0800 IronPort-SDR: yhVqbtPbkeqeahdtBfxJidV5wf/5ab2D3ZvgiNb67tNXctyqktLXdTtTJSx2eOtFjpF7jvCdUL 2BOllnw7H3JA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,486,1596524400"; d="scan'208";a="368533533" Received: from silpixa00400633.ir.intel.com ([10.237.213.210]) by orsmga007.jf.intel.com with ESMTP; 18 Nov 2020 08:15:25 -0800 From: Harry van Haaren To: ovs-dev@openvswitch.org Date: Wed, 18 Nov 2020 16:14:58 +0000 Message-Id: <20201118161501.1710801-8-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201118161501.1710801-1-harry.van.haaren@intel.com> References: <20201030190647.1839197-1-harry.van.haaren@intel.com> <20201118161501.1710801-1-harry.van.haaren@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org Subject: [ovs-dev] [PATCH v3 07/10] dpif-netdev: Add command to switch dpif implementation X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" This commit adds a new command to allow the user to switch the active DPIF implementation at runtime. A probe function is executed before switching the DPIF implementation, to ensure the CPU is capable of running the ISA required. For example, the below code will switch to the AVX512 enabled DPIF assuming that the runtime CPU is capable of running AVX512 instructions: $ ovs-appctl dpif-netdev/dpif-set dpif_avx512 A new configuration flag is added to allow selection of the default DPIF. This is useful for running the unit-tests against the available DPIF implementations, without modifying each unit test. The design of the testing & validation for ISA optimized DPIF implementations is based around the work already upstream for DPCLS. Note however that a DPCLS lookup has no state or side-effects, allowing the auto-validator implementation to perform multiple lookups and provide consistent statistic counters. The DPIF component does have state, so running two implementations in parallel and comparing output is not a valid testing method, as there are changes in DPIF statistic counters (side effects). As a result, the DPIF is tested directly against the unit-tests. Signed-off-by: Harry van Haaren Signed-off-by: Cian Ferriter --- v3: - Reformat code to match OVS style - Reformat comments to match OVS style - Improve return values for dp_netdev_input_outer_avx512_probe(). Use errno values and return 0 on success. - Remove "TODO: call the dpif selector impl here, int ret, out param for func ptr:" - Remove "TODO make this register/selectable just like DPCLS" --- acinclude.m4 | 15 +++++ configure.ac | 1 + lib/automake.mk | 1 + lib/dpif-netdev-avx512.c | 14 +++++ lib/dpif-netdev-private-dpif.c | 104 +++++++++++++++++++++++++++++++ lib/dpif-netdev-private-dpif.h | 31 ++++++++- lib/dpif-netdev-private-thread.h | 12 +--- lib/dpif-netdev.c | 84 +++++++++++++++++++++++-- 8 files changed, 247 insertions(+), 15 deletions(-) create mode 100644 lib/dpif-netdev-private-dpif.c diff --git a/acinclude.m4 b/acinclude.m4 index 1460289ca..9a7cb8df0 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -30,6 +30,21 @@ AC_DEFUN([OVS_CHECK_DPCLS_AUTOVALIDATOR], [ fi ]) +dnl Set OVS DPIF default implementation at configure time for running the unit +dnl tests on the whole codebase without modifying tests per DPIF impl +AC_DEFUN([OVS_CHECK_DPIF_AVX512_DEFAULT], [ + AC_ARG_ENABLE([dpif-default-avx512], + [AC_HELP_STRING([--enable-dpif-default-avx512], [Enable DPIF AVX512 implementation as default.])], + [dpifavx512=yes],[dpifavx512=no]) + AC_MSG_CHECKING([whether DPIF AVX512 is default implementation]) + if test "$dpifavx512" != yes; then + AC_MSG_RESULT([no]) + else + OVS_CFLAGS="$OVS_CFLAGS -DDPIF_AVX512_DEFAULT" + AC_MSG_RESULT([yes]) + fi +]) + dnl OVS_ENABLE_WERROR AC_DEFUN([OVS_ENABLE_WERROR], [AC_ARG_ENABLE( diff --git a/configure.ac b/configure.ac index 126a1d9d1..76b1e4fec 100644 --- a/configure.ac +++ b/configure.ac @@ -185,6 +185,7 @@ OVS_ENABLE_WERROR OVS_ENABLE_SPARSE OVS_CTAGS_IDENTIFIERS OVS_CHECK_DPCLS_AUTOVALIDATOR +OVS_CHECK_DPIF_AVX512_DEFAULT OVS_CHECK_BINUTILS_AVX512 AC_ARG_VAR(KARCH, [Kernel Architecture String]) diff --git a/lib/automake.mk b/lib/automake.mk index 650207940..2a41f7ab5 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -115,6 +115,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dpif-netdev.h \ lib/dpif-netdev-private-dfc.h \ lib/dpif-netdev-private-dpcls.h \ + lib/dpif-netdev-private-dpif.c \ lib/dpif-netdev-private-dpif.h \ lib/dpif-netdev-private-flow.h \ lib/dpif-netdev-private-hwol.h \ diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c index ff756a203..2dee909a3 100644 --- a/lib/dpif-netdev-avx512.c +++ b/lib/dpif-netdev-avx512.c @@ -19,6 +19,7 @@ #if !defined(__CHECKER__) #include +#include #include "dpif-netdev.h" #include "dpif-netdev-perf.h" @@ -44,6 +45,19 @@ struct pkt_flow_meta { uint16_t tcp_flags; }; +int32_t +dp_netdev_input_outer_avx512_probe(void) +{ + int avx512f_available = dpdk_get_cpu_has_isa("x86_64", "avx512f"); + int bmi2_available = dpdk_get_cpu_has_isa("x86_64", "bmi2"); + + if (!avx512f_available || !bmi2_available) { + return -ENOTSUP; + } + + return 0; +} + int32_t dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, struct dp_packet_batch *packets, diff --git a/lib/dpif-netdev-private-dpif.c b/lib/dpif-netdev-private-dpif.c new file mode 100644 index 000000000..a27a1d752 --- /dev/null +++ b/lib/dpif-netdev-private-dpif.c @@ -0,0 +1,104 @@ +/* + * Copyright (c) 2020 Intel Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include +#include + +#include "dpif-netdev-private-dpif.h" +#include "util.h" +#include "openvswitch/vlog.h" + +VLOG_DEFINE_THIS_MODULE(dpif_netdev_impl); + +/* Structure describing each available DPIF implmeentation. */ +struct dpif_netdev_impl_info_t { + /* Function pointer to execute to have this DPIF implementation run. */ + dp_netdev_input_func func; + /* Function pointer to execute to check the CPU ISA is available to run. + * May be NULL, which implies that it is always valid to use. + */ + dp_netdev_input_func_probe probe; + /* Name used to select this DPIF implementation. */ + const char *name; +}; + +/* Actual list of implementations goes here. */ +static struct dpif_netdev_impl_info_t dpif_impls[] = { + /* The default scalar C code implementation. */ + { .func = dp_netdev_input, + .probe = NULL, + .name = "dpif_scalar", }, + +#if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__) + /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */ + { .func = dp_netdev_input_outer_avx512, + .probe = dp_netdev_input_outer_avx512_probe, + .name = "dpif_avx512", }, +#endif +}; + +dp_netdev_input_func +dp_netdev_impl_get_default(void) +{ + int dpif_idx = 0; + +/* Configure time overriding to run test suite on all implementations. */ +#if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__) +#ifdef DPIF_AVX512_DEFAULT + ovs_assert(dpif_impls[1].func == dp_netdev_input_outer_avx512); + if (!dp_netdev_input_outer_avx512_probe()) { + dpif_idx = 1; + }; +#endif +#endif + + VLOG_INFO("Default DPIF implementation is %s.\n", + dpif_impls[dpif_idx].name); + dp_netdev_input_func func = dpif_impls[dpif_idx].func; + + return func; +} + + +/* This function checks all available DPIF implementations, and selects the + * returns the function pointer to the one requested by "name". + */ +int32_t +dp_netdev_impl_get(const char *name, dp_netdev_input_func *out_func) +{ + ovs_assert(name); + ovs_assert(out_func); + + uint32_t i; + + for (i = 0; i < ARRAY_SIZE(dpif_impls); i++) { + if (strcmp(dpif_impls[i].name, name) == 0) { + /* Probe function is optional - so check it is set before exec. */ + if (dpif_impls[i].probe) { + int probe_ok = dpif_impls[i].probe(); + if (probe_ok) { + *out_func = NULL; + return probe_ok; + } + } + *out_func = dpif_impls[i].func; + return 0; + } + } + + return -EINVAL; +} diff --git a/lib/dpif-netdev-private-dpif.h b/lib/dpif-netdev-private-dpif.h index ae9068458..644ebfa40 100644 --- a/lib/dpif-netdev-private-dpif.h +++ b/lib/dpif-netdev-private-dpif.h @@ -23,7 +23,36 @@ struct dp_netdev_pmd_thread; struct dp_packet_batch; -/* Available implementations for dpif work */ +/* Typedef for DPIF functions. + * Returns a bitmask of packets to handle, possibly including upcall/misses. + */ +typedef int32_t (*dp_netdev_input_func)(struct dp_netdev_pmd_thread *pmd, + struct dp_packet_batch *packets, + odp_port_t port_no); + +/* Probe a DPIF implementation. This allows the implementation to validate CPU + * ISA availability. Returns 0 if not available, returns 1 is valid to use. + */ +typedef int32_t (*dp_netdev_input_func_probe)(void); + +/* This function checks all available DPIF implementations, and selects the + * returns the function pointer to the one requested by "name". + */ +int32_t +dp_netdev_impl_get(const char *name, dp_netdev_input_func *out_func); + +/* Returns the ./configure selected DPIF as default, used to initialize. */ +dp_netdev_input_func dp_netdev_impl_get_default(void); + +/* Available implementations of DPIF below */ +int32_t +dp_netdev_input(struct dp_netdev_pmd_thread *pmd, + struct dp_packet_batch *packets, + odp_port_t in_port); + +/* AVX512 enabled DPIF implementation and probe functions */ +int32_t +dp_netdev_input_outer_avx512_probe(void); int32_t dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, struct dp_packet_batch *packets, diff --git a/lib/dpif-netdev-private-thread.h b/lib/dpif-netdev-private-thread.h index 18387b81d..d0b3ccb06 100644 --- a/lib/dpif-netdev-private-thread.h +++ b/lib/dpif-netdev-private-thread.h @@ -28,6 +28,8 @@ #include "dpif-netdev-perf.h" #include "openvswitch/thread.h" +#include "dpif-netdev-private-dpif.h" + #ifdef __cplusplus extern "C" { #endif @@ -47,16 +49,6 @@ struct dp_netdev_pmd_thread_ctx { uint32_t emc_insert_min; }; -/* Forward declaration for typedef */ -struct dp_netdev_pmd_thread; - -/* Typedef for DPIF functions. - * Returns a bitmask of packets to handle, possibly including upcall/misses. - */ -typedef int32_t (*dp_netdev_input_func)(struct dp_netdev_pmd_thread *pmd, - struct dp_packet_batch *packets, - odp_port_t port_no); - /* PMD: Poll modes drivers. PMD accesses devices via polling to eliminate * the performance overhead of interrupt processing. Therefore netdev can * not implement rx-wait for these devices. dpif-netdev needs to poll diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index d03e00cc0..7fd61b89d 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -478,8 +478,8 @@ static void dp_netdev_execute_actions(struct dp_netdev_pmd_thread *pmd, const struct flow *flow, const struct nlattr *actions, size_t actions_len); -static int32_t dp_netdev_input(struct dp_netdev_pmd_thread *, - struct dp_packet_batch *, odp_port_t port_no); +int32_t dp_netdev_input(struct dp_netdev_pmd_thread *, + struct dp_packet_batch *, odp_port_t port_no); static void dp_netdev_recirculate(struct dp_netdev_pmd_thread *, struct dp_packet_batch *); @@ -990,6 +990,78 @@ dpif_netdev_subtable_lookup_set(struct unixctl_conn *conn, int argc, ds_destroy(&reply); } +static void +dpif_netdev_impl_set(struct unixctl_conn *conn, int argc, + const char *argv[], void *aux OVS_UNUSED) +{ + /* This function requires just one parameter, the DPIF name. + * A second optional parameter can identify the datapath instance. + */ + const char *dpif_name = argv[1]; + + static const char *error_description[2] = { + "Unknown DPIF implementation", + "CPU doesn't support the required instruction for", + }; + + dp_netdev_input_func new_func; + int32_t err = dp_netdev_impl_get(dpif_name, &new_func); + if (err) { + struct ds reply = DS_EMPTY_INITIALIZER; + ds_put_format(&reply, "DPIF implementation not available: %s %s.\n", + error_description[ (err == -ENOTSUP) ], dpif_name); + const char *reply_str = ds_cstr(&reply); + unixctl_command_reply(conn, reply_str); + VLOG_INFO("%s", reply_str); + ds_destroy(&reply); + return; + } + + /* argv[2] is optional datapath instance. If no datapath name is provided + * and only one datapath exists, the one existing datapath is reprobed. + */ + ovs_mutex_lock(&dp_netdev_mutex); + struct dp_netdev *dp = NULL; + + if (argc == 3) { + dp = shash_find_data(&dp_netdevs, argv[2]); + } else if (shash_count(&dp_netdevs) == 1) { + dp = shash_first(&dp_netdevs)->data; + } + + if (!dp) { + ovs_mutex_unlock(&dp_netdev_mutex); + unixctl_command_reply_error(conn, + "please specify an existing datapath"); + return; + } + + /* Get PMD threads list */ + size_t n; + struct dp_netdev_pmd_thread **pmd_list; + sorted_poll_thread_list(dp, &pmd_list, &n); + + for (size_t i = 0; i < n; i++) { + struct dp_netdev_pmd_thread *pmd = pmd_list[i]; + if (pmd->core_id == NON_PMD_CORE_ID) { + continue; + } + + /* Set PMD threads DPIF implementation to requested one */ + pmd->netdev_input_func = *new_func; + }; + + ovs_mutex_unlock(&dp_netdev_mutex); + + /* Reply with success to command */ + struct ds reply = DS_EMPTY_INITIALIZER; + ds_put_format(&reply, "DPIF implementation set to %s.\n", dpif_name); + const char *reply_str = ds_cstr(&reply); + unixctl_command_reply(conn, reply_str); + VLOG_INFO("%s", reply_str); + ds_destroy(&reply); +} + static void dpif_netdev_pmd_rebalance(struct unixctl_conn *conn, int argc, const char *argv[], void *aux OVS_UNUSED) @@ -1212,6 +1284,10 @@ dpif_netdev_init(void) unixctl_command_register("dpif-netdev/subtable-lookup-prio-get", "", 0, 0, dpif_netdev_subtable_lookup_get, NULL); + unixctl_command_register("dpif-netdev/dpif-set", + "[dpif implementation name] [dp]", + 1, 2, dpif_netdev_impl_set, + NULL); return 0; } @@ -6045,7 +6121,7 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, cmap_init(&pmd->tx_bonds); /* Initialize the DPIF function pointer to the default scalar version */ - pmd->netdev_input_func = dp_netdev_input; + pmd->netdev_input_func = dp_netdev_impl_get_default(); /* init the 'flow_cache' since there is no * actual thread created for NON_PMD_CORE_ID. */ @@ -6955,7 +7031,7 @@ dp_netdev_input__(struct dp_netdev_pmd_thread *pmd, } } -static int32_t +int32_t dp_netdev_input(struct dp_netdev_pmd_thread *pmd, struct dp_packet_batch *packets, odp_port_t port_no)