From patchwork Tue Jul 6 13:11:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501234 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2yT6NQSz9sS8 for ; Tue, 6 Jul 2021 23:12:01 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 13CEF4058A; Tue, 6 Jul 2021 13:11:59 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8PITtJegl3Q4; Tue, 6 Jul 2021 13:11:57 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 3C29A40562; Tue, 6 Jul 2021 13:11:56 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1452FC0010; Tue, 6 Jul 2021 13:11:56 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id D3ED6C0010 for ; Tue, 6 Jul 2021 13:11:54 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id C199D4024B for ; Tue, 6 Jul 2021 13:11:54 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SW_BJooo0qHI for ; Tue, 6 Jul 2021 13:11:53 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 54887400CD for ; Tue, 6 Jul 2021 13:11:53 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101848" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101848" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:11:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258013" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:11:50 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:40 +0100 Message-Id: <20210706131150.45513-2-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 01/11] dpif-netdev: Add command line and function pointer for miniflow extract X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Kumar Amber This patch introduces the mfex function pointers which allows the user to switch between different miniflow extract implementations which are provided by the OVS based on optimized ISA CPU. The user can query for the available minflow extract variants available for that CPU by following commands: $ovs-appctl dpif-netdev/miniflow-parser-get Similarly an user can set the miniflow implementation by the following command : $ ovs-appctl dpif-netdev/miniflow-parser-set name This allows for more performance and flexibility to the user to choose the miniflow implementation according to the needs. Signed-off-by: Kumar Amber Co-authored-by: Harry van Haaren Signed-off-by: Harry van Haaren --- v5: - fix review comments(Ian, Flavio, Eelco) - add enum to hold mfex indexes - add new get and set implemenatations - add Atomic set and get --- --- NEWS | 1 + lib/automake.mk | 2 + lib/dpif-netdev-avx512.c | 32 +++++- lib/dpif-netdev-private-extract.c | 159 ++++++++++++++++++++++++++++++ lib/dpif-netdev-private-extract.h | 105 ++++++++++++++++++++ lib/dpif-netdev-private-thread.h | 8 ++ lib/dpif-netdev.c | 127 +++++++++++++++++++++++- 7 files changed, 429 insertions(+), 5 deletions(-) create mode 100644 lib/dpif-netdev-private-extract.c create mode 100644 lib/dpif-netdev-private-extract.h diff --git a/NEWS b/NEWS index be96fc57f..60db823c4 100644 --- a/NEWS +++ b/NEWS @@ -22,6 +22,7 @@ Post-v2.15.0 * Enable the AVX512 DPCLS implementation to use VPOPCNT instruction if the CPU supports it. This enhances performance by using the native vpopcount instructions, instead of the emulated version of vpopcount. + * Add command line option to switch between mfex function pointers. - ovs-ctl: * New option '--no-record-hostname' to disable hostname configuration in ovsdb on startup. diff --git a/lib/automake.mk b/lib/automake.mk index 49f42c2a3..6657b9ae5 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -118,6 +118,8 @@ lib_libopenvswitch_la_SOURCES = \ lib/dpif-netdev-private-dpcls.h \ lib/dpif-netdev-private-dpif.c \ lib/dpif-netdev-private-dpif.h \ + lib/dpif-netdev-private-extract.c \ + lib/dpif-netdev-private-extract.h \ lib/dpif-netdev-private-flow.h \ lib/dpif-netdev-private-hwol.h \ lib/dpif-netdev-private-thread.h \ diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c index 9a5189145..91fad92db 100644 --- a/lib/dpif-netdev-avx512.c +++ b/lib/dpif-netdev-avx512.c @@ -149,6 +149,16 @@ dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, * // do all processing (HWOL->MFEX->EMC->SMC) * } */ + + /* Do a batch minfilow extract into keys. */ + uint32_t mf_mask = 0; + miniflow_extract_func mfex_func; + atomic_read_relaxed(&pmd->miniflow_extract_opt, &mfex_func); + if (mfex_func) { + mf_mask = mfex_func(packets, keys, batch_size, in_port, pmd); + } + + /* Perform first packet interation. */ uint32_t lookup_pkts_bitmask = (1ULL << batch_size) - 1; uint32_t iter = lookup_pkts_bitmask; while (iter) { @@ -167,6 +177,13 @@ dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, pkt_metadata_init(&packet->md, in_port); struct dp_netdev_flow *f = NULL; + struct netdev_flow_key *key = &keys[i]; + + /* Check the minfiflow mask to see if the packet was correctly + * classifed by vector mfex else do a scalar miniflow extract + * for that packet. + */ + uint32_t mfex_hit = (mf_mask & (1 << i)); /* Check for a partial hardware offload match. */ if (hwol_enabled) { @@ -177,7 +194,13 @@ dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, } if (f) { rules[i] = &f->cr; - pkt_meta[i].tcp_flags = parse_tcp_flags(packet); + /* If AVX512 MFEX already classified the packet, use it. */ + if (mfex_hit) { + pkt_meta[i].tcp_flags = miniflow_get_tcp_flags(&key->mf); + } else { + pkt_meta[i].tcp_flags = parse_tcp_flags(packet); + } + pkt_meta[i].bytes = dp_packet_size(packet); phwol_hits++; hwol_emc_smc_hitmask |= (1 << i); @@ -185,9 +208,10 @@ dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, } } - /* Do miniflow extract into keys. */ - struct netdev_flow_key *key = &keys[i]; - miniflow_extract(packet, &key->mf); + if (!mfex_hit) { + /* Do a scalar miniflow extract into keys. */ + miniflow_extract(packet, &key->mf); + } /* Cache TCP and byte values for all packets. */ pkt_meta[i].bytes = dp_packet_size(packet); diff --git a/lib/dpif-netdev-private-extract.c b/lib/dpif-netdev-private-extract.c new file mode 100644 index 000000000..f7ad2d5b5 --- /dev/null +++ b/lib/dpif-netdev-private-extract.c @@ -0,0 +1,159 @@ +/* + * Copyright (c) 2021 Intel. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include +#include +#include + +#include "dp-packet.h" +#include "dpif-netdev-private-dpcls.h" +#include "dpif-netdev-private-extract.h" +#include "dpif-netdev-private-thread.h" +#include "flow.h" +#include "openvswitch/vlog.h" +#include "ovs-thread.h" +#include "util.h" + +VLOG_DEFINE_THIS_MODULE(dpif_netdev_extract); + +/* Variable to hold the default mfex implementation. */ +static miniflow_extract_func default_mfex_func = NULL; + +/* Implementations of available extract options and + * the implementations are always in order of preference. + */ +static struct dpif_miniflow_extract_impl mfex_impls[] = { + + [MFEX_IMPL_SCALAR] = { + .probe = NULL, + .extract_func = NULL, + .name = "scalar", }, +}; + +BUILD_ASSERT_DECL(MFEX_IMPL_MAX >= ARRAY_SIZE(mfex_impls)); + +void +dpif_miniflow_extract_init(void) +{ + /* Call probe on each impl, and cache the result. */ + uint32_t i; + for (i = 0; i < ARRAY_SIZE(mfex_impls); i++) { + bool avail = true; + if (mfex_impls[i].probe) { + /* Return zero is success, non-zero means error. */ + avail = (mfex_impls[i].probe() == 0); + } + VLOG_INFO("Miniflow Extract implementation %s (available: %s)\n", + mfex_impls[i].name, avail ? "True" : "False"); + mfex_impls[i].available = avail; + } +} + +miniflow_extract_func +dp_mfex_impl_get_default(void) +{ + /* For the first call, this will be NULL. Compute the compile time default. + */ + if (!default_mfex_func) { + + VLOG_INFO("Default MFEX implementation is %s.\n", + mfex_impls[MFEX_IMPL_SCALAR].name); + default_mfex_func = mfex_impls[MFEX_IMPL_SCALAR].extract_func; + } + + return default_mfex_func; +} + +int32_t +dp_mfex_impl_set_default_by_name(const char *name) +{ + miniflow_extract_func new_default; + + + int32_t err = dp_mfex_impl_get_by_name(name, &new_default); + + if (!err) { + default_mfex_func = new_default; + } + + return err; + +} + +uint32_t +dp_mfex_impl_get(struct ds *reply, struct dp_netdev_pmd_thread **pmd_list, + size_t n) +{ + /* Add all mfex functions to reply string. */ + ds_put_cstr(reply, "Available MFEX implementations:\n"); + + for (uint32_t i = 0; i < ARRAY_SIZE(mfex_impls); i++) { + + ds_put_format(reply, " %s (available: %s)(pmds: ", + mfex_impls[i].name, mfex_impls[i].available ? + "True" : "False"); + + for (size_t j = 0; j < n; j++) { + struct dp_netdev_pmd_thread *pmd = pmd_list[j]; + if (pmd->core_id == NON_PMD_CORE_ID) { + continue; + } + + if (pmd->miniflow_extract_opt == mfex_impls[i].extract_func) { + ds_put_format(reply, "%u,", pmd->core_id); + } + } + + ds_chomp(reply, ','); + + if (ds_last(reply) == ' ') { + ds_put_cstr(reply, "none"); + } + + ds_put_cstr(reply, ")\n"); + } + + return ARRAY_SIZE(mfex_impls); +} + +/* This function checks all available MFEX implementations, and selects the + * returns the function pointer to the one requested by "name". + */ +int32_t +dp_mfex_impl_get_by_name(const char *name, miniflow_extract_func *out_func) +{ + if ((name == NULL) || (out_func == NULL)) { + return -EINVAL; + } + + uint32_t i; + + for (i = 0; i < ARRAY_SIZE(mfex_impls); i++) { + if (strcmp(mfex_impls[i].name, name) == 0) { + /* Probe function is optional - so check it is set before exec. */ + if (!mfex_impls[i].available) { + *out_func = NULL; + return -EINVAL; + } + + *out_func = mfex_impls[i].extract_func; + return 0; + } + } + + return -EINVAL; +} diff --git a/lib/dpif-netdev-private-extract.h b/lib/dpif-netdev-private-extract.h new file mode 100644 index 000000000..074b3ee16 --- /dev/null +++ b/lib/dpif-netdev-private-extract.h @@ -0,0 +1,105 @@ +/* + * Copyright (c) 2021 Intel. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef MFEX_AVX512_EXTRACT +#define MFEX_AVX512_EXTRACT 1 + +#include + +/* Forward declarations. */ +struct dp_packet; +struct miniflow; +struct dp_netdev_pmd_thread; +struct dp_packet_batch; +struct netdev_flow_key; + +/* Function pointer prototype to be implemented in the optimized miniflow + * extract code. + * returns the hitmask of the processed packets on success. + * returns zero on failure. + */ +typedef uint32_t (*miniflow_extract_func)(struct dp_packet_batch *batch, + struct netdev_flow_key *keys, + uint32_t keys_size, + odp_port_t in_port, + struct dp_netdev_pmd_thread + *pmd_handle); + +/* Probe function is used to detect if this CPU has the ISA required + * to run the optimized miniflow implementation. + * returns one on successful probe. + * returns zero on failure. + */ +typedef int32_t (*miniflow_extract_probe)(void); + +/* Structure representing the attributes of an optimized implementation. */ +struct dpif_miniflow_extract_impl { + /* When non-zero, this impl has passed the probe() checks. */ + bool available; + + /* Probe function is used to detect if this CPU has the ISA required + * to run the optimized miniflow implementation. + */ + miniflow_extract_probe probe; + + /* Optional function to call to extract miniflows for a burst of packets. + */ + miniflow_extract_func extract_func; + + /* Name of the optimized implementation. */ + char *name; +}; + + +/* Enum to hold implementation indexes. The list is traversed + * linearly as from the ISA perspective, the VBMI version + * should always come before the generic AVX512-F version. + */ +enum dpif_miniflow_extract_impl_idx { + MFEX_IMPL_SCALAR, + MFEX_IMPL_MAX +}; + +/* This function returns all available implementations to the caller. The + * quantity of implementations is returned by the int return value. + */ +uint32_t +dp_mfex_impl_get(struct ds *reply, struct dp_netdev_pmd_thread **pmd_list, + size_t n); + +/* This function checks all available MFEX implementations, and selects the + * returns the function pointer to the one requested by "name". + */ +int32_t +dp_mfex_impl_get_by_name(const char *name, miniflow_extract_func *out_func); + +/* Returns the default MFEX which is first ./configure selected, but can be + * overridden at runtime. */ +miniflow_extract_func dp_mfex_impl_get_default(void); + +/* Overrides the default MFEX with the user set MFEX. */ +int32_t dp_mfex_impl_set_default_by_name(const char *name); + + +/* Initializes the available miniflow extract implementations by probing for + * the CPU ISA requirements. As the runtime available CPU ISA does not change + * and the required ISA of the implementation also does not change, it is safe + * to cache the probe() results, and not call probe() at runtime. + */ +void +dpif_miniflow_extract_init(void); + +#endif /* MFEX_AVX512_EXTRACT */ diff --git a/lib/dpif-netdev-private-thread.h b/lib/dpif-netdev-private-thread.h index ba79c4a0a..a4c092b69 100644 --- a/lib/dpif-netdev-private-thread.h +++ b/lib/dpif-netdev-private-thread.h @@ -27,6 +27,11 @@ #include #include "cmap.h" + +#include "dpif-netdev-private-dfc.h" +#include "dpif-netdev-private-dpif.h" +#include "dpif-netdev-perf.h" +#include "dpif-netdev-private-extract.h" #include "openvswitch/thread.h" #ifdef __cplusplus @@ -110,6 +115,9 @@ struct dp_netdev_pmd_thread { /* Pointer for per-DPIF implementation scratch space. */ void *netdev_input_func_userdata; + /* Function pointer to call for miniflow_extract() functionality. */ + ATOMIC(miniflow_extract_func) miniflow_extract_opt; + struct seq *reload_seq; uint64_t last_reload_seq; diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6203cf656..2043c9ba2 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -46,6 +46,7 @@ #include "dpif.h" #include "dpif-netdev-lookup.h" #include "dpif-netdev-perf.h" +#include "dpif-netdev-private-extract.h" #include "dpif-provider.h" #include "dummy.h" #include "fat-rwlock.h" @@ -1069,6 +1070,97 @@ dpif_netdev_impl_set(struct unixctl_conn *conn, int argc OVS_UNUSED, ds_destroy(&reply); } +static void +dpif_miniflow_extract_impl_get(struct unixctl_conn *conn, int argc OVS_UNUSED, + const char *argv[] OVS_UNUSED, + void *aux OVS_UNUSED) +{ + struct ds reply = DS_EMPTY_INITIALIZER; + struct shash_node *node; + uint32_t count = 0; + + SHASH_FOR_EACH (node, &dp_netdevs) { + struct dp_netdev *dp = node->data; + + /* Get PMD threads list. */ + size_t n; + struct dp_netdev_pmd_thread **pmd_list; + sorted_poll_thread_list(dp, &pmd_list, &n); + count = dp_mfex_impl_get(&reply, pmd_list, n); + } + + if (count == 0) { + unixctl_command_reply_error(conn, "Error getting Mfex names."); + } else { + unixctl_command_reply(conn, ds_cstr(&reply)); + } + + ds_destroy(&reply); +} + +static void +dpif_miniflow_extract_impl_set(struct unixctl_conn *conn, int argc, + const char *argv[], void *aux OVS_UNUSED) +{ + /* This function requires just one parameter, the miniflow name. + */ + const char *mfex_name = argv[1]; + struct shash_node *node; + + static const char *error_description[2] = { + "Unknown miniflow implementation", + "implementation doesn't exist", + }; + + ovs_mutex_lock(&dp_netdev_mutex); + int32_t err = dp_mfex_impl_set_default_by_name(mfex_name); + + if (err) { + struct ds reply = DS_EMPTY_INITIALIZER; + ds_put_format(&reply, + "Miniflow implementation not available: %s %s.\n", + error_description[ (err == EINVAL) ], mfex_name); + const char *reply_str = ds_cstr(&reply); + unixctl_command_reply_error(conn, reply_str); + VLOG_INFO("%s", reply_str); + ds_destroy(&reply); + ovs_mutex_unlock(&dp_netdev_mutex); + return; + } + + SHASH_FOR_EACH (node, &dp_netdevs) { + struct dp_netdev *dp = node->data; + + /* Get PMD threads list. */ + size_t n; + struct dp_netdev_pmd_thread **pmd_list; + sorted_poll_thread_list(dp, &pmd_list, &n); + + for (size_t i = 0; i < n; i++) { + struct dp_netdev_pmd_thread *pmd = pmd_list[i]; + if (pmd->core_id == NON_PMD_CORE_ID) { + continue; + } + + /* Initialize MFEX function pointer to the newly configured + * default. */ + miniflow_extract_func default_func = dp_mfex_impl_get_default(); + atomic_uintptr_t *pmd_func = (void *) &pmd->miniflow_extract_opt; + atomic_store_relaxed(pmd_func, (uintptr_t) default_func); + }; + } + + ovs_mutex_unlock(&dp_netdev_mutex); + + /* Reply with success to command. */ + struct ds reply = DS_EMPTY_INITIALIZER; + ds_put_format(&reply, "Miniflow implementation set to %s.\n", mfex_name); + const char *reply_str = ds_cstr(&reply); + unixctl_command_reply(conn, reply_str); + VLOG_INFO("%s", reply_str); + ds_destroy(&reply); +} + static void dpif_netdev_pmd_rebalance(struct unixctl_conn *conn, int argc, const char *argv[], void *aux OVS_UNUSED) @@ -1298,6 +1390,13 @@ dpif_netdev_init(void) unixctl_command_register("dpif-netdev/dpif-impl-get", "", 0, 0, dpif_netdev_impl_get, NULL); + unixctl_command_register("dpif-netdev/miniflow-parser-set", + "miniflow implementation name", + 1, 1, dpif_miniflow_extract_impl_set, + NULL); + unixctl_command_register("dpif-netdev/miniflow-parser-get", "", + 0, 0, dpif_miniflow_extract_impl_get, + NULL); return 0; } @@ -1499,6 +1598,8 @@ create_dp_netdev(const char *name, const struct dpif_class *class, dp->conntrack = conntrack_init(); + dpif_miniflow_extract_init(); + atomic_init(&dp->emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN); atomic_init(&dp->tx_flush_interval, DEFAULT_TX_FLUSH_INTERVAL); @@ -6206,6 +6307,11 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, atomic_uintptr_t *pmd_func = (void *) &pmd->netdev_input_func; atomic_init(pmd_func, (uintptr_t) default_func); + /* Init default miniflow_extract function */ + miniflow_extract_func mfex_func = dp_mfex_impl_get_default(); + atomic_uintptr_t *pmd_func_mfex = (void *)&pmd->miniflow_extract_opt; + atomic_store_relaxed(pmd_func_mfex, (uintptr_t) mfex_func); + /* init the 'flow_cache' since there is no * actual thread created for NON_PMD_CORE_ID. */ if (core_id == NON_PMD_CORE_ID) { @@ -6795,6 +6901,7 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, size_t n_missed = 0, n_emc_hit = 0, n_phwol_hit = 0; struct dfc_cache *cache = &pmd->flow_cache; struct dp_packet *packet; + struct dp_packet_batch single_packet; const size_t cnt = dp_packet_batch_size(packets_); uint32_t cur_min = pmd->ctx.emc_insert_min; int i; @@ -6803,6 +6910,11 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, size_t map_cnt = 0; bool batch_enable = true; + single_packet.count = 1; + + miniflow_extract_func mfex_func; + atomic_read_relaxed(&pmd->miniflow_extract_opt, &mfex_func); + atomic_read_relaxed(&pmd->dp->smc_enable_db, &smc_enable_db); pmd_perf_update_counter(&pmd->perf_stats, md_is_valid ? PMD_STAT_RECIRC : PMD_STAT_RECV, @@ -6853,7 +6965,20 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, } } - miniflow_extract(packet, &key->mf); + /* Set the count and packet for miniflow_opt with batch_size 1. */ + if ((mfex_func) && (!md_is_valid)) { + single_packet.packets[0] = packet; + int mf_ret; + + mf_ret = mfex_func(&single_packet, key, 1, port_no, pmd); + /* Fallback to original miniflow_extract if there is a miss. */ + if (!mf_ret) { + miniflow_extract(packet, &key->mf); + } + } else { + miniflow_extract(packet, &key->mf); + } + key->len = 0; /* Not computed yet. */ key->hash = (md_is_valid == false) From patchwork Tue Jul 6 13:11:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501235 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2yZ19TTz9sS8 for ; Tue, 6 Jul 2021 23:12:06 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id CCF9C608A6; Tue, 6 Jul 2021 13:12:02 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Okl-Sc1-fmTW; Tue, 6 Jul 2021 13:12:00 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp3.osuosl.org (Postfix) with ESMTPS id CA3C46085B; Tue, 6 Jul 2021 13:11:58 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id C28D2C0021; Tue, 6 Jul 2021 13:11:57 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id ABAF8C0020 for ; Tue, 6 Jul 2021 13:11:56 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 93BCD40393 for ; Tue, 6 Jul 2021 13:11:56 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kpUTWPKmsrt2 for ; Tue, 6 Jul 2021 13:11:55 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 319CA4024B for ; Tue, 6 Jul 2021 13:11:54 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101853" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101853" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:11:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258019" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:11:52 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:41 +0100 Message-Id: <20210706131150.45513-3-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 02/11] dpif-netdev: Add auto validation function for miniflow extract X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Kumar Amber This patch introduced the auto-validation function which allows users to compare the batch of packets obtained from different miniflow implementations against the linear miniflow extract and return a hitmask. The autovaidator function can be triggered at runtime using the following command: $ ovs-appctl dpif-netdev/miniflow-parser-set autovalidator Signed-off-by: Kumar Amber Co-authored-by: Harry van Haaren Signed-off-by: Harry van Haaren --- v5: - fix review comments(Ian, Flavio, Eelco) - remove ovs assert and switch to default after a batch of packets is processed - Atomic set and get introduced - fix raw_ctz for windows build --- --- NEWS | 2 + lib/dpif-netdev-private-extract.c | 149 ++++++++++++++++++++++++++++++ lib/dpif-netdev-private-extract.h | 13 +++ lib/dpif-netdev.c | 2 +- 4 files changed, 165 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 60db823c4..ccf9a0f1e 100644 --- a/NEWS +++ b/NEWS @@ -23,6 +23,8 @@ Post-v2.15.0 CPU supports it. This enhances performance by using the native vpopcount instructions, instead of the emulated version of vpopcount. * Add command line option to switch between mfex function pointers. + * Add miniflow extract auto-validator function to compare different + miniflow extract implementations against default implementation. - ovs-ctl: * New option '--no-record-hostname' to disable hostname configuration in ovsdb on startup. diff --git a/lib/dpif-netdev-private-extract.c b/lib/dpif-netdev-private-extract.c index f7ad2d5b5..62170ff6c 100644 --- a/lib/dpif-netdev-private-extract.c +++ b/lib/dpif-netdev-private-extract.c @@ -38,6 +38,11 @@ static miniflow_extract_func default_mfex_func = NULL; */ static struct dpif_miniflow_extract_impl mfex_impls[] = { + [MFEX_IMPL_AUTOVALIDATOR] = { + .probe = NULL, + .extract_func = dpif_miniflow_extract_autovalidator, + .name = "autovalidator", }, + [MFEX_IMPL_SCALAR] = { .probe = NULL, .extract_func = NULL, @@ -157,3 +162,147 @@ dp_mfex_impl_get_by_name(const char *name, miniflow_extract_func *out_func) return -EINVAL; } + +uint32_t +dpif_miniflow_extract_autovalidator(struct dp_packet_batch *packets, + struct netdev_flow_key *keys, + uint32_t keys_size, odp_port_t in_port, + struct dp_netdev_pmd_thread *pmd_handle) +{ + const uint32_t cnt = dp_packet_batch_size(packets); + uint16_t good_l2_5_ofs[NETDEV_MAX_BURST]; + uint16_t good_l3_ofs[NETDEV_MAX_BURST]; + uint16_t good_l4_ofs[NETDEV_MAX_BURST]; + uint16_t good_l2_pad_size[NETDEV_MAX_BURST]; + struct dp_packet *packet; + struct dp_netdev_pmd_thread *pmd = pmd_handle; + struct netdev_flow_key test_keys[NETDEV_MAX_BURST]; + + if (keys_size < cnt) { + miniflow_extract_func default_func = NULL; + atomic_uintptr_t *pmd_func = (void *)&pmd->miniflow_extract_opt; + atomic_store_relaxed(pmd_func, (uintptr_t) default_func); + VLOG_ERR("Invalid key size supplied, Key_size: %d less than" + "batch_size: %d", keys_size, cnt); + return 0; + } + + /* Run scalar miniflow_extract to get default result. */ + DP_PACKET_BATCH_FOR_EACH (i, packet, packets) { + pkt_metadata_init(&packet->md, in_port); + miniflow_extract(packet, &keys[i].mf); + + /* Store known good metadata to compare with optimized metadata. */ + good_l2_5_ofs[i] = packet->l2_5_ofs; + good_l3_ofs[i] = packet->l3_ofs; + good_l4_ofs[i] = packet->l4_ofs; + good_l2_pad_size[i] = packet->l2_pad_size; + } + + uint32_t batch_failed = 0; + /* Iterate through each version of miniflow implementations. */ + for (int j = MFEX_IMPL_MAX; j < MFEX_IMPL_MAX; j++) { + if ((j < MFEX_IMPL_MAX) || (!mfex_impls[j].available)) { + continue; + } + + /* Reset keys and offsets before each implementation. */ + memset(test_keys, 0, keys_size * sizeof(struct netdev_flow_key)); + DP_PACKET_BATCH_FOR_EACH (i, packet, packets) { + dp_packet_reset_offsets(packet); + } + /* Call optimized miniflow for each batch of packet. */ + uint32_t hit_mask = mfex_impls[j].extract_func(packets, test_keys, + keys_size, in_port, + pmd_handle); + + /* Do a miniflow compare for bits, blocks and offsets for all the + * classified packets in the hitmask marked by set bits. */ + while (hit_mask) { + /* Index for the set bit. */ + uint32_t i = raw_ctz(hit_mask); + /* Set the index in hitmask to Zero. */ + hit_mask &= (hit_mask - 1); + + uint32_t failed = 0; + + struct ds log_msg = DS_EMPTY_INITIALIZER; + ds_put_format(&log_msg, "mfex autovalidator pkt %d\n", i); + + /* Check miniflow bits are equal. */ + if ((keys[i].mf.map.bits[0] != test_keys[i].mf.map.bits[0]) || + (keys[i].mf.map.bits[1] != test_keys[i].mf.map.bits[1])) { + ds_put_format(&log_msg, "Good 0x%llx 0x%llx\tTest 0x%llx" + " 0x%llx\n", keys[i].mf.map.bits[0], + keys[i].mf.map.bits[1], + test_keys[i].mf.map.bits[0], + test_keys[i].mf.map.bits[1]); + failed = 1; + } + + if (!miniflow_equal(&keys[i].mf, &test_keys[i].mf)) { + uint32_t block_cnt = miniflow_n_values(&keys[i].mf); + ds_put_format(&log_msg, "Autovalidation blocks failed for %s" + "pkt %d\nGood hex:\n", mfex_impls[j].name, i); + ds_put_hex_dump(&log_msg, &keys[i].buf, block_cnt * 8, 0, + false); + ds_put_format(&log_msg, "Test hex:\n"); + ds_put_hex_dump(&log_msg, &test_keys[i].buf, block_cnt * 8, 0, + false); + failed = 1; + } + + packet = packets->packets[i]; + if ((packet->l2_pad_size != good_l2_pad_size[i]) || + (packet->l2_5_ofs != good_l2_5_ofs[i]) || + (packet->l3_ofs != good_l3_ofs[i]) || + (packet->l4_ofs != good_l4_ofs[i])) { + ds_put_format(&log_msg, "Autovalidation packet offsets failed" + " for %s pkt %d\n", mfex_impls[j].name, i); + ds_put_format(&log_msg, "Good offsets: l2_pad_size %u," + " l2_5_ofs : %u l3_ofs %u, l4_ofs %u\n", + good_l2_pad_size[i], good_l2_5_ofs[i], + good_l3_ofs[i], good_l4_ofs[i]); + ds_put_format(&log_msg, " Test offsets: l2_pad_size %u," + " l2_5_ofs : %u l3_ofs %u, l4_ofs %u\n", + packet->l2_pad_size, packet->l2_5_ofs, + packet->l3_ofs, packet->l4_ofs); + failed = 1; + } + + if (failed) { + VLOG_ERR("Autovalidation for %s failed in pkt %d," + " disabling.", mfex_impls[j].name, i); + VLOG_ERR("Autovalidation failure details:\n%s", + ds_cstr(&log_msg)); + batch_failed = 1; + } + ds_destroy(&log_msg); + } + } + + /* Having dumped the debug info for the batch, disable autovalidator. */ + if (batch_failed) { + miniflow_extract_func default_func = NULL; + atomic_uintptr_t *pmd_func = (void *)&pmd->miniflow_extract_opt; + atomic_store_relaxed(pmd_func, (uintptr_t) default_func); + } + + /* Preserve packet correctness by storing back the good offsets in + * packets back. */ + DP_PACKET_BATCH_FOR_EACH (i, packet, packets) { + packet->l2_5_ofs = good_l2_5_ofs[i]; + packet->l3_ofs = good_l3_ofs[i]; + packet->l4_ofs = good_l4_ofs[i]; + packet->l2_pad_size = good_l2_pad_size[i]; + } + + /* Returning zero implies no packets were hit by autovalidation. This + * simplifies unit-tests as changing --enable-mfex-default-autovalidator + * would pass/fail. By always returning zero, autovalidator is a little + * slower, but we gain consistency in testing. The auto-validator is only + * meant to test different implementaions against a batch of packets + * without incrementing hit counters. + */ + return 0; +} diff --git a/lib/dpif-netdev-private-extract.h b/lib/dpif-netdev-private-extract.h index 074b3ee16..10525c378 100644 --- a/lib/dpif-netdev-private-extract.h +++ b/lib/dpif-netdev-private-extract.h @@ -69,6 +69,7 @@ struct dpif_miniflow_extract_impl { * should always come before the generic AVX512-F version. */ enum dpif_miniflow_extract_impl_idx { + MFEX_IMPL_AUTOVALIDATOR, MFEX_IMPL_SCALAR, MFEX_IMPL_MAX }; @@ -102,4 +103,16 @@ int32_t dp_mfex_impl_set_default_by_name(const char *name); void dpif_miniflow_extract_init(void); +/* Retrieve the hitmask of the batch of pakcets which is obtained by comparing + * different miniflow implementations with linear miniflow extract. + * Key_size need to be at least the size of the batch. + * On error, returns a zero. + * On success, returns the number of packets in the batch compared. + */ +uint32_t +dpif_miniflow_extract_autovalidator(struct dp_packet_batch *batch, + struct netdev_flow_key *keys, + uint32_t keys_size, odp_port_t in_port, + struct dp_netdev_pmd_thread *pmd_handle); + #endif /* MFEX_AVX512_EXTRACT */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 2043c9ba2..175d8699f 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1156,8 +1156,8 @@ dpif_miniflow_extract_impl_set(struct unixctl_conn *conn, int argc, struct ds reply = DS_EMPTY_INITIALIZER; ds_put_format(&reply, "Miniflow implementation set to %s.\n", mfex_name); const char *reply_str = ds_cstr(&reply); - unixctl_command_reply(conn, reply_str); VLOG_INFO("%s", reply_str); + unixctl_command_reply(conn, reply_str); ds_destroy(&reply); } From patchwork Tue Jul 6 13:11:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501236 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2yc6lFpz9sS8 for ; Tue, 6 Jul 2021 23:12:08 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 2CE58405AD; Tue, 6 Jul 2021 13:12:06 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WZeyuq2xSSLQ; Tue, 6 Jul 2021 13:12:02 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 756D2405BC; Tue, 6 Jul 2021 13:12:00 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id CF347C002A; Tue, 6 Jul 2021 13:11:59 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 181EEC001D for ; Tue, 6 Jul 2021 13:11:59 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 9E039400CD for ; Tue, 6 Jul 2021 13:11:58 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v5EoLueFnVbd for ; Tue, 6 Jul 2021 13:11:57 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 3BD794039F for ; Tue, 6 Jul 2021 13:11:57 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101860" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101860" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:11:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258030" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:11:54 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:42 +0100 Message-Id: <20210706131150.45513-4-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 03/11] dpif-netdev: Add study function to select the best mfex function X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Kumar Amber The study function runs all the available implementations of miniflow_extract and makes a choice whose hitmask has maximum hits and sets the mfex to that function. Study can be run at runtime using the following command: $ ovs-appctl dpif-netdev/miniflow-parser-set study Signed-off-by: Kumar Amber Co-authored-by: Harry van Haaren Signed-off-by: Harry van Haaren --- v5: - fix review comments(Ian, Flavio, Eelco) - add Atomic set in study --- --- NEWS | 3 + lib/automake.mk | 1 + lib/dpif-netdev-extract-study.c | 124 ++++++++++++++++++++++++++++++ lib/dpif-netdev-private-extract.c | 19 ++++- lib/dpif-netdev-private-extract.h | 20 +++++ 5 files changed, 165 insertions(+), 2 deletions(-) create mode 100644 lib/dpif-netdev-extract-study.c diff --git a/NEWS b/NEWS index ccf9a0f1e..275aa1868 100644 --- a/NEWS +++ b/NEWS @@ -25,6 +25,9 @@ Post-v2.15.0 * Add command line option to switch between mfex function pointers. * Add miniflow extract auto-validator function to compare different miniflow extract implementations against default implementation. + * Add study function to miniflow function table which studies packet + and automatically chooses the best miniflow implementation for that + traffic. - ovs-ctl: * New option '--no-record-hostname' to disable hostname configuration in ovsdb on startup. diff --git a/lib/automake.mk b/lib/automake.mk index 6657b9ae5..5223d321b 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -107,6 +107,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dp-packet.h \ lib/dp-packet.c \ lib/dpdk.h \ + lib/dpif-netdev-extract-study.c \ lib/dpif-netdev-lookup.h \ lib/dpif-netdev-lookup.c \ lib/dpif-netdev-lookup-autovalidator.c \ diff --git a/lib/dpif-netdev-extract-study.c b/lib/dpif-netdev-extract-study.c new file mode 100644 index 000000000..32b76bd03 --- /dev/null +++ b/lib/dpif-netdev-extract-study.c @@ -0,0 +1,124 @@ +/* + * Copyright (c) 2021 Intel. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include +#include +#include + +#include "dpif-netdev-private-thread.h" +#include "openvswitch/vlog.h" +#include "ovs-thread.h" + +VLOG_DEFINE_THIS_MODULE(dpif_mfex_extract_study); + +/* Max count of packets to be compared. */ +#define MFEX_MAX_COUNT (128) + +static uint32_t mfex_study_pkts_count = 0; + +/* Struct to hold miniflow study stats. */ +struct study_stats { + uint32_t pkt_count; + uint32_t impl_hitcount[MFEX_IMPL_MAX]; +}; + +/* Define per thread data to hold the study stats. */ +DEFINE_PER_THREAD_MALLOCED_DATA(struct study_stats *, study_stats); + +/* Allocate per thread PMD pointer space for study_stats. */ +static inline struct study_stats * +mfex_study_get_study_stats_ptr(void) +{ + struct study_stats *stats = study_stats_get(); + if (OVS_UNLIKELY(!stats)) { + stats = xzalloc(sizeof *stats); + study_stats_set_unsafe(stats); + } + return stats; +} + +uint32_t +mfex_study_traffic(struct dp_packet_batch *packets, + struct netdev_flow_key *keys, + uint32_t keys_size, odp_port_t in_port, + struct dp_netdev_pmd_thread *pmd_handle) +{ + uint32_t hitmask = 0; + uint32_t mask = 0; + struct dp_netdev_pmd_thread *pmd = pmd_handle; + struct dpif_miniflow_extract_impl *miniflow_funcs; + uint32_t impl_count = dpif_mfex_impl_info_get(&miniflow_funcs); + struct study_stats *stats = mfex_study_get_study_stats_ptr(); + + /* Run traffic optimized miniflow_extract to collect the hitmask + * to be compared after certain packets have been hit to choose + * the best miniflow_extract version for that traffic. + */ + for (int i = MFEX_IMPL_MAX; i < impl_count; i++) { + if (miniflow_funcs[i].available) { + hitmask = miniflow_funcs[i].extract_func(packets, keys, keys_size, + in_port, pmd_handle); + stats->impl_hitcount[i] += count_1bits(hitmask); + + /* If traffic is not classified then we dont overwrite the keys + * array in minfiflow implementations so its safe to create a + * mask for all those packets whose miniflow have been created. + */ + mask |= hitmask; + } + } + stats->pkt_count += dp_packet_batch_size(packets); + + /* Choose the best implementation after a minimum packets have been + * processed. + */ + if (stats->pkt_count >= MFEX_MAX_COUNT) { + uint32_t best_func_index = MFEX_IMPL_MAX; + uint32_t max_hits = 0; + for (int i = MFEX_IMPL_MAX; i < impl_count; i++) { + if (stats->impl_hitcount[i] > max_hits) { + max_hits = stats->impl_hitcount[i]; + best_func_index = i; + } + } + + /* If 50% of the packets hit, enable the function. */ + if (max_hits >= (mfex_study_pkts_count / 2)) { + miniflow_extract_func mf_func = + miniflow_funcs[best_func_index].extract_func; + atomic_uintptr_t *pmd_func = (void *)&pmd->miniflow_extract_opt; + atomic_store_relaxed(pmd_func, (uintptr_t) mf_func); + VLOG_INFO("MFEX study chose impl %s: (hits %d/%d pkts)", + miniflow_funcs[best_func_index].name, max_hits, + stats->pkt_count); + } else { + /* Set the implementation to null for default miniflow. */ + miniflow_extract_func mf_func = + miniflow_funcs[MFEX_IMPL_SCALAR].extract_func; + atomic_uintptr_t *pmd_func = (void *)&pmd->miniflow_extract_opt; + atomic_store_relaxed(pmd_func, (uintptr_t) mf_func); + VLOG_INFO("Not enough packets matched (%d/%d), disabling" + " optimized MFEX.", max_hits, stats->pkt_count); + } + /* Reset stats so that study function can be called again + * for next traffic type and optimal function ptr can be + * chosen. + */ + memset(stats, 0, sizeof(struct study_stats)); + } + return mask; +} diff --git a/lib/dpif-netdev-private-extract.c b/lib/dpif-netdev-private-extract.c index 62170ff6c..eaddeceaf 100644 --- a/lib/dpif-netdev-private-extract.c +++ b/lib/dpif-netdev-private-extract.c @@ -47,6 +47,11 @@ static struct dpif_miniflow_extract_impl mfex_impls[] = { .probe = NULL, .extract_func = NULL, .name = "scalar", }, + + [MFEX_IMPL_STUDY] = { + .probe = NULL, + .extract_func = mfex_study_traffic, + .name = "study", }, }; BUILD_ASSERT_DECL(MFEX_IMPL_MAX >= ARRAY_SIZE(mfex_impls)); @@ -88,7 +93,6 @@ dp_mfex_impl_set_default_by_name(const char *name) { miniflow_extract_func new_default; - int32_t err = dp_mfex_impl_get_by_name(name, &new_default); if (!err) { @@ -146,7 +150,6 @@ dp_mfex_impl_get_by_name(const char *name, miniflow_extract_func *out_func) } uint32_t i; - for (i = 0; i < ARRAY_SIZE(mfex_impls); i++) { if (strcmp(mfex_impls[i].name, name) == 0) { /* Probe function is optional - so check it is set before exec. */ @@ -163,6 +166,18 @@ dp_mfex_impl_get_by_name(const char *name, miniflow_extract_func *out_func) return -EINVAL; } +int32_t +dpif_mfex_impl_info_get(struct dpif_miniflow_extract_impl **out_ptr) +{ + if (out_ptr == NULL) { + return -EINVAL; + } + + *out_ptr = mfex_impls; + + return ARRAY_SIZE(mfex_impls); +} + uint32_t dpif_miniflow_extract_autovalidator(struct dp_packet_batch *packets, struct netdev_flow_key *keys, diff --git a/lib/dpif-netdev-private-extract.h b/lib/dpif-netdev-private-extract.h index 10525c378..cd46c94dd 100644 --- a/lib/dpif-netdev-private-extract.h +++ b/lib/dpif-netdev-private-extract.h @@ -71,6 +71,7 @@ struct dpif_miniflow_extract_impl { enum dpif_miniflow_extract_impl_idx { MFEX_IMPL_AUTOVALIDATOR, MFEX_IMPL_SCALAR, + MFEX_IMPL_STUDY, MFEX_IMPL_MAX }; @@ -94,6 +95,13 @@ miniflow_extract_func dp_mfex_impl_get_default(void); /* Overrides the default MFEX with the user set MFEX. */ int32_t dp_mfex_impl_set_default_by_name(const char *name); +/* Retrieve the array of miniflow implementations for iteration. + * On error, returns a negative number. + * On success, returns the size of the arrays pointed to by the out parameter. + */ +int32_t +dpif_mfex_impl_info_get(struct dpif_miniflow_extract_impl **out_ptr); + /* Initializes the available miniflow extract implementations by probing for * the CPU ISA requirements. As the runtime available CPU ISA does not change @@ -115,4 +123,16 @@ dpif_miniflow_extract_autovalidator(struct dp_packet_batch *batch, uint32_t keys_size, odp_port_t in_port, struct dp_netdev_pmd_thread *pmd_handle); +/* Retrieve the number of packets by studying packets using different miniflow + * implementations to choose the best implementation using the maximum hitmask + * count. + * On error, returns a zero for no packets. + * On success, returns mask of the packets hit. + */ +uint32_t +mfex_study_traffic(struct dp_packet_batch *packets, + struct netdev_flow_key *keys, + uint32_t keys_size, odp_port_t in_port, + struct dp_netdev_pmd_thread *pmd_handle); + #endif /* MFEX_AVX512_EXTRACT */ From patchwork Tue Jul 6 13:11:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501237 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2yf6WJyz9sWS for ; Tue, 6 Jul 2021 23:12:10 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 50BC260892; Tue, 6 Jul 2021 13:12:08 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PRRXVfWGfVqc; Tue, 6 Jul 2021 13:12:04 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTPS id 4D88F6068C; Tue, 6 Jul 2021 13:12:02 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 956BDC001D; Tue, 6 Jul 2021 13:12:01 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id E35C5C0010 for ; Tue, 6 Jul 2021 13:12:00 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id CF0B34040C for ; Tue, 6 Jul 2021 13:12:00 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2zIkFhk4SbZh for ; Tue, 6 Jul 2021 13:11:59 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 73D62403F7 for ; Tue, 6 Jul 2021 13:11:59 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101866" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101866" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:11:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258042" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:11:57 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:43 +0100 Message-Id: <20210706131150.45513-5-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 04/11] docs/dpdk/bridge: add miniflow extract section. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Kumar Amber This commit adds a section to the dpdk/bridge.rst netdev documentation, detailing the added miniflow functionality. The newly added commands are documented, and sample output is provided. The use of auto-validator and special study function is also described in detail as well as running fuzzy tests. Signed-off-by: Kumar Amber Co-authored-by: Cian Ferriter Signed-off-by: Cian Ferriter Co-authored-by: Harry van Haaren Signed-off-by: Harry van Haaren --- v5: - fix review comments(Ian, Flavio, Eelco) --- --- Documentation/topics/dpdk/bridge.rst | 49 ++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index 2d0850836..2901e8096 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -256,3 +256,52 @@ The following line should be seen in the configure output when the above option is used :: checking whether DPIF AVX512 is default implementation... yes + +Miniflow Extract +---------------- + +Miniflow extract (MFEX) performs parsing of the raw packets and extracts the +important header information into a compressed miniflow. This miniflow is +composed of bits and blocks where the bits signify which blocks are set or +have values where as the blocks hold the metadata, ip, udp, vlan, etc. These +values are used by the datapath for switching decisions later. + +Most modern CPUs have SIMD capabilities. These SIMD instructions are able +to process a vector rather than act on one single data. OVS provides multiple +implementations of miniflow extract. This allows the user to take advantage +of SIMD instructions like AVX512 to gain additional performance. + +A list of implementations can be obtained by the following command. The +command also shows whether the CPU supports each implementation :: + + $ ovs-appctl dpif-netdev/miniflow-parser-get + Available Optimized Miniflow Extracts: + autovalidator (available: True)(pmds: none) + scalar (available: True)(pmds: 3) + study (available: True)(pmds: none) + +An implementation can be selected manually by the following command :: + + $ ovs-appctl dpif-netdev/miniflow-parser-set study + +Also user can select the study implementation which studies the traffic for +a specific number of packets by applying all available implementaions of +miniflow extract and than chooses the one with most optimal result for that +traffic pattern. + +Miniflow Extract Validation +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As multiple versions of miniflow extract can co-exist, each with different +CPU ISA optimizations, it is important to validate that they all give the +exact same results. To easily test all miniflow implementations, an +``autovalidator`` implementation of the miniflow exists. This implementation +runs all other available miniflow extract implementations, and verifies that +the results are identical. + +Running the OVS unit tests with the autovalidator enabled ensures all +implementations provide the same results. + +To set the Miniflow autovalidator, use this command :: + + $ ovs-appctl dpif-netdev/miniflow-parser-set autovalidator From patchwork Tue Jul 6 13:11:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501238 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2yl3Pgxz9sS8 for ; Tue, 6 Jul 2021 23:12:15 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id DD02183AE9; Tue, 6 Jul 2021 13:12:12 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4KS_9enZQMha; Tue, 6 Jul 2021 13:12:11 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 21C8983A94; Tue, 6 Jul 2021 13:12:08 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id E1260C0010; Tue, 6 Jul 2021 13:12:07 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 05B36C0010 for ; Tue, 6 Jul 2021 13:12:07 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 0D0DF403E8 for ; Tue, 6 Jul 2021 13:12:05 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YbHJrb4yI_Pk for ; Tue, 6 Jul 2021 13:12:02 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 7224640022 for ; Tue, 6 Jul 2021 13:12:01 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101869" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101869" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:12:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258060" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:11:59 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:44 +0100 Message-Id: <20210706131150.45513-6-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 05/11] dpif-netdev: Add configure to enable autovalidator at build time. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Kumar Amber This commit adds a new command to allow the user to enable autovalidatior by default at build time thus allowing for runnig unit test by default. $ ./configure --enable-mfex-default-autovalidator Signed-off-by: Kumar Amber Co-authored-by: Harry van Haaren Signed-off-by: Harry van Haaren --- v5: - fix review comments(Ian, Flavio, Eelco) --- --- Documentation/topics/dpdk/bridge.rst | 5 +++++ NEWS | 5 +++-- acinclude.m4 | 16 ++++++++++++++++ configure.ac | 1 + lib/dpif-netdev-private-extract.c | 8 +++++++- 5 files changed, 32 insertions(+), 3 deletions(-) diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index 2901e8096..c79e108b7 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -305,3 +305,8 @@ implementations provide the same results. To set the Miniflow autovalidator, use this command :: $ ovs-appctl dpif-netdev/miniflow-parser-set autovalidator + +A compile time option is available in order to test it with the OVS unit +test suite. Use the following configure option :: + + $ ./configure --enable-mfex-default-autovalidator diff --git a/NEWS b/NEWS index 275aa1868..608c5a32f 100644 --- a/NEWS +++ b/NEWS @@ -25,9 +25,11 @@ Post-v2.15.0 * Add command line option to switch between mfex function pointers. * Add miniflow extract auto-validator function to compare different miniflow extract implementations against default implementation. - * Add study function to miniflow function table which studies packet + * Add study function to miniflow function table which studies packet and automatically chooses the best miniflow implementation for that traffic. + * Add build time configure command to enable auto-validatior as default + miniflow implementation at build time. - ovs-ctl: * New option '--no-record-hostname' to disable hostname configuration in ovsdb on startup. @@ -44,7 +46,6 @@ Post-v2.15.0 * New option '--election-timer' to the 'create-cluster' command to set the leader election timer during cluster creation. - v2.15.0 - 15 Feb 2021 --------------------- - OVSDB: diff --git a/acinclude.m4 b/acinclude.m4 index 5fbcd9872..e2704cfda 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -14,6 +14,22 @@ # See the License for the specific language governing permissions and # limitations under the License. +dnl Set OVS MFEX Autovalidator as default miniflow extract at compile time? +dnl This enables automatically running all unit tests with all MFEX +dnl implementations. +AC_DEFUN([OVS_CHECK_MFEX_AUTOVALIDATOR], [ + AC_ARG_ENABLE([mfex-default-autovalidator], + [AC_HELP_STRING([--enable-mfex-default-autovalidator], [Enable MFEX autovalidator as default miniflow_extract implementation.])], + [autovalidator=yes],[autovalidator=no]) + AC_MSG_CHECKING([whether MFEX Autovalidator is default implementation]) + if test "$autovalidator" != yes; then + AC_MSG_RESULT([no]) + else + OVS_CFLAGS="$OVS_CFLAGS -DMFEX_AUTOVALIDATOR_DEFAULT" + AC_MSG_RESULT([yes]) + fi +]) + dnl Set OVS DPCLS Autovalidator as default subtable search at compile time? dnl This enables automatically running all unit tests with all DPCLS dnl implementations. diff --git a/configure.ac b/configure.ac index e45685a6c..46c402892 100644 --- a/configure.ac +++ b/configure.ac @@ -186,6 +186,7 @@ OVS_ENABLE_SPARSE OVS_CTAGS_IDENTIFIERS OVS_CHECK_DPCLS_AUTOVALIDATOR OVS_CHECK_DPIF_AVX512_DEFAULT +OVS_CHECK_MFEX_AUTOVALIDATOR OVS_CHECK_BINUTILS_AVX512 AC_ARG_VAR(KARCH, [Kernel Architecture String]) diff --git a/lib/dpif-netdev-private-extract.c b/lib/dpif-netdev-private-extract.c index eaddeceaf..6ae91a24d 100644 --- a/lib/dpif-netdev-private-extract.c +++ b/lib/dpif-netdev-private-extract.c @@ -76,6 +76,12 @@ dpif_miniflow_extract_init(void) miniflow_extract_func dp_mfex_impl_get_default(void) { + +#ifdef MFEX_AUTOVALIDATOR_DEFAULT + VLOG_INFO("Default miniflow Extract implementation %s", + mfex_impls[MFEX_IMPL_AUTOVALIDATOR].name); + default_mfex_func = mfex_impls[MFEX_IMPL_AUTOVALIDATOR].extract_func; +#else /* For the first call, this will be NULL. Compute the compile time default. */ if (!default_mfex_func) { @@ -84,7 +90,7 @@ dp_mfex_impl_get_default(void) mfex_impls[MFEX_IMPL_SCALAR].name); default_mfex_func = mfex_impls[MFEX_IMPL_SCALAR].extract_func; } - +#endif return default_mfex_func; } From patchwork Tue Jul 6 13:11:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501239 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2yv0prvz9sS8 for ; Tue, 6 Jul 2021 23:12:23 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 397AE40634; Tue, 6 Jul 2021 13:12:20 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cUP7N8ykD2Cb; Tue, 6 Jul 2021 13:12:17 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 3074D405C6; Tue, 6 Jul 2021 13:12:13 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id D71E3C0010; Tue, 6 Jul 2021 13:12:12 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1A7BAC001D for ; Tue, 6 Jul 2021 13:12:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id EF6814040E for ; Tue, 6 Jul 2021 13:12:06 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id akdwZxPL_2Am for ; Tue, 6 Jul 2021 13:12:06 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 887DE4040D for ; Tue, 6 Jul 2021 13:12:03 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101873" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101873" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:12:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258067" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:12:01 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:45 +0100 Message-Id: <20210706131150.45513-7-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 06/11] dpif-netdev: Add packet count and core id paramters for study X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Kumar Amber This commit introduces additional command line paramter for mfex study function. If user provides additional packet out it is used in study to compare minimum packets which must be processed else a default value is choosen. Also introduces a third paramter for choosing a particular pmd core. $ ovs-appctl dpif-netdev/miniflow-parser-set study 500 3 Signed-off-by: Kumar Amber --- v5: - fix review comments(Ian, Flavio, Eelco) - introucde pmd core id parameter --- --- Documentation/topics/dpdk/bridge.rst | 35 +++++++++++- lib/dpif-netdev-extract-study.c | 23 +++++++- lib/dpif-netdev-private-extract.c | 2 +- lib/dpif-netdev-private-extract.h | 9 ++++ lib/dpif-netdev.c | 79 ++++++++++++++++++++++++++-- 5 files changed, 139 insertions(+), 9 deletions(-) diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index c79e108b7..8495687e8 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -282,12 +282,43 @@ command also shows whether the CPU supports each implementation :: An implementation can be selected manually by the following command :: - $ ovs-appctl dpif-netdev/miniflow-parser-set study + $ ovs-appctl dpif-netdev/miniflow-parser-set [name] [study_cnt] [core_id] + +The above command has two optional parameters study_cnt and core_id which can +be set optionally. The second parameter study_cnt is specific to study +where how many packets needed to choose best implementation can be provided. +Third parameter core_id can also be provided to set a particular miniflow +extract function to a specific pmd thread on the core. In case of any other +implementation other than study second parameter [study_cnt] can pe provided +with any arbitatry number and is ignored. Also user can select the study implementation which studies the traffic for a specific number of packets by applying all available implementaions of miniflow extract and than chooses the one with most optimal result for that -traffic pattern. +traffic pattern. A user can also provide an additional packet count parameter +which is the minimum number of packets that OVS must study before choosing an +optimal implementation. If no packet count is provided then the default value +128 is chosen. Also, as there is no synchronization point between threads, one +PMD thread might still be running a previous round, and can now decide on +earlier data. + +Study can be selected with packet count by the following command :: + + $ ovs-appctl dpif-netdev/miniflow-parser-set study 1024 + +Study can be selected with packet count and explicit PMD selection +by the following command :: + + $ ovs-appctl dpif-netdev/miniflow-parser-set study 1024 3 + +In the above command the last parameter is the CORE ID of the PMD +thread and this can also be used to explicitly set the miniflow +extraction function pointer on different PMD threads. + +Scalar can be selected on core 3 by the following command where +study count can be put as any arbitary number:: + + $ ovs-appctl dpif-netdev/miniflow-parser-set scalar 0 3 Miniflow Extract Validation ~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/lib/dpif-netdev-extract-study.c b/lib/dpif-netdev-extract-study.c index 32b76bd03..9b36d1974 100644 --- a/lib/dpif-netdev-extract-study.c +++ b/lib/dpif-netdev-extract-study.c @@ -51,6 +51,27 @@ mfex_study_get_study_stats_ptr(void) return stats; } +uint32_t mfex_set_study_pkt_cnt(uint32_t pkt_cmp_count, + const char *name) +{ + struct dpif_miniflow_extract_impl *miniflow_funcs; + dpif_mfex_impl_info_get(&miniflow_funcs); + + /* If the packet count is set and implementation called is study then + * set packet counter to requested number else set the packet counter + * to default number. + */ + if ((strcmp(miniflow_funcs[MFEX_IMPL_STUDY].name, name) == 0) && + (pkt_cmp_count != 0)) { + + mfex_study_pkts_count = pkt_cmp_count; + return 0; + } + + mfex_study_pkts_count = MFEX_MAX_COUNT; + return -EINVAL; +} + uint32_t mfex_study_traffic(struct dp_packet_batch *packets, struct netdev_flow_key *keys, @@ -86,7 +107,7 @@ mfex_study_traffic(struct dp_packet_batch *packets, /* Choose the best implementation after a minimum packets have been * processed. */ - if (stats->pkt_count >= MFEX_MAX_COUNT) { + if (stats->pkt_count >= mfex_study_pkts_count) { uint32_t best_func_index = MFEX_IMPL_MAX; uint32_t max_hits = 0; for (int i = MFEX_IMPL_MAX; i < impl_count; i++) { diff --git a/lib/dpif-netdev-private-extract.c b/lib/dpif-netdev-private-extract.c index 6ae91a24d..c1239b319 100644 --- a/lib/dpif-netdev-private-extract.c +++ b/lib/dpif-netdev-private-extract.c @@ -118,7 +118,7 @@ dp_mfex_impl_get(struct ds *reply, struct dp_netdev_pmd_thread **pmd_list, for (uint32_t i = 0; i < ARRAY_SIZE(mfex_impls); i++) { - ds_put_format(reply, " %s (available: %s)(pmds: ", + ds_put_format(reply, " %s (available: %s, pmds: ", mfex_impls[i].name, mfex_impls[i].available ? "True" : "False"); diff --git a/lib/dpif-netdev-private-extract.h b/lib/dpif-netdev-private-extract.h index cd46c94dd..a1f48d870 100644 --- a/lib/dpif-netdev-private-extract.h +++ b/lib/dpif-netdev-private-extract.h @@ -135,4 +135,13 @@ mfex_study_traffic(struct dp_packet_batch *packets, uint32_t keys_size, odp_port_t in_port, struct dp_netdev_pmd_thread *pmd_handle); +/* Sets the packet count from user to the stats for use in + * study function to match against the classified packets to choose + * the optimal implementation. + * On error, returns EINVAL. + * On success, returns 0. + */ +uint32_t mfex_set_study_pkt_cnt(uint32_t pkt_cmp_count, + const char *name); + #endif /* MFEX_AVX512_EXTRACT */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 175d8699f..6bcb24a73 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1103,9 +1103,13 @@ dpif_miniflow_extract_impl_set(struct unixctl_conn *conn, int argc, const char *argv[], void *aux OVS_UNUSED) { /* This function requires just one parameter, the miniflow name. + * A second optional parameter can set the packet count to match in study. + * A third optional paramter PMD thread ID can be also provided which + * allows users to set miniflow implementation on a particular pmd. */ const char *mfex_name = argv[1]; struct shash_node *node; + struct ds reply = DS_EMPTY_INITIALIZER; static const char *error_description[2] = { "Unknown miniflow implementation", @@ -1116,7 +1120,6 @@ dpif_miniflow_extract_impl_set(struct unixctl_conn *conn, int argc, int32_t err = dp_mfex_impl_set_default_by_name(mfex_name); if (err) { - struct ds reply = DS_EMPTY_INITIALIZER; ds_put_format(&reply, "Miniflow implementation not available: %s %s.\n", error_description[ (err == EINVAL) ], mfex_name); @@ -1128,6 +1131,44 @@ dpif_miniflow_extract_impl_set(struct unixctl_conn *conn, int argc, return; } + /* argv[2] is optional packet count, which user can provide along with + * study function to set the minimum packet that must be matched in order + * to choose the optimal function. */ + uint32_t pkt_cmp_count = 0; + uint32_t study_ret = 0; + + if ((argc == 3) || (argc == 4)) { + if (str_to_uint(argv[2], 10, &pkt_cmp_count)) { + study_ret = mfex_set_study_pkt_cnt(pkt_cmp_count, mfex_name); + } else { + study_ret = -EINVAL; + } + } else { + /* Default packet compare count when packets count not provided. */ + study_ret = mfex_set_study_pkt_cnt(0, mfex_name); + } + + uint32_t pmd_thread_specified = 0; + uint32_t pmd_thread_to_change = 0; + uint32_t pmd_thread_update_ok = 0; + if (argc == 4) { + if (str_to_uint(argv[3], 10, &pmd_thread_to_change)) { + pmd_thread_specified = 1; + } else { + /* argv[3] isn't even a uint. return without changing anything. */ + ovs_mutex_unlock(&dp_netdev_mutex); + ds_put_format(&reply, + "Error: Miniflow parser not changed, PMD thread argument" + " passed is not valid: '%s'. Pass a valid pmd thread ID.\n", + argv[3]); + const char *reply_str = ds_cstr(&reply); + VLOG_INFO("%s", reply_str); + unixctl_command_reply_error(conn, reply_str); + ds_destroy(&reply); + return; + } + } + SHASH_FOR_EACH (node, &dp_netdevs) { struct dp_netdev *dp = node->data; @@ -1142,8 +1183,14 @@ dpif_miniflow_extract_impl_set(struct unixctl_conn *conn, int argc, continue; } + if ((pmd_thread_specified) && + (pmd->core_id != pmd_thread_to_change)) { + continue; + } + /* Initialize MFEX function pointer to the newly configured * default. */ + pmd_thread_update_ok = 1; miniflow_extract_func default_func = dp_mfex_impl_get_default(); atomic_uintptr_t *pmd_func = (void *) &pmd->miniflow_extract_opt; atomic_store_relaxed(pmd_func, (uintptr_t) default_func); @@ -1152,9 +1199,30 @@ dpif_miniflow_extract_impl_set(struct unixctl_conn *conn, int argc, ovs_mutex_unlock(&dp_netdev_mutex); + /* If PMD thread was specified, but it wasn't found, return error. */ + if (pmd_thread_specified && !pmd_thread_update_ok) { + ds_put_format(&reply, + "Error: Miniflow parser not changed, PMD thread %d not in use," + " pass a valid pmd thread ID.\n", + pmd_thread_to_change); + const char *reply_str = ds_cstr(&reply); + VLOG_INFO("%s", reply_str); + unixctl_command_reply_error(conn, reply_str); + ds_destroy(&reply); + return; + } + /* Reply with success to command. */ - struct ds reply = DS_EMPTY_INITIALIZER; - ds_put_format(&reply, "Miniflow implementation set to %s.\n", mfex_name); + ds_put_format(&reply, "Miniflow implementation set to %s", mfex_name); + if (pmd_thread_specified) { + ds_put_format(&reply, ", on pmd thread %d", pmd_thread_to_change); + } + if (study_ret == 0) { + ds_put_format(&reply, ", studing %d packets", pkt_cmp_count); + } + + ds_put_format(&reply, "\n"); + const char *reply_str = ds_cstr(&reply); VLOG_INFO("%s", reply_str); unixctl_command_reply(conn, reply_str); @@ -1391,8 +1459,9 @@ dpif_netdev_init(void) 0, 0, dpif_netdev_impl_get, NULL); unixctl_command_register("dpif-netdev/miniflow-parser-set", - "miniflow implementation name", - 1, 1, dpif_miniflow_extract_impl_set, + "miniflow_implementation_name [study_pkt_cnt]" + " [pmd_core]", + 1, 3, dpif_miniflow_extract_impl_set, NULL); unixctl_command_register("dpif-netdev/miniflow-parser-get", "", 0, 0, dpif_miniflow_extract_impl_get, From patchwork Tue Jul 6 13:11:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501241 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2zG4gz0z9sWq for ; Tue, 6 Jul 2021 23:12:42 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 3C72160A6D; Tue, 6 Jul 2021 13:12:40 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id umDGka4wAIIl; Tue, 6 Jul 2021 13:12:38 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTPS id 3DDFF608F8; Tue, 6 Jul 2021 13:12:35 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0CA24C001F; Tue, 6 Jul 2021 13:12:35 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1B711C0020 for ; Tue, 6 Jul 2021 13:12:33 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 74A5B4041D for ; Tue, 6 Jul 2021 13:12:13 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Pxb54iBNH1zF for ; Tue, 6 Jul 2021 13:12:09 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 8711C40412 for ; Tue, 6 Jul 2021 13:12:05 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101879" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101879" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:12:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258077" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:12:03 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:46 +0100 Message-Id: <20210706131150.45513-8-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 07/11] test/sytem-dpdk: Add unit test for mfex autovalidator X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Kumar Amber Tests: 6: OVS-DPDK - MFEX Autovalidator 7: OVS-DPDK - MFEX Autovalidator Fuzzy Added a new directory to store the PCAP file used in the tests and a script to generate the fuzzy traffic type pcap to be used in fuzzy unit test. Signed-off-by: Kumar Amber --- v5: - fix review comments(Ian, Flavio, Eelco) - remove sleep from first test and added minor 5 sec sleep to fuzzy --- --- Documentation/topics/dpdk/bridge.rst | 55 +++++++++++++++++++++++++++ tests/automake.mk | 5 +++ tests/mfex_fuzzy.py | 32 ++++++++++++++++ tests/pcap/mfex_test | Bin 0 -> 416 bytes tests/system-dpdk.at | 46 ++++++++++++++++++++++ 5 files changed, 138 insertions(+) create mode 100755 tests/mfex_fuzzy.py create mode 100644 tests/pcap/mfex_test diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index 8495687e8..8a8ef3782 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -341,3 +341,58 @@ A compile time option is available in order to test it with the OVS unit test suite. Use the following configure option :: $ ./configure --enable-mfex-default-autovalidator + +Unit Test Miniflow Extract +++++++++++++++++++++++++++ + +Unit test can also be used to test the workflow mentioned above by running +the following test-case in tests/system-dpdk.at :: + + make check-dpdk TESTSUITEFLAGS='-k MFEX' + OVS-DPDK - MFEX Autovalidator + +The unit test uses mulitple traffic type to test the correctness of the +implementaions. + +Running Fuzzy test with Autovalidator ++++++++++++++++++++++++++++++++++++++ + +Fuzzy tests can also be done on miniflow extract with the help of +auto-validator and Scapy. The steps below describes the steps to +reproduce the setup with IP being fuzzed to generate packets. + +Scapy is used to create fuzzy IP packets and save them into a PCAP :: + + pkt = fuzz(Ether()/IP()/TCP()) + +Set the miniflow extract to autovalidator using :: + + $ ovs-appctl dpif-netdev/miniflow-parser-set autovalidator + +OVS is configured to receive the generated packets :: + + $ ovs-vsctl add-port br0 pcap0 -- \ + set Interface pcap0 type=dpdk options:dpdk-devargs=net_pcap0 + "rx_pcap=fuzzy.pcap" + +With this workflow, the autovalidator will ensure that all MFEX +implementations are classifying each packet in exactly the same way. +If an optimized MFEX implementation causes a different miniflow to be +generated, the autovalidator has ovs_assert and logging statements that +will inform about the issue. + +Unit Fuzzy test with Autovalidator ++++++++++++++++++++++++++++++++++++++ + +The prerquiste before running the unit test is to run the script provided :: + + tests/mfex_fuzzy.py + +This script generates a pcap with mulitple type of fuzzed packets to be used +in the below unit test-case. + +Unit test can also be used to test the workflow mentioned above by running +the following test-case in tests/system-dpdk.at :: + + make check-dpdk TESTSUITEFLAGS='-k MFEX' + OVS-DPDK - MFEX Autovalidator Fuzzy diff --git a/tests/automake.mk b/tests/automake.mk index f45f8d76c..e94ccd27c 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -143,6 +143,11 @@ $(srcdir)/tests/fuzz-regression-list.at: tests/automake.mk echo "TEST_FUZZ_REGRESSION([$$basename])"; \ done > $@.tmp && mv $@.tmp $@ +EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS) +MFEX_AUTOVALIDATOR_TESTS = \ + tests/pcap/mfex_test \ + tests/mfex_fuzzy.py + OVSDB_CLUSTER_TESTSUITE_AT = \ tests/ovsdb-cluster-testsuite.at \ tests/ovsdb-execution.at \ diff --git a/tests/mfex_fuzzy.py b/tests/mfex_fuzzy.py new file mode 100755 index 000000000..a8051ba2b --- /dev/null +++ b/tests/mfex_fuzzy.py @@ -0,0 +1,32 @@ +#!/usr/bin/python3 +try: + from scapy.all import * +except ModuleNotFoundError as err: + print(err + ": Scapy") +import sys +import os + +path = os.environ['OVS_DIR'] + "/tests/pcap/fuzzy" +pktdump = PcapWriter(path, append=False, sync=True) + +for i in range(0, 2000): + + # Generate random protocol bases, use a fuzz() over the combined packet for full fuzzing. + eth = Ether(src=RandMAC(), dst=RandMAC()) + vlan = Dot1Q() + ipv4 = IP(src=RandIP(), dst=RandIP()) + ipv6 = IPv6(src=RandIP6(), dst=RandIP6()) + udp = UDP() + tcp = TCP() + + # IPv4 packets with fuzzing + pktdump.write(fuzz(eth/ipv4/udp)) + pktdump.write(fuzz(eth/ipv4/tcp)) + pktdump.write(fuzz(eth/vlan/ipv4/udp)) + pktdump.write(fuzz(eth/vlan/ipv4/tcp)) + + # IPv6 packets with fuzzing + pktdump.write(fuzz(eth/ipv6/udp)) + pktdump.write(fuzz(eth/ipv6/tcp)) + pktdump.write(fuzz(eth/vlan/ipv6/udp)) + pktdump.write(fuzz(eth/vlan/ipv6/tcp)) \ No newline at end of file diff --git a/tests/pcap/mfex_test b/tests/pcap/mfex_test new file mode 100644 index 0000000000000000000000000000000000000000..1aac67b8d643ecb016c758cba4cc32212a80f52a GIT binary patch literal 416 zcmca|c+)~A1{MYw`2U}Qff2}QK`M68ITRa|G@yFii5$Gfk6YL%z>@uY&}o| z2s4N<1VH2&7y^V87$)XGOtD~MV$cFgfG~zBGGJ2#YtF$KST_NTIwYriok6N4Vm)gX-Q@c^{cp<7_5LgK^UuU{2>VS0RZ!RQ+EIW literal 0 HcmV?d00001 diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 802895488..fcab92729 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -232,3 +232,49 @@ OVS_VSWITCHD_STOP(["\@does not exist. The Open vSwitch kernel module is probably \@EAL: No free hugepages reported in hugepages-1048576kB@d"]) AT_CLEANUP dnl -------------------------------------------------------------------------- + +dnl -------------------------------------------------------------------------- +dnl Add standard DPDK PHY port +AT_SETUP([OVS-DPDK - MFEX Autovalidator]) +AT_KEYWORDS([dpdk]) + +OVS_DPDK_START() + +dnl Add userspace bridge and attach it to OVS +AT_CHECK([ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev]) +AT_CHECK([ovs-vsctl add-port br0 p1 -- set Interface p1 type=dpdk options:dpdk-devargs=net_pcap1,rx_pcap=$srcdir/pcap/mfex_test,infinite_rx=1], [], [stdout], [stderr]) +AT_CHECK([ovs-vsctl show], [], [stdout]) + + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set autovalidator], [0], [dnl +Miniflow implementation set to autovalidator +]) + +dnl Clean up +AT_CHECK([ovs-vsctl del-port br0 p1], [], [stdout], [stderr]) +AT_CLEANUP +dnl -------------------------------------------------------------------------- + +dnl -------------------------------------------------------------------------- +dnl Add standard DPDK PHY port +AT_SETUP([OVS-DPDK - MFEX Autovalidator Fuzzy]) +AT_KEYWORDS([dpdk]) +AT_SKIP_IF([! pip3 list | grep scapy], [], []) +AT_CHECK([$PYTHON3 $srcdir/mfex_fuzzy.py], [], [stdout]) +OVS_DPDK_START() + +dnl Add userspace bridge and attach it to OVS +AT_CHECK([ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev]) +AT_CHECK([ovs-vsctl add-port br0 p1 -- set Interface p1 type=dpdk options:dpdk-devargs=net_pcap1,rx_pcap=$srcdir/pcap/fuzzy,infinite_rx=1], [], [stdout], [stderr]) +AT_CHECK([ovs-vsctl show], [], [stdout]) + + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set autovalidator], [0], [dnl +Miniflow implementation set to autovalidator +]) +sleep 5 + +dnl Clean up +AT_CHECK([ovs-vsctl del-port br0 p1], [], [stdout], [stderr]) +AT_CLEANUP +dnl -------------------------------------------------------------------------- From patchwork Tue Jul 6 13:11:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501242 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2zJ2pF4z9sWS for ; Tue, 6 Jul 2021 23:12:44 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 97EEF8397F; Tue, 6 Jul 2021 13:12:42 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 27iQ_E6nrChI; Tue, 6 Jul 2021 13:12:39 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 1B757831F5; Tue, 6 Jul 2021 13:12:38 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id F1CB2C001F; Tue, 6 Jul 2021 13:12:37 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1E652C0025 for ; Tue, 6 Jul 2021 13:12:36 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 0EDBB403F6 for ; Tue, 6 Jul 2021 13:12:14 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l0M9FcyhTXY8 for ; Tue, 6 Jul 2021 13:12:10 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 777DC4040D for ; Tue, 6 Jul 2021 13:12:07 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101885" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101885" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:12:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258087" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:12:05 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:47 +0100 Message-Id: <20210706131150.45513-9-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 08/11] dpif/stats: add miniflow extract opt hits counter X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Harry van Haaren This commit adds a new counter to be displayed to the user when requesting datapath packet statistics. It counts the number of packets that are parsed and a miniflow built up from it by the optimized miniflow extract parsers. The ovs-appctl command "dpif-netdev/pmd-perf-show" now has an extra entry indicating if the optimized MFEX was hit: - MFEX Opt hits: 6786432 (100.0 %) Signed-off-by: Harry van Haaren --- v5: - fix review comments(Ian, Flavio, Eelco) --- --- lib/dpif-netdev-avx512.c | 2 ++ lib/dpif-netdev-perf.c | 3 +++ lib/dpif-netdev-perf.h | 1 + lib/dpif-netdev-unixctl.man | 1 + lib/dpif-netdev.c | 16 ++++++++++------ tests/pmd.at | 6 ++++-- 6 files changed, 21 insertions(+), 8 deletions(-) diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c index 91fad92db..645b4c9b4 100644 --- a/lib/dpif-netdev-avx512.c +++ b/lib/dpif-netdev-avx512.c @@ -311,8 +311,10 @@ dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread *pmd, } /* At this point we don't return error anymore, so commit stats here. */ + uint32_t mfex_hit = __builtin_popcountll(mf_mask); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_RECV, batch_size); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_PHWOL_HIT, phwol_hits); + pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_MFEX_OPT_HIT, mfex_hit); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_EXACT_HIT, emc_hits); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SMC_HIT, smc_hits); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_MASKED_HIT, diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c index 7103a2d4d..d7676ea2b 100644 --- a/lib/dpif-netdev-perf.c +++ b/lib/dpif-netdev-perf.c @@ -247,6 +247,7 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, " Rx packets: %12"PRIu64" (%.0f Kpps, %.0f cycles/pkt)\n" " Datapath passes: %12"PRIu64" (%.2f passes/pkt)\n" " - PHWOL hits: %12"PRIu64" (%5.1f %%)\n" + " - MFEX Opt hits: %12"PRIu64" (%5.1f %%)\n" " - EMC hits: %12"PRIu64" (%5.1f %%)\n" " - SMC hits: %12"PRIu64" (%5.1f %%)\n" " - Megaflow hits: %12"PRIu64" (%5.1f %%, %.2f " @@ -258,6 +259,8 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, passes, rx_packets ? 1.0 * passes / rx_packets : 0, stats[PMD_STAT_PHWOL_HIT], 100.0 * stats[PMD_STAT_PHWOL_HIT] / passes, + stats[PMD_STAT_MFEX_OPT_HIT], + 100.0 * stats[PMD_STAT_MFEX_OPT_HIT] / passes, stats[PMD_STAT_EXACT_HIT], 100.0 * stats[PMD_STAT_EXACT_HIT] / passes, stats[PMD_STAT_SMC_HIT], diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h index 8b1a52387..834c26260 100644 --- a/lib/dpif-netdev-perf.h +++ b/lib/dpif-netdev-perf.h @@ -57,6 +57,7 @@ extern "C" { enum pmd_stat_type { PMD_STAT_PHWOL_HIT, /* Packets that had a partial HWOL hit (phwol). */ + PMD_STAT_MFEX_OPT_HIT, /* Packets that had miniflow optimized match. */ PMD_STAT_EXACT_HIT, /* Packets that had an exact match (emc). */ PMD_STAT_SMC_HIT, /* Packets that had a sig match hit (SMC). */ PMD_STAT_MASKED_HIT, /* Packets that matched in the flow table. */ diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man index 83ce4f1c5..f2e536c15 100644 --- a/lib/dpif-netdev-unixctl.man +++ b/lib/dpif-netdev-unixctl.man @@ -136,6 +136,7 @@ pmd thread numa_id 0 core_id 1: Rx packets: 2399607 (2381 Kpps, 848 cycles/pkt) Datapath passes: 3599415 (1.50 passes/pkt) - PHWOL hits: 0 ( 0.0 %) + - MFEX Opt hits: 4570133 ( 99.5 %) - EMC hits: 336472 ( 9.3 %) - SMC hits: 0 ( 0.0 %) - Megaflow hits: 3262943 ( 90.7 %, 1.00 subtbl lookups/hit) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6bcb24a73..08ce06e3f 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -649,6 +649,7 @@ pmd_info_show_stats(struct ds *reply, " packet recirculations: %"PRIu64"\n" " avg. datapath passes per packet: %.02f\n" " phwol hits: %"PRIu64"\n" + " mfex opt hits: %"PRIu64"\n" " emc hits: %"PRIu64"\n" " smc hits: %"PRIu64"\n" " megaflow hits: %"PRIu64"\n" @@ -658,10 +659,9 @@ pmd_info_show_stats(struct ds *reply, " avg. packets per output batch: %.02f\n", total_packets, stats[PMD_STAT_RECIRC], passes_per_pkt, stats[PMD_STAT_PHWOL_HIT], - stats[PMD_STAT_EXACT_HIT], - stats[PMD_STAT_SMC_HIT], - stats[PMD_STAT_MASKED_HIT], lookups_per_hit, - stats[PMD_STAT_MISS], stats[PMD_STAT_LOST], + stats[PMD_STAT_MFEX_OPT_HIT], stats[PMD_STAT_EXACT_HIT], + stats[PMD_STAT_SMC_HIT], stats[PMD_STAT_MASKED_HIT], + lookups_per_hit, stats[PMD_STAT_MISS], stats[PMD_STAT_LOST], packets_per_batch); if (total_cycles == 0) { @@ -6967,7 +6967,7 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, bool md_is_valid, odp_port_t port_no) { struct netdev_flow_key *key = &keys[0]; - size_t n_missed = 0, n_emc_hit = 0, n_phwol_hit = 0; + size_t n_missed = 0, n_emc_hit = 0, n_phwol_hit = 0, n_mfex_opt_hit = 0; struct dfc_cache *cache = &pmd->flow_cache; struct dp_packet *packet; struct dp_packet_batch single_packet; @@ -7041,7 +7041,9 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, mf_ret = mfex_func(&single_packet, key, 1, port_no, pmd); /* Fallback to original miniflow_extract if there is a miss. */ - if (!mf_ret) { + if (mf_ret) { + n_mfex_opt_hit++; + } else { miniflow_extract(packet, &key->mf); } } else { @@ -7095,6 +7097,8 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, *n_flows = map_cnt; pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_PHWOL_HIT, n_phwol_hit); + pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_MFEX_OPT_HIT, + n_mfex_opt_hit); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_EXACT_HIT, n_emc_hit); if (!smc_enable_db) { diff --git a/tests/pmd.at b/tests/pmd.at index 61fc6257c..d3de86f09 100644 --- a/tests/pmd.at +++ b/tests/pmd.at @@ -202,12 +202,13 @@ dummy@ovs-dummy: hit:0 missed:0 p0 7/1: (dummy-pmd: configured_rx_queues=4, configured_tx_queues=, requested_rx_queues=4, requested_tx_queues=) ]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 10], [0], [dnl +AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 11], [0], [dnl pmd thread numa_id core_id : packets received: 0 packet recirculations: 0 avg. datapath passes per packet: 0.00 phwol hits: 0 + mfex opt hits: 0 emc hits: 0 smc hits: 0 megaflow hits: 0 @@ -234,12 +235,13 @@ AT_CHECK([cat ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:77,dst=50:54:00:00:01:78),eth_type(0x0800),ipv4(frag=no), actions: ]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 10], [0], [dnl +AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 11], [0], [dnl pmd thread numa_id core_id : packets received: 20 packet recirculations: 0 avg. datapath passes per packet: 1.00 phwol hits: 0 + mfex opt hits: 0 emc hits: 19 smc hits: 0 megaflow hits: 0 From patchwork Tue Jul 6 13:11:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501240 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2zC5G9Rz9sWq for ; Tue, 6 Jul 2021 23:12:39 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id EC01D40562; Tue, 6 Jul 2021 13:12:36 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4kAYws53QQyT; Tue, 6 Jul 2021 13:12:33 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp4.osuosl.org (Postfix) with ESMTPS id 444CB40583; Tue, 6 Jul 2021 13:12:31 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 13C27C0010; Tue, 6 Jul 2021 13:12:31 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1BE2CC0010 for ; Tue, 6 Jul 2021 13:12:30 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id B5F0640022 for ; Tue, 6 Jul 2021 13:12:12 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jYVa6kmXqu1W for ; Tue, 6 Jul 2021 13:12:11 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 5133C4041D for ; Tue, 6 Jul 2021 13:12:09 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101889" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101889" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:12:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258100" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:12:07 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:48 +0100 Message-Id: <20210706131150.45513-10-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 09/11] dpdk: add additional CPU ISA detection strings X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Harry van Haaren This commit enables OVS to at runtime check for more detailed AVX512 capabilities, specifically Byte and Word (BW) extensions, and Vector Bit Manipulation Instructions (VBMI). These instructions will be used in the CPU ISA optimized implementations of traffic profile aware miniflow extract. Signed-off-by: Harry van Haaren Acked-by: Eelco Chaudron Acked-by: Flavio Leitner --- NEWS | 1 + lib/dpdk.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/NEWS b/NEWS index 608c5a32f..502b41e3c 100644 --- a/NEWS +++ b/NEWS @@ -30,6 +30,7 @@ Post-v2.15.0 traffic. * Add build time configure command to enable auto-validatior as default miniflow implementation at build time. + * Cache results for CPU ISA checks, reduces overhead on repeated lookups. - ovs-ctl: * New option '--no-record-hostname' to disable hostname configuration in ovsdb on startup. diff --git a/lib/dpdk.c b/lib/dpdk.c index 567fd28d4..bcdd575e9 100644 --- a/lib/dpdk.c +++ b/lib/dpdk.c @@ -665,6 +665,8 @@ dpdk_get_cpu_has_isa(const char *arch, const char *feature) #if __x86_64__ /* CPU flags only defined for the architecture that support it. */ CHECK_CPU_FEATURE(feature, "avx512f", RTE_CPUFLAG_AVX512F); + CHECK_CPU_FEATURE(feature, "avx512bw", RTE_CPUFLAG_AVX512BW); + CHECK_CPU_FEATURE(feature, "avx512vbmi", RTE_CPUFLAG_AVX512VBMI); CHECK_CPU_FEATURE(feature, "avx512vpopcntdq", RTE_CPUFLAG_AVX512VPOPCNTDQ); CHECK_CPU_FEATURE(feature, "bmi2", RTE_CPUFLAG_BMI2); #endif From patchwork Tue Jul 6 13:11:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501243 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2zg6fqlz9sWS for ; Tue, 6 Jul 2021 23:13:02 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 9541083BC1; Tue, 6 Jul 2021 13:12:59 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0ak4T2wPQi_n; Tue, 6 Jul 2021 13:12:55 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 29D3983B5C; Tue, 6 Jul 2021 13:12:54 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id D5D7BC0010; Tue, 6 Jul 2021 13:12:53 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 27B41C000E for ; Tue, 6 Jul 2021 13:12:53 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 337E0404D5 for ; Tue, 6 Jul 2021 13:12:18 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PDRsgik8NuUb for ; Tue, 6 Jul 2021 13:12:14 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id 8045C4044A for ; Tue, 6 Jul 2021 13:12:11 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101894" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101894" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:12:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258126" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:12:09 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:49 +0100 Message-Id: <20210706131150.45513-11-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 10/11] dpif-netdev/mfex: Add AVX512 based optimized miniflow extract X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Harry van Haaren This commit adds AVX512 implementations of miniflow extract. By using the 64 bytes available in an AVX512 register, it is possible to convert a packet to a miniflow data-structure in a small quantity instructions. The implementation here probes for Ether()/IP()/UDP() traffic, and builds the appropriate miniflow data-structure for packets that match the probe. The implementation here is auto-validated by the miniflow extract autovalidator, hence its correctness can be easily tested and verified. Note that this commit is designed to easily allow addition of new traffic profiles in a scalable way, without code duplication for each traffic profile. Signed-off-by: Harry van Haaren Signed-off-by: Harry van Haaren > --- v5: - fix review comments(Ian, Flavio, Eelco) - inlcude assert for flow abi change - include assert for offset changes --- --- lib/automake.mk | 1 + lib/dpif-netdev-extract-avx512.c | 446 ++++++++++++++++++++++++++++++ lib/dpif-netdev-extract-study.c | 6 +- lib/dpif-netdev-private-extract.c | 16 +- lib/dpif-netdev-private-extract.h | 22 ++ 5 files changed, 487 insertions(+), 4 deletions(-) create mode 100644 lib/dpif-netdev-extract-avx512.c diff --git a/lib/automake.mk b/lib/automake.mk index 5223d321b..978ab36c1 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -39,6 +39,7 @@ lib_libopenvswitchavx512_la_CFLAGS = \ $(AM_CFLAGS) lib_libopenvswitchavx512_la_SOURCES = \ lib/dpif-netdev-lookup-avx512-gather.c \ + lib/dpif-netdev-extract-avx512.c \ lib/dpif-netdev-avx512.c lib_libopenvswitchavx512_la_LDFLAGS = \ -static diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c new file mode 100644 index 000000000..887caa6f2 --- /dev/null +++ b/lib/dpif-netdev-extract-avx512.c @@ -0,0 +1,446 @@ +/* + * Copyright (c) 2021 Intel. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * AVX512 Miniflow Extract. + * + * This file contains optimized implementations of miniflow_extract() + * for specific common traffic patterns. The optimizations allow for + * quick probing of a specific packet type, and if a match with a specific + * type is found, a shuffle like procedure builds up the required miniflow. + * + * Process + * --------- + * + * The procedure is to classify the packet based on the traffic type + * using predifined bit-masks and arrage the packet header data using shuffle + * instructions to a pre-defined place as required by the miniflow. + * This elimates the if-else ladder to identify the packet data and add data + * as per protocol which is present. + */ + +#ifdef __x86_64__ +/* Sparse cannot handle the AVX512 instructions. */ +#if !defined(__CHECKER__) + +#include +#include +#include +#include +#include + +#include "flow.h" +#include "dpdk.h" + +#include "dpif-netdev-private-dpcls.h" +#include "dpif-netdev-private-extract.h" +#include "dpif-netdev-private-flow.h" + +/* AVX512-BW level permutex2var_epi8 emulation. */ +static inline __m512i +__attribute__((target("avx512bw"))) +_mm512_maskz_permutex2var_epi8_skx(__mmask64 k_mask, + __m512i v_data_0, + __m512i v_shuf_idxs, + __m512i v_data_1) +{ + /* Manipulate shuffle indexes for u16 size. */ + __mmask64 k_mask_odd_lanes = 0xAAAAAAAAAAAAAAAA; + /* Clear away ODD lane bytes. Cannot be done above due to no u8 shift. */ + __m512i v_shuf_idx_evn = _mm512_mask_blend_epi8(k_mask_odd_lanes, + v_shuf_idxs, + _mm512_setzero_si512()); + v_shuf_idx_evn = _mm512_srli_epi16(v_shuf_idx_evn, 1); + + __m512i v_shuf_idx_odd = _mm512_srli_epi16(v_shuf_idxs, 9); + + /* Shuffle each half at 16-bit width. */ + __m512i v_shuf1 = _mm512_permutex2var_epi16(v_data_0, v_shuf_idx_evn, + v_data_1); + __m512i v_shuf2 = _mm512_permutex2var_epi16(v_data_0, v_shuf_idx_odd, + v_data_1); + + /* Find if the shuffle index was odd, via mask and compare. */ + uint16_t index_odd_mask = 0x1; + const __m512i v_index_mask_u16 = _mm512_set1_epi16(index_odd_mask); + + /* EVEN lanes, find if u8 index was odd, result as u16 bitmask. */ + __m512i v_idx_even_masked = _mm512_and_si512(v_shuf_idxs, + v_index_mask_u16); + __mmask32 evn_rotate_mask = _mm512_cmpeq_epi16_mask(v_idx_even_masked, + v_index_mask_u16); + + /* ODD lanes, find if u8 index was odd, result as u16 bitmask. */ + __m512i v_shuf_idx_srli8 = _mm512_srli_epi16(v_shuf_idxs, 8); + __m512i v_idx_odd_masked = _mm512_and_si512(v_shuf_idx_srli8, + v_index_mask_u16); + __mmask32 odd_rotate_mask = _mm512_cmpeq_epi16_mask(v_idx_odd_masked, + v_index_mask_u16); + odd_rotate_mask = ~odd_rotate_mask; + + /* Rotate and blend results from each index. */ + __m512i v_shuf_res_evn = _mm512_mask_srli_epi16(v_shuf1, evn_rotate_mask, + v_shuf1, 8); + __m512i v_shuf_res_odd = _mm512_mask_slli_epi16(v_shuf2, odd_rotate_mask, + v_shuf2, 8); + + /* If shuffle index was odd, blend shifted version. */ + __m512i v_shuf_result = _mm512_mask_blend_epi8(k_mask_odd_lanes, + v_shuf_res_evn, v_shuf_res_odd); + + __m512i v_zeros = _mm512_setzero_si512(); + __m512i v_result_kmskd = _mm512_mask_blend_epi8(k_mask, v_zeros, + v_shuf_result); + + return v_result_kmskd; +} + +/* Wrapper function required to enable ISA. */ +static inline __m512i +__attribute__((__target__("avx512vbmi"))) +_mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i idx, __m512i a) +{ + return _mm512_maskz_permutexvar_epi8(kmask, idx, a); +} + + +/* This file contains optimized implementations of miniflow_extract() + * for specific common traffic patterns. The optimizations allow for + * quick probing of a specific packet type, and if a match with a specific + * type is found, a shuffle like procedure builds up the required miniflow. + * + * The functionality here can be easily auto-validated and tested against the + * scalar miniflow_extract() function. As such, manual review of the code by + * the community (although welcome) is not required. Confidence in the + * correctness of the code can be had from the autovalidation. + */ + +/* Generator for EtherType masks and values. */ +#define PATTERN_ETHERTYPE_GEN(type_b0, type_b1) \ + 0, 0, 0, 0, 0, 0, /* Ether MAC DST */ \ + 0, 0, 0, 0, 0, 0, /* Ether MAC SRC */ \ + type_b0, type_b1, /* EtherType */ + +#define PATTERN_ETHERTYPE_MASK PATTERN_ETHERTYPE_GEN(0xFF, 0xFF) +#define PATTERN_ETHERTYPE_IPV4 PATTERN_ETHERTYPE_GEN(0x08, 0x00) + +/* Generator for checking IPv4 ver, ihl, and proto */ +#define PATTERN_IPV4_GEN(VER_IHL, FLAG_OFF_B0, FLAG_OFF_B1, PROTO) \ + VER_IHL, /* Version and IHL */ \ + 0, 0, 0, /* DSCP, ECN, Total Length */ \ + 0, 0, /* Identification */ \ + /* Flags/Fragment offset: don't match MoreFrag (MF) or FragOffset */ \ + FLAG_OFF_B0, FLAG_OFF_B1, \ + 0, /* TTL */ \ + PROTO, /* Protocol */ \ + 0, 0, /* Header checksum */ \ + 0, 0, 0, 0, /* Src IP */ \ + 0, 0, 0, 0, /* Dst IP */ + +#define PATTERN_IPV4_MASK PATTERN_IPV4_GEN(0xFF, 0xFE, 0xFF, 0xFF) +#define PATTERN_IPV4_UDP PATTERN_IPV4_GEN(0x45, 0, 0, 0x11) +#define PATTERN_IPV4_TCP PATTERN_IPV4_GEN(0x45, 0, 0, 0x06) + +#define NU 0 +#define PATTERN_IPV4_UDP_SHUFFLE \ + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, NU, NU, /* Ether */ \ + 26, 27, 28, 29, 30, 31, 32, 33, NU, NU, NU, NU, 20, 15, 22, 23, /* IPv4 */ \ + 34, 35, 36, 37, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* UDP */ \ + NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* Unused. */ + + +/* Generation of K-mask bitmask values, to zero out data in result. Note that + * these correspond 1:1 to the above "*_SHUFFLE" values, and bit used must be + * set in this K-mask, and "NU" values must be zero in the k-mask. Each mask + * defined here represents 2 blocks, so 16 bytes, so 4 characters (eg. 0xFFFF). + * + * Note the ULL suffix allows shifting by 32 or more without integer overflow. + */ +#define KMASK_ETHER 0x1FFFULL +#define KMASK_IPV4 0xF0FFULL +#define KMASK_UDP 0x000FULL + +#define PATTERN_IPV4_UDP_KMASK \ + (KMASK_ETHER | (KMASK_IPV4 << 16) | (KMASK_UDP << 32)) + + +/* This union allows initializing static data as u8, but easily loading it + * into AVX512 registers too. The union ensures proper alignment for the zmm. + */ +union mfex_data { + uint8_t u8_data[64]; + __m512i zmm; +}; + +/* This structure represents a single traffic pattern. The AVX512 code to + * enable the specifics for each pattern is largely the same, so it is + * specialized to use the common profile data from here. + * + * Due to the nature of e.g. TCP flag handling, or VLAN CFI bit setting, + * some profiles require additional processing. This is handled by having + * all implementations call a post-process function, and specializing away + * the big switch() that handles all traffic types. + * + * This approach reduces AVX512 code-duplication for each traffic type. + */ +struct mfex_profile { + /* Required for probing a packet with the mfex pattern. */ + union mfex_data probe_mask; + union mfex_data probe_data; + + /* Required for reshaping packet into miniflow. */ + union mfex_data store_shuf; + __mmask64 store_kmsk; + + /* Constant data to set in mf.bits and dp_packet data on hit. */ + uint64_t mf_bits[2]; + uint16_t dp_pkt_offs[4]; + uint16_t dp_pkt_min_size; +}; + +BUILD_ASSERT_DECL((OFFSETOFEND(struct dp_packet, l4_ofs) + - offsetof(struct dp_packet, l2_pad_size)) == + MEMBER_SIZEOF(struct mfex_profile, dp_pkt_offs)); + +/* Any change in flow abi should be inluded here otherwise build should + * fail. + */ +BUILD_ASSERT_DECL(FLOW_WC_SEQ == 42); + +enum MFEX_PROFILES { + PROFILE_ETH_IPV4_UDP, + PROFILE_COUNT, +}; + +/* Static const instances of profiles. These are compile-time constants, + * and are specialized into individual miniflow-extract functions. + */ +static const struct mfex_profile mfex_profiles[PROFILE_COUNT] = +{ + [PROFILE_ETH_IPV4_UDP] = { + .probe_mask.u8_data = { PATTERN_ETHERTYPE_MASK PATTERN_IPV4_MASK }, + .probe_data.u8_data = { PATTERN_ETHERTYPE_IPV4 PATTERN_IPV4_UDP}, + + .store_shuf.u8_data = { PATTERN_IPV4_UDP_SHUFFLE }, + .store_kmsk = PATTERN_IPV4_UDP_KMASK, + + .mf_bits = { 0x18a0000000000000, 0x0000000000040401}, + .dp_pkt_offs = { + 0, UINT16_MAX, 14, 34, + }, + .dp_pkt_min_size = 42, + }, +}; + + +/* Protocol specific helper functions, for calculating offsets/lenghts. */ +static int32_t +mfex_ipv4_set_l2_pad_size(struct dp_packet *pkt, struct ip_header *nh, + uint32_t len_from_ipv4) +{ + /* Handle dynamic l2_pad_size. */ + uint16_t tot_len = ntohs(nh->ip_tot_len); + if (OVS_UNLIKELY(tot_len > len_from_ipv4 || + (len_from_ipv4 - tot_len) > UINT16_MAX)) { + return -1; + } + dp_packet_set_l2_pad_size(pkt, len_from_ipv4 - tot_len); + return 0; +} + +/* Generic loop to process any mfex profile. This code is specialized into + * multiple actual MFEX implementation functions. Its marked ALWAYS_INLINE + * to ensure the compiler specializes each instance. The code is marked "hot" + * to inform the compiler this is a hotspot in the program, encouraging + * inlining of callee functions such as the permute calls. + */ +static inline uint32_t ALWAYS_INLINE +__attribute__ ((hot)) +mfex_avx512_process(struct dp_packet_batch *packets, + struct netdev_flow_key *keys, + uint32_t keys_size OVS_UNUSED, + odp_port_t in_port, + void *pmd_handle OVS_UNUSED, + const enum MFEX_PROFILES profile_id, + const uint32_t use_vbmi) +{ + uint32_t hitmask = 0; + struct dp_packet *packet; + + /* Here the profile to use is chosen by the variable used to specialize + * the function. This causes different MFEX traffic to be handled. + */ + const struct mfex_profile *profile = &mfex_profiles[profile_id]; + + /* Load profile constant data. */ + __m512i v_vals = _mm512_loadu_si512(&profile->probe_data); + __m512i v_mask = _mm512_loadu_si512(&profile->probe_mask); + __m512i v_shuf = _mm512_loadu_si512(&profile->store_shuf); + + __mmask64 k_shuf = profile->store_kmsk; + __m128i v_bits = _mm_loadu_si128((void *) &profile->mf_bits); + uint16_t dp_pkt_min_size = profile->dp_pkt_min_size; + + __m128i v_zeros = _mm_setzero_si128(); + __m128i v_blocks01 = _mm_insert_epi32(v_zeros, odp_to_u32(in_port), 1); + + DP_PACKET_BATCH_FOR_EACH (i, packet, packets) { + /* If the packet is smaller than the probe size, skip it. */ + const uint32_t size = dp_packet_size(packet); + if (size < dp_pkt_min_size) { + continue; + } + + /* Load packet data and probe with AVX512 mask & compare. */ + const uint8_t *pkt = dp_packet_data(packet); + __m512i v_pkt0 = _mm512_loadu_si512(pkt); + __m512i v_pkt0_masked = _mm512_and_si512(v_pkt0, v_mask); + __mmask64 k_cmp = _mm512_cmpeq_epi8_mask(v_pkt0_masked, v_vals); + if (k_cmp != UINT64_MAX) { + continue; + } + + /* Copy known dp packet offsets to the dp_packet instance. */ + memcpy(&packet->l2_pad_size, &profile->dp_pkt_offs, + sizeof(uint16_t) * 4); + + /* Store known miniflow bits and first two blocks. */ + struct miniflow *mf = &keys[i].mf; + uint64_t *bits = (void *) &mf->map.bits[0]; + uint64_t *blocks = miniflow_values(mf); + _mm_storeu_si128((void *) bits, v_bits); + _mm_storeu_si128((void *) blocks, v_blocks01); + + /* Permute the packet layout into miniflow blocks shape. + * As different AVX512 ISA levels have different implementations, + * this specializes on the "use_vbmi" attribute passed in. + */ + __m512i v512_zeros = _mm512_setzero_si512(); + __m512i v_blk0 = v512_zeros; + if (__builtin_constant_p(use_vbmi) && use_vbmi) { + v_blk0 = _mm512_maskz_permutexvar_epi8_wrap(k_shuf, v_shuf, + v_pkt0); + } else { + v_blk0 = _mm512_maskz_permutex2var_epi8_skx(k_shuf, v_pkt0, + v_shuf, v512_zeros); + } + _mm512_storeu_si512(&blocks[2], v_blk0); + + + /* Perform "post-processing" per profile, handling details not easily + * handled in the above generic AVX512 code. Examples include TCP flag + * parsing, adding the VLAN CFI bit, and handling IPv4 fragments. + */ + switch (profile_id) { + case PROFILE_COUNT: + ovs_assert(0); /* avoid compiler warning on missing ENUM */ + break; + + case PROFILE_ETH_IPV4_UDP: { + /* Handle dynamic l2_pad_size. */ + uint32_t size_from_ipv4 = size - sizeof(struct eth_header); + struct ip_header *nh = (void *)&pkt[sizeof(struct eth_header)]; + if (mfex_ipv4_set_l2_pad_size(packet, nh, size_from_ipv4)) { + continue; + } + + } break; + default: + break; + }; + + /* This packet has its miniflow created, add to hitmask. */ + hitmask |= 1 << i; + } + + return hitmask; +} + + +#define DECLARE_MFEX_FUNC(name, profile) \ +uint32_t \ +__attribute__((__target__("avx512f"))) \ +__attribute__((__target__("avx512vl"))) \ +__attribute__((__target__("avx512vbmi"))) \ +mfex_avx512_vbmi_##name(struct dp_packet_batch *packets, \ + struct netdev_flow_key *keys, uint32_t keys_size,\ + odp_port_t in_port, struct dp_netdev_pmd_thread \ + *pmd_handle) \ +{ \ + return mfex_avx512_process(packets, keys, keys_size, in_port, \ + pmd_handle, profile, 1); \ +} \ + \ +uint32_t \ +__attribute__((__target__("avx512f"))) \ +__attribute__((__target__("avx512vl"))) \ +mfex_avx512_##name(struct dp_packet_batch *packets, \ + struct netdev_flow_key *keys, uint32_t keys_size, \ + odp_port_t in_port, struct dp_netdev_pmd_thread \ + *pmd_handle) \ +{ \ + return mfex_avx512_process(packets, keys, keys_size, in_port, \ + pmd_handle, profile, 0); \ +} + +/* Each profile gets a single declare here, which specializes the function + * as required. + */ +DECLARE_MFEX_FUNC(ip_udp, PROFILE_ETH_IPV4_UDP) + + +static int32_t +avx512_isa_probe(uint32_t needs_vbmi) +{ + static const char *isa_required[] = { + "avx512f", + "avx512bw", + "bmi2", + }; + + int32_t ret = 0; + for (uint32_t i = 0; i < ARRAY_SIZE(isa_required); i++) { + if (!dpdk_get_cpu_has_isa("x86_64", isa_required[i])) { + ret = -ENOTSUP; + } + } + + if (needs_vbmi) { + if (!dpdk_get_cpu_has_isa("x86_64", "avx512vbmi")) { + ret = -ENOTSUP; + } + } + + return ret; +} + +/* Probe functions to check ISA requirements. */ +int32_t +mfex_avx512_probe(void) +{ + const uint32_t needs_vbmi = 0; + return avx512_isa_probe(needs_vbmi); +} + +int32_t +mfex_avx512_vbmi_probe(void) +{ + const uint32_t needs_vbmi = 1; + return avx512_isa_probe(needs_vbmi); +} + +#endif /* __CHECKER__ */ +#endif /* __x86_64__ */ diff --git a/lib/dpif-netdev-extract-study.c b/lib/dpif-netdev-extract-study.c index 9b36d1974..6de633214 100644 --- a/lib/dpif-netdev-extract-study.c +++ b/lib/dpif-netdev-extract-study.c @@ -89,7 +89,7 @@ mfex_study_traffic(struct dp_packet_batch *packets, * to be compared after certain packets have been hit to choose * the best miniflow_extract version for that traffic. */ - for (int i = MFEX_IMPL_MAX; i < impl_count; i++) { + for (int i = MFEX_IMPL_VMBI_IPv4_UDP; i < impl_count; i++) { if (miniflow_funcs[i].available) { hitmask = miniflow_funcs[i].extract_func(packets, keys, keys_size, in_port, pmd_handle); @@ -108,9 +108,9 @@ mfex_study_traffic(struct dp_packet_batch *packets, * processed. */ if (stats->pkt_count >= mfex_study_pkts_count) { - uint32_t best_func_index = MFEX_IMPL_MAX; + uint32_t best_func_index = MFEX_IMPL_VMBI_IPv4_UDP; uint32_t max_hits = 0; - for (int i = MFEX_IMPL_MAX; i < impl_count; i++) { + for (int i = MFEX_IMPL_VMBI_IPv4_UDP; i < impl_count; i++) { if (stats->impl_hitcount[i] > max_hits) { max_hits = stats->impl_hitcount[i]; best_func_index = i; diff --git a/lib/dpif-netdev-private-extract.c b/lib/dpif-netdev-private-extract.c index c1239b319..5929e1493 100644 --- a/lib/dpif-netdev-private-extract.c +++ b/lib/dpif-netdev-private-extract.c @@ -52,6 +52,19 @@ static struct dpif_miniflow_extract_impl mfex_impls[] = { .probe = NULL, .extract_func = mfex_study_traffic, .name = "study", }, + +/* Compile in implementations only if the compiler ISA checks pass. */ +#if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__) + [MFEX_IMPL_VMBI_IPv4_UDP] = { + .probe = mfex_avx512_vbmi_probe, + .extract_func = mfex_avx512_vbmi_ip_udp, + .name = "avx512_vbmi_ipv4_udp", }, + + [MFEX_IMPL_IPv4_UDP] = { + .probe = mfex_avx512_probe, + .extract_func = mfex_avx512_ip_udp, + .name = "avx512_ipv4_udp", }, +#endif }; BUILD_ASSERT_DECL(MFEX_IMPL_MAX >= ARRAY_SIZE(mfex_impls)); @@ -222,7 +235,8 @@ dpif_miniflow_extract_autovalidator(struct dp_packet_batch *packets, uint32_t batch_failed = 0; /* Iterate through each version of miniflow implementations. */ - for (int j = MFEX_IMPL_MAX; j < MFEX_IMPL_MAX; j++) { + for (int j = MFEX_IMPL_VMBI_IPv4_UDP; j < MFEX_IMPL_MAX; j++) { + if ((j < MFEX_IMPL_MAX) || (!mfex_impls[j].available)) { continue; } diff --git a/lib/dpif-netdev-private-extract.h b/lib/dpif-netdev-private-extract.h index a1f48d870..d1875815a 100644 --- a/lib/dpif-netdev-private-extract.h +++ b/lib/dpif-netdev-private-extract.h @@ -72,6 +72,8 @@ enum dpif_miniflow_extract_impl_idx { MFEX_IMPL_AUTOVALIDATOR, MFEX_IMPL_SCALAR, MFEX_IMPL_STUDY, + MFEX_IMPL_VMBI_IPv4_UDP, + MFEX_IMPL_IPv4_UDP, MFEX_IMPL_MAX }; @@ -144,4 +146,24 @@ mfex_study_traffic(struct dp_packet_batch *packets, uint32_t mfex_set_study_pkt_cnt(uint32_t pkt_cmp_count, const char *name); +/* AVX512 MFEX Probe and Implementations functions. */ +#ifdef __x86_64__ +int32_t mfex_avx512_probe(void); +int32_t mfex_avx512_vbmi_probe(void); + +#define DECLARE_AVX512_MFEX_PROTOTYPE(name) \ + uint32_t \ + mfex_avx512_vbmi_##name(struct dp_packet_batch *packets, \ + struct netdev_flow_key *keys, uint32_t keys_size,\ + odp_port_t in_port, struct dp_netdev_pmd_thread \ + *pmd_handle); \ + uint32_t \ + mfex_avx512_##name(struct dp_packet_batch *packets, \ + struct netdev_flow_key *keys, uint32_t keys_size, \ + odp_port_t in_port, struct dp_netdev_pmd_thread \ + *pmd_handle); \ + +DECLARE_AVX512_MFEX_PROTOTYPE(ip_udp); +#endif /* __x86_64__ */ + #endif /* MFEX_AVX512_EXTRACT */ From patchwork Tue Jul 6 13:11:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ferriter, Cian" X-Patchwork-Id: 1501244 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GK2zh13cjz9sWq for ; Tue, 6 Jul 2021 23:13:04 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 957CF40410; Tue, 6 Jul 2021 13:13:00 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cUrgPtv0GP_d; Tue, 6 Jul 2021 13:12:59 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp2.osuosl.org (Postfix) with ESMTPS id 6425A4044A; Tue, 6 Jul 2021 13:12:58 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 43E6AC0010; Tue, 6 Jul 2021 13:12:58 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2F6D9C0010 for ; Tue, 6 Jul 2021 13:12:57 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 1CCFD404CF for ; Tue, 6 Jul 2021 13:12:19 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BV12iBZ4PAiz for ; Tue, 6 Jul 2021 13:12:16 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by smtp2.osuosl.org (Postfix) with ESMTPS id C0A1240412 for ; Tue, 6 Jul 2021 13:12:13 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10036"; a="206101898" X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="206101898" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jul 2021 06:12:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,328,1616482800"; d="scan'208";a="486258147" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.105]) by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2021 06:12:11 -0700 From: Cian Ferriter To: ovs-dev@openvswitch.org Date: Tue, 6 Jul 2021 14:11:50 +0100 Message-Id: <20210706131150.45513-12-cian.ferriter@intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210706131150.45513-1-cian.ferriter@intel.com> References: <20210706131150.45513-1-cian.ferriter@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org, fbl@sysclose.org, kumar.amber@intel.com Subject: [ovs-dev] [v6 11/11] dpif-netdev/mfex: add more AVX512 traffic profiles X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Harry van Haaren This commit adds 3 new traffic profile implementations to the existing avx512 miniflow extract infrastructure. The profiles added are: - Ether()/IP()/TCP() - Ether()/Dot1Q()/IP()/UDP() - Ether()/Dot1Q()/IP()/TCP() The design of the avx512 code here is for scalability to add more traffic profiles, as well as enabling CPU ISA. Note that an implementation is primarily adding static const data, which the compiler then specializes away when the profile specific function is declared below. As a result, the code is relatively maintainable, and scalable for new traffic profiles as well as new ISA, and does not lower performance compared with manually written code for each profile/ISA. Note that confidence in the correctness of each implementation is achieved through autovalidation, unit tests with known packets, and fuzz tested packets. Signed-off-by: Harry van Haaren Acked-by: Eelco Chaudron --- Hi Readers, If you have a traffic profile you'd like to see accelerated using avx512 code, please send me an email and we can collaborate on adding support for it! Regards, -Harry --- v5: - fix review comments(Ian, Flavio, Eelco) --- --- NEWS | 2 + lib/dpif-netdev-extract-avx512.c | 152 ++++++++++++++++++++++++++++++ lib/dpif-netdev-private-extract.c | 30 ++++++ lib/dpif-netdev-private-extract.h | 10 ++ 4 files changed, 194 insertions(+) diff --git a/NEWS b/NEWS index 502b41e3c..ec4c61466 100644 --- a/NEWS +++ b/NEWS @@ -31,6 +31,8 @@ Post-v2.15.0 * Add build time configure command to enable auto-validatior as default miniflow implementation at build time. * Cache results for CPU ISA checks, reduces overhead on repeated lookups. + * Add AVX512 based optimized miniflow extract function for traffic type + IPv4/UDP, IPv4/TCP, Vlan/IPv4/UDP and Vlan/Ipv4/TCP. - ovs-ctl: * New option '--no-record-hostname' to disable hostname configuration in ovsdb on startup. diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c index 887caa6f2..ed0df0181 100644 --- a/lib/dpif-netdev-extract-avx512.c +++ b/lib/dpif-netdev-extract-avx512.c @@ -136,6 +136,13 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i idx, __m512i a) #define PATTERN_ETHERTYPE_MASK PATTERN_ETHERTYPE_GEN(0xFF, 0xFF) #define PATTERN_ETHERTYPE_IPV4 PATTERN_ETHERTYPE_GEN(0x08, 0x00) +#define PATTERN_ETHERTYPE_DT1Q PATTERN_ETHERTYPE_GEN(0x81, 0x00) + +/* VLAN (Dot1Q) patterns and masks. */ +#define PATTERN_DT1Q_MASK \ + 0x00, 0x00, 0xFF, 0xFF, +#define PATTERN_DT1Q_IPV4 \ + 0x00, 0x00, 0x08, 0x00, /* Generator for checking IPv4 ver, ihl, and proto */ #define PATTERN_IPV4_GEN(VER_IHL, FLAG_OFF_B0, FLAG_OFF_B1, PROTO) \ @@ -161,6 +168,29 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i idx, __m512i a) 34, 35, 36, 37, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* UDP */ \ NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* Unused. */ +/* TCP shuffle: tcp_ctl bits require mask/processing, not included here. */ +#define PATTERN_IPV4_TCP_SHUFFLE \ + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, NU, NU, /* Ether */ \ + 26, 27, 28, 29, 30, 31, 32, 33, NU, NU, NU, NU, 20, 15, 22, 23, /* IPv4 */ \ + NU, NU, NU, NU, NU, NU, NU, NU, 34, 35, 36, 37, NU, NU, NU, NU, /* TCP */ \ + NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* Unused. */ + +#define PATTERN_DT1Q_IPV4_UDP_SHUFFLE \ + /* Ether (2 blocks): Note that *VLAN* type is written here. */ \ + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 16, 17, 0, 0, \ + /* VLAN (1 block): Note that the *EtherHdr->Type* is written here. */ \ + 12, 13, 14, 15, 0, 0, 0, 0, \ + 30, 31, 32, 33, 34, 35, 36, 37, 0, 0, 0, 0, 24, 19, 26, 27, /* IPv4 */ \ + 38, 39, 40, 41, NU, NU, NU, NU, /* UDP */ + +#define PATTERN_DT1Q_IPV4_TCP_SHUFFLE \ + /* Ether (2 blocks): Note that *VLAN* type is written here. */ \ + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 16, 17, 0, 0, \ + /* VLAN (1 block): Note that the *EtherHdr->Type* is written here. */ \ + 12, 13, 14, 15, 0, 0, 0, 0, \ + 30, 31, 32, 33, 34, 35, 36, 37, 0, 0, 0, 0, 24, 19, 26, 27, /* IPv4 */ \ + NU, NU, NU, NU, NU, NU, NU, NU, 38, 39, 40, 41, NU, NU, NU, NU, /* TCP */ \ + NU, NU, NU, NU, NU, NU, NU, NU, /* Unused. */ /* Generation of K-mask bitmask values, to zero out data in result. Note that * these correspond 1:1 to the above "*_SHUFFLE" values, and bit used must be @@ -170,12 +200,22 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i idx, __m512i a) * Note the ULL suffix allows shifting by 32 or more without integer overflow. */ #define KMASK_ETHER 0x1FFFULL +#define KMASK_DT1Q 0x0FULL #define KMASK_IPV4 0xF0FFULL #define KMASK_UDP 0x000FULL +#define KMASK_TCP 0x0F00ULL #define PATTERN_IPV4_UDP_KMASK \ (KMASK_ETHER | (KMASK_IPV4 << 16) | (KMASK_UDP << 32)) +#define PATTERN_IPV4_TCP_KMASK \ + (KMASK_ETHER | (KMASK_IPV4 << 16) | (KMASK_TCP << 32)) + +#define PATTERN_DT1Q_IPV4_UDP_KMASK \ + (KMASK_ETHER | (KMASK_DT1Q << 16) | (KMASK_IPV4 << 24) | (KMASK_UDP << 40)) + +#define PATTERN_DT1Q_IPV4_TCP_KMASK \ + (KMASK_ETHER | (KMASK_DT1Q << 16) | (KMASK_IPV4 << 24) | (KMASK_TCP << 40)) /* This union allows initializing static data as u8, but easily loading it * into AVX512 registers too. The union ensures proper alignment for the zmm. @@ -222,6 +262,9 @@ BUILD_ASSERT_DECL(FLOW_WC_SEQ == 42); enum MFEX_PROFILES { PROFILE_ETH_IPV4_UDP, + PROFILE_ETH_IPV4_TCP, + PROFILE_ETH_VLAN_IPV4_UDP, + PROFILE_ETH_VLAN_IPV4_TCP, PROFILE_COUNT, }; @@ -243,6 +286,56 @@ static const struct mfex_profile mfex_profiles[PROFILE_COUNT] = }, .dp_pkt_min_size = 42, }, + + [PROFILE_ETH_IPV4_TCP] = { + .probe_mask.u8_data = { PATTERN_ETHERTYPE_MASK PATTERN_IPV4_MASK }, + .probe_data.u8_data = { PATTERN_ETHERTYPE_IPV4 PATTERN_IPV4_TCP}, + + .store_shuf.u8_data = { PATTERN_IPV4_TCP_SHUFFLE }, + .store_kmsk = PATTERN_IPV4_TCP_KMASK, + + .mf_bits = { 0x18a0000000000000, 0x0000000000044401}, + .dp_pkt_offs = { + 0, UINT16_MAX, 14, 34, + }, + .dp_pkt_min_size = 54, + }, + + [PROFILE_ETH_VLAN_IPV4_UDP] = { + .probe_mask.u8_data = { + PATTERN_ETHERTYPE_MASK PATTERN_DT1Q_MASK PATTERN_IPV4_MASK + }, + .probe_data.u8_data = { + PATTERN_ETHERTYPE_DT1Q PATTERN_DT1Q_IPV4 PATTERN_IPV4_UDP + }, + + .store_shuf.u8_data = { PATTERN_DT1Q_IPV4_UDP_SHUFFLE }, + .store_kmsk = PATTERN_DT1Q_IPV4_UDP_KMASK, + + .mf_bits = { 0x38a0000000000000, 0x0000000000040401}, + .dp_pkt_offs = { + 14, UINT16_MAX, 18, 38, + }, + .dp_pkt_min_size = 46, + }, + + [PROFILE_ETH_VLAN_IPV4_TCP] = { + .probe_mask.u8_data = { + PATTERN_ETHERTYPE_MASK PATTERN_DT1Q_MASK PATTERN_IPV4_MASK + }, + .probe_data.u8_data = { + PATTERN_ETHERTYPE_DT1Q PATTERN_DT1Q_IPV4 PATTERN_IPV4_TCP + }, + + .store_shuf.u8_data = { PATTERN_DT1Q_IPV4_TCP_SHUFFLE }, + .store_kmsk = PATTERN_DT1Q_IPV4_TCP_KMASK, + + .mf_bits = { 0x38a0000000000000, 0x0000000000044401}, + .dp_pkt_offs = { + 14, UINT16_MAX, 18, 38, + }, + .dp_pkt_min_size = 46, + }, }; @@ -261,6 +354,25 @@ mfex_ipv4_set_l2_pad_size(struct dp_packet *pkt, struct ip_header *nh, return 0; } +/* Fixup the VLAN CFI and PCP, reading the PCP from the input to this function, + * and storing the output CFI bit bitwise-OR-ed with the PCP to miniflow. + */ +static void +mfex_vlan_pcp(const uint8_t vlan_pcp, uint64_t *block) +{ + /* Bitwise-OR in the CFI flag, keeping other data the same. */ + uint8_t *cfi_byte = (uint8_t *) block; + cfi_byte[2] = 0x10 | vlan_pcp; +} + +static void +mfex_handle_tcp_flags(const struct tcp_header *tcp, uint64_t *block) +{ + uint16_t ctl = (OVS_FORCE uint16_t) TCP_FLAGS_BE16(tcp->tcp_ctl); + uint64_t ctl_u64 = ctl; + *block = ctl_u64 << 32; +} + /* Generic loop to process any mfex profile. This code is specialized into * multiple actual MFEX implementation functions. Its marked ALWAYS_INLINE * to ensure the compiler specializes each instance. The code is marked "hot" @@ -349,6 +461,43 @@ mfex_avx512_process(struct dp_packet_batch *packets, ovs_assert(0); /* avoid compiler warning on missing ENUM */ break; + case PROFILE_ETH_VLAN_IPV4_TCP: { + mfex_vlan_pcp(pkt[14], &keys[i].buf[4]); + + uint32_t size_from_ipv4 = size - VLAN_ETH_HEADER_LEN; + struct ip_header *nh = (void *)&pkt[VLAN_ETH_HEADER_LEN]; + if (mfex_ipv4_set_l2_pad_size(packet, nh, size_from_ipv4)) { + continue; + } + + /* Process TCP flags, and store to blocks. */ + const struct tcp_header *tcp = (void *)&pkt[38]; + mfex_handle_tcp_flags(tcp, &blocks[7]); + } break; + + case PROFILE_ETH_VLAN_IPV4_UDP: { + mfex_vlan_pcp(pkt[14], &keys[i].buf[4]); + + uint32_t size_from_ipv4 = size - VLAN_ETH_HEADER_LEN; + struct ip_header *nh = (void *)&pkt[VLAN_ETH_HEADER_LEN]; + if (mfex_ipv4_set_l2_pad_size(packet, nh, size_from_ipv4)) { + continue; + } + } break; + + case PROFILE_ETH_IPV4_TCP: { + /* Process TCP flags, and store to blocks. */ + const struct tcp_header *tcp = (void *)&pkt[34]; + mfex_handle_tcp_flags(tcp, &blocks[6]); + + /* Handle dynamic l2_pad_size. */ + uint32_t size_from_ipv4 = size - sizeof(struct eth_header); + struct ip_header *nh = (void *)&pkt[sizeof(struct eth_header)]; + if (mfex_ipv4_set_l2_pad_size(packet, nh, size_from_ipv4)) { + continue; + } + } break; + case PROFILE_ETH_IPV4_UDP: { /* Handle dynamic l2_pad_size. */ uint32_t size_from_ipv4 = size - sizeof(struct eth_header); @@ -400,6 +549,9 @@ mfex_avx512_##name(struct dp_packet_batch *packets, \ * as required. */ DECLARE_MFEX_FUNC(ip_udp, PROFILE_ETH_IPV4_UDP) +DECLARE_MFEX_FUNC(ip_tcp, PROFILE_ETH_IPV4_TCP) +DECLARE_MFEX_FUNC(dot1q_ip_udp, PROFILE_ETH_VLAN_IPV4_UDP) +DECLARE_MFEX_FUNC(dot1q_ip_tcp, PROFILE_ETH_VLAN_IPV4_TCP) static int32_t diff --git a/lib/dpif-netdev-private-extract.c b/lib/dpif-netdev-private-extract.c index 5929e1493..4987d628a 100644 --- a/lib/dpif-netdev-private-extract.c +++ b/lib/dpif-netdev-private-extract.c @@ -64,6 +64,36 @@ static struct dpif_miniflow_extract_impl mfex_impls[] = { .probe = mfex_avx512_probe, .extract_func = mfex_avx512_ip_udp, .name = "avx512_ipv4_udp", }, + + [MFEX_IMPL_VMBI_IPv4_TCP] = { + .probe = mfex_avx512_vbmi_probe, + .extract_func = mfex_avx512_vbmi_ip_tcp, + .name = "avx512_vbmi_ipv4_tcp", }, + + [MFEX_IMPL_IPv4_TCP] = { + .probe = mfex_avx512_probe, + .extract_func = mfex_avx512_ip_tcp, + .name = "avx512_ipv4_tcp", }, + + [MFEX_IMPL_VMBI_DOT1Q_IPv4_UDP] = { + .probe = mfex_avx512_vbmi_probe, + .extract_func = mfex_avx512_vbmi_dot1q_ip_udp, + .name = "avx512_vbmi_dot1q_ipv4_udp", }, + + [MFEX_IMPL_DOT1Q_IPv4_UDP] = { + .probe = mfex_avx512_probe, + .extract_func = mfex_avx512_dot1q_ip_udp, + .name = "avx512_dot1q_ipv4_udp", }, + + [MFEX_IMPL_VMBI_DOT1Q_IPv4_TCP] = { + .probe = mfex_avx512_vbmi_probe, + .extract_func = mfex_avx512_vbmi_dot1q_ip_tcp, + .name = "avx512_vbmi_dot1q_ipv4_tcp", }, + + [MFEX_IMPL_DOT1Q_IPv4_TCP] = { + .probe = mfex_avx512_probe, + .extract_func = mfex_avx512_dot1q_ip_tcp, + .name = "avx512_dot1q_ipv4_tcp", }, #endif }; diff --git a/lib/dpif-netdev-private-extract.h b/lib/dpif-netdev-private-extract.h index d1875815a..40e12fac8 100644 --- a/lib/dpif-netdev-private-extract.h +++ b/lib/dpif-netdev-private-extract.h @@ -74,6 +74,12 @@ enum dpif_miniflow_extract_impl_idx { MFEX_IMPL_STUDY, MFEX_IMPL_VMBI_IPv4_UDP, MFEX_IMPL_IPv4_UDP, + MFEX_IMPL_VMBI_IPv4_TCP, + MFEX_IMPL_IPv4_TCP, + MFEX_IMPL_VMBI_DOT1Q_IPv4_UDP, + MFEX_IMPL_DOT1Q_IPv4_UDP, + MFEX_IMPL_VMBI_DOT1Q_IPv4_TCP, + MFEX_IMPL_DOT1Q_IPv4_TCP, MFEX_IMPL_MAX }; @@ -164,6 +170,10 @@ int32_t mfex_avx512_vbmi_probe(void); *pmd_handle); \ DECLARE_AVX512_MFEX_PROTOTYPE(ip_udp); +DECLARE_AVX512_MFEX_PROTOTYPE(ip_tcp); +DECLARE_AVX512_MFEX_PROTOTYPE(dot1q_ip_udp); +DECLARE_AVX512_MFEX_PROTOTYPE(dot1q_ip_tcp); + #endif /* __x86_64__ */ #endif /* MFEX_AVX512_EXTRACT */