From patchwork Mon Jul 15 16:36:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1132099 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45nTgd6Bdjz9sDQ for ; Tue, 16 Jul 2019 02:36:29 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 81B02E35; Mon, 15 Jul 2019 16:35:23 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id C1D4FE30 for ; Mon, 15 Jul 2019 16:35:21 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id E7469876 for ; Mon, 15 Jul 2019 16:35:20 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Jul 2019 09:35:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,493,1557212400"; d="scan'208";a="168993665" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by fmsmga007.fm.intel.com with ESMTP; 15 Jul 2019 09:35:19 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Mon, 15 Jul 2019 17:36:32 +0100 Message-Id: <20190715163636.51572-2-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190715163636.51572-1-harry.van.haaren@intel.com> References: <20190709123440.45519-1-harry.van.haaren@intel.com> <20190715163636.51572-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v11 1/5] dpif-netdev: Implement function pointers/subtable X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This allows plugging-in of different subtable hash-lookup-verify routines, and allows special casing of those functions based on known context (eg: # of bits set) of the specific subtable. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v11: - Rebased to latest master - Added space to ULLONG_FOR_EACH_1 (Ilya) - Use capital letter in commit message (Ilya) v10: - Fix capitalization of comments, and punctuation. (Ian) - Variable declarations up top before use (Ian) - Fix alignment of function parameters, had to newline after typedef (Ian) - Some mailing-list questions relpied to on-list (Ian) v9: - Use count_1bits in favour of __builtin_popcount (Ilya) v6: - Implement subtable effort per packet "lookups_match" counter (Ilya) - Remove double newline (Eelco) - Remove double * before comments (Eelco) - Reword comments in dpcls_lookup() for clarity (Harry) --- lib/dpif-netdev.c | 138 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 96 insertions(+), 42 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6b99a3c44..123f04577 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7683,6 +7683,28 @@ dpif_dummy_register(enum dummy_level level) /* Datapath Classifier. */ +/* Forward declaration for lookup_func typedef. */ +struct dpcls_subtable; + +/* Lookup function for a subtable in the dpcls. This function is called + * by each subtable with an array of packets, and a bitmask of packets to + * perform the lookup on. Using a function pointer gives flexibility to + * optimize the lookup function based on subtable properties and the + * CPU instruction set available at runtime. + */ +typedef +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* Prototype for generic lookup func, using same code path as before. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + /* A set of rules that all have the same fields wildcarded. */ struct dpcls_subtable { /* The fields are only used by writers. */ @@ -7692,6 +7714,13 @@ struct dpcls_subtable { struct cmap rules; /* Contains "struct dpcls_rule"s. */ uint32_t hit_cnt; /* Number of match hits in subtable in current optimization interval. */ + + /* The lookup function to use for this subtable. If there is a known + * property of the subtable (eg: only 3 bits of miniflow metadata is + * used for the lookup) then this can point at an optimized version of + * the lookup function for this particular subtable. */ + dpcls_subtable_lookup_func lookup_func; + struct netdev_flow_key mask; /* Wildcards for fields (const). */ /* 'mask' must be the last field, additional space is allocated here. */ }; @@ -7751,6 +7780,10 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) cmap_init(&subtable->rules); subtable->hit_cnt = 0; netdev_flow_key_clone(&subtable->mask, mask); + + /* Decide which hash/lookup/verify function to use. */ + subtable->lookup_func = dpcls_subtable_lookup_generic; + cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash); /* Add the new subtable at the end of the pvector (with no hits yet) */ pvector_insert(&cls->subtables, subtable, 0); @@ -7911,6 +7944,55 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule, return true; } +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ + int i; + uint32_t found_map; + + /* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ + uint32_t hashes[NETDEV_MAX_BURST]; + ULLONG_FOR_EACH_1 (i, keys_map) { + hashes[i] = netdev_flow_key_hash_in_mask(keys[i], + &subtable->mask); + } + + /* Lookup. */ + const struct cmap_node *nodes[NETDEV_MAX_BURST]; + found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); + + /* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ + ULLONG_FOR_EACH_1 (i, found_map) { + struct dpcls_rule *rule; + + CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { + if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { + rules[i] = rule; + /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap + * within one second optimization interval. */ + subtable->hit_cnt++; + goto next; + } + } + /* None of the found rules was a match. Reset the i-th bit to + * keep searching this key in the next subtable. */ + ULLONG_SET0(found_map, i); /* Did not match. */ + next: + ; /* Keep Sparse happy. */ + } + + return found_map; +} + /* For each miniflow in 'keys' performs a classifier lookup writing the result * into the corresponding slot in 'rules'. If a particular entry in 'keys' is * NULL it is skipped. @@ -7929,16 +8011,12 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], /* The received 'cnt' miniflows are the search-keys that will be processed * to find a matching entry into the available subtables. * The number of bits in map_type is equal to NETDEV_MAX_BURST. */ - typedef uint32_t map_type; -#define MAP_BITS (sizeof(map_type) * CHAR_BIT) +#define MAP_BITS (sizeof(uint32_t) * CHAR_BIT) BUILD_ASSERT_DECL(MAP_BITS >= NETDEV_MAX_BURST); struct dpcls_subtable *subtable; - map_type keys_map = TYPE_MAXIMUM(map_type); /* Set all bits. */ - map_type found_map; - uint32_t hashes[MAP_BITS]; - const struct cmap_node *nodes[MAP_BITS]; + uint32_t keys_map = TYPE_MAXIMUM(uint32_t); /* Set all bits. */ if (cnt != MAP_BITS) { keys_map >>= MAP_BITS - cnt; /* Clear extra bits. */ @@ -7946,6 +8024,7 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], memset(rules, 0, cnt * sizeof *rules); int lookups_match = 0, subtable_pos = 1; + uint32_t found_map; /* The Datapath classifier - aka dpcls - is composed of subtables. * Subtables are dynamically created as needed when new rules are inserted. @@ -7955,52 +8034,27 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], * search-key, the search for that key can stop because the rules are * non-overlapping. */ PVECTOR_FOR_EACH (subtable, &cls->subtables) { - int i; + /* Call the subtable specific lookup function. */ + found_map = subtable->lookup_func(subtable, keys_map, keys, rules); - /* Compute hashes for the remaining keys. Each search-key is - * masked with the subtable's mask to avoid hashing the wildcarded - * bits. */ - ULLONG_FOR_EACH_1(i, keys_map) { - hashes[i] = netdev_flow_key_hash_in_mask(keys[i], - &subtable->mask); - } - /* Lookup. */ - found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); - /* Check results. When the i-th bit of found_map is set, it means - * that a set of nodes with a matching hash value was found for the - * i-th search-key. Due to possible hash collisions we need to check - * which of the found rules, if any, really matches our masked - * search-key. */ - ULLONG_FOR_EACH_1(i, found_map) { - struct dpcls_rule *rule; + /* Count the number of subtables searched for this packet match. This + * estimates the "spread" of subtables looked at per matched packet. */ + uint32_t pkts_matched = count_1bits(found_map); + lookups_match += pkts_matched * subtable_pos; - CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { - if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { - rules[i] = rule; - /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap - * within one second optimization interval. */ - subtable->hit_cnt++; - lookups_match += subtable_pos; - goto next; - } - } - /* None of the found rules was a match. Reset the i-th bit to - * keep searching this key in the next subtable. */ - ULLONG_SET0(found_map, i); /* Did not match. */ - next: - ; /* Keep Sparse happy. */ - } - keys_map &= ~found_map; /* Clear the found rules. */ + /* Clear the found rules, and return early if all packets are found. */ + keys_map &= ~found_map; if (!keys_map) { if (num_lookups_p) { *num_lookups_p = lookups_match; } - return true; /* All found. */ + return true; } subtable_pos++; } + if (num_lookups_p) { *num_lookups_p = lookups_match; } - return false; /* Some misses. */ + return false; } From patchwork Mon Jul 15 16:36:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1132101 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45nThK012Mz9sDQ for ; Tue, 16 Jul 2019 02:37:04 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 3DEA9CD4; Mon, 15 Jul 2019 16:35:25 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 0CF4BE89 for ; Mon, 15 Jul 2019 16:35:24 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 2EF18886 for ; Mon, 15 Jul 2019 16:35:23 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Jul 2019 09:35:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,493,1557212400"; d="scan'208";a="168993679" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by fmsmga007.fm.intel.com with ESMTP; 15 Jul 2019 09:35:21 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Mon, 15 Jul 2019 17:36:33 +0100 Message-Id: <20190715163636.51572-3-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190715163636.51572-1-harry.van.haaren@intel.com> References: <20190709123440.45519-1-harry.van.haaren@intel.com> <20190715163636.51572-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v11 2/5] dpif-netdev: Move dpcls lookup structures to .h X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit moves some data-structures to be available in the dpif-netdev-private.h header. This allows specific implementations of the subtable lookup function to include just that header file, and not require that the code exists in dpif-netdev.c Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v11: - Rebase to latest master - Split netdev private header to own file (Ilya) v10: - Rebase updates from previous patch in code that moved. - Move cmap.h include into alphabetical order (Ian) - Fix comment and typo (Ian) - Restructure function typedef to fit in 80 chars v6: - Fix double * in code comments (Eelco) - Reword comment on lookup_func for clarity (Harry) - Rebase fixups --- lib/automake.mk | 1 + lib/dpif-netdev-private.h | 119 ++++++++++++++++++++++++++++++++++++++ lib/dpif-netdev.c | 68 +--------------------- 3 files changed, 123 insertions(+), 65 deletions(-) create mode 100644 lib/dpif-netdev-private.h diff --git a/lib/automake.mk b/lib/automake.mk index 1b89cac8c..6f216efe0 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -80,6 +80,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dpdk.h \ lib/dpif-netdev.c \ lib/dpif-netdev.h \ + lib/dpif-netdev-private.h \ lib/dpif-netdev-perf.c \ lib/dpif-netdev-perf.h \ lib/dpif-provider.h \ diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h new file mode 100644 index 000000000..b235e23c8 --- /dev/null +++ b/lib/dpif-netdev-private.h @@ -0,0 +1,119 @@ +/* + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc. + * Copyright (c) 2019 Intel Corperation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef DPIF_NETDEV_PRIVATE_H +#define DPIF_NETDEV_PRIVATE_H 1 + +#include +#include + +#include "dpif.h" +#include "cmap.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/* Forward declaration for lookup_func typedef. */ +struct dpcls_subtable; +struct dpcls_rule; + +/* Must be public as it is instantiated in subtable struct below. */ +struct netdev_flow_key { + uint32_t hash; /* Hash function differs for different users. */ + uint32_t len; /* Length of the following miniflow (incl. map). */ + struct miniflow mf; + uint64_t buf[FLOW_MAX_PACKET_U64S]; +}; + +/* A rule to be inserted to the classifier. */ +struct dpcls_rule { + struct cmap_node cmap_node; /* Within struct dpcls_subtable 'rules'. */ + struct netdev_flow_key *mask; /* Subtable's mask. */ + struct netdev_flow_key flow; /* Matching key. */ + /* 'flow' must be the last field, additional space is allocated here. */ +}; + +/* Lookup function for a subtable in the dpcls. This function is called + * by each subtable with an array of packets, and a bitmask of packets to + * perform the lookup on. Using a function pointer gives flexibility to + * optimize the lookup function based on subtable properties and the + * CPU instruction set available at runtime. + */ +typedef +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* Prototype for generic lookup func, using same code path as before. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* A set of rules that all have the same fields wildcarded. */ +struct dpcls_subtable { + /* The fields are only used by writers. */ + struct cmap_node cmap_node OVS_GUARDED; /* Within dpcls 'subtables_map'. */ + + /* These fields are accessed by readers. */ + struct cmap rules; /* Contains "struct dpcls_rule"s. */ + uint32_t hit_cnt; /* Number of match hits in subtable in current + optimization interval. */ + + /* Miniflow fingerprint that the subtable matches on. The miniflow "bits" + * are used to select the actual dpcls lookup implementation at subtable + * creation time. + */ + uint8_t mf_bits_set_unit0; + uint8_t mf_bits_set_unit1; + + /* the lookup function to use for this subtable. If there is a known + * property of the subtable (eg: only 3 bits of miniflow metadata is + * used for the lookup) then this can point at an optimized version of + * the lookup function for this particular subtable. */ + dpcls_subtable_lookup_func lookup_func; + + /* caches the masks to match a packet to, reducing runtime calculations */ + uint64_t *mf_masks; + + struct netdev_flow_key mask; /* Wildcards for fields (const). */ + /* 'mask' must be the last field, additional space is allocated here. */ +}; + +/* Iterate through netdev_flow_key TNL u64 values specified by 'FLOWMAP'. */ +#define NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(VALUE, KEY, FLOWMAP) \ + MINIFLOW_FOR_EACH_IN_FLOWMAP (VALUE, &(KEY)->mf, FLOWMAP) + +/* Generates a mask for each bit set in the subtable's miniflow. */ +void +netdev_flow_key_gen_masks(const struct netdev_flow_key *tbl, + uint64_t *mf_masks, + const uint32_t mf_bits_u0, + const uint32_t mf_bits_u1); + +/* Matches a dpcls rule against the incoming packet in 'target' */ +bool dpcls_rule_matches_key(const struct dpcls_rule *rule, + const struct netdev_flow_key *target); + +#ifdef __cplusplus +} +#endif + +#endif /* netdev-private.h */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 123f04577..749a478a8 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -16,6 +16,7 @@ #include #include "dpif-netdev.h" +#include "dpif-netdev-private.h" #include #include @@ -128,15 +129,6 @@ static struct odp_support dp_netdev_support = { .ct_orig_tuple6 = true, }; -/* Stores a miniflow with inline values */ - -struct netdev_flow_key { - uint32_t hash; /* Hash function differs for different users. */ - uint32_t len; /* Length of the following miniflow (incl. map). */ - struct miniflow mf; - uint64_t buf[FLOW_MAX_PACKET_U64S]; -}; - /* EMC cache and SMC cache compose the datapath flow cache (DFC) * * Exact match cache for frequently used flows @@ -243,14 +235,6 @@ struct dpcls { struct pvector subtables; }; -/* A rule to be inserted to the classifier. */ -struct dpcls_rule { - struct cmap_node cmap_node; /* Within struct dpcls_subtable 'rules'. */ - struct netdev_flow_key *mask; /* Subtable's mask. */ - struct netdev_flow_key flow; /* Matching key. */ - /* 'flow' must be the last field, additional space is allocated here. */ -}; - /* Data structure to keep packet order till fastpath processing. */ struct dp_packet_flow_map { struct dp_packet *packet; @@ -268,7 +252,7 @@ static bool dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], struct dpcls_rule **rules, size_t cnt, int *num_lookups_p); -static bool dpcls_rule_matches_key(const struct dpcls_rule *rule, +bool dpcls_rule_matches_key(const struct dpcls_rule *rule, const struct netdev_flow_key *target); /* Set of supported meter flags */ #define DP_SUPPORTED_METER_FLAGS_MASK \ @@ -2784,10 +2768,6 @@ netdev_flow_key_init_masked(struct netdev_flow_key *dst, (dst_u64 - miniflow_get_values(&dst->mf)) * 8); } -/* Iterate through netdev_flow_key TNL u64 values specified by 'FLOWMAP'. */ -#define NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(VALUE, KEY, FLOWMAP) \ - MINIFLOW_FOR_EACH_IN_FLOWMAP(VALUE, &(KEY)->mf, FLOWMAP) - /* Returns a hash value for the bits of 'key' where there are 1-bits in * 'mask'. */ static inline uint32_t @@ -7683,48 +7663,6 @@ dpif_dummy_register(enum dummy_level level) /* Datapath Classifier. */ -/* Forward declaration for lookup_func typedef. */ -struct dpcls_subtable; - -/* Lookup function for a subtable in the dpcls. This function is called - * by each subtable with an array of packets, and a bitmask of packets to - * perform the lookup on. Using a function pointer gives flexibility to - * optimize the lookup function based on subtable properties and the - * CPU instruction set available at runtime. - */ -typedef -uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules); - -/* Prototype for generic lookup func, using same code path as before. */ -uint32_t -dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules); - -/* A set of rules that all have the same fields wildcarded. */ -struct dpcls_subtable { - /* The fields are only used by writers. */ - struct cmap_node cmap_node OVS_GUARDED; /* Within dpcls 'subtables_map'. */ - - /* These fields are accessed by readers. */ - struct cmap rules; /* Contains "struct dpcls_rule"s. */ - uint32_t hit_cnt; /* Number of match hits in subtable in current - optimization interval. */ - - /* The lookup function to use for this subtable. If there is a known - * property of the subtable (eg: only 3 bits of miniflow metadata is - * used for the lookup) then this can point at an optimized version of - * the lookup function for this particular subtable. */ - dpcls_subtable_lookup_func lookup_func; - - struct netdev_flow_key mask; /* Wildcards for fields (const). */ - /* 'mask' must be the last field, additional space is allocated here. */ -}; - static void dpcls_subtable_destroy_cb(struct dpcls_subtable *subtable) { @@ -7928,7 +7866,7 @@ dpcls_remove(struct dpcls *cls, struct dpcls_rule *rule) /* Returns true if 'target' satisfies 'key' in 'mask', that is, if each 1-bit * in 'mask' the values in 'key' and 'target' are the same. */ -static bool +bool dpcls_rule_matches_key(const struct dpcls_rule *rule, const struct netdev_flow_key *target) { From patchwork Mon Jul 15 16:36:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1132102 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45nThx2l8Jz9sLt for ; Tue, 16 Jul 2019 02:37:37 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id D0295E9F; Mon, 15 Jul 2019 16:35:27 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id C9F7AE4F for ; Mon, 15 Jul 2019 16:35:25 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 125DA71C for ; Mon, 15 Jul 2019 16:35:25 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Jul 2019 09:35:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,493,1557212400"; d="scan'208";a="168993687" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by fmsmga007.fm.intel.com with ESMTP; 15 Jul 2019 09:35:23 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Mon, 15 Jul 2019 17:36:34 +0100 Message-Id: <20190715163636.51572-4-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190715163636.51572-1-harry.van.haaren@intel.com> References: <20190709123440.45519-1-harry.van.haaren@intel.com> <20190715163636.51572-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v11 3/5] dpif-netdev: Split out generic lookup function X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit splits the generic hash-lookup-verify function to its own file, for cleaner seperation between optimized versions. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v11: - Rebase fixups from previous patches - Spacing around ULLONG_FOR_EACH_1 (Ilya) v10: - Rebase fixups from previous patch changes v6: - Fixup some checkpatch warnings on whitespace with MACROs (Ilya) - Other MACROs function incorrectly when checkpatch is happy, so using the functional version without space after the ( character. This prints a checkpatch warning, but I see no way to fix it. --- lib/automake.mk | 1 + lib/dpif-netdev-lookup-generic.c | 97 ++++++++++++++++++++++++++++++++ lib/dpif-netdev.c | 69 +---------------------- 3 files changed, 99 insertions(+), 68 deletions(-) create mode 100644 lib/dpif-netdev-lookup-generic.c diff --git a/lib/automake.mk b/lib/automake.mk index 6f216efe0..29d3458da 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -78,6 +78,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dp-packet.h \ lib/dp-packet.c \ lib/dpdk.h \ + lib/dpif-netdev-lookup-generic.c \ lib/dpif-netdev.c \ lib/dpif-netdev.h \ lib/dpif-netdev-private.h \ diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c new file mode 100644 index 000000000..833abf54f --- /dev/null +++ b/lib/dpif-netdev-lookup-generic.c @@ -0,0 +1,97 @@ +/* + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2016, 2017 Nicira, Inc. + * Copyright (c) 2019 Intel Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include "dpif-netdev.h" +#include "dpif-netdev-private.h" + +#include "bitmap.h" +#include "cmap.h" + +#include "dp-packet.h" +#include "dpif.h" +#include "dpif-netdev-perf.h" +#include "dpif-provider.h" +#include "flow.h" +#include "packets.h" +#include "pvector.h" + +/* Returns a hash value for the bits of 'key' where there are 1-bits in + * 'mask'. */ +static inline uint32_t +netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, + const struct netdev_flow_key *mask) +{ + const uint64_t *p = miniflow_get_values(&mask->mf); + uint32_t hash = 0; + uint64_t value; + + NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { + hash = hash_add64(hash, value & *p); + p++; + } + + return hash_finish(hash, (p - miniflow_get_values(&mask->mf)) * 8); +} + +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ + int i; + uint32_t found_map; + + /* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ + uint32_t hashes[NETDEV_MAX_BURST]; + ULLONG_FOR_EACH_1 (i, keys_map) { + hashes[i] = netdev_flow_key_hash_in_mask(keys[i], &subtable->mask); + } + + /* Lookup. */ + const struct cmap_node *nodes[NETDEV_MAX_BURST]; + found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); + + /* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ + ULLONG_FOR_EACH_1 (i, found_map) { + struct dpcls_rule *rule; + + CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { + if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { + rules[i] = rule; + /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap + * within one second optimization interval. */ + subtable->hit_cnt++; + goto next; + } + } + /* None of the found rules was a match. Reset the i-th bit to + * keep searching this key in the next subtable. */ + ULLONG_SET0(found_map, i); /* Did not match. */ + next: + ; /* Keep Sparse happy. */ + } + + return found_map; +} diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 749a478a8..b42ca35e3 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -252,8 +252,7 @@ static bool dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], struct dpcls_rule **rules, size_t cnt, int *num_lookups_p); -bool dpcls_rule_matches_key(const struct dpcls_rule *rule, - const struct netdev_flow_key *target); + /* Set of supported meter flags */ #define DP_SUPPORTED_METER_FLAGS_MASK \ (OFPMF13_STATS | OFPMF13_PKTPS | OFPMF13_KBPS | OFPMF13_BURST) @@ -2768,23 +2767,6 @@ netdev_flow_key_init_masked(struct netdev_flow_key *dst, (dst_u64 - miniflow_get_values(&dst->mf)) * 8); } -/* Returns a hash value for the bits of 'key' where there are 1-bits in - * 'mask'. */ -static inline uint32_t -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, - const struct netdev_flow_key *mask) -{ - const uint64_t *p = miniflow_get_values(&mask->mf); - uint32_t hash = 0; - uint64_t value; - - NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(value, key, mask->mf.map) { - hash = hash_add64(hash, value & *p++); - } - - return hash_finish(hash, (p - miniflow_get_values(&mask->mf)) * 8); -} - static inline bool emc_entry_alive(struct emc_entry *ce) { @@ -7882,55 +7864,6 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule, return true; } -uint32_t -dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules) -{ - int i; - uint32_t found_map; - - /* Compute hashes for the remaining keys. Each search-key is - * masked with the subtable's mask to avoid hashing the wildcarded - * bits. */ - uint32_t hashes[NETDEV_MAX_BURST]; - ULLONG_FOR_EACH_1 (i, keys_map) { - hashes[i] = netdev_flow_key_hash_in_mask(keys[i], - &subtable->mask); - } - - /* Lookup. */ - const struct cmap_node *nodes[NETDEV_MAX_BURST]; - found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); - - /* Check results. When the i-th bit of found_map is set, it means - * that a set of nodes with a matching hash value was found for the - * i-th search-key. Due to possible hash collisions we need to check - * which of the found rules, if any, really matches our masked - * search-key. */ - ULLONG_FOR_EACH_1 (i, found_map) { - struct dpcls_rule *rule; - - CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { - if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { - rules[i] = rule; - /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap - * within one second optimization interval. */ - subtable->hit_cnt++; - goto next; - } - } - /* None of the found rules was a match. Reset the i-th bit to - * keep searching this key in the next subtable. */ - ULLONG_SET0(found_map, i); /* Did not match. */ - next: - ; /* Keep Sparse happy. */ - } - - return found_map; -} - /* For each miniflow in 'keys' performs a classifier lookup writing the result * into the corresponding slot in 'rules'. If a particular entry in 'keys' is * NULL it is skipped. From patchwork Mon Jul 15 16:36:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1132104 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45nTjf4BMmz9sDQ for ; Tue, 16 Jul 2019 02:38:14 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id B32F1EC2; Mon, 15 Jul 2019 16:35:29 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 79832EB7 for ; Mon, 15 Jul 2019 16:35:28 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 21BA0893 for ; Mon, 15 Jul 2019 16:35:27 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Jul 2019 09:35:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,493,1557212400"; d="scan'208";a="168993704" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by fmsmga007.fm.intel.com with ESMTP; 15 Jul 2019 09:35:25 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Mon, 15 Jul 2019 17:36:35 +0100 Message-Id: <20190715163636.51572-5-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190715163636.51572-1-harry.van.haaren@intel.com> References: <20190709123440.45519-1-harry.van.haaren@intel.com> <20190715163636.51572-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v11 4/5] dpif-netdev: Refactor generic implementation X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit refactors the generic implementation. The goal of this refactor is to simplify the code to enable "specialization" of the functions at compile time. Given compile-time optimizations, the compiler is able to unroll loops, and create optimized code sequences due to compile time knowledge of loop-trip counts. In order to enable these compiler optimizations, we must refactor the code to pass the loop-trip counts to functions as compile time constants. This patch allows the number of miniflow-bits set per "unit" in the miniflow to be passed around as a function argument. Note that this patch does NOT yet take advantage of doing so, this is only a refactor to enable it in the next patches. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v11: - Rebased to previous changes - Fix typo in commit message (Ian) - Fix variable declaration spacing (Ian) - Remove function names from comments (Ian) - Replace magic 8 with sizeof(uint64_t) (Ian) - Captialize and end comments with a stop. (Ian/Ilya) - Add build time assert to validate FLOWMAP_UNITS (Ilya) - Add note on ALWAYS_INLINE operation - Add space after ULLONG_FOR_EACH_1 (Ilya) - Use hash_add_words64() instead of rolling own loop (Ilya) Note that hash_words64_inline() calls hash_finish() with an fixed value, so it was not the right hash function for this usage. Used hash_add_words64() and manual hash_finish() to re-use as much of hashing code as we can. v10: - Rebase updates from previous patches - Fix whitespace indentation of func params - Removed restrict keyword, Windows CI failing when it is used (Ian) - Fix integer 0 used to set NULL pointer (Ilya) - Postpone free() call on cls->blocks_scratch (Ilya) - Fix indentation of a function v9: - Use count_1bits in favour of __builtin_popcount (Ilya) - Use ALWAYS_INLINE instead of __attribute__ synatx (Ilya) v8: - Rework block_cache and mf_masks to avoid variable-lenght array due to compiler issues. Provisioning for worst case is not a good solution due to magnitue of over-provisioning required. - Rework netdev_flatten function removing unused parameter --- lib/dpif-netdev-lookup-generic.c | 214 +++++++++++++++++++++++++------ lib/dpif-netdev-private.h | 8 +- lib/dpif-netdev.c | 77 ++++++++++- 3 files changed, 258 insertions(+), 41 deletions(-) diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c index 833abf54f..abd166fc3 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -30,68 +30,210 @@ #include "packets.h" #include "pvector.h" -/* Returns a hash value for the bits of 'key' where there are 1-bits in - * 'mask'. */ -static inline uint32_t -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, - const struct netdev_flow_key *mask) +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic); + +/* Lookup functions below depends on the internal structure of flowmap. */ +BUILD_ASSERT_DECL(FLOWMAP_UNITS == 2); + +/* Given a packet, table and mf_masks, this function iterates over each bit + * set in the subtable, and calculates the appropriate metadata to store in the + * blocks_scratch[]. + * + * The results of the blocks_scratch[] can be used for hashing, and later for + * verification of if a rule matches the given packet. + */ +static inline void +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks, + const uint64_t *tbl_blocks, + const uint64_t *mf_masks, + uint64_t *blocks_scratch, + const uint64_t pkt_mf_bits, + const uint32_t count) { - const uint64_t *p = miniflow_get_values(&mask->mf); - uint32_t hash = 0; - uint64_t value; + uint32_t i; + + for (i = 0; i < count; i++) { + uint64_t mf_mask = mf_masks[i]; + /* Calculate the block index for the packet metadata. */ + uint64_t idx_bits = mf_mask & pkt_mf_bits; + const uint32_t pkt_idx = count_1bits(idx_bits); + + /* Check if the packet has the subtable miniflow bit set. If yes, the + * block at the above pkt_idx will be stored, otherwise it is masked + * out to be zero. + */ + uint64_t pkt_has_mf_bit = (mf_mask + 1) & pkt_mf_bits; + uint64_t no_bit = ((!pkt_has_mf_bit) > 0) - 1; - NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { - hash = hash_add64(hash, value & *p); - p++; + /* Mask packet block by table block, and mask to zero if packet + * doesn't actually contain this block of metadata. + */ + blocks_scratch[i] = pkt_blocks[pkt_idx] & tbl_blocks[i] & no_bit; } +} + +/* This function takes a packet, and subtable and writes an array of uint64_t + * blocks. The blocks contain the metadata that the subtable matches on, in + * the same order as the subtable, allowing linear iteration over the blocks. + * + * To calculate the blocks contents, the netdev_flow_key_flatten_unit function + * is called twice, once for each "unit" of the miniflow. This call can be + * inlined by the compiler for performance. + * + * Note that the u0_count and u1_count variables can be compile-time constants, + * allowing the loop in the inlined flatten_unit() function to be compile-time + * unrolled, or possibly removed totally by unrolling by the loop iterations. + * The compile time optimizations enabled by this design improves performance. + */ +static inline void +netdev_flow_key_flatten(const struct netdev_flow_key *key, + const struct netdev_flow_key *mask, + const uint64_t *mf_masks, + uint64_t *blocks_scratch, + const uint32_t u0_count, + const uint32_t u1_count) +{ + /* Load mask from subtable, mask with packet mf, popcount to get idx. */ + const uint64_t *pkt_blocks = miniflow_get_values(&key->mf); + const uint64_t *tbl_blocks = miniflow_get_values(&mask->mf); - return hash_finish(hash, (p - miniflow_get_values(&mask->mf)) * 8); + /* Packet miniflow bits to be masked by pre-calculated mf_masks. */ + const uint64_t pkt_bits_u0 = key->mf.map.bits[0]; + const uint32_t pkt_bits_u0_pop = count_1bits(pkt_bits_u0); + const uint64_t pkt_bits_u1 = key->mf.map.bits[1]; + + /* Unit 0 flattening */ + netdev_flow_key_flatten_unit(&pkt_blocks[0], + &tbl_blocks[0], + &mf_masks[0], + &blocks_scratch[0], + pkt_bits_u0, + u0_count); + + /* Unit 1 flattening: + * Move the pointers forward in the arrays based on u0 offsets, NOTE: + * 1) pkt blocks indexed by actual popcount of u0, which is NOT always + * the same as the amount of bits set in the subtable. + * 2) mf_masks, tbl_block and blocks_scratch are all "flat" arrays, so + * the index is always u0_count. + */ + netdev_flow_key_flatten_unit(&pkt_blocks[pkt_bits_u0_pop], + &tbl_blocks[u0_count], + &mf_masks[u0_count], + &blocks_scratch[u0_count], + pkt_bits_u1, + u1_count); } -uint32_t -dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules) +/* Compares a rule and the blocks representing a key, returns 1 on a match. */ +static inline uint64_t +netdev_rule_matches_key(const struct dpcls_rule *rule, + const uint32_t mf_bits_total, + const uint64_t *blocks_scratch) { - int i; - uint32_t found_map; + const uint64_t *keyp = miniflow_get_values(&rule->flow.mf); + const uint64_t *maskp = miniflow_get_values(&rule->mask->mf); + uint64_t not_match = 0; + + for (int i = 0; i < mf_bits_total; i++) { + not_match |= (blocks_scratch[i] & maskp[i]) != keyp[i]; + } - /* Compute hashes for the remaining keys. Each search-key is - * masked with the subtable's mask to avoid hashing the wildcarded - * bits. */ + /* invert result to show match as 1 */ + return !not_match; +} + +/* Const prop version of the function: note that mf bits total and u0 are + * explicitly passed in here, while they're also available at runtime from the + * subtable pointer. By making them compile time, we enable the compiler to + * unroll loops and flatten out code-sequences based on the knowledge of the + * mf_bits_* compile time values. This results in improved performance. + * + * Note: this function is marked with ALWAYS_INLINE to ensure the compiler + * inlines the below code, and then uses the compile time constants to make + * specialized versions of the runtime code. Without ALWAYS_INLINE, the + * compiler might decide to not inline, and performance will suffer. + */ +static inline uint32_t ALWAYS_INLINE +lookup_generic_impl(struct dpcls_subtable *subtable, + uint64_t *blocks_scratch, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules, + const uint32_t bit_count_u0, + const uint32_t bit_count_u1) +{ + const uint32_t n_pkts = count_1bits(keys_map); + ovs_assert(NETDEV_MAX_BURST >= n_pkts); uint32_t hashes[NETDEV_MAX_BURST]; + + const uint32_t bit_count_total = bit_count_u0 + bit_count_u1; + uint64_t *mf_masks = subtable->mf_masks; + int i; + + /* Flatten the packet metadata into the blocks_scratch[] using subtable. */ + ULLONG_FOR_EACH_1 (i, keys_map) { + netdev_flow_key_flatten(keys[i], + &subtable->mask, + mf_masks, + &blocks_scratch[i * bit_count_total], + bit_count_u0, + bit_count_u1); + } + + /* Hash the now linearized blocks of packet metadata. */ ULLONG_FOR_EACH_1 (i, keys_map) { - hashes[i] = netdev_flow_key_hash_in_mask(keys[i], &subtable->mask); + uint64_t *block_ptr = &blocks_scratch[i * bit_count_total]; + uint32_t hash = hash_add_words64(0, block_ptr, bit_count_total); + hashes[i] = hash_finish(hash, bit_count_total * 8); } - /* Lookup. */ + /* Lookup: this returns a bitmask of packets where the hash table had + * an entry for the given hash key. Presence of a hash key does not + * guarantee matching the key, as there can be hash collisions. + */ + uint32_t found_map; const struct cmap_node *nodes[NETDEV_MAX_BURST]; + found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); - /* Check results. When the i-th bit of found_map is set, it means - * that a set of nodes with a matching hash value was found for the - * i-th search-key. Due to possible hash collisions we need to check - * which of the found rules, if any, really matches our masked - * search-key. */ + /* Verify that packet actually matched rule. If not found, a hash + * collision has taken place, so continue searching with the next node. + */ ULLONG_FOR_EACH_1 (i, found_map) { struct dpcls_rule *rule; CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { - if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { + const uint32_t cidx = i * bit_count_total; + uint32_t match = netdev_rule_matches_key(rule, bit_count_total, + &blocks_scratch[cidx]); + + if (OVS_LIKELY(match)) { rules[i] = rule; - /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap - * within one second optimization interval. */ subtable->hit_cnt++; goto next; } } - /* None of the found rules was a match. Reset the i-th bit to - * keep searching this key in the next subtable. */ - ULLONG_SET0(found_map, i); /* Did not match. */ + + /* None of the found rules was a match. Clear the i-th bit to + * search for this key in the next subtable. */ + ULLONG_SET0(found_map, i); next: - ; /* Keep Sparse happy. */ + ; /* Keep Sparse happy. */ } return found_map; } + +/* Generic lookup function that uses runtime provided mf bits for iterating. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint64_t *blocks_scratch, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ + return lookup_generic_impl(subtable, blocks_scratch, keys_map, keys, rules, + subtable->mf_bits_set_unit0, + subtable->mf_bits_set_unit1); +} diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h index b235e23c8..e1ceaa641 100644 --- a/lib/dpif-netdev-private.h +++ b/lib/dpif-netdev-private.h @@ -56,13 +56,15 @@ struct dpcls_rule { */ typedef uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint64_t *blocks_scratch, uint32_t keys_map, const struct netdev_flow_key *keys[], struct dpcls_rule **rules); -/* Prototype for generic lookup func, using same code path as before. */ +/* Prototype for generic lookup func, using generic scalar code path. */ uint32_t dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint64_t *blocks_scratch, uint32_t keys_map, const struct netdev_flow_key *keys[], struct dpcls_rule **rules); @@ -84,13 +86,13 @@ struct dpcls_subtable { uint8_t mf_bits_set_unit0; uint8_t mf_bits_set_unit1; - /* the lookup function to use for this subtable. If there is a known + /* The lookup function to use for this subtable. If there is a known * property of the subtable (eg: only 3 bits of miniflow metadata is * used for the lookup) then this can point at an optimized version of * the lookup function for this particular subtable. */ dpcls_subtable_lookup_func lookup_func; - /* caches the masks to match a packet to, reducing runtime calculations */ + /* Caches the masks to match a packet to, reducing runtime calculations. */ uint64_t *mf_masks; struct netdev_flow_key mask; /* Wildcards for fields (const). */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index b42ca35e3..8acc1445a 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -233,6 +233,15 @@ struct dpcls { odp_port_t in_port; struct cmap subtables_map; struct pvector subtables; + + /* Region of memory for this DPCLS instance to use as scratch. + * Size is garaunteed to be large enough to hold all blocks required for + * the subtable's to match on. This allows each dpcls lookup to flatten + * the packet miniflows into this blocks_scratch area, without using + * variable lenght arrays. This region is allocated on subtable create, and + * will be resized as required if a larger subtable is added. */ + uint64_t *blocks_scratch; + uint32_t blocks_scratch_size; }; /* Data structure to keep packet order till fastpath processing. */ @@ -7649,6 +7658,7 @@ static void dpcls_subtable_destroy_cb(struct dpcls_subtable *subtable) { cmap_destroy(&subtable->rules); + ovsrcu_postpone(free, subtable->mf_masks); ovsrcu_postpone(free, subtable); } @@ -7659,6 +7669,8 @@ dpcls_init(struct dpcls *cls) { cmap_init(&cls->subtables_map); pvector_init(&cls->subtables); + cls->blocks_scratch = NULL; + cls->blocks_scratch_size = 0; } static void @@ -7686,6 +7698,7 @@ dpcls_destroy(struct dpcls *cls) } cmap_destroy(&cls->subtables_map); pvector_destroy(&cls->subtables); + ovsrcu_postpone(free, cls->blocks_scratch); } } @@ -7701,7 +7714,28 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) subtable->hit_cnt = 0; netdev_flow_key_clone(&subtable->mask, mask); - /* Decide which hash/lookup/verify function to use. */ + /* The count of bits in the mask defines the space required for masks. + * Then call gen_masks() to create the appropriate masks, avoiding the cost + * of doing runtime calculations. */ + uint32_t unit0 = count_1bits(mask->mf.map.bits[0]); + uint32_t unit1 = count_1bits(mask->mf.map.bits[1]); + subtable->mf_bits_set_unit0 = unit0; + subtable->mf_bits_set_unit1 = unit1; + + subtable->mf_masks = xmalloc(sizeof(uint64_t) * (unit0 + unit1)); + netdev_flow_key_gen_masks(mask, subtable->mf_masks, unit0, unit1); + + /* Allocate blocks scratch space only if subtable requires more size than + * is currently allocated. */ + const uint32_t blocks_required_per_pkt = unit0 + unit1; + if (cls->blocks_scratch_size < blocks_required_per_pkt) { + free(cls->blocks_scratch); + cls->blocks_scratch = xmalloc(sizeof(uint64_t) * NETDEV_MAX_BURST * + blocks_required_per_pkt); + cls->blocks_scratch_size = blocks_required_per_pkt; + } + + /* Assign the generic lookup - this works with any miniflow fingerprint. */ subtable->lookup_func = dpcls_subtable_lookup_generic; cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash); @@ -7846,6 +7880,43 @@ dpcls_remove(struct dpcls *cls, struct dpcls_rule *rule) } } +/* Inner loop for mask generation of a unit, see netdev_flow_key_gen_masks. */ +static inline void +netdev_flow_key_gen_mask_unit(uint64_t iter, + const uint64_t count, + uint64_t *mf_masks) +{ + int i; + for (i = 0; i < count; i++) { + uint64_t lowest_bit = (iter & -iter); + iter &= ~lowest_bit; + mf_masks[i] = (lowest_bit - 1); + } + /* Checks that count has covered all bits in the iter bitmap. */ + ovs_assert(iter == 0); +} + +/* Generate a mask for each block in the miniflow, based on the bits set. This + * allows easily masking packets with the generated array here, without + * calculations. This replaces runtime-calculating the masks. + * @param key The table to generate the mf_masks for + * @param mf_masks Pointer to a u64 array of at least *mf_bits* in size + * @param mf_bits_total Number of bits set in the whole miniflow (both units) + * @param mf_bits_unit0 Number of bits set in unit0 of the miniflow + */ +void +netdev_flow_key_gen_masks(const struct netdev_flow_key *tbl, + uint64_t *mf_masks, + const uint32_t mf_bits_u0, + const uint32_t mf_bits_u1) +{ + uint64_t iter_u0 = tbl->mf.map.bits[0]; + uint64_t iter_u1 = tbl->mf.map.bits[1]; + + netdev_flow_key_gen_mask_unit(iter_u0, mf_bits_u0, &mf_masks[0]); + netdev_flow_key_gen_mask_unit(iter_u1, mf_bits_u1, &mf_masks[mf_bits_u0]); +} + /* Returns true if 'target' satisfies 'key' in 'mask', that is, if each 1-bit * in 'mask' the values in 'key' and 'target' are the same. */ bool @@ -7886,6 +7957,7 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], BUILD_ASSERT_DECL(MAP_BITS >= NETDEV_MAX_BURST); struct dpcls_subtable *subtable; + uint64_t *blocks_scratch = cls->blocks_scratch; uint32_t keys_map = TYPE_MAXIMUM(uint32_t); /* Set all bits. */ @@ -7906,7 +7978,8 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], * non-overlapping. */ PVECTOR_FOR_EACH (subtable, &cls->subtables) { /* Call the subtable specific lookup function. */ - found_map = subtable->lookup_func(subtable, keys_map, keys, rules); + found_map = subtable->lookup_func(subtable, blocks_scratch, keys_map, + keys, rules); /* Count the number of subtables searched for this packet match. This * estimates the "spread" of subtables looked at per matched packet. */ From patchwork Mon Jul 15 16:36:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1132106 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45nTkG6FrXz9sDQ for ; Tue, 16 Jul 2019 02:38:46 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 5D411EC5; Mon, 15 Jul 2019 16:35:31 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 62C89E4F for ; Mon, 15 Jul 2019 16:35:30 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C17C671C for ; Mon, 15 Jul 2019 16:35:28 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Jul 2019 09:35:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,493,1557212400"; d="scan'208";a="168993710" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by fmsmga007.fm.intel.com with ESMTP; 15 Jul 2019 09:35:27 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Mon, 15 Jul 2019 17:36:36 +0100 Message-Id: <20190715163636.51572-6-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190715163636.51572-1-harry.van.haaren@intel.com> References: <20190709123440.45519-1-harry.van.haaren@intel.com> <20190715163636.51572-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v11 5/5] dpif-netdev: Add specialized generic scalar functions X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit adds a number of specialized functions, that handle common miniflow fingerprints. This enables compiler optimization, resulting in higher performance. Below a quick description of how this optimization actually works; "Specialized functions" are "instances" of the generic implementation, but the compiler is given extra context when compiling. In the case of iterating miniflow datastructures, the most interesting value to enable compile time optimizations is the loop trip count per unit. In order to create a specialized function, there is a generic implementation, which uses a for() loop without the compiler knowing the loop trip count at compile time. The loop trip count is passed in as an argument to the function: uint32_t miniflow_impl_generic(struct miniflow *mf, uint32_t loop_count) { for(uint32_t i = 0; i < loop_count; i++) // do work } In order to "specialize" the function, we call the generic implementation with hard-coded numbers - these are compile time constants! uint32_t miniflow_impl_loop5(struct miniflow *mf, uint32_t loop_count) { // use hard coded constant for compile-time constant-propogation return miniflow_impl_generic(mf, 5); } Given the compiler is aware of the loop trip count at compile time, it can perform an optimization known as "constant propogation". Combined with inlining of the miniflow_impl_generic() function, the compiler is now enabled to *compile time* unroll the loop 5x, and produce "flat" code. The last step to using the specialized functions is to utilize a function-pointer to choose the specialized (or generic) implementation. The selection of the function pointer is performed at subtable creation time, when miniflow fingerprint of the subtable is known. This technique is known as "multiple dispatch" in some literature, as it uses multiple items of information (miniflow bit counts) to select the dispatch function. By pointing the function pointer at the optimized implementation, OvS benefits from the compile time optimizations at runtime. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v11: - Use MACROs to declare and check optimized functions (Ilya) - Use captial letter for commit message (Ilya) - Rebase onto latest patchset changes - Added NEWS entry for data-path subtable specialization (Ian/Harry) - Checkpatch notes an "incorrect bracketing" in the MACROs, however I didn't find a solution that it does like. v10: - Rebase changes from previous patches. - Remove "restrict" keyword as windows CI failed, see here for details: https://ci.appveyor.com/project/istokes/ovs-q8fvv/builds/24398228 v8: - Rework to use blocks_cache from the dpcls instance, to avoid variable lenght arrays in the data-path. --- NEWS | 4 +++ lib/dpif-netdev-lookup-generic.c | 51 ++++++++++++++++++++++++++++++++ lib/dpif-netdev-private.h | 8 +++++ lib/dpif-netdev.c | 9 ++++-- 4 files changed, 70 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS index 81130e667..4cfffb1bc 100644 --- a/NEWS +++ b/NEWS @@ -34,6 +34,10 @@ Post-v2.11.0 * 'ovs-appctl exit' now implies cleanup of non-internal ports in userspace datapath regardless of '--cleanup' option. Use '--cleanup' to remove internal ports too. + * Datapath classifer code refactored to enable function pointers to select + the lookup implementation at runtime. This enables specialization of + specific subtables based on the miniflow attributes, enhancing the + performance of the subtable search. - OVSDB: * OVSDB clients can now resynchronize with clustered servers much more quickly after a brief disconnection, saving bandwidth and CPU time. diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c index abd166fc3..259c36645 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -233,7 +233,58 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules) { + /* Here the runtime subtable->mf_bits counts are used, which forces the + * compiler to iterate normal for() loops. Due to this limitation in the + * compilers available optimizations, this function has lower performance + * than the below specialized functions. + */ return lookup_generic_impl(subtable, blocks_scratch, keys_map, keys, rules, subtable->mf_bits_set_unit0, subtable->mf_bits_set_unit1); } + +/* Expand out specialized functions with U0 and U1 bit attributes */ +#define DECLARE_OPTIMIZED_LOOKUP_FUNCTION(U0, U1) \ + static uint32_t \ + dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1( \ + struct dpcls_subtable *subtable, \ + uint64_t *blocks_scratch, \ + uint32_t keys_map, \ + const struct netdev_flow_key *keys[],\ + struct dpcls_rule **rules) \ + { \ + return lookup_generic_impl(subtable, blocks_scratch, keys_map, \ + keys, rules, U0, U1); \ + } \ + +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5,1) +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,1) +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4,0) + +/* Check if a speicalized function is valid for the required subtable. */ +#define CHECK_LOOKUP_FUNCTION(U0,U1) \ + if (!f && u0_bits == U0 && u1_bits == U1) { \ + f = dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1; \ + } + +/* Probe function to lookup an available specialized function. + * If capable to run the requested miniflow fingerprint, this function returns + * the most optimal implementation for that miniflow fingerprint. + * @retval FunctionAddress A valid function to handle the miniflow bit pattern + * @retval 0 The requested miniflow is not supported here, NULL is returned + */ +dpcls_subtable_lookup_func +dpcls_subtable_generic_probe(uint32_t u0_bits, uint32_t u1_bits) +{ + dpcls_subtable_lookup_func f = NULL; + + CHECK_LOOKUP_FUNCTION(5, 1); + CHECK_LOOKUP_FUNCTION(4, 1); + CHECK_LOOKUP_FUNCTION(4, 0); + + if (f) { + VLOG_INFO("Subtable using Generic Optimized for u0 %d, u1 %d\n", + u0_bits, u1_bits); + } + return f; +} diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h index e1ceaa641..f541343f4 100644 --- a/lib/dpif-netdev-private.h +++ b/lib/dpif-netdev-private.h @@ -69,6 +69,14 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules); +/* Probe function to select a specialized version of the generic lookup + * implementation. This provides performance benefit due to compile-time + * optimizations such as loop-unrolling. These are enabled by the compile-time + * constants in the specific function implementations. + */ +dpcls_subtable_lookup_func +dpcls_subtable_generic_probe(uint32_t u0_bit_count, uint32_t u1_bit_count); + /* A set of rules that all have the same fields wildcarded. */ struct dpcls_subtable { /* The fields are only used by writers. */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 8acc1445a..996dec35e 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7735,8 +7735,13 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) cls->blocks_scratch_size = blocks_required_per_pkt; } - /* Assign the generic lookup - this works with any miniflow fingerprint. */ - subtable->lookup_func = dpcls_subtable_lookup_generic; + /* Probe for a specialized generic lookup function. */ + subtable->lookup_func = dpcls_subtable_generic_probe(unit0, unit1); + + /* If not set, assign generic lookup. Generic works for any miniflow. */ + if (!subtable->lookup_func) { + subtable->lookup_func = dpcls_subtable_lookup_generic; + } cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash); /* Add the new subtable at the end of the pvector (with no hits yet) */