From patchwork Wed Jul 17 18:21:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1133383 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45plvN1mV9z9s3l for ; Thu, 18 Jul 2019 04:21:04 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 5F3A3F40; Wed, 17 Jul 2019 18:20:31 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 20707F3C for ; Wed, 17 Jul 2019 18:20:30 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 4F534887 for ; Wed, 17 Jul 2019 18:20:29 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jul 2019 11:20:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,275,1559545200"; d="scan'208";a="343114077" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by orsmga005.jf.intel.com with ESMTP; 17 Jul 2019 11:20:27 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Wed, 17 Jul 2019 19:21:43 +0100 Message-Id: <20190717182147.5042-2-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190717182147.5042-1-harry.van.haaren@intel.com> References: <20190717130033.25114-1-harry.van.haaren@intel.com> <20190717182147.5042-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v13 1/5] dpif-netdev: Implement function pointers/subtable X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This allows plugging-in of different subtable hash-lookup-verify routines, and allows special casing of those functions based on known context (eg: # of bits set) of the specific subtable. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v11: - Rebased to latest master - Added space to ULLONG_FOR_EACH_1 (Ilya) - Use capital letter in commit message (Ilya) v10: - Fix capitalization of comments, and punctuation. (Ian) - Variable declarations up top before use (Ian) - Fix alignment of function parameters, had to newline after typedef (Ian) - Some mailing-list questions relpied to on-list (Ian) v9: - Use count_1bits in favour of __builtin_popcount (Ilya) v6: - Implement subtable effort per packet "lookups_match" counter (Ilya) - Remove double newline (Eelco) - Remove double * before comments (Eelco) - Reword comments in dpcls_lookup() for clarity (Harry) --- lib/dpif-netdev.c | 138 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 96 insertions(+), 42 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6b99a3c44..123f04577 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7683,6 +7683,28 @@ dpif_dummy_register(enum dummy_level level) /* Datapath Classifier. */ +/* Forward declaration for lookup_func typedef. */ +struct dpcls_subtable; + +/* Lookup function for a subtable in the dpcls. This function is called + * by each subtable with an array of packets, and a bitmask of packets to + * perform the lookup on. Using a function pointer gives flexibility to + * optimize the lookup function based on subtable properties and the + * CPU instruction set available at runtime. + */ +typedef +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* Prototype for generic lookup func, using same code path as before. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + /* A set of rules that all have the same fields wildcarded. */ struct dpcls_subtable { /* The fields are only used by writers. */ @@ -7692,6 +7714,13 @@ struct dpcls_subtable { struct cmap rules; /* Contains "struct dpcls_rule"s. */ uint32_t hit_cnt; /* Number of match hits in subtable in current optimization interval. */ + + /* The lookup function to use for this subtable. If there is a known + * property of the subtable (eg: only 3 bits of miniflow metadata is + * used for the lookup) then this can point at an optimized version of + * the lookup function for this particular subtable. */ + dpcls_subtable_lookup_func lookup_func; + struct netdev_flow_key mask; /* Wildcards for fields (const). */ /* 'mask' must be the last field, additional space is allocated here. */ }; @@ -7751,6 +7780,10 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) cmap_init(&subtable->rules); subtable->hit_cnt = 0; netdev_flow_key_clone(&subtable->mask, mask); + + /* Decide which hash/lookup/verify function to use. */ + subtable->lookup_func = dpcls_subtable_lookup_generic; + cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash); /* Add the new subtable at the end of the pvector (with no hits yet) */ pvector_insert(&cls->subtables, subtable, 0); @@ -7911,6 +7944,55 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule, return true; } +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ + int i; + uint32_t found_map; + + /* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ + uint32_t hashes[NETDEV_MAX_BURST]; + ULLONG_FOR_EACH_1 (i, keys_map) { + hashes[i] = netdev_flow_key_hash_in_mask(keys[i], + &subtable->mask); + } + + /* Lookup. */ + const struct cmap_node *nodes[NETDEV_MAX_BURST]; + found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); + + /* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ + ULLONG_FOR_EACH_1 (i, found_map) { + struct dpcls_rule *rule; + + CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { + if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { + rules[i] = rule; + /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap + * within one second optimization interval. */ + subtable->hit_cnt++; + goto next; + } + } + /* None of the found rules was a match. Reset the i-th bit to + * keep searching this key in the next subtable. */ + ULLONG_SET0(found_map, i); /* Did not match. */ + next: + ; /* Keep Sparse happy. */ + } + + return found_map; +} + /* For each miniflow in 'keys' performs a classifier lookup writing the result * into the corresponding slot in 'rules'. If a particular entry in 'keys' is * NULL it is skipped. @@ -7929,16 +8011,12 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], /* The received 'cnt' miniflows are the search-keys that will be processed * to find a matching entry into the available subtables. * The number of bits in map_type is equal to NETDEV_MAX_BURST. */ - typedef uint32_t map_type; -#define MAP_BITS (sizeof(map_type) * CHAR_BIT) +#define MAP_BITS (sizeof(uint32_t) * CHAR_BIT) BUILD_ASSERT_DECL(MAP_BITS >= NETDEV_MAX_BURST); struct dpcls_subtable *subtable; - map_type keys_map = TYPE_MAXIMUM(map_type); /* Set all bits. */ - map_type found_map; - uint32_t hashes[MAP_BITS]; - const struct cmap_node *nodes[MAP_BITS]; + uint32_t keys_map = TYPE_MAXIMUM(uint32_t); /* Set all bits. */ if (cnt != MAP_BITS) { keys_map >>= MAP_BITS - cnt; /* Clear extra bits. */ @@ -7946,6 +8024,7 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], memset(rules, 0, cnt * sizeof *rules); int lookups_match = 0, subtable_pos = 1; + uint32_t found_map; /* The Datapath classifier - aka dpcls - is composed of subtables. * Subtables are dynamically created as needed when new rules are inserted. @@ -7955,52 +8034,27 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], * search-key, the search for that key can stop because the rules are * non-overlapping. */ PVECTOR_FOR_EACH (subtable, &cls->subtables) { - int i; + /* Call the subtable specific lookup function. */ + found_map = subtable->lookup_func(subtable, keys_map, keys, rules); - /* Compute hashes for the remaining keys. Each search-key is - * masked with the subtable's mask to avoid hashing the wildcarded - * bits. */ - ULLONG_FOR_EACH_1(i, keys_map) { - hashes[i] = netdev_flow_key_hash_in_mask(keys[i], - &subtable->mask); - } - /* Lookup. */ - found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); - /* Check results. When the i-th bit of found_map is set, it means - * that a set of nodes with a matching hash value was found for the - * i-th search-key. Due to possible hash collisions we need to check - * which of the found rules, if any, really matches our masked - * search-key. */ - ULLONG_FOR_EACH_1(i, found_map) { - struct dpcls_rule *rule; + /* Count the number of subtables searched for this packet match. This + * estimates the "spread" of subtables looked at per matched packet. */ + uint32_t pkts_matched = count_1bits(found_map); + lookups_match += pkts_matched * subtable_pos; - CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { - if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { - rules[i] = rule; - /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap - * within one second optimization interval. */ - subtable->hit_cnt++; - lookups_match += subtable_pos; - goto next; - } - } - /* None of the found rules was a match. Reset the i-th bit to - * keep searching this key in the next subtable. */ - ULLONG_SET0(found_map, i); /* Did not match. */ - next: - ; /* Keep Sparse happy. */ - } - keys_map &= ~found_map; /* Clear the found rules. */ + /* Clear the found rules, and return early if all packets are found. */ + keys_map &= ~found_map; if (!keys_map) { if (num_lookups_p) { *num_lookups_p = lookups_match; } - return true; /* All found. */ + return true; } subtable_pos++; } + if (num_lookups_p) { *num_lookups_p = lookups_match; } - return false; /* Some misses. */ + return false; } From patchwork Wed Jul 17 18:21:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1133384 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45plw006ztz9s3l for ; Thu, 18 Jul 2019 04:21:35 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 14D08F51; Wed, 17 Jul 2019 18:20:34 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 3DABAF48 for ; Wed, 17 Jul 2019 18:20:32 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 62D2E887 for ; Wed, 17 Jul 2019 18:20:31 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jul 2019 11:20:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,275,1559545200"; d="scan'208";a="343114084" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by orsmga005.jf.intel.com with ESMTP; 17 Jul 2019 11:20:29 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Wed, 17 Jul 2019 19:21:44 +0100 Message-Id: <20190717182147.5042-3-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190717182147.5042-1-harry.van.haaren@intel.com> References: <20190717130033.25114-1-harry.van.haaren@intel.com> <20190717182147.5042-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v13 2/5] dpif-netdev: Move dpcls lookup structures to .h X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit moves some data-structures to be available in the dpif-netdev-private.h header. This allows specific implementations of the subtable lookup function to include just that header file, and not require that the code exists in dpif-netdev.c Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v12: - Fixed Caps (Ilya) - Moved declaration of mf_bits and mf_masks* to future patch (Ilya) v11: - Rebase to latest master - Split netdev private header to own file (Ilya) v10: - Rebase updates from previous patch in code that moved. - Move cmap.h include into alphabetical order (Ian) - Fix comment and typo (Ian) - Restructure function typedef to fit in 80 chars v6: - Fix double * in code comments (Eelco) - Reword comment on lookup_func for clarity (Harry) - Rebase fixups --- lib/automake.mk | 1 + lib/dpif-netdev-private.h | 109 ++++++++++++++++++++++++++++++++++++++ lib/dpif-netdev.c | 68 ++---------------------- 3 files changed, 113 insertions(+), 65 deletions(-) create mode 100644 lib/dpif-netdev-private.h diff --git a/lib/automake.mk b/lib/automake.mk index 1b89cac8c..6f216efe0 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -80,6 +80,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dpdk.h \ lib/dpif-netdev.c \ lib/dpif-netdev.h \ + lib/dpif-netdev-private.h \ lib/dpif-netdev-perf.c \ lib/dpif-netdev-perf.h \ lib/dpif-provider.h \ diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h new file mode 100644 index 000000000..555856482 --- /dev/null +++ b/lib/dpif-netdev-private.h @@ -0,0 +1,109 @@ +/* + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2015 Nicira, Inc. + * Copyright (c) 2019 Intel Corperation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef DPIF_NETDEV_PRIVATE_H +#define DPIF_NETDEV_PRIVATE_H 1 + +#include +#include + +#include "dpif.h" +#include "cmap.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/* Forward declaration for lookup_func typedef. */ +struct dpcls_subtable; +struct dpcls_rule; + +/* Must be public as it is instantiated in subtable struct below. */ +struct netdev_flow_key { + uint32_t hash; /* Hash function differs for different users. */ + uint32_t len; /* Length of the following miniflow (incl. map). */ + struct miniflow mf; + uint64_t buf[FLOW_MAX_PACKET_U64S]; +}; + +/* A rule to be inserted to the classifier. */ +struct dpcls_rule { + struct cmap_node cmap_node; /* Within struct dpcls_subtable 'rules'. */ + struct netdev_flow_key *mask; /* Subtable's mask. */ + struct netdev_flow_key flow; /* Matching key. */ + /* 'flow' must be the last field, additional space is allocated here. */ +}; + +/* Lookup function for a subtable in the dpcls. This function is called + * by each subtable with an array of packets, and a bitmask of packets to + * perform the lookup on. Using a function pointer gives flexibility to + * optimize the lookup function based on subtable properties and the + * CPU instruction set available at runtime. + */ +typedef +uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* Prototype for generic lookup func, using same code path as before. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules); + +/* A set of rules that all have the same fields wildcarded. */ +struct dpcls_subtable { + /* The fields are only used by writers. */ + struct cmap_node cmap_node OVS_GUARDED; /* Within dpcls 'subtables_map'. */ + + /* These fields are accessed by readers. */ + struct cmap rules; /* Contains "struct dpcls_rule"s. */ + uint32_t hit_cnt; /* Number of match hits in subtable in current + optimization interval. */ + + /* The lookup function to use for this subtable. If there is a known + * property of the subtable (eg: only 3 bits of miniflow metadata is + * used for the lookup) then this can point at an optimized version of + * the lookup function for this particular subtable. */ + dpcls_subtable_lookup_func lookup_func; + + struct netdev_flow_key mask; /* Wildcards for fields (const). */ + /* 'mask' must be the last field, additional space is allocated here. */ +}; + +/* Iterate through netdev_flow_key TNL u64 values specified by 'FLOWMAP'. */ +#define NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(VALUE, KEY, FLOWMAP) \ + MINIFLOW_FOR_EACH_IN_FLOWMAP (VALUE, &(KEY)->mf, FLOWMAP) + +/* Generates a mask for each bit set in the subtable's miniflow. */ +void +netdev_flow_key_gen_masks(const struct netdev_flow_key *tbl, + uint64_t *mf_masks, + const uint32_t mf_bits_u0, + const uint32_t mf_bits_u1); + +/* Matches a dpcls rule against the incoming packet in 'target' */ +bool dpcls_rule_matches_key(const struct dpcls_rule *rule, + const struct netdev_flow_key *target); + +#ifdef __cplusplus +} +#endif + +#endif /* netdev-private.h */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 123f04577..749a478a8 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -16,6 +16,7 @@ #include #include "dpif-netdev.h" +#include "dpif-netdev-private.h" #include #include @@ -128,15 +129,6 @@ static struct odp_support dp_netdev_support = { .ct_orig_tuple6 = true, }; -/* Stores a miniflow with inline values */ - -struct netdev_flow_key { - uint32_t hash; /* Hash function differs for different users. */ - uint32_t len; /* Length of the following miniflow (incl. map). */ - struct miniflow mf; - uint64_t buf[FLOW_MAX_PACKET_U64S]; -}; - /* EMC cache and SMC cache compose the datapath flow cache (DFC) * * Exact match cache for frequently used flows @@ -243,14 +235,6 @@ struct dpcls { struct pvector subtables; }; -/* A rule to be inserted to the classifier. */ -struct dpcls_rule { - struct cmap_node cmap_node; /* Within struct dpcls_subtable 'rules'. */ - struct netdev_flow_key *mask; /* Subtable's mask. */ - struct netdev_flow_key flow; /* Matching key. */ - /* 'flow' must be the last field, additional space is allocated here. */ -}; - /* Data structure to keep packet order till fastpath processing. */ struct dp_packet_flow_map { struct dp_packet *packet; @@ -268,7 +252,7 @@ static bool dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], struct dpcls_rule **rules, size_t cnt, int *num_lookups_p); -static bool dpcls_rule_matches_key(const struct dpcls_rule *rule, +bool dpcls_rule_matches_key(const struct dpcls_rule *rule, const struct netdev_flow_key *target); /* Set of supported meter flags */ #define DP_SUPPORTED_METER_FLAGS_MASK \ @@ -2784,10 +2768,6 @@ netdev_flow_key_init_masked(struct netdev_flow_key *dst, (dst_u64 - miniflow_get_values(&dst->mf)) * 8); } -/* Iterate through netdev_flow_key TNL u64 values specified by 'FLOWMAP'. */ -#define NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(VALUE, KEY, FLOWMAP) \ - MINIFLOW_FOR_EACH_IN_FLOWMAP(VALUE, &(KEY)->mf, FLOWMAP) - /* Returns a hash value for the bits of 'key' where there are 1-bits in * 'mask'. */ static inline uint32_t @@ -7683,48 +7663,6 @@ dpif_dummy_register(enum dummy_level level) /* Datapath Classifier. */ -/* Forward declaration for lookup_func typedef. */ -struct dpcls_subtable; - -/* Lookup function for a subtable in the dpcls. This function is called - * by each subtable with an array of packets, and a bitmask of packets to - * perform the lookup on. Using a function pointer gives flexibility to - * optimize the lookup function based on subtable properties and the - * CPU instruction set available at runtime. - */ -typedef -uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules); - -/* Prototype for generic lookup func, using same code path as before. */ -uint32_t -dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules); - -/* A set of rules that all have the same fields wildcarded. */ -struct dpcls_subtable { - /* The fields are only used by writers. */ - struct cmap_node cmap_node OVS_GUARDED; /* Within dpcls 'subtables_map'. */ - - /* These fields are accessed by readers. */ - struct cmap rules; /* Contains "struct dpcls_rule"s. */ - uint32_t hit_cnt; /* Number of match hits in subtable in current - optimization interval. */ - - /* The lookup function to use for this subtable. If there is a known - * property of the subtable (eg: only 3 bits of miniflow metadata is - * used for the lookup) then this can point at an optimized version of - * the lookup function for this particular subtable. */ - dpcls_subtable_lookup_func lookup_func; - - struct netdev_flow_key mask; /* Wildcards for fields (const). */ - /* 'mask' must be the last field, additional space is allocated here. */ -}; - static void dpcls_subtable_destroy_cb(struct dpcls_subtable *subtable) { @@ -7928,7 +7866,7 @@ dpcls_remove(struct dpcls *cls, struct dpcls_rule *rule) /* Returns true if 'target' satisfies 'key' in 'mask', that is, if each 1-bit * in 'mask' the values in 'key' and 'target' are the same. */ -static bool +bool dpcls_rule_matches_key(const struct dpcls_rule *rule, const struct netdev_flow_key *target) { From patchwork Wed Jul 17 18:21:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1133385 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45plwj2HJzz9s3l for ; Thu, 18 Jul 2019 04:22:13 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id E605BF72; Wed, 17 Jul 2019 18:20:35 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id F3F54F4A for ; Wed, 17 Jul 2019 18:20:33 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 4C6F7887 for ; Wed, 17 Jul 2019 18:20:33 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jul 2019 11:20:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,275,1559545200"; d="scan'208";a="343114090" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by orsmga005.jf.intel.com with ESMTP; 17 Jul 2019 11:20:31 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Wed, 17 Jul 2019 19:21:45 +0100 Message-Id: <20190717182147.5042-4-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190717182147.5042-1-harry.van.haaren@intel.com> References: <20190717130033.25114-1-harry.van.haaren@intel.com> <20190717182147.5042-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v13 3/5] dpif-netdev: Split out generic lookup function X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit splits the generic hash-lookup-verify function to its own file, for cleaner seperation between optimized versions. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v12: - Fix indentation around "keep sparse happy" comment (Ilya) v11: - Rebase fixups from previous patches - Spacing around ULLONG_FOR_EACH_1 (Ilya) v10: - Rebase fixups from previous patch changes v6: - Fixup some checkpatch warnings on whitespace with MACROs (Ilya) - Other MACROs function incorrectly when checkpatch is happy, so using the functional version without space after the ( character. This prints a checkpatch warning, but I see no way to fix it. --- lib/automake.mk | 1 + lib/dpif-netdev-lookup-generic.c | 98 ++++++++++++++++++++++++++++++++ lib/dpif-netdev.c | 69 +--------------------- 3 files changed, 100 insertions(+), 68 deletions(-) create mode 100644 lib/dpif-netdev-lookup-generic.c diff --git a/lib/automake.mk b/lib/automake.mk index 6f216efe0..29d3458da 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -78,6 +78,7 @@ lib_libopenvswitch_la_SOURCES = \ lib/dp-packet.h \ lib/dp-packet.c \ lib/dpdk.h \ + lib/dpif-netdev-lookup-generic.c \ lib/dpif-netdev.c \ lib/dpif-netdev.h \ lib/dpif-netdev-private.h \ diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c new file mode 100644 index 000000000..8064911b3 --- /dev/null +++ b/lib/dpif-netdev-lookup-generic.c @@ -0,0 +1,98 @@ +/* + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2016, 2017 Nicira, Inc. + * Copyright (c) 2019 Intel Corporation. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include "dpif-netdev.h" +#include "dpif-netdev-private.h" + +#include "bitmap.h" +#include "cmap.h" + +#include "dp-packet.h" +#include "dpif.h" +#include "dpif-netdev-perf.h" +#include "dpif-provider.h" +#include "flow.h" +#include "packets.h" +#include "pvector.h" + +/* Returns a hash value for the bits of 'key' where there are 1-bits in + * 'mask'. */ +static inline uint32_t +netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, + const struct netdev_flow_key *mask) +{ + const uint64_t *p = miniflow_get_values(&mask->mf); + uint32_t hash = 0; + uint64_t value; + + NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { + hash = hash_add64(hash, value & *p); + p++; + } + + return hash_finish(hash, (p - miniflow_get_values(&mask->mf)) * 8); +} + +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ + int i; + uint32_t found_map; + + /* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ + uint32_t hashes[NETDEV_MAX_BURST]; + ULLONG_FOR_EACH_1 (i, keys_map) { + hashes[i] = netdev_flow_key_hash_in_mask(keys[i], &subtable->mask); + } + + /* Lookup. */ + const struct cmap_node *nodes[NETDEV_MAX_BURST]; + found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); + + /* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ + ULLONG_FOR_EACH_1 (i, found_map) { + struct dpcls_rule *rule; + + CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { + if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { + rules[i] = rule; + /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap + * within one second optimization interval. */ + subtable->hit_cnt++; + goto next; + } + } + + /* None of the found rules was a match. Reset the i-th bit to + * keep searching this key in the next subtable. */ + ULLONG_SET0(found_map, i); /* Did not match. */ + next: + ; /* Keep Sparse happy. */ + } + + return found_map; +} diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 749a478a8..b42ca35e3 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -252,8 +252,7 @@ static bool dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], struct dpcls_rule **rules, size_t cnt, int *num_lookups_p); -bool dpcls_rule_matches_key(const struct dpcls_rule *rule, - const struct netdev_flow_key *target); + /* Set of supported meter flags */ #define DP_SUPPORTED_METER_FLAGS_MASK \ (OFPMF13_STATS | OFPMF13_PKTPS | OFPMF13_KBPS | OFPMF13_BURST) @@ -2768,23 +2767,6 @@ netdev_flow_key_init_masked(struct netdev_flow_key *dst, (dst_u64 - miniflow_get_values(&dst->mf)) * 8); } -/* Returns a hash value for the bits of 'key' where there are 1-bits in - * 'mask'. */ -static inline uint32_t -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, - const struct netdev_flow_key *mask) -{ - const uint64_t *p = miniflow_get_values(&mask->mf); - uint32_t hash = 0; - uint64_t value; - - NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP(value, key, mask->mf.map) { - hash = hash_add64(hash, value & *p++); - } - - return hash_finish(hash, (p - miniflow_get_values(&mask->mf)) * 8); -} - static inline bool emc_entry_alive(struct emc_entry *ce) { @@ -7882,55 +7864,6 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule, return true; } -uint32_t -dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules) -{ - int i; - uint32_t found_map; - - /* Compute hashes for the remaining keys. Each search-key is - * masked with the subtable's mask to avoid hashing the wildcarded - * bits. */ - uint32_t hashes[NETDEV_MAX_BURST]; - ULLONG_FOR_EACH_1 (i, keys_map) { - hashes[i] = netdev_flow_key_hash_in_mask(keys[i], - &subtable->mask); - } - - /* Lookup. */ - const struct cmap_node *nodes[NETDEV_MAX_BURST]; - found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); - - /* Check results. When the i-th bit of found_map is set, it means - * that a set of nodes with a matching hash value was found for the - * i-th search-key. Due to possible hash collisions we need to check - * which of the found rules, if any, really matches our masked - * search-key. */ - ULLONG_FOR_EACH_1 (i, found_map) { - struct dpcls_rule *rule; - - CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { - if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { - rules[i] = rule; - /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap - * within one second optimization interval. */ - subtable->hit_cnt++; - goto next; - } - } - /* None of the found rules was a match. Reset the i-th bit to - * keep searching this key in the next subtable. */ - ULLONG_SET0(found_map, i); /* Did not match. */ - next: - ; /* Keep Sparse happy. */ - } - - return found_map; -} - /* For each miniflow in 'keys' performs a classifier lookup writing the result * into the corresponding slot in 'rules'. If a particular entry in 'keys' is * NULL it is skipped. From patchwork Wed Jul 17 18:21:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1133386 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45plxG5Fbrz9s3l for ; Thu, 18 Jul 2019 04:22:42 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 8EA35F7B; Wed, 17 Jul 2019 18:20:38 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 018B5F46 for ; Wed, 17 Jul 2019 18:20:37 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id A2B9D756 for ; Wed, 17 Jul 2019 18:20:35 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jul 2019 11:20:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,275,1559545200"; d="scan'208";a="343114096" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by orsmga005.jf.intel.com with ESMTP; 17 Jul 2019 11:20:33 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Wed, 17 Jul 2019 19:21:46 +0100 Message-Id: <20190717182147.5042-5-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190717182147.5042-1-harry.van.haaren@intel.com> References: <20190717130033.25114-1-harry.van.haaren@intel.com> <20190717182147.5042-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v13 4/5] dpif-netdev: Refactor generic implementation X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit refactors the generic implementation. The goal of this refactor is to simplify the code to enable "specialization" of the functions at compile time. Given compile-time optimizations, the compiler is able to unroll loops, and create optimized code sequences due to compile time knowledge of loop-trip counts. In order to enable these compiler optimizations, we must refactor the code to pass the loop-trip counts to functions as compile time constants. This patch allows the number of miniflow-bits set per "unit" in the miniflow to be passed around as a function argument. Note that this patch does NOT yet take advantage of doing so, this is only a refactor to enable it in the next patches. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v13: - Moved blocks scratch array to thread local storage. This encapsulates the blocks array better inside the implementation where it is required, and avoids bleeding the details to the dpcls or PMD level. Thanks Ilya for suggesting the DEFINE_STATIC_PER_THREAD_DATA method. - Removed blocks scratch array from struct dpcls - Removed blocks_scratch parameter from lookup_func prototype v12: - Fix Caps and . (Ilya) - Fixed typos (Ilya) - Added mf_bits and mf_masks in this patch (Ilya) - Fixed rebase conflicts v11: - Rebased to previous changes - Fix typo in commit message (Ian) - Fix variable declaration spacing (Ian) - Remove function names from comments (Ian) - Replace magic 8 with sizeof(uint64_t) (Ian) - Captialize and end comments with a stop. (Ian/Ilya) - Add build time assert to validate FLOWMAP_UNITS (Ilya) - Add note on ALWAYS_INLINE operation - Add space after ULLONG_FOR_EACH_1 (Ilya) - Use hash_add_words64() instead of rolling own loop (Ilya) Note that hash_words64_inline() calls hash_finish() with an fixed value, so it was not the right hash function for this usage. Used hash_add_words64() and manual hash_finish() to re-use as much of hashing code as we can. v10: - Rebase updates from previous patches - Fix whitespace indentation of func params - Removed restrict keyword, Windows CI failing when it is used (Ian) - Fix integer 0 used to set NULL pointer (Ilya) - Postpone free() call on cls->blocks_scratch (Ilya) - Fix indentation of a function v9: - Use count_1bits in favour of __builtin_popcount (Ilya) - Use ALWAYS_INLINE instead of __attribute__ synatx (Ilya) v8: - Rework block_cache and mf_masks to avoid variable-lenght array due to compiler issues. Provisioning for worst case is not a good solution due to magnitue of over-provisioning required. - Rework netdev_flatten function removing unused parameter --- lib/dpif-netdev-lookup-generic.c | 241 +++++++++++++++++++++++++++---- lib/dpif-netdev-private.h | 12 +- lib/dpif-netdev.c | 51 ++++++- 3 files changed, 269 insertions(+), 35 deletions(-) diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c index 8064911b3..de45099f8 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -27,61 +27,226 @@ #include "dpif-netdev-perf.h" #include "dpif-provider.h" #include "flow.h" +#include "ovs-thread.h" #include "packets.h" #include "pvector.h" -/* Returns a hash value for the bits of 'key' where there are 1-bits in - * 'mask'. */ -static inline uint32_t -netdev_flow_key_hash_in_mask(const struct netdev_flow_key *key, - const struct netdev_flow_key *mask) +VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic); + +/* Lookup functions below depends on the internal structure of flowmap. */ +BUILD_ASSERT_DECL(FLOWMAP_UNITS == 2); + +/* Per thread data to store the blocks cache. The 'blocks_cache_count' variable + * stores the size of the allocated space in uint64_t blocks (so * 8 to get the + * size in bytes). + */ +DEFINE_STATIC_PER_THREAD_DATA(uint64_t *, blocks_scratch_ptr, 0); +DEFINE_STATIC_PER_THREAD_DATA(uint32_t, blocks_scratch_count_ptr, 0); + +/* Given a packet, table and mf_masks, this function iterates over each bit + * set in the subtable, and calculates the appropriate metadata to store in the + * blocks_scratch[]. + * + * The results of the blocks_scratch[] can be used for hashing, and later for + * verification of if a rule matches the given packet. + */ +static inline void +netdev_flow_key_flatten_unit(const uint64_t *pkt_blocks, + const uint64_t *tbl_blocks, + const uint64_t *mf_masks, + uint64_t *blocks_scratch, + const uint64_t pkt_mf_bits, + const uint32_t count) { - const uint64_t *p = miniflow_get_values(&mask->mf); - uint32_t hash = 0; - uint64_t value; + uint32_t i; + + for (i = 0; i < count; i++) { + uint64_t mf_mask = mf_masks[i]; + /* Calculate the block index for the packet metadata. */ + uint64_t idx_bits = mf_mask & pkt_mf_bits; + const uint32_t pkt_idx = count_1bits(idx_bits); - NETDEV_FLOW_KEY_FOR_EACH_IN_FLOWMAP (value, key, mask->mf.map) { - hash = hash_add64(hash, value & *p); - p++; + /* Check if the packet has the subtable miniflow bit set. If yes, the + * block at the above pkt_idx will be stored, otherwise it is masked + * out to be zero. + */ + uint64_t pkt_has_mf_bit = (mf_mask + 1) & pkt_mf_bits; + uint64_t no_bit = ((!pkt_has_mf_bit) > 0) - 1; + + /* Mask packet block by table block, and mask to zero if packet + * doesn't actually contain this block of metadata. + */ + blocks_scratch[i] = pkt_blocks[pkt_idx] & tbl_blocks[i] & no_bit; } +} + +/* This function takes a packet, and subtable and writes an array of uint64_t + * blocks. The blocks contain the metadata that the subtable matches on, in + * the same order as the subtable, allowing linear iteration over the blocks. + * + * To calculate the blocks contents, the netdev_flow_key_flatten_unit function + * is called twice, once for each "unit" of the miniflow. This call can be + * inlined by the compiler for performance. + * + * Note that the u0_count and u1_count variables can be compile-time constants, + * allowing the loop in the inlined flatten_unit() function to be compile-time + * unrolled, or possibly removed totally by unrolling by the loop iterations. + * The compile time optimizations enabled by this design improves performance. + */ +static inline void +netdev_flow_key_flatten(const struct netdev_flow_key *key, + const struct netdev_flow_key *mask, + const uint64_t *mf_masks, + uint64_t *blocks_scratch, + const uint32_t u0_count, + const uint32_t u1_count) +{ + /* Load mask from subtable, mask with packet mf, popcount to get idx. */ + const uint64_t *pkt_blocks = miniflow_get_values(&key->mf); + const uint64_t *tbl_blocks = miniflow_get_values(&mask->mf); - return hash_finish(hash, (p - miniflow_get_values(&mask->mf)) * 8); + /* Packet miniflow bits to be masked by pre-calculated mf_masks. */ + const uint64_t pkt_bits_u0 = key->mf.map.bits[0]; + const uint32_t pkt_bits_u0_pop = count_1bits(pkt_bits_u0); + const uint64_t pkt_bits_u1 = key->mf.map.bits[1]; + + /* Unit 0 flattening */ + netdev_flow_key_flatten_unit(&pkt_blocks[0], + &tbl_blocks[0], + &mf_masks[0], + &blocks_scratch[0], + pkt_bits_u0, + u0_count); + + /* Unit 1 flattening: + * Move the pointers forward in the arrays based on u0 offsets, NOTE: + * 1) pkt blocks indexed by actual popcount of u0, which is NOT always + * the same as the amount of bits set in the subtable. + * 2) mf_masks, tbl_block and blocks_scratch are all "flat" arrays, so + * the index is always u0_count. + */ + netdev_flow_key_flatten_unit(&pkt_blocks[pkt_bits_u0_pop], + &tbl_blocks[u0_count], + &mf_masks[u0_count], + &blocks_scratch[u0_count], + pkt_bits_u1, + u1_count); } -uint32_t -dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules) +/* Compares a rule and the blocks representing a key, returns 1 on a match. */ +static inline uint64_t +netdev_rule_matches_key(const struct dpcls_rule *rule, + const uint32_t mf_bits_total, + const uint64_t *blocks_scratch) { - int i; - uint32_t found_map; + const uint64_t *keyp = miniflow_get_values(&rule->flow.mf); + const uint64_t *maskp = miniflow_get_values(&rule->mask->mf); + uint64_t not_match = 0; + + for (int i = 0; i < mf_bits_total; i++) { + not_match |= (blocks_scratch[i] & maskp[i]) != keyp[i]; + } + + /* Invert result to show match as 1. */ + return !not_match; +} - /* Compute hashes for the remaining keys. Each search-key is - * masked with the subtable's mask to avoid hashing the wildcarded - * bits. */ +/* Const prop version of the function: note that mf bits total and u0 are + * explicitly passed in here, while they're also available at runtime from the + * subtable pointer. By making them compile time, we enable the compiler to + * unroll loops and flatten out code-sequences based on the knowledge of the + * mf_bits_* compile time values. This results in improved performance. + * + * Note: this function is marked with ALWAYS_INLINE to ensure the compiler + * inlines the below code, and then uses the compile time constants to make + * specialized versions of the runtime code. Without ALWAYS_INLINE, the + * compiler might decide to not inline, and performance will suffer. + */ +static inline uint32_t ALWAYS_INLINE +lookup_generic_impl(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules, + const uint32_t bit_count_u0, + const uint32_t bit_count_u1) +{ + const uint32_t n_pkts = count_1bits(keys_map); + ovs_assert(NETDEV_MAX_BURST >= n_pkts); uint32_t hashes[NETDEV_MAX_BURST]; + + const uint32_t bit_count_total = bit_count_u0 + bit_count_u1; + const uint32_t block_count_required = bit_count_total * NETDEV_MAX_BURST; + uint64_t *mf_masks = subtable->mf_masks; + int i; + + /* Blocks scratch is an optimization to re-use the same packet miniflow + * block data when doing rule-verify. This reduces work done during lookup + * and hence improves performance. The blocks_scratch array is stored as a + * thread local variable, as each thread requires its own blocks memory. + */ + uint64_t **blocks_scratch_ptr = blocks_scratch_ptr_get(); + uint32_t *blocks_scratch_count_ptr = blocks_scratch_count_ptr_get(); + uint32_t blocks_scratch_count = *blocks_scratch_count_ptr; + + /* Check if this thread already has a large enough blocks_scratch array + * allocated. This is a predictable UNLIKLEY branch as it will only occur + * once at startup, or if a subtable with higher blocks count is added. + */ + if (OVS_UNLIKELY(blocks_scratch_count < block_count_required || + !*blocks_scratch_ptr)) { + /* Free old scratch memory if it was allocated. */ + if (*blocks_scratch_ptr) { + free(*blocks_scratch_ptr); + } + + /* Allocate new memory for blocks_scratch, and store new size */ + uint64_t *new_ptr = xmalloc(sizeof(uint64_t) * block_count_required); + *blocks_scratch_ptr = new_ptr; + *blocks_scratch_count_ptr = block_count_required; + VLOG_DBG("block scratch array resized to %d\n", block_count_required); + } + + uint64_t *blocks_scratch = *blocks_scratch_ptr; + + /* Flatten the packet metadata into the blocks_scratch[] using subtable. */ + ULLONG_FOR_EACH_1 (i, keys_map) { + netdev_flow_key_flatten(keys[i], + &subtable->mask, + mf_masks, + &blocks_scratch[i * bit_count_total], + bit_count_u0, + bit_count_u1); + } + + /* Hash the now linearized blocks of packet metadata. */ ULLONG_FOR_EACH_1 (i, keys_map) { - hashes[i] = netdev_flow_key_hash_in_mask(keys[i], &subtable->mask); + uint64_t *block_ptr = &blocks_scratch[i * bit_count_total]; + uint32_t hash = hash_add_words64(0, block_ptr, bit_count_total); + hashes[i] = hash_finish(hash, bit_count_total * 8); } - /* Lookup. */ + /* Lookup: this returns a bitmask of packets where the hash table had + * an entry for the given hash key. Presence of a hash key does not + * guarantee matching the key, as there can be hash collisions. + */ + uint32_t found_map; const struct cmap_node *nodes[NETDEV_MAX_BURST]; + found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); - /* Check results. When the i-th bit of found_map is set, it means - * that a set of nodes with a matching hash value was found for the - * i-th search-key. Due to possible hash collisions we need to check - * which of the found rules, if any, really matches our masked - * search-key. */ + /* Verify that packet actually matched rule. If not found, a hash + * collision has taken place, so continue searching with the next node. + */ ULLONG_FOR_EACH_1 (i, found_map) { struct dpcls_rule *rule; CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { - if (OVS_LIKELY(dpcls_rule_matches_key(rule, keys[i]))) { + const uint32_t cidx = i * bit_count_total; + uint32_t match = netdev_rule_matches_key(rule, bit_count_total, + &blocks_scratch[cidx]); + + if (OVS_LIKELY(match)) { rules[i] = rule; - /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap - * within one second optimization interval. */ subtable->hit_cnt++; goto next; } @@ -96,3 +261,15 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, return found_map; } + +/* Generic lookup function that uses runtime provided mf bits for iterating. */ +uint32_t +dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules) +{ + return lookup_generic_impl(subtable, keys_map, keys, rules, + subtable->mf_bits_set_unit0, + subtable->mf_bits_set_unit1); +} diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h index 555856482..610851a10 100644 --- a/lib/dpif-netdev-private.h +++ b/lib/dpif-netdev-private.h @@ -60,7 +60,7 @@ uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules); -/* Prototype for generic lookup func, using same code path as before. */ +/* Prototype for generic lookup func, using generic scalar code path. */ uint32_t dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, uint32_t keys_map, @@ -77,12 +77,22 @@ struct dpcls_subtable { uint32_t hit_cnt; /* Number of match hits in subtable in current optimization interval. */ + /* Miniflow fingerprint that the subtable matches on. The miniflow "bits" + * are used to select the actual dpcls lookup implementation at subtable + * creation time. + */ + uint8_t mf_bits_set_unit0; + uint8_t mf_bits_set_unit1; + /* The lookup function to use for this subtable. If there is a known * property of the subtable (eg: only 3 bits of miniflow metadata is * used for the lookup) then this can point at an optimized version of * the lookup function for this particular subtable. */ dpcls_subtable_lookup_func lookup_func; + /* Caches the masks to match a packet to, reducing runtime calculations. */ + uint64_t *mf_masks; + struct netdev_flow_key mask; /* Wildcards for fields (const). */ /* 'mask' must be the last field, additional space is allocated here. */ }; diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index b42ca35e3..702170698 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7649,6 +7649,7 @@ static void dpcls_subtable_destroy_cb(struct dpcls_subtable *subtable) { cmap_destroy(&subtable->rules); + ovsrcu_postpone(free, subtable->mf_masks); ovsrcu_postpone(free, subtable); } @@ -7701,7 +7702,17 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) subtable->hit_cnt = 0; netdev_flow_key_clone(&subtable->mask, mask); - /* Decide which hash/lookup/verify function to use. */ + /* The count of bits in the mask defines the space required for masks. + * Then call gen_masks() to create the appropriate masks, avoiding the cost + * of doing runtime calculations. */ + uint32_t unit0 = count_1bits(mask->mf.map.bits[0]); + uint32_t unit1 = count_1bits(mask->mf.map.bits[1]); + subtable->mf_bits_set_unit0 = unit0; + subtable->mf_bits_set_unit1 = unit1; + subtable->mf_masks = xmalloc(sizeof(uint64_t) * (unit0 + unit1)); + netdev_flow_key_gen_masks(mask, subtable->mf_masks, unit0, unit1); + + /* Assign the generic lookup - this works with any miniflow fingerprint. */ subtable->lookup_func = dpcls_subtable_lookup_generic; cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash); @@ -7846,6 +7857,43 @@ dpcls_remove(struct dpcls *cls, struct dpcls_rule *rule) } } +/* Inner loop for mask generation of a unit, see netdev_flow_key_gen_masks. */ +static inline void +netdev_flow_key_gen_mask_unit(uint64_t iter, + const uint64_t count, + uint64_t *mf_masks) +{ + int i; + for (i = 0; i < count; i++) { + uint64_t lowest_bit = (iter & -iter); + iter &= ~lowest_bit; + mf_masks[i] = (lowest_bit - 1); + } + /* Checks that count has covered all bits in the iter bitmap. */ + ovs_assert(iter == 0); +} + +/* Generate a mask for each block in the miniflow, based on the bits set. This + * allows easily masking packets with the generated array here, without + * calculations. This replaces runtime-calculating the masks. + * @param key The table to generate the mf_masks for + * @param mf_masks Pointer to a u64 array of at least *mf_bits* in size + * @param mf_bits_total Number of bits set in the whole miniflow (both units) + * @param mf_bits_unit0 Number of bits set in unit0 of the miniflow + */ +void +netdev_flow_key_gen_masks(const struct netdev_flow_key *tbl, + uint64_t *mf_masks, + const uint32_t mf_bits_u0, + const uint32_t mf_bits_u1) +{ + uint64_t iter_u0 = tbl->mf.map.bits[0]; + uint64_t iter_u1 = tbl->mf.map.bits[1]; + + netdev_flow_key_gen_mask_unit(iter_u0, mf_bits_u0, &mf_masks[0]); + netdev_flow_key_gen_mask_unit(iter_u1, mf_bits_u1, &mf_masks[mf_bits_u0]); +} + /* Returns true if 'target' satisfies 'key' in 'mask', that is, if each 1-bit * in 'mask' the values in 'key' and 'target' are the same. */ bool @@ -7886,7 +7934,6 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key *keys[], BUILD_ASSERT_DECL(MAP_BITS >= NETDEV_MAX_BURST); struct dpcls_subtable *subtable; - uint32_t keys_map = TYPE_MAXIMUM(uint32_t); /* Set all bits. */ if (cnt != MAP_BITS) { From patchwork Wed Jul 17 18:21:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1133387 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45ply45KPYz9s3l for ; Thu, 18 Jul 2019 04:23:24 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 6D181F8C; Wed, 17 Jul 2019 18:20:40 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 408A3F54 for ; Wed, 17 Jul 2019 18:20:38 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 74A36887 for ; Wed, 17 Jul 2019 18:20:37 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jul 2019 11:20:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,275,1559545200"; d="scan'208";a="343114103" Received: from silpixa00399779.ir.intel.com (HELO silpixa00399779.ger.corp.intel.com) ([10.237.222.100]) by orsmga005.jf.intel.com with ESMTP; 17 Jul 2019 11:20:35 -0700 From: Harry van Haaren To: dev@openvswitch.org Date: Wed, 17 Jul 2019 19:21:47 +0100 Message-Id: <20190717182147.5042-6-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190717182147.5042-1-harry.van.haaren@intel.com> References: <20190717130033.25114-1-harry.van.haaren@intel.com> <20190717182147.5042-1-harry.van.haaren@intel.com> X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v13 5/5] dpif-netdev: Add specialized generic scalar functions X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit adds a number of specialized functions, that handle common miniflow fingerprints. This enables compiler optimization, resulting in higher performance. Below a quick description of how this optimization actually works; "Specialized functions" are "instances" of the generic implementation, but the compiler is given extra context when compiling. In the case of iterating miniflow datastructures, the most interesting value to enable compile time optimizations is the loop trip count per unit. In order to create a specialized function, there is a generic implementation, which uses a for() loop without the compiler knowing the loop trip count at compile time. The loop trip count is passed in as an argument to the function: uint32_t miniflow_impl_generic(struct miniflow *mf, uint32_t loop_count) { for(uint32_t i = 0; i < loop_count; i++) // do work } In order to "specialize" the function, we call the generic implementation with hard-coded numbers - these are compile time constants! uint32_t miniflow_impl_loop5(struct miniflow *mf, uint32_t loop_count) { // use hard coded constant for compile-time constant-propogation return miniflow_impl_generic(mf, 5); } Given the compiler is aware of the loop trip count at compile time, it can perform an optimization known as "constant propogation". Combined with inlining of the miniflow_impl_generic() function, the compiler is now enabled to *compile time* unroll the loop 5x, and produce "flat" code. The last step to using the specialized functions is to utilize a function-pointer to choose the specialized (or generic) implementation. The selection of the function pointer is performed at subtable creation time, when miniflow fingerprint of the subtable is known. This technique is known as "multiple dispatch" in some literature, as it uses multiple items of information (miniflow bit counts) to select the dispatch function. By pointing the function pointer at the optimized implementation, OvS benefits from the compile time optimizations at runtime. Signed-off-by: Harry van Haaren Tested-by: Malvika Gupta --- v13: - Update macros to new lookup function prototype without blocks scratch v12: - Fix typo (Ian) - Fix missing . after comments (Ian) - Improve return value comments for probe function (Ian) - Spaces after number in optimized lookup declarations (Ilya) - Make VLOG level debug instead of info (Ilya) v11: - Use MACROs to declare and check optimized functions (Ilya) - Use captial letter for commit message (Ilya) - Rebase onto latest patchset changes - Added NEWS entry for data-path subtable specialization (Ian/Harry) - Checkpatch notes an "incorrect bracketing" in the MACROs, however I didn't find a solution that it does like. v10: - Rebase changes from previous patches. - Remove "restrict" keyword as windows CI failed, see here for details: https://ci.appveyor.com/project/istokes/ovs-q8fvv/builds/24398228 v8: - Rework to use blocks_cache from the dpcls instance, to avoid variable lenght arrays in the data-path. --- NEWS | 4 +++ lib/dpif-netdev-lookup-generic.c | 49 ++++++++++++++++++++++++++++++++ lib/dpif-netdev-private.h | 8 ++++++ lib/dpif-netdev.c | 9 ++++-- 4 files changed, 68 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS index 81130e667..4cfffb1bc 100644 --- a/NEWS +++ b/NEWS @@ -34,6 +34,10 @@ Post-v2.11.0 * 'ovs-appctl exit' now implies cleanup of non-internal ports in userspace datapath regardless of '--cleanup' option. Use '--cleanup' to remove internal ports too. + * Datapath classifer code refactored to enable function pointers to select + the lookup implementation at runtime. This enables specialization of + specific subtables based on the miniflow attributes, enhancing the + performance of the subtable search. - OVSDB: * OVSDB clients can now resynchronize with clustered servers much more quickly after a brief disconnection, saving bandwidth and CPU time. diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-generic.c index de45099f8..50edc483c 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-lookup-generic.c @@ -269,7 +269,56 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules) { + /* Here the runtime subtable->mf_bits counts are used, which forces the + * compiler to iterate normal for() loops. Due to this limitation in the + * compilers available optimizations, this function has lower performance + * than the below specialized functions. + */ return lookup_generic_impl(subtable, keys_map, keys, rules, subtable->mf_bits_set_unit0, subtable->mf_bits_set_unit1); } + +/* Expand out specialized functions with U0 and U1 bit attributes. */ +#define DECLARE_OPTIMIZED_LOOKUP_FUNCTION(U0, U1) \ + static uint32_t \ + dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1( \ + struct dpcls_subtable *subtable, \ + uint32_t keys_map, \ + const struct netdev_flow_key *keys[],\ + struct dpcls_rule **rules) \ + { \ + return lookup_generic_impl(subtable, keys_map, keys, rules, U0, U1); \ + } \ + +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5, 1) +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4, 1) +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4, 0) + +/* Check if a specialized function is valid for the required subtable. */ +#define CHECK_LOOKUP_FUNCTION(U0,U1) \ + if (!f && u0_bits == U0 && u1_bits == U1) { \ + f = dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1; \ + } + +/* Probe function to lookup an available specialized function. + * If capable to run the requested miniflow fingerprint, this function returns + * the most optimal implementation for that miniflow fingerprint. + * @retval Non-NULL A valid function to handle the miniflow bit pattern + * @retval NULL The requested miniflow is not supported by this implementation. + */ +dpcls_subtable_lookup_func +dpcls_subtable_generic_probe(uint32_t u0_bits, uint32_t u1_bits) +{ + dpcls_subtable_lookup_func f = NULL; + + CHECK_LOOKUP_FUNCTION(5, 1); + CHECK_LOOKUP_FUNCTION(4, 1); + CHECK_LOOKUP_FUNCTION(4, 0); + + if (f) { + VLOG_DBG("Subtable using Generic Optimized for u0 %d, u1 %d\n", + u0_bits, u1_bits); + } + return f; +} diff --git a/lib/dpif-netdev-private.h b/lib/dpif-netdev-private.h index 610851a10..68c33a0f9 100644 --- a/lib/dpif-netdev-private.h +++ b/lib/dpif-netdev-private.h @@ -67,6 +67,14 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules); +/* Probe function to select a specialized version of the generic lookup + * implementation. This provides performance benefit due to compile-time + * optimizations such as loop-unrolling. These are enabled by the compile-time + * constants in the specific function implementations. + */ +dpcls_subtable_lookup_func +dpcls_subtable_generic_probe(uint32_t u0_bit_count, uint32_t u1_bit_count); + /* A set of rules that all have the same fields wildcarded. */ struct dpcls_subtable { /* The fields are only used by writers. */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 702170698..d0a1c58ad 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7712,8 +7712,13 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) subtable->mf_masks = xmalloc(sizeof(uint64_t) * (unit0 + unit1)); netdev_flow_key_gen_masks(mask, subtable->mf_masks, unit0, unit1); - /* Assign the generic lookup - this works with any miniflow fingerprint. */ - subtable->lookup_func = dpcls_subtable_lookup_generic; + /* Probe for a specialized generic lookup function. */ + subtable->lookup_func = dpcls_subtable_generic_probe(unit0, unit1); + + /* If not set, assign generic lookup. Generic works for any miniflow. */ + if (!subtable->lookup_func) { + subtable->lookup_func = dpcls_subtable_lookup_generic; + } cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash); /* Add the new subtable at the end of the pvector (with no hits yet) */