From patchwork Fri Oct 14 14:37:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Bodireddy, Bhanuprakash" X-Patchwork-Id: 682304 X-Patchwork-Delegate: diproiettod@vmware.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3swVdp2s1Dz9sQw for ; Sat, 15 Oct 2016 01:41:02 +1100 (AEDT) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 9FC41106B7; Fri, 14 Oct 2016 07:41:01 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e4.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id 6CAD0106B6 for ; Fri, 14 Oct 2016 07:41:00 -0700 (PDT) Received: from bar5.cudamail.com (unknown [192.168.21.12]) by mx1e4.cudamail.com (Postfix) with ESMTPS id EB7B61E02C1 for ; Fri, 14 Oct 2016 08:40:59 -0600 (MDT) X-ASG-Debug-ID: 1476456059-09eadd617c15080001-byXFYA Received: from mx1-pf1.cudamail.com ([192.168.24.1]) by bar5.cudamail.com with ESMTP id RQhsOacn7sRouB2d (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Fri, 14 Oct 2016 08:40:59 -0600 (MDT) X-Barracuda-Envelope-From: bhanuprakash.bodireddy@intel.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.1 Received: from unknown (HELO mga06.intel.com) (134.134.136.31) by mx1-pf1.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 14 Oct 2016 14:40:58 -0000 Received-SPF: pass (mx1-pf1.cudamail.com: SPF record at intel.com designates 134.134.136.31 as permitted sender) X-Barracuda-Apparent-Source-IP: 134.134.136.31 X-Barracuda-RBL-IP: 134.134.136.31 Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP; 14 Oct 2016 07:40:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,493,1473145200"; d="scan'208";a="19585512" Received: from silpixa00393942.ir.intel.com ([10.237.223.42]) by fmsmga006.fm.intel.com with ESMTP; 14 Oct 2016 07:40:56 -0700 X-CudaMail-Envelope-Sender: bhanuprakash.bodireddy@intel.com From: Bhanuprakash Bodireddy To: dev@openvswitch.org X-CudaMail-MID: CM-E1-1013023153 X-CudaMail-DTE: 101416 X-CudaMail-Originating-IP: 134.134.136.31 Date: Fri, 14 Oct 2016 15:37:04 +0100 X-ASG-Orig-Subj: [##CM-E1-1013023153##][PATCH v3 01/12] dpcls: Use 32 packet batches for lookups. Message-Id: <1476455835-77641-2-git-send-email-bhanuprakash.bodireddy@intel.com> X-Mailer: git-send-email 2.4.11 In-Reply-To: <1476455835-77641-1-git-send-email-bhanuprakash.bodireddy@intel.com> References: <1476455835-77641-1-git-send-email-bhanuprakash.bodireddy@intel.com> X-Barracuda-Connect: UNKNOWN[192.168.24.1] X-Barracuda-Start-Time: 1476456059 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 X-Barracuda-Spam-Score: 0.60 X-Barracuda-Spam-Status: No, SCORE=0.60 using global scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=4.0 tests=BSF_SC5_MJ1963, RDNS_NONE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.33711 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.10 RDNS_NONE Delivered to trusted network by a host with no rDNS 0.50 BSF_SC5_MJ1963 Custom Rule MJ1963 Subject: [ovs-dev] [PATCH v3 01/12] dpcls: Use 32 packet batches for lookups. X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" This patch increases the number of packets processed in a batch during a lookup from 16 to 32. Processing batches of 32 packets improves performance and also one of the internal loops can be avoided here. Signed-off-by: Antonio Fischetti Co-authored-by: Bhanuprakash Bodireddy Signed-off-by: Bhanuprakash Bodireddy Acked-by: Jarno Rajahalme --- lib/dpif-netdev.c | 110 ++++++++++++++++++++++-------------------------------- 1 file changed, 45 insertions(+), 65 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index eb9f764..0a4f338 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4985,23 +4985,21 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key keys[], int *num_lookups_p) { /* The received 'cnt' miniflows are the search-keys that will be processed - * in batches of 16 elements. N_MAPS will contain the number of these - * 16-elements batches. i.e. for 'cnt' = 32, N_MAPS will be 2. The batch - * size 16 was experimentally found faster than 8 or 32. */ - typedef uint16_t map_type; + * to find a matching entry into the available subtables. + * The number of bits in map_type is equal to NETDEV_MAX_BURST. */ + typedef uint32_t map_type; #define MAP_BITS (sizeof(map_type) * CHAR_BIT) + BUILD_ASSERT_DECL(MAP_BITS >= NETDEV_MAX_BURST); -#if !defined(__CHECKER__) && !defined(_WIN32) - const int N_MAPS = DIV_ROUND_UP(cnt, MAP_BITS); -#else - enum { N_MAPS = DIV_ROUND_UP(NETDEV_MAX_BURST, MAP_BITS) }; -#endif - map_type maps[N_MAPS]; struct dpcls_subtable *subtable; - memset(maps, 0xff, sizeof maps); - if (cnt % MAP_BITS) { - maps[N_MAPS - 1] >>= MAP_BITS - cnt % MAP_BITS; /* Clear extra bits. */ + map_type keys_map = TYPE_MAXIMUM(map_type); + map_type found_map; + uint32_t hashes[MAP_BITS]; + const struct cmap_node *nodes[MAP_BITS]; + + if (cnt != NETDEV_MAX_BURST) { + keys_map >>= NETDEV_MAX_BURST - cnt; /* Clear extra bits. */ } memset(rules, 0, cnt * sizeof *rules); @@ -5015,61 +5013,43 @@ dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key keys[], * search-key, the search for that key can stop because the rules are * non-overlapping. */ PVECTOR_FOR_EACH (subtable, &cls->subtables) { - const struct netdev_flow_key *mkeys = keys; - struct dpcls_rule **mrules = rules; - map_type remains = 0; - int m; - - BUILD_ASSERT_DECL(sizeof remains == sizeof *maps); - - /* Loops on each batch of 16 search-keys. */ - for (m = 0; m < N_MAPS; m++, mkeys += MAP_BITS, mrules += MAP_BITS) { - uint32_t hashes[MAP_BITS]; - const struct cmap_node *nodes[MAP_BITS]; - unsigned long map = maps[m]; - int i; - - if (!map) { - continue; /* Skip empty maps. */ - } - - /* Compute hashes for the remaining keys. Each search-key is - * masked with the subtable's mask to avoid hashing the wildcarded - * bits. */ - ULLONG_FOR_EACH_1(i, map) { - hashes[i] = netdev_flow_key_hash_in_mask(&mkeys[i], - &subtable->mask); - } - /* Lookup. */ - map = cmap_find_batch(&subtable->rules, map, hashes, nodes); - /* Check results. When the i-th bit of map is set, it means that a - * set of nodes with a matching hash value was found for the i-th - * search-key. Due to possible hash collisions we need to check - * which of the found rules, if any, really matches our masked - * search-key. */ - ULLONG_FOR_EACH_1(i, map) { - struct dpcls_rule *rule; - - CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { - if (OVS_LIKELY(dpcls_rule_matches_key(rule, &mkeys[i]))) { - mrules[i] = rule; - /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap - * within one second optimization interval */ - subtable->hit_cnt++; - lookups_match += subtable_pos; - goto next; - } + int i; + + /* Compute hashes for the remaining keys. Each search-key is + * masked with the subtable's mask to avoid hashing the wildcarded + * bits. */ + ULLONG_FOR_EACH_1(i, keys_map) { + hashes[i] = netdev_flow_key_hash_in_mask(&keys[i], + &subtable->mask); + } + /* Lookup. */ + found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); + /* Check results. When the i-th bit of found_map is set, it means + * that a set of nodes with a matching hash value was found for the + * i-th search-key. Due to possible hash collisions we need to check + * which of the found rules, if any, really matches our masked + * search-key. */ + ULLONG_FOR_EACH_1(i, found_map) { + struct dpcls_rule *rule; + + CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { + if (OVS_LIKELY(dpcls_rule_matches_key(rule, &keys[i]))) { + rules[i] = rule; + /* Even at 20 Mpps the 32-bit hit_cnt cannot wrap + * within one second optimization interval. */ + subtable->hit_cnt++; + lookups_match += subtable_pos; + goto next; } - /* None of the found rules was a match. Reset the i-th bit to - * keep searching in the next subtable. */ - ULLONG_SET0(map, i); /* Did not match. */ - next: - ; /* Keep Sparse happy. */ } - maps[m] &= ~map; /* Clear the found rules. */ - remains |= maps[m]; + /* None of the found rules was a match. Reset the i-th bit to + * keep searching this key in the next subtable. */ + ULLONG_SET0(found_map, i); /* Did not match. */ + next: + ; /* Keep Sparse happy. */ } - if (!remains) { + keys_map &= ~found_map; /* Clear the found rules. */ + if (!keys_map) { if (num_lookups_p) { *num_lookups_p = lookups_match; }