From patchwork Tue Oct 31 23:39:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Yipeng1" X-Patchwork-Id: 832785 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yRSdd08Djz9sNc for ; Wed, 1 Nov 2017 10:45:24 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 540C7D61; Tue, 31 Oct 2017 23:45:23 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 28585BE0 for ; Tue, 31 Oct 2017 23:45:22 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 1C0924FE for ; Tue, 31 Oct 2017 23:45:20 +0000 (UTC) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Oct 2017 16:45:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.44,326,1505804400"; d="scan'208"; a="1031859824" Received: from bdw-yipeng.jf.intel.com ([10.54.81.30]) by orsmga003.jf.intel.com with ESMTP; 31 Oct 2017 16:45:05 -0700 From: Yipeng Wang To: dev@openvswitch.org Date: Tue, 31 Oct 2017 16:39:34 -0700 Message-Id: <1509493177-28988-3-git-send-email-yipeng1.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1509493177-28988-1-git-send-email-yipeng1.wang@intel.com> References: <1509493177-28988-1-git-send-email-yipeng1.wang@intel.com> X-Spam-Status: No, score=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD autolearn=disabled version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v2 2/5] dpif-netdev: Add AVX2 implementation for CD lookup. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This patch adds the AVX2 implementation during CD lookup. 16 entries of a bucket will be compared together with the lookup key. This patch depends on the first patch. CC: Darrell Ball CC: Jan Scheurich Signed-off-by: Yipeng Wang Signed-off-by: Antonio Fischetti Co-authored-by: Antonio Fischetti --- evaluation: We setup the testing enviornment same to the previous patch. The AVX2 CD implementation's results are shown below. AVX2 data: 1M flows: no.subtable: 10 20 30 cd-ovs 3895961 3170530 2968555 orig-ovs 2683455 1646227 1240501 speedup 1.45x 1.92x 2.39x --- lib/dpif-netdev.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 1 deletion(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index ea1d625..78219ba 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -30,6 +30,9 @@ #include #include #include +#if defined(__AVX2__) +#include +#endif #ifdef DPDK_NETDEV #include @@ -2378,7 +2381,37 @@ cd_lookup_bulk_pipe(struct dpcls *cls, const struct netdev_flow_key keys[], OVS_PREFETCH(prim_bkt1); OVS_PREFETCH(sec_bkt1); +#ifdef __AVX2__ + prim_hitmask = _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16( + _mm256_load_si256((__m256i const *)prim_bkt0->sig), + _mm256_set1_epi16(temp_sig0))); + + + sec_hitmask = _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16( + _mm256_load_si256((__m256i const *)sec_bkt0->sig), + _mm256_set1_epi16(temp_sig0))); + if (prim_hitmask) { + loc = raw_ctz(prim_hitmask) >> 1; + data[i-1] = + prim_bkt0->table_index[loc]; + if (data[i-1] != 0 && cls->sub_ptrs[data[i-1]] != 0) { + hits |= 1 << (i - 1); + prim_bkt0 = prim_bkt1; + sec_bkt0 = sec_bkt1; + temp_sig0 = temp_sig1; + continue; + } + } + + if (sec_hitmask) { + loc = raw_ctz(sec_hitmask) >> 1; + data[i-1] = sec_bkt0->table_index[loc]; + if (data[i-1] != 0 && cls->sub_ptrs[data[i-1]] != 0) { + hits |= 1 << (i - 1); + } + } +#else unsigned int j; prim_hitmask = 0; sec_hitmask = 0; @@ -2407,12 +2440,42 @@ cd_lookup_bulk_pipe(struct dpcls *cls, const struct netdev_flow_key keys[], hits |= 1 << (i - 1); } } - +#endif prim_bkt0 = prim_bkt1; sec_bkt0 = sec_bkt1; temp_sig0 = temp_sig1; } +#ifdef __AVX2__ + prim_hitmask = _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16( + _mm256_load_si256((__m256i const *)prim_bkt0->sig), + _mm256_set1_epi16(temp_sig0))); + + + sec_hitmask = _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16( + _mm256_load_si256((__m256i const *)sec_bkt0->sig), + _mm256_set1_epi16(temp_sig0))); + + if (prim_hitmask) { + loc = raw_ctz(prim_hitmask) >> 1; + data[i-1] = prim_bkt0->table_index[loc]; + if (data[i-1] != 0 && cls->sub_ptrs[data[i-1]] != 0) { + hits |= 1 << (i - 1); + if (hit_mask != NULL) { + *hit_mask = hits; + } + return; + } + } + + if (sec_hitmask) { + loc = raw_ctz(sec_hitmask) >> 1; + data[i-1] = sec_bkt0->table_index[loc]; + if (data[i-1] != 0 && cls->sub_ptrs[data[i-1]] != 0) { + hits |= 1 << (i - 1); + } + } +#else unsigned int j; prim_hitmask = 0; sec_hitmask = 0; @@ -2442,9 +2505,11 @@ cd_lookup_bulk_pipe(struct dpcls *cls, const struct netdev_flow_key keys[], } } +#endif if (hit_mask != NULL) { *hit_mask = hits; } + } static int