From patchwork Mon Dec 4 20:16:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Bodireddy, Bhanuprakash" X-Patchwork-Id: 844400 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yrGfj0nS8z9sDB for ; Tue, 5 Dec 2017 07:28:28 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 3DB9BBDB; Mon, 4 Dec 2017 20:28:26 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 36B13BAD for ; Mon, 4 Dec 2017 20:28:25 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 8B5E4401 for ; Mon, 4 Dec 2017 20:28:24 +0000 (UTC) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Dec 2017 12:28:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,361,1508828400"; d="scan'208";a="12991605" Received: from silpixa00393942.ir.intel.com (HELO silpixa00393942.ger.corp.intel.com) ([10.237.223.42]) by orsmga001.jf.intel.com with ESMTP; 04 Dec 2017 12:28:22 -0800 From: Bhanuprakash Bodireddy To: dev@openvswitch.org Date: Mon, 4 Dec 2017 20:16:46 +0000 Message-Id: <1512418610-84032-1-git-send-email-bhanuprakash.bodireddy@intel.com> X-Mailer: git-send-email 2.4.11 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH RFC 1/5] compiler: Introduce OVS_PREFETCH variants. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit introduces prefetch variants by using the GCC built-in prefetch function. The prefetch variants gives the user better control on designing data caching strategy in order to increase cache efficiency and minimize cache pollution. Data reference patterns here can be classified in to - Non-temporal(NT) - Data that is referenced once and not reused in immediate future. - Temporal - Data will be used again soon. The Macro variants can be used where there are - Predictable memory access patterns. - Execution pipeline can stall if data isn't available. - Time consuming loops. For example: OVS_PREFETCH_CACHE(addr, OPCH_LTR) - OPCH_LTR : OVS PREFETCH CACHE HINT-LOW TEMPORAL READ. - __builtin_prefetch(addr, 0, 1) - Prefetch data in to L3 cache for readonly purpose. OVS_PREFETCH_CACHE(addr, OPCH_HTW) - OPCH_HTW : OVS PREFETCH CACHE HINT-HIGH TEMPORAL WRITE. - __builtin_prefetch(addr, 1, 3) - Prefetch data in to all caches in anticipation of write. In doing so it invalidates other cached copies so as to gain 'exclusive' access. OVS_PREFETCH(addr) - OPCH_HTR : OVS PREFETCH CACHE HINT-HIGH TEMPORAL READ. - __builtin_prefetch(addr, 0, 3) - Prefetch data in to all caches in anticipation of read and that data will be used again soon (HTR - High Temporal Read). Signed-off-by: Bhanuprakash Bodireddy --- include/openvswitch/compiler.h | 90 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 87 insertions(+), 3 deletions(-) diff --git a/include/openvswitch/compiler.h b/include/openvswitch/compiler.h index c7cb930..5d5553a 100644 --- a/include/openvswitch/compiler.h +++ b/include/openvswitch/compiler.h @@ -229,11 +229,95 @@ * instruction as OVS_PREFETCH(), or bring the data into the cache in an * exclusive state. */ #if __GNUC__ -#define OVS_PREFETCH(addr) __builtin_prefetch((addr)) -#define OVS_PREFETCH_WRITE(addr) __builtin_prefetch((addr), 1) +enum cache_locality { + NON_TEMPORAL_LOCALITY, + LOW_TEMPORAL_LOCALITY, + MODERATE_TEMPORAL_LOCALITY, + HIGH_TEMPORAL_LOCALITY +}; + +enum cache_rw { + PREFETCH_READ, + PREFETCH_WRITE +}; + +/* Implementation details of prefetch hint instructions may vary across + * different processors and microarchitectures. + * + * OPCH_NTW, OPCH_LTW, OPCH_MTW uses prefetchwt1 instruction and OPCH_HTW + * uses prefetchw instruction when available. + * */ +#define OVS_PREFETCH_CACHE_HINT \ + OPCH(OPCH_NTR, PREFETCH_READ, NON_TEMPORAL_LOCALITY, \ + "Fetch data to non-temporal cache to minimize cache pollution") \ + OPCH(OPCH_LTR, PREFETCH_READ, LOW_TEMPORAL_LOCALITY, \ + "Fetch data to L2 and L3 cache") \ + OPCH(OPCH_MTR, PREFETCH_READ, MODERATE_TEMPORAL_LOCALITY, \ + "Fetch data to L2 and L3 caches, same as LTR on" \ + "Nehalem, Westmere, Sandy Bridge and newer microarchitectures") \ + OPCH(OPCH_HTR, PREFETCH_READ, HIGH_TEMPORAL_LOCALITY, \ + "Fetch data in to all cache levels L1, L2 and L3") \ + OPCH(OPCH_NTW, PREFETCH_WRITE, NON_TEMPORAL_LOCALITY, \ + "Fetch data to L2, and L3 cache in exclusive state" \ + "in anticipation of write") \ + OPCH(OPCH_LTW, PREFETCH_WRITE, LOW_TEMPORAL_LOCALITY, \ + "Fetch data to L2, and L3 cache in exclusive state") \ + OPCH(OPCH_MTW, PREFETCH_WRITE, MODERATE_TEMPORAL_LOCALITY, \ + "Fetch data in to L2 and L3 caches in exclusive state") \ + OPCH(OPCH_HTW, PREFETCH_WRITE, HIGH_TEMPORAL_LOCALITY, \ + "Fetch data in to all cache levels in exclusive state") + +/* Indexes for cache prefetch types. */ +enum { +#define OPCH(ENUM, RW, LOCALITY, EXPLANATION) ENUM##_INDEX, + OVS_PREFETCH_CACHE_HINT +#undef OPCH +}; + +/* Cache prefetch types. */ +enum ovs_prefetch_type { +#define OPCH(ENUM, RW, LOCALITY, EXPLANATION) ENUM = 1 << ENUM##_INDEX, + OVS_PREFETCH_CACHE_HINT +#undef OPCH +}; + +#define OVS_PREFETCH_CACHE(addr, TYPE) switch(TYPE) \ +{ \ + case OPCH_NTR: \ + __builtin_prefetch((addr), PREFETCH_READ, NON_TEMPORAL_LOCALITY); \ + break; \ + case OPCH_LTR: \ + __builtin_prefetch((addr), PREFETCH_READ, LOW_TEMPORAL_LOCALITY); \ + break; \ + case OPCH_MTR: \ + __builtin_prefetch((addr), PREFETCH_READ, \ + MODERATE_TEMPORAL_LOCALITY); \ + break; \ + case OPCH_HTR: \ + __builtin_prefetch((addr), PREFETCH_READ, HIGH_TEMPORAL_LOCALITY); \ + break; \ + case OPCH_NTW: \ + __builtin_prefetch((addr), PREFETCH_WRITE, NON_TEMPORAL_LOCALITY); \ + break; \ + case OPCH_LTW: \ + __builtin_prefetch((addr), PREFETCH_WRITE, LOW_TEMPORAL_LOCALITY); \ + break; \ + case OPCH_MTW: \ + __builtin_prefetch((addr), PREFETCH_WRITE, \ + MODERATE_TEMPORAL_LOCALITY); \ + break; \ + case OPCH_HTW: \ + __builtin_prefetch((addr), PREFETCH_WRITE, HIGH_TEMPORAL_LOCALITY); \ + break; \ + \ +} + +/* Retain this for compatibility. */ +#define OVS_PREFETCH(addr) OVS_PREFETCH_CACHE(addr, OPCH_HTR) +#define OVS_PREFETCH_WRITE(addr) OVS_PREFETCH_CACHE(addr, OP) #else #define OVS_PREFETCH(addr) -#define OVS_PREFETCH_WRITE(addr) +#define OVS_PREFETCH_CACHE(addr, OP) #endif /* Build assertions.