From patchwork Wed Nov 12 03:51:18 2008 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Piggin X-Patchwork-Id: 8303 X-Patchwork-Delegate: paulus@samba.org Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from ozlabs.org (localhost [127.0.0.1]) by ozlabs.org (Postfix) with ESMTP id D8769DDE0A for ; Wed, 12 Nov 2008 14:52:13 +1100 (EST) X-Original-To: linuxppc-dev@ozlabs.org Delivered-To: linuxppc-dev@ozlabs.org Received: from mx1.suse.de (ns1.suse.de [195.135.220.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx1.suse.de", Issuer "CAcert Class 3 Root" (verified OK)) by ozlabs.org (Postfix) with ESMTPS id 8D885DDDF3 for ; Wed, 12 Nov 2008 14:51:27 +1100 (EST) Received: from Relay1.suse.de (relay-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 7A7B8429C6; Wed, 12 Nov 2008 04:51:18 +0100 (CET) Date: Wed, 12 Nov 2008 04:51:18 +0100 From: Nick Piggin To: Paul Mackerras , linuxppc-dev@ozlabs.org Subject: [patch 2/3] powerpc: optimise smp_rmb Message-ID: <20081112035118.GG26053@wotan.suse.de> References: <20081112035048.GF26053@wotan.suse.de> Mime-Version: 1.0 Content-Disposition: inline In-Reply-To: <20081112035048.GF26053@wotan.suse.de> User-Agent: Mutt/1.5.9i X-BeenThere: linuxppc-dev@ozlabs.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@ozlabs.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@ozlabs.org After commit 598056d5af8fef1dbe8f96f5c2b641a528184e5a, rmb() becomes a sync instruction, which is needed to order cacheable vs noncacheable loads. However smp_rmb() is #defined to rmb(), and smp_rmb() can be an lwsync. Restore smp_rmb() performance by using lwsync there. Update comments. Signed-off-by: Nick Piggin Index: linux-2.6/arch/powerpc/include/asm/system.h =================================================================== --- linux-2.6.orig/arch/powerpc/include/asm/system.h 2008-11-12 12:28:57.000000000 +1100 +++ linux-2.6/arch/powerpc/include/asm/system.h 2008-11-12 12:35:12.000000000 +1100 @@ -23,15 +23,17 @@ * read_barrier_depends() prevents data-dependent loads being reordered * across this point (nop on PPC). * - * We have to use the sync instructions for mb(), since lwsync doesn't - * order loads with respect to previous stores. Lwsync is fine for - * rmb(), though. Note that rmb() actually uses a sync on 32-bit - * architectures. + * *mb() variants without smp_ prefix must order all types of memory + * operations with one another. sync is the only instruction sufficient + * to do this. * - * For wmb(), we use sync since wmb is used in drivers to order - * stores to system memory with respect to writes to the device. - * However, smp_wmb() can be a lighter-weight lwsync or eieio barrier - * on SMP since it is only used to order updates to system memory. + * For the smp_ barriers, ordering is for cacheable memory operations + * only. We have to use the sync instruction for smp_mb(), since lwsync + * doesn't order loads with respect to previous stores. Lwsync can be + * used for smp_rmb() and smp_wmb(). + * + * However, on CPUs that don't support lwsync, lwsync actually maps to a + * heavy-weight sync, so smp_wmb() can be a lighter-weight eieio. */ #define mb() __asm__ __volatile__ ("sync" : : : "memory") #define rmb() __asm__ __volatile__ ("sync" : : : "memory") @@ -51,7 +53,7 @@ #endif #define smp_mb() mb() -#define smp_rmb() rmb() +#define smp_rmb() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory") #define smp_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory") #define smp_read_barrier_depends() read_barrier_depends() #else