From patchwork Sun Oct 12 05:47:32 2008 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Piggin X-Patchwork-Id: 4048 X-Patchwork-Delegate: paulus@samba.org Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from ozlabs.org (localhost [127.0.0.1]) by ozlabs.org (Postfix) with ESMTP id 67C6DDDEFA for ; Sun, 12 Oct 2008 17:01:22 +1100 (EST) X-Original-To: linuxppc-dev@ozlabs.org Delivered-To: linuxppc-dev@ozlabs.org X-Greylist: delayed 787 seconds by postgrey-1.31 at ozlabs; Sun, 12 Oct 2008 17:01:07 EST Received: from mx2.suse.de (ns2.suse.de [195.135.220.15]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx2.suse.de", Issuer "CAcert Class 3 Root" (verified OK)) by ozlabs.org (Postfix) with ESMTPS id 92877DDDFD for ; Sun, 12 Oct 2008 17:01:07 +1100 (EST) Received: from Relay2.suse.de (relay-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 6FAB24643A; Sun, 12 Oct 2008 07:48:25 +0200 (CEST) Date: Sun, 12 Oct 2008 07:47:32 +0200 From: Nick Piggin To: Ingo Molnar , linux-arch@vger.kernel.org, linuxppc-dev@ozlabs.org, paulus@samba.org, benh@kernel.crashing.org Subject: [patch] powerpc: implement optimised mutex fastpaths Message-ID: <20081012054732.GB12535@wotan.suse.de> References: <20081012054634.GA12535@wotan.suse.de> Mime-Version: 1.0 Content-Disposition: inline In-Reply-To: <20081012054634.GA12535@wotan.suse.de> User-Agent: Mutt/1.5.9i X-BeenThere: linuxppc-dev@ozlabs.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@ozlabs.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@ozlabs.org Implement a more optimal mutex fastpath for powerpc, making use of acquire and release barrier semantics. This takes the mutex lock+unlock benchmark from 203 to 173 cycles on a G5. Signed-off-by: Nick Piggin Index: linux-2.6/arch/powerpc/include/asm/mutex.h =================================================================== --- linux-2.6.orig/arch/powerpc/include/asm/mutex.h +++ linux-2.6/arch/powerpc/include/asm/mutex.h @@ -1,9 +1,136 @@ /* - * Pull in the generic implementation for the mutex fastpath. + * Optimised mutex implementation of include/asm-generic/mutex-dec.h algorithm + */ +#ifndef _ASM_POWERPC_MUTEX_H +#define _ASM_POWERPC_MUTEX_H + +static inline int __mutex_cmpxchg_lock(atomic_t *v, int old, int new) +{ + int t; + + __asm__ __volatile__ ( +"1: lwarx %0,0,%1 # mutex trylock\n\ + cmpw 0,%0,%2\n\ + bne- 2f\n" + PPC405_ERR77(0,%1) +" stwcx. %3,0,%1\n\ + bne- 1b" + ISYNC_ON_SMP + "\n\ +2:" + : "=&r" (t) + : "r" (&v->counter), "r" (old), "r" (new) + : "cc", "memory"); + + return t; +} + +static inline int __mutex_dec_return_lock(atomic_t *v) +{ + int t; + + __asm__ __volatile__( +"1: lwarx %0,0,%1 # mutex lock\n\ + addic %0,%0,-1\n" + PPC405_ERR77(0,%1) +" stwcx. %0,0,%1\n\ + bne- 1b" + ISYNC_ON_SMP + : "=&r" (t) + : "r" (&v->counter) + : "cc", "memory"); + + return t; +} + +static inline int __mutex_inc_return_unlock(atomic_t *v) +{ + int t; + + __asm__ __volatile__( + LWSYNC_ON_SMP +"1: lwarx %0,0,%1 # mutex unlock\n\ + addic %0,%0,1\n" + PPC405_ERR77(0,%1) +" stwcx. %0,0,%1 \n\ + bne- 1b" + : "=&r" (t) + : "r" (&v->counter) + : "cc", "memory"); + + return t; +} + +/** + * __mutex_fastpath_lock - try to take the lock by moving the count + * from 1 to a 0 value + * @count: pointer of type atomic_t + * @fail_fn: function to call if the original value was not 1 + * + * Change the count from 1 to a value lower than 1, and call if + * it wasn't 1 originally. This function MUST leave the value lower than + * 1 even when the "1" assertion wasn't true. + */ +static inline void +__mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) +{ + if (unlikely(__mutex_dec_return_lock(count) < 0)) + fail_fn(count); +} + +/** + * __mutex_fastpath_lock_retval - try to take the lock by moving the count + * from 1 to a 0 value + * @count: pointer of type atomic_t + * @fail_fn: function to call if the original value was not 1 + * + * Change the count from 1 to a value lower than 1, and call if + * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, + * or anything the slow path function returns. + */ +static inline int +__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +{ + if (unlikely(__mutex_dec_return_lock(count) < 0)) + return fail_fn(count); + return 0; +} + +/** + * __mutex_fastpath_unlock - try to promote the count from 0 to 1 + * @count: pointer of type atomic_t + * @fail_fn: function to call if the original value was not 0 + * + * Try to promote the count from 0 to 1. If it wasn't 0, call . + * In the failure case, this function is allowed to either set the value to + * 1, or to set it to a value lower than 1. + */ +static inline void +__mutex_fastpath_unlock(atomic_t *count, void (*fail_fn)(atomic_t *)) +{ + if (unlikely(__mutex_inc_return_unlock(count) <= 0)) + fail_fn(count); +} + +#define __mutex_slowpath_needs_to_unlock() 1 + +/** + * __mutex_fastpath_trylock - try to acquire the mutex, without waiting + * + * @count: pointer of type atomic_t + * @fail_fn: fallback function * - * TODO: implement optimized primitives instead, or leave the generic - * implementation in place, or pick the atomic_xchg() based generic - * implementation. (see asm-generic/mutex-xchg.h for details) + * Change the count from 1 to a value lower than 1, and return 0 (failure) + * if it wasn't 1 originally, or return 1 (success) otherwise. This function + * MUST leave the value lower than 1 even when the "1" assertion wasn't true. + * Additionally, if the value was < 0 originally, this function must not leave + * it to 0 on failure. */ +static inline int +__mutex_fastpath_trylock(atomic_t *count, int (*fail_fn)(atomic_t *)) +{ + if (likely(__mutex_cmpxchg_lock(count, 1, 0) == 1)) + return 1; +} -#include +#endif