diff mbox

powerpc: Remove static branch prediction in atomic{, 64}_add_unless

Message ID 1475474583-3085-1-git-send-email-anton@ozlabs.org (mailing list archive)
State Accepted
Headers show

Commit Message

Anton Blanchard Oct. 3, 2016, 6:03 a.m. UTC
From: Anton Blanchard <anton@samba.org>

I see quite a lot of static branch mispredictions on a simple
web serving workload. The issue is in __atomic_add_unless(), called
from _atomic_dec_and_lock(). There is no obvious common case, so it
is better to let the hardware predict the branch.

Signed-off-by: Anton Blanchard <anton@samba.org>
---
 arch/powerpc/include/asm/atomic.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Nicholas Piggin Oct. 4, 2016, 3:18 a.m. UTC | #1
On Mon,  3 Oct 2016 17:03:03 +1100
Anton Blanchard <anton@ozlabs.org> wrote:

> From: Anton Blanchard <anton@samba.org>
> 
> I see quite a lot of static branch mispredictions on a simple
> web serving workload. The issue is in __atomic_add_unless(), called
> from _atomic_dec_and_lock(). There is no obvious common case, so it
> is better to let the hardware predict the branch.

Seems reasonable. How problematic is an unmatched lwarx for performance?
It seems that it will serialize a subsequent lwarx until it is next to
complete (in the case of atomic_dec_and_lock, the spin_lock will
immediately do another lwarx). Maybe that's not a big problem.

Putting a regular load and test here before the lwarx would be
disappointing because many users have success as the common case.


> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
>  arch/powerpc/include/asm/atomic.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index f08d567..2b90335 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -233,7 +233,7 @@ static __inline__ int __atomic_add_unless(atomic_t *v, int a, int u)
>  	PPC_ATOMIC_ENTRY_BARRIER
>  "1:	lwarx	%0,0,%1		# __atomic_add_unless\n\
>  	cmpw	0,%0,%3 \n\
> -	beq-	2f \n\
> +	beq	2f \n\
>  	add	%0,%2,%0 \n"
>  	PPC405_ERR77(0,%2)
>  "	stwcx.	%0,0,%1 \n\
> @@ -539,7 +539,7 @@ static __inline__ int atomic64_add_unless(atomic64_t *v, long a, long u)
>  	PPC_ATOMIC_ENTRY_BARRIER
>  "1:	ldarx	%0,0,%1		# __atomic_add_unless\n\
>  	cmpd	0,%0,%3 \n\
> -	beq-	2f \n\
> +	beq	2f \n\
>  	add	%0,%2,%0 \n"
>  "	stdcx.	%0,0,%1 \n\
>  	bne-	1b \n"
Michael Ellerman Oct. 5, 2016, 2:36 a.m. UTC | #2
On Mon, 2016-03-10 at 06:03:03 UTC, Anton Blanchard wrote:
> From: Anton Blanchard <anton@samba.org>
> 
> I see quite a lot of static branch mispredictions on a simple
> web serving workload. The issue is in __atomic_add_unless(), called
> from _atomic_dec_and_lock(). There is no obvious common case, so it
> is better to let the hardware predict the branch.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/61e98ebff3ba3d3b17e999dc483c26

cheers
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index f08d567..2b90335 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -233,7 +233,7 @@  static __inline__ int __atomic_add_unless(atomic_t *v, int a, int u)
 	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%1		# __atomic_add_unless\n\
 	cmpw	0,%0,%3 \n\
-	beq-	2f \n\
+	beq	2f \n\
 	add	%0,%2,%0 \n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%0,0,%1 \n\
@@ -539,7 +539,7 @@  static __inline__ int atomic64_add_unless(atomic64_t *v, long a, long u)
 	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%1		# __atomic_add_unless\n\
 	cmpd	0,%0,%3 \n\
-	beq-	2f \n\
+	beq	2f \n\
 	add	%0,%2,%0 \n"
 "	stdcx.	%0,0,%1 \n\
 	bne-	1b \n"