diff mbox series

powerpc: optimise WARN_ON()

Message ID 20190817090442.C5FEF106613@localhost.localdomain (mailing list archive)
State Superseded
Headers show
Series powerpc: optimise WARN_ON() | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch next (c9633332103e55bc73d80d07ead28b95a22a85a3)
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-pmac32 success Build succeeded
snowpatch_ozlabs/checkpatch fail total: 1 errors, 0 warnings, 0 checks, 8 lines checked

Commit Message

Christophe Leroy Aug. 17, 2019, 9:04 a.m. UTC
Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger
of the t(d/w)nei instruction instead of using directly the
value of x.

This leads to GCC adding unnecessary pair of addic/subfe. This was
revealed after adding a WARN_ON() on top of clear_page() in order
to detect misaligned destination:

@@ -49,6 +51,8 @@ static inline void clear_page(void *addr)
 {
 	unsigned int i;

+	WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1));
+
 	for (i = 0; i < PAGE_SIZE / L1_CACHE_BYTES; i++, addr += L1_CACHE_BYTES)
 		dcbz(addr);
 }

This resulted on:

0000019c <clear_user_page>:
 19c:	54 68 06 fe 	clrlwi  r8,r3,27
 1a0:	31 48 ff ff 	addic   r10,r8,-1
 1a4:	7d 4a 41 10 	subfe   r10,r10,r8
 1a8:	0f 0a 00 00 	twnei   r10,0
 1ac:	39 20 00 80 	li      r9,128
 1b0:	7d 29 03 a6 	mtctr   r9
 1b4:	7c 00 1f ec 	dcbz    0,r3
 1b8:	38 63 00 20 	addi    r3,r3,32
 1bc:	42 00 ff f8 	bdnz    1b4 <clear_user_page+0x18>
 1c0:	7c a3 2b 78 	mr      r3,r5
 1c4:	48 00 00 00 	b       1c4 <clear_user_page+0x28>
			1c4: R_PPC_REL24	flush_dcache_page

By using (x) instead of !!(x) like BUG_ON() does, the additional
instructions go away:

0000019c <clear_user_page>:
 19c:	54 6a 06 fe 	clrlwi  r10,r3,27
 1a0:	0f 0a 00 00 	twnei   r10,0
 1a4:	39 20 00 80 	li      r9,128
 1a8:	7d 29 03 a6 	mtctr   r9
 1ac:	7c 00 1f ec 	dcbz    0,r3
 1b0:	38 63 00 20 	addi    r3,r3,32
 1b4:	42 00 ff f8 	bdnz    1ac <clear_user_page+0x10>
 1b8:	7c a3 2b 78 	mr      r3,r5
 1bc:	48 00 00 00 	b       1bc <clear_user_page+0x20>
			1bc: R_PPC_REL24	flush_dcache_page

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 arch/powerpc/include/asm/bug.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Segher Boessenkool Aug. 18, 2019, 12:01 p.m. UTC | #1
On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote:
> Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger
> of the t(d/w)nei instruction instead of using directly the
> value of x.
> 
> This leads to GCC adding unnecessary pair of addic/subfe.

And it has to, it is passed as an "r" to an asm, GCC has to put the "!!"
value into a register.

> By using (x) instead of !!(x) like BUG_ON() does, the additional
> instructions go away:

But is it correct?  What happens if you pass an int to WARN_ON, on a
64-bit kernel?

(You might want to have 64-bit generate either tw or td.  But, with
your __builtin_trap patch, all that will be automatic).


Segher
Christophe Leroy Aug. 19, 2019, 5:40 a.m. UTC | #2
Le 18/08/2019 à 14:01, Segher Boessenkool a écrit :
> On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote:
>> Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger
>> of the t(d/w)nei instruction instead of using directly the
>> value of x.
>>
>> This leads to GCC adding unnecessary pair of addic/subfe.
> 
> And it has to, it is passed as an "r" to an asm, GCC has to put the "!!"
> value into a register.
> 
>> By using (x) instead of !!(x) like BUG_ON() does, the additional
>> instructions go away:
> 
> But is it correct?  What happens if you pass an int to WARN_ON, on a
> 64-bit kernel?

On a 64-bit kernel, an int is still in a 64-bit register, so there would 
be no problem with tdnei, would it ? an int 0 is the same as an long 0, 
right ?

It is on 32-bit kernel that I see a problem, if one passes a long long 
to WARN_ON(), the forced cast to long will just drop the upper size of 
it. So as of today, BUG_ON() is buggy for that.

> 
> (You might want to have 64-bit generate either tw or td.  But, with
> your __builtin_trap patch, all that will be automatic).
> 

Yes I'll discard this patch and focus on the __builtin_trap() one which 
should solve most issues.

Christophe
Segher Boessenkool Aug. 19, 2019, 7:57 a.m. UTC | #3
On Mon, Aug 19, 2019 at 07:40:42AM +0200, Christophe Leroy wrote:
> Le 18/08/2019 à 14:01, Segher Boessenkool a écrit :
> >On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote:
> >>Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger
> >>of the t(d/w)nei instruction instead of using directly the
> >>value of x.
> >>
> >>This leads to GCC adding unnecessary pair of addic/subfe.
> >
> >And it has to, it is passed as an "r" to an asm, GCC has to put the "!!"
> >value into a register.
> >
> >>By using (x) instead of !!(x) like BUG_ON() does, the additional
> >>instructions go away:
> >
> >But is it correct?  What happens if you pass an int to WARN_ON, on a
> >64-bit kernel?
> 
> On a 64-bit kernel, an int is still in a 64-bit register, so there would 
> be no problem with tdnei, would it ? an int 0 is the same as an long 0, 
> right ?

The top 32 bits of a 64-bit register holding an int are undefined.  Take
as example

  int x = 42;
  x = ~x;

which may put ffff_ffff_ffff_ffd5 into the reg, not 0000_0000_ffff_ffd5
as you might expect or want.  For tw instructions this makes no difference
(they only look at the low 32 bits anyway); for td insns, it does.

> It is on 32-bit kernel that I see a problem, if one passes a long long 
> to WARN_ON(), the forced cast to long will just drop the upper size of 
> it. So as of today, BUG_ON() is buggy for that.

Sure, it isn't defined what types you can pass to that macro.  Another
thing that makes inline functions much saner to use.

> >(You might want to have 64-bit generate either tw or td.  But, with
> >your __builtin_trap patch, all that will be automatic).
> 
> Yes I'll discard this patch and focus on the __builtin_trap() one which 
> should solve most issues.

But see my comment there about the compiler knowing all code after an
unconditional trap is dead.


Segher
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index fed7e6241349..77074582fe65 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -107,7 +107,7 @@ 
 		: : "i" (__FILE__), "i" (__LINE__),		\
 		  "i" (BUGFLAG_WARNING|BUGFLAG_TAINT(TAINT_WARN)),\
 		  "i" (sizeof(struct bug_entry)),		\
-		  "r" (__ret_warn_on));				\
+		  "r" ((__force long)(x)));			\
 	}							\
 	unlikely(__ret_warn_on);				\
 })