Message ID | 20190817090442.C5FEF106613@localhost.localdomain (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | powerpc: optimise WARN_ON() | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch next (c9633332103e55bc73d80d07ead28b95a22a85a3) |
snowpatch_ozlabs/build-ppc64le | success | Build succeeded |
snowpatch_ozlabs/build-ppc64be | success | Build succeeded |
snowpatch_ozlabs/build-ppc64e | success | Build succeeded |
snowpatch_ozlabs/build-pmac32 | success | Build succeeded |
snowpatch_ozlabs/checkpatch | fail | total: 1 errors, 0 warnings, 0 checks, 8 lines checked |
On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote: > Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger > of the t(d/w)nei instruction instead of using directly the > value of x. > > This leads to GCC adding unnecessary pair of addic/subfe. And it has to, it is passed as an "r" to an asm, GCC has to put the "!!" value into a register. > By using (x) instead of !!(x) like BUG_ON() does, the additional > instructions go away: But is it correct? What happens if you pass an int to WARN_ON, on a 64-bit kernel? (You might want to have 64-bit generate either tw or td. But, with your __builtin_trap patch, all that will be automatic). Segher
Le 18/08/2019 à 14:01, Segher Boessenkool a écrit : > On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote: >> Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger >> of the t(d/w)nei instruction instead of using directly the >> value of x. >> >> This leads to GCC adding unnecessary pair of addic/subfe. > > And it has to, it is passed as an "r" to an asm, GCC has to put the "!!" > value into a register. > >> By using (x) instead of !!(x) like BUG_ON() does, the additional >> instructions go away: > > But is it correct? What happens if you pass an int to WARN_ON, on a > 64-bit kernel? On a 64-bit kernel, an int is still in a 64-bit register, so there would be no problem with tdnei, would it ? an int 0 is the same as an long 0, right ? It is on 32-bit kernel that I see a problem, if one passes a long long to WARN_ON(), the forced cast to long will just drop the upper size of it. So as of today, BUG_ON() is buggy for that. > > (You might want to have 64-bit generate either tw or td. But, with > your __builtin_trap patch, all that will be automatic). > Yes I'll discard this patch and focus on the __builtin_trap() one which should solve most issues. Christophe
On Mon, Aug 19, 2019 at 07:40:42AM +0200, Christophe Leroy wrote: > Le 18/08/2019 à 14:01, Segher Boessenkool a écrit : > >On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote: > >>Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger > >>of the t(d/w)nei instruction instead of using directly the > >>value of x. > >> > >>This leads to GCC adding unnecessary pair of addic/subfe. > > > >And it has to, it is passed as an "r" to an asm, GCC has to put the "!!" > >value into a register. > > > >>By using (x) instead of !!(x) like BUG_ON() does, the additional > >>instructions go away: > > > >But is it correct? What happens if you pass an int to WARN_ON, on a > >64-bit kernel? > > On a 64-bit kernel, an int is still in a 64-bit register, so there would > be no problem with tdnei, would it ? an int 0 is the same as an long 0, > right ? The top 32 bits of a 64-bit register holding an int are undefined. Take as example int x = 42; x = ~x; which may put ffff_ffff_ffff_ffd5 into the reg, not 0000_0000_ffff_ffd5 as you might expect or want. For tw instructions this makes no difference (they only look at the low 32 bits anyway); for td insns, it does. > It is on 32-bit kernel that I see a problem, if one passes a long long > to WARN_ON(), the forced cast to long will just drop the upper size of > it. So as of today, BUG_ON() is buggy for that. Sure, it isn't defined what types you can pass to that macro. Another thing that makes inline functions much saner to use. > >(You might want to have 64-bit generate either tw or td. But, with > >your __builtin_trap patch, all that will be automatic). > > Yes I'll discard this patch and focus on the __builtin_trap() one which > should solve most issues. But see my comment there about the compiler knowing all code after an unconditional trap is dead. Segher
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h index fed7e6241349..77074582fe65 100644 --- a/arch/powerpc/include/asm/bug.h +++ b/arch/powerpc/include/asm/bug.h @@ -107,7 +107,7 @@ : : "i" (__FILE__), "i" (__LINE__), \ "i" (BUGFLAG_WARNING|BUGFLAG_TAINT(TAINT_WARN)),\ "i" (sizeof(struct bug_entry)), \ - "r" (__ret_warn_on)); \ + "r" ((__force long)(x))); \ } \ unlikely(__ret_warn_on); \ })
Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger of the t(d/w)nei instruction instead of using directly the value of x. This leads to GCC adding unnecessary pair of addic/subfe. This was revealed after adding a WARN_ON() on top of clear_page() in order to detect misaligned destination: @@ -49,6 +51,8 @@ static inline void clear_page(void *addr) { unsigned int i; + WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1)); + for (i = 0; i < PAGE_SIZE / L1_CACHE_BYTES; i++, addr += L1_CACHE_BYTES) dcbz(addr); } This resulted on: 0000019c <clear_user_page>: 19c: 54 68 06 fe clrlwi r8,r3,27 1a0: 31 48 ff ff addic r10,r8,-1 1a4: 7d 4a 41 10 subfe r10,r10,r8 1a8: 0f 0a 00 00 twnei r10,0 1ac: 39 20 00 80 li r9,128 1b0: 7d 29 03 a6 mtctr r9 1b4: 7c 00 1f ec dcbz 0,r3 1b8: 38 63 00 20 addi r3,r3,32 1bc: 42 00 ff f8 bdnz 1b4 <clear_user_page+0x18> 1c0: 7c a3 2b 78 mr r3,r5 1c4: 48 00 00 00 b 1c4 <clear_user_page+0x28> 1c4: R_PPC_REL24 flush_dcache_page By using (x) instead of !!(x) like BUG_ON() does, the additional instructions go away: 0000019c <clear_user_page>: 19c: 54 6a 06 fe clrlwi r10,r3,27 1a0: 0f 0a 00 00 twnei r10,0 1a4: 39 20 00 80 li r9,128 1a8: 7d 29 03 a6 mtctr r9 1ac: 7c 00 1f ec dcbz 0,r3 1b0: 38 63 00 20 addi r3,r3,32 1b4: 42 00 ff f8 bdnz 1ac <clear_user_page+0x10> 1b8: 7c a3 2b 78 mr r3,r5 1bc: 48 00 00 00 b 1bc <clear_user_page+0x20> 1bc: R_PPC_REL24 flush_dcache_page Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> --- arch/powerpc/include/asm/bug.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)