diff mbox series

powerpc: slightly improve cache helpers

Message ID 0b460a85319fb89dab2c5d1200ac69a3e1b7c1ef.1557235807.git.christophe.leroy@c-s.fr (mailing list archive)
State Superseded
Headers show
Series powerpc: slightly improve cache helpers | expand

Commit Message

Christophe Leroy May 7, 2019, 1:31 p.m. UTC
Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
that are summed to obtain the target address. Using '%y0' argument
gives GCC the opportunity to use both registers instead of only one
with the second being forced to 0.

Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 arch/powerpc/include/asm/cache.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Segher Boessenkool May 7, 2019, 3:10 p.m. UTC | #1
Hi Christophe,

On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
> that are summed to obtain the target address. Using '%y0' argument
> gives GCC the opportunity to use both registers instead of only one
> with the second being forced to 0.

That's not quite right.  Sorry if I didn't explain it properly.

"m" allows all memory.  But this instruction only allows reg,reg and
0,reg addressing.  For that you need to use constraint "Z".

The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing)
print as "0,reg" instead of "0(reg)" as it would by default (for just "%0").


Segher
Christophe Leroy May 7, 2019, 4:53 p.m. UTC | #2
Le 07/05/2019 à 17:10, Segher Boessenkool a écrit :
> Hi Christophe,
> 
> On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
>> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
>> that are summed to obtain the target address. Using '%y0' argument
>> gives GCC the opportunity to use both registers instead of only one
>> with the second being forced to 0.
> 
> That's not quite right.  Sorry if I didn't explain it properly.
> 
> "m" allows all memory.  But this instruction only allows reg,reg and
> 0,reg addressing.  For that you need to use constraint "Z".

But gcc help 
(https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints) 
says it is better to use 'm':

Z

     Memory operand that is an indexed or indirect from a register (it 
is usually better to use ‘m’ or ‘es’ in asm statements)

That's the reason why I used 'm', I thought it was equivalent.

Christophe

> 
> The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing)
> print as "0,reg" instead of "0(reg)" as it would by default (for just "%0").
> 
> 
> Segher
>
Segher Boessenkool May 8, 2019, 2:40 p.m. UTC | #3
On Tue, May 07, 2019 at 06:53:30PM +0200, Christophe Leroy wrote:
> Le 07/05/2019 à 17:10, Segher Boessenkool a écrit :
> >On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
> >>Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
> >>that are summed to obtain the target address. Using '%y0' argument
> >>gives GCC the opportunity to use both registers instead of only one
> >>with the second being forced to 0.
> >
> >That's not quite right.  Sorry if I didn't explain it properly.
> >
> >"m" allows all memory.  But this instruction only allows reg,reg and
> >0,reg addressing.  For that you need to use constraint "Z".
> 
> But gcc help 
> (https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints) 
> says it is better to use 'm':

It says it *usually* is better to use "m".  What it really should say is
it is better to use "m" _when that is valid_.  It is not valid for the
cache block instructions.

I'll fix up the comment...  "es" is ancient, too, nowadays it is
equivalent to just "m" (and you need "m<>" to allow pre-modify addressing).

> Z
> 
>     Memory operand that is an indexed or indirect from a register (it 
> is usually better to use ‘m’ or ‘es’ in asm statements)
> 
> That's the reason why I used 'm', I thought it was equivalent.

Yeah, the manual text could be clearer.


Segher
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 40ea5b3781c6..5a22a869a20b 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -85,22 +85,22 @@  extern void _set_L3CR(unsigned long);
 
 static inline void dcbz(void *addr)
 {
-	__asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
+	__asm__ __volatile__ ("dcbz %y0" : : "m"(*(u8 *)addr) : "memory");
 }
 
 static inline void dcbi(void *addr)
 {
-	__asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
+	__asm__ __volatile__ ("dcbi %y0" : : "m"(*(u8 *)addr) : "memory");
 }
 
 static inline void dcbf(void *addr)
 {
-	__asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
+	__asm__ __volatile__ ("dcbf %y0" : : "m"(*(u8 *)addr) : "memory");
 }
 
 static inline void dcbst(void *addr)
 {
-	__asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
+	__asm__ __volatile__ ("dcbst %y0" : : "m"(*(u8 *)addr) : "memory");
 }
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */