Message ID | 0b460a85319fb89dab2c5d1200ac69a3e1b7c1ef.1557235807.git.christophe.leroy@c-s.fr (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | powerpc: slightly improve cache helpers | expand |
Hi Christophe, On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote: > Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers > that are summed to obtain the target address. Using '%y0' argument > gives GCC the opportunity to use both registers instead of only one > with the second being forced to 0. That's not quite right. Sorry if I didn't explain it properly. "m" allows all memory. But this instruction only allows reg,reg and 0,reg addressing. For that you need to use constraint "Z". The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing) print as "0,reg" instead of "0(reg)" as it would by default (for just "%0"). Segher
Le 07/05/2019 à 17:10, Segher Boessenkool a écrit : > Hi Christophe, > > On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote: >> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers >> that are summed to obtain the target address. Using '%y0' argument >> gives GCC the opportunity to use both registers instead of only one >> with the second being forced to 0. > > That's not quite right. Sorry if I didn't explain it properly. > > "m" allows all memory. But this instruction only allows reg,reg and > 0,reg addressing. For that you need to use constraint "Z". But gcc help (https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints) says it is better to use 'm': Z Memory operand that is an indexed or indirect from a register (it is usually better to use ‘m’ or ‘es’ in asm statements) That's the reason why I used 'm', I thought it was equivalent. Christophe > > The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing) > print as "0,reg" instead of "0(reg)" as it would by default (for just "%0"). > > > Segher >
On Tue, May 07, 2019 at 06:53:30PM +0200, Christophe Leroy wrote: > Le 07/05/2019 à 17:10, Segher Boessenkool a écrit : > >On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote: > >>Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers > >>that are summed to obtain the target address. Using '%y0' argument > >>gives GCC the opportunity to use both registers instead of only one > >>with the second being forced to 0. > > > >That's not quite right. Sorry if I didn't explain it properly. > > > >"m" allows all memory. But this instruction only allows reg,reg and > >0,reg addressing. For that you need to use constraint "Z". > > But gcc help > (https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints) > says it is better to use 'm': It says it *usually* is better to use "m". What it really should say is it is better to use "m" _when that is valid_. It is not valid for the cache block instructions. I'll fix up the comment... "es" is ancient, too, nowadays it is equivalent to just "m" (and you need "m<>" to allow pre-modify addressing). > Z > > Memory operand that is an indexed or indirect from a register (it > is usually better to use ‘m’ or ‘es’ in asm statements) > > That's the reason why I used 'm', I thought it was equivalent. Yeah, the manual text could be clearer. Segher
diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h index 40ea5b3781c6..5a22a869a20b 100644 --- a/arch/powerpc/include/asm/cache.h +++ b/arch/powerpc/include/asm/cache.h @@ -85,22 +85,22 @@ extern void _set_L3CR(unsigned long); static inline void dcbz(void *addr) { - __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory"); + __asm__ __volatile__ ("dcbz %y0" : : "m"(*(u8 *)addr) : "memory"); } static inline void dcbi(void *addr) { - __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory"); + __asm__ __volatile__ ("dcbi %y0" : : "m"(*(u8 *)addr) : "memory"); } static inline void dcbf(void *addr) { - __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory"); + __asm__ __volatile__ ("dcbf %y0" : : "m"(*(u8 *)addr) : "memory"); } static inline void dcbst(void *addr) { - __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory"); + __asm__ __volatile__ ("dcbst %y0" : : "m"(*(u8 *)addr) : "memory"); } #endif /* !__ASSEMBLY__ */ #endif /* __KERNEL__ */
Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers that are summed to obtain the target address. Using '%y0' argument gives GCC the opportunity to use both registers instead of only one with the second being forced to 0. Suggested-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> --- arch/powerpc/include/asm/cache.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)