Message ID | 20180824131734.18557-1-rka@sysgo.com |
---|---|
State | New |
Headers | show |
Series | tcg/i386: fix vector operations on 32-bit hosts | expand |
ping http://patchwork.ozlabs.org/patch/961849/ On 08/24/2018 03:17 PM, Roman Kapl wrote: > The TCG backend uses LOWREGMASK to get the low 7 bits of register numbers. This > was defined as no-op for 32-bit x86, with the assumption that we have eight > registers anyway. This assumption is not true once we have xmm regs. > > Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes and > have overflown into other opcode fields, wreaking havoc. > > To trigger these problems, you can try running the "movi d8, #0x0" AArch64 > instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated, > but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2". > > Fixes: 770c2fc7bb ("Add vector operations") > Signed-off-by: Roman Kapl <rka@sysgo.com> > --- > > Note: It could also be possible to add a dedicated VEC_LOWREGMASK, but I don't > think it is better or signigicantly faster. > > tcg/i386/tcg-target.inc.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c > index a91e4f1313..436195894b 100644 > --- a/tcg/i386/tcg-target.inc.c > +++ b/tcg/i386/tcg-target.inc.c > @@ -302,11 +302,7 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, > return 0; > } > > -#if TCG_TARGET_REG_BITS == 64 > # define LOWREGMASK(x) ((x) & 7) > -#else > -# define LOWREGMASK(x) (x) > -#endif > > #define P_EXT 0x100 /* 0x0f opcode prefix */ > #define P_EXT38 0x200 /* 0x0f 0x38 opcode prefix */ >
On 8/24/18 3:17 PM, Roman Kapl wrote: > The TCG backend uses LOWREGMASK to get the low 7 bits of register numbers. This 7 = 0b111: the low 3 bits? > was defined as no-op for 32-bit x86, with the assumption that we have eight > registers anyway. This assumption is not true once we have xmm regs. > > Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes and > have overflown into other opcode fields, wreaking havoc. > > To trigger these problems, you can try running the "movi d8, #0x0" AArch64 > instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated, > but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2". > > Fixes: 770c2fc7bb ("Add vector operations") > Signed-off-by: Roman Kapl <rka@sysgo.com> > --- > > Note: It could also be possible to add a dedicated VEC_LOWREGMASK, but I don't > think it is better or signigicantly faster. > > tcg/i386/tcg-target.inc.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c > index a91e4f1313..436195894b 100644 > --- a/tcg/i386/tcg-target.inc.c > +++ b/tcg/i386/tcg-target.inc.c > @@ -302,11 +302,7 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, > return 0; > } > > -#if TCG_TARGET_REG_BITS == 64 > # define LOWREGMASK(x) ((x) & 7) > -#else > -# define LOWREGMASK(x) (x) > -#endif > > #define P_EXT 0x100 /* 0x0f opcode prefix */ > #define P_EXT38 0x200 /* 0x0f 0x38 opcode prefix */ >
Hi, On 09/20/2018 02:19 PM, Philippe Mathieu-Daudé wrote: > On 8/24/18 3:17 PM, Roman Kapl wrote: >> The TCG backend uses LOWREGMASK to get the low 7 bits of register numbers. This > > 7 = 0b111: the low 3 bits? Yes, low 3 bits, 8 registers, the commit message is wrong. Thanks, Roman Kapl
On 8/24/18 6:17 AM, Roman Kapl wrote: > The TCG backend uses LOWREGMASK to get the low 7 bits of register numbers. This > was defined as no-op for 32-bit x86, with the assumption that we have eight > registers anyway. This assumption is not true once we have xmm regs. > > Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes and > have overflown into other opcode fields, wreaking havoc. > > To trigger these problems, you can try running the "movi d8, #0x0" AArch64 > instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated, > but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2". > > Fixes: 770c2fc7bb ("Add vector operations") > Signed-off-by: Roman Kapl <rka@sysgo.com> > --- > > Note: It could also be possible to add a dedicated VEC_LOWREGMASK, but I don't > think it is better or signigicantly faster. Agreed, this does seem to be the clearest solution. I've queued the patch with the "7" typo in the commit message fixed. r~
diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index a91e4f1313..436195894b 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -302,11 +302,7 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, return 0; } -#if TCG_TARGET_REG_BITS == 64 # define LOWREGMASK(x) ((x) & 7) -#else -# define LOWREGMASK(x) (x) -#endif #define P_EXT 0x100 /* 0x0f opcode prefix */ #define P_EXT38 0x200 /* 0x0f 0x38 opcode prefix */
The TCG backend uses LOWREGMASK to get the low 7 bits of register numbers. This was defined as no-op for 32-bit x86, with the assumption that we have eight registers anyway. This assumption is not true once we have xmm regs. Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes and have overflown into other opcode fields, wreaking havoc. To trigger these problems, you can try running the "movi d8, #0x0" AArch64 instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated, but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2". Fixes: 770c2fc7bb ("Add vector operations") Signed-off-by: Roman Kapl <rka@sysgo.com> --- Note: It could also be possible to add a dedicated VEC_LOWREGMASK, but I don't think it is better or signigicantly faster. tcg/i386/tcg-target.inc.c | 4 ---- 1 file changed, 4 deletions(-)