Message ID | 20220829142326.39562-2-ricky@rzhou.org |
---|---|
State | New |
Headers | show |
Series | target/i386: Raise #GP on unaligned m128 accesses when required. | expand |
On 8/29/22 07:23, Ricky Zhou wrote: > Many instructions which load/store 128-bit values are supposed to > raise #GP when the memory operand isn't 16-byte aligned. This includes: > - Instructions explicitly requiring memory alignment (Exceptions Type 1 > in the "AVX and SSE Instruction Exception Specification" section of > the SDM) > - Legacy SSE instructions that load/store 128-bit values (Exceptions > Types 2 and 4). > > This change adds a raise_gp_if_unaligned helper which raises #GP if an > address is not properly aligned. This helper is called before 128-bit > loads/stores where appropriate. > > Resolves:https://gitlab.com/qemu-project/qemu/-/issues/217 > Signed-off-by: Ricky Zhou<ricky@rzhou.org> > --- > target/i386/helper.h | 1 + > target/i386/tcg/mem_helper.c | 8 ++++++++ > target/i386/tcg/translate.c | 38 +++++++++++++++++++++++++++++++++--- > 3 files changed, 44 insertions(+), 3 deletions(-) This trap should be raised via the memory operation: - static inline void gen_ldo_env_A0(DisasContext *s, int offset) + static inline void gen_ldo_env_A0(DisasContext *s, int offset, bool aligned) { int mem_index = s->mem_index; - tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ); + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, + MO_LEUQ | (aligned ? MO_ALIGN_16 : 0)); tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(0))); tcg_gen_addi_tl(s->tmp0, s->A0, 8); tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(1))); } Only the first of the two loads/stores must be aligned, as the other is known to be +8. You then must fill in the x86_tcg_ops.do_unaligned_access hook to raise #GP. r~
On Mon, Aug 29, 2022 at 9:45 AM Richard Henderson <richard.henderson@linaro.org> wrote: > > On 8/29/22 07:23, Ricky Zhou wrote: > This trap should be raised via the memory operation: > ... > Only the first of the two loads/stores must be aligned, as the other is known to be +8. > You then must fill in the x86_tcg_ops.do_unaligned_access hook to raise #GP. Thanks for taking a look at this - did you see the bit in the cover letter where I discuss doing this via alignment requirements on the memory operation? My logic was that the memop alignment checks seem to be more oriented towards triggering #AC exceptions (even though this is not currently implemented), since qemu-user's unaligned access handlers (helper_unaligned_{ld,st}) already trigger SIGBUS as opposed to SIGSEGV. I was concerned that implementing this via MO_ALIGN_16 would get in the way of a hypothetical future implementation of the AC flag, since do_unaligned_access would need to raise #AC instead of #GP for that. One slightly more involved way to use alignment on the MemOp could be to arrange to pass the problematic MemOp to do_unaligned_access and helper_unaligned_{ld,st}. Then we could allow CPUs to handle misalignment of different MemOps differently (e.g. raise #GP/SIGSEGV for certain ops and #AC/SIGBUS for others). For this change to x86, we could maybe get away with making MO_ALIGN_16 and above trigger #GP/SIGSEGV and everything else trigger #AC/SIGBUS. If that's a little hacky, we could instead add some dedicated bits to MemOp that distinguish different types of unaligned accesses. What do you think? Happy to implement whichever approach is preferred! Thanks, Ricky
On 8/29/22 13:46, Ricky Zhou wrote: > Thanks for taking a look at this - did you see the bit in the cover > letter where I discuss doing this via alignment requirements on the > memory operation? My logic was that the memop alignment checks seem to > be more oriented towards triggering #AC exceptions (even though this is > not currently implemented), I missed that in the cover. However... implementing #AC is pretty hypothetical. It's not something that I've ever seen used, and not something that anyone has asked for. > One slightly more involved way to use alignment on the MemOp could be to > arrange to pass the problematic MemOp to do_unaligned_access and > helper_unaligned_{ld,st}. Then we could allow CPUs to handle > misalignment of different MemOps differently (e.g. raise #GP/SIGSEGV for > certain ops and #AC/SIGBUS for others). For this change to x86, we could > maybe get away with making MO_ALIGN_16 and above trigger #GP/SIGSEGV and > everything else trigger #AC/SIGBUS. If that's a little hacky, we could > instead add some dedicated bits to MemOp that distinguish different > types of unaligned accesses. There's another related problem that actually has gotten a bug report in the past: when the form of the address should raise #SS instead of #GP in system mode. My initial thought was to record information about "the" memory access in the per-insn unwind info, until I realized that there are insns with multiple memory operations requiring different treatment. E.g. "push (%rax)", where the read might raise #GP and the write might raise #SS. So I think we'd need to encode #GP vs #SS into the mmu_idx used (e.g. in the lsb). However, I don't think there are any similar situations of multiple memory types affecting SSE, so #AC vs #GP could in fact be encoded into the per-insn unwind info. As for SIGBUS vs SIGSEGV for SSE and user-only, you only need implement the x86_cpu_ops.record_sigbus hook. C.f. the s390x version which raises PGM_SPECIFICATION -> SIGILL for unaligned atomic operations. r~
diff --git a/target/i386/helper.h b/target/i386/helper.h index ac3b4d1ee3..17d78f2b0d 100644 --- a/target/i386/helper.h +++ b/target/i386/helper.h @@ -213,6 +213,7 @@ DEF_HELPER_1(update_mxcsr, void, env) DEF_HELPER_1(enter_mmx, void, env) DEF_HELPER_1(emms, void, env) DEF_HELPER_3(movq, void, env, ptr, ptr) +DEF_HELPER_3(raise_gp_if_unaligned, void, env, tl, tl) #define SHIFT 0 #include "ops_sse_header.h" diff --git a/target/i386/tcg/mem_helper.c b/target/i386/tcg/mem_helper.c index e3cdafd2d4..79259abef3 100644 --- a/target/i386/tcg/mem_helper.c +++ b/target/i386/tcg/mem_helper.c @@ -181,3 +181,11 @@ void helper_boundl(CPUX86State *env, target_ulong a0, int v) raise_exception_ra(env, EXCP05_BOUND, GETPC()); } } + +void helper_raise_gp_if_unaligned(CPUX86State *env, target_ulong addr, + target_ulong align_mask) +{ + if (unlikely((addr & align_mask) != 0)) { + raise_exception_ra(env, EXCP0D_GPF, GETPC()); + } +} diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index b7972f0ff5..de13f483b6 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3054,7 +3054,7 @@ static const struct SSEOpHelper_epp sse_op_table6[256] = { [0x25] = SSE41_OP(pmovsxdq), [0x28] = SSE41_OP(pmuldq), [0x29] = SSE41_OP(pcmpeqq), - [0x2a] = SSE41_SPECIAL, /* movntqda */ + [0x2a] = SSE41_SPECIAL, /* movntdqa */ [0x2b] = SSE41_OP(packusdw), [0x30] = SSE41_OP(pmovzxbw), [0x31] = SSE41_OP(pmovzxbd), @@ -3194,10 +3194,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, break; case 0x1e7: /* movntdq */ case 0x02b: /* movntps */ - case 0x12b: /* movntps */ + case 0x12b: /* movntpd */ if (mod == 3) goto illegal_op; gen_lea_modrm(env, s, modrm); + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, tcg_const_tl(0xf)); gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); break; case 0x3f0: /* lddqu */ @@ -3273,6 +3274,11 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, case 0x26f: /* movdqu xmm, ea */ if (mod != 3) { gen_lea_modrm(env, s, modrm); + /* movaps, movapd, movdqa */ + if (b == 0x028 || b == 0x128 || b == 0x16f) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); } else { rm = (modrm & 7) | REX_B(s); @@ -3331,6 +3337,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, case 0x212: /* movsldup */ if (mod != 3) { gen_lea_modrm(env, s, modrm); + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); } else { rm = (modrm & 7) | REX_B(s); @@ -3373,6 +3383,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, case 0x216: /* movshdup */ if (mod != 3) { gen_lea_modrm(env, s, modrm); + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); } else { rm = (modrm & 7) | REX_B(s); @@ -3465,6 +3479,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, case 0x27f: /* movdqu ea, xmm */ if (mod != 3) { gen_lea_modrm(env, s, modrm); + if (b == 0x029 || b == 0x129 || b == 0x17f) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); } else { rm = (modrm & 7) | REX_B(s); @@ -3806,10 +3824,16 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset + offsetof(ZMMReg, ZMM_W(0))); break; - case 0x2a: /* movntqda */ + case 0x2a: /* movntdqa */ + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); gen_ldo_env_A0(s, op1_offset); return; default: + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, op2_offset); } } @@ -4351,6 +4375,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, } else { op2_offset = offsetof(CPUX86State,xmm_t0); gen_lea_modrm(env, s, modrm); + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, op2_offset); } } else { @@ -4469,6 +4497,10 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, break; default: /* 128 bit access */ + if (!(s->prefix & PREFIX_VEX)) { + gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, + tcg_const_tl(0xf)); + } gen_ldo_env_A0(s, op2_offset); break; }
Many instructions which load/store 128-bit values are supposed to raise #GP when the memory operand isn't 16-byte aligned. This includes: - Instructions explicitly requiring memory alignment (Exceptions Type 1 in the "AVX and SSE Instruction Exception Specification" section of the SDM) - Legacy SSE instructions that load/store 128-bit values (Exceptions Types 2 and 4). This change adds a raise_gp_if_unaligned helper which raises #GP if an address is not properly aligned. This helper is called before 128-bit loads/stores where appropriate. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/217 Signed-off-by: Ricky Zhou <ricky@rzhou.org> --- target/i386/helper.h | 1 + target/i386/tcg/mem_helper.c | 8 ++++++++ target/i386/tcg/translate.c | 38 +++++++++++++++++++++++++++++++++--- 3 files changed, 44 insertions(+), 3 deletions(-)