diff mbox series

[1/1] target/i386: Raise #GP on unaligned m128 accesses when required.

Message ID 20220829142326.39562-2-ricky@rzhou.org
State New
Headers show
Series target/i386: Raise #GP on unaligned m128 accesses when required. | expand

Commit Message

Ricky Zhou Aug. 29, 2022, 2:23 p.m. UTC
Many instructions which load/store 128-bit values are supposed to
raise #GP when the memory operand isn't 16-byte aligned. This includes:
 - Instructions explicitly requiring memory alignment (Exceptions Type 1
   in the "AVX and SSE Instruction Exception Specification" section of
   the SDM)
 - Legacy SSE instructions that load/store 128-bit values (Exceptions
   Types 2 and 4).

This change adds a raise_gp_if_unaligned helper which raises #GP if an
address is not properly aligned. This helper is called before 128-bit
loads/stores where appropriate.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/217
Signed-off-by: Ricky Zhou <ricky@rzhou.org>
---
 target/i386/helper.h         |  1 +
 target/i386/tcg/mem_helper.c |  8 ++++++++
 target/i386/tcg/translate.c  | 38 +++++++++++++++++++++++++++++++++---
 3 files changed, 44 insertions(+), 3 deletions(-)

Comments

Richard Henderson Aug. 29, 2022, 4:45 p.m. UTC | #1
On 8/29/22 07:23, Ricky Zhou wrote:
> Many instructions which load/store 128-bit values are supposed to
> raise #GP when the memory operand isn't 16-byte aligned. This includes:
>   - Instructions explicitly requiring memory alignment (Exceptions Type 1
>     in the "AVX and SSE Instruction Exception Specification" section of
>     the SDM)
>   - Legacy SSE instructions that load/store 128-bit values (Exceptions
>     Types 2 and 4).
> 
> This change adds a raise_gp_if_unaligned helper which raises #GP if an
> address is not properly aligned. This helper is called before 128-bit
> loads/stores where appropriate.
> 
> Resolves:https://gitlab.com/qemu-project/qemu/-/issues/217
> Signed-off-by: Ricky Zhou<ricky@rzhou.org>
> ---
>   target/i386/helper.h         |  1 +
>   target/i386/tcg/mem_helper.c |  8 ++++++++
>   target/i386/tcg/translate.c  | 38 +++++++++++++++++++++++++++++++++---
>   3 files changed, 44 insertions(+), 3 deletions(-)


This trap should be raised via the memory operation:

- static inline void gen_ldo_env_A0(DisasContext *s, int offset)

+ static inline void gen_ldo_env_A0(DisasContext *s, int offset, bool aligned)
   {

       int mem_index = s->mem_index;

-     tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ);

+     tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index,
+                         MO_LEUQ | (aligned ? MO_ALIGN_16 : 0));
       tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(0)));

       tcg_gen_addi_tl(s->tmp0, s->A0, 8);

       tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);

       tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(1)));

   }


Only the first of the two loads/stores must be aligned, as the other is known to be +8. 
You then must fill in the x86_tcg_ops.do_unaligned_access hook to raise #GP.


r~
Ricky Zhou Aug. 29, 2022, 8:46 p.m. UTC | #2
On Mon, Aug 29, 2022 at 9:45 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 8/29/22 07:23, Ricky Zhou wrote:
> This trap should be raised via the memory operation:
> ...
> Only the first of the two loads/stores must be aligned, as the other is known to be +8.
> You then must fill in the x86_tcg_ops.do_unaligned_access hook to raise #GP.
Thanks for taking a look at this - did you see the bit in the cover
letter where I discuss doing this via alignment requirements on the
memory operation? My logic was that the memop alignment checks seem to
be more oriented towards triggering #AC exceptions (even though this is
not currently implemented), since qemu-user's unaligned access handlers
(helper_unaligned_{ld,st}) already trigger SIGBUS as opposed to SIGSEGV.
I was concerned that implementing this via MO_ALIGN_16 would get in the
way of a hypothetical future implementation of the AC flag, since
do_unaligned_access would need to raise #AC instead of #GP for that.

One slightly more involved way to use alignment on the MemOp could be to
arrange to pass the problematic MemOp to do_unaligned_access and
helper_unaligned_{ld,st}. Then we could allow CPUs to handle
misalignment of different MemOps differently (e.g. raise #GP/SIGSEGV for
certain ops and #AC/SIGBUS for others). For this change to x86, we could
maybe get away with making MO_ALIGN_16 and above trigger #GP/SIGSEGV and
everything else trigger #AC/SIGBUS. If that's a little hacky, we could
instead add some dedicated bits to MemOp that distinguish different
types of unaligned accesses.

What do you think? Happy to implement whichever approach is preferred!

Thanks,
Ricky
Richard Henderson Aug. 29, 2022, 10:54 p.m. UTC | #3
On 8/29/22 13:46, Ricky Zhou wrote:
> Thanks for taking a look at this - did you see the bit in the cover
> letter where I discuss doing this via alignment requirements on the
> memory operation? My logic was that the memop alignment checks seem to
> be more oriented towards triggering #AC exceptions (even though this is
> not currently implemented),

I missed that in the cover.  However... implementing #AC is pretty hypothetical.  It's not 
something that I've ever seen used, and not something that anyone has asked for.

> One slightly more involved way to use alignment on the MemOp could be to
> arrange to pass the problematic MemOp to do_unaligned_access and
> helper_unaligned_{ld,st}. Then we could allow CPUs to handle
> misalignment of different MemOps differently (e.g. raise #GP/SIGSEGV for
> certain ops and #AC/SIGBUS for others). For this change to x86, we could
> maybe get away with making MO_ALIGN_16 and above trigger #GP/SIGSEGV and
> everything else trigger #AC/SIGBUS. If that's a little hacky, we could
> instead add some dedicated bits to MemOp that distinguish different
> types of unaligned accesses.

There's another related problem that actually has gotten a bug report in the past: when 
the form of the address should raise #SS instead of #GP in system mode.

My initial thought was to record information about "the" memory access in the per-insn 
unwind info, until I realized that there are insns with  multiple memory operations 
requiring different treatment.  E.g. "push (%rax)", where the read might raise #GP and the 
write might raise #SS.  So I think we'd need to encode #GP vs #SS into the mmu_idx used 
(e.g. in the lsb).

However, I don't think there are any similar situations of multiple memory types affecting 
SSE, so #AC vs #GP could in fact be encoded into the per-insn unwind info.

As for SIGBUS vs SIGSEGV for SSE and user-only, you only need implement the 
x86_cpu_ops.record_sigbus hook.  C.f. the s390x version which raises PGM_SPECIFICATION -> 
SIGILL for unaligned atomic operations.


r~
diff mbox series

Patch

diff --git a/target/i386/helper.h b/target/i386/helper.h
index ac3b4d1ee3..17d78f2b0d 100644
--- a/target/i386/helper.h
+++ b/target/i386/helper.h
@@ -213,6 +213,7 @@  DEF_HELPER_1(update_mxcsr, void, env)
 DEF_HELPER_1(enter_mmx, void, env)
 DEF_HELPER_1(emms, void, env)
 DEF_HELPER_3(movq, void, env, ptr, ptr)
+DEF_HELPER_3(raise_gp_if_unaligned, void, env, tl, tl)
 
 #define SHIFT 0
 #include "ops_sse_header.h"
diff --git a/target/i386/tcg/mem_helper.c b/target/i386/tcg/mem_helper.c
index e3cdafd2d4..79259abef3 100644
--- a/target/i386/tcg/mem_helper.c
+++ b/target/i386/tcg/mem_helper.c
@@ -181,3 +181,11 @@  void helper_boundl(CPUX86State *env, target_ulong a0, int v)
         raise_exception_ra(env, EXCP05_BOUND, GETPC());
     }
 }
+
+void helper_raise_gp_if_unaligned(CPUX86State *env, target_ulong addr,
+                                  target_ulong align_mask)
+{
+    if (unlikely((addr & align_mask) != 0)) {
+        raise_exception_ra(env, EXCP0D_GPF, GETPC());
+    }
+}
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index b7972f0ff5..de13f483b6 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3054,7 +3054,7 @@  static const struct SSEOpHelper_epp sse_op_table6[256] = {
     [0x25] = SSE41_OP(pmovsxdq),
     [0x28] = SSE41_OP(pmuldq),
     [0x29] = SSE41_OP(pcmpeqq),
-    [0x2a] = SSE41_SPECIAL, /* movntqda */
+    [0x2a] = SSE41_SPECIAL, /* movntdqa */
     [0x2b] = SSE41_OP(packusdw),
     [0x30] = SSE41_OP(pmovzxbw),
     [0x31] = SSE41_OP(pmovzxbd),
@@ -3194,10 +3194,11 @@  static void gen_sse(CPUX86State *env, DisasContext *s, int b,
             break;
         case 0x1e7: /* movntdq */
         case 0x02b: /* movntps */
-        case 0x12b: /* movntps */
+        case 0x12b: /* movntpd */
             if (mod == 3)
                 goto illegal_op;
             gen_lea_modrm(env, s, modrm);
+            gen_helper_raise_gp_if_unaligned(cpu_env, s->A0, tcg_const_tl(0xf));
             gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
             break;
         case 0x3f0: /* lddqu */
@@ -3273,6 +3274,11 @@  static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x26f: /* movdqu xmm, ea */
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
+                /* movaps, movapd, movdqa */
+                if (b == 0x028 || b == 0x128 || b == 0x16f) {
+                    gen_helper_raise_gp_if_unaligned(cpu_env, s->A0,
+                                                     tcg_const_tl(0xf));
+                }
                 gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
             } else {
                 rm = (modrm & 7) | REX_B(s);
@@ -3331,6 +3337,10 @@  static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x212: /* movsldup */
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
+                if (!(s->prefix & PREFIX_VEX)) {
+                    gen_helper_raise_gp_if_unaligned(cpu_env, s->A0,
+                                                     tcg_const_tl(0xf));
+                }
                 gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
             } else {
                 rm = (modrm & 7) | REX_B(s);
@@ -3373,6 +3383,10 @@  static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x216: /* movshdup */
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
+                if (!(s->prefix & PREFIX_VEX)) {
+                    gen_helper_raise_gp_if_unaligned(cpu_env, s->A0,
+                                                     tcg_const_tl(0xf));
+                }
                 gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
             } else {
                 rm = (modrm & 7) | REX_B(s);
@@ -3465,6 +3479,10 @@  static void gen_sse(CPUX86State *env, DisasContext *s, int b,
         case 0x27f: /* movdqu ea, xmm */
             if (mod != 3) {
                 gen_lea_modrm(env, s, modrm);
+                if (b == 0x029 || b == 0x129 || b == 0x17f) {
+                    gen_helper_raise_gp_if_unaligned(cpu_env, s->A0,
+                                                     tcg_const_tl(0xf));
+                }
                 gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
             } else {
                 rm = (modrm & 7) | REX_B(s);
@@ -3806,10 +3824,16 @@  static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                         tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset +
                                         offsetof(ZMMReg, ZMM_W(0)));
                         break;
-                    case 0x2a:            /* movntqda */
+                    case 0x2a:            /* movntdqa */
+                        gen_helper_raise_gp_if_unaligned(cpu_env, s->A0,
+                                                         tcg_const_tl(0xf));
                         gen_ldo_env_A0(s, op1_offset);
                         return;
                     default:
+                        if (!(s->prefix & PREFIX_VEX)) {
+                            gen_helper_raise_gp_if_unaligned(cpu_env, s->A0,
+                                                             tcg_const_tl(0xf));
+                        }
                         gen_ldo_env_A0(s, op2_offset);
                     }
                 }
@@ -4351,6 +4375,10 @@  static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                 } else {
                     op2_offset = offsetof(CPUX86State,xmm_t0);
                     gen_lea_modrm(env, s, modrm);
+                    if (!(s->prefix & PREFIX_VEX)) {
+                        gen_helper_raise_gp_if_unaligned(cpu_env, s->A0,
+                                                         tcg_const_tl(0xf));
+                    }
                     gen_ldo_env_A0(s, op2_offset);
                 }
             } else {
@@ -4469,6 +4497,10 @@  static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                     break;
                 default:
                     /* 128 bit access */
+                    if (!(s->prefix & PREFIX_VEX)) {
+                        gen_helper_raise_gp_if_unaligned(cpu_env, s->A0,
+                                                         tcg_const_tl(0xf));
+                    }
                     gen_ldo_env_A0(s, op2_offset);
                     break;
                 }