diff mbox series

[RFC,v2,4/9] target/ppc: delay writeback of avr{l, h} during lvx instruction

Message ID 20181217122405.18732-5-mark.cave-ayland@ilande.co.uk
State New
Headers show
Series target/ppc: convert VMX instructions to use TCG vector operations | expand

Commit Message

Mark Cave-Ayland Dec. 17, 2018, 12:24 p.m. UTC
During review of the previous patch, Richard pointed out an existing bug that
the writeback to the avr{l,h} registers should be delayed until after any
exceptions have been raised.

Perform both 64-bit loads into separate temporaries and then write them into
the avr{l,h} registers together to ensure that this is always the case.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
 target/ppc/translate/vmx-impl.inc.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

Comments

Richard Henderson Dec. 17, 2018, 4:46 p.m. UTC | #1
On 12/17/18 4:24 AM, Mark Cave-Ayland wrote:
> During review of the previous patch, Richard pointed out an existing bug that
> the writeback to the avr{l,h} registers should be delayed until after any
> exceptions have been raised.
> 
> Perform both 64-bit loads into separate temporaries and then write them into
> the avr{l,h} registers together to ensure that this is always the case.
> 
> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> ---
>  target/ppc/translate/vmx-impl.inc.c | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)

I feel a bit silly.  There was no bug, since the address is forced to be
aligned on a 16-byte boundary.  The second memory access will be to the same
page and cannot trap.

That said, I think the cleanup looks good.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~
Mark Cave-Ayland Dec. 17, 2018, 5:23 p.m. UTC | #2
On 17/12/2018 16:46, Richard Henderson wrote:

> On 12/17/18 4:24 AM, Mark Cave-Ayland wrote:
>> During review of the previous patch, Richard pointed out an existing bug that
>> the writeback to the avr{l,h} registers should be delayed until after any
>> exceptions have been raised.
>>
>> Perform both 64-bit loads into separate temporaries and then write them into
>> the avr{l,h} registers together to ensure that this is always the case.
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>> ---
>>  target/ppc/translate/vmx-impl.inc.c | 24 +++++++++++++-----------
>>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> I feel a bit silly.  There was no bug, since the address is forced to be
> aligned on a 16-byte boundary.  The second memory access will be to the same
> page and cannot trap.
> 
> That said, I think the cleanup looks good.
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Ah okay. I'm inclined to drop this patch for now, since it feels strange introducing
a cleanup to just one of the load functions...


ATB,

Mark.
diff mbox series

Patch

diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c
index 30046c6e31..cd7d12265c 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -18,33 +18,35 @@  static inline TCGv_ptr gen_avr_ptr(int reg)
 static void glue(gen_, name)(DisasContext *ctx)                                       \
 {                                                                             \
     TCGv EA;                                                                  \
-    TCGv_i64 avr;                                                             \
+    TCGv_i64 avr1, avr2;                                                      \
     if (unlikely(!ctx->altivec_enabled)) {                                    \
         gen_exception(ctx, POWERPC_EXCP_VPU);                                 \
         return;                                                               \
     }                                                                         \
     gen_set_access_type(ctx, ACCESS_INT);                                     \
-    avr = tcg_temp_new_i64();                                                 \
+    avr1 = tcg_temp_new_i64();                                                \
+    avr2 = tcg_temp_new_i64();                                                \
     EA = tcg_temp_new();                                                      \
     gen_addr_reg_index(ctx, EA);                                              \
     tcg_gen_andi_tl(EA, EA, ~0xf);                                            \
     /* We only need to swap high and low halves. gen_qemu_ld64_i64 does       \
        necessary 64-bit byteswap already. */                                  \
     if (ctx->le_mode) {                                                       \
-        gen_qemu_ld64_i64(ctx, avr, EA);                                      \
-        set_avr64(rD(ctx->opcode), avr, false);                               \
+        gen_qemu_ld64_i64(ctx, avr1, EA);                                     \
         tcg_gen_addi_tl(EA, EA, 8);                                           \
-        gen_qemu_ld64_i64(ctx, avr, EA);                                      \
-        set_avr64(rD(ctx->opcode), avr, true);                                \
+        gen_qemu_ld64_i64(ctx, avr2, EA);                                     \
+        set_avr64(rD(ctx->opcode), avr1, false);                              \
+        set_avr64(rD(ctx->opcode), avr2, true);                               \
     } else {                                                                  \
-        gen_qemu_ld64_i64(ctx, avr, EA);                                      \
-        set_avr64(rD(ctx->opcode), avr, true);                                \
+        gen_qemu_ld64_i64(ctx, avr1, EA);                                     \
         tcg_gen_addi_tl(EA, EA, 8);                                           \
-        gen_qemu_ld64_i64(ctx, avr, EA);                                      \
-        set_avr64(rD(ctx->opcode), avr, false);                               \
+        gen_qemu_ld64_i64(ctx, avr2, EA);                                     \
+        set_avr64(rD(ctx->opcode), avr1, true);                               \
+        set_avr64(rD(ctx->opcode), avr2, false);                              \
     }                                                                         \
     tcg_temp_free(EA);                                                        \
-    tcg_temp_free_i64(avr);                                                   \
+    tcg_temp_free_i64(avr1);                                                  \
+    tcg_temp_free_i64(avr2);                                                  \
 }
 
 #define GEN_VR_STX(name, opc2, opc3)                                          \