Patchwork target-arm: implement ARMv8 VSEL instruction

login
register
mail settings
Submitter Måns Rullgård
Date June 18, 2013, 2:30 p.m.
Message ID <1371565848-9297-1-git-send-email-mans@mansr.com>
Download mbox | patch
Permalink /patch/252369/
State New
Headers show

Comments

Måns Rullgård - June 18, 2013, 2:30 p.m.
This adds support for the VSEL instruction introduced in ARMv8.
It resides along with other new VFP instructions under the CDP2
encoding which was previously unused.

Signed-off-by: Mans Rullgard <mans@mansr.com>
---
Basic testing performed but not every combination.
---
 target-arm/translate.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 2 deletions(-)
Peter Maydell - June 20, 2013, 7:23 p.m.
On 18 June 2013 15:30, Mans Rullgard <mans@mansr.com> wrote:
> This adds support for the VSEL instruction introduced in ARMv8.
> It resides along with other new VFP instructions under the CDP2
> encoding which was previously unused.
>
> Signed-off-by: Mans Rullgard <mans@mansr.com>

So I found this pretty confusing, which I think is an indication
that we need to start by cleaning up the existing v7 VFP/Neon
decode.

Specifically, currently we handle all Neon decode by just calling
the neon decode functions directly from the disas_arm_insn
and disas_thumb2_insn functions. We should move VFP to work
in the same way (ie take it out of disas_coproc_insn()).
Basically, the architecture manual treats them as part of the
core instruction set, and we should make our decoder do the same.

The (existing) coproc decode is also confusing, and would benefit
a lot from a comment at the top of disas_coproc_insn specifying
the opcode patterns that can reach it.

> +    if (((insn >> 23) & 1) == 0) {
> +        /* vsel */
> +        int cc = (insn >> 20) & 3;
> +        int cond = (cc << 2) | (((cc << 1) ^ cc) & 2);
> +        int pass_label = gen_new_label();
> +
> +        gen_mov_F0_vreg(dp, rn);
> +        gen_mov_vreg_F0(dp, rd);
> +        gen_test_cc(cond, pass_label);
> +        gen_mov_F0_vreg(dp, rm);
> +        gen_mov_vreg_F0(dp, rd);
> +        gen_set_label(pass_label);

You can generate better code with the TCG movcond op.
Luckily you don't actually have to duplicate the whole of
gen_test_cc only doing movconds, because there are only actually
4 encodable conditions here (3 of which turn into a single
movcond; the fourth requires two consecutive movcond ops).

Also I don't think we should introduce any new uses of F0/F1.
You can just load a VFP register into a TCG temp like this:

    ftmp = tcg_temp_new_i32();
    tcg_gen_ld_f32(ftmp, cpu_env, vfp_reg_offset(0, rd));

operate on it as usual, and store:
    tcg_gen_st_f32(ftmp, cpu_env, vfp_reg_offset(0, rd));
    tcg_temp_free_i32(ftmp);

(similarly for double).

> @@ -6699,6 +6742,12 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s)
>              }
>              return; /* v7MP: Unallocated memory hint: must NOP */
>          }
> +        if ((insn & 0x0f000010) == 0x0e000000) {
> +            /* cdp2 */
> +            if (disas_coproc_insn(env, s, insn))
> +                goto illegal_op;
> +            return;
> +        }

This hunk is oddly placed, because it's neither next to the neon
decode (which is further up) nor the mrc2/mcr2 decode (which is
further down).

thanks
-- PMM
Måns Rullgård - June 20, 2013, 11:25 p.m.
Peter Maydell <peter.maydell@linaro.org> writes:

> On 18 June 2013 15:30, Mans Rullgard <mans@mansr.com> wrote:
>> This adds support for the VSEL instruction introduced in ARMv8.
>> It resides along with other new VFP instructions under the CDP2
>> encoding which was previously unused.
>>
>> Signed-off-by: Mans Rullgard <mans@mansr.com>
>
> So I found this pretty confusing,

That makes two of us.

> which I think is an indication that we need to start by cleaning up
> the existing v7 VFP/Neon decode.
>
> Specifically, currently we handle all Neon decode by just calling
> the neon decode functions directly from the disas_arm_insn
> and disas_thumb2_insn functions. We should move VFP to work
> in the same way (ie take it out of disas_coproc_insn()).
> Basically, the architecture manual treats them as part of the
> core instruction set, and we should make our decoder do the same.
>
> The (existing) coproc decode is also confusing, and would benefit
> a lot from a comment at the top of disas_coproc_insn specifying
> the opcode patterns that can reach it.
>
>> +    if (((insn >> 23) & 1) == 0) {
>> +        /* vsel */
>> +        int cc = (insn >> 20) & 3;
>> +        int cond = (cc << 2) | (((cc << 1) ^ cc) & 2);
>> +        int pass_label = gen_new_label();
>> +
>> +        gen_mov_F0_vreg(dp, rn);
>> +        gen_mov_vreg_F0(dp, rd);
>> +        gen_test_cc(cond, pass_label);
>> +        gen_mov_F0_vreg(dp, rm);
>> +        gen_mov_vreg_F0(dp, rd);
>> +        gen_set_label(pass_label);
>
> You can generate better code with the TCG movcond op.
> Luckily you don't actually have to duplicate the whole of
> gen_test_cc only doing movconds, because there are only actually
> 4 encodable conditions here (3 of which turn into a single
> movcond; the fourth requires two consecutive movcond ops).

Thanks, that sounds better.

> Also I don't think we should introduce any new uses of F0/F1.
> You can just load a VFP register into a TCG temp like this:

Great, more obsolete stuff.

>     ftmp = tcg_temp_new_i32();
>     tcg_gen_ld_f32(ftmp, cpu_env, vfp_reg_offset(0, rd));
>
> operate on it as usual, and store:
>     tcg_gen_st_f32(ftmp, cpu_env, vfp_reg_offset(0, rd));
>     tcg_temp_free_i32(ftmp);
>
> (similarly for double).
>
>> @@ -6699,6 +6742,12 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s)
>>              }
>>              return; /* v7MP: Unallocated memory hint: must NOP */
>>          }
>> +        if ((insn & 0x0f000010) == 0x0e000000) {
>> +            /* cdp2 */
>> +            if (disas_coproc_insn(env, s, insn))
>> +                goto illegal_op;
>> +            return;
>> +        }
>
> This hunk is oddly placed, because it's neither next to the neon
> decode (which is further up) nor the mrc2/mcr2 decode (which is
> further down).

That's because it is neither.  It is CDP2, previously not decoded at all.
This seemed as logical a place as any to me.  If you disagree, please
say where you'd prefer that it go.
Peter Maydell - July 2, 2013, 12:56 p.m.
On 21 June 2013 00:25, Måns Rullgård <mans@mansr.com> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
>> On 18 June 2013 15:30, Mans Rullgard <mans@mansr.com> wrote:
>>> @@ -6699,6 +6742,12 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s)
>>>              }
>>>              return; /* v7MP: Unallocated memory hint: must NOP */
>>>          }
>>> +        if ((insn & 0x0f000010) == 0x0e000000) {
>>> +            /* cdp2 */
>>> +            if (disas_coproc_insn(env, s, insn))
>>> +                goto illegal_op;
>>> +            return;
>>> +        }
>>
>> This hunk is oddly placed, because it's neither next to the neon
>> decode (which is further up) nor the mrc2/mcr2 decode (which is
>> further down).
>
> That's because it is neither.  It is CDP2, previously not decoded at all.
> This seemed as logical a place as any to me.  If you disagree, please
> say where you'd prefer that it go.

Well, if we're treating it in the same way as Neon (ie pulling
the VFP decode out of the decode_coproc() function as I suggest),
then the logical place to put VFP decode is next to the Neon
decode.

At that point the need to actually decode CDP2 into a call
to disas_coproc_insn() goes away, but if we did want to include
it then the logical place to put that decode is next to the
MCR2/MRC2 decode, because both CDP2 and MCR2/MRC2 deal with
coprocessor instructions and are in the same table in the ARM ARM.

-- PMM

Patch

diff --git a/target-arm/translate.c b/target-arm/translate.c
index f7089b2..eb9423b 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -2546,6 +2546,40 @@  static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size)
     return tmp;
 }
 
+static int disas_v8vfp_insn(CPUARMState *env, DisasContext *s, uint32_t insn)
+{
+    uint32_t rd, rn, rm;
+    int dp = (insn >> 8) & 1;
+
+    if (dp) {
+        VFP_DREG_D(rd, insn);
+        VFP_DREG_N(rn, insn);
+        VFP_DREG_M(rm, insn);
+    } else {
+        rd = VFP_SREG_D(insn);
+        rn = VFP_SREG_N(insn);
+        rm = VFP_SREG_M(insn);
+    }
+
+    if (((insn >> 23) & 1) == 0) {
+        /* vsel */
+        int cc = (insn >> 20) & 3;
+        int cond = (cc << 2) | (((cc << 1) ^ cc) & 2);
+        int pass_label = gen_new_label();
+
+        gen_mov_F0_vreg(dp, rn);
+        gen_mov_vreg_F0(dp, rd);
+        gen_test_cc(cond, pass_label);
+        gen_mov_F0_vreg(dp, rm);
+        gen_mov_vreg_F0(dp, rd);
+        gen_set_label(pass_label);
+
+        return 0;
+    }
+
+    return 1;
+}
+
 /* Disassemble a VFP instruction.  Returns nonzero if an error occurred
    (ie. an undefined instruction).  */
 static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn)
@@ -2568,6 +2602,11 @@  static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn)
             && rn != ARM_VFP_MVFR1 && rn != ARM_VFP_MVFR0)
             return 1;
     }
+    if (((insn >> 28) & 0xf) == 0xf) {
+        if (!arm_feature(env, ARM_FEATURE_V8))
+            return 1;
+        return disas_v8vfp_insn(env, s, insn);
+    }
     dp = ((insn & 0xf00) == 0xb00);
     switch ((insn >> 24) & 0xf) {
     case 0xe:
@@ -6241,6 +6280,10 @@  static int disas_coproc_insn(CPUARMState * env, DisasContext *s, uint32_t insn)
         /* cdp */
         return 1;
     }
+    if (((insn >> 28) & 0xf) == 0xf) {
+        /* cdp2 */
+        return 1;
+    }
 
     crm = insn & 0xf;
     if (is64) {
@@ -6699,6 +6742,12 @@  static void disas_arm_insn(CPUARMState * env, DisasContext *s)
             }
             return; /* v7MP: Unallocated memory hint: must NOP */
         }
+        if ((insn & 0x0f000010) == 0x0e000000) {
+            /* cdp2 */
+            if (disas_coproc_insn(env, s, insn))
+                goto illegal_op;
+            return;
+        }
 
         if ((insn & 0x0ffffdff) == 0x01010000) {
             ARCH(6);
@@ -8684,8 +8733,6 @@  static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw
             if (disas_neon_data_insn(env, s, insn))
                 goto illegal_op;
         } else {
-            if (insn & (1 << 28))
-                goto illegal_op;
             if (disas_coproc_insn (env, s, insn))
                 goto illegal_op;
         }