diff mbox

[v2,06/15] target-tricore: Add instructions of SRC opcode format

Message ID 1405359671-25985-7-git-send-email-kbastian@mail.uni-paderborn.de
State New
Headers show

Commit Message

Bastian Koppelmann July 14, 2014, 5:41 p.m. UTC
Add instructions of SRC opcode format.
Add helper for sh arithmetic carry.
Add micro-op generator functions for conditional add/sub and shi.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
---
v1 -> v2:
    - helper_shac uses sextract32 for the constant and add len parameter.
    - Replace else case with signed right shift in helper_shac.
    - Remove sign_extend function and use sextract32 instead.
    - Replace branches in OP_COND makro with movcond.
    - Remove gen_cond_mov and use tcg_gen_movcond_tl instead.
    - Remove gen_sh and and change gen_shi to a special case.
    - Moved all SRC instructions to one decode function.

 target-tricore/helper.h    |  19 ++++++
 target-tricore/op_helper.c |  32 +++++++++
 target-tricore/translate.c | 162 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 213 insertions(+)

--
2.0.1

Comments

Richard Henderson July 14, 2014, 9:05 p.m. UTC | #1
On 07/14/2014 10:41 AM, Bastian Koppelmann wrote:
> +target_ulong helper_shac(CPUTRICOREState *env, target_ulong r1,
> +                         target_ulong r2, target_ulong len)
> +{
> +    target_ulong carry_out, msk, msk_start, msk_len, ret;
> +    int32_t shift_count;
> +    int const6;
> +    const6 = sextract32(r2, 0, len);
> +
> +    if (const6 >= 0) {
> +        if (const6 != 0) {
> +            msk_start = 32 - const6;
> +            msk_len = 31-msk_start;
> +            msk = ((1 << msk_len) - 1) << msk_start;
> +            carry_out = ((r1 & msk) != 0);
> +        } else {
> +            carry_out = 0;
> +        }
> +        ret = r1 << const6;
> +    } else {
> +
> +        shift_count = 0 - const6;
> +        ret = (int32_t)r1 >> shift_count;
> +        msk = (1 << (shift_count - 1)) - 1;
> +        carry_out = ((r1 & msk) != 0);
> +    }
> +    if (carry_out) {
> +        /* TODO: carry out */
> +    }
> +    return ret;
> +}

Why a helper for SHA?  It's not any more difficult than SH.

> +static void gen_shi(TCGv ret, TCGv r1, int32_t r2, int len)
> +{
> +/* shift_count = sign_ext(const4[3:0]);
> +   D[a] = (shift_count >= 0) ? D[a] << shift_count : D[a] >> (-shift_count); */
> +    int shift_count = sextract32(r2, 0, len);

Careful with your documentation: you're adding the 16-bit documentation as
opposed to the more generic 32-bit documentation.

I do not think you should have a "len" parameter at all.  We've already
sign-extended const4 in decode_src_opc, so there's no need to do it again.

> +static void gen_shaci(TCGv ret, TCGv r1, int32_t con, int len)
> +{
> +    TCGv temp = tcg_const_i32(con);
> +
> +    gen_shac(ret, r1, temp, len);
> +
> +    tcg_temp_free(temp);
> +}

In particular, SHACI with an immediate is pretty much exactly SHI except with
an arithmetic right shift instead of a logical right shift.

(Yes, there's some carry and overflow bit computation to do, but it's not like
you've implemented any of that in your current implementation either.)

> +
> +/*
> + * Functions for decoding instructions
> + */
> +
> +static void decode_src_opc(DisasContext *ctx, int op1)
> +{
> +    int r1;
> +    int32_t const4;
> +    TCGv temp, temp2;
> +
> +    r1 = MASK_OP_SRC_S1D(ctx->opcode);
> +    const4 = MASK_OP_SRC_CONST4_SEXT(ctx->opcode);
> +
> +    switch (op1) {
> +
> +    case OPC1_16_SRC_ADD:

Watch the silly blank likes, above the case.  And the end-of-file blank lines
in some of the other patches.

> +        tcg_gen_addi_tl(cpu_gpr_d[r1], cpu_gpr_d[r1], const4);

Are you planning to come back to implement V and AV bits later?

> +    case OPC1_16_SRC_MOV_A:
> +        /* load const4 again unsigned */
> +        const4 = MASK_OP_SRC_CONST4(ctx->opcode);
> +        tcg_gen_movi_tl(cpu_gpr_a[r1], const4);

Err.. I don't think this is right.  I see "signed" on page 3-224.

> +    case OPC1_16_SRC_SHA:
> +        /* FIXME: const too long */
> +        gen_shaci(cpu_gpr_d[r1], cpu_gpr_d[r1], const4, 4);
> +        break;

Huh?  Why the fixme?


r~
Bastian Koppelmann July 15, 2014, 4:29 a.m. UTC | #2
>> +        tcg_gen_addi_tl(cpu_gpr_d[r1], cpu_gpr_d[r1], const4);
> Are you planning to come back to implement V and AV bits later?
Yes. I will do that in the next version of this patchset.
>
>> +    case OPC1_16_SRC_MOV_A:
>> +        /* load const4 again unsigned */
>> +        const4 = MASK_OP_SRC_CONST4(ctx->opcode);
>> +        tcg_gen_movi_tl(cpu_gpr_a[r1], const4);
> Err.. I don't think this is right.  I see "signed" on page 3-224.
Well this seemed to have changed over the instructionset version. If you 
look at the implemented CPU TC1796 it uses version 1.3 and you refer to 
version 1.6. However there should be a mechanism like arm_feature to 
handle this.
>
>> +    case OPC1_16_SRC_SHA:
>> +        /* FIXME: const too long */
>> +        gen_shaci(cpu_gpr_d[r1], cpu_gpr_d[r1], const4, 4);
>> +        break;
> Huh?  Why the fixme?
>
>
> r~
Bastian Koppelmann July 15, 2014, 1:19 p.m. UTC | #3
>> +        tcg_gen_addi_tl(cpu_gpr_d[r1], cpu_gpr_d[r1], const4);
> Are you planning to come back to implement V and AV bits later?
Would you recommend implementing this as a helper? It seems rather 
complex. Especially with half-word and byte arithmetic. On the other 
hand the instructions using this are common and would benefit from 
open-coding it in TCG.

Thanks,

Bastian
Richard Henderson July 15, 2014, 3:42 p.m. UTC | #4
On 07/15/2014 06:19 AM, Bastian Koppelmann wrote:
> 
>>> +        tcg_gen_addi_tl(cpu_gpr_d[r1], cpu_gpr_d[r1], const4);
>> Are you planning to come back to implement V and AV bits later?
> Would you recommend implementing this as a helper? It seems rather complex.
> Especially with half-word and byte arithmetic. On the other hand the
> instructions using this are common and would benefit from open-coding it in TCG.

The halfword and byte insns probably require a helper.  It's certainly going to
be worth while open coding the full word operations.

Ideally we'd be able to use vector instructions on the host, but we'd need
serious enhancements to TCG in order to implement that.

When the implementation of an insn requires more than, say, a dozen insns, or
requires any branching at all, then I believe we're better off sharing the code
with a helper.  This is because the real compiler is significantly better than
TCG at generating code across basic blocks, and the reduction in icache usage.


r~
diff mbox

Patch

diff --git a/target-tricore/helper.h b/target-tricore/helper.h
index e69de29..87c423c 100644
--- a/target-tricore/helper.h
+++ b/target-tricore/helper.h
@@ -0,0 +1,19 @@ 
+/*
+ *  Copyright (c) 2012-2014 Bastian Koppelmann C-Lab/University Paderborn
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* Arithmetic */
+DEF_HELPER_4(shac, i32, env, i32, i32, i32)
diff --git a/target-tricore/op_helper.c b/target-tricore/op_helper.c
index 2e5981f..1d1c9d8 100644
--- a/target-tricore/op_helper.c
+++ b/target-tricore/op_helper.c
@@ -20,6 +20,38 @@ 
 #include "exec/helper-proto.h"
 #include "exec/cpu_ldst.h"

+
+target_ulong helper_shac(CPUTRICOREState *env, target_ulong r1,
+                         target_ulong r2, target_ulong len)
+{
+    target_ulong carry_out, msk, msk_start, msk_len, ret;
+    int32_t shift_count;
+    int const6;
+    const6 = sextract32(r2, 0, len);
+
+    if (const6 >= 0) {
+        if (const6 != 0) {
+            msk_start = 32 - const6;
+            msk_len = 31-msk_start;
+            msk = ((1 << msk_len) - 1) << msk_start;
+            carry_out = ((r1 & msk) != 0);
+        } else {
+            carry_out = 0;
+        }
+        ret = r1 << const6;
+    } else {
+
+        shift_count = 0 - const6;
+        ret = (int32_t)r1 >> shift_count;
+        msk = (1 << (shift_count - 1)) - 1;
+        carry_out = ((r1 & msk) != 0);
+    }
+    if (carry_out) {
+        /* TODO: carry out */
+    }
+    return ret;
+}
+
 static inline void QEMU_NORETURN do_raise_exception_err(CPUTRICOREState *env,
                                                         uint32_t exception,
                                                         int error_code,
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 19250e1..3d59709 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -27,6 +27,7 @@ 
 #include "exec/helper-gen.h"

 #include "tricore-opcodes.h"
+
 /*
  * TCG registers
  */
@@ -94,8 +95,169 @@  void tricore_cpu_dump_state(CPUState *cs, FILE *f,

 }

+/*
+ * Functions to generate micro-ops
+ */
+/* Functions for arithmetic instructions  */
+
+#define OP_COND(insn)\
+static inline void gen_cond_##insn(int cond, TCGv r1, TCGv r2, TCGv r3, \
+                                   TCGv r4)                             \
+{                                                                       \
+    TCGv temp = tcg_temp_new();                                         \
+    TCGv temp2 = tcg_const_tl(0);                                       \
+                                                                        \
+    tcg_gen_##insn ## _tl(temp, r1, r2);                                \
+    tcg_gen_movcond_tl(cond, r3, r4, temp2, temp, r3);                  \
+                                                                        \
+    tcg_temp_free(temp);                                                \
+    tcg_temp_free(temp2);                                               \
+}                                                                       \
+                                                                        \
+static inline void gen_condi_##insn(int cond, TCGv r1, int32_t r2,      \
+                                    TCGv r3, TCGv r4)                   \
+{                                                                       \
+    TCGv temp = tcg_const_i32(r2);                                      \
+    gen_cond_##insn(cond, r1, temp, r3, r4);                            \
+    tcg_temp_free(temp);                                                \
+}
+
+OP_COND(add)
+OP_COND(sub)
+
+static void gen_shi(TCGv ret, TCGv r1, int32_t r2, int len)
+{
+/* shift_count = sign_ext(const4[3:0]);
+   D[a] = (shift_count >= 0) ? D[a] << shift_count : D[a] >> (-shift_count); */
+    int shift_count = sextract32(r2, 0, len);
+
+    if (shift_count == -32) {
+        tcg_gen_movi_tl(r1, 0);
+    } else if (shift_count >= 0) {
+        tcg_gen_shli_tl(r1, r1, shift_count);
+    } else {
+        tcg_gen_shri_tl(r1, r1, (-shift_count));
+    }
+}
+
+static void gen_shac(TCGv ret, TCGv r1, TCGv r2, int len)
+{
+    TCGv temp = tcg_const_tl(len);
+    gen_helper_shac(ret, cpu_env, r1, r2, temp);
+    tcg_temp_free(temp);
+}
+
+static void gen_shaci(TCGv ret, TCGv r1, int32_t con, int len)
+{
+    TCGv temp = tcg_const_i32(con);
+
+    gen_shac(ret, r1, temp, len);
+
+    tcg_temp_free(temp);
+}
+
+/*
+ * Functions for decoding instructions
+ */
+
+static void decode_src_opc(DisasContext *ctx, int op1)
+{
+    int r1;
+    int32_t const4;
+    TCGv temp, temp2;
+
+    r1 = MASK_OP_SRC_S1D(ctx->opcode);
+    const4 = MASK_OP_SRC_CONST4_SEXT(ctx->opcode);
+
+    switch (op1) {
+
+    case OPC1_16_SRC_ADD:
+        tcg_gen_addi_tl(cpu_gpr_d[r1], cpu_gpr_d[r1], const4);
+        break;
+    case OPC1_16_SRC_ADD_A15:
+        tcg_gen_addi_tl(cpu_gpr_d[15], cpu_gpr_d[r1], const4);
+        break;
+    case OPC1_16_SRC_ADD_15A:
+        tcg_gen_addi_tl(cpu_gpr_d[r1], cpu_gpr_d[15], const4);
+        break;
+    case OPC1_16_SRC_ADD_A:
+        tcg_gen_addi_tl(cpu_gpr_a[r1], cpu_gpr_a[r1], const4);
+        break;
+    case OPC1_16_SRC_CADD:
+        gen_condi_add(TCG_COND_NE, cpu_gpr_d[r1], const4, cpu_gpr_d[r1],
+                      cpu_gpr_d[15]);
+        break;
+    case OPC1_16_SRC_CADDN:
+        gen_condi_add(TCG_COND_EQ, cpu_gpr_d[r1], const4, cpu_gpr_d[r1],
+                      cpu_gpr_d[15]);
+        break;
+    case OPC1_16_SRC_CMOV:
+        temp = tcg_const_tl(0);
+        temp2 = tcg_const_tl(const4);
+        tcg_gen_movcond_tl(TCG_COND_EQ, cpu_gpr_d[r1], cpu_gpr_d[15], temp,
+                           temp2, cpu_gpr_d[r1]);
+        tcg_temp_free(temp);
+        tcg_temp_free(temp2);
+        break;
+    case OPC1_16_SRC_CMOVN:
+        temp = tcg_const_tl(0);
+        temp2 = tcg_const_tl(const4);
+        tcg_gen_movcond_tl(TCG_COND_NE, cpu_gpr_d[r1], cpu_gpr_d[15], temp,
+                           temp2, cpu_gpr_d[r1]);
+        tcg_temp_free(temp);
+        tcg_temp_free(temp2);
+        break;
+    case OPC1_16_SRC_EQ:
+        tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_gpr_d[15], cpu_gpr_d[r1],
+                            const4);
+        break;
+    case OPC1_16_SRC_LT:
+        tcg_gen_setcondi_tl(TCG_COND_LT, cpu_gpr_d[15], cpu_gpr_d[r1],
+                            const4);
+        break;
+    case OPC1_16_SRC_MOV:
+        tcg_gen_movi_tl(cpu_gpr_d[r1], const4);
+        break;
+    case OPC1_16_SRC_MOV_A:
+        /* load const4 again unsigned */
+        const4 = MASK_OP_SRC_CONST4(ctx->opcode);
+        tcg_gen_movi_tl(cpu_gpr_a[r1], const4);
+        break;
+    case OPC1_16_SRC_SH:
+        gen_shi(cpu_gpr_d[r1], cpu_gpr_d[r1], const4, 4);
+        break;
+    case OPC1_16_SRC_SHA:
+        /* FIXME: const too long */
+        gen_shaci(cpu_gpr_d[r1], cpu_gpr_d[r1], const4, 4);
+        break;
+    }
+}
+
 static void decode_16Bit_opc(CPUTRICOREState *env, DisasContext *ctx)
 {
+    int op1;
+
+    op1 = MASK_OP_MAJOR(ctx->opcode);
+
+    switch (op1) {
+
+    case OPC1_16_SRC_ADD:
+    case OPC1_16_SRC_ADD_A15:
+    case OPC1_16_SRC_ADD_15A:
+    case OPC1_16_SRC_ADD_A:
+    case OPC1_16_SRC_CADD:
+    case OPC1_16_SRC_CADDN:
+    case OPC1_16_SRC_CMOV:
+    case OPC1_16_SRC_CMOVN:
+    case OPC1_16_SRC_EQ:
+    case OPC1_16_SRC_LT:
+    case OPC1_16_SRC_MOV:
+    case OPC1_16_SRC_MOV_A:
+    case OPC1_16_SRC_SH:
+    case OPC1_16_SRC_SHA:
+        decode_src_opc(ctx, op1);
+        break;
+    }
 }

 static void decode_32Bit_opc(CPUTRICOREState *env, DisasContext *ctx)