Patchwork Fix extlh instruction on Alpha

login
register
mail settings
Submitter Vince Weaver
Date Sept. 17, 2009, 4:07 p.m.
Message ID <20090917120406.J54069@stanley.csl.cornell.edu>
Download mbox | patch
Permalink /patch/33792/
State Superseded
Headers show

Comments

Vince Weaver - Sept. 17, 2009, 4:07 p.m.
On Wed, 16 Sep 2009, Aurelien Jarno wrote:

> In case tmp1 = 0, it becomes 64, and then 0 again after the and, so
> rc=ra<<0.

Ah, I see.  I completely missed that optimization.

How does this updated patch look?  I removed one of the TCGv variables 
too.  Does that help performance?  What would be nice is a tcg 
subtract-from instruction, which I know some architectures have.  Maybe 
tcg does have it and I should look harder.
Laurent Desnogues - Sept. 17, 2009, 4:25 p.m.
On Thu, Sep 17, 2009 at 6:07 PM, Vince Weaver <vince@csl.cornell.edu> wrote:
>
> What would be nice is a tcg
> subtract-from instruction, which I know some architectures have.  Maybe
> tcg does have it and I should look harder.

There is tcg_gen_subfi.


Laurent
Andreas Schwab - Sept. 17, 2009, 4:35 p.m.
Vince Weaver <vince@csl.cornell.edu> writes:

>              tcg_gen_andi_i64(tmp1, cpu_ir[rb], 7);
>              tcg_gen_shli_i64(tmp1, tmp1, 3);
> -            tmp2 = tcg_const_i64(64);
> -            tcg_gen_sub_i64(tmp1, tmp2, tmp1);
> -            tcg_temp_free(tmp2);
> +            tcg_gen_andi_i64(tmp1, tmp1, 0x3f);
> +            tcg_gen_neg_i64(tmp1, tmp1);
> +            tcg_gen_addi_i64(tmp1, tmp1, 64);

This wastes an operation.  If you switch andi and neg you don't need to
add 64.

Andreas.
Aurelien Jarno - Sept. 17, 2009, 5:19 p.m.
On Thu, Sep 17, 2009 at 12:07:23PM -0400, Vince Weaver wrote:
> On Wed, 16 Sep 2009, Aurelien Jarno wrote:
> 
> > In case tmp1 = 0, it becomes 64, and then 0 again after the and, so
> > rc=ra<<0.
> 
> Ah, I see.  I completely missed that optimization.
> 
> How does this updated patch look?  I removed one of the TCGv variables 
> too.  Does that help performance?  What would be nice is a tcg 

Yes it looks ok. Removing one normal TCGv variable doesn't really help.
What helps is removing a tcg temp_local variable or a branch.

> subtract-from instruction, which I know some architectures have.  Maybe 
> tcg does have it and I should look harder.

You can use tcg_gen_subfi_i64 (tcg result, immediate, tcg arg).
 
> diff --git a/target-alpha/translate.c b/target-alpha/translate.c
> index 9d2bc45..af2a43c 100644
> --- a/target-alpha/translate.c
> +++ b/target-alpha/translate.c
> @@ -524,14 +524,16 @@ static inline void gen_ext_h(void(*tcg_gen_ext_i64)(TCGv t0, TCGv t1),
>              else
>                  tcg_gen_mov_i64(cpu_ir[rc], cpu_ir[ra]);
>          } else {
> -            TCGv tmp1, tmp2;
ap> +            TCGv tmp1;
>              tmp1 = tcg_temp_new();
> +
>              tcg_gen_andi_i64(tmp1, cpu_ir[rb], 7);
>              tcg_gen_shli_i64(tmp1, tmp1, 3);
> -            tmp2 = tcg_const_i64(64);
> -            tcg_gen_sub_i64(tmp1, tmp2, tmp1);
> -            tcg_temp_free(tmp2);
> +            tcg_gen_andi_i64(tmp1, tmp1, 0x3f);
> +            tcg_gen_neg_i64(tmp1, tmp1);
> +            tcg_gen_addi_i64(tmp1, tmp1, 64);
>              tcg_gen_shl_i64(cpu_ir[rc], cpu_ir[ra], tmp1);
> +
>              tcg_temp_free(tmp1);
>          }
>          if (tcg_gen_ext_i64)
> @@ -1316,7 +1318,7 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn)
>              break;
>          case 0x6A:
>              /* EXTLH */
> -            gen_ext_h(&tcg_gen_ext16u_i64, ra, rb, rc, islit, lit);
> +            gen_ext_h(&tcg_gen_ext32u_i64, ra, rb, rc, islit, lit);
>              break;
>          case 0x72:
>              /* MSKQH */
>

Patch

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 9d2bc45..af2a43c 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -524,14 +524,16 @@  static inline void gen_ext_h(void(*tcg_gen_ext_i64)(TCGv t0, TCGv t1),
             else
                 tcg_gen_mov_i64(cpu_ir[rc], cpu_ir[ra]);
         } else {
-            TCGv tmp1, tmp2;
+            TCGv tmp1;
             tmp1 = tcg_temp_new();
+
             tcg_gen_andi_i64(tmp1, cpu_ir[rb], 7);
             tcg_gen_shli_i64(tmp1, tmp1, 3);
-            tmp2 = tcg_const_i64(64);
-            tcg_gen_sub_i64(tmp1, tmp2, tmp1);
-            tcg_temp_free(tmp2);
+            tcg_gen_andi_i64(tmp1, tmp1, 0x3f);
+            tcg_gen_neg_i64(tmp1, tmp1);
+            tcg_gen_addi_i64(tmp1, tmp1, 64);
             tcg_gen_shl_i64(cpu_ir[rc], cpu_ir[ra], tmp1);
+
             tcg_temp_free(tmp1);
         }
         if (tcg_gen_ext_i64)
@@ -1316,7 +1318,7 @@  static inline int translate_one(DisasContext *ctx, uint32_t insn)
             break;
         case 0x6A:
             /* EXTLH */
-            gen_ext_h(&tcg_gen_ext16u_i64, ra, rb, rc, islit, lit);
+            gen_ext_h(&tcg_gen_ext32u_i64, ra, rb, rc, islit, lit);
             break;
         case 0x72:
             /* MSKQH */