diff mbox series

[v4,20/47] target/ppc: implement vslq

Message ID 20220222143646.1268606-21-matheus.ferst@eldorado.org.br
State New
Headers show
Series target/ppc: PowerISA Vector/VSX instruction batch | expand

Commit Message

Matheus K. Ferst Feb. 22, 2022, 2:36 p.m. UTC
From: Matheus Ferst <matheus.ferst@eldorado.org.br>

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
 -  New in v4.
---
 target/ppc/insn32.decode            |  1 +
 target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

Comments

Richard Henderson Feb. 22, 2022, 10:14 p.m. UTC | #1
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
> 
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> v4:
>   -  New in v4.
> ---
>   target/ppc/insn32.decode            |  1 +
>   target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
>   2 files changed, 41 insertions(+)
> 
> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
> index 88baebe35e..3799065508 100644
> --- a/target/ppc/insn32.decode
> +++ b/target/ppc/insn32.decode
> @@ -473,6 +473,7 @@ VSLB            000100 ..... ..... ..... 00100000100    @VX
>   VSLH            000100 ..... ..... ..... 00101000100    @VX
>   VSLW            000100 ..... ..... ..... 00110000100    @VX
>   VSLD            000100 ..... ..... ..... 10111000100    @VX
> +VSLQ            000100 ..... ..... ..... 00100000101    @VX
>   
>   VSRB            000100 ..... ..... ..... 01000000100    @VX
>   VSRH            000100 ..... ..... ..... 01001000100    @VX
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index ec4f0e7654..ca98a545ef 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -834,6 +834,46 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
>   TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
>   TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
>   
> +static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
> +{
> +    TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
> +
> +    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
> +    REQUIRE_VECTOR(ctx);
> +
> +    n = tcg_temp_new_i64();
> +    hi = tcg_temp_new_i64();
> +    lo = tcg_temp_new_i64();
> +    tmp = tcg_const_i64(0);
> +
> +    get_avr64(lo, a->vra, false);
> +    get_avr64(hi, a->vra, true);
> +
> +    get_avr64(n, a->vrb, true);
> +    tcg_gen_andi_i64(n, n, 0x7F);
> +
> +    tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
> +    tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);

Since you have to mask twice anyway, better use (n & 64) != 0.

Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~
Matheus K. Ferst Feb. 23, 2022, 9:53 p.m. UTC | #2
On 22/02/2022 19:14, Richard Henderson wrote:
> On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
>> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>>
>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
>> ---
>> v4:
>>   -  New in v4.
>> ---
>>   target/ppc/insn32.decode            |  1 +
>>   target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
>>   2 files changed, 41 insertions(+)
>>
>> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
>> index 88baebe35e..3799065508 100644
>> --- a/target/ppc/insn32.decode
>> +++ b/target/ppc/insn32.decode
>> @@ -473,6 +473,7 @@ VSLB            000100 ..... ..... ..... 
>> 00100000100    @VX
>>   VSLH            000100 ..... ..... ..... 00101000100    @VX
>>   VSLW            000100 ..... ..... ..... 00110000100    @VX
>>   VSLD            000100 ..... ..... ..... 10111000100    @VX
>> +VSLQ            000100 ..... ..... ..... 00100000101    @VX
>>
>>   VSRB            000100 ..... ..... ..... 01000000100    @VX
>>   VSRH            000100 ..... ..... ..... 01001000100    @VX
>> diff --git a/target/ppc/translate/vmx-impl.c.inc 
>> b/target/ppc/translate/vmx-impl.c.inc
>> index ec4f0e7654..ca98a545ef 100644
>> --- a/target/ppc/translate/vmx-impl.c.inc
>> +++ b/target/ppc/translate/vmx-impl.c.inc
>> @@ -834,6 +834,46 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, 
>> MO_16, tcg_gen_gvec_sarv);
>>   TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, 
>> tcg_gen_gvec_sarv);
>>   TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, 
>> tcg_gen_gvec_sarv);
>>
>> +static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
>> +{
>> +    TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
>> +
>> +    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
>> +    REQUIRE_VECTOR(ctx);
>> +
>> +    n = tcg_temp_new_i64();
>> +    hi = tcg_temp_new_i64();
>> +    lo = tcg_temp_new_i64();
>> +    tmp = tcg_const_i64(0);
>> +
>> +    get_avr64(lo, a->vra, false);
>> +    get_avr64(hi, a->vra, true);
>> +
>> +    get_avr64(n, a->vrb, true);
>> +    tcg_gen_andi_i64(n, n, 0x7F);
>> +
>> +    tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
>> +    tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
> 
> Since you have to mask twice anyway, better use (n & 64) != 0.
> 

Hmm, I'm not sure if I understood. To check != 0 we'll need a temp to 
hold n&64. We could use tmp here, but we'll need another one in patch 
22. Is that right?

Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
Richard Henderson Feb. 23, 2022, 10:12 p.m. UTC | #3
On 2/23/22 11:53, Matheus K. Ferst wrote:
> On 22/02/2022 19:14, Richard Henderson wrote:
>> On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
>>> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>>>
>>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
>>> ---
>>> v4:
>>>   -  New in v4.
>>> ---
>>>   target/ppc/insn32.decode            |  1 +
>>>   target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
>>>   2 files changed, 41 insertions(+)
>>>
>>> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
>>> index 88baebe35e..3799065508 100644
>>> --- a/target/ppc/insn32.decode
>>> +++ b/target/ppc/insn32.decode
>>> @@ -473,6 +473,7 @@ VSLB            000100 ..... ..... ..... 00100000100    @VX
>>>   VSLH            000100 ..... ..... ..... 00101000100    @VX
>>>   VSLW            000100 ..... ..... ..... 00110000100    @VX
>>>   VSLD            000100 ..... ..... ..... 10111000100    @VX
>>> +VSLQ            000100 ..... ..... ..... 00100000101    @VX
>>>
>>>   VSRB            000100 ..... ..... ..... 01000000100    @VX
>>>   VSRH            000100 ..... ..... ..... 01001000100    @VX
>>> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
>>> index ec4f0e7654..ca98a545ef 100644
>>> --- a/target/ppc/translate/vmx-impl.c.inc
>>> +++ b/target/ppc/translate/vmx-impl.c.inc
>>> @@ -834,6 +834,46 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, 
>>> tcg_gen_gvec_sarv);
>>>   TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
>>>   TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
>>>
>>> +static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
>>> +{
>>> +    TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
>>> +
>>> +    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
>>> +    REQUIRE_VECTOR(ctx);
>>> +
>>> +    n = tcg_temp_new_i64();
>>> +    hi = tcg_temp_new_i64();
>>> +    lo = tcg_temp_new_i64();
>>> +    tmp = tcg_const_i64(0);
>>> +
>>> +    get_avr64(lo, a->vra, false);
>>> +    get_avr64(hi, a->vra, true);
>>> +
>>> +    get_avr64(n, a->vrb, true);
>>> +    tcg_gen_andi_i64(n, n, 0x7F);
>>> +
>>> +    tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
>>> +    tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
>>
>> Since you have to mask twice anyway, better use (n & 64) != 0.
>>
> 
> Hmm, I'm not sure if I understood. To check != 0 we'll need a temp to hold n&64. We could 
> use tmp here, but we'll need another one in patch 22. Is that right?

Yes.

r~
diff mbox series

Patch

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 88baebe35e..3799065508 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -473,6 +473,7 @@  VSLB            000100 ..... ..... ..... 00100000100    @VX
 VSLH            000100 ..... ..... ..... 00101000100    @VX
 VSLW            000100 ..... ..... ..... 00110000100    @VX
 VSLD            000100 ..... ..... ..... 10111000100    @VX
+VSLQ            000100 ..... ..... ..... 00100000101    @VX
 
 VSRB            000100 ..... ..... ..... 01000000100    @VX
 VSRH            000100 ..... ..... ..... 01001000100    @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index ec4f0e7654..ca98a545ef 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,6 +834,46 @@  TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
 TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
 TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
 
+static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
+{
+    TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
+
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    n = tcg_temp_new_i64();
+    hi = tcg_temp_new_i64();
+    lo = tcg_temp_new_i64();
+    tmp = tcg_const_i64(0);
+
+    get_avr64(lo, a->vra, false);
+    get_avr64(hi, a->vra, true);
+
+    get_avr64(n, a->vrb, true);
+    tcg_gen_andi_i64(n, n, 0x7F);
+
+    tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
+    tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
+    tcg_gen_andi_i64(n, n, ~64ULL);
+
+    tcg_gen_shl_i64(tmp, lo, n);
+    set_avr64(a->vrt, tmp, false);
+
+    tcg_gen_shl_i64(hi, hi, n);
+    tcg_gen_xori_i64(n, n, 63);
+    tcg_gen_shr_i64(lo, lo, n);
+    tcg_gen_shri_i64(lo, lo, 1);
+    tcg_gen_or_i64(hi, hi, lo);
+    set_avr64(a->vrt, hi, true);
+
+    tcg_temp_free_i64(hi);
+    tcg_temp_free_i64(lo);
+    tcg_temp_free_i64(tmp);
+    tcg_temp_free_i64(n);
+
+    return true;
+}
+
 #define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3)               \
 static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t,     \
                                          TCGv_vec sat, TCGv_vec a,      \