diff mbox series

[RFC,05/65] target/riscv: remove vsll.vi, vsrl.vi, vsra.vi insns from using gvec

Message ID 20200710104920.13550-6-frank.chang@sifive.com
State New
Headers show
Series target/riscv: support vector extension v0.9 | expand

Commit Message

Frank Chang July 10, 2020, 10:48 a.m. UTC
From: Frank Chang <frank.chang@sifive.com>

vsll.vi, vsrl.vi, vsra.vi cannot use shli gvec as it requires the
shift immediate value to be within the range: [0.. SEW bits].
Otherwise, it will hit the assertion:
tcg_debug_assert(shift >= 0 && shift < (8 << vece));

However, RVV spec does not have such constraint, therefore we have to
use helper functions instead.

Signed-off-by: Frank Chang <frank.chang@sifive.com>
---
 target/riscv/insn_trans/trans_rvv.inc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Richard Henderson July 10, 2020, 4:27 p.m. UTC | #1
On 7/10/20 3:48 AM, frank.chang@sifive.com wrote:
> From: Frank Chang <frank.chang@sifive.com>
> 
> vsll.vi, vsrl.vi, vsra.vi cannot use shli gvec as it requires the
> shift immediate value to be within the range: [0.. SEW bits].
> Otherwise, it will hit the assertion:
> tcg_debug_assert(shift >= 0 && shift < (8 << vece));
> 
> However, RVV spec does not have such constraint, therefore we have to
> use helper functions instead.

Why do you say that?  It does have such a constraint:

# Only the low lg2(SEW) bits are read to obtain the shift amount from a
register value.

While that only talks about the register value, I sincerely doubt that the same
truncation does not actually apply to immediates.

And if the entire immediate value does apply, the manual should certainly
specify what should happen in that case.  And at present it doesn't.

It seems to me the bug is the bare use of GEN_OPIVI_GVEC_TRANS and thence
do_opivi_gvec.  The ZX parameter should be extended to more than just "zero vs
sign-extend", it should have an option for truncating the immediate to s->sew.


r~
Frank Chang July 14, 2020, 2:59 a.m. UTC | #2
On Sat, Jul 11, 2020 at 12:27 AM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 7/10/20 3:48 AM, frank.chang@sifive.com wrote:
> > From: Frank Chang <frank.chang@sifive.com>
> >
> > vsll.vi, vsrl.vi, vsra.vi cannot use shli gvec as it requires the
> > shift immediate value to be within the range: [0.. SEW bits].
> > Otherwise, it will hit the assertion:
> > tcg_debug_assert(shift >= 0 && shift < (8 << vece));
> >
> > However, RVV spec does not have such constraint, therefore we have to
> > use helper functions instead.
>
> Why do you say that?  It does have such a constraint:
>
> # Only the low lg2(SEW) bits are read to obtain the shift amount from a
> register value.
>
> While that only talks about the register value, I sincerely doubt that the
> same
> truncation does not actually apply to immediates.
>
> And if the entire immediate value does apply, the manual should certainly
> specify what should happen in that case.  And at present it doesn't.
>
> It seems to me the bug is the bare use of GEN_OPIVI_GVEC_TRANS and thence
> do_opivi_gvec.  The ZX parameter should be extended to more than just
> "zero vs
> sign-extend", it should have an option for truncating the immediate to
> s->sew.
>
>
> r~
>

The latest spec specified:

Only the low *lg2(SEW) bits* are read to obtain the shift amount from
a *register
value*.
The *immediate* is treated as an *unsigned shift amount*, with a *maximum
shift amount of 31*.

Looks like the shift amount in the immediate value is not relevant with SEW
setting.
If so, is it better to just use do_opivi_gvec() and implement the logic by
our own rather than using gvec IR?

Frank Chang
LIU Zhiwei July 14, 2020, 3:35 a.m. UTC | #3
On 2020/7/14 10:59, Frank Chang wrote:
> On Sat, Jul 11, 2020 at 12:27 AM Richard Henderson 
> <richard.henderson@linaro.org <mailto:richard.henderson@linaro.org>> 
> wrote:
>
>     On 7/10/20 3:48 AM, frank.chang@sifive.com
>     <mailto:frank.chang@sifive.com> wrote:
>     > From: Frank Chang <frank.chang@sifive.com
>     <mailto:frank.chang@sifive.com>>
>     >
>     > vsll.vi <http://vsll.vi>, vsrl.vi <http://vsrl.vi>, vsra.vi
>     <http://vsra.vi> cannot use shli gvec as it requires the
>     > shift immediate value to be within the range: [0.. SEW bits].
>     > Otherwise, it will hit the assertion:
>     > tcg_debug_assert(shift >= 0 && shift < (8 << vece));
>     >
>     > However, RVV spec does not have such constraint, therefore we
>     have to
>     > use helper functions instead.
>
>     Why do you say that?  It does have such a constraint:
>
>     # Only the low lg2(SEW) bits are read to obtain the shift amount
>     from a
>     register value.
>
>     While that only talks about the register value, I sincerely doubt
>     that the same
>     truncation does not actually apply to immediates.
>
>     And if the entire immediate value does apply, the manual should
>     certainly
>     specify what should happen in that case.  And at present it doesn't.
>
>     It seems to me the bug is the bare use of GEN_OPIVI_GVEC_TRANS and
>     thence
>     do_opivi_gvec.  The ZX parameter should be extended to more than
>     just "zero vs
>     sign-extend", it should have an option for truncating the
>     immediate to s->sew.
>
>
>     r~
>
>
> The latest spec specified:
>
> Only the low *lg2(SEW) bits* are read to obtain the shift amount from 
> a *register value*.
> The *immediate* is treated as an *unsigned shift amount*, with a 
> *maximum shift amount of 31*.
>
> Looks like the shift amount in the immediate value is not relevant 
> with SEW setting.
> If so, is it better to just use do_opivi_gvec() and implement the 
> logic by our own rather than using gvec IR?

In my opinion, it doesn't matter to truncate the immediate to s->sew 
before calling the gvec IR,
whether the constraint of immediate exits or not.

Zhiwei
>
> Frank Chang
Frank Chang July 14, 2020, 4:39 a.m. UTC | #4
On Tue, Jul 14, 2020 at 11:35 AM LIU Zhiwei <zhiwei_liu@c-sky.com> wrote:

>
>
> On 2020/7/14 10:59, Frank Chang wrote:
>
> On Sat, Jul 11, 2020 at 12:27 AM Richard Henderson <
> richard.henderson@linaro.org> wrote:
>
>> On 7/10/20 3:48 AM, frank.chang@sifive.com wrote:
>> > From: Frank Chang <frank.chang@sifive.com>
>> >
>> > vsll.vi, vsrl.vi, vsra.vi cannot use shli gvec as it requires the
>> > shift immediate value to be within the range: [0.. SEW bits].
>> > Otherwise, it will hit the assertion:
>> > tcg_debug_assert(shift >= 0 && shift < (8 << vece));
>> >
>> > However, RVV spec does not have such constraint, therefore we have to
>> > use helper functions instead.
>>
>> Why do you say that?  It does have such a constraint:
>>
>> # Only the low lg2(SEW) bits are read to obtain the shift amount from a
>> register value.
>>
>> While that only talks about the register value, I sincerely doubt that
>> the same
>> truncation does not actually apply to immediates.
>>
>> And if the entire immediate value does apply, the manual should certainly
>> specify what should happen in that case.  And at present it doesn't.
>>
>> It seems to me the bug is the bare use of GEN_OPIVI_GVEC_TRANS and thence
>> do_opivi_gvec.  The ZX parameter should be extended to more than just
>> "zero vs
>> sign-extend", it should have an option for truncating the immediate to
>> s->sew.
>>
>>
>> r~
>>
>
> The latest spec specified:
>
> Only the low *lg2(SEW) bits* are read to obtain the shift amount from a *register
> value*.
> The *immediate* is treated as an *unsigned shift amount*, with a *maximum
> shift amount of 31*.
>
> Looks like the shift amount in the immediate value is not relevant with
> SEW setting.
> If so, is it better to just use do_opivi_gvec() and implement the logic by
> our own rather than using gvec IR?
>
>
> In my opinion, it doesn't matter to truncate the immediate to s->sew
> before calling the gvec IR,
> whether the constraint of immediate exits or not.
>
> Zhiwei
>
>
> Frank Chang
>
>
>
The current issue I've encountered is the test like:

*vsetvli t0,t0,e8,m1,tu,mu,d1*
*vsll.vi <http://vsll.vi> v30, v30, 27*
where the SEW is 8 (i.e. vece = 0), but the immediate value is: 27.
The instruction doesn't violate the requirement specified in spec as its
value is less then 31.
However, it can't pass *tcg_debug_assert(shift >= 0 && shift < (8 <<
vece));* assertion if tcg debug option is enabled.

Frank Chang
Richard Henderson July 14, 2020, 1:21 p.m. UTC | #5
On 7/13/20 7:59 PM, Frank Chang wrote:
> The latest spec specified:
> 
> Only the low *lg2(SEW) bits* are read to obtain the shift amount from a
> *register value*.
> The *immediate* is treated as an *unsigned shift amount*, with a *maximum shift
> amount of 31*.

Which, I hope you will agree is underspecified, and should be reported as a bug
in the manual.

> Looks like the shift amount in the immediate value is not relevant with SEW
> setting.

How can it not be?  It is when the value comes from a register...

> If so, is it better to just use do_opivi_gvec() and implement the logic by our
> own rather than using gvec IR?

No, it is not.  What is the logic you would apply on your own?  There should be
a right answer.

If the answer is that out-of-range shift produces zero, which some
architectures use, then you can look at the immediate value, see that you must
supply zero, and then fill the vector with zeros from translate.  You need not
call a helper to perform N shifts when you know the result a-priori.

If the answer is that shift values are truncated, which riscv uses *everywhere
else*, then you should truncate the immediate value during translate.


r~
Frank Chang July 14, 2020, 1:59 p.m. UTC | #6
On Tue, Jul 14, 2020 at 9:21 PM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 7/13/20 7:59 PM, Frank Chang wrote:
> > The latest spec specified:
> >
> > Only the low *lg2(SEW) bits* are read to obtain the shift amount from a
> > *register value*.
> > The *immediate* is treated as an *unsigned shift amount*, with a
> *maximum shift
> > amount of 31*.
>
> Which, I hope you will agree is underspecified, and should be reported as
> a bug
> in the manual.
>

Yes, you're correct.
I found out I missed the MASK part in GEN_VEXT_SHIFT_VV() macro,
which this macro is shared between OPIVV and OPIVI format of instructions.
So the same masking methodology should be applied to shift amounts coming
from both register and immediate value.
Spike also does something like:
*vs2 >> (zimm5 & (sew - 1) & 0x1f);* for SEW = 8.

I think spec is kind of misleading to the reader by the way it expresses.


>
> > Looks like the shift amount in the immediate value is not relevant with
> SEW
> > setting.
>
> How can it not be?  It is when the value comes from a register...
>
> > If so, is it better to just use do_opivi_gvec() and implement the logic
> by our
> > own rather than using gvec IR?
>
> No, it is not.  What is the logic you would apply on your own?  There
> should be
> a right answer.
>

I was talking about just calling GEN_OPIVI_TRANS() to generate the helper
functions
defined in vector_helper.c as what I'm doing in the original patch.
But as long as the immediate value should also apply *lg2(SEW) bits*
truncating,
I think I should update GEN_OPIVI_GVEC_TRANS() to utilize gvec IR.


>
> If the answer is that out-of-range shift produces zero, which some
> architectures use, then you can look at the immediate value, see that you
> must
> supply zero, and then fill the vector with zeros from translate.  You need
> not
> call a helper to perform N shifts when you know the result a-priori.
>
> If the answer is that shift values are truncated, which riscv uses
> *everywhere
> else*, then you should truncate the immediate value during translate.
>

I think vsll.vi, vsrl.vi and vsra.vi truncate the out-of-range shift as
other riscv instructions.
I will double confirm that, thanks for the advice.

Frank Chang


>
>
> r~
>
LIU Zhiwei July 15, 2020, 2:52 a.m. UTC | #7
On 2020/7/14 21:59, Frank Chang wrote:
> On Tue, Jul 14, 2020 at 9:21 PM Richard Henderson 
> <richard.henderson@linaro.org <mailto:richard.henderson@linaro.org>> 
> wrote:
>
>     On 7/13/20 7:59 PM, Frank Chang wrote:
>     > The latest spec specified:
>     >
>     > Only the low *lg2(SEW) bits* are read to obtain the shift amount
>     from a
>     > *register value*.
>     > The *immediate* is treated as an *unsigned shift amount*, with a
>     *maximum shift
>     > amount of 31*.
>
>     Which, I hope you will agree is underspecified, and should be
>     reported as a bug
>     in the manual.
>
>
> Yes, you're correct.
> I found out I missed the MASK part in GEN_VEXT_SHIFT_VV() macro,
> which this macro is shared between OPIVV and OPIVI format of instructions.
> So the same masking methodology should be applied to shift amounts 
> coming from both register and immediate value.
> Spike also does something like:
> /vs2 >> (zimm5 & (sew - 1) & 0x1f);/ for SEW = 8.
>
> I think spec is kind of misleading to the reader by the way it expresses.
>
>
>     > Looks like the shift amount in the immediate value is not
>     relevant with SEW
>     > setting.
>
>     How can it not be?  It is when the value comes from a register...
>
>     > If so, is it better to just use do_opivi_gvec() and implement
>     the logic by our
>     > own rather than using gvec IR?
>
>     No, it is not.  What is the logic you would apply on your own? 
>     There should be
>     a right answer.
>
>
> I was talking about just calling GEN_OPIVI_TRANS() to generate the 
> helper functions
> defined in vector_helper.c as what I'm doing in the original patch.
> But as long as the immediate value should also apply *lg2(SEW) bits* 
> truncating,
> I think I should update GEN_OPIVI_GVEC_TRANS() to utilize gvec IR.
>
>
>     If the answer is that out-of-range shift produces zero, which some
>     architectures use, then you can look at the immediate value, see
>     that you must
>     supply zero, and then fill the vector with zeros from translate. 
>     You need not
>     call a helper to perform N shifts when you know the result a-priori.
>
>     If the answer is that shift values are truncated, which riscv uses
>     *everywhere
>     else*, then you should truncate the immediate value during translate.
>
>
> I think vsll.vi <http://vsll.vi>, vsrl.vi <http://vsrl.vi> and vsra.vi 
> <http://vsra.vi> truncate the out-of-range shift as other riscv 
> instructions.
> I will double confirm that, thanks for the advice.
>
Perhaps the reason is that the assembler can't identify if an imm is 
legal according to log(SEW),
because the assembler can't get the SEW.

To make more compliance with assembler directly users'  intuition, just 
let the imm encoding to insn as itself(truncate to 31)
and work as itself.

Zhiwei
> Frank Chang
>
>
>
>     r~
>
diff mbox series

Patch

diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index c0b7375927..70d31a5525 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -1427,9 +1427,9 @@  GEN_OPIVX_GVEC_SHIFT_TRANS(vsll_vx,  shls)
 GEN_OPIVX_GVEC_SHIFT_TRANS(vsrl_vx,  shrs)
 GEN_OPIVX_GVEC_SHIFT_TRANS(vsra_vx,  sars)
 
-GEN_OPIVI_GVEC_TRANS(vsll_vi, 1, vsll_vx,  shli)
-GEN_OPIVI_GVEC_TRANS(vsrl_vi, 1, vsrl_vx,  shri)
-GEN_OPIVI_GVEC_TRANS(vsra_vi, 1, vsra_vx,  sari)
+GEN_OPIVI_TRANS(vsll_vi, 1, vsll_vx, opivx_check)
+GEN_OPIVI_TRANS(vsrl_vi, 1, vsrl_vx, opivx_check)
+GEN_OPIVI_TRANS(vsra_vi, 1, vsra_vx, opivx_check)
 
 /* Vector Narrowing Integer Right Shift Instructions */
 static bool opivv_narrow_check(DisasContext *s, arg_rmrr *a)