diff mbox series

[RFC] vect: disable multiple calls of poly simdclones

Message ID 91d3f8ee-8b2c-4866-a3ed-beb2953b5438@arm.com
State New
Headers show
Series [RFC] vect: disable multiple calls of poly simdclones | expand

Commit Message

Andre Vieira (lists) Nov. 3, 2023, 7:08 p.m. UTC
Hi,

The current codegen code to support VF's that are multiples of a 
simdclone simdlen rely on BIT_FIELD_REF to create multiple input 
vectors.  This does not work for non-constant simdclones, so we should 
disable using such clones when
the VF is a multiple of the non-constant simdlen until we change the 
codegen to support those.

Enabling SVE simdclone support will cause ICEs if the vectorizer decides 
to use a SVE simdclone with a VF that is larger than the simdlen. I'll 
be away for the next two weeks, so cant' really discuss this further.
I initially tried to solve the problem, but the way 
vectorizable_simd_clone_call is structured doesn't make it easy to 
replace BIT_FIELD_REF with the poly-suitable solution right now of using 
unpack_{hi,lo}. Unfortunately I only found this now as I was adding 
further tests for SVE :(

gcc/ChangeLog:

	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Reject simdclones
	with non-constant simdlen when VF is not exactly the same.

Comments

Richard Biener Nov. 6, 2023, 7:52 a.m. UTC | #1
On Fri, 3 Nov 2023, Andre Vieira (lists) wrote:

> Hi,
> 
> The current codegen code to support VF's that are multiples of a simdclone
> simdlen rely on BIT_FIELD_REF to create multiple input vectors.  This does not
> work for non-constant simdclones, so we should disable using such clones when
> the VF is a multiple of the non-constant simdlen until we change the codegen
> to support those.
> 
> Enabling SVE simdclone support will cause ICEs if the vectorizer decides to
> use a SVE simdclone with a VF that is larger than the simdlen. I'll be away
> for the next two weeks, so cant' really discuss this further.
> I initially tried to solve the problem, but the way
> vectorizable_simd_clone_call is structured doesn't make it easy to replace
> BIT_FIELD_REF with the poly-suitable solution right now of using
> unpack_{hi,lo}.

I think it should be straight-forward to use unpack_{even,odd} (it's
even/odd for VLA, right?  If lo/hi would be possible then doing
BIT_FIELD_REF would be, too?  Also you need to have multiple stages
of unpack/pack when the factor is more than 2).

There's plenty of time even during stage3 to address this.

At least your patch should have come with a testcase (or two).

Is there a bugreport tracking this issue?  It should affect GCN as well
I guess.

Richard.


> Unfortunately I only found this now as I was adding further
> tests for SVE :(
> 
> gcc/ChangeLog:
> 
> 	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Reject simdclones
> 	with non-constant simdlen when VF is not exactly the same.
Andrew Stubbs Nov. 6, 2023, 9:42 a.m. UTC | #2
On 06/11/2023 07:52, Richard Biener wrote:
> On Fri, 3 Nov 2023, Andre Vieira (lists) wrote:
> 
>> Hi,
>>
>> The current codegen code to support VF's that are multiples of a simdclone
>> simdlen rely on BIT_FIELD_REF to create multiple input vectors.  This does not
>> work for non-constant simdclones, so we should disable using such clones when
>> the VF is a multiple of the non-constant simdlen until we change the codegen
>> to support those.
>>
>> Enabling SVE simdclone support will cause ICEs if the vectorizer decides to
>> use a SVE simdclone with a VF that is larger than the simdlen. I'll be away
>> for the next two weeks, so cant' really discuss this further.
>> I initially tried to solve the problem, but the way
>> vectorizable_simd_clone_call is structured doesn't make it easy to replace
>> BIT_FIELD_REF with the poly-suitable solution right now of using
>> unpack_{hi,lo}.
> 
> I think it should be straight-forward to use unpack_{even,odd} (it's
> even/odd for VLA, right?  If lo/hi would be possible then doing
> BIT_FIELD_REF would be, too?  Also you need to have multiple stages
> of unpack/pack when the factor is more than 2).
> 
> There's plenty of time even during stage3 to address this.
> 
> At least your patch should have come with a testcase (or two).
> 
> Is there a bugreport tracking this issue?  It should affect GCN as well
> I guess.

What does "non-constant simdclones" mean? I'm not sure if this is a 
thing that can happen on GCN, or not?

Andrew
Richard Biener Nov. 6, 2023, 9:50 a.m. UTC | #3
On Mon, 6 Nov 2023, Andrew Stubbs wrote:

> 
> 
> On 06/11/2023 07:52, Richard Biener wrote:
> > On Fri, 3 Nov 2023, Andre Vieira (lists) wrote:
> > 
> >> Hi,
> >>
> >> The current codegen code to support VF's that are multiples of a simdclone
> >> simdlen rely on BIT_FIELD_REF to create multiple input vectors.  This does
> >> not
> >> work for non-constant simdclones, so we should disable using such clones
> >> when
> >> the VF is a multiple of the non-constant simdlen until we change the
> >> codegen
> >> to support those.
> >>
> >> Enabling SVE simdclone support will cause ICEs if the vectorizer decides to
> >> use a SVE simdclone with a VF that is larger than the simdlen. I'll be away
> >> for the next two weeks, so cant' really discuss this further.
> >> I initially tried to solve the problem, but the way
> >> vectorizable_simd_clone_call is structured doesn't make it easy to replace
> >> BIT_FIELD_REF with the poly-suitable solution right now of using
> >> unpack_{hi,lo}.
> > 
> > I think it should be straight-forward to use unpack_{even,odd} (it's
> > even/odd for VLA, right?  If lo/hi would be possible then doing
> > BIT_FIELD_REF would be, too?  Also you need to have multiple stages
> > of unpack/pack when the factor is more than 2).
> > 
> > There's plenty of time even during stage3 to address this.
> > 
> > At least your patch should have come with a testcase (or two).
> > 
> > Is there a bugreport tracking this issue?  It should affect GCN as well
> > I guess.
> 
> What does "non-constant simdclones" mean? I'm not sure if this is a thing that
> can happen on GCN, or not?

simdclone with a variable (POLY_INT) vector size.

Richard.
Andre Vieira (lists) Nov. 27, 2023, 4:17 p.m. UTC | #4
On 06/11/2023 07:52, Richard Biener wrote:
> On Fri, 3 Nov 2023, Andre Vieira (lists) wrote:
> 
>> Hi,
>>
>> The current codegen code to support VF's that are multiples of a simdclone
>> simdlen rely on BIT_FIELD_REF to create multiple input vectors.  This does not
>> work for non-constant simdclones, so we should disable using such clones when
>> the VF is a multiple of the non-constant simdlen until we change the codegen
>> to support those.
>>
>> Enabling SVE simdclone support will cause ICEs if the vectorizer decides to
>> use a SVE simdclone with a VF that is larger than the simdlen. I'll be away
>> for the next two weeks, so cant' really discuss this further.
>> I initially tried to solve the problem, but the way
>> vectorizable_simd_clone_call is structured doesn't make it easy to replace
>> BIT_FIELD_REF with the poly-suitable solution right now of using
>> unpack_{hi,lo}.
> 
> I think it should be straight-forward to use unpack_{even,odd} (it's
> even/odd for VLA, right?  If lo/hi would be possible then doing
> BIT_FIELD_REF would be, too?  Also you need to have multiple stages
> of unpack/pack when the factor is more than 2).
> 
> There's plenty of time even during stage3 to address this.
> 
> At least your patch should have come with a testcase (or two).

Yeah I didn't add one as it didn't trigger on AArch64 without my two 
outstanding aarch64 simdclone patches.
> 
> Is there a bugreport tracking this issue?  It should affect GCN as well
> I guess.

No, since I can't trigger them yet on trunk until the reviews on my 
target specific patches are done and they are committed.

I don't have a GCN backend lying around but I suspect GCN doesn't use 
poly simdlen simdclones yet either... I haven't checked. The issue 
triggers for aarch64 when trying to generate SVE simdclones for 
functions with mixed types.  I'll give the unpack thing a go locally.
diff mbox series

Patch

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 5f262cae2aae784e3ef4fd07455b7aa742797b51..dc3e0716161838aef66cf37342499006673336d6 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4165,7 +4165,10 @@  vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 	if (!constant_multiple_p (vf * group_size, n->simdclone->simdlen,
 				  &num_calls)
 	    || (!n->simdclone->inbranch && (masked_call_offset > 0))
-	    || (nargs != simd_nargs))
+	    || (nargs != simd_nargs)
+	    /* Currently we do not support multiple calls of non-constant
+	       simdlen as poly vectors can not be accessed by BIT_FIELD_REF.  */
+	    || (!n->simdclone->simdlen.is_constant () && num_calls != 1))
 	  continue;
 	if (num_calls != 1)
 	  this_badness += exact_log2 (num_calls) * 4096;