diff mbox series

Fix mixed input kind permute optimization

Message ID 20240521172620.284CB13685@imap1.dmz-prg2.suse.org
State New
Headers show
Series Fix mixed input kind permute optimization | expand

Commit Message

Richard Biener May 21, 2024, 5:26 p.m. UTC
When change_vec_perm_layout runs into a permute combining two
nodes where one is invariant and one internal the partition of
one input can be -1 but the other might not be.  The following
supports this case by simply ignoring inputs with input partiton -1.

I'm not sure this is correct but it avoids ICEing when accessing
that partitions layout for gcc.target/i386/pr98928.c with the
change to avoid splitting store dataref groups during SLP discovery.

Bootstrap and regtest running on x86_64-unknown-linux-gnu (ontop of
the SLP series).  The change can't break anything that's already
broken but I'm not sure this does the right thing - the testcase
has an uniform constant.  I'll try to come up with a better runtime
testcase tomorrow.  Hints as to where to correctly fix such case
appreciated.

	* tree-vect-slp.cc (change_vec_perm_layout): Ignore an
	input partition of -1.
---
 gcc/tree-vect-slp.cc | 2 ++
 1 file changed, 2 insertions(+)

Comments

Richard Sandiford May 21, 2024, 6:42 p.m. UTC | #1
Richard Biener <rguenther@suse.de> writes:
> When change_vec_perm_layout runs into a permute combining two
> nodes where one is invariant and one internal the partition of
> one input can be -1 but the other might not be.  The following
> supports this case by simply ignoring inputs with input partiton -1.
>
> I'm not sure this is correct but it avoids ICEing when accessing
> that partitions layout for gcc.target/i386/pr98928.c with the
> change to avoid splitting store dataref groups during SLP discovery.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu (ontop of
> the SLP series).  The change can't break anything that's already
> broken but I'm not sure this does the right thing - the testcase
> has an uniform constant.  I'll try to come up with a better runtime
> testcase tomorrow.  Hints as to where to correctly fix such case
> appreciated.

Famous last words, but yeah, it looks correct to me.  I think the
routine in principle should have a free choice of which layout to
choose for invariants (as long as it's consistent for all queries
about the same node).  So it should just be a question of whether
keeping the original layout is more likely to give a valid
permutation, or whether going with out_layout_i would be better.
I don't have a strong intuition either way.

Thanks,
Richard

>
> 	* tree-vect-slp.cc (change_vec_perm_layout): Ignore an
> 	input partition of -1.
> ---
>  gcc/tree-vect-slp.cc | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 873748b0a72..f6ec1a81c96 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -4828,6 +4828,8 @@ change_vec_perm_layout (slp_tree node, lane_permutation_t &perm,
>  	{
>  	  slp_tree in_node = SLP_TREE_CHILDREN (node)[entry.first];
>  	  unsigned int in_partition_i = m_vertices[in_node->vertex].partition;
> +	  if (in_partition_i == -1u)
> +	    continue;
>  	  this_in_layout_i = m_partitions[in_partition_i].layout;
>  	}
>        if (this_in_layout_i > 0)
Richard Sandiford May 22, 2024, 9:28 a.m. UTC | #2
Richard Sandiford <richard.sandiford@arm.com> writes:
> Richard Biener <rguenther@suse.de> writes:
>> When change_vec_perm_layout runs into a permute combining two
>> nodes where one is invariant and one internal the partition of
>> one input can be -1 but the other might not be.  The following
>> supports this case by simply ignoring inputs with input partiton -1.
>>
>> I'm not sure this is correct but it avoids ICEing when accessing
>> that partitions layout for gcc.target/i386/pr98928.c with the
>> change to avoid splitting store dataref groups during SLP discovery.
>>
>> Bootstrap and regtest running on x86_64-unknown-linux-gnu (ontop of
>> the SLP series).  The change can't break anything that's already
>> broken but I'm not sure this does the right thing - the testcase
>> has an uniform constant.  I'll try to come up with a better runtime
>> testcase tomorrow.  Hints as to where to correctly fix such case
>> appreciated.
>
> Famous last words, but yeah, it looks correct to me.  I think the
> routine in principle should have a free choice of which layout to
> choose for invariants (as long as it's consistent for all queries
> about the same node).  So it should just be a question of whether
> keeping the original layout is more likely to give a valid
> permutation, or whether going with out_layout_i would be better.
> I don't have a strong intuition either way.

BTW, I should have said that using a different layout from 0
would require compensating code in the materialize function.
So this is definitely the simplest and most direct fix.

Thanks,
Richard
Richard Biener May 22, 2024, 9:38 a.m. UTC | #3
On Wed, 22 May 2024, Richard Sandiford wrote:

> Richard Sandiford <richard.sandiford@arm.com> writes:
> > Richard Biener <rguenther@suse.de> writes:
> >> When change_vec_perm_layout runs into a permute combining two
> >> nodes where one is invariant and one internal the partition of
> >> one input can be -1 but the other might not be.  The following
> >> supports this case by simply ignoring inputs with input partiton -1.
> >>
> >> I'm not sure this is correct but it avoids ICEing when accessing
> >> that partitions layout for gcc.target/i386/pr98928.c with the
> >> change to avoid splitting store dataref groups during SLP discovery.
> >>
> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu (ontop of
> >> the SLP series).  The change can't break anything that's already
> >> broken but I'm not sure this does the right thing - the testcase
> >> has an uniform constant.  I'll try to come up with a better runtime
> >> testcase tomorrow.  Hints as to where to correctly fix such case
> >> appreciated.
> >
> > Famous last words, but yeah, it looks correct to me.  I think the
> > routine in principle should have a free choice of which layout to
> > choose for invariants (as long as it's consistent for all queries
> > about the same node).  So it should just be a question of whether
> > keeping the original layout is more likely to give a valid
> > permutation, or whether going with out_layout_i would be better.
> > I don't have a strong intuition either way.
> 
> BTW, I should have said that using a different layout from 0
> would require compensating code in the materialize function.
> So this is definitely the simplest and most direct fix.

Yeah, I guess we can improve on that later.  I'm going to push the
change after lunch together with the other two fixes - the ARM CI
discovered its share of testsuite fallout for the actual change
I'm going to look at.

Richard.
diff mbox series

Patch

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 873748b0a72..f6ec1a81c96 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -4828,6 +4828,8 @@  change_vec_perm_layout (slp_tree node, lane_permutation_t &perm,
 	{
 	  slp_tree in_node = SLP_TREE_CHILDREN (node)[entry.first];
 	  unsigned int in_partition_i = m_vertices[in_node->vertex].partition;
+	  if (in_partition_i == -1u)
+	    continue;
 	  this_in_layout_i = m_partitions[in_partition_i].layout;
 	}
       if (this_in_layout_i > 0)