Message ID | 20240521172620.284CB13685@imap1.dmz-prg2.suse.org |
---|---|
State | New |
Headers | show |
Series | Fix mixed input kind permute optimization | expand |
Richard Biener <rguenther@suse.de> writes: > When change_vec_perm_layout runs into a permute combining two > nodes where one is invariant and one internal the partition of > one input can be -1 but the other might not be. The following > supports this case by simply ignoring inputs with input partiton -1. > > I'm not sure this is correct but it avoids ICEing when accessing > that partitions layout for gcc.target/i386/pr98928.c with the > change to avoid splitting store dataref groups during SLP discovery. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu (ontop of > the SLP series). The change can't break anything that's already > broken but I'm not sure this does the right thing - the testcase > has an uniform constant. I'll try to come up with a better runtime > testcase tomorrow. Hints as to where to correctly fix such case > appreciated. Famous last words, but yeah, it looks correct to me. I think the routine in principle should have a free choice of which layout to choose for invariants (as long as it's consistent for all queries about the same node). So it should just be a question of whether keeping the original layout is more likely to give a valid permutation, or whether going with out_layout_i would be better. I don't have a strong intuition either way. Thanks, Richard > > * tree-vect-slp.cc (change_vec_perm_layout): Ignore an > input partition of -1. > --- > gcc/tree-vect-slp.cc | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > index 873748b0a72..f6ec1a81c96 100644 > --- a/gcc/tree-vect-slp.cc > +++ b/gcc/tree-vect-slp.cc > @@ -4828,6 +4828,8 @@ change_vec_perm_layout (slp_tree node, lane_permutation_t &perm, > { > slp_tree in_node = SLP_TREE_CHILDREN (node)[entry.first]; > unsigned int in_partition_i = m_vertices[in_node->vertex].partition; > + if (in_partition_i == -1u) > + continue; > this_in_layout_i = m_partitions[in_partition_i].layout; > } > if (this_in_layout_i > 0)
Richard Sandiford <richard.sandiford@arm.com> writes: > Richard Biener <rguenther@suse.de> writes: >> When change_vec_perm_layout runs into a permute combining two >> nodes where one is invariant and one internal the partition of >> one input can be -1 but the other might not be. The following >> supports this case by simply ignoring inputs with input partiton -1. >> >> I'm not sure this is correct but it avoids ICEing when accessing >> that partitions layout for gcc.target/i386/pr98928.c with the >> change to avoid splitting store dataref groups during SLP discovery. >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu (ontop of >> the SLP series). The change can't break anything that's already >> broken but I'm not sure this does the right thing - the testcase >> has an uniform constant. I'll try to come up with a better runtime >> testcase tomorrow. Hints as to where to correctly fix such case >> appreciated. > > Famous last words, but yeah, it looks correct to me. I think the > routine in principle should have a free choice of which layout to > choose for invariants (as long as it's consistent for all queries > about the same node). So it should just be a question of whether > keeping the original layout is more likely to give a valid > permutation, or whether going with out_layout_i would be better. > I don't have a strong intuition either way. BTW, I should have said that using a different layout from 0 would require compensating code in the materialize function. So this is definitely the simplest and most direct fix. Thanks, Richard
On Wed, 22 May 2024, Richard Sandiford wrote: > Richard Sandiford <richard.sandiford@arm.com> writes: > > Richard Biener <rguenther@suse.de> writes: > >> When change_vec_perm_layout runs into a permute combining two > >> nodes where one is invariant and one internal the partition of > >> one input can be -1 but the other might not be. The following > >> supports this case by simply ignoring inputs with input partiton -1. > >> > >> I'm not sure this is correct but it avoids ICEing when accessing > >> that partitions layout for gcc.target/i386/pr98928.c with the > >> change to avoid splitting store dataref groups during SLP discovery. > >> > >> Bootstrap and regtest running on x86_64-unknown-linux-gnu (ontop of > >> the SLP series). The change can't break anything that's already > >> broken but I'm not sure this does the right thing - the testcase > >> has an uniform constant. I'll try to come up with a better runtime > >> testcase tomorrow. Hints as to where to correctly fix such case > >> appreciated. > > > > Famous last words, but yeah, it looks correct to me. I think the > > routine in principle should have a free choice of which layout to > > choose for invariants (as long as it's consistent for all queries > > about the same node). So it should just be a question of whether > > keeping the original layout is more likely to give a valid > > permutation, or whether going with out_layout_i would be better. > > I don't have a strong intuition either way. > > BTW, I should have said that using a different layout from 0 > would require compensating code in the materialize function. > So this is definitely the simplest and most direct fix. Yeah, I guess we can improve on that later. I'm going to push the change after lunch together with the other two fixes - the ARM CI discovered its share of testsuite fallout for the actual change I'm going to look at. Richard.
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 873748b0a72..f6ec1a81c96 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -4828,6 +4828,8 @@ change_vec_perm_layout (slp_tree node, lane_permutation_t &perm, { slp_tree in_node = SLP_TREE_CHILDREN (node)[entry.first]; unsigned int in_partition_i = m_vertices[in_node->vertex].partition; + if (in_partition_i == -1u) + continue; this_in_layout_i = m_partitions[in_partition_i].layout; } if (this_in_layout_i > 0)