Message ID | r8po9665-r9r7-3r5-n086-297r819155nr@fhfr.qr |
---|---|
State | New |
Headers | show |
Series | tree-optimization/102139 - fix SLP DR base alignment | expand |
On Tue, Aug 31, 2021 at 11:26 AM Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > When doing whole-function SLP we have to make sure the recorded > base alignments we compute as the maximum alignment seen for a > base anywhere in the function is actually valid at the point > we want to make use of it. > > To make this work we now record the stmt the alignment was derived > from in addition to the DRs innermost behavior and we use a > dominance check to verify the recorded info is valid when doing > BB vectorization. > > Note this leaves a small(?) hole for the case where we have sth > like > > unaligned DR > call (); // does not return > aligned DR > > since we'll derive an aligned access for the earlier DR but the > later DR is never actually reached since the call does not > return. To plug this hole one option (for the easy backporting) > would be to simply not use the base-alignment recording at all. > Alternatively we'd have to store the dataref grouping 'id' somewhere > in the DR itself and use that to handle this particular case. It turns out this isn't too difficult so the following is a patch adjusted to cover that case together with a testcase. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK? Thanks, Richard. 2021-08-31 Richard Biener <rguenther@suse.de> PR tree-optimization/102139 * tree-vectorizer.h (vec_base_alignments): Adjust hash-map type to record a std::pair of the stmt-info and the innermost loop behavior. (dr_vec_info::group): New member. * tree-vect-data-refs.c (vect_record_base_alignment): Adjust. (vect_compute_data_ref_alignment): Verify the recorded base alignment can be used. (data_ref_pair): Remove. (dr_group_sort_cmp): Adjust. (vect_analyze_data_ref_accesses): Store the group-ID in the dr_vec_info and operate on a vector of dr_vec_infos. * gcc.dg/torture/pr102139.c: New testcase.
Richard Biener <richard.guenther@gmail.com> writes: > On Tue, Aug 31, 2021 at 11:26 AM Richard Biener via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> When doing whole-function SLP we have to make sure the recorded >> base alignments we compute as the maximum alignment seen for a >> base anywhere in the function is actually valid at the point >> we want to make use of it. Ah, yeah, the danger of optimisations that silently rely on the then-current restrictions :-( >> To make this work we now record the stmt the alignment was derived >> from in addition to the DRs innermost behavior and we use a >> dominance check to verify the recorded info is valid when doing >> BB vectorization. >> >> Note this leaves a small(?) hole for the case where we have sth >> like >> >> unaligned DR >> call (); // does not return >> aligned DR >> >> since we'll derive an aligned access for the earlier DR but the >> later DR is never actually reached since the call does not >> return. To plug this hole one option (for the easy backporting) >> would be to simply not use the base-alignment recording at all. >> Alternatively we'd have to store the dataref grouping 'id' somewhere >> in the DR itself and use that to handle this particular case. > > It turns out this isn't too difficult so the following is a patch adjusted > to cover that case together with a testcase. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > OK? TBH I know nothing about this group id scheme, so I'm not really in a position to comment. But it LGTM from the (few) bits I do understand. I guess we're leaving the same easter egg for loop optimisation if we ever support early exits, but I'm not sure what to do about that. Thanks, Richard > > Thanks, > Richard. > > 2021-08-31 Richard Biener <rguenther@suse.de> > > PR tree-optimization/102139 > * tree-vectorizer.h (vec_base_alignments): Adjust hash-map > type to record a std::pair of the stmt-info and the innermost > loop behavior. > (dr_vec_info::group): New member. > * tree-vect-data-refs.c (vect_record_base_alignment): Adjust. > (vect_compute_data_ref_alignment): Verify the recorded > base alignment can be used. > (data_ref_pair): Remove. > (dr_group_sort_cmp): Adjust. > (vect_analyze_data_ref_accesses): Store the group-ID in the > dr_vec_info and operate on a vector of dr_vec_infos. > > * gcc.dg/torture/pr102139.c: New testcase.
On Wed, 1 Sep 2021, Richard Sandiford wrote: > Richard Biener <richard.guenther@gmail.com> writes: > > On Tue, Aug 31, 2021 at 11:26 AM Richard Biener via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> > >> When doing whole-function SLP we have to make sure the recorded > >> base alignments we compute as the maximum alignment seen for a > >> base anywhere in the function is actually valid at the point > >> we want to make use of it. > > Ah, yeah, the danger of optimisations that silently rely on the > then-current restrictions :-( Yeah. > >> To make this work we now record the stmt the alignment was derived > >> from in addition to the DRs innermost behavior and we use a > >> dominance check to verify the recorded info is valid when doing > >> BB vectorization. > >> > >> Note this leaves a small(?) hole for the case where we have sth > >> like > >> > >> unaligned DR > >> call (); // does not return > >> aligned DR > >> > >> since we'll derive an aligned access for the earlier DR but the > >> later DR is never actually reached since the call does not > >> return. To plug this hole one option (for the easy backporting) > >> would be to simply not use the base-alignment recording at all. > >> Alternatively we'd have to store the dataref grouping 'id' somewhere > >> in the DR itself and use that to handle this particular case. > > > > It turns out this isn't too difficult so the following is a patch adjusted > > to cover that case together with a testcase. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > > > OK? > > TBH I know nothing about this group id scheme, so I'm not really > in a position to comment. But it LGTM from the (few) bits I do understand. > > I guess we're leaving the same easter egg for loop optimisation if > we ever support early exits, but I'm not sure what to do about that. We're currently not recording alignment from masked DRs (aka DR_IS_CONDITIONAL_IN_STMT), I suppose we'd need to mark all stmts after early exits in this way then since in the end they will have to be masked on the early exit. So we _might_ be fine there ... Pushed. Thanks, Richard. > Thanks, > Richard > > > > > Thanks, > > Richard. > > > > 2021-08-31 Richard Biener <rguenther@suse.de> > > > > PR tree-optimization/102139 > > * tree-vectorizer.h (vec_base_alignments): Adjust hash-map > > type to record a std::pair of the stmt-info and the innermost > > loop behavior. > > (dr_vec_info::group): New member. > > * tree-vect-data-refs.c (vect_record_base_alignment): Adjust. > > (vect_compute_data_ref_alignment): Verify the recorded > > base alignment can be used. > > (data_ref_pair): Remove. > > (dr_group_sort_cmp): Adjust. > > (vect_analyze_data_ref_accesses): Store the group-ID in the > > dr_vec_info and operate on a vector of dr_vec_infos. > > > > * gcc.dg/torture/pr102139.c: New testcase. >
Richard Biener <rguenther@suse.de> writes: > On Wed, 1 Sep 2021, Richard Sandiford wrote: > >> Richard Biener <richard.guenther@gmail.com> writes: >> > On Tue, Aug 31, 2021 at 11:26 AM Richard Biener via Gcc-patches >> > <gcc-patches@gcc.gnu.org> wrote: >> >> >> >> When doing whole-function SLP we have to make sure the recorded >> >> base alignments we compute as the maximum alignment seen for a >> >> base anywhere in the function is actually valid at the point >> >> we want to make use of it. >> >> Ah, yeah, the danger of optimisations that silently rely on the >> then-current restrictions :-( > > Yeah. > >> >> To make this work we now record the stmt the alignment was derived >> >> from in addition to the DRs innermost behavior and we use a >> >> dominance check to verify the recorded info is valid when doing >> >> BB vectorization. >> >> >> >> Note this leaves a small(?) hole for the case where we have sth >> >> like >> >> >> >> unaligned DR >> >> call (); // does not return >> >> aligned DR >> >> >> >> since we'll derive an aligned access for the earlier DR but the >> >> later DR is never actually reached since the call does not >> >> return. To plug this hole one option (for the easy backporting) >> >> would be to simply not use the base-alignment recording at all. >> >> Alternatively we'd have to store the dataref grouping 'id' somewhere >> >> in the DR itself and use that to handle this particular case. >> > >> > It turns out this isn't too difficult so the following is a patch adjusted >> > to cover that case together with a testcase. >> > >> > Bootstrapped and tested on x86_64-unknown-linux-gnu. >> > >> > OK? >> >> TBH I know nothing about this group id scheme, so I'm not really >> in a position to comment. But it LGTM from the (few) bits I do understand. >> >> I guess we're leaving the same easter egg for loop optimisation if >> we ever support early exits, but I'm not sure what to do about that. > > We're currently not recording alignment from masked DRs > (aka DR_IS_CONDITIONAL_IN_STMT), I suppose we'd need to mark > all stmts after early exits in this way then since in the end > they will have to be masked on the early exit. > > So we _might_ be fine there ... Yeah, for a pure SVE-like implementation that's probably true. But we also have the option of vectorising an early exit by branching if the condition is true for *any* element, then handling the remaining elements with an epilogue. It would be quite a different approach from what we're doing now, and wouldn't necessarily require up-front if-conversion. But the point is that it's theoretically possible, just as whole-function SLP was theoretically possible (but seemingly some way off) when this code was written :-) Thanks, Richard
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 37f46d1aaa3..e2549811961 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -895,11 +895,11 @@ vect_record_base_alignment (vec_info *vinfo, stmt_vec_info stmt_info, innermost_loop_behavior *drb) { bool existed; - innermost_loop_behavior *&entry + std::pair<gimple *, innermost_loop_behavior *> &entry = vinfo->base_alignments.get_or_insert (drb->base_address, &existed); - if (!existed || entry->base_alignment < drb->base_alignment) + if (!existed || entry.second->base_alignment < drb->base_alignment) { - entry = drb; + entry = std::make_pair (stmt_info->stmt, drb); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "recording new base alignment for %T\n" @@ -1060,11 +1060,16 @@ vect_compute_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info) /* Calculate the maximum of the pooled base address alignment and the alignment that we can compute for DR itself. */ - innermost_loop_behavior **entry = base_alignments->get (drb->base_address); - if (entry && base_alignment < (*entry)->base_alignment) + std::pair<gimple *, innermost_loop_behavior *> *entry + = base_alignments->get (drb->base_address); + if (entry + && base_alignment < (*entry).second->base_alignment + && (loop_vinfo + || dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt_info->stmt), + gimple_bb (entry->first)))) { - base_alignment = (*entry)->base_alignment; - base_misalignment = (*entry)->base_misalignment; + base_alignment = entry->second->base_alignment; + base_misalignment = entry->second->base_misalignment; } if (drb->offset_alignment < vect_align_c diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 72e018e8eac..8db642c7dc3 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -106,10 +106,11 @@ struct stmt_info_for_cost { typedef vec<stmt_info_for_cost> stmt_vector_for_cost; -/* Maps base addresses to an innermost_loop_behavior that gives the maximum - known alignment for that base. */ +/* Maps base addresses to an innermost_loop_behavior and the stmt it was + derived from that gives the maximum known alignment for that base. */ typedef hash_map<tree_operand_hash, - innermost_loop_behavior *> vec_base_alignments; + std::pair<gimple *, innermost_loop_behavior *> > + vec_base_alignments; /************************************************************************ SLP