Message ID | nycvar.YFH.7.76.1911200953150.5566@zhemvz.fhfr.qr |
---|---|
State | New |
Headers | show |
Series | Fix PR92537 | expand |
Richard Biener <rguenther@suse.de> writes: > There's a bit left in this PR which the following fixes. The root > only became external late and the check more naturally belongs to > the new place. > > Boostrap and regtest running on x86_64-unknown-linux-gnu. FWIW, I was just about to post the patch below before seeing your message. :-) Thought I might as well post it anyway just in case. I guess the slight advantage to keeping the vect_analyze_slp_instance test is that we can still free child nodes at that point, so it might be more efficient to catch it "early". It probably doesn't make much differnce in practice though. Tested on aarch64-linux-gnu. Thanks, Richard 2019-11-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92537 * tree-vect-slp.c (vect_slp_analyze_node_operations): Don't allow the root node to be external. gcc/testsuite/ PR tree-optimization/92537 * gcc.dg/vect/pr92537.c: New test. * gfortran.dg/vect/pr92537.f90: Likewise. Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c 2019-11-18 15:21:12.919993766 +0000 +++ gcc/tree-vect-slp.c 2019-11-20 09:48:57.145101295 +0000 @@ -2848,7 +2848,7 @@ vect_slp_analyze_node_operations (vec_in slp_tree child; if (SLP_TREE_DEF_TYPE (node) != vect_internal_def) - return true; + return node != SLP_INSTANCE_TREE (node_instance); /* If we already analyzed the exact same set of scalar stmts we're done. We share the generated vector stmts for those. Index: gcc/testsuite/gcc.dg/vect/pr92537.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.dg/vect/pr92537.c 2019-11-20 09:48:57.145101295 +0000 @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3" } */ + +int a[64]; +int b, c, e; +short d; +short f[64]; +void g() { + b = 0; + c = d >> 3; + for (; b < 64 - 1; b++) { + e = 0; + e -= a[b] * c; + f[b] = e; + } +} Index: gcc/testsuite/gfortran.dg/vect/pr92537.f90 =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gfortran.dg/vect/pr92537.f90 2019-11-20 09:48:57.145101295 +0000 @@ -0,0 +1,32 @@ +! { dg-do compile } */ +! { dg-additional-options "-fno-inline -march=skylake" } + +MODULE pr93527 + implicit none + integer, parameter :: wp = kind (1.d0) + interface p_min + module procedure p_min_wp + end interface +contains + subroutine foo (pr) + real(wp), pointer :: pr(:) + integer :: nzd + real(wp) :: pmin + real(wp) :: pmin_diag + integer :: i + nzd = 15 + allocate (pr(nzd)) + pmin_diag = 4000._wp + pmin = p_min(pmin_diag) + pmin = min (pmin,pmin_diag) + pr(1) = log(pmin) + do i=1,nzd-1 + pr(i+1) = log(pmin) + i + end do + end subroutine foo + function p_min_wp (x) result (p_min) + real(wp), intent(in) :: x + real(wp) :: p_min + p_min = x + end function p_min_wp +end module pr93527
On Wed, 20 Nov 2019, Richard Sandiford wrote: > Richard Biener <rguenther@suse.de> writes: > > There's a bit left in this PR which the following fixes. The root > > only became external late and the check more naturally belongs to > > the new place. > > > > Boostrap and regtest running on x86_64-unknown-linux-gnu. > > FWIW, I was just about to post the patch below before seeing your > message. :-) Thought I might as well post it anyway just in case. I was thinking whether it makes sense to have external root nodes and they probably can only appear with the CTORs right now (otherwise they'll be stores). So possibly generalizing the check like you do makes sense, still the check belongs in vect_slp_analyze_operations ;) > I guess the slight advantage to keeping the vect_analyze_slp_instance > test is that we can still free child nodes at that point, so it might > be more efficient to catch it "early". It probably doesn't make much > differnce in practice though. But if the root is external there are no child nodes (well, besides that root). But yeah, having the check twice looked odd to me. > Tested on aarch64-linux-gnu. I've installed my variant before seeing you mail, so... I also have a prototype for finding "random" roots for SLP vectorization (just a bunch of same stmts) which runs into the same issue (but also very many others ;)) Thanks, Richard. > > Thanks, > Richard > > > 2019-11-20 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > PR tree-optimization/92537 > * tree-vect-slp.c (vect_slp_analyze_node_operations): Don't > allow the root node to be external. > > gcc/testsuite/ > PR tree-optimization/92537 > * gcc.dg/vect/pr92537.c: New test. > * gfortran.dg/vect/pr92537.f90: Likewise. > > Index: gcc/tree-vect-slp.c > =================================================================== > --- gcc/tree-vect-slp.c 2019-11-18 15:21:12.919993766 +0000 > +++ gcc/tree-vect-slp.c 2019-11-20 09:48:57.145101295 +0000 > @@ -2848,7 +2848,7 @@ vect_slp_analyze_node_operations (vec_in > slp_tree child; > > if (SLP_TREE_DEF_TYPE (node) != vect_internal_def) > - return true; > + return node != SLP_INSTANCE_TREE (node_instance); > > /* If we already analyzed the exact same set of scalar stmts we're done. > We share the generated vector stmts for those. > Index: gcc/testsuite/gcc.dg/vect/pr92537.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.dg/vect/pr92537.c 2019-11-20 09:48:57.145101295 +0000 > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-O3" } */ > + > +int a[64]; > +int b, c, e; > +short d; > +short f[64]; > +void g() { > + b = 0; > + c = d >> 3; > + for (; b < 64 - 1; b++) { > + e = 0; > + e -= a[b] * c; > + f[b] = e; > + } > +} > Index: gcc/testsuite/gfortran.dg/vect/pr92537.f90 > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gfortran.dg/vect/pr92537.f90 2019-11-20 09:48:57.145101295 +0000 > @@ -0,0 +1,32 @@ > +! { dg-do compile } */ > +! { dg-additional-options "-fno-inline -march=skylake" } > + > +MODULE pr93527 > + implicit none > + integer, parameter :: wp = kind (1.d0) > + interface p_min > + module procedure p_min_wp > + end interface > +contains > + subroutine foo (pr) > + real(wp), pointer :: pr(:) > + integer :: nzd > + real(wp) :: pmin > + real(wp) :: pmin_diag > + integer :: i > + nzd = 15 > + allocate (pr(nzd)) > + pmin_diag = 4000._wp > + pmin = p_min(pmin_diag) > + pmin = min (pmin,pmin_diag) > + pr(1) = log(pmin) > + do i=1,nzd-1 > + pr(i+1) = log(pmin) + i > + end do > + end subroutine foo > + function p_min_wp (x) result (p_min) > + real(wp), intent(in) :: x > + real(wp) :: p_min > + p_min = x > + end function p_min_wp > +end module pr93527 >
Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c (revision 278477) +++ gcc/tree-vect-slp.c (working copy) @@ -2253,18 +2264,6 @@ vect_analyze_slp_instance (vec_info *vin matches[group_size / const_max_nunits * const_max_nunits] = false; vect_free_slp_tree (node, false); } - else if (constructor - && SLP_TREE_DEF_TYPE (node) != vect_internal_def) - { - /* CONSTRUCTOR vectorization relies on a vector stmt being - generated, that doesn't work for fully external ones. */ - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "Build SLP failed: CONSTRUCTOR of external " - "or constant elements\n"); - vect_free_slp_tree (node, false); - return false; - } else { /* Create a new SLP instance. */ @@ -2939,7 +2939,12 @@ vect_slp_analyze_operations (vec_info *v if (!vect_slp_analyze_node_operations (vinfo, SLP_INSTANCE_TREE (instance), instance, visited, lvisited, - &cost_vec)) + &cost_vec) + /* Instances with a root stmt require vectorized defs for the + SLP tree root. */ + || (SLP_INSTANCE_ROOT_STMT (instance) + && (SLP_TREE_DEF_TYPE (SLP_INSTANCE_TREE (instance)) + != vect_internal_def))) { slp_tree node = SLP_INSTANCE_TREE (instance); stmt_vec_info stmt_info = SLP_TREE_SCALAR_STMTS (node)[0]; Index: gcc/testsuite/gfortran.dg/pr92537.f90 =================================================================== --- gcc/testsuite/gfortran.dg/pr92537.f90 (nonexistent) +++ gcc/testsuite/gfortran.dg/pr92537.f90 (working copy) @@ -0,0 +1,32 @@ +! { dg-do compile } +! { dg-options "-O2 -ftree-vectorize -fno-inline" } +! { dg-additional-options "-march=skylake" { target x86_64-*-* i?86-*-* } } +MODULE pr93527 + implicit none + integer, parameter :: wp = kind (1.d0) + interface p_min + module procedure p_min_wp + end interface +contains + subroutine foo (pr) + real(wp), pointer :: pr(:) + integer :: nzd + real(wp) :: pmin + real(wp) :: pmin_diag + integer :: i + nzd = 15 + allocate (pr(nzd)) + pmin_diag = 4000._wp + pmin = p_min(pmin_diag) + pmin = min (pmin,pmin_diag) + pr(1) = log(pmin) + do i=1,nzd-1 + pr(i+1) = log(pmin) + i + end do + end subroutine foo + function p_min_wp (x) result (p_min) + real(wp), intent(in) :: x + real(wp) :: p_min + p_min = x + end function p_min_wp +end module pr93527