diff mbox series

[3/4] Don't vectorise single-iteration epilogues

Message ID mptmudbvexc.fsf@arm.com
State New
Headers show
Series Vector epilogues vs. mixed vector sizes | expand

Commit Message

Richard Sandiford Nov. 4, 2019, 3:28 p.m. UTC
With a later patch I saw a case in which we peeled a single iteration
for gaps but didn't need to peel further iterations to make up a full
vector.  We then tried to vectorise the single-iteration epilogue.


2019-11-04  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-loop.c (vect_analyze_loop): Only try to vectorize
	the epilogue if there are peeled iterations for it to handle.

Comments

Richard Biener Nov. 6, 2019, 11:56 a.m. UTC | #1
On Mon, Nov 4, 2019 at 4:30 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> With a later patch I saw a case in which we peeled a single iteration
> for gaps but didn't need to peel further iterations to make up a full
> vector.  We then tried to vectorise the single-iteration epilogue.

But when peeling for gaps we peel off a full vector iteration and thus
have possibly VF-1 iterations in the epilogue, enough for vectorizing
with VF/2?

>
> 2019-11-04  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vect-loop.c (vect_analyze_loop): Only try to vectorize
>         the epilogue if there are peeled iterations for it to handle.
>
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2019-11-04 15:18:26.684592505 +0000
> +++ gcc/tree-vect-loop.c        2019-11-04 15:18:36.608524542 +0000
> @@ -2462,6 +2462,7 @@ vect_analyze_loop (class loop *loop, loo
>           vect_epilogues = (!loop->simdlen
>                             && loop->inner == NULL
>                             && PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK)
> +                           && LOOP_VINFO_PEELING_FOR_NITER (first_loop_vinfo)
>                             /* For now only allow one epilogue loop.  */
>                             && first_loop_vinfo->epilogue_vinfos.is_empty ());
>
Richard Sandiford Nov. 6, 2019, 12:22 p.m. UTC | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Nov 4, 2019 at 4:30 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> With a later patch I saw a case in which we peeled a single iteration
>> for gaps but didn't need to peel further iterations to make up a full
>> vector.  We then tried to vectorise the single-iteration epilogue.
>
> But when peeling for gaps we peel off a full vector iteration and thus
> have possibly VF-1 iterations in the epilogue, enough for vectorizing
> with VF/2?

Peeling for gaps just means we need to peel off one final scalar
iteration.  Often that means we need to peel more to keep the vector
loop operating on a multiple of VF, but if so, that additional peeling
counts as LOOP_VINFO_PEELING_FOR_NITER.

If we have a VF of 32 and a known iteration count of 65, we can peel a
single iteration for gaps without having to peel any more.  (Obviously
we'd peel that iteration anyway if we didn't have to peel it for gaps.)
And when using fully-masked/predicated loops, peeling one iteration for
gaps doesn't force us to peel more, even if the iteration count isn't
known.

Thanks,
Richard

>>
>> 2019-11-04  Richard Sandiford  <richard.sandiford@arm.com>
>>
>> gcc/
>>         * tree-vect-loop.c (vect_analyze_loop): Only try to vectorize
>>         the epilogue if there are peeled iterations for it to handle.
>>
>> Index: gcc/tree-vect-loop.c
>> ===================================================================
>> --- gcc/tree-vect-loop.c        2019-11-04 15:18:26.684592505 +0000
>> +++ gcc/tree-vect-loop.c        2019-11-04 15:18:36.608524542 +0000
>> @@ -2462,6 +2462,7 @@ vect_analyze_loop (class loop *loop, loo
>>           vect_epilogues = (!loop->simdlen
>>                             && loop->inner == NULL
>>                             && PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK)
>> +                           && LOOP_VINFO_PEELING_FOR_NITER (first_loop_vinfo)
>>                             /* For now only allow one epilogue loop.  */
>>                             && first_loop_vinfo->epilogue_vinfos.is_empty ());
>>
Richard Biener Nov. 6, 2019, 1:16 p.m. UTC | #3
On Wed, Nov 6, 2019 at 1:22 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener <richard.guenther@gmail.com> writes:
> > On Mon, Nov 4, 2019 at 4:30 PM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> With a later patch I saw a case in which we peeled a single iteration
> >> for gaps but didn't need to peel further iterations to make up a full
> >> vector.  We then tried to vectorise the single-iteration epilogue.
> >
> > But when peeling for gaps we peel off a full vector iteration and thus
> > have possibly VF-1 iterations in the epilogue, enough for vectorizing
> > with VF/2?
>
> Peeling for gaps just means we need to peel off one final scalar
> iteration.  Often that means we need to peel more to keep the vector
> loop operating on a multiple of VF, but if so, that additional peeling
> counts as LOOP_VINFO_PEELING_FOR_NITER.
>
> If we have a VF of 32 and a known iteration count of 65, we can peel a
> single iteration for gaps without having to peel any more.  (Obviously
> we'd peel that iteration anyway if we didn't have to peel it for gaps.)
> And when using fully-masked/predicated loops, peeling one iteration for
> gaps doesn't force us to peel more, even if the iteration count isn't
> known.

For sure when we do not have any epiloge it's pointless to try vectorize it.
It seems LOOP_VINFO_PEELING_FOR_NITER is set in "interesting"
ways, deciphering it seems to show that when we have an epilogue but
not LOOP_VINFO_PEELING_FOR_NITER then that epilogue always
has a single iteration only.

So, OK ...

Richard.

> Thanks,
> Richard
>
> >>
> >> 2019-11-04  Richard Sandiford  <richard.sandiford@arm.com>
> >>
> >> gcc/
> >>         * tree-vect-loop.c (vect_analyze_loop): Only try to vectorize
> >>         the epilogue if there are peeled iterations for it to handle.
> >>
> >> Index: gcc/tree-vect-loop.c
> >> ===================================================================
> >> --- gcc/tree-vect-loop.c        2019-11-04 15:18:26.684592505 +0000
> >> +++ gcc/tree-vect-loop.c        2019-11-04 15:18:36.608524542 +0000
> >> @@ -2462,6 +2462,7 @@ vect_analyze_loop (class loop *loop, loo
> >>           vect_epilogues = (!loop->simdlen
> >>                             && loop->inner == NULL
> >>                             && PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK)
> >> +                           && LOOP_VINFO_PEELING_FOR_NITER (first_loop_vinfo)
> >>                             /* For now only allow one epilogue loop.  */
> >>                             && first_loop_vinfo->epilogue_vinfos.is_empty ());
> >>
diff mbox series

Patch

Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2019-11-04 15:18:26.684592505 +0000
+++ gcc/tree-vect-loop.c	2019-11-04 15:18:36.608524542 +0000
@@ -2462,6 +2462,7 @@  vect_analyze_loop (class loop *loop, loo
 	  vect_epilogues = (!loop->simdlen
 			    && loop->inner == NULL
 			    && PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK)
+			    && LOOP_VINFO_PEELING_FOR_NITER (first_loop_vinfo)
 			    /* For now only allow one epilogue loop.  */
 			    && first_loop_vinfo->epilogue_vinfos.is_empty ());