diff mbox series

[2/4] Check the VF is small enough for an epilogue loop

Message ID mptr22nveyp.fsf@arm.com
State New
Headers show
Series Vector epilogues vs. mixed vector sizes | expand

Commit Message

Richard Sandiford Nov. 4, 2019, 3:27 p.m. UTC
The number of iterations of an epilogue loop is always smaller than the
VF of the main loop.  vect_analyze_loop_costing was taking this into
account when deciding whether the loop is cheap enough to vectorise,
but that has no effect with the unlimited cost model.  We need to use
a separate check for correctness as well.

This can happen if the sizes returned by autovectorize_vector_sizes
happen to be out of order, e.g. because the target prefers smaller
vectors.  It can also happen with later patches if two vectorisation
attempts happen to end up with the same VF.


2019-11-04  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-loop.c (vect_analyze_loop_2): When vectorizing an
	epilogue loop, make sure that the VF is small enough or that
	the epilogue loop can be fully-masked.

Comments

Richard Biener Nov. 6, 2019, 11:54 a.m. UTC | #1
On Mon, Nov 4, 2019 at 4:29 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> The number of iterations of an epilogue loop is always smaller than the
> VF of the main loop.  vect_analyze_loop_costing was taking this into
> account when deciding whether the loop is cheap enough to vectorise,
> but that has no effect with the unlimited cost model.  We need to use
> a separate check for correctness as well.
>
> This can happen if the sizes returned by autovectorize_vector_sizes
> happen to be out of order, e.g. because the target prefers smaller
> vectors.  It can also happen with later patches if two vectorisation
> attempts happen to end up with the same VF.

OK.

>
> 2019-11-04  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vect-loop.c (vect_analyze_loop_2): When vectorizing an
>         epilogue loop, make sure that the VF is small enough or that
>         the epilogue loop can be fully-masked.
>
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2019-11-04 15:17:35.924940111 +0000
> +++ gcc/tree-vect-loop.c        2019-11-04 15:17:50.736838681 +0000
> @@ -2142,6 +2142,16 @@ vect_analyze_loop_2 (loop_vec_info loop_
>                                        " support peeling for gaps.\n");
>      }
>
> +  /* If we're vectorizing an epilogue loop, we either need a fully-masked
> +     loop or a loop that has a lower VF than the main loop.  */
> +  if (LOOP_VINFO_EPILOGUE_P (loop_vinfo)
> +      && !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
> +      && maybe_ge (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
> +                  LOOP_VINFO_VECT_FACTOR (orig_loop_vinfo)))
> +    return opt_result::failure_at (vect_location,
> +                                  "Vectorization factor too high for"
> +                                  " epilogue loop.\n");
> +
>    /* Check the costings of the loop make vectorizing worthwhile.  */
>    res = vect_analyze_loop_costing (loop_vinfo);
>    if (res < 0)
diff mbox series

Patch

Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2019-11-04 15:17:35.924940111 +0000
+++ gcc/tree-vect-loop.c	2019-11-04 15:17:50.736838681 +0000
@@ -2142,6 +2142,16 @@  vect_analyze_loop_2 (loop_vec_info loop_
 				       " support peeling for gaps.\n");
     }
 
+  /* If we're vectorizing an epilogue loop, we either need a fully-masked
+     loop or a loop that has a lower VF than the main loop.  */
+  if (LOOP_VINFO_EPILOGUE_P (loop_vinfo)
+      && !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
+      && maybe_ge (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
+		   LOOP_VINFO_VECT_FACTOR (orig_loop_vinfo)))
+    return opt_result::failure_at (vect_location,
+				   "Vectorization factor too high for"
+				   " epilogue loop.\n");
+
   /* Check the costings of the loop make vectorizing worthwhile.  */
   res = vect_analyze_loop_costing (loop_vinfo);
   if (res < 0)