diff mbox series

[068/nnn] poly_int: current_vector_size and TARGET_AUTOVECTORIZE_VECTOR_SIZES

Message ID 87po9dixjt.fsf@linaro.org
State New
Headers show
Series [068/nnn] poly_int: current_vector_size and TARGET_AUTOVECTORIZE_VECTOR_SIZES | expand

Commit Message

Richard Sandiford Oct. 23, 2017, 5:28 p.m. UTC
This patch changes the type of current_vector_size to poly_uint64.
It also changes TARGET_AUTOVECTORIZE_VECTOR_SIZES so that it fills
in a vector of possible sizes (as poly_uint64s) instead of returning
a bitmask.  The documentation claimed that the hook didn't need to
include the default vector size (returned by preferred_simd_mode),
but that wasn't consistent with the omp-low.c usage.


2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* target.h (vector_sizes, auto_vector_sizes): New typedefs.
	* target.def (autovectorize_vector_sizes): Return the vector sizes
	by pointer, using vector_sizes rather than a bitmask.
	* targhooks.h (default_autovectorize_vector_sizes): Update accordingly.
	* targhooks.c (default_autovectorize_vector_sizes): Likewise.
	* config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes):
	Likewise.
	* config/arc/arc.c (arc_autovectorize_vector_sizes): Likewise.
	* config/arm/arm.c (arm_autovectorize_vector_sizes): Likewise.
	* config/i386/i386.c (ix86_autovectorize_vector_sizes): Likewise.
	* config/mips/mips.c (mips_autovectorize_vector_sizes): Likewise.
	* omp-general.c (omp_max_vf): Likewise.
	* omp-low.c (omp_clause_aligned_alignment): Likewise.
	* optabs-query.c (can_vec_mask_load_store_p): Likewise.
	* tree-vect-loop.c (vect_analyze_loop): Likewise.
	* tree-vect-slp.c (vect_slp_bb): Likewise.
	* doc/tm.texi: Regenerate.
	* tree-vectorizer.h (current_vector_size): Change from an unsigned int
	to a poly_uint64.
	* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Take
	the vector size as a poly_uint64 rather than an unsigned int.
	(current_vector_size): Change from an unsigned int to a poly_uint64.
	(get_vectype_for_scalar_type): Update accordingly.
	* tree.h (build_truth_vector_type): Take the size and number of
	units as a poly_uint64 rather than an unsigned int.
	(build_vector_type): Add a temporary overload that takes
	the number of units as a poly_uint64 rather than an unsigned int.
	* tree.c (make_vector_type): Likewise.
	(build_truth_vector_type): Take the number of units as a poly_uint64
	rather than an unsigned int.

Comments

Jeff Law Dec. 6, 2017, 1:52 a.m. UTC | #1
On 10/23/2017 11:28 AM, Richard Sandiford wrote:
> This patch changes the type of current_vector_size to poly_uint64.
> It also changes TARGET_AUTOVECTORIZE_VECTOR_SIZES so that it fills
> in a vector of possible sizes (as poly_uint64s) instead of returning
> a bitmask.  The documentation claimed that the hook didn't need to
> include the default vector size (returned by preferred_simd_mode),
> but that wasn't consistent with the omp-low.c usage.
> 
> 
> 2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
> 	    Alan Hayward  <alan.hayward@arm.com>
> 	    David Sherwood  <david.sherwood@arm.com>
> 
> gcc/
> 	* target.h (vector_sizes, auto_vector_sizes): New typedefs.
> 	* target.def (autovectorize_vector_sizes): Return the vector sizes
> 	by pointer, using vector_sizes rather than a bitmask.
> 	* targhooks.h (default_autovectorize_vector_sizes): Update accordingly.
> 	* targhooks.c (default_autovectorize_vector_sizes): Likewise.
> 	* config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes):
> 	Likewise.
> 	* config/arc/arc.c (arc_autovectorize_vector_sizes): Likewise.
> 	* config/arm/arm.c (arm_autovectorize_vector_sizes): Likewise.
> 	* config/i386/i386.c (ix86_autovectorize_vector_sizes): Likewise.
> 	* config/mips/mips.c (mips_autovectorize_vector_sizes): Likewise.
> 	* omp-general.c (omp_max_vf): Likewise.
> 	* omp-low.c (omp_clause_aligned_alignment): Likewise.
> 	* optabs-query.c (can_vec_mask_load_store_p): Likewise.
> 	* tree-vect-loop.c (vect_analyze_loop): Likewise.
> 	* tree-vect-slp.c (vect_slp_bb): Likewise.
> 	* doc/tm.texi: Regenerate.
> 	* tree-vectorizer.h (current_vector_size): Change from an unsigned int
> 	to a poly_uint64.
> 	* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Take
> 	the vector size as a poly_uint64 rather than an unsigned int.
> 	(current_vector_size): Change from an unsigned int to a poly_uint64.
> 	(get_vectype_for_scalar_type): Update accordingly.
> 	* tree.h (build_truth_vector_type): Take the size and number of
> 	units as a poly_uint64 rather than an unsigned int.
> 	(build_vector_type): Add a temporary overload that takes
> 	the number of units as a poly_uint64 rather than an unsigned int.
> 	* tree.c (make_vector_type): Likewise.
> 	(build_truth_vector_type): Take the number of units as a poly_uint64
> 	rather than an unsigned int.

OK.

jeff
diff mbox series

Patch

Index: gcc/target.h
===================================================================
--- gcc/target.h	2017-10-23 17:11:40.126719272 +0100
+++ gcc/target.h	2017-10-23 17:22:32.724227435 +0100
@@ -199,6 +199,13 @@  typedef vec<unsigned short> vec_perm_ind
    automatically freed.  */
 typedef auto_vec<unsigned short, 32> auto_vec_perm_indices;
 
+/* The type to use for lists of vector sizes.  */
+typedef vec<poly_uint64> vector_sizes;
+
+/* Same, but can be used to construct local lists that are
+   automatically freed.  */
+typedef auto_vec<poly_uint64, 8> auto_vector_sizes;
+
 /* The target structure.  This holds all the backend hooks.  */
 #define DEFHOOKPOD(NAME, DOC, TYPE, INIT) TYPE NAME;
 #define DEFHOOK(NAME, DOC, TYPE, PARAMS, INIT) TYPE (* NAME) PARAMS;
Index: gcc/target.def
===================================================================
--- gcc/target.def	2017-10-23 17:22:30.980383601 +0100
+++ gcc/target.def	2017-10-23 17:22:32.724227435 +0100
@@ -1880,12 +1880,16 @@  transformations even in absence of speci
    after processing the preferred one derived from preferred_simd_mode.  */
 DEFHOOK
 (autovectorize_vector_sizes,
- "This hook should return a mask of sizes that should be iterated over\n\
-after trying to autovectorize using the vector size derived from the\n\
-mode returned by @code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}.\n\
-The default is zero which means to not iterate over other vector sizes.",
- unsigned int,
- (void),
+ "If the mode returned by @code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE} is not\n\
+the only one that is worth considering, this hook should add all suitable\n\
+vector sizes to @var{sizes}, in order of decreasing preference.  The first\n\
+one should be the size of @code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}.\n\
+\n\
+The hook does not need to do anything if the vector returned by\n\
+@code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE} is the only one relevant\n\
+for autovectorization.  The default implementation does nothing.",
+ void,
+ (vector_sizes *sizes),
  default_autovectorize_vector_sizes)
 
 /* Function to get a target mode for a vector mask.  */
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h	2017-10-23 17:22:30.980383601 +0100
+++ gcc/targhooks.h	2017-10-23 17:22:32.725227332 +0100
@@ -106,7 +106,7 @@  default_builtin_support_vector_misalignm
 					     const_tree,
 					     int, bool);
 extern machine_mode default_preferred_simd_mode (scalar_mode mode);
-extern unsigned int default_autovectorize_vector_sizes (void);
+extern void default_autovectorize_vector_sizes (vector_sizes *);
 extern opt_machine_mode default_get_mask_mode (poly_uint64, poly_uint64);
 extern void *default_init_cost (struct loop *);
 extern unsigned default_add_stmt_cost (void *, int, enum vect_cost_for_stmt,
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	2017-10-23 17:22:30.980383601 +0100
+++ gcc/targhooks.c	2017-10-23 17:22:32.725227332 +0100
@@ -1248,10 +1248,9 @@  default_preferred_simd_mode (scalar_mode
 /* By default only the size derived from the preferred vector mode
    is tried.  */
 
-unsigned int
-default_autovectorize_vector_sizes (void)
+void
+default_autovectorize_vector_sizes (vector_sizes *)
 {
-  return 0;
 }
 
 /* By default a vector of integers is used as a mask.  */
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2017-10-23 17:11:40.139744163 +0100
+++ gcc/config/aarch64/aarch64.c	2017-10-23 17:22:32.709228991 +0100
@@ -11310,12 +11310,13 @@  aarch64_preferred_simd_mode (scalar_mode
   return aarch64_simd_container_mode (mode, 128);
 }
 
-/* Return the bitmask of possible vector sizes for the vectorizer
+/* Return a list of possible vector sizes for the vectorizer
    to iterate over.  */
-static unsigned int
-aarch64_autovectorize_vector_sizes (void)
+static void
+aarch64_autovectorize_vector_sizes (vector_sizes *sizes)
 {
-  return (16 | 8);
+  sizes->safe_push (16);
+  sizes->safe_push (8);
 }
 
 /* Implement TARGET_MANGLE_TYPE.  */
Index: gcc/config/arc/arc.c
===================================================================
--- gcc/config/arc/arc.c	2017-10-23 17:11:40.141747992 +0100
+++ gcc/config/arc/arc.c	2017-10-23 17:22:32.710228887 +0100
@@ -404,10 +404,14 @@  arc_preferred_simd_mode (scalar_mode mod
 /* Implements target hook
    TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES.  */
 
-static unsigned int
-arc_autovectorize_vector_sizes (void)
+static void
+arc_autovectorize_vector_sizes (vector_sizes *sizes)
 {
-  return TARGET_PLUS_QMACW ? (8 | 4) : 0;
+  if (TARGET_PLUS_QMACW)
+    {
+      sizes->quick_push (8);
+      sizes->quick_push (4);
+    }
 }
 
 /* TARGET_PRESERVE_RELOAD_P is still awaiting patch re-evaluation / review.  */
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	2017-10-23 17:19:01.398170131 +0100
+++ gcc/config/arm/arm.c	2017-10-23 17:22:32.713228576 +0100
@@ -283,7 +283,7 @@  static bool arm_builtin_support_vector_m
 static void arm_conditional_register_usage (void);
 static enum flt_eval_method arm_excess_precision (enum excess_precision_type);
 static reg_class_t arm_preferred_rename_class (reg_class_t rclass);
-static unsigned int arm_autovectorize_vector_sizes (void);
+static void arm_autovectorize_vector_sizes (vector_sizes *);
 static int arm_default_branch_cost (bool, bool);
 static int arm_cortex_a5_branch_cost (bool, bool);
 static int arm_cortex_m_branch_cost (bool, bool);
@@ -27947,10 +27947,14 @@  arm_vector_alignment (const_tree type)
   return align;
 }
 
-static unsigned int
-arm_autovectorize_vector_sizes (void)
+static void
+arm_autovectorize_vector_sizes (vector_sizes *sizes)
 {
-  return TARGET_NEON_VECTORIZE_DOUBLE ? 0 : (16 | 8);
+  if (!TARGET_NEON_VECTORIZE_DOUBLE)
+    {
+      sizes->safe_push (16);
+      sizes->safe_push (8);
+    }
 }
 
 static bool
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	2017-10-23 17:22:30.978383200 +0100
+++ gcc/config/i386/i386.c	2017-10-23 17:22:32.719227954 +0100
@@ -48105,17 +48105,20 @@  ix86_preferred_simd_mode (scalar_mode mo
    vectors.  If AVX512F is enabled then try vectorizing with 512bit,
    256bit and 128bit vectors.  */
 
-static unsigned int
-ix86_autovectorize_vector_sizes (void)
+static void
+ix86_autovectorize_vector_sizes (vector_sizes *sizes)
 {
-  unsigned int bytesizes = 0;
-
   if (TARGET_AVX512F && !TARGET_PREFER_AVX256)
-    bytesizes |= (64 | 32 | 16);
+    {
+      sizes->safe_push (64);
+      sizes->safe_push (32);
+      sizes->safe_push (16);
+    }
   else if (TARGET_AVX && !TARGET_PREFER_AVX128)
-    bytesizes |= (32 | 16);
-
-  return bytesizes;
+    {
+      sizes->safe_push (32);
+      sizes->safe_push (16);
+    }
 }
 
 /* Implemenation of targetm.vectorize.get_mask_mode.  */
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	2017-10-23 17:18:47.656057887 +0100
+++ gcc/config/mips/mips.c	2017-10-23 17:22:32.721227746 +0100
@@ -13401,10 +13401,11 @@  mips_preferred_simd_mode (scalar_mode mo
 
 /* Implement TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES.  */
 
-static unsigned int
-mips_autovectorize_vector_sizes (void)
+static void
+mips_autovectorize_vector_sizes (vector_sizes *sizes)
 {
-  return ISA_HAS_MSA ? 16 : 0;
+  if (ISA_HAS_MSA)
+    sizes->safe_push (16);
 }
 
 /* Implement TARGET_INIT_LIBFUNCS.  */
Index: gcc/omp-general.c
===================================================================
--- gcc/omp-general.c	2017-10-23 17:22:29.881163047 +0100
+++ gcc/omp-general.c	2017-10-23 17:22:32.722227643 +0100
@@ -433,17 +433,21 @@  omp_max_vf (void)
 	  && global_options_set.x_flag_tree_loop_vectorize))
     return 1;
 
-  int vf = 1;
-  int vs = targetm.vectorize.autovectorize_vector_sizes ();
-  if (vs)
-    vf = 1 << floor_log2 (vs);
-  else
+  auto_vector_sizes sizes;
+  targetm.vectorize.autovectorize_vector_sizes (&sizes);
+  if (!sizes.is_empty ())
     {
-      machine_mode vqimode = targetm.vectorize.preferred_simd_mode (QImode);
-      if (GET_MODE_CLASS (vqimode) == MODE_VECTOR_INT)
-	vf = GET_MODE_NUNITS (vqimode);
+      poly_uint64 vf = 0;
+      for (unsigned int i = 0; i < sizes.length (); ++i)
+	vf = ordered_max (vf, sizes[i]);
+      return vf;
     }
-  return vf;
+
+  machine_mode vqimode = targetm.vectorize.preferred_simd_mode (QImode);
+  if (GET_MODE_CLASS (vqimode) == MODE_VECTOR_INT)
+    return GET_MODE_NUNITS (vqimode);
+
+  return 1;
 }
 
 /* Return maximum SIMT width if offloading may target SIMT hardware.  */
Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c	2017-10-23 17:22:29.882163248 +0100
+++ gcc/omp-low.c	2017-10-23 17:22:32.723227539 +0100
@@ -3451,9 +3451,11 @@  omp_clause_aligned_alignment (tree claus
   /* Otherwise return implementation defined alignment.  */
   unsigned int al = 1;
   opt_scalar_mode mode_iter;
-  int vs = targetm.vectorize.autovectorize_vector_sizes ();
-  if (vs)
-    vs = 1 << floor_log2 (vs);
+  auto_vector_sizes sizes;
+  targetm.vectorize.autovectorize_vector_sizes (&sizes);
+  poly_uint64 vs = 0;
+  for (unsigned int i = 0; i < sizes.length (); ++i)
+    vs = ordered_max (vs, sizes[i]);
   static enum mode_class classes[]
     = { MODE_INT, MODE_VECTOR_INT, MODE_FLOAT, MODE_VECTOR_FLOAT };
   for (int i = 0; i < 4; i += 2)
@@ -3464,16 +3466,16 @@  omp_clause_aligned_alignment (tree claus
 	machine_mode vmode = targetm.vectorize.preferred_simd_mode (mode);
 	if (GET_MODE_CLASS (vmode) != classes[i + 1])
 	  continue;
-	while (vs
-	       && GET_MODE_SIZE (vmode) < vs
+	while (maybe_nonzero (vs)
+	       && must_lt (GET_MODE_SIZE (vmode), vs)
 	       && GET_MODE_2XWIDER_MODE (vmode).exists ())
 	  vmode = GET_MODE_2XWIDER_MODE (vmode).require ();
 
 	tree type = lang_hooks.types.type_for_mode (mode, 1);
 	if (type == NULL_TREE || TYPE_MODE (type) != mode)
 	  continue;
-	type = build_vector_type (type, GET_MODE_SIZE (vmode)
-					/ GET_MODE_SIZE (mode));
+	unsigned int nelts = GET_MODE_SIZE (vmode) / GET_MODE_SIZE (mode);
+	type = build_vector_type (type, nelts);
 	if (TYPE_MODE (type) != vmode)
 	  continue;
 	if (TYPE_ALIGN_UNIT (type) > al)
Index: gcc/optabs-query.c
===================================================================
--- gcc/optabs-query.c	2017-10-23 17:11:39.995468444 +0100
+++ gcc/optabs-query.c	2017-10-23 17:22:32.723227539 +0100
@@ -489,7 +489,6 @@  can_vec_mask_load_store_p (machine_mode
 {
   optab op = is_load ? maskload_optab : maskstore_optab;
   machine_mode vmode;
-  unsigned int vector_sizes;
 
   /* If mode is vector mode, check it directly.  */
   if (VECTOR_MODE_P (mode))
@@ -513,14 +512,14 @@  can_vec_mask_load_store_p (machine_mode
       && convert_optab_handler (op, vmode, mask_mode) != CODE_FOR_nothing)
     return true;
 
-  vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
-  while (vector_sizes != 0)
+  auto_vector_sizes vector_sizes;
+  targetm.vectorize.autovectorize_vector_sizes (&vector_sizes);
+  for (unsigned int i = 0; i < vector_sizes.length (); ++i)
     {
-      unsigned int cur = 1 << floor_log2 (vector_sizes);
-      vector_sizes &= ~cur;
-      if (cur <= GET_MODE_SIZE (smode))
+      poly_uint64 cur = vector_sizes[i];
+      poly_uint64 nunits;
+      if (!multiple_p (cur, GET_MODE_SIZE (smode), &nunits))
 	continue;
-      unsigned int nunits = cur / GET_MODE_SIZE (smode);
       if (mode_for_vector (smode, nunits).exists (&vmode)
 	  && VECTOR_MODE_P (vmode)
 	  && targetm.vectorize.get_mask_mode (nunits, cur).exists (&mask_mode)
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2017-10-23 17:22:28.835953330 +0100
+++ gcc/tree-vect-loop.c	2017-10-23 17:22:32.727227124 +0100
@@ -2327,11 +2327,12 @@  vect_analyze_loop_2 (loop_vec_info loop_
 vect_analyze_loop (struct loop *loop, loop_vec_info orig_loop_vinfo)
 {
   loop_vec_info loop_vinfo;
-  unsigned int vector_sizes;
+  auto_vector_sizes vector_sizes;
 
   /* Autodetect first vector size we try.  */
   current_vector_size = 0;
-  vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
+  targetm.vectorize.autovectorize_vector_sizes (&vector_sizes);
+  unsigned int next_size = 0;
 
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,
@@ -2347,6 +2348,7 @@  vect_analyze_loop (struct loop *loop, lo
       return NULL;
     }
 
+  poly_uint64 autodetected_vector_size = 0;
   while (1)
     {
       /* Check the CFG characteristics of the loop (nesting, entry/exit).  */
@@ -2373,18 +2375,28 @@  vect_analyze_loop (struct loop *loop, lo
 
       delete loop_vinfo;
 
-      vector_sizes &= ~current_vector_size;
+      if (next_size == 0)
+	autodetected_vector_size = current_vector_size;
+
+      if (next_size < vector_sizes.length ()
+	  && must_eq (vector_sizes[next_size], autodetected_vector_size))
+	next_size += 1;
+
       if (fatal
-	  || vector_sizes == 0
-	  || current_vector_size == 0)
+	  || next_size == vector_sizes.length ()
+	  || known_zero (current_vector_size))
 	return NULL;
 
       /* Try the next biggest vector size.  */
-      current_vector_size = 1 << floor_log2 (vector_sizes);
+      current_vector_size = vector_sizes[next_size++];
       if (dump_enabled_p ())
-	dump_printf_loc (MSG_NOTE, vect_location,
-			 "***** Re-trying analysis with "
-			 "vector size %d\n", current_vector_size);
+	{
+	  dump_printf_loc (MSG_NOTE, vect_location,
+			   "***** Re-trying analysis with "
+			   "vector size ");
+	  dump_dec (MSG_NOTE, current_vector_size);
+	  dump_printf (MSG_NOTE, "\n");
+	}
     }
 }
 
@@ -7686,9 +7698,12 @@  vect_transform_loop (loop_vec_info loop_
 	  dump_printf (MSG_NOTE, "\n");
 	}
       else
-	dump_printf_loc (MSG_NOTE, vect_location,
-			 "LOOP EPILOGUE VECTORIZED (VS=%d)\n",
-			 current_vector_size);
+	{
+	  dump_printf_loc (MSG_NOTE, vect_location,
+			   "LOOP EPILOGUE VECTORIZED (VS=");
+	  dump_dec (MSG_NOTE, current_vector_size);
+	  dump_printf (MSG_NOTE, ")\n");
+	}
     }
 
   /* Free SLP instances here because otherwise stmt reference counting
@@ -7705,31 +7720,39 @@  vect_transform_loop (loop_vec_info loop_
   if (LOOP_VINFO_EPILOGUE_P (loop_vinfo))
     epilogue = NULL;
 
+  if (!PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK))
+    epilogue = NULL;
+
   if (epilogue)
     {
-	unsigned int vector_sizes
-	  = targetm.vectorize.autovectorize_vector_sizes ();
-	vector_sizes &= current_vector_size - 1;
-
-	if (!PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK))
-	  epilogue = NULL;
-	else if (!vector_sizes)
-	  epilogue = NULL;
-	else if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
-		 && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) >= 0
-		 && must_eq (vf, lowest_vf))
-	  {
-	    int smallest_vec_size = 1 << ctz_hwi (vector_sizes);
-	    int ratio = current_vector_size / smallest_vec_size;
-	    unsigned HOST_WIDE_INT eiters = LOOP_VINFO_INT_NITERS (loop_vinfo)
-	      - LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo);
-	    eiters = eiters % lowest_vf;
-
-	    epilogue->nb_iterations_upper_bound = eiters - 1;
+      auto_vector_sizes vector_sizes;
+      targetm.vectorize.autovectorize_vector_sizes (&vector_sizes);
+      unsigned int next_size = 0;
+
+      if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
+	  && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) >= 0
+	  && must_eq (vf, lowest_vf))
+	{
+	  unsigned int eiters
+	    = (LOOP_VINFO_INT_NITERS (loop_vinfo)
+	       - LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo));
+	  eiters = eiters % lowest_vf;
+	  epilogue->nb_iterations_upper_bound = eiters - 1;
+
+	  unsigned int ratio;
+	  while (next_size < vector_sizes.length ()
+		 && !(constant_multiple_p (current_vector_size,
+					   vector_sizes[next_size], &ratio)
+		      && eiters >= lowest_vf / ratio))
+	    next_size += 1;
+	}
+      else
+	while (next_size < vector_sizes.length ()
+	       && may_lt (current_vector_size, vector_sizes[next_size]))
+	  next_size += 1;
 
-	    if (eiters < lowest_vf / ratio)
-	      epilogue = NULL;
-	    }
+      if (next_size == vector_sizes.length ())
+	epilogue = NULL;
     }
 
   if (epilogue)
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2017-10-23 17:22:28.836953531 +0100
+++ gcc/tree-vect-slp.c	2017-10-23 17:22:32.728227020 +0100
@@ -3018,18 +3018,20 @@  vect_slp_bb (basic_block bb)
 {
   bb_vec_info bb_vinfo;
   gimple_stmt_iterator gsi;
-  unsigned int vector_sizes;
   bool any_vectorized = false;
+  auto_vector_sizes vector_sizes;
 
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location, "===vect_slp_analyze_bb===\n");
 
   /* Autodetect first vector size we try.  */
   current_vector_size = 0;
-  vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
+  targetm.vectorize.autovectorize_vector_sizes (&vector_sizes);
+  unsigned int next_size = 0;
 
   gsi = gsi_start_bb (bb);
 
+  poly_uint64 autodetected_vector_size = 0;
   while (1)
     {
       if (gsi_end_p (gsi))
@@ -3084,10 +3086,16 @@  vect_slp_bb (basic_block bb)
 
       any_vectorized |= vectorized;
 
-      vector_sizes &= ~current_vector_size;
+      if (next_size == 0)
+	autodetected_vector_size = current_vector_size;
+
+      if (next_size < vector_sizes.length ()
+	  && must_eq (vector_sizes[next_size], autodetected_vector_size))
+	next_size += 1;
+
       if (vectorized
-	  || vector_sizes == 0
-	  || current_vector_size == 0
+	  || next_size == vector_sizes.length ()
+	  || known_zero (current_vector_size)
 	  /* If vect_slp_analyze_bb_1 signaled that analysis for all
 	     vector sizes will fail do not bother iterating.  */
 	  || fatal)
@@ -3100,16 +3108,20 @@  vect_slp_bb (basic_block bb)
 
 	  /* And reset vector sizes.  */
 	  current_vector_size = 0;
-	  vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
+	  next_size = 0;
 	}
       else
 	{
 	  /* Try the next biggest vector size.  */
-	  current_vector_size = 1 << floor_log2 (vector_sizes);
+	  current_vector_size = vector_sizes[next_size++];
 	  if (dump_enabled_p ())
-	    dump_printf_loc (MSG_NOTE, vect_location,
-			     "***** Re-trying analysis with "
-			     "vector size %d\n", current_vector_size);
+	    {
+	      dump_printf_loc (MSG_NOTE, vect_location,
+			       "***** Re-trying analysis with "
+			       "vector size ");
+	      dump_dec (MSG_NOTE, current_vector_size);
+	      dump_printf (MSG_NOTE, "\n");
+	    }
 
 	  /* Start over.  */
 	  gsi = region_begin;
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	2017-10-23 17:22:30.979383401 +0100
+++ gcc/doc/tm.texi	2017-10-23 17:22:32.722227643 +0100
@@ -5839,11 +5839,15 @@  equal to @code{word_mode}, because the v
 transformations even in absence of specialized @acronym{SIMD} hardware.
 @end deftypefn
 
-@deftypefn {Target Hook} {unsigned int} TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES (void)
-This hook should return a mask of sizes that should be iterated over
-after trying to autovectorize using the vector size derived from the
-mode returned by @code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}.
-The default is zero which means to not iterate over other vector sizes.
+@deftypefn {Target Hook} void TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES (vector_sizes *@var{sizes})
+If the mode returned by @code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE} is not
+the only one that is worth considering, this hook should add all suitable
+vector sizes to @var{sizes}, in order of decreasing preference.  The first
+one should be the size of @code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}.
+
+The hook does not need to do anything if the vector returned by
+@code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE} is the only one relevant
+for autovectorization.  The default implementation does nothing.
 @end deftypefn
 
 @deftypefn {Target Hook} opt_machine_mode TARGET_VECTORIZE_GET_MASK_MODE (poly_uint64 @var{nunits}, poly_uint64 @var{length})
Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2017-10-23 17:22:28.837953732 +0100
+++ gcc/tree-vectorizer.h	2017-10-23 17:22:32.731226709 +0100
@@ -1199,7 +1199,7 @@  extern source_location find_loop_locatio
 extern bool vect_can_advance_ivs_p (loop_vec_info);
 
 /* In tree-vect-stmts.c.  */
-extern unsigned int current_vector_size;
+extern poly_uint64 current_vector_size;
 extern tree get_vectype_for_scalar_type (tree);
 extern tree get_mask_type_for_scalar_type (tree);
 extern tree get_same_sized_vectype (tree, tree);
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2017-10-23 17:22:28.837953732 +0100
+++ gcc/tree-vect-stmts.c	2017-10-23 17:22:32.730226813 +0100
@@ -9084,12 +9084,12 @@  free_stmt_vec_info (gimple *stmt)
    by the target.  */
 
 static tree
-get_vectype_for_scalar_type_and_size (tree scalar_type, unsigned size)
+get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size)
 {
   tree orig_scalar_type = scalar_type;
   scalar_mode inner_mode;
   machine_mode simd_mode;
-  int nunits;
+  poly_uint64 nunits;
   tree vectype;
 
   if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode)
@@ -9131,13 +9131,13 @@  get_vectype_for_scalar_type_and_size (tr
 
   /* If no size was supplied use the mode the target prefers.   Otherwise
      lookup a vector mode of the specified size.  */
-  if (size == 0)
+  if (known_zero (size))
     simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
-  else if (!mode_for_vector (inner_mode, size / nbytes).exists (&simd_mode))
+  else if (!multiple_p (size, nbytes, &nunits)
+	   || !mode_for_vector (inner_mode, nunits).exists (&simd_mode))
     return NULL_TREE;
-  nunits = GET_MODE_SIZE (simd_mode) / nbytes;
   /* NOTE: nunits == 1 is allowed to support single element vector types.  */
-  if (nunits < 1)
+  if (!multiple_p (GET_MODE_SIZE (simd_mode), nbytes, &nunits))
     return NULL_TREE;
 
   vectype = build_vector_type (scalar_type, nunits);
@@ -9155,7 +9155,7 @@  get_vectype_for_scalar_type_and_size (tr
   return vectype;
 }
 
-unsigned int current_vector_size;
+poly_uint64 current_vector_size;
 
 /* Function get_vectype_for_scalar_type.
 
@@ -9169,7 +9169,7 @@  get_vectype_for_scalar_type (tree scalar
   vectype = get_vectype_for_scalar_type_and_size (scalar_type,
 						  current_vector_size);
   if (vectype
-      && current_vector_size == 0)
+      && known_zero (current_vector_size))
     current_vector_size = GET_MODE_SIZE (TYPE_MODE (vectype));
   return vectype;
 }
Index: gcc/tree.h
===================================================================
--- gcc/tree.h	2017-10-23 17:22:21.308442966 +0100
+++ gcc/tree.h	2017-10-23 17:22:32.736226191 +0100
@@ -4108,7 +4108,13 @@  extern tree build_reference_type_for_mod
 extern tree build_reference_type (tree);
 extern tree build_vector_type_for_mode (tree, machine_mode);
 extern tree build_vector_type (tree innertype, int nunits);
-extern tree build_truth_vector_type (unsigned, unsigned);
+/* Temporary.  */
+inline tree
+build_vector_type (tree innertype, poly_uint64 nunits)
+{
+  return build_vector_type (innertype, (int) nunits.to_constant ());
+}
+extern tree build_truth_vector_type (poly_uint64, poly_uint64);
 extern tree build_same_sized_truth_vector_type (tree vectype);
 extern tree build_opaque_vector_type (tree innertype, int nunits);
 extern tree build_index_type (tree);
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	2017-10-23 17:22:21.307442765 +0100
+++ gcc/tree.c	2017-10-23 17:22:32.734226398 +0100
@@ -9662,6 +9662,13 @@  make_vector_type (tree innertype, int nu
   return t;
 }
 
+/* Temporary.  */
+static tree
+make_vector_type (tree innertype, poly_uint64 nunits, machine_mode mode)
+{
+  return make_vector_type (innertype, (int) nunits.to_constant (), mode);
+}
+
 static tree
 make_or_reuse_type (unsigned size, int unsignedp)
 {
@@ -10559,19 +10566,18 @@  build_vector_type (tree innertype, int n
 /* Build truth vector with specified length and number of units.  */
 
 tree
-build_truth_vector_type (unsigned nunits, unsigned vector_size)
+build_truth_vector_type (poly_uint64 nunits, poly_uint64 vector_size)
 {
   machine_mode mask_mode
     = targetm.vectorize.get_mask_mode (nunits, vector_size).else_blk ();
 
-  unsigned HOST_WIDE_INT vsize;
+  poly_uint64 vsize;
   if (mask_mode == BLKmode)
     vsize = vector_size * BITS_PER_UNIT;
   else
     vsize = GET_MODE_BITSIZE (mask_mode);
 
-  unsigned HOST_WIDE_INT esize = vsize / nunits;
-  gcc_assert (esize * nunits == vsize);
+  unsigned HOST_WIDE_INT esize = vector_element_size (vsize, nunits);
 
   tree bool_type = build_nonstandard_boolean_type (esize);