diff mbox series

[11/n] Support vectorisation with mixed vector sizes

Message ID mpty2x98071.fsf@arm.com
State New
Headers show
Series [11/n] Support vectorisation with mixed vector sizes | expand

Commit Message

Richard Sandiford Oct. 25, 2019, 12:43 p.m. UTC
After previous patches, it's now possible to make the vectoriser
support multiple vector sizes in the same vector region, using
related_vector_mode to pick the right vector mode for a given
element mode.  No port yet takes advantage of this, but I have
a follow-on patch for AArch64.

This patch also seemed like a good opportunity to add some more dump
messages: one to make it clear which vector size/mode was being used
when analysis passed or failed, and another to say when we've decided
to skip a redundant vector size/mode.


2019-10-24  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* machmode.h (opt_machine_mode::operator==): New function.
	(opt_machine_mode::operator!=): Likewise.
	* tree-vectorizer.h (vec_info::vector_mode): Update comment.
	(get_related_vectype_for_scalar_type): Delete.
	(get_vectype_for_scalar_type_and_size): Declare.
	* tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say
	whether analysis passed or failed, and with what vector modes.
	Use related_vector_mode to check whether trying a particular
	vector mode would be redundant with the autodetected mode,
	and print a dump message if we decide to skip it.
	* tree-vect-loop.c (vect_analyze_loop): Likewise.
	(vect_create_epilog_for_reduction): Use
	get_related_vectype_for_scalar_type instead of
	get_vectype_for_scalar_type_and_size.
	* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace
	with...
	(get_related_vectype_for_scalar_type): ...this new function.
	Take a starting/"prevailing" vector mode rather than a vector size.
	Take an optional nunits argument, with the same meaning as for
	related_vector_mode.  Use related_vector_mode when not
	auto-detecting a mode, falling back to mode_for_vector if no
	target mode exists.
	(get_vectype_for_scalar_type): Update accordingly.
	(get_same_sized_vectype): Likewise.
	* tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise.

Comments

Richard Biener Nov. 5, 2019, 12:56 p.m. UTC | #1
On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> After previous patches, it's now possible to make the vectoriser
> support multiple vector sizes in the same vector region, using
> related_vector_mode to pick the right vector mode for a given
> element mode.  No port yet takes advantage of this, but I have
> a follow-on patch for AArch64.
>
> This patch also seemed like a good opportunity to add some more dump
> messages: one to make it clear which vector size/mode was being used
> when analysis passed or failed, and another to say when we've decided
> to skip a redundant vector size/mode.

OK.

I wonder if, when we requested a specific size previously, we now
have to verify we got that constraint satisfied after the change.
Esp. the epilogue vectorization cases want to get V2DI
from V4DI.

          sz /= 2;
-         vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
+         vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
+                                                         scalar_type,
+                                                         sz / scalar_bytes);

doesn't look like an improvement in readability to me there.  Maybe
re-formulating
the whole code in terms of lanes instead of size would make it easier to follow?

Thanks,
Richard.

>
> 2019-10-24  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * machmode.h (opt_machine_mode::operator==): New function.
>         (opt_machine_mode::operator!=): Likewise.
>         * tree-vectorizer.h (vec_info::vector_mode): Update comment.
>         (get_related_vectype_for_scalar_type): Delete.
>         (get_vectype_for_scalar_type_and_size): Declare.
>         * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say
>         whether analysis passed or failed, and with what vector modes.
>         Use related_vector_mode to check whether trying a particular
>         vector mode would be redundant with the autodetected mode,
>         and print a dump message if we decide to skip it.
>         * tree-vect-loop.c (vect_analyze_loop): Likewise.
>         (vect_create_epilog_for_reduction): Use
>         get_related_vectype_for_scalar_type instead of
>         get_vectype_for_scalar_type_and_size.
>         * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace
>         with...
>         (get_related_vectype_for_scalar_type): ...this new function.
>         Take a starting/"prevailing" vector mode rather than a vector size.
>         Take an optional nunits argument, with the same meaning as for
>         related_vector_mode.  Use related_vector_mode when not
>         auto-detecting a mode, falling back to mode_for_vector if no
>         target mode exists.
>         (get_vectype_for_scalar_type): Update accordingly.
>         (get_same_sized_vectype): Likewise.
>         * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise.
>
> Index: gcc/machmode.h
> ===================================================================
> --- gcc/machmode.h      2019-10-25 13:26:59.053879364 +0100
> +++ gcc/machmode.h      2019-10-25 13:27:26.201687539 +0100
> @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS)
>    bool exists () const;
>    template<typename U> bool exists (U *) const;
>
> +  bool operator== (const T &m) const { return m_mode == m; }
> +  bool operator!= (const T &m) const { return m_mode != m; }
> +
>  private:
>    machine_mode m_mode;
>  };
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2019-10-25 13:27:19.317736181 +0100
> +++ gcc/tree-vectorizer.h       2019-10-25 13:27:26.209687483 +0100
> @@ -329,8 +329,9 @@ typedef std::pair<tree, tree> vec_object
>    /* Cost data used by the target cost model.  */
>    void *target_cost_data;
>
> -  /* If we've chosen a vector size for this vectorization region,
> -     this is one mode that has such a size, otherwise it is VOIDmode.  */
> +  /* The argument we should pass to related_vector_mode when looking up
> +     the vector mode for a scalar mode, or VOIDmode if we haven't yet
> +     made any decisions about which vector modes to use.  */
>    machine_mode vector_mode;
>
>  private:
> @@ -1595,8 +1596,9 @@ extern dump_user_location_t find_loop_lo
>  extern bool vect_can_advance_ivs_p (loop_vec_info);
>
>  /* In tree-vect-stmts.c.  */
> +extern tree get_related_vectype_for_scalar_type (machine_mode, tree,
> +                                                poly_uint64 = 0);
>  extern tree get_vectype_for_scalar_type (vec_info *, tree);
> -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64);
>  extern tree get_mask_type_for_scalar_type (vec_info *, tree);
>  extern tree get_same_sized_vectype (tree, tree);
>  extern bool vect_get_loop_mask_type (loop_vec_info);
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2019-10-25 13:27:19.313736209 +0100
> +++ gcc/tree-vect-slp.c 2019-10-25 13:27:26.205687511 +0100
> @@ -3118,7 +3118,12 @@ vect_slp_bb_region (gimple_stmt_iterator
>           && dbg_cnt (vect_slp))
>         {
>           if (dump_enabled_p ())
> -           dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
> +           {
> +             dump_printf_loc (MSG_NOTE, vect_location,
> +                              "***** Analysis succeeded with vector mode"
> +                              " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode));
> +             dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
> +           }
>
>           bb_vinfo->shared->check_datarefs ();
>           vect_schedule_slp (bb_vinfo);
> @@ -3138,6 +3143,13 @@ vect_slp_bb_region (gimple_stmt_iterator
>
>           vectorized = true;
>         }
> +      else
> +       {
> +         if (dump_enabled_p ())
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Analysis failed with vector mode %s\n",
> +                            GET_MODE_NAME (bb_vinfo->vector_mode));
> +       }
>
>        if (mode_i == 0)
>         autodetected_vector_mode = bb_vinfo->vector_mode;
> @@ -3145,9 +3157,22 @@ vect_slp_bb_region (gimple_stmt_iterator
>        delete bb_vinfo;
>
>        if (mode_i < vector_modes.length ()
> -         && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
> -                      GET_MODE_SIZE (autodetected_vector_mode)))
> -       mode_i += 1;
> +         && VECTOR_MODE_P (autodetected_vector_mode)
> +         && (related_vector_mode (vector_modes[mode_i],
> +                                  GET_MODE_INNER (autodetected_vector_mode))
> +             == autodetected_vector_mode)
> +         && (related_vector_mode (autodetected_vector_mode,
> +                                  GET_MODE_INNER (vector_modes[mode_i]))
> +             == vector_modes[mode_i]))
> +       {
> +         if (dump_enabled_p ())
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Skipping vector mode %s, which would"
> +                            " repeat the analysis for %s\n",
> +                            GET_MODE_NAME (vector_modes[mode_i]),
> +                            GET_MODE_NAME (autodetected_vector_mode));
> +         mode_i += 1;
> +       }
>
>        if (vectorized
>           || mode_i == vector_modes.length ()
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2019-10-25 13:27:19.309736237 +0100
> +++ gcc/tree-vect-loop.c        2019-10-25 13:27:26.201687539 +0100
> @@ -2367,6 +2367,17 @@ vect_analyze_loop (class loop *loop, loo
>        opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
>        if (mode_i == 0)
>         autodetected_vector_mode = loop_vinfo->vector_mode;
> +      if (dump_enabled_p ())
> +       {
> +         if (res)
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Analysis succeeded with vector mode %s\n",
> +                            GET_MODE_NAME (loop_vinfo->vector_mode));
> +         else
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Analysis failed with vector mode %s\n",
> +                            GET_MODE_NAME (loop_vinfo->vector_mode));
> +       }
>
>        if (res)
>         {
> @@ -2400,9 +2411,22 @@ vect_analyze_loop (class loop *loop, loo
>         }
>
>        if (mode_i < vector_modes.length ()
> -         && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
> -                      GET_MODE_SIZE (autodetected_vector_mode)))
> -       mode_i += 1;
> +         && VECTOR_MODE_P (autodetected_vector_mode)
> +         && (related_vector_mode (vector_modes[mode_i],
> +                                  GET_MODE_INNER (autodetected_vector_mode))
> +             == autodetected_vector_mode)
> +         && (related_vector_mode (autodetected_vector_mode,
> +                                  GET_MODE_INNER (vector_modes[mode_i]))
> +             == vector_modes[mode_i]))
> +       {
> +         if (dump_enabled_p ())
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Skipping vector mode %s, which would"
> +                            " repeat the analysis for %s\n",
> +                            GET_MODE_NAME (vector_modes[mode_i]),
> +                            GET_MODE_NAME (autodetected_vector_mode));
> +         mode_i += 1;
> +       }
>
>        if (mode_i == vector_modes.length ()
>           || autodetected_vector_mode == VOIDmode)
> @@ -4763,7 +4787,10 @@ vect_create_epilog_for_reduction (stmt_v
>           && (mode1 = targetm.vectorize.split_reduction (mode)) != mode)
>         sz1 = GET_MODE_SIZE (mode1).to_constant ();
>
> -      tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1);
> +      unsigned int scalar_bytes = tree_to_uhwi (TYPE_SIZE_UNIT (scalar_type));
> +      tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> +                                                          scalar_type,
> +                                                          sz1 / scalar_bytes);
>        reduce_with_shift = have_whole_vector_shift (mode1);
>        if (!VECTOR_MODE_P (mode1))
>         reduce_with_shift = false;
> @@ -4781,7 +4808,9 @@ vect_create_epilog_for_reduction (stmt_v
>         {
>           gcc_assert (!slp_reduc);
>           sz /= 2;
> -         vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
> +         vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> +                                                         scalar_type,
> +                                                         sz / scalar_bytes);
>
>           /* The target has to make sure we support lowpart/highpart
>              extraction, either via direct vector extract or through
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c       2019-10-25 13:27:22.985710263 +0100
> +++ gcc/tree-vect-stmts.c       2019-10-25 13:27:26.205687511 +0100
> @@ -11111,18 +11111,28 @@ vect_remove_stores (stmt_vec_info first_
>      }
>  }
>
> -/* Function get_vectype_for_scalar_type_and_size.
> -
> -   Returns the vector type corresponding to SCALAR_TYPE  and SIZE as supported
> -   by the target.  */
> +/* If NUNITS is nonzero, return a vector type that contains NUNITS
> +   elements of type SCALAR_TYPE, or null if the target doesn't support
> +   such a type.
> +
> +   If NUNITS is zero, return a vector type that contains elements of
> +   type SCALAR_TYPE, choosing whichever vector size the target prefers.
> +
> +   If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode
> +   for this vectorization region and want to "autodetect" the best choice.
> +   Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE
> +   and we want the new type to be interoperable with it.   PREVAILING_MODE
> +   in this case can be a scalar integer mode or a vector mode; when it
> +   is a vector mode, the function acts like a tree-level version of
> +   related_vector_mode.  */
>
>  tree
> -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size)
> +get_related_vectype_for_scalar_type (machine_mode prevailing_mode,
> +                                    tree scalar_type, poly_uint64 nunits)
>  {
>    tree orig_scalar_type = scalar_type;
>    scalar_mode inner_mode;
>    machine_mode simd_mode;
> -  poly_uint64 nunits;
>    tree vectype;
>
>    if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode)
> @@ -11162,10 +11172,11 @@ get_vectype_for_scalar_type_and_size (tr
>    if (scalar_type == NULL_TREE)
>      return NULL_TREE;
>
> -  /* If no size was supplied use the mode the target prefers.   Otherwise
> -     lookup a vector mode of the specified size.  */
> -  if (known_eq (size, 0U))
> +  /* If no prevailing mode was supplied, use the mode the target prefers.
> +     Otherwise lookup a vector mode based on the prevailing mode.  */
> +  if (prevailing_mode == VOIDmode)
>      {
> +      gcc_assert (known_eq (nunits, 0U));
>        simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
>        if (SCALAR_INT_MODE_P (simd_mode))
>         {
> @@ -11181,9 +11192,19 @@ get_vectype_for_scalar_type_and_size (tr
>             return NULL_TREE;
>         }
>      }
> -  else if (!multiple_p (size, nbytes, &nunits)
> -          || !mode_for_vector (inner_mode, nunits).exists (&simd_mode))
> -    return NULL_TREE;
> +  else if (SCALAR_INT_MODE_P (prevailing_mode)
> +          || !related_vector_mode (prevailing_mode,
> +                                   inner_mode, nunits).exists (&simd_mode))
> +    {
> +      /* Fall back to using mode_for_vector, mostly in the hope of being
> +        able to use an integer mode.  */
> +      if (known_eq (nunits, 0U)
> +         && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits))
> +       return NULL_TREE;
> +
> +      if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode))
> +       return NULL_TREE;
> +    }
>
>    vectype = build_vector_type_for_mode (scalar_type, simd_mode);
>
> @@ -11211,9 +11232,8 @@ get_vectype_for_scalar_type_and_size (tr
>  tree
>  get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type)
>  {
> -  tree vectype;
> -  poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode);
> -  vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size);
> +  tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode,
> +                                                     scalar_type);
>    if (vectype && vinfo->vector_mode == VOIDmode)
>      vinfo->vector_mode = TYPE_MODE (vectype);
>    return vectype;
> @@ -11246,8 +11266,13 @@ get_same_sized_vectype (tree scalar_type
>    if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
>      return truth_type_for (vector_type);
>
> -  return get_vectype_for_scalar_type_and_size
> -          (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type)));
> +  poly_uint64 nunits;
> +  if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> +                  GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits))
> +    return NULL_TREE;
> +
> +  return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type),
> +                                             scalar_type, nunits);
>  }
>
>  /* Function vect_is_simple_use.
> Index: gcc/tree-vectorizer.c
> ===================================================================
> --- gcc/tree-vectorizer.c       2019-10-25 13:27:19.317736181 +0100
> +++ gcc/tree-vectorizer.c       2019-10-25 13:27:26.209687483 +0100
> @@ -1348,7 +1348,7 @@ get_vec_alignment_for_array_type (tree t
>    poly_uint64 array_size, vector_size;
>
>    tree scalar_type = strip_array_types (type);
> -  tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0);
> +  tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type);
>    if (!vectype
>        || !poly_int_tree_p (TYPE_SIZE (type), &array_size)
>        || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)
Richard Sandiford Nov. 6, 2019, 12:38 p.m. UTC | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> After previous patches, it's now possible to make the vectoriser
>> support multiple vector sizes in the same vector region, using
>> related_vector_mode to pick the right vector mode for a given
>> element mode.  No port yet takes advantage of this, but I have
>> a follow-on patch for AArch64.
>>
>> This patch also seemed like a good opportunity to add some more dump
>> messages: one to make it clear which vector size/mode was being used
>> when analysis passed or failed, and another to say when we've decided
>> to skip a redundant vector size/mode.
>
> OK.
>
> I wonder if, when we requested a specific size previously, we now
> have to verify we got that constraint satisfied after the change.
> Esp. the epilogue vectorization cases want to get V2DI
> from V4DI.
>
>           sz /= 2;
> -         vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
> +         vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> +                                                         scalar_type,
> +                                                         sz / scalar_bytes);
>
> doesn't look like an improvement in readability to me there.

Yeah, guess it isn't great.

> Maybe re-formulating the whole code in terms of lanes instead of size
> would make it easier to follow?

OK, how about this version?  It still won't win awards, but it's at
least a bit more readable.

Tested as before.

Richard


2019-11-06  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* machmode.h (opt_machine_mode::operator==): New function.
	(opt_machine_mode::operator!=): Likewise.
	* tree-vectorizer.h (vec_info::vector_mode): Update comment.
	(get_related_vectype_for_scalar_type): Delete.
	(get_vectype_for_scalar_type_and_size): Declare.
	* tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say
	whether analysis passed or failed, and with what vector modes.
	Use related_vector_mode to check whether trying a particular
	vector mode would be redundant with the autodetected mode,
	and print a dump message if we decide to skip it.
	* tree-vect-loop.c (vect_analyze_loop): Likewise.
	(vect_create_epilog_for_reduction): Use
	get_related_vectype_for_scalar_type instead of
	get_vectype_for_scalar_type_and_size.
	* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace
	with...
	(get_related_vectype_for_scalar_type): ...this new function.
	Take a starting/"prevailing" vector mode rather than a vector size.
	Take an optional nunits argument, with the same meaning as for
	related_vector_mode.  Use related_vector_mode when not
	auto-detecting a mode, falling back to mode_for_vector if no
	target mode exists.
	(get_vectype_for_scalar_type): Update accordingly.
	(get_same_sized_vectype): Likewise.
	* tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise.

Index: gcc/machmode.h
===================================================================
--- gcc/machmode.h	2019-11-06 12:35:12.460201615 +0000
+++ gcc/machmode.h	2019-11-06 12:35:27.972093472 +0000
@@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS)
   bool exists () const;
   template<typename U> bool exists (U *) const;
 
+  bool operator== (const T &m) const { return m_mode == m; }
+  bool operator!= (const T &m) const { return m_mode != m; }
+
 private:
   machine_mode m_mode;
 };
Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2019-11-06 12:35:12.764199495 +0000
+++ gcc/tree-vectorizer.h	2019-11-06 12:35:27.976093444 +0000
@@ -335,8 +335,9 @@ typedef std::pair<tree, tree> vec_object
   /* Cost data used by the target cost model.  */
   void *target_cost_data;
 
-  /* If we've chosen a vector size for this vectorization region,
-     this is one mode that has such a size, otherwise it is VOIDmode.  */
+  /* The argument we should pass to related_vector_mode when looking up
+     the vector mode for a scalar mode, or VOIDmode if we haven't yet
+     made any decisions about which vector modes to use.  */
   machine_mode vector_mode;
 
 private:
@@ -1609,8 +1610,9 @@ extern bool vect_can_advance_ivs_p (loop
 extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code);
 
 /* In tree-vect-stmts.c.  */
+extern tree get_related_vectype_for_scalar_type (machine_mode, tree,
+						 poly_uint64 = 0);
 extern tree get_vectype_for_scalar_type (vec_info *, tree);
-extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64);
 extern tree get_mask_type_for_scalar_type (vec_info *, tree);
 extern tree get_same_sized_vectype (tree, tree);
 extern bool vect_get_loop_mask_type (loop_vec_info);
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2019-11-06 12:35:12.760199523 +0000
+++ gcc/tree-vect-slp.c	2019-11-06 12:35:27.972093472 +0000
@@ -3202,7 +3202,12 @@ vect_slp_bb_region (gimple_stmt_iterator
 	  && dbg_cnt (vect_slp))
 	{
 	  if (dump_enabled_p ())
-	    dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
+	    {
+	      dump_printf_loc (MSG_NOTE, vect_location,
+			       "***** Analysis succeeded with vector mode"
+			       " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode));
+	      dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
+	    }
 
 	  bb_vinfo->shared->check_datarefs ();
 	  vect_schedule_slp (bb_vinfo);
@@ -3222,6 +3227,13 @@ vect_slp_bb_region (gimple_stmt_iterator
 
 	  vectorized = true;
 	}
+      else
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Analysis failed with vector mode %s\n",
+			     GET_MODE_NAME (bb_vinfo->vector_mode));
+	}
 
       if (mode_i == 0)
 	autodetected_vector_mode = bb_vinfo->vector_mode;
@@ -3229,9 +3241,22 @@ vect_slp_bb_region (gimple_stmt_iterator
       delete bb_vinfo;
 
       if (mode_i < vector_modes.length ()
-	  && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
-		       GET_MODE_SIZE (autodetected_vector_mode)))
-	mode_i += 1;
+	  && VECTOR_MODE_P (autodetected_vector_mode)
+	  && (related_vector_mode (vector_modes[mode_i],
+				   GET_MODE_INNER (autodetected_vector_mode))
+	      == autodetected_vector_mode)
+	  && (related_vector_mode (autodetected_vector_mode,
+				   GET_MODE_INNER (vector_modes[mode_i]))
+	      == vector_modes[mode_i]))
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Skipping vector mode %s, which would"
+			     " repeat the analysis for %s\n",
+			     GET_MODE_NAME (vector_modes[mode_i]),
+			     GET_MODE_NAME (autodetected_vector_mode));
+	  mode_i += 1;
+	}
 
       if (vectorized
 	  || mode_i == vector_modes.length ()
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2019-11-06 12:35:12.756199552 +0000
+++ gcc/tree-vect-loop.c	2019-11-06 12:35:27.972093472 +0000
@@ -2417,6 +2417,17 @@ vect_analyze_loop (class loop *loop, vec
       res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
       if (mode_i == 0)
 	autodetected_vector_mode = loop_vinfo->vector_mode;
+      if (dump_enabled_p ())
+	{
+	  if (res)
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Analysis succeeded with vector mode %s\n",
+			     GET_MODE_NAME (loop_vinfo->vector_mode));
+	  else
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Analysis failed with vector mode %s\n",
+			     GET_MODE_NAME (loop_vinfo->vector_mode));
+	}
 
       loop->aux = NULL;
       if (res)
@@ -2479,9 +2490,22 @@ vect_analyze_loop (class loop *loop, vec
 	}
 
       if (mode_i < vector_modes.length ()
-	  && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
-		       GET_MODE_SIZE (autodetected_vector_mode)))
-	mode_i += 1;
+	  && VECTOR_MODE_P (autodetected_vector_mode)
+	  && (related_vector_mode (vector_modes[mode_i],
+				   GET_MODE_INNER (autodetected_vector_mode))
+	      == autodetected_vector_mode)
+	  && (related_vector_mode (autodetected_vector_mode,
+				   GET_MODE_INNER (vector_modes[mode_i]))
+	      == vector_modes[mode_i]))
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Skipping vector mode %s, which would"
+			     " repeat the analysis for %s\n",
+			     GET_MODE_NAME (vector_modes[mode_i]),
+			     GET_MODE_NAME (autodetected_vector_mode));
+	  mode_i += 1;
+	}
 
       if (mode_i == vector_modes.length ()
 	  || autodetected_vector_mode == VOIDmode)
@@ -4870,13 +4894,15 @@ vect_create_epilog_for_reduction (stmt_v
 	 in a vector mode of smaller size and first reduce upper/lower
 	 halves against each other.  */
       enum machine_mode mode1 = mode;
-      unsigned sz = tree_to_uhwi (TYPE_SIZE_UNIT (vectype));
-      unsigned sz1 = sz;
+      unsigned nunits = TYPE_VECTOR_SUBPARTS (vectype).to_constant ();
+      unsigned nunits1 = nunits;
       if (!slp_reduc
 	  && (mode1 = targetm.vectorize.split_reduction (mode)) != mode)
-	sz1 = GET_MODE_SIZE (mode1).to_constant ();
+	nunits1 = GET_MODE_NUNITS (mode1).to_constant ();
 
-      tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1);
+      tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
+							   scalar_type,
+							   nunits1);
       reduce_with_shift = have_whole_vector_shift (mode1);
       if (!VECTOR_MODE_P (mode1))
 	reduce_with_shift = false;
@@ -4890,11 +4916,13 @@ vect_create_epilog_for_reduction (stmt_v
       /* First reduce the vector to the desired vector size we should
 	 do shift reduction on by combining upper and lower halves.  */
       new_temp = new_phi_result;
-      while (sz > sz1)
+      while (nunits > nunits1)
 	{
 	  gcc_assert (!slp_reduc);
-	  sz /= 2;
-	  vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
+	  nunits /= 2;
+	  vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
+							  scalar_type, nunits);
+	  unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1));
 
 	  /* The target has to make sure we support lowpart/highpart
 	     extraction, either via direct vector extract or through
@@ -4919,15 +4947,14 @@ vect_create_epilog_for_reduction (stmt_v
 		  = gimple_build_assign (dst2, BIT_FIELD_REF,
 					 build3 (BIT_FIELD_REF, vectype1,
 						 new_temp, TYPE_SIZE (vectype1),
-						 bitsize_int (sz * BITS_PER_UNIT)));
+						 bitsize_int (bitsize)));
 	      gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT);
 	    }
 	  else
 	    {
 	      /* Extract via punning to appropriately sized integer mode
 		 vector.  */
-	      tree eltype = build_nonstandard_integer_type (sz * BITS_PER_UNIT,
-							    1);
+	      tree eltype = build_nonstandard_integer_type (bitsize, 1);
 	      tree etype = build_vector_type (eltype, 2);
 	      gcc_assert (convert_optab_handler (vec_extract_optab,
 						 TYPE_MODE (etype),
@@ -4956,7 +4983,7 @@ vect_create_epilog_for_reduction (stmt_v
 		  = gimple_build_assign (tem, BIT_FIELD_REF,
 					 build3 (BIT_FIELD_REF, eltype,
 						 new_temp, TYPE_SIZE (eltype),
-						 bitsize_int (sz * BITS_PER_UNIT)));
+						 bitsize_int (bitsize)));
 	      gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT);
 	      dst2 =  make_ssa_name (vectype1);
 	      epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR,
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2019-11-06 12:35:12.796199272 +0000
+++ gcc/tree-vect-stmts.c	2019-11-06 12:35:27.976093444 +0000
@@ -11097,18 +11097,28 @@ vect_remove_stores (stmt_vec_info first_
     }
 }
 
-/* Function get_vectype_for_scalar_type_and_size.
-
-   Returns the vector type corresponding to SCALAR_TYPE  and SIZE as supported
-   by the target.  */
+/* If NUNITS is nonzero, return a vector type that contains NUNITS
+   elements of type SCALAR_TYPE, or null if the target doesn't support
+   such a type.
+
+   If NUNITS is zero, return a vector type that contains elements of
+   type SCALAR_TYPE, choosing whichever vector size the target prefers.
+
+   If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode
+   for this vectorization region and want to "autodetect" the best choice.
+   Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE
+   and we want the new type to be interoperable with it.   PREVAILING_MODE
+   in this case can be a scalar integer mode or a vector mode; when it
+   is a vector mode, the function acts like a tree-level version of
+   related_vector_mode.  */
 
 tree
-get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size)
+get_related_vectype_for_scalar_type (machine_mode prevailing_mode,
+				     tree scalar_type, poly_uint64 nunits)
 {
   tree orig_scalar_type = scalar_type;
   scalar_mode inner_mode;
   machine_mode simd_mode;
-  poly_uint64 nunits;
   tree vectype;
 
   if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode)
@@ -11148,10 +11158,11 @@ get_vectype_for_scalar_type_and_size (tr
   if (scalar_type == NULL_TREE)
     return NULL_TREE;
 
-  /* If no size was supplied use the mode the target prefers.   Otherwise
-     lookup a vector mode of the specified size.  */
-  if (known_eq (size, 0U))
+  /* If no prevailing mode was supplied, use the mode the target prefers.
+     Otherwise lookup a vector mode based on the prevailing mode.  */
+  if (prevailing_mode == VOIDmode)
     {
+      gcc_assert (known_eq (nunits, 0U));
       simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
       if (SCALAR_INT_MODE_P (simd_mode))
 	{
@@ -11167,9 +11178,19 @@ get_vectype_for_scalar_type_and_size (tr
 	    return NULL_TREE;
 	}
     }
-  else if (!multiple_p (size, nbytes, &nunits)
-	   || !mode_for_vector (inner_mode, nunits).exists (&simd_mode))
-    return NULL_TREE;
+  else if (SCALAR_INT_MODE_P (prevailing_mode)
+	   || !related_vector_mode (prevailing_mode,
+				    inner_mode, nunits).exists (&simd_mode))
+    {
+      /* Fall back to using mode_for_vector, mostly in the hope of being
+	 able to use an integer mode.  */
+      if (known_eq (nunits, 0U)
+	  && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits))
+	return NULL_TREE;
+
+      if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode))
+	return NULL_TREE;
+    }
 
   vectype = build_vector_type_for_mode (scalar_type, simd_mode);
 
@@ -11197,9 +11218,8 @@ get_vectype_for_scalar_type_and_size (tr
 tree
 get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type)
 {
-  tree vectype;
-  poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode);
-  vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size);
+  tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode,
+						      scalar_type);
   if (vectype && vinfo->vector_mode == VOIDmode)
     vinfo->vector_mode = TYPE_MODE (vectype);
   return vectype;
@@ -11232,8 +11252,13 @@ get_same_sized_vectype (tree scalar_type
   if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
     return truth_type_for (vector_type);
 
-  return get_vectype_for_scalar_type_and_size
-	   (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type)));
+  poly_uint64 nunits;
+  if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)),
+		   GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits))
+    return NULL_TREE;
+
+  return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type),
+					      scalar_type, nunits);
 }
 
 /* Function vect_is_simple_use.
Index: gcc/tree-vectorizer.c
===================================================================
--- gcc/tree-vectorizer.c	2019-11-06 12:35:12.764199495 +0000
+++ gcc/tree-vectorizer.c	2019-11-06 12:35:27.976093444 +0000
@@ -1359,7 +1359,7 @@ get_vec_alignment_for_array_type (tree t
   poly_uint64 array_size, vector_size;
 
   tree scalar_type = strip_array_types (type);
-  tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0);
+  tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type);
   if (!vectype
       || !poly_int_tree_p (TYPE_SIZE (type), &array_size)
       || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)
Richard Biener Nov. 12, 2019, 9:22 a.m. UTC | #3
On Wed, Nov 6, 2019 at 1:38 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener <richard.guenther@gmail.com> writes:
> > On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> After previous patches, it's now possible to make the vectoriser
> >> support multiple vector sizes in the same vector region, using
> >> related_vector_mode to pick the right vector mode for a given
> >> element mode.  No port yet takes advantage of this, but I have
> >> a follow-on patch for AArch64.
> >>
> >> This patch also seemed like a good opportunity to add some more dump
> >> messages: one to make it clear which vector size/mode was being used
> >> when analysis passed or failed, and another to say when we've decided
> >> to skip a redundant vector size/mode.
> >
> > OK.
> >
> > I wonder if, when we requested a specific size previously, we now
> > have to verify we got that constraint satisfied after the change.
> > Esp. the epilogue vectorization cases want to get V2DI
> > from V4DI.
> >
> >           sz /= 2;
> > -         vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
> > +         vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> > +                                                         scalar_type,
> > +                                                         sz / scalar_bytes);
> >
> > doesn't look like an improvement in readability to me there.
>
> Yeah, guess it isn't great.
>
> > Maybe re-formulating the whole code in terms of lanes instead of size
> > would make it easier to follow?
>
> OK, how about this version?  It still won't win awards, but it's at
> least a bit more readable.
>
> Tested as before.

OK (and sorry for the delay, looking for leftovers of the series now).

Thanks,
Richard.

> Richard
>
>
> 2019-11-06  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * machmode.h (opt_machine_mode::operator==): New function.
>         (opt_machine_mode::operator!=): Likewise.
>         * tree-vectorizer.h (vec_info::vector_mode): Update comment.
>         (get_related_vectype_for_scalar_type): Delete.
>         (get_vectype_for_scalar_type_and_size): Declare.
>         * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say
>         whether analysis passed or failed, and with what vector modes.
>         Use related_vector_mode to check whether trying a particular
>         vector mode would be redundant with the autodetected mode,
>         and print a dump message if we decide to skip it.
>         * tree-vect-loop.c (vect_analyze_loop): Likewise.
>         (vect_create_epilog_for_reduction): Use
>         get_related_vectype_for_scalar_type instead of
>         get_vectype_for_scalar_type_and_size.
>         * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace
>         with...
>         (get_related_vectype_for_scalar_type): ...this new function.
>         Take a starting/"prevailing" vector mode rather than a vector size.
>         Take an optional nunits argument, with the same meaning as for
>         related_vector_mode.  Use related_vector_mode when not
>         auto-detecting a mode, falling back to mode_for_vector if no
>         target mode exists.
>         (get_vectype_for_scalar_type): Update accordingly.
>         (get_same_sized_vectype): Likewise.
>         * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise.
>
> Index: gcc/machmode.h
> ===================================================================
> --- gcc/machmode.h      2019-11-06 12:35:12.460201615 +0000
> +++ gcc/machmode.h      2019-11-06 12:35:27.972093472 +0000
> @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS)
>    bool exists () const;
>    template<typename U> bool exists (U *) const;
>
> +  bool operator== (const T &m) const { return m_mode == m; }
> +  bool operator!= (const T &m) const { return m_mode != m; }
> +
>  private:
>    machine_mode m_mode;
>  };
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2019-11-06 12:35:12.764199495 +0000
> +++ gcc/tree-vectorizer.h       2019-11-06 12:35:27.976093444 +0000
> @@ -335,8 +335,9 @@ typedef std::pair<tree, tree> vec_object
>    /* Cost data used by the target cost model.  */
>    void *target_cost_data;
>
> -  /* If we've chosen a vector size for this vectorization region,
> -     this is one mode that has such a size, otherwise it is VOIDmode.  */
> +  /* The argument we should pass to related_vector_mode when looking up
> +     the vector mode for a scalar mode, or VOIDmode if we haven't yet
> +     made any decisions about which vector modes to use.  */
>    machine_mode vector_mode;
>
>  private:
> @@ -1609,8 +1610,9 @@ extern bool vect_can_advance_ivs_p (loop
>  extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code);
>
>  /* In tree-vect-stmts.c.  */
> +extern tree get_related_vectype_for_scalar_type (machine_mode, tree,
> +                                                poly_uint64 = 0);
>  extern tree get_vectype_for_scalar_type (vec_info *, tree);
> -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64);
>  extern tree get_mask_type_for_scalar_type (vec_info *, tree);
>  extern tree get_same_sized_vectype (tree, tree);
>  extern bool vect_get_loop_mask_type (loop_vec_info);
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2019-11-06 12:35:12.760199523 +0000
> +++ gcc/tree-vect-slp.c 2019-11-06 12:35:27.972093472 +0000
> @@ -3202,7 +3202,12 @@ vect_slp_bb_region (gimple_stmt_iterator
>           && dbg_cnt (vect_slp))
>         {
>           if (dump_enabled_p ())
> -           dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
> +           {
> +             dump_printf_loc (MSG_NOTE, vect_location,
> +                              "***** Analysis succeeded with vector mode"
> +                              " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode));
> +             dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
> +           }
>
>           bb_vinfo->shared->check_datarefs ();
>           vect_schedule_slp (bb_vinfo);
> @@ -3222,6 +3227,13 @@ vect_slp_bb_region (gimple_stmt_iterator
>
>           vectorized = true;
>         }
> +      else
> +       {
> +         if (dump_enabled_p ())
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Analysis failed with vector mode %s\n",
> +                            GET_MODE_NAME (bb_vinfo->vector_mode));
> +       }
>
>        if (mode_i == 0)
>         autodetected_vector_mode = bb_vinfo->vector_mode;
> @@ -3229,9 +3241,22 @@ vect_slp_bb_region (gimple_stmt_iterator
>        delete bb_vinfo;
>
>        if (mode_i < vector_modes.length ()
> -         && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
> -                      GET_MODE_SIZE (autodetected_vector_mode)))
> -       mode_i += 1;
> +         && VECTOR_MODE_P (autodetected_vector_mode)
> +         && (related_vector_mode (vector_modes[mode_i],
> +                                  GET_MODE_INNER (autodetected_vector_mode))
> +             == autodetected_vector_mode)
> +         && (related_vector_mode (autodetected_vector_mode,
> +                                  GET_MODE_INNER (vector_modes[mode_i]))
> +             == vector_modes[mode_i]))
> +       {
> +         if (dump_enabled_p ())
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Skipping vector mode %s, which would"
> +                            " repeat the analysis for %s\n",
> +                            GET_MODE_NAME (vector_modes[mode_i]),
> +                            GET_MODE_NAME (autodetected_vector_mode));
> +         mode_i += 1;
> +       }
>
>        if (vectorized
>           || mode_i == vector_modes.length ()
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2019-11-06 12:35:12.756199552 +0000
> +++ gcc/tree-vect-loop.c        2019-11-06 12:35:27.972093472 +0000
> @@ -2417,6 +2417,17 @@ vect_analyze_loop (class loop *loop, vec
>        res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
>        if (mode_i == 0)
>         autodetected_vector_mode = loop_vinfo->vector_mode;
> +      if (dump_enabled_p ())
> +       {
> +         if (res)
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Analysis succeeded with vector mode %s\n",
> +                            GET_MODE_NAME (loop_vinfo->vector_mode));
> +         else
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Analysis failed with vector mode %s\n",
> +                            GET_MODE_NAME (loop_vinfo->vector_mode));
> +       }
>
>        loop->aux = NULL;
>        if (res)
> @@ -2479,9 +2490,22 @@ vect_analyze_loop (class loop *loop, vec
>         }
>
>        if (mode_i < vector_modes.length ()
> -         && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
> -                      GET_MODE_SIZE (autodetected_vector_mode)))
> -       mode_i += 1;
> +         && VECTOR_MODE_P (autodetected_vector_mode)
> +         && (related_vector_mode (vector_modes[mode_i],
> +                                  GET_MODE_INNER (autodetected_vector_mode))
> +             == autodetected_vector_mode)
> +         && (related_vector_mode (autodetected_vector_mode,
> +                                  GET_MODE_INNER (vector_modes[mode_i]))
> +             == vector_modes[mode_i]))
> +       {
> +         if (dump_enabled_p ())
> +           dump_printf_loc (MSG_NOTE, vect_location,
> +                            "***** Skipping vector mode %s, which would"
> +                            " repeat the analysis for %s\n",
> +                            GET_MODE_NAME (vector_modes[mode_i]),
> +                            GET_MODE_NAME (autodetected_vector_mode));
> +         mode_i += 1;
> +       }
>
>        if (mode_i == vector_modes.length ()
>           || autodetected_vector_mode == VOIDmode)
> @@ -4870,13 +4894,15 @@ vect_create_epilog_for_reduction (stmt_v
>          in a vector mode of smaller size and first reduce upper/lower
>          halves against each other.  */
>        enum machine_mode mode1 = mode;
> -      unsigned sz = tree_to_uhwi (TYPE_SIZE_UNIT (vectype));
> -      unsigned sz1 = sz;
> +      unsigned nunits = TYPE_VECTOR_SUBPARTS (vectype).to_constant ();
> +      unsigned nunits1 = nunits;
>        if (!slp_reduc
>           && (mode1 = targetm.vectorize.split_reduction (mode)) != mode)
> -       sz1 = GET_MODE_SIZE (mode1).to_constant ();
> +       nunits1 = GET_MODE_NUNITS (mode1).to_constant ();
>
> -      tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1);
> +      tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> +                                                          scalar_type,
> +                                                          nunits1);
>        reduce_with_shift = have_whole_vector_shift (mode1);
>        if (!VECTOR_MODE_P (mode1))
>         reduce_with_shift = false;
> @@ -4890,11 +4916,13 @@ vect_create_epilog_for_reduction (stmt_v
>        /* First reduce the vector to the desired vector size we should
>          do shift reduction on by combining upper and lower halves.  */
>        new_temp = new_phi_result;
> -      while (sz > sz1)
> +      while (nunits > nunits1)
>         {
>           gcc_assert (!slp_reduc);
> -         sz /= 2;
> -         vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
> +         nunits /= 2;
> +         vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
> +                                                         scalar_type, nunits);
> +         unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1));
>
>           /* The target has to make sure we support lowpart/highpart
>              extraction, either via direct vector extract or through
> @@ -4919,15 +4947,14 @@ vect_create_epilog_for_reduction (stmt_v
>                   = gimple_build_assign (dst2, BIT_FIELD_REF,
>                                          build3 (BIT_FIELD_REF, vectype1,
>                                                  new_temp, TYPE_SIZE (vectype1),
> -                                                bitsize_int (sz * BITS_PER_UNIT)));
> +                                                bitsize_int (bitsize)));
>               gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT);
>             }
>           else
>             {
>               /* Extract via punning to appropriately sized integer mode
>                  vector.  */
> -             tree eltype = build_nonstandard_integer_type (sz * BITS_PER_UNIT,
> -                                                           1);
> +             tree eltype = build_nonstandard_integer_type (bitsize, 1);
>               tree etype = build_vector_type (eltype, 2);
>               gcc_assert (convert_optab_handler (vec_extract_optab,
>                                                  TYPE_MODE (etype),
> @@ -4956,7 +4983,7 @@ vect_create_epilog_for_reduction (stmt_v
>                   = gimple_build_assign (tem, BIT_FIELD_REF,
>                                          build3 (BIT_FIELD_REF, eltype,
>                                                  new_temp, TYPE_SIZE (eltype),
> -                                                bitsize_int (sz * BITS_PER_UNIT)));
> +                                                bitsize_int (bitsize)));
>               gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT);
>               dst2 =  make_ssa_name (vectype1);
>               epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR,
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c       2019-11-06 12:35:12.796199272 +0000
> +++ gcc/tree-vect-stmts.c       2019-11-06 12:35:27.976093444 +0000
> @@ -11097,18 +11097,28 @@ vect_remove_stores (stmt_vec_info first_
>      }
>  }
>
> -/* Function get_vectype_for_scalar_type_and_size.
> -
> -   Returns the vector type corresponding to SCALAR_TYPE  and SIZE as supported
> -   by the target.  */
> +/* If NUNITS is nonzero, return a vector type that contains NUNITS
> +   elements of type SCALAR_TYPE, or null if the target doesn't support
> +   such a type.
> +
> +   If NUNITS is zero, return a vector type that contains elements of
> +   type SCALAR_TYPE, choosing whichever vector size the target prefers.
> +
> +   If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode
> +   for this vectorization region and want to "autodetect" the best choice.
> +   Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE
> +   and we want the new type to be interoperable with it.   PREVAILING_MODE
> +   in this case can be a scalar integer mode or a vector mode; when it
> +   is a vector mode, the function acts like a tree-level version of
> +   related_vector_mode.  */
>
>  tree
> -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size)
> +get_related_vectype_for_scalar_type (machine_mode prevailing_mode,
> +                                    tree scalar_type, poly_uint64 nunits)
>  {
>    tree orig_scalar_type = scalar_type;
>    scalar_mode inner_mode;
>    machine_mode simd_mode;
> -  poly_uint64 nunits;
>    tree vectype;
>
>    if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode)
> @@ -11148,10 +11158,11 @@ get_vectype_for_scalar_type_and_size (tr
>    if (scalar_type == NULL_TREE)
>      return NULL_TREE;
>
> -  /* If no size was supplied use the mode the target prefers.   Otherwise
> -     lookup a vector mode of the specified size.  */
> -  if (known_eq (size, 0U))
> +  /* If no prevailing mode was supplied, use the mode the target prefers.
> +     Otherwise lookup a vector mode based on the prevailing mode.  */
> +  if (prevailing_mode == VOIDmode)
>      {
> +      gcc_assert (known_eq (nunits, 0U));
>        simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
>        if (SCALAR_INT_MODE_P (simd_mode))
>         {
> @@ -11167,9 +11178,19 @@ get_vectype_for_scalar_type_and_size (tr
>             return NULL_TREE;
>         }
>      }
> -  else if (!multiple_p (size, nbytes, &nunits)
> -          || !mode_for_vector (inner_mode, nunits).exists (&simd_mode))
> -    return NULL_TREE;
> +  else if (SCALAR_INT_MODE_P (prevailing_mode)
> +          || !related_vector_mode (prevailing_mode,
> +                                   inner_mode, nunits).exists (&simd_mode))
> +    {
> +      /* Fall back to using mode_for_vector, mostly in the hope of being
> +        able to use an integer mode.  */
> +      if (known_eq (nunits, 0U)
> +         && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits))
> +       return NULL_TREE;
> +
> +      if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode))
> +       return NULL_TREE;
> +    }
>
>    vectype = build_vector_type_for_mode (scalar_type, simd_mode);
>
> @@ -11197,9 +11218,8 @@ get_vectype_for_scalar_type_and_size (tr
>  tree
>  get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type)
>  {
> -  tree vectype;
> -  poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode);
> -  vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size);
> +  tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode,
> +                                                     scalar_type);
>    if (vectype && vinfo->vector_mode == VOIDmode)
>      vinfo->vector_mode = TYPE_MODE (vectype);
>    return vectype;
> @@ -11232,8 +11252,13 @@ get_same_sized_vectype (tree scalar_type
>    if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
>      return truth_type_for (vector_type);
>
> -  return get_vectype_for_scalar_type_and_size
> -          (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type)));
> +  poly_uint64 nunits;
> +  if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> +                  GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits))
> +    return NULL_TREE;
> +
> +  return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type),
> +                                             scalar_type, nunits);
>  }
>
>  /* Function vect_is_simple_use.
> Index: gcc/tree-vectorizer.c
> ===================================================================
> --- gcc/tree-vectorizer.c       2019-11-06 12:35:12.764199495 +0000
> +++ gcc/tree-vectorizer.c       2019-11-06 12:35:27.976093444 +0000
> @@ -1359,7 +1359,7 @@ get_vec_alignment_for_array_type (tree t
>    poly_uint64 array_size, vector_size;
>
>    tree scalar_type = strip_array_types (type);
> -  tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0);
> +  tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type);
>    if (!vectype
>        || !poly_int_tree_p (TYPE_SIZE (type), &array_size)
>        || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)
diff mbox series

Patch

Index: gcc/machmode.h
===================================================================
--- gcc/machmode.h	2019-10-25 13:26:59.053879364 +0100
+++ gcc/machmode.h	2019-10-25 13:27:26.201687539 +0100
@@ -258,6 +258,9 @@  #define CLASS_HAS_WIDER_MODES_P(CLASS)
   bool exists () const;
   template<typename U> bool exists (U *) const;
 
+  bool operator== (const T &m) const { return m_mode == m; }
+  bool operator!= (const T &m) const { return m_mode != m; }
+
 private:
   machine_mode m_mode;
 };
Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2019-10-25 13:27:19.317736181 +0100
+++ gcc/tree-vectorizer.h	2019-10-25 13:27:26.209687483 +0100
@@ -329,8 +329,9 @@  typedef std::pair<tree, tree> vec_object
   /* Cost data used by the target cost model.  */
   void *target_cost_data;
 
-  /* If we've chosen a vector size for this vectorization region,
-     this is one mode that has such a size, otherwise it is VOIDmode.  */
+  /* The argument we should pass to related_vector_mode when looking up
+     the vector mode for a scalar mode, or VOIDmode if we haven't yet
+     made any decisions about which vector modes to use.  */
   machine_mode vector_mode;
 
 private:
@@ -1595,8 +1596,9 @@  extern dump_user_location_t find_loop_lo
 extern bool vect_can_advance_ivs_p (loop_vec_info);
 
 /* In tree-vect-stmts.c.  */
+extern tree get_related_vectype_for_scalar_type (machine_mode, tree,
+						 poly_uint64 = 0);
 extern tree get_vectype_for_scalar_type (vec_info *, tree);
-extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64);
 extern tree get_mask_type_for_scalar_type (vec_info *, tree);
 extern tree get_same_sized_vectype (tree, tree);
 extern bool vect_get_loop_mask_type (loop_vec_info);
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2019-10-25 13:27:19.313736209 +0100
+++ gcc/tree-vect-slp.c	2019-10-25 13:27:26.205687511 +0100
@@ -3118,7 +3118,12 @@  vect_slp_bb_region (gimple_stmt_iterator
 	  && dbg_cnt (vect_slp))
 	{
 	  if (dump_enabled_p ())
-	    dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
+	    {
+	      dump_printf_loc (MSG_NOTE, vect_location,
+			       "***** Analysis succeeded with vector mode"
+			       " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode));
+	      dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n");
+	    }
 
 	  bb_vinfo->shared->check_datarefs ();
 	  vect_schedule_slp (bb_vinfo);
@@ -3138,6 +3143,13 @@  vect_slp_bb_region (gimple_stmt_iterator
 
 	  vectorized = true;
 	}
+      else
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Analysis failed with vector mode %s\n",
+			     GET_MODE_NAME (bb_vinfo->vector_mode));
+	}
 
       if (mode_i == 0)
 	autodetected_vector_mode = bb_vinfo->vector_mode;
@@ -3145,9 +3157,22 @@  vect_slp_bb_region (gimple_stmt_iterator
       delete bb_vinfo;
 
       if (mode_i < vector_modes.length ()
-	  && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
-		       GET_MODE_SIZE (autodetected_vector_mode)))
-	mode_i += 1;
+	  && VECTOR_MODE_P (autodetected_vector_mode)
+	  && (related_vector_mode (vector_modes[mode_i],
+				   GET_MODE_INNER (autodetected_vector_mode))
+	      == autodetected_vector_mode)
+	  && (related_vector_mode (autodetected_vector_mode,
+				   GET_MODE_INNER (vector_modes[mode_i]))
+	      == vector_modes[mode_i]))
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Skipping vector mode %s, which would"
+			     " repeat the analysis for %s\n",
+			     GET_MODE_NAME (vector_modes[mode_i]),
+			     GET_MODE_NAME (autodetected_vector_mode));
+	  mode_i += 1;
+	}
 
       if (vectorized
 	  || mode_i == vector_modes.length ()
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2019-10-25 13:27:19.309736237 +0100
+++ gcc/tree-vect-loop.c	2019-10-25 13:27:26.201687539 +0100
@@ -2367,6 +2367,17 @@  vect_analyze_loop (class loop *loop, loo
       opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts);
       if (mode_i == 0)
 	autodetected_vector_mode = loop_vinfo->vector_mode;
+      if (dump_enabled_p ())
+	{
+	  if (res)
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Analysis succeeded with vector mode %s\n",
+			     GET_MODE_NAME (loop_vinfo->vector_mode));
+	  else
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Analysis failed with vector mode %s\n",
+			     GET_MODE_NAME (loop_vinfo->vector_mode));
+	}
 
       if (res)
 	{
@@ -2400,9 +2411,22 @@  vect_analyze_loop (class loop *loop, loo
 	}
 
       if (mode_i < vector_modes.length ()
-	  && known_eq (GET_MODE_SIZE (vector_modes[mode_i]),
-		       GET_MODE_SIZE (autodetected_vector_mode)))
-	mode_i += 1;
+	  && VECTOR_MODE_P (autodetected_vector_mode)
+	  && (related_vector_mode (vector_modes[mode_i],
+				   GET_MODE_INNER (autodetected_vector_mode))
+	      == autodetected_vector_mode)
+	  && (related_vector_mode (autodetected_vector_mode,
+				   GET_MODE_INNER (vector_modes[mode_i]))
+	      == vector_modes[mode_i]))
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "***** Skipping vector mode %s, which would"
+			     " repeat the analysis for %s\n",
+			     GET_MODE_NAME (vector_modes[mode_i]),
+			     GET_MODE_NAME (autodetected_vector_mode));
+	  mode_i += 1;
+	}
 
       if (mode_i == vector_modes.length ()
 	  || autodetected_vector_mode == VOIDmode)
@@ -4763,7 +4787,10 @@  vect_create_epilog_for_reduction (stmt_v
 	  && (mode1 = targetm.vectorize.split_reduction (mode)) != mode)
 	sz1 = GET_MODE_SIZE (mode1).to_constant ();
 
-      tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1);
+      unsigned int scalar_bytes = tree_to_uhwi (TYPE_SIZE_UNIT (scalar_type));
+      tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
+							   scalar_type,
+							   sz1 / scalar_bytes);
       reduce_with_shift = have_whole_vector_shift (mode1);
       if (!VECTOR_MODE_P (mode1))
 	reduce_with_shift = false;
@@ -4781,7 +4808,9 @@  vect_create_epilog_for_reduction (stmt_v
 	{
 	  gcc_assert (!slp_reduc);
 	  sz /= 2;
-	  vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
+	  vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype),
+							  scalar_type,
+							  sz / scalar_bytes);
 
 	  /* The target has to make sure we support lowpart/highpart
 	     extraction, either via direct vector extract or through
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2019-10-25 13:27:22.985710263 +0100
+++ gcc/tree-vect-stmts.c	2019-10-25 13:27:26.205687511 +0100
@@ -11111,18 +11111,28 @@  vect_remove_stores (stmt_vec_info first_
     }
 }
 
-/* Function get_vectype_for_scalar_type_and_size.
-
-   Returns the vector type corresponding to SCALAR_TYPE  and SIZE as supported
-   by the target.  */
+/* If NUNITS is nonzero, return a vector type that contains NUNITS
+   elements of type SCALAR_TYPE, or null if the target doesn't support
+   such a type.
+
+   If NUNITS is zero, return a vector type that contains elements of
+   type SCALAR_TYPE, choosing whichever vector size the target prefers.
+
+   If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode
+   for this vectorization region and want to "autodetect" the best choice.
+   Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE
+   and we want the new type to be interoperable with it.   PREVAILING_MODE
+   in this case can be a scalar integer mode or a vector mode; when it
+   is a vector mode, the function acts like a tree-level version of
+   related_vector_mode.  */
 
 tree
-get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size)
+get_related_vectype_for_scalar_type (machine_mode prevailing_mode,
+				     tree scalar_type, poly_uint64 nunits)
 {
   tree orig_scalar_type = scalar_type;
   scalar_mode inner_mode;
   machine_mode simd_mode;
-  poly_uint64 nunits;
   tree vectype;
 
   if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode)
@@ -11162,10 +11172,11 @@  get_vectype_for_scalar_type_and_size (tr
   if (scalar_type == NULL_TREE)
     return NULL_TREE;
 
-  /* If no size was supplied use the mode the target prefers.   Otherwise
-     lookup a vector mode of the specified size.  */
-  if (known_eq (size, 0U))
+  /* If no prevailing mode was supplied, use the mode the target prefers.
+     Otherwise lookup a vector mode based on the prevailing mode.  */
+  if (prevailing_mode == VOIDmode)
     {
+      gcc_assert (known_eq (nunits, 0U));
       simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
       if (SCALAR_INT_MODE_P (simd_mode))
 	{
@@ -11181,9 +11192,19 @@  get_vectype_for_scalar_type_and_size (tr
 	    return NULL_TREE;
 	}
     }
-  else if (!multiple_p (size, nbytes, &nunits)
-	   || !mode_for_vector (inner_mode, nunits).exists (&simd_mode))
-    return NULL_TREE;
+  else if (SCALAR_INT_MODE_P (prevailing_mode)
+	   || !related_vector_mode (prevailing_mode,
+				    inner_mode, nunits).exists (&simd_mode))
+    {
+      /* Fall back to using mode_for_vector, mostly in the hope of being
+	 able to use an integer mode.  */
+      if (known_eq (nunits, 0U)
+	  && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits))
+	return NULL_TREE;
+
+      if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode))
+	return NULL_TREE;
+    }
 
   vectype = build_vector_type_for_mode (scalar_type, simd_mode);
 
@@ -11211,9 +11232,8 @@  get_vectype_for_scalar_type_and_size (tr
 tree
 get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type)
 {
-  tree vectype;
-  poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode);
-  vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size);
+  tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode,
+						      scalar_type);
   if (vectype && vinfo->vector_mode == VOIDmode)
     vinfo->vector_mode = TYPE_MODE (vectype);
   return vectype;
@@ -11246,8 +11266,13 @@  get_same_sized_vectype (tree scalar_type
   if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type))
     return truth_type_for (vector_type);
 
-  return get_vectype_for_scalar_type_and_size
-	   (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type)));
+  poly_uint64 nunits;
+  if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)),
+		   GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits))
+    return NULL_TREE;
+
+  return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type),
+					      scalar_type, nunits);
 }
 
 /* Function vect_is_simple_use.
Index: gcc/tree-vectorizer.c
===================================================================
--- gcc/tree-vectorizer.c	2019-10-25 13:27:19.317736181 +0100
+++ gcc/tree-vectorizer.c	2019-10-25 13:27:26.209687483 +0100
@@ -1348,7 +1348,7 @@  get_vec_alignment_for_array_type (tree t
   poly_uint64 array_size, vector_size;
 
   tree scalar_type = strip_array_types (type);
-  tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0);
+  tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type);
   if (!vectype
       || !poly_int_tree_p (TYPE_SIZE (type), &array_size)
       || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)