Message ID | mpty2x98071.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | [11/n] Support vectorisation with mixed vector sizes | expand |
On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford <richard.sandiford@arm.com> wrote: > > After previous patches, it's now possible to make the vectoriser > support multiple vector sizes in the same vector region, using > related_vector_mode to pick the right vector mode for a given > element mode. No port yet takes advantage of this, but I have > a follow-on patch for AArch64. > > This patch also seemed like a good opportunity to add some more dump > messages: one to make it clear which vector size/mode was being used > when analysis passed or failed, and another to say when we've decided > to skip a redundant vector size/mode. OK. I wonder if, when we requested a specific size previously, we now have to verify we got that constraint satisfied after the change. Esp. the epilogue vectorization cases want to get V2DI from V4DI. sz /= 2; - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, + sz / scalar_bytes); doesn't look like an improvement in readability to me there. Maybe re-formulating the whole code in terms of lanes instead of size would make it easier to follow? Thanks, Richard. > > 2019-10-24 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > * machmode.h (opt_machine_mode::operator==): New function. > (opt_machine_mode::operator!=): Likewise. > * tree-vectorizer.h (vec_info::vector_mode): Update comment. > (get_related_vectype_for_scalar_type): Delete. > (get_vectype_for_scalar_type_and_size): Declare. > * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say > whether analysis passed or failed, and with what vector modes. > Use related_vector_mode to check whether trying a particular > vector mode would be redundant with the autodetected mode, > and print a dump message if we decide to skip it. > * tree-vect-loop.c (vect_analyze_loop): Likewise. > (vect_create_epilog_for_reduction): Use > get_related_vectype_for_scalar_type instead of > get_vectype_for_scalar_type_and_size. > * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace > with... > (get_related_vectype_for_scalar_type): ...this new function. > Take a starting/"prevailing" vector mode rather than a vector size. > Take an optional nunits argument, with the same meaning as for > related_vector_mode. Use related_vector_mode when not > auto-detecting a mode, falling back to mode_for_vector if no > target mode exists. > (get_vectype_for_scalar_type): Update accordingly. > (get_same_sized_vectype): Likewise. > * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. > > Index: gcc/machmode.h > =================================================================== > --- gcc/machmode.h 2019-10-25 13:26:59.053879364 +0100 > +++ gcc/machmode.h 2019-10-25 13:27:26.201687539 +0100 > @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS) > bool exists () const; > template<typename U> bool exists (U *) const; > > + bool operator== (const T &m) const { return m_mode == m; } > + bool operator!= (const T &m) const { return m_mode != m; } > + > private: > machine_mode m_mode; > }; > Index: gcc/tree-vectorizer.h > =================================================================== > --- gcc/tree-vectorizer.h 2019-10-25 13:27:19.317736181 +0100 > +++ gcc/tree-vectorizer.h 2019-10-25 13:27:26.209687483 +0100 > @@ -329,8 +329,9 @@ typedef std::pair<tree, tree> vec_object > /* Cost data used by the target cost model. */ > void *target_cost_data; > > - /* If we've chosen a vector size for this vectorization region, > - this is one mode that has such a size, otherwise it is VOIDmode. */ > + /* The argument we should pass to related_vector_mode when looking up > + the vector mode for a scalar mode, or VOIDmode if we haven't yet > + made any decisions about which vector modes to use. */ > machine_mode vector_mode; > > private: > @@ -1595,8 +1596,9 @@ extern dump_user_location_t find_loop_lo > extern bool vect_can_advance_ivs_p (loop_vec_info); > > /* In tree-vect-stmts.c. */ > +extern tree get_related_vectype_for_scalar_type (machine_mode, tree, > + poly_uint64 = 0); > extern tree get_vectype_for_scalar_type (vec_info *, tree); > -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64); > extern tree get_mask_type_for_scalar_type (vec_info *, tree); > extern tree get_same_sized_vectype (tree, tree); > extern bool vect_get_loop_mask_type (loop_vec_info); > Index: gcc/tree-vect-slp.c > =================================================================== > --- gcc/tree-vect-slp.c 2019-10-25 13:27:19.313736209 +0100 > +++ gcc/tree-vect-slp.c 2019-10-25 13:27:26.205687511 +0100 > @@ -3118,7 +3118,12 @@ vect_slp_bb_region (gimple_stmt_iterator > && dbg_cnt (vect_slp)) > { > if (dump_enabled_p ()) > - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); > + { > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis succeeded with vector mode" > + " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode)); > + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); > + } > > bb_vinfo->shared->check_datarefs (); > vect_schedule_slp (bb_vinfo); > @@ -3138,6 +3143,13 @@ vect_slp_bb_region (gimple_stmt_iterator > > vectorized = true; > } > + else > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis failed with vector mode %s\n", > + GET_MODE_NAME (bb_vinfo->vector_mode)); > + } > > if (mode_i == 0) > autodetected_vector_mode = bb_vinfo->vector_mode; > @@ -3145,9 +3157,22 @@ vect_slp_bb_region (gimple_stmt_iterator > delete bb_vinfo; > > if (mode_i < vector_modes.length () > - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), > - GET_MODE_SIZE (autodetected_vector_mode))) > - mode_i += 1; > + && VECTOR_MODE_P (autodetected_vector_mode) > + && (related_vector_mode (vector_modes[mode_i], > + GET_MODE_INNER (autodetected_vector_mode)) > + == autodetected_vector_mode) > + && (related_vector_mode (autodetected_vector_mode, > + GET_MODE_INNER (vector_modes[mode_i])) > + == vector_modes[mode_i])) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Skipping vector mode %s, which would" > + " repeat the analysis for %s\n", > + GET_MODE_NAME (vector_modes[mode_i]), > + GET_MODE_NAME (autodetected_vector_mode)); > + mode_i += 1; > + } > > if (vectorized > || mode_i == vector_modes.length () > Index: gcc/tree-vect-loop.c > =================================================================== > --- gcc/tree-vect-loop.c 2019-10-25 13:27:19.309736237 +0100 > +++ gcc/tree-vect-loop.c 2019-10-25 13:27:26.201687539 +0100 > @@ -2367,6 +2367,17 @@ vect_analyze_loop (class loop *loop, loo > opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts); > if (mode_i == 0) > autodetected_vector_mode = loop_vinfo->vector_mode; > + if (dump_enabled_p ()) > + { > + if (res) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis succeeded with vector mode %s\n", > + GET_MODE_NAME (loop_vinfo->vector_mode)); > + else > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis failed with vector mode %s\n", > + GET_MODE_NAME (loop_vinfo->vector_mode)); > + } > > if (res) > { > @@ -2400,9 +2411,22 @@ vect_analyze_loop (class loop *loop, loo > } > > if (mode_i < vector_modes.length () > - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), > - GET_MODE_SIZE (autodetected_vector_mode))) > - mode_i += 1; > + && VECTOR_MODE_P (autodetected_vector_mode) > + && (related_vector_mode (vector_modes[mode_i], > + GET_MODE_INNER (autodetected_vector_mode)) > + == autodetected_vector_mode) > + && (related_vector_mode (autodetected_vector_mode, > + GET_MODE_INNER (vector_modes[mode_i])) > + == vector_modes[mode_i])) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Skipping vector mode %s, which would" > + " repeat the analysis for %s\n", > + GET_MODE_NAME (vector_modes[mode_i]), > + GET_MODE_NAME (autodetected_vector_mode)); > + mode_i += 1; > + } > > if (mode_i == vector_modes.length () > || autodetected_vector_mode == VOIDmode) > @@ -4763,7 +4787,10 @@ vect_create_epilog_for_reduction (stmt_v > && (mode1 = targetm.vectorize.split_reduction (mode)) != mode) > sz1 = GET_MODE_SIZE (mode1).to_constant (); > > - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1); > + unsigned int scalar_bytes = tree_to_uhwi (TYPE_SIZE_UNIT (scalar_type)); > + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, > + sz1 / scalar_bytes); > reduce_with_shift = have_whole_vector_shift (mode1); > if (!VECTOR_MODE_P (mode1)) > reduce_with_shift = false; > @@ -4781,7 +4808,9 @@ vect_create_epilog_for_reduction (stmt_v > { > gcc_assert (!slp_reduc); > sz /= 2; > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, > + sz / scalar_bytes); > > /* The target has to make sure we support lowpart/highpart > extraction, either via direct vector extract or through > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c 2019-10-25 13:27:22.985710263 +0100 > +++ gcc/tree-vect-stmts.c 2019-10-25 13:27:26.205687511 +0100 > @@ -11111,18 +11111,28 @@ vect_remove_stores (stmt_vec_info first_ > } > } > > -/* Function get_vectype_for_scalar_type_and_size. > - > - Returns the vector type corresponding to SCALAR_TYPE and SIZE as supported > - by the target. */ > +/* If NUNITS is nonzero, return a vector type that contains NUNITS > + elements of type SCALAR_TYPE, or null if the target doesn't support > + such a type. > + > + If NUNITS is zero, return a vector type that contains elements of > + type SCALAR_TYPE, choosing whichever vector size the target prefers. > + > + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode > + for this vectorization region and want to "autodetect" the best choice. > + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE > + and we want the new type to be interoperable with it. PREVAILING_MODE > + in this case can be a scalar integer mode or a vector mode; when it > + is a vector mode, the function acts like a tree-level version of > + related_vector_mode. */ > > tree > -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size) > +get_related_vectype_for_scalar_type (machine_mode prevailing_mode, > + tree scalar_type, poly_uint64 nunits) > { > tree orig_scalar_type = scalar_type; > scalar_mode inner_mode; > machine_mode simd_mode; > - poly_uint64 nunits; > tree vectype; > > if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) > @@ -11162,10 +11172,11 @@ get_vectype_for_scalar_type_and_size (tr > if (scalar_type == NULL_TREE) > return NULL_TREE; > > - /* If no size was supplied use the mode the target prefers. Otherwise > - lookup a vector mode of the specified size. */ > - if (known_eq (size, 0U)) > + /* If no prevailing mode was supplied, use the mode the target prefers. > + Otherwise lookup a vector mode based on the prevailing mode. */ > + if (prevailing_mode == VOIDmode) > { > + gcc_assert (known_eq (nunits, 0U)); > simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); > if (SCALAR_INT_MODE_P (simd_mode)) > { > @@ -11181,9 +11192,19 @@ get_vectype_for_scalar_type_and_size (tr > return NULL_TREE; > } > } > - else if (!multiple_p (size, nbytes, &nunits) > - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) > - return NULL_TREE; > + else if (SCALAR_INT_MODE_P (prevailing_mode) > + || !related_vector_mode (prevailing_mode, > + inner_mode, nunits).exists (&simd_mode)) > + { > + /* Fall back to using mode_for_vector, mostly in the hope of being > + able to use an integer mode. */ > + if (known_eq (nunits, 0U) > + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) > + return NULL_TREE; > + > + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) > + return NULL_TREE; > + } > > vectype = build_vector_type_for_mode (scalar_type, simd_mode); > > @@ -11211,9 +11232,8 @@ get_vectype_for_scalar_type_and_size (tr > tree > get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type) > { > - tree vectype; > - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode); > - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size); > + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode, > + scalar_type); > if (vectype && vinfo->vector_mode == VOIDmode) > vinfo->vector_mode = TYPE_MODE (vectype); > return vectype; > @@ -11246,8 +11266,13 @@ get_same_sized_vectype (tree scalar_type > if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) > return truth_type_for (vector_type); > > - return get_vectype_for_scalar_type_and_size > - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type))); > + poly_uint64 nunits; > + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)), > + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits)) > + return NULL_TREE; > + > + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type), > + scalar_type, nunits); > } > > /* Function vect_is_simple_use. > Index: gcc/tree-vectorizer.c > =================================================================== > --- gcc/tree-vectorizer.c 2019-10-25 13:27:19.317736181 +0100 > +++ gcc/tree-vectorizer.c 2019-10-25 13:27:26.209687483 +0100 > @@ -1348,7 +1348,7 @@ get_vec_alignment_for_array_type (tree t > poly_uint64 array_size, vector_size; > > tree scalar_type = strip_array_types (type); > - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0); > + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type); > if (!vectype > || !poly_int_tree_p (TYPE_SIZE (type), &array_size) > || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)
Richard Biener <richard.guenther@gmail.com> writes: > On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford > <richard.sandiford@arm.com> wrote: >> >> After previous patches, it's now possible to make the vectoriser >> support multiple vector sizes in the same vector region, using >> related_vector_mode to pick the right vector mode for a given >> element mode. No port yet takes advantage of this, but I have >> a follow-on patch for AArch64. >> >> This patch also seemed like a good opportunity to add some more dump >> messages: one to make it clear which vector size/mode was being used >> when analysis passed or failed, and another to say when we've decided >> to skip a redundant vector size/mode. > > OK. > > I wonder if, when we requested a specific size previously, we now > have to verify we got that constraint satisfied after the change. > Esp. the epilogue vectorization cases want to get V2DI > from V4DI. > > sz /= 2; > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, > + sz / scalar_bytes); > > doesn't look like an improvement in readability to me there. Yeah, guess it isn't great. > Maybe re-formulating the whole code in terms of lanes instead of size > would make it easier to follow? OK, how about this version? It still won't win awards, but it's at least a bit more readable. Tested as before. Richard 2019-11-06 Richard Sandiford <richard.sandiford@arm.com> gcc/ * machmode.h (opt_machine_mode::operator==): New function. (opt_machine_mode::operator!=): Likewise. * tree-vectorizer.h (vec_info::vector_mode): Update comment. (get_related_vectype_for_scalar_type): Delete. (get_vectype_for_scalar_type_and_size): Declare. * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say whether analysis passed or failed, and with what vector modes. Use related_vector_mode to check whether trying a particular vector mode would be redundant with the autodetected mode, and print a dump message if we decide to skip it. * tree-vect-loop.c (vect_analyze_loop): Likewise. (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type_and_size. * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace with... (get_related_vectype_for_scalar_type): ...this new function. Take a starting/"prevailing" vector mode rather than a vector size. Take an optional nunits argument, with the same meaning as for related_vector_mode. Use related_vector_mode when not auto-detecting a mode, falling back to mode_for_vector if no target mode exists. (get_vectype_for_scalar_type): Update accordingly. (get_same_sized_vectype): Likewise. * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. Index: gcc/machmode.h =================================================================== --- gcc/machmode.h 2019-11-06 12:35:12.460201615 +0000 +++ gcc/machmode.h 2019-11-06 12:35:27.972093472 +0000 @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS) bool exists () const; template<typename U> bool exists (U *) const; + bool operator== (const T &m) const { return m_mode == m; } + bool operator!= (const T &m) const { return m_mode != m; } + private: machine_mode m_mode; }; Index: gcc/tree-vectorizer.h =================================================================== --- gcc/tree-vectorizer.h 2019-11-06 12:35:12.764199495 +0000 +++ gcc/tree-vectorizer.h 2019-11-06 12:35:27.976093444 +0000 @@ -335,8 +335,9 @@ typedef std::pair<tree, tree> vec_object /* Cost data used by the target cost model. */ void *target_cost_data; - /* If we've chosen a vector size for this vectorization region, - this is one mode that has such a size, otherwise it is VOIDmode. */ + /* The argument we should pass to related_vector_mode when looking up + the vector mode for a scalar mode, or VOIDmode if we haven't yet + made any decisions about which vector modes to use. */ machine_mode vector_mode; private: @@ -1609,8 +1610,9 @@ extern bool vect_can_advance_ivs_p (loop extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code); /* In tree-vect-stmts.c. */ +extern tree get_related_vectype_for_scalar_type (machine_mode, tree, + poly_uint64 = 0); extern tree get_vectype_for_scalar_type (vec_info *, tree); -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64); extern tree get_mask_type_for_scalar_type (vec_info *, tree); extern tree get_same_sized_vectype (tree, tree); extern bool vect_get_loop_mask_type (loop_vec_info); Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c 2019-11-06 12:35:12.760199523 +0000 +++ gcc/tree-vect-slp.c 2019-11-06 12:35:27.972093472 +0000 @@ -3202,7 +3202,12 @@ vect_slp_bb_region (gimple_stmt_iterator && dbg_cnt (vect_slp)) { if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); + { + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis succeeded with vector mode" + " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode)); + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); + } bb_vinfo->shared->check_datarefs (); vect_schedule_slp (bb_vinfo); @@ -3222,6 +3227,13 @@ vect_slp_bb_region (gimple_stmt_iterator vectorized = true; } + else + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis failed with vector mode %s\n", + GET_MODE_NAME (bb_vinfo->vector_mode)); + } if (mode_i == 0) autodetected_vector_mode = bb_vinfo->vector_mode; @@ -3229,9 +3241,22 @@ vect_slp_bb_region (gimple_stmt_iterator delete bb_vinfo; if (mode_i < vector_modes.length () - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), - GET_MODE_SIZE (autodetected_vector_mode))) - mode_i += 1; + && VECTOR_MODE_P (autodetected_vector_mode) + && (related_vector_mode (vector_modes[mode_i], + GET_MODE_INNER (autodetected_vector_mode)) + == autodetected_vector_mode) + && (related_vector_mode (autodetected_vector_mode, + GET_MODE_INNER (vector_modes[mode_i])) + == vector_modes[mode_i])) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Skipping vector mode %s, which would" + " repeat the analysis for %s\n", + GET_MODE_NAME (vector_modes[mode_i]), + GET_MODE_NAME (autodetected_vector_mode)); + mode_i += 1; + } if (vectorized || mode_i == vector_modes.length () Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c 2019-11-06 12:35:12.756199552 +0000 +++ gcc/tree-vect-loop.c 2019-11-06 12:35:27.972093472 +0000 @@ -2417,6 +2417,17 @@ vect_analyze_loop (class loop *loop, vec res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts); if (mode_i == 0) autodetected_vector_mode = loop_vinfo->vector_mode; + if (dump_enabled_p ()) + { + if (res) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis succeeded with vector mode %s\n", + GET_MODE_NAME (loop_vinfo->vector_mode)); + else + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis failed with vector mode %s\n", + GET_MODE_NAME (loop_vinfo->vector_mode)); + } loop->aux = NULL; if (res) @@ -2479,9 +2490,22 @@ vect_analyze_loop (class loop *loop, vec } if (mode_i < vector_modes.length () - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), - GET_MODE_SIZE (autodetected_vector_mode))) - mode_i += 1; + && VECTOR_MODE_P (autodetected_vector_mode) + && (related_vector_mode (vector_modes[mode_i], + GET_MODE_INNER (autodetected_vector_mode)) + == autodetected_vector_mode) + && (related_vector_mode (autodetected_vector_mode, + GET_MODE_INNER (vector_modes[mode_i])) + == vector_modes[mode_i])) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Skipping vector mode %s, which would" + " repeat the analysis for %s\n", + GET_MODE_NAME (vector_modes[mode_i]), + GET_MODE_NAME (autodetected_vector_mode)); + mode_i += 1; + } if (mode_i == vector_modes.length () || autodetected_vector_mode == VOIDmode) @@ -4870,13 +4894,15 @@ vect_create_epilog_for_reduction (stmt_v in a vector mode of smaller size and first reduce upper/lower halves against each other. */ enum machine_mode mode1 = mode; - unsigned sz = tree_to_uhwi (TYPE_SIZE_UNIT (vectype)); - unsigned sz1 = sz; + unsigned nunits = TYPE_VECTOR_SUBPARTS (vectype).to_constant (); + unsigned nunits1 = nunits; if (!slp_reduc && (mode1 = targetm.vectorize.split_reduction (mode)) != mode) - sz1 = GET_MODE_SIZE (mode1).to_constant (); + nunits1 = GET_MODE_NUNITS (mode1).to_constant (); - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1); + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, + nunits1); reduce_with_shift = have_whole_vector_shift (mode1); if (!VECTOR_MODE_P (mode1)) reduce_with_shift = false; @@ -4890,11 +4916,13 @@ vect_create_epilog_for_reduction (stmt_v /* First reduce the vector to the desired vector size we should do shift reduction on by combining upper and lower halves. */ new_temp = new_phi_result; - while (sz > sz1) + while (nunits > nunits1) { gcc_assert (!slp_reduc); - sz /= 2; - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); + nunits /= 2; + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, nunits); + unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1)); /* The target has to make sure we support lowpart/highpart extraction, either via direct vector extract or through @@ -4919,15 +4947,14 @@ vect_create_epilog_for_reduction (stmt_v = gimple_build_assign (dst2, BIT_FIELD_REF, build3 (BIT_FIELD_REF, vectype1, new_temp, TYPE_SIZE (vectype1), - bitsize_int (sz * BITS_PER_UNIT))); + bitsize_int (bitsize))); gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); } else { /* Extract via punning to appropriately sized integer mode vector. */ - tree eltype = build_nonstandard_integer_type (sz * BITS_PER_UNIT, - 1); + tree eltype = build_nonstandard_integer_type (bitsize, 1); tree etype = build_vector_type (eltype, 2); gcc_assert (convert_optab_handler (vec_extract_optab, TYPE_MODE (etype), @@ -4956,7 +4983,7 @@ vect_create_epilog_for_reduction (stmt_v = gimple_build_assign (tem, BIT_FIELD_REF, build3 (BIT_FIELD_REF, eltype, new_temp, TYPE_SIZE (eltype), - bitsize_int (sz * BITS_PER_UNIT))); + bitsize_int (bitsize))); gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); dst2 = make_ssa_name (vectype1); epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR, Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c 2019-11-06 12:35:12.796199272 +0000 +++ gcc/tree-vect-stmts.c 2019-11-06 12:35:27.976093444 +0000 @@ -11097,18 +11097,28 @@ vect_remove_stores (stmt_vec_info first_ } } -/* Function get_vectype_for_scalar_type_and_size. - - Returns the vector type corresponding to SCALAR_TYPE and SIZE as supported - by the target. */ +/* If NUNITS is nonzero, return a vector type that contains NUNITS + elements of type SCALAR_TYPE, or null if the target doesn't support + such a type. + + If NUNITS is zero, return a vector type that contains elements of + type SCALAR_TYPE, choosing whichever vector size the target prefers. + + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode + for this vectorization region and want to "autodetect" the best choice. + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE + and we want the new type to be interoperable with it. PREVAILING_MODE + in this case can be a scalar integer mode or a vector mode; when it + is a vector mode, the function acts like a tree-level version of + related_vector_mode. */ tree -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size) +get_related_vectype_for_scalar_type (machine_mode prevailing_mode, + tree scalar_type, poly_uint64 nunits) { tree orig_scalar_type = scalar_type; scalar_mode inner_mode; machine_mode simd_mode; - poly_uint64 nunits; tree vectype; if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) @@ -11148,10 +11158,11 @@ get_vectype_for_scalar_type_and_size (tr if (scalar_type == NULL_TREE) return NULL_TREE; - /* If no size was supplied use the mode the target prefers. Otherwise - lookup a vector mode of the specified size. */ - if (known_eq (size, 0U)) + /* If no prevailing mode was supplied, use the mode the target prefers. + Otherwise lookup a vector mode based on the prevailing mode. */ + if (prevailing_mode == VOIDmode) { + gcc_assert (known_eq (nunits, 0U)); simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); if (SCALAR_INT_MODE_P (simd_mode)) { @@ -11167,9 +11178,19 @@ get_vectype_for_scalar_type_and_size (tr return NULL_TREE; } } - else if (!multiple_p (size, nbytes, &nunits) - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) - return NULL_TREE; + else if (SCALAR_INT_MODE_P (prevailing_mode) + || !related_vector_mode (prevailing_mode, + inner_mode, nunits).exists (&simd_mode)) + { + /* Fall back to using mode_for_vector, mostly in the hope of being + able to use an integer mode. */ + if (known_eq (nunits, 0U) + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) + return NULL_TREE; + + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) + return NULL_TREE; + } vectype = build_vector_type_for_mode (scalar_type, simd_mode); @@ -11197,9 +11218,8 @@ get_vectype_for_scalar_type_and_size (tr tree get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type) { - tree vectype; - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode); - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size); + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode, + scalar_type); if (vectype && vinfo->vector_mode == VOIDmode) vinfo->vector_mode = TYPE_MODE (vectype); return vectype; @@ -11232,8 +11252,13 @@ get_same_sized_vectype (tree scalar_type if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) return truth_type_for (vector_type); - return get_vectype_for_scalar_type_and_size - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type))); + poly_uint64 nunits; + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)), + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits)) + return NULL_TREE; + + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type), + scalar_type, nunits); } /* Function vect_is_simple_use. Index: gcc/tree-vectorizer.c =================================================================== --- gcc/tree-vectorizer.c 2019-11-06 12:35:12.764199495 +0000 +++ gcc/tree-vectorizer.c 2019-11-06 12:35:27.976093444 +0000 @@ -1359,7 +1359,7 @@ get_vec_alignment_for_array_type (tree t poly_uint64 array_size, vector_size; tree scalar_type = strip_array_types (type); - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0); + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type); if (!vectype || !poly_int_tree_p (TYPE_SIZE (type), &array_size) || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)
On Wed, Nov 6, 2019 at 1:38 PM Richard Sandiford <richard.sandiford@arm.com> wrote: > > Richard Biener <richard.guenther@gmail.com> writes: > > On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford > > <richard.sandiford@arm.com> wrote: > >> > >> After previous patches, it's now possible to make the vectoriser > >> support multiple vector sizes in the same vector region, using > >> related_vector_mode to pick the right vector mode for a given > >> element mode. No port yet takes advantage of this, but I have > >> a follow-on patch for AArch64. > >> > >> This patch also seemed like a good opportunity to add some more dump > >> messages: one to make it clear which vector size/mode was being used > >> when analysis passed or failed, and another to say when we've decided > >> to skip a redundant vector size/mode. > > > > OK. > > > > I wonder if, when we requested a specific size previously, we now > > have to verify we got that constraint satisfied after the change. > > Esp. the epilogue vectorization cases want to get V2DI > > from V4DI. > > > > sz /= 2; > > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); > > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > > + scalar_type, > > + sz / scalar_bytes); > > > > doesn't look like an improvement in readability to me there. > > Yeah, guess it isn't great. > > > Maybe re-formulating the whole code in terms of lanes instead of size > > would make it easier to follow? > > OK, how about this version? It still won't win awards, but it's at > least a bit more readable. > > Tested as before. OK (and sorry for the delay, looking for leftovers of the series now). Thanks, Richard. > Richard > > > 2019-11-06 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > * machmode.h (opt_machine_mode::operator==): New function. > (opt_machine_mode::operator!=): Likewise. > * tree-vectorizer.h (vec_info::vector_mode): Update comment. > (get_related_vectype_for_scalar_type): Delete. > (get_vectype_for_scalar_type_and_size): Declare. > * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say > whether analysis passed or failed, and with what vector modes. > Use related_vector_mode to check whether trying a particular > vector mode would be redundant with the autodetected mode, > and print a dump message if we decide to skip it. > * tree-vect-loop.c (vect_analyze_loop): Likewise. > (vect_create_epilog_for_reduction): Use > get_related_vectype_for_scalar_type instead of > get_vectype_for_scalar_type_and_size. > * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace > with... > (get_related_vectype_for_scalar_type): ...this new function. > Take a starting/"prevailing" vector mode rather than a vector size. > Take an optional nunits argument, with the same meaning as for > related_vector_mode. Use related_vector_mode when not > auto-detecting a mode, falling back to mode_for_vector if no > target mode exists. > (get_vectype_for_scalar_type): Update accordingly. > (get_same_sized_vectype): Likewise. > * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. > > Index: gcc/machmode.h > =================================================================== > --- gcc/machmode.h 2019-11-06 12:35:12.460201615 +0000 > +++ gcc/machmode.h 2019-11-06 12:35:27.972093472 +0000 > @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS) > bool exists () const; > template<typename U> bool exists (U *) const; > > + bool operator== (const T &m) const { return m_mode == m; } > + bool operator!= (const T &m) const { return m_mode != m; } > + > private: > machine_mode m_mode; > }; > Index: gcc/tree-vectorizer.h > =================================================================== > --- gcc/tree-vectorizer.h 2019-11-06 12:35:12.764199495 +0000 > +++ gcc/tree-vectorizer.h 2019-11-06 12:35:27.976093444 +0000 > @@ -335,8 +335,9 @@ typedef std::pair<tree, tree> vec_object > /* Cost data used by the target cost model. */ > void *target_cost_data; > > - /* If we've chosen a vector size for this vectorization region, > - this is one mode that has such a size, otherwise it is VOIDmode. */ > + /* The argument we should pass to related_vector_mode when looking up > + the vector mode for a scalar mode, or VOIDmode if we haven't yet > + made any decisions about which vector modes to use. */ > machine_mode vector_mode; > > private: > @@ -1609,8 +1610,9 @@ extern bool vect_can_advance_ivs_p (loop > extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code); > > /* In tree-vect-stmts.c. */ > +extern tree get_related_vectype_for_scalar_type (machine_mode, tree, > + poly_uint64 = 0); > extern tree get_vectype_for_scalar_type (vec_info *, tree); > -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64); > extern tree get_mask_type_for_scalar_type (vec_info *, tree); > extern tree get_same_sized_vectype (tree, tree); > extern bool vect_get_loop_mask_type (loop_vec_info); > Index: gcc/tree-vect-slp.c > =================================================================== > --- gcc/tree-vect-slp.c 2019-11-06 12:35:12.760199523 +0000 > +++ gcc/tree-vect-slp.c 2019-11-06 12:35:27.972093472 +0000 > @@ -3202,7 +3202,12 @@ vect_slp_bb_region (gimple_stmt_iterator > && dbg_cnt (vect_slp)) > { > if (dump_enabled_p ()) > - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); > + { > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis succeeded with vector mode" > + " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode)); > + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); > + } > > bb_vinfo->shared->check_datarefs (); > vect_schedule_slp (bb_vinfo); > @@ -3222,6 +3227,13 @@ vect_slp_bb_region (gimple_stmt_iterator > > vectorized = true; > } > + else > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis failed with vector mode %s\n", > + GET_MODE_NAME (bb_vinfo->vector_mode)); > + } > > if (mode_i == 0) > autodetected_vector_mode = bb_vinfo->vector_mode; > @@ -3229,9 +3241,22 @@ vect_slp_bb_region (gimple_stmt_iterator > delete bb_vinfo; > > if (mode_i < vector_modes.length () > - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), > - GET_MODE_SIZE (autodetected_vector_mode))) > - mode_i += 1; > + && VECTOR_MODE_P (autodetected_vector_mode) > + && (related_vector_mode (vector_modes[mode_i], > + GET_MODE_INNER (autodetected_vector_mode)) > + == autodetected_vector_mode) > + && (related_vector_mode (autodetected_vector_mode, > + GET_MODE_INNER (vector_modes[mode_i])) > + == vector_modes[mode_i])) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Skipping vector mode %s, which would" > + " repeat the analysis for %s\n", > + GET_MODE_NAME (vector_modes[mode_i]), > + GET_MODE_NAME (autodetected_vector_mode)); > + mode_i += 1; > + } > > if (vectorized > || mode_i == vector_modes.length () > Index: gcc/tree-vect-loop.c > =================================================================== > --- gcc/tree-vect-loop.c 2019-11-06 12:35:12.756199552 +0000 > +++ gcc/tree-vect-loop.c 2019-11-06 12:35:27.972093472 +0000 > @@ -2417,6 +2417,17 @@ vect_analyze_loop (class loop *loop, vec > res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts); > if (mode_i == 0) > autodetected_vector_mode = loop_vinfo->vector_mode; > + if (dump_enabled_p ()) > + { > + if (res) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis succeeded with vector mode %s\n", > + GET_MODE_NAME (loop_vinfo->vector_mode)); > + else > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Analysis failed with vector mode %s\n", > + GET_MODE_NAME (loop_vinfo->vector_mode)); > + } > > loop->aux = NULL; > if (res) > @@ -2479,9 +2490,22 @@ vect_analyze_loop (class loop *loop, vec > } > > if (mode_i < vector_modes.length () > - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), > - GET_MODE_SIZE (autodetected_vector_mode))) > - mode_i += 1; > + && VECTOR_MODE_P (autodetected_vector_mode) > + && (related_vector_mode (vector_modes[mode_i], > + GET_MODE_INNER (autodetected_vector_mode)) > + == autodetected_vector_mode) > + && (related_vector_mode (autodetected_vector_mode, > + GET_MODE_INNER (vector_modes[mode_i])) > + == vector_modes[mode_i])) > + { > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "***** Skipping vector mode %s, which would" > + " repeat the analysis for %s\n", > + GET_MODE_NAME (vector_modes[mode_i]), > + GET_MODE_NAME (autodetected_vector_mode)); > + mode_i += 1; > + } > > if (mode_i == vector_modes.length () > || autodetected_vector_mode == VOIDmode) > @@ -4870,13 +4894,15 @@ vect_create_epilog_for_reduction (stmt_v > in a vector mode of smaller size and first reduce upper/lower > halves against each other. */ > enum machine_mode mode1 = mode; > - unsigned sz = tree_to_uhwi (TYPE_SIZE_UNIT (vectype)); > - unsigned sz1 = sz; > + unsigned nunits = TYPE_VECTOR_SUBPARTS (vectype).to_constant (); > + unsigned nunits1 = nunits; > if (!slp_reduc > && (mode1 = targetm.vectorize.split_reduction (mode)) != mode) > - sz1 = GET_MODE_SIZE (mode1).to_constant (); > + nunits1 = GET_MODE_NUNITS (mode1).to_constant (); > > - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1); > + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, > + nunits1); > reduce_with_shift = have_whole_vector_shift (mode1); > if (!VECTOR_MODE_P (mode1)) > reduce_with_shift = false; > @@ -4890,11 +4916,13 @@ vect_create_epilog_for_reduction (stmt_v > /* First reduce the vector to the desired vector size we should > do shift reduction on by combining upper and lower halves. */ > new_temp = new_phi_result; > - while (sz > sz1) > + while (nunits > nunits1) > { > gcc_assert (!slp_reduc); > - sz /= 2; > - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); > + nunits /= 2; > + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), > + scalar_type, nunits); > + unsigned int bitsize = tree_to_uhwi (TYPE_SIZE (vectype1)); > > /* The target has to make sure we support lowpart/highpart > extraction, either via direct vector extract or through > @@ -4919,15 +4947,14 @@ vect_create_epilog_for_reduction (stmt_v > = gimple_build_assign (dst2, BIT_FIELD_REF, > build3 (BIT_FIELD_REF, vectype1, > new_temp, TYPE_SIZE (vectype1), > - bitsize_int (sz * BITS_PER_UNIT))); > + bitsize_int (bitsize))); > gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > } > else > { > /* Extract via punning to appropriately sized integer mode > vector. */ > - tree eltype = build_nonstandard_integer_type (sz * BITS_PER_UNIT, > - 1); > + tree eltype = build_nonstandard_integer_type (bitsize, 1); > tree etype = build_vector_type (eltype, 2); > gcc_assert (convert_optab_handler (vec_extract_optab, > TYPE_MODE (etype), > @@ -4956,7 +4983,7 @@ vect_create_epilog_for_reduction (stmt_v > = gimple_build_assign (tem, BIT_FIELD_REF, > build3 (BIT_FIELD_REF, eltype, > new_temp, TYPE_SIZE (eltype), > - bitsize_int (sz * BITS_PER_UNIT))); > + bitsize_int (bitsize))); > gsi_insert_before (&exit_gsi, epilog_stmt, GSI_SAME_STMT); > dst2 = make_ssa_name (vectype1); > epilog_stmt = gimple_build_assign (dst2, VIEW_CONVERT_EXPR, > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c 2019-11-06 12:35:12.796199272 +0000 > +++ gcc/tree-vect-stmts.c 2019-11-06 12:35:27.976093444 +0000 > @@ -11097,18 +11097,28 @@ vect_remove_stores (stmt_vec_info first_ > } > } > > -/* Function get_vectype_for_scalar_type_and_size. > - > - Returns the vector type corresponding to SCALAR_TYPE and SIZE as supported > - by the target. */ > +/* If NUNITS is nonzero, return a vector type that contains NUNITS > + elements of type SCALAR_TYPE, or null if the target doesn't support > + such a type. > + > + If NUNITS is zero, return a vector type that contains elements of > + type SCALAR_TYPE, choosing whichever vector size the target prefers. > + > + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode > + for this vectorization region and want to "autodetect" the best choice. > + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE > + and we want the new type to be interoperable with it. PREVAILING_MODE > + in this case can be a scalar integer mode or a vector mode; when it > + is a vector mode, the function acts like a tree-level version of > + related_vector_mode. */ > > tree > -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size) > +get_related_vectype_for_scalar_type (machine_mode prevailing_mode, > + tree scalar_type, poly_uint64 nunits) > { > tree orig_scalar_type = scalar_type; > scalar_mode inner_mode; > machine_mode simd_mode; > - poly_uint64 nunits; > tree vectype; > > if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) > @@ -11148,10 +11158,11 @@ get_vectype_for_scalar_type_and_size (tr > if (scalar_type == NULL_TREE) > return NULL_TREE; > > - /* If no size was supplied use the mode the target prefers. Otherwise > - lookup a vector mode of the specified size. */ > - if (known_eq (size, 0U)) > + /* If no prevailing mode was supplied, use the mode the target prefers. > + Otherwise lookup a vector mode based on the prevailing mode. */ > + if (prevailing_mode == VOIDmode) > { > + gcc_assert (known_eq (nunits, 0U)); > simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); > if (SCALAR_INT_MODE_P (simd_mode)) > { > @@ -11167,9 +11178,19 @@ get_vectype_for_scalar_type_and_size (tr > return NULL_TREE; > } > } > - else if (!multiple_p (size, nbytes, &nunits) > - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) > - return NULL_TREE; > + else if (SCALAR_INT_MODE_P (prevailing_mode) > + || !related_vector_mode (prevailing_mode, > + inner_mode, nunits).exists (&simd_mode)) > + { > + /* Fall back to using mode_for_vector, mostly in the hope of being > + able to use an integer mode. */ > + if (known_eq (nunits, 0U) > + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) > + return NULL_TREE; > + > + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) > + return NULL_TREE; > + } > > vectype = build_vector_type_for_mode (scalar_type, simd_mode); > > @@ -11197,9 +11218,8 @@ get_vectype_for_scalar_type_and_size (tr > tree > get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type) > { > - tree vectype; > - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode); > - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size); > + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode, > + scalar_type); > if (vectype && vinfo->vector_mode == VOIDmode) > vinfo->vector_mode = TYPE_MODE (vectype); > return vectype; > @@ -11232,8 +11252,13 @@ get_same_sized_vectype (tree scalar_type > if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) > return truth_type_for (vector_type); > > - return get_vectype_for_scalar_type_and_size > - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type))); > + poly_uint64 nunits; > + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)), > + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits)) > + return NULL_TREE; > + > + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type), > + scalar_type, nunits); > } > > /* Function vect_is_simple_use. > Index: gcc/tree-vectorizer.c > =================================================================== > --- gcc/tree-vectorizer.c 2019-11-06 12:35:12.764199495 +0000 > +++ gcc/tree-vectorizer.c 2019-11-06 12:35:27.976093444 +0000 > @@ -1359,7 +1359,7 @@ get_vec_alignment_for_array_type (tree t > poly_uint64 array_size, vector_size; > > tree scalar_type = strip_array_types (type); > - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0); > + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type); > if (!vectype > || !poly_int_tree_p (TYPE_SIZE (type), &array_size) > || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)
Index: gcc/machmode.h =================================================================== --- gcc/machmode.h 2019-10-25 13:26:59.053879364 +0100 +++ gcc/machmode.h 2019-10-25 13:27:26.201687539 +0100 @@ -258,6 +258,9 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS) bool exists () const; template<typename U> bool exists (U *) const; + bool operator== (const T &m) const { return m_mode == m; } + bool operator!= (const T &m) const { return m_mode != m; } + private: machine_mode m_mode; }; Index: gcc/tree-vectorizer.h =================================================================== --- gcc/tree-vectorizer.h 2019-10-25 13:27:19.317736181 +0100 +++ gcc/tree-vectorizer.h 2019-10-25 13:27:26.209687483 +0100 @@ -329,8 +329,9 @@ typedef std::pair<tree, tree> vec_object /* Cost data used by the target cost model. */ void *target_cost_data; - /* If we've chosen a vector size for this vectorization region, - this is one mode that has such a size, otherwise it is VOIDmode. */ + /* The argument we should pass to related_vector_mode when looking up + the vector mode for a scalar mode, or VOIDmode if we haven't yet + made any decisions about which vector modes to use. */ machine_mode vector_mode; private: @@ -1595,8 +1596,9 @@ extern dump_user_location_t find_loop_lo extern bool vect_can_advance_ivs_p (loop_vec_info); /* In tree-vect-stmts.c. */ +extern tree get_related_vectype_for_scalar_type (machine_mode, tree, + poly_uint64 = 0); extern tree get_vectype_for_scalar_type (vec_info *, tree); -extern tree get_vectype_for_scalar_type_and_size (tree, poly_uint64); extern tree get_mask_type_for_scalar_type (vec_info *, tree); extern tree get_same_sized_vectype (tree, tree); extern bool vect_get_loop_mask_type (loop_vec_info); Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c 2019-10-25 13:27:19.313736209 +0100 +++ gcc/tree-vect-slp.c 2019-10-25 13:27:26.205687511 +0100 @@ -3118,7 +3118,12 @@ vect_slp_bb_region (gimple_stmt_iterator && dbg_cnt (vect_slp)) { if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); + { + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis succeeded with vector mode" + " %s\n", GET_MODE_NAME (bb_vinfo->vector_mode)); + dump_printf_loc (MSG_NOTE, vect_location, "SLPing BB part\n"); + } bb_vinfo->shared->check_datarefs (); vect_schedule_slp (bb_vinfo); @@ -3138,6 +3143,13 @@ vect_slp_bb_region (gimple_stmt_iterator vectorized = true; } + else + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis failed with vector mode %s\n", + GET_MODE_NAME (bb_vinfo->vector_mode)); + } if (mode_i == 0) autodetected_vector_mode = bb_vinfo->vector_mode; @@ -3145,9 +3157,22 @@ vect_slp_bb_region (gimple_stmt_iterator delete bb_vinfo; if (mode_i < vector_modes.length () - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), - GET_MODE_SIZE (autodetected_vector_mode))) - mode_i += 1; + && VECTOR_MODE_P (autodetected_vector_mode) + && (related_vector_mode (vector_modes[mode_i], + GET_MODE_INNER (autodetected_vector_mode)) + == autodetected_vector_mode) + && (related_vector_mode (autodetected_vector_mode, + GET_MODE_INNER (vector_modes[mode_i])) + == vector_modes[mode_i])) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Skipping vector mode %s, which would" + " repeat the analysis for %s\n", + GET_MODE_NAME (vector_modes[mode_i]), + GET_MODE_NAME (autodetected_vector_mode)); + mode_i += 1; + } if (vectorized || mode_i == vector_modes.length () Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c 2019-10-25 13:27:19.309736237 +0100 +++ gcc/tree-vect-loop.c 2019-10-25 13:27:26.201687539 +0100 @@ -2367,6 +2367,17 @@ vect_analyze_loop (class loop *loop, loo opt_result res = vect_analyze_loop_2 (loop_vinfo, fatal, &n_stmts); if (mode_i == 0) autodetected_vector_mode = loop_vinfo->vector_mode; + if (dump_enabled_p ()) + { + if (res) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis succeeded with vector mode %s\n", + GET_MODE_NAME (loop_vinfo->vector_mode)); + else + dump_printf_loc (MSG_NOTE, vect_location, + "***** Analysis failed with vector mode %s\n", + GET_MODE_NAME (loop_vinfo->vector_mode)); + } if (res) { @@ -2400,9 +2411,22 @@ vect_analyze_loop (class loop *loop, loo } if (mode_i < vector_modes.length () - && known_eq (GET_MODE_SIZE (vector_modes[mode_i]), - GET_MODE_SIZE (autodetected_vector_mode))) - mode_i += 1; + && VECTOR_MODE_P (autodetected_vector_mode) + && (related_vector_mode (vector_modes[mode_i], + GET_MODE_INNER (autodetected_vector_mode)) + == autodetected_vector_mode) + && (related_vector_mode (autodetected_vector_mode, + GET_MODE_INNER (vector_modes[mode_i])) + == vector_modes[mode_i])) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Skipping vector mode %s, which would" + " repeat the analysis for %s\n", + GET_MODE_NAME (vector_modes[mode_i]), + GET_MODE_NAME (autodetected_vector_mode)); + mode_i += 1; + } if (mode_i == vector_modes.length () || autodetected_vector_mode == VOIDmode) @@ -4763,7 +4787,10 @@ vect_create_epilog_for_reduction (stmt_v && (mode1 = targetm.vectorize.split_reduction (mode)) != mode) sz1 = GET_MODE_SIZE (mode1).to_constant (); - tree vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz1); + unsigned int scalar_bytes = tree_to_uhwi (TYPE_SIZE_UNIT (scalar_type)); + tree vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, + sz1 / scalar_bytes); reduce_with_shift = have_whole_vector_shift (mode1); if (!VECTOR_MODE_P (mode1)) reduce_with_shift = false; @@ -4781,7 +4808,9 @@ vect_create_epilog_for_reduction (stmt_v { gcc_assert (!slp_reduc); sz /= 2; - vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz); + vectype1 = get_related_vectype_for_scalar_type (TYPE_MODE (vectype), + scalar_type, + sz / scalar_bytes); /* The target has to make sure we support lowpart/highpart extraction, either via direct vector extract or through Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c 2019-10-25 13:27:22.985710263 +0100 +++ gcc/tree-vect-stmts.c 2019-10-25 13:27:26.205687511 +0100 @@ -11111,18 +11111,28 @@ vect_remove_stores (stmt_vec_info first_ } } -/* Function get_vectype_for_scalar_type_and_size. - - Returns the vector type corresponding to SCALAR_TYPE and SIZE as supported - by the target. */ +/* If NUNITS is nonzero, return a vector type that contains NUNITS + elements of type SCALAR_TYPE, or null if the target doesn't support + such a type. + + If NUNITS is zero, return a vector type that contains elements of + type SCALAR_TYPE, choosing whichever vector size the target prefers. + + If PREVAILING_MODE is VOIDmode, we have not yet chosen a vector mode + for this vectorization region and want to "autodetect" the best choice. + Otherwise, PREVAILING_MODE is a previously-chosen vector TYPE_MODE + and we want the new type to be interoperable with it. PREVAILING_MODE + in this case can be a scalar integer mode or a vector mode; when it + is a vector mode, the function acts like a tree-level version of + related_vector_mode. */ tree -get_vectype_for_scalar_type_and_size (tree scalar_type, poly_uint64 size) +get_related_vectype_for_scalar_type (machine_mode prevailing_mode, + tree scalar_type, poly_uint64 nunits) { tree orig_scalar_type = scalar_type; scalar_mode inner_mode; machine_mode simd_mode; - poly_uint64 nunits; tree vectype; if (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) @@ -11162,10 +11172,11 @@ get_vectype_for_scalar_type_and_size (tr if (scalar_type == NULL_TREE) return NULL_TREE; - /* If no size was supplied use the mode the target prefers. Otherwise - lookup a vector mode of the specified size. */ - if (known_eq (size, 0U)) + /* If no prevailing mode was supplied, use the mode the target prefers. + Otherwise lookup a vector mode based on the prevailing mode. */ + if (prevailing_mode == VOIDmode) { + gcc_assert (known_eq (nunits, 0U)); simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); if (SCALAR_INT_MODE_P (simd_mode)) { @@ -11181,9 +11192,19 @@ get_vectype_for_scalar_type_and_size (tr return NULL_TREE; } } - else if (!multiple_p (size, nbytes, &nunits) - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) - return NULL_TREE; + else if (SCALAR_INT_MODE_P (prevailing_mode) + || !related_vector_mode (prevailing_mode, + inner_mode, nunits).exists (&simd_mode)) + { + /* Fall back to using mode_for_vector, mostly in the hope of being + able to use an integer mode. */ + if (known_eq (nunits, 0U) + && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) + return NULL_TREE; + + if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) + return NULL_TREE; + } vectype = build_vector_type_for_mode (scalar_type, simd_mode); @@ -11211,9 +11232,8 @@ get_vectype_for_scalar_type_and_size (tr tree get_vectype_for_scalar_type (vec_info *vinfo, tree scalar_type) { - tree vectype; - poly_uint64 vector_size = GET_MODE_SIZE (vinfo->vector_mode); - vectype = get_vectype_for_scalar_type_and_size (scalar_type, vector_size); + tree vectype = get_related_vectype_for_scalar_type (vinfo->vector_mode, + scalar_type); if (vectype && vinfo->vector_mode == VOIDmode) vinfo->vector_mode = TYPE_MODE (vectype); return vectype; @@ -11246,8 +11266,13 @@ get_same_sized_vectype (tree scalar_type if (VECT_SCALAR_BOOLEAN_TYPE_P (scalar_type)) return truth_type_for (vector_type); - return get_vectype_for_scalar_type_and_size - (scalar_type, GET_MODE_SIZE (TYPE_MODE (vector_type))); + poly_uint64 nunits; + if (!multiple_p (GET_MODE_SIZE (TYPE_MODE (vector_type)), + GET_MODE_SIZE (TYPE_MODE (scalar_type)), &nunits)) + return NULL_TREE; + + return get_related_vectype_for_scalar_type (TYPE_MODE (vector_type), + scalar_type, nunits); } /* Function vect_is_simple_use. Index: gcc/tree-vectorizer.c =================================================================== --- gcc/tree-vectorizer.c 2019-10-25 13:27:19.317736181 +0100 +++ gcc/tree-vectorizer.c 2019-10-25 13:27:26.209687483 +0100 @@ -1348,7 +1348,7 @@ get_vec_alignment_for_array_type (tree t poly_uint64 array_size, vector_size; tree scalar_type = strip_array_types (type); - tree vectype = get_vectype_for_scalar_type_and_size (scalar_type, 0); + tree vectype = get_related_vectype_for_scalar_type (VOIDmode, scalar_type); if (!vectype || !poly_int_tree_p (TYPE_SIZE (type), &array_size) || !poly_int_tree_p (TYPE_SIZE (vectype), &vector_size)