Message ID | f613e6c0-df68-06ec-9a06-3def009a0dcd@suse.de |
---|---|
State | New |
Headers | show |
Series | [nvptx,committed] Force vl32 if calling vector-partitionable routines | expand |
Hi Tom! On 2019-01-07T20:11:59+0100, Tom de Vries <tdevries@suse.de> wrote: > [nvptx] Force vl32 if calling vector-partitionable routines > > With PTX_MAX_VECTOR_LENGTH set to larger than PTX_WARP_SIZE, routines can be > called from offloading regions with vector-size set to larger than warp size. > OTOH, vector-partitionable routines assume warp-sized vector length. > > Detect if we're calling a vector-partitionable routine from an offloading > region, and if so, fall back to warp-sized vector length in that region. > > 2018-12-17 Tom de Vries <tdevries@suse.de> > > PR target/85486 > * config/nvptx/nvptx.c (has_vector_partitionable_routine_calls_p): New > function. > (nvptx_goacc_validate_dims): Force vl32 if calling vector-partitionable > routines. > --- a/gcc/config/nvptx/nvptx.c > +++ b/gcc/config/nvptx/nvptx.c > +/* Return true if FNDECL contains calls to vector-partitionable routines. */ > + > +static bool > +has_vector_partitionable_routine_calls_p (tree fndecl) > +{ > + if (!fndecl) > + return false; > + > + basic_block bb; > + FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (fndecl)) > + for (gimple_stmt_iterator i = gsi_start_bb (bb); !gsi_end_p (i); > + gsi_next_nondebug (&i)) > + { > + gimple *stmt = gsi_stmt (i); > + if (gimple_code (stmt) != GIMPLE_CALL) > + continue; (This might use '!is_gimple_call (stmt)'.) > + > + tree callee = gimple_call_fndecl (stmt); > + if (!callee) > + continue; Would there be any other case where this '!callee' conditional doesn't really mean 'gimple_call_internal_p (stmt)'? I thought about suggesting to use that instead, and then maybe 'gcc_assert (callee)' (... which doesn't trigger for any current testcases), but reviewing 'GIMPLE_CALL', I now see further 'is_gimple_call_addr' legitimate cases. What do these mean, here? And, should we add a comment why 'continue' is fine then, instead of fail-safe 'return true'? Couldn't an 'internal_fn' potentially also make use of OpenACC parallelism? > + > + tree attrs = oacc_get_fn_attrib (callee); > + if (attrs == NULL_TREE) > + return false; That's not correct, as far as I can tell: if the current callee doesn't have an 'oacc function' attribute, we *stop* here any further processing, and 'return false' indicating that there are no "calls to vector-partitionable routines". See bug fix and adjusted test case in attached patch "Force vl32 if calling vector-partitionable routines: fix case where callee doesn't have 'oacc function' attribute [PR85486]". OK to push? > + > + int partition_level = oacc_fn_attrib_level (attrs); > + bool seq_routine_p = partition_level == GOMP_DIM_MAX; > + if (!seq_routine_p) > + return true; > + } > + > + return false; > +} > @@ -5611,6 +5646,16 @@ nvptx_goacc_validate_dims_1 (tree decl, int dims[], int fn_level) > old_dims[i] = dims[i]; > > const char *vector_reason = NULL; > + if (offload_region_p && has_vector_partitionable_routine_calls_p (decl)) > + { > + if (dims[GOMP_DIM_VECTOR] > PTX_WARP_SIZE) > + { > + vector_reason = G_("using vector_length (%d) due to call to" > + " vector-partitionable routine, ignoring %d"); > + dims[GOMP_DIM_VECTOR] = PTX_WARP_SIZE; > + } > + } > + > if (dims[GOMP_DIM_VECTOR] == 0) > { > vector_reason = G_("using vector_length (%d), ignoring runtime setting"); Grüße Thomas ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
[nvptx] Force vl32 if calling vector-partitionable routines With PTX_MAX_VECTOR_LENGTH set to larger than PTX_WARP_SIZE, routines can be called from offloading regions with vector-size set to larger than warp size. OTOH, vector-partitionable routines assume warp-sized vector length. Detect if we're calling a vector-partitionable routine from an offloading region, and if so, fall back to warp-sized vector length in that region. 2018-12-17 Tom de Vries <tdevries@suse.de> PR target/85486 * config/nvptx/nvptx.c (has_vector_partitionable_routine_calls_p): New function. (nvptx_goacc_validate_dims): Force vl32 if calling vector-partitionable routines. --- gcc/config/nvptx/nvptx.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 5a4b38de522..7fdc285b6f8 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -59,6 +59,7 @@ #include "builtins.h" #include "omp-general.h" #include "omp-low.h" +#include "omp-offload.h" #include "gomp-constants.h" #include "dumpfile.h" #include "internal-fn.h" @@ -5496,6 +5497,40 @@ nvptx_apply_dim_limits (int dims[]) dims[GOMP_DIM_VECTOR] = PTX_WARP_SIZE; } +/* Return true if FNDECL contains calls to vector-partitionable routines. */ + +static bool +has_vector_partitionable_routine_calls_p (tree fndecl) +{ + if (!fndecl) + return false; + + basic_block bb; + FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (fndecl)) + for (gimple_stmt_iterator i = gsi_start_bb (bb); !gsi_end_p (i); + gsi_next_nondebug (&i)) + { + gimple *stmt = gsi_stmt (i); + if (gimple_code (stmt) != GIMPLE_CALL) + continue; + + tree callee = gimple_call_fndecl (stmt); + if (!callee) + continue; + + tree attrs = oacc_get_fn_attrib (callee); + if (attrs == NULL_TREE) + return false; + + int partition_level = oacc_fn_attrib_level (attrs); + bool seq_routine_p = partition_level == GOMP_DIM_MAX; + if (!seq_routine_p) + return true; + } + + return false; +} + /* As nvptx_goacc_validate_dims, but does not return bool to indicate whether DIMS has changed. */ @@ -5611,6 +5646,16 @@ nvptx_goacc_validate_dims_1 (tree decl, int dims[], int fn_level) old_dims[i] = dims[i]; const char *vector_reason = NULL; + if (offload_region_p && has_vector_partitionable_routine_calls_p (decl)) + { + if (dims[GOMP_DIM_VECTOR] > PTX_WARP_SIZE) + { + vector_reason = G_("using vector_length (%d) due to call to" + " vector-partitionable routine, ignoring %d"); + dims[GOMP_DIM_VECTOR] = PTX_WARP_SIZE; + } + } + if (dims[GOMP_DIM_VECTOR] == 0) { vector_reason = G_("using vector_length (%d), ignoring runtime setting");