diff mbox

[5/16] Add in_oacc_kernels_region in struct loop

Message ID 5649C02B.9030604@mentor.com
State New
Headers show

Commit Message

Tom de Vries Nov. 16, 2015, 11:38 a.m. UTC
On 11/11/15 11:55, Richard Biener wrote:
> On Mon, 9 Nov 2015, Tom de Vries wrote:
>
>> On 09/11/15 16:35, Tom de Vries wrote:
>>> Hi,
>>>
>>> this patch series for stage1 trunk adds support to:
>>> - parallelize oacc kernels regions using parloops, and
>>> - map the loops onto the oacc gang dimension.
>>>
>>> The patch series contains these patches:
>>>
>>>        1    Insert new exit block only when needed in
>>>           transform_to_exit_first_loop_alt
>>>        2    Make create_parallel_loop return void
>>>        3    Ignore reduction clause on kernels directive
>>>        4    Implement -foffload-alias
>>>        5    Add in_oacc_kernels_region in struct loop
>>>        6    Add pass_oacc_kernels
>>>        7    Add pass_dominator_oacc_kernels
>>>        8    Add pass_ch_oacc_kernels
>>>        9    Add pass_parallelize_loops_oacc_kernels
>>>       10    Add pass_oacc_kernels pass group in passes.def
>>>       11    Update testcases after adding kernels pass group
>>>       12    Handle acc loop directive
>>>       13    Add c-c++-common/goacc/kernels-*.c
>>>       14    Add gfortran.dg/goacc/kernels-*.f95
>>>       15    Add libgomp.oacc-c-c++-common/kernels-*.c
>>>       16    Add libgomp.oacc-fortran/kernels-*.f95
>>>
>>> The first 9 patches are more or less independent, but patches 10-16 are
>>> intended to be committed at the same time.
>>>
>>> Bootstrapped and reg-tested on x86_64.
>>>
>>> Build and reg-tested with nvidia accelerator, in combination with a
>>> patch that enables accelerator testing (which is submitted at
>>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).
>>>
>>> I'll post the individual patches in reply to this message.
>>
>> this patch adds and initializes the field in_oacc_kernels_region field in
>> struct loop.
>>
>> The field is used to signal to subsequent passes that we're dealing with a
>> loop in a kernels region that we're trying parallelize.
>>
>> Note that we do not parallelize kernels regions with more than one loop nest.
>> [ In general, kernels regions with more than one loop nest should be split up
>> into seperate kernels regions, but that's not supported atm. ]
>
> I think mark_loops_in_oacc_kernels_region can be greatly simplified.
>
> Both region entry and exit should have the same ->loop_father (a SESE
> region).  Then you can just walk that loops inner (and their sibling)
> loops checking their header domination relation with the region entry
> exit (only necessary for direct inner loops).

Updated patch to use the loops structure.  Atm I'm also skipping loops 
containing sibling loops, since I have no test-cases for that yet.

Thanks,
- Tom

Comments

Richard Biener Nov. 16, 2015, 12:41 p.m. UTC | #1
On Mon, 16 Nov 2015, Tom de Vries wrote:

> On 11/11/15 11:55, Richard Biener wrote:
> > On Mon, 9 Nov 2015, Tom de Vries wrote:
> > 
> > > On 09/11/15 16:35, Tom de Vries wrote:
> > > > Hi,
> > > > 
> > > > this patch series for stage1 trunk adds support to:
> > > > - parallelize oacc kernels regions using parloops, and
> > > > - map the loops onto the oacc gang dimension.
> > > > 
> > > > The patch series contains these patches:
> > > > 
> > > >        1    Insert new exit block only when needed in
> > > >           transform_to_exit_first_loop_alt
> > > >        2    Make create_parallel_loop return void
> > > >        3    Ignore reduction clause on kernels directive
> > > >        4    Implement -foffload-alias
> > > >        5    Add in_oacc_kernels_region in struct loop
> > > >        6    Add pass_oacc_kernels
> > > >        7    Add pass_dominator_oacc_kernels
> > > >        8    Add pass_ch_oacc_kernels
> > > >        9    Add pass_parallelize_loops_oacc_kernels
> > > >       10    Add pass_oacc_kernels pass group in passes.def
> > > >       11    Update testcases after adding kernels pass group
> > > >       12    Handle acc loop directive
> > > >       13    Add c-c++-common/goacc/kernels-*.c
> > > >       14    Add gfortran.dg/goacc/kernels-*.f95
> > > >       15    Add libgomp.oacc-c-c++-common/kernels-*.c
> > > >       16    Add libgomp.oacc-fortran/kernels-*.f95
> > > > 
> > > > The first 9 patches are more or less independent, but patches 10-16 are
> > > > intended to be committed at the same time.
> > > > 
> > > > Bootstrapped and reg-tested on x86_64.
> > > > 
> > > > Build and reg-tested with nvidia accelerator, in combination with a
> > > > patch that enables accelerator testing (which is submitted at
> > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).
> > > > 
> > > > I'll post the individual patches in reply to this message.
> > > 
> > > this patch adds and initializes the field in_oacc_kernels_region field in
> > > struct loop.
> > > 
> > > The field is used to signal to subsequent passes that we're dealing with a
> > > loop in a kernels region that we're trying parallelize.
> > > 
> > > Note that we do not parallelize kernels regions with more than one loop
> > > nest.
> > > [ In general, kernels regions with more than one loop nest should be split
> > > up
> > > into seperate kernels regions, but that's not supported atm. ]
> > 
> > I think mark_loops_in_oacc_kernels_region can be greatly simplified.
> > 
> > Both region entry and exit should have the same ->loop_father (a SESE
> > region).  Then you can just walk that loops inner (and their sibling)
> > loops checking their header domination relation with the region entry
> > exit (only necessary for direct inner loops).
> 
> Updated patch to use the loops structure.  Atm I'm also skipping loops
> containing sibling loops, since I have no test-cases for that yet.

Looks ok to me now.  You want to update copy_loop_info btw.

Richard.

> Thanks,
> - Tom
> 
>
diff mbox

Patch

Add in_oacc_kernels_region in struct loop

2015-11-09  Tom de Vries  <tom@codesourcery.com>

	* cfgloop.h (struct loop): Add in_oacc_kernels_region field.
	* omp-low.c (mark_loops_in_oacc_kernels_region): New function.
	(expand_omp_target): Call mark_loops_in_oacc_kernels_region.

---
 gcc/cfgloop.h |  3 +++
 gcc/omp-low.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 6af6893..ee73bf9 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -191,6 +191,9 @@  struct GTY ((chain_next ("%h.next"))) loop {
   /* True if we should try harder to vectorize this loop.  */
   bool force_vectorize;
 
+  /* True if the loop is part of an oacc kernels region.  */
+  bool in_oacc_kernels_region;
+
   /* For SIMD loops, this is a unique identifier of the loop, referenced
      by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE
      builtins.  */
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 5f76434..fba7bbd 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -12450,6 +12450,46 @@  get_oacc_ifn_dim_arg (const gimple *stmt)
   return (int) axis;
 }
 
+/* Mark the loops inside the kernels region starting at REGION_ENTRY and ending
+   at REGION_EXIT.  */
+
+static void
+mark_loops_in_oacc_kernels_region (basic_block region_entry,
+				   basic_block region_exit)
+{
+  struct loop *outer = region_entry->loop_father;
+  gcc_assert (region_exit == NULL || outer == region_exit->loop_father);
+
+  /* Don't parallelize the kernels region if it contains more than one outer
+     loop.  */
+  unsigned int nr_outer_loops = 0;
+  struct loop *single_outer;
+  for (struct loop *loop = outer->inner; loop != NULL; loop = loop->next)
+    {
+      gcc_assert (loop_outer (loop) == outer);
+
+      if (!dominated_by_p (CDI_DOMINATORS, loop->header, region_entry))
+	continue;
+
+      if (region_exit != NULL
+	  && dominated_by_p (CDI_DOMINATORS, loop->header, region_exit))
+	continue;
+
+      nr_outer_loops++;
+      single_outer = loop;
+    }
+  if (nr_outer_loops != 1)
+    return;
+
+  for (struct loop *loop = single_outer->inner; loop != NULL; loop = loop->inner)
+    if (loop->next)
+      return;
+
+  /* Mark the loops in the region.  */
+  for (struct loop *loop = single_outer; loop != NULL; loop = loop->inner)
+    loop->in_oacc_kernels_region = true;
+}
+
 /* Expand the GIMPLE_OMP_TARGET starting at REGION.  */
 
 static void
@@ -12505,6 +12545,9 @@  expand_omp_target (struct omp_region *region)
   entry_bb = region->entry;
   exit_bb = region->exit;
 
+  if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS)
+    mark_loops_in_oacc_kernels_region (region->entry, region->exit);
+
   if (offloaded)
     {
       unsigned srcidx, dstidx, num;