diff mbox

[4/16] Implement -foffload-alias

Message ID 566EE3CA.2010804@mentor.com
State New
Headers show

Commit Message

Tom de Vries Dec. 14, 2015, 3:44 p.m. UTC
On 14/12/15 14:26, Richard Biener wrote:
> On Sun, 13 Dec 2015, Tom de Vries wrote:
>
>> On 11/12/15 14:00, Richard Biener wrote:
>>> On Fri, 11 Dec 2015, Tom de Vries wrote:
>>>
>>>> On 13/11/15 12:39, Jakub Jelinek wrote:
>>>>> We simply have some compiler internal interface between the caller and
>>>>> callee of the outlined regions, each interface in between those has
>>>>> its own structure type used to communicate the info;
>>>>> we can attach attributes on the fields, or some flags to indicate some
>>>>> properties interesting from aliasing POV.  We don't really need to
>>>>> perform
>>>>> full IPA-PTA, perhaps it would be enough to a) record somewhere in
>>>>> cgraph
>>>>> the relationship in between such callers and callees (for offloading
>>>>> regions
>>>>> we already have "omp target entrypoint" attribute on the callee and a
>>>>> singler caller), tell LTO if possible not to split those into different
>>>>> partitions if easily possible, and then just for these pairs perform
>>>>> aliasing/points-to analysis in the caller and the result record using
>>>>> cliques/special attributes/whatever to the callee side, so that the
>>>>> callee
>>>>> (outlined OpenMP/OpenACC/Cilk+ region) can then improve its alias
>>>>> analysis.
>>>>
>>>> Hi,
>>>>
>>>> This work-in-progress patch allows me to use IPA PTA information in the
>>>> kernels pass group.
>>>>
>>>> Since:
>>>> -  I'm running IPA PTA before ealias, and IPA PTA does not interpret
>>>>      restrict, and
>>>> - compute_may_alias doesn't run if IPA PTA information is present
>>>> I needed to convince ealias to do the restrict clique/base annotation.
>>>>
>>>> It would be more logical to fit IPA PTA after ealias, but one is an IPA
>>>> pass,
>>>> the other a regular one-function pass, so I would have to split the
>>>> containing
>>>> pass groups pass_all_early_optimizations and
>>>> pass_local_optimization_passes.
>>>> I'll give that a try now.
>>>>
>>
>> I've tried this approach, but realized that this changes the order in which
>> non-openacc functions are processed in the compiler, so I've abandoned this
>> idea.
>>
>>>> Any comments?
>>>
>>> I don't think you want to run IPA PTA before early
>>> optimizations, it (and ealias) rely on some initial cleanup to
>>> do anything meaningful with well-spent ressources.
>>>
>>> The local PTA "hack" also looks more like a waste of resources, but well
>>> ... teaching IPA PTA to honor restrict might be an impossible task
>>> though I didn't think much about it other than handling it only for
>>> nonlocal_p functions (for others we should see all incoming args
>>> if IPA PTA works optimally).  The restrict tags will leak all over
>>> the place of course and in the end no meaningful cliques may remain.
>>>
>>
>> This patch:
>> - moves the kernels pass group to the first position in the pass list
>>    after ealias where we're back in ipa mode
>> - inserts an new ipa pass to contain the gimple pass group called
>>    pass_oacc_ipa
>> - inserts a version of ipa-pta before the pass group.
>
> In principle I like this a lot, but
>
> +  NEXT_PASS (pass_ipa_pta_oacc_kernels);
> +  NEXT_PASS (pass_oacc_ipa);
> +  PUSH_INSERT_PASSES_WITHIN (pass_oacc_ipa)
>
> I think you can put pass_ipa_pta_oacc_kernels into the pass_oacc_ipa
> group and thus just "clone" ipa_pta?

Done. But using a clone means using the same gate function, and that 
means that this pass_ipa_pta instance no longer runs by default for 
openacc by default.

I've added enabling-by-default of fipa-pta for fopenacc in 
default_options_optimization to fix that.

> sub-passes of IPA passes can
> be both ipa passes and non-ipa passes.

Right. It does mean that I need yet another pass (pass_ipa_oacc_kernels) 
to do the IPA/non-IPA transition at pass/sub-pass boundary:
...
   NEXT_PASS (pass_ipa_oacc);
   PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc)
       NEXT_PASS (pass_ipa_pta);
       NEXT_PASS (pass_ipa_oacc_kernels);
       PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc_kernels)
          /* out-of-ipa */
          NEXT_PASS (pass_oacc_kernels);
          PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
...

OK for stage3 if bootstrap and reg-test succeeds?

Thanks,
- Tom

Comments

Richard Biener Dec. 16, 2015, 1:16 p.m. UTC | #1
On Mon, 14 Dec 2015, Tom de Vries wrote:

> On 14/12/15 14:26, Richard Biener wrote:
> > On Sun, 13 Dec 2015, Tom de Vries wrote:
> > 
> > > On 11/12/15 14:00, Richard Biener wrote:
> > > > On Fri, 11 Dec 2015, Tom de Vries wrote:
> > > > 
> > > > > On 13/11/15 12:39, Jakub Jelinek wrote:
> > > > > > We simply have some compiler internal interface between the caller
> > > > > > and
> > > > > > callee of the outlined regions, each interface in between those has
> > > > > > its own structure type used to communicate the info;
> > > > > > we can attach attributes on the fields, or some flags to indicate
> > > > > > some
> > > > > > properties interesting from aliasing POV.  We don't really need to
> > > > > > perform
> > > > > > full IPA-PTA, perhaps it would be enough to a) record somewhere in
> > > > > > cgraph
> > > > > > the relationship in between such callers and callees (for offloading
> > > > > > regions
> > > > > > we already have "omp target entrypoint" attribute on the callee and
> > > > > > a
> > > > > > singler caller), tell LTO if possible not to split those into
> > > > > > different
> > > > > > partitions if easily possible, and then just for these pairs perform
> > > > > > aliasing/points-to analysis in the caller and the result record
> > > > > > using
> > > > > > cliques/special attributes/whatever to the callee side, so that the
> > > > > > callee
> > > > > > (outlined OpenMP/OpenACC/Cilk+ region) can then improve its alias
> > > > > > analysis.
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > This work-in-progress patch allows me to use IPA PTA information in
> > > > > the
> > > > > kernels pass group.
> > > > > 
> > > > > Since:
> > > > > -  I'm running IPA PTA before ealias, and IPA PTA does not interpret
> > > > >      restrict, and
> > > > > - compute_may_alias doesn't run if IPA PTA information is present
> > > > > I needed to convince ealias to do the restrict clique/base annotation.
> > > > > 
> > > > > It would be more logical to fit IPA PTA after ealias, but one is an
> > > > > IPA
> > > > > pass,
> > > > > the other a regular one-function pass, so I would have to split the
> > > > > containing
> > > > > pass groups pass_all_early_optimizations and
> > > > > pass_local_optimization_passes.
> > > > > I'll give that a try now.
> > > > > 
> > > 
> > > I've tried this approach, but realized that this changes the order in
> > > which
> > > non-openacc functions are processed in the compiler, so I've abandoned
> > > this
> > > idea.
> > > 
> > > > > Any comments?
> > > > 
> > > > I don't think you want to run IPA PTA before early
> > > > optimizations, it (and ealias) rely on some initial cleanup to
> > > > do anything meaningful with well-spent ressources.
> > > > 
> > > > The local PTA "hack" also looks more like a waste of resources, but well
> > > > ... teaching IPA PTA to honor restrict might be an impossible task
> > > > though I didn't think much about it other than handling it only for
> > > > nonlocal_p functions (for others we should see all incoming args
> > > > if IPA PTA works optimally).  The restrict tags will leak all over
> > > > the place of course and in the end no meaningful cliques may remain.
> > > > 
> > > 
> > > This patch:
> > > - moves the kernels pass group to the first position in the pass list
> > >    after ealias where we're back in ipa mode
> > > - inserts an new ipa pass to contain the gimple pass group called
> > >    pass_oacc_ipa
> > > - inserts a version of ipa-pta before the pass group.
> > 
> > In principle I like this a lot, but
> > 
> > +  NEXT_PASS (pass_ipa_pta_oacc_kernels);
> > +  NEXT_PASS (pass_oacc_ipa);
> > +  PUSH_INSERT_PASSES_WITHIN (pass_oacc_ipa)
> > 
> > I think you can put pass_ipa_pta_oacc_kernels into the pass_oacc_ipa
> > group and thus just "clone" ipa_pta?
> 
> Done. But using a clone means using the same gate function, and that means
> that this pass_ipa_pta instance no longer runs by default for openacc by
> default.
> 
> I've added enabling-by-default of fipa-pta for fopenacc in
> default_options_optimization to fix that.

Hmm, but that enables both IPA PTA passes then?  I suppose that's ok,
and if not enabling the "late" IPA PTA you'd want to re-set 
gimple_df->ipa_pta.

> > sub-passes of IPA passes can
> > be both ipa passes and non-ipa passes.
> 
> Right. It does mean that I need yet another pass (pass_ipa_oacc_kernels) to do
> the IPA/non-IPA transition at pass/sub-pass boundary:
> ...
>   NEXT_PASS (pass_ipa_oacc);
>   PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc)
>       NEXT_PASS (pass_ipa_pta);
>       NEXT_PASS (pass_ipa_oacc_kernels);
>       PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc_kernels)
>          /* out-of-ipa */
>          NEXT_PASS (pass_oacc_kernels);
>          PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
> ...
> 
> OK for stage3 if bootstrap and reg-test succeeds?

Ok.

Richard.
diff mbox

Patch

Add pass_oacc_ipa

2015-12-14  Tom de Vries  <tom@codesourcery.com>

	* opts.c (default_options_optimization): Set fipa-pta on by default for
	fopenacc.
	* passes.def: Move kernels pass group to pass_ipa_oacc.
	* tree-pass.h (make_pass_oacc_kernels2): Remove.
	(make_pass_ipa_oacc, make_pass_ipa_oacc_kernels): Declare.
	* tree-ssa-loop.c (pass_oacc_kernels2, make_pass_oacc_kernels2): Remove.
	(pass_ipa_oacc, pass_ipa_oacc_kernels): New pass.
	(make_pass_ipa_oacc, make_pass_ipa_oacc_kernels): New function.
	* tree-ssa-structalias.c (pass_ipa_pta::clone): New function.

	* g++.dg/ipa/devirt-37.C: Update for new fre2 pass.
	* g++.dg/ipa/devirt-40.C: Same.
	* g++.dg/tree-ssa/pr61034.C: Same.
	* gcc.dg/ipa/ipa-pta-13.c: Same.
	* gcc.dg/ipa/ipa-pta-3.c: Same.
	* gcc.dg/ipa/ipa-pta-4.c: Same.

---
 gcc/opts.c                              |  9 ++++
 gcc/passes.def                          | 41 ++++++++++--------
 gcc/testsuite/g++.dg/ipa/devirt-37.C    | 10 ++---
 gcc/testsuite/g++.dg/ipa/devirt-40.C    |  4 +-
 gcc/testsuite/g++.dg/tree-ssa/pr61034.C | 10 ++---
 gcc/testsuite/gcc.dg/ipa/ipa-pta-13.c   |  4 +-
 gcc/testsuite/gcc.dg/ipa/ipa-pta-3.c    |  4 +-
 gcc/testsuite/gcc.dg/ipa/ipa-pta-4.c    |  4 +-
 gcc/tree-pass.h                         |  3 +-
 gcc/tree-ssa-loop.c                     | 76 ++++++++++++++++++++++++---------
 gcc/tree-ssa-structalias.c              |  2 +
 11 files changed, 110 insertions(+), 57 deletions(-)

diff --git a/gcc/opts.c b/gcc/opts.c
index 3d25f98..42d5566 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -560,6 +560,7 @@  default_options_optimization (struct gcc_options *opts,
 {
   unsigned int i;
   int opt2;
+  bool openacc_mode = false;
 
   /* Scan to see what optimization level has been specified.  That will
      determine the default value of many flags.  */
@@ -619,6 +620,10 @@  default_options_optimization (struct gcc_options *opts,
 	  opts->x_optimize_debug = 1;
 	  break;
 
+	case OPT_fopenacc:
+	  openacc_mode = true;
+	  break;
+
 	default:
 	  /* Ignore other options in this prescan.  */
 	  break;
@@ -633,6 +638,10 @@  default_options_optimization (struct gcc_options *opts,
   /* -O2 param settings.  */
   opt2 = (opts->x_optimize >= 2);
 
+  if (openacc_mode
+      && !opts_set->x_flag_ipa_pta)
+    opts->x_flag_ipa_pta = true;
+
   /* Track fields in field-sensitive alias analysis.  */
   maybe_set_param_value
     (PARAM_MAX_FIELDS_FOR_FIELD_SENSITIVE,
diff --git a/gcc/passes.def b/gcc/passes.def
index 43ce3d5..96e18f1 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -88,24 +88,7 @@  along with GCC; see the file COPYING3.  If not see
 	  /* pass_build_ealias is a dummy pass that ensures that we
 	     execute TODO_rebuild_alias at this point.  */
 	  NEXT_PASS (pass_build_ealias);
-	  /* Pass group that runs when the function is an offloaded function
-	     containing oacc kernels loops.  Part 1.  */
-	  NEXT_PASS (pass_oacc_kernels);
-	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
-	      NEXT_PASS (pass_ch);
-	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_fre);
-	  /* Pass group that runs when the function is an offloaded function
-	     containing oacc kernels loops.  Part 2.  */
-	  NEXT_PASS (pass_oacc_kernels2);
-	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2)
-	      /* We use pass_lim to rewrite in-memory iteration and reduction
-		 variable accesses in loops into local variables accesses.  */
-	      NEXT_PASS (pass_lim);
-	      NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
-	      NEXT_PASS (pass_dce);
-	      NEXT_PASS (pass_expand_omp_ssa);
-	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_merge_phi);
           NEXT_PASS (pass_dse);
 	  NEXT_PASS (pass_cd_dce);
@@ -124,6 +107,30 @@  along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_rebuild_cgraph_edges);
       NEXT_PASS (pass_inline_parameters);
   POP_INSERT_PASSES ()
+
+  NEXT_PASS (pass_ipa_oacc);
+  PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc)
+      NEXT_PASS (pass_ipa_pta);
+      /* Pass group that runs when the function is an offloaded function
+	 containing oacc kernels loops.	 */
+      NEXT_PASS (pass_ipa_oacc_kernels);
+      PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc_kernels)
+	  NEXT_PASS (pass_oacc_kernels);
+	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+	      NEXT_PASS (pass_ch);
+	      NEXT_PASS (pass_fre);
+	      /* We use pass_lim to rewrite in-memory iteration and reduction
+		 variable accesses in loops into local variables accesses.  */
+	      NEXT_PASS (pass_lim);
+	      NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
+	      NEXT_PASS (pass_dce);
+	      /* pass_parallelize_loops_oacc_kernels */
+	      NEXT_PASS (pass_expand_omp_ssa);
+	      NEXT_PASS (pass_rebuild_cgraph_edges);
+	  POP_INSERT_PASSES ()
+      POP_INSERT_PASSES ()
+  POP_INSERT_PASSES ()
+
   NEXT_PASS (pass_ipa_chkp_produce_thunks);
   NEXT_PASS (pass_ipa_auto_profile);
   NEXT_PASS (pass_ipa_free_inline_summary);
diff --git a/gcc/testsuite/g++.dg/ipa/devirt-37.C b/gcc/testsuite/g++.dg/ipa/devirt-37.C
index 9c5287e..b7f52a0 100644
--- a/gcc/testsuite/g++.dg/ipa/devirt-37.C
+++ b/gcc/testsuite/g++.dg/ipa/devirt-37.C
@@ -1,4 +1,4 @@ 
-/* { dg-options "-fpermissive -O2 -fno-indirect-inlining -fno-devirtualize-speculatively -fdump-tree-fre2-details -fno-early-inlining"  } */
+/* { dg-options "-fpermissive -O2 -fno-indirect-inlining -fno-devirtualize-speculatively -fdump-tree-fre3-details -fno-early-inlining"  } */
 #include <stdlib.h>
 struct A {virtual void test() {abort ();}};
 struct B:A
@@ -30,7 +30,7 @@  t()
 /* After inlining the call within constructor needs to be checked to not go into a basetype.
    We should see the vtbl store and we should notice extcall as possibly clobbering the
    type but ignore it because b is in static storage.  */
-/* { dg-final { scan-tree-dump "No dynamic type change found."  "fre2"  } } */
-/* { dg-final { scan-tree-dump "Checking vtbl store:"  "fre2"  } } */
-/* { dg-final { scan-tree-dump "Function call may change dynamic type:extcall"  "fre2"  } } */
-/* { dg-final { scan-tree-dump "converting indirect call to function virtual void"  "fre2"  } } */
+/* { dg-final { scan-tree-dump "No dynamic type change found."  "fre3"  } } */
+/* { dg-final { scan-tree-dump "Checking vtbl store:"  "fre3"  } } */
+/* { dg-final { scan-tree-dump "Function call may change dynamic type:extcall"  "fre3"  } } */
+/* { dg-final { scan-tree-dump "converting indirect call to function virtual void"  "fre3"  } } */
diff --git a/gcc/testsuite/g++.dg/ipa/devirt-40.C b/gcc/testsuite/g++.dg/ipa/devirt-40.C
index 279a228..5107c29 100644
--- a/gcc/testsuite/g++.dg/ipa/devirt-40.C
+++ b/gcc/testsuite/g++.dg/ipa/devirt-40.C
@@ -1,4 +1,4 @@ 
-/* { dg-options "-O2 -fdump-tree-fre2-details"  } */
+/* { dg-options "-O2 -fdump-tree-fre3-details"  } */
 typedef enum
 {
 } UErrorCode;
@@ -19,4 +19,4 @@  A::m_fn1 (UnicodeString &, int &p2, UErrorCode &) const
   UnicodeString a[2];
 }
 
-/* { dg-final { scan-tree-dump-not "\\n  OBJ_TYPE_REF" "fre2"  } } */
+/* { dg-final { scan-tree-dump-not "\\n  OBJ_TYPE_REF" "fre3"  } } */
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr61034.C b/gcc/testsuite/g++.dg/tree-ssa/pr61034.C
index cd4ee05..c06c580 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr61034.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr61034.C
@@ -1,5 +1,5 @@ 
 // { dg-do compile }
-// { dg-options "-O2 -fdump-tree-fre2 -fdump-tree-optimized" }
+// { dg-options "-O2 -fdump-tree-fre3 -fdump-tree-optimized" }
 
 #define assume(x) if(!(x))__builtin_unreachable()
 
@@ -42,13 +42,13 @@  bool f(I a, I b, I c, I d) {
 // a bunch of conditional free()s and unreachable()s.
 // This works only if everything is inlined into 'f'.
 
-// { dg-final { scan-tree-dump-times ";; Function" 1 "fre2" } }
-// { dg-final { scan-tree-dump-times "unreachable" 11 "fre2" } }
+// { dg-final { scan-tree-dump-times ";; Function" 1 "fre3" } }
+// { dg-final { scan-tree-dump-times "unreachable" 11 "fre3" } }
 
 // Note that depending on PUSH_ARGS_REVERSED we are presented with
 // a different initial CFG and thus the final outcome is different
 
-// { dg-final { scan-tree-dump-times "free" 10 "fre2" { target x86_64-*-* i?86-*-* } } }
+// { dg-final { scan-tree-dump-times "free" 10 "fre3" { target x86_64-*-* i?86-*-* } } }
 // { dg-final { scan-tree-dump-times "free" 3 "optimized" { target x86_64-*-* i?86-*-* } } }
-// { dg-final { scan-tree-dump-times "free" 14 "fre2" { target aarch64-*-* ia64-*-* arm-*-* hppa*-*-* sparc*-*-* powerpc*-*-* alpha*-*-* } } }
+// { dg-final { scan-tree-dump-times "free" 14 "fre3" { target aarch64-*-* ia64-*-* arm-*-* hppa*-*-* sparc*-*-* powerpc*-*-* alpha*-*-* } } }
 // { dg-final { scan-tree-dump-times "free" 4 "optimized" { target aarch64-*-* ia64-*-* arm-*-* hppa*-*-* sparc*-*-* powerpc*-*-* alpha*-*-* } } }
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-pta-13.c b/gcc/testsuite/gcc.dg/ipa/ipa-pta-13.c
index f558df3..71b31c4 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-pta-13.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-pta-13.c
@@ -1,5 +1,5 @@ 
 /* { dg-do link } */
-/* { dg-options "-O2 -fipa-pta -fdump-ipa-pta-details -fdump-tree-fre2 -fno-ipa-icf" } */
+/* { dg-options "-O2 -fipa-pta -fdump-ipa-pta-details -fdump-tree-fre3 -fno-ipa-icf" } */
 
 static int x, y;
 
@@ -54,7 +54,7 @@  int main()
   local_address_taken (&y);
   /* As we are computing flow- and context-insensitive we may not
      CSE the load of x here.  */
-  /* { dg-final { scan-tree-dump " = x;" "fre2" } } */
+  /* { dg-final { scan-tree-dump " = x;" "fre3" } } */
   return x;
 }
 
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-pta-3.c b/gcc/testsuite/gcc.dg/ipa/ipa-pta-3.c
index ff6fa57..8655794 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-pta-3.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-pta-3.c
@@ -1,5 +1,5 @@ 
 /* { dg-do run } */
-/* { dg-options "-O2 -fipa-pta -fdump-ipa-pta-details -fdump-tree-fre2-details" } */
+/* { dg-options "-O2 -fipa-pta -fdump-ipa-pta-details -fdump-tree-fre3-details" } */
 
 static int __attribute__((noinline,noclone))
 foo (int *p, int *q)
@@ -23,4 +23,4 @@  int main()
 
 /* { dg-final { scan-ipa-dump "foo.arg0 = &a" "pta" } } */
 /* { dg-final { scan-ipa-dump "foo.arg1 = &b" "pta" } } */
-/* { dg-final { scan-tree-dump "Replaced \\\*p_2\\\(D\\\) with 1" "fre2" } } */
+/* { dg-final { scan-tree-dump "Replaced \\\*p_2\\\(D\\\) with 1" "fre3" } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-pta-4.c b/gcc/testsuite/gcc.dg/ipa/ipa-pta-4.c
index 106e325..c42762a 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-pta-4.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-pta-4.c
@@ -1,5 +1,5 @@ 
 /* { dg-do run } */
-/* { dg-options "-O2 -fipa-pta -fdump-ipa-pta-details -fdump-tree-fre2-details" } */
+/* { dg-options "-O2 -fipa-pta -fdump-ipa-pta-details -fdump-tree-fre3-details" } */
 
 int a, b;
 
@@ -28,4 +28,4 @@  int main()
 
 /* { dg-final { scan-ipa-dump "foo.arg0 = &a" "pta" } } */
 /* { dg-final { scan-ipa-dump "foo.arg1 = &b" "pta" } } */
-/* { dg-final { scan-tree-dump "Replaced \\\*p_2\\\(D\\\) with 1" "fre2" } } */
+/* { dg-final { scan-tree-dump "Replaced \\\*p_2\\\(D\\\) with 1" "fre3" } } */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index e1cbce9..dcdbdfd 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -468,7 +468,8 @@  extern gimple_opt_pass *make_pass_vtable_verify (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ubsan (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sanopt (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_oacc_kernels (gcc::context *ctxt);
-extern gimple_opt_pass *make_pass_oacc_kernels2 (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_ipa_oacc (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_ipa_oacc_kernels (gcc::context *ctxt);
 
 /* IPA Passes */
 extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index cf7d94e..1fe2716 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -36,6 +36,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "tree-vectorizer.h"
 #include "omp-low.h"
+#include "diagnostic-core.h"
 
 
 /* A pass making sure loops are fixed up.  */
@@ -206,12 +207,14 @@  make_pass_oacc_kernels (gcc::context *ctxt)
   return new pass_oacc_kernels (ctxt);
 }
 
+/* The ipa oacc superpass.  */
+
 namespace {
 
-const pass_data pass_data_oacc_kernels2 =
+const pass_data pass_data_ipa_oacc =
 {
-  GIMPLE_PASS, /* type */
-  "oacc_kernels2", /* name */
+  SIMPLE_IPA_PASS, /* type */
+  "ipa_oacc", /* name */
   OPTGROUP_LOOP, /* optinfo_flags */
   TV_TREE_LOOP, /* tv_id */
   PROP_cfg, /* properties_required */
@@ -221,34 +224,65 @@  const pass_data pass_data_oacc_kernels2 =
   0, /* todo_flags_finish */
 };
 
-class pass_oacc_kernels2 : public gimple_opt_pass
+class pass_ipa_oacc : public simple_ipa_opt_pass
 {
 public:
-  pass_oacc_kernels2 (gcc::context *ctxt)
-    : gimple_opt_pass (pass_data_oacc_kernels2, ctxt)
+  pass_ipa_oacc (gcc::context *ctxt)
+    : simple_ipa_opt_pass (pass_data_ipa_oacc, ctxt)
   {}
 
   /* opt_pass methods: */
-  virtual bool gate (function *fn) { return gate_oacc_kernels (fn); }
-  virtual unsigned int execute (function *fn)
-    {
-      /* Rather than having a copy of the previous dump, get some use out of
-	 this dump, and try to minimize differences with the following pass
-	 (pass_lim), which will initizalize the loop optimizer with
-	 LOOPS_NORMAL.  */
-      loop_optimizer_init (LOOPS_NORMAL);
-      loop_optimizer_finalize (fn);
-      return 0;
-    }
+  virtual bool gate (function *)
+  {
+    return (optimize
+	    /* Don't bother doing anything if the program has errors.  */
+	    && !seen_error ()
+	    && flag_openacc
+	    && flag_tree_parallelize_loops > 1);
+  }
 
-}; // class pass_oacc_kernels2
+}; // class pass_ipa_oacc
 
 } // anon namespace
 
-gimple_opt_pass *
-make_pass_oacc_kernels2 (gcc::context *ctxt)
+simple_ipa_opt_pass *
+make_pass_ipa_oacc (gcc::context *ctxt)
+{
+  return new pass_ipa_oacc (ctxt);
+}
+
+/* The ipa oacc kernels pass.  */
+
+namespace {
+
+const pass_data pass_data_ipa_oacc_kernels =
+{
+  SIMPLE_IPA_PASS, /* type */
+  "ipa_oacc_kernels", /* name */
+  OPTGROUP_LOOP, /* optinfo_flags */
+  TV_TREE_LOOP, /* tv_id */
+  PROP_cfg, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_ipa_oacc_kernels : public simple_ipa_opt_pass
+{
+public:
+  pass_ipa_oacc_kernels (gcc::context *ctxt)
+    : simple_ipa_opt_pass (pass_data_ipa_oacc_kernels, ctxt)
+  {}
+
+}; // class pass_ipa_oacc_kernels
+
+} // anon namespace
+
+simple_ipa_opt_pass *
+make_pass_ipa_oacc_kernels (gcc::context *ctxt)
 {
-  return new pass_oacc_kernels2 (ctxt);
+  return new pass_ipa_oacc_kernels (ctxt);
 }
 
 /* The no-loop superpass.  */
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index b34c955..5f8c0b6 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -7821,6 +7821,8 @@  public:
 	      && !seen_error ());
     }
 
+  opt_pass * clone () { return new pass_ipa_pta (m_ctxt); }
+
   virtual unsigned int execute (function *) { return ipa_pta_execute (); }
 
 }; // class pass_ipa_pta