diff mbox

[PR68373] Call scev_const_prop in pass_parallelize_loops::execute

Message ID 564BA844.1010103@mentor.com
State New
Headers show

Commit Message

Tom de Vries Nov. 17, 2015, 10:20 p.m. UTC
[ was: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def ]

Hi,

Consider test-case test.c, with a use of the final value of the 
iteration variable (return i):
...
unsigned int
foo (int *a, unsigned int n)
{
   unsigned int i;
   for (i = 0; i < n; ++i)
     a[i] = 1;

   return i;
}
...

Compiled with:
...
$ gcc -S -O2 test.c -ftree-parallelize-loops=2 -fdump-tree-all-details
...

Before parloops, we have:
...
  <bb 4>:
   # i_12 = PHI <0(3), i_10(5)>
   _5 = (long unsigned int) i_12;
   _6 = _5 * 4;
   _8 = a_7(D) + _6;
   *_8 = 1;
   i_10 = i_12 + 1;
   if (n_4(D) > i_10)
     goto <bb 5>;
   else
     goto <bb 6>;

   <bb 5>:
   goto <bb 4>;

   <bb 6>:
   # i_14 = PHI <n_4(D)(4), 0(2)>
...

Parloops will fail because:
...
phi is n_2 = PHI <n_4(D)(4)>
arg of phi to exit:   value n_4(D) used outside loop
   checking if it a part of reduction pattern:
   FAILED: it is not a part of reduction....
...
[ note that the phi looks slightly different. In 
gather_scalar_reductions -> vect_analyze_loop_form -> 
vect_analyze_loop_form_1 -> split_loop_exit_edge we split the edge from 
bb4 to bb6. ]

This patch uses scev_const_prop at the start of parloops. 
scev_const_prop first also splits the exit edge, and then replaces the 
phi with a assignment:
...
  final value replacement:
   n_2 = PHI <n_4(D)(4)>
   with
   n_2 = n_4(D);
...

This allows parloops to succeed.

And there's a similar story when we compile with -fno-tree-scev-cprop in 
addition.

Bootstrapped and reg-tested on x86_64.

OK for stage3/stage1?

Thanks,
- Tom
diff mbox

Patch

Call scev_const_prop in pass_parallelize_loops::execute

2015-11-17  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/68373
	* tree-parloops.c (pass_parallelize_loops::execute): Call
	scev_const_prop.

	* gcc.dg/autopar/pr68373.c: New test.

---
 gcc/testsuite/gcc.dg/autopar/pr68373.c | 14 ++++++++++++++
 gcc/tree-parloops.c                    |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/autopar/pr68373.c b/gcc/testsuite/gcc.dg/autopar/pr68373.c
new file mode 100644
index 0000000..8e0f8a5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/autopar/pr68373.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-parallelize-loops=2 -fdump-tree-parloops-details" } */
+
+unsigned int
+foo (int *a, unsigned int n)
+{
+  unsigned int i;
+  for (i = 0; i < n; ++i)
+    a[i] = 1;
+
+  return i;
+}
+
+/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops" } } */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 17415a8..d944395 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -2787,6 +2787,9 @@  pass_parallelize_loops::execute (function *fun)
   if (number_of_loops (fun) <= 1)
     return 0;
 
+  unsigned int sccp_todo = scev_const_prop ();
+  gcc_assert (sccp_todo == 0);
+
   if (parallelize_loops ())
     {
       fun->curr_properties &= ~(PROP_gimple_eomp);