diff mbox

[Fortran] Use ANNOTATE_EXPR annot_expr_ivdep_kind for DO CONCURRENT

Message ID 525473D1.4010001@net-b.de
State New
Headers show

Commit Message

Tobias Burnus Oct. 8, 2013, 9:06 p.m. UTC
This patch requires my pending ME and C FE patch: 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00514.html

Using C/C++'s #pragma ivdep or – with the attached Fortran patch – "do 
concurrent", the loop condition is annotated such that later the loop's 
vectorization safelen is set to infinity (well, INT_MAX). The main 
purpose is to tell the compiler that the result is independent of the 
order in which the loop is walked. The typical case is pointer aliasing, 
in which the compiler either doesn't vectorize or adds a run-time 
aliasing check (loop versioning). With the annotation, the compiler 
simply assumes that there is no aliasing and avoids the versioning. – 
Contrary to C++ which does not even have the "restrict" qualifier (gcc 
and others do support __restrict) and cases where C/C++'s __restrict 
qualifier isn't sufficient/applicable, the effect on typical Fortran 
code should be smaller as most variables cannot alias. Still, in some 
cases it can help. (See test case for an example.)

There is an alternative to ivdep, which is more lower level [1]: 
OpenMPv4's "omp simd" (with safelen=) for C/C++/Fortran and – for C/C++ 
– Cilk Plus's #pragma simd.

Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias

PS: I think the same annotation could be also used with FORALL and 
implied loops with whole-array/array-section assignments, when the FE 
knows that there is no aliasing between the LHS and RHS side. (In some 
cases, the FE knows this while the ME doesn't.)

PPS: My personal motivation is my long-standing wish to pass this 
information to the middle end for DO CONCURRENT but also to use the 
pragma for a specific C++ code.

[1] The OpenMPv4 support for C/C++ will be merged soon, for Fortran it 
will take a while (maybe still in 4.9, maybe only later). See 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00502.html / The relevant 
Cilk Plus patch has been posted at 
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01626.html

Comments

Tobias Burnus Oct. 22, 2013, 6:30 p.m. UTC | #1
Two weeks ago I submitted the patch, available at: 
http://gcc.gnu.org/ml/fortran/2013-10/msg00022.html ; while the ME patch 
is not yet approved, the C FE was approved (latest C/ME patch: 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01752.html).

Additionally, I'd like to early ping for: 
http://gcc.gnu.org/ml/fortran/2013-10/msg00068.html

Tobias

PS: Actually, safelen (i.e. "GCC ivdep", "omp simd", Cilk's "simd") does 
not require that the result is independent of the loop-waking order. 
Just (quoting from my C patch for GCC ivdep): "With this pragma, the 
programmer asserts that there are no loop-carried dependencies which 
would prevent that consecutive iterations of the following loop can be 
executed concurrently with SIMD (single instruction multiple data) 
instructions."
Fortran's do concurrent requires more: "The DO CONCURRENT construct 
provides a means for the program to specify that individual loop
iterations have no interdependencies." (F2008, Introduction). Still, in 
terms of optimizing "do concurrent", setting safelen gives the ME all 
required information.


On Oct 08, 2013, Tobias Burnus wrote:
> This patch requires my pending ME and C FE patch: 
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00514.html
>
> Using C/C++'s #pragma ivdep or – with the attached Fortran patch – "do 
> concurrent", the loop condition is annotated such that later the 
> loop's vectorization safelen is set to infinity (well, INT_MAX). The 
> main purpose is to tell the compiler that the result is independent of 
> the order in which the loop is walked. The typical case is pointer 
> aliasing, in which the compiler either doesn't vectorize or adds a 
> run-time aliasing check (loop versioning). With the annotation, the 
> compiler simply assumes that there is no aliasing and avoids the 
> versioning. – Contrary to C++ which does not even have the "restrict" 
> qualifier (gcc and others do support __restrict) and cases where 
> C/C++'s __restrict qualifier isn't sufficient/applicable, the effect 
> on typical Fortran code should be smaller as most variables cannot 
> alias. Still, in some cases it can help. (See test case for an example.)
>
> There is an alternative to ivdep, which is more lower level [1]: 
> OpenMPv4's "omp simd" (with safelen=) for C/C++/Fortran and – for 
> C/C++ – Cilk Plus's #pragma simd.
>
> Build and regtested on x86-64-gnu-linux.
> OK for the trunk?
>
> Tobias
>
> PS: I think the same annotation could be also used with FORALL and 
> implied loops with whole-array/array-section assignments, when the FE 
> knows that there is no aliasing between the LHS and RHS side. (In some 
> cases, the FE knows this while the ME doesn't.)
>
> PPS: My personal motivation is my long-standing wish to pass this 
> information to the middle end for DO CONCURRENT but also to use the 
> pragma for a specific C++ code.
>
> [1] The OpenMPv4 support for C/C++ will be merged soon, for Fortran it 
> will take a while (maybe still in 4.9, maybe only later). See 
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00502.html / The relevant 
> Cilk Plus patch has been posted at 
> http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01626.html
Steve Kargl Oct. 22, 2013, 7:22 p.m. UTC | #2
On Tue, Oct 22, 2013 at 08:30:44PM +0200, Tobias Burnus wrote:
> Two weeks ago I submitted the patch, available at: 
> http://gcc.gnu.org/ml/fortran/2013-10/msg00022.html ; while the ME patch 
> is not yet approved, the C FE was approved (latest C/ME patch: 
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01752.html).

The Fortran part is OK.
diff mbox

Patch

2013-10-08  Tobias Burnus  <burnus@net-b.de>

	PR fortran/44646
	* trans-stmt.c (struct forall_info): Add do_concurrent field.
	(gfc_trans_forall_1): Set it for do concurrent.
	(gfc_trans_forall_loop): Mark those as annot_expr_ivdep_kind.

2013-10-08  Tobias Burnus  <burnus@net-b.de>

	PR fortran/44646
	* gfortran.dg/vect/vect-do-concurrent-1.f90: New.

diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index edd2dac..b44d2c1 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -53,6 +53,7 @@  typedef struct forall_info
   int nvar;
   tree size;
   struct forall_info  *prev_nest;
+  bool do_concurrent;
 }
 forall_info;
 
@@ -2759,6 +2760,12 @@  gfc_trans_forall_loop (forall_info *forall_tmp, tree body,
       /* The exit condition.  */
       cond = fold_build2_loc (input_location, LE_EXPR, boolean_type_node,
 			      count, build_int_cst (TREE_TYPE (count), 0));
+      if (forall_tmp->do_concurrent)
+	{
+	  cond = build1 (ANNOTATE_EXPR, TREE_TYPE (cond), cond);
+	  SET_ANNOTATE_EXPR_ID (cond, annot_expr_ivdep_kind);
+	}
+
       tmp = build1_v (GOTO_EXPR, exit_label);
       tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
 			     cond, tmp, build_empty_stmt (input_location));
@@ -3842,6 +3849,7 @@  gfc_trans_forall_1 (gfc_code * code, forall_info * nested_forall_info)
 	}
 
       tmp = gfc_finish_block (&body);
+      nested_forall_info->do_concurrent = true;
       tmp = gfc_trans_nested_forall_loop (nested_forall_info, tmp, 1);
       gfc_add_expr_to_block (&block, tmp);
       goto done;
diff --git a/gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90 b/gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90
new file mode 100644
index 0000000..7d56241
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/vect/vect-do-concurrent-1.f90
@@ -0,0 +1,17 @@ 
+! { dg-do compile }
+! { dg-require-effective-target vect_float }
+! { dg-options "-O3 -fopt-info-vec-optimized" }
+
+subroutine test(n, a, b, c)
+  integer, value :: n
+  real, contiguous,  pointer :: a(:), b(:), c(:)
+  integer :: i
+  do concurrent (i = 1:n)
+    a(i) = b(i) + c(i)
+  end do
+end subroutine test
+
+! { dg-message "loop vectorized" "" { target *-*-* } 0 }
+! { dg-bogus "version" "" { target *-*-* } 0 }
+! { dg-bogus "alias" "" { target *-*-* } 0 }
+! { dg-final { cleanup-tree-dump "vect" } }