Patchwork [10/10] Remove the temporary array for reductions written to memory.

login
register
mail settings
Submitter Sebastian Pop
Date Jan. 19, 2011, 6:47 p.m.
Message ID <AANLkTinn9R3Dis4m7qk+94+O=heHi81bP668Zj49sR=t@mail.gmail.com>
Download mbox | patch
Permalink /patch/79501/
State New
Headers show

Comments

Sebastian Pop - Jan. 19, 2011, 6:47 p.m.
Hi,

On Sat, Jan 15, 2011 at 03:21, Sebastian Pop <sebpop@gmail.com> wrote:
> Albert, Konrad,
>
> this patch fixes (without data dep analysis changes)
> the motivating example of Figures 1, 2, 3,... of:
> http://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=trifunovic.pdf
>
> Sebastian
>
> On Sat, Jan 15, 2011 at 03:05, Sebastian Pop <sebpop@gmail.com> wrote:
>> 2011-01-15  Sebastian Pop  <sebastian.pop@amd.com>
>>
>>        * graphite-sese-to-poly.c
>>        (translate_scalar_reduction_to_array_for_stmt): Call unshare_expr.
>>        (close_phi_written_to_memory): New.
>>        (translate_scalar_reduction_to_array): Call close_phi_written_to_memory
>>        and unshare_expr.

This patch introduced a regression that I discussed with Tobias today
on the Graphite phone call.

This patch avoids the creation of temporary arrays when they are not
necessary, i.e., when the result of the reduction is written back to
memory.  Here is the diff before "0" and after "1" the patch on the
gimple representation of interchange-12.c

But this is not enough, as rewrite_commutative_reductions_out_of_ssa
updates some parts of the polyhedral representation, and these should
now go away.  I am preparing a patch to fix all this.

Sebastian
Sebastian Pop - Jan. 20, 2011, 12:19 a.m.
Hi,

This patch set corrects the problem described earlier.  I am
committing this to the graphite branch for testing.

Sebastian Pop (3):
  Move rewrite_commutative_reductions_out_of_ssa before
    find_scop_parameters.
  Pass to dr_analyze_indices the analysis loop for subscripts.
  Only copy PBB_DOMAIN when it is initialized.

 gcc/ChangeLog.graphite                     |   32 ++++++++++++
 gcc/graphite-scop-detection.c              |    4 +-
 gcc/graphite-sese-to-poly.c                |   71 +++++++++++++++++++++++----
 gcc/testsuite/gfortran.dg/graphite/id-23.f |   13 +++++
 gcc/tree-data-ref.c                        |   31 +++++++------
 gcc/tree-data-ref.h                        |    4 +-
 gcc/tree-ssa-loop-prefetch.c               |    3 +-
 7 files changed, 129 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/graphite/id-23.f

Patch

--- 0	2011-01-19 12:14:22.000000000 -0600
+++ 1	2011-01-19 12:15:52.000000000 -0600
@@ -81,17 +81,17 @@ 
         # .MEM_22 = VDEF <.MEM_35>
         A[i_31][j_32] = 0;
         # .MEM_7 = VDEF <.MEM_22>
-        Commutative_Associative_Reduction.5[0] = 0;
+        A[i_31][j_32] = 0;

       }
       bb_7 (preds = {bb_5 }, succs = {bb_17 })
       {
       <bb 7>:
         # VUSE <.MEM_38>
-        D.2738_23 = Commutative_Associative_Reduction.5[0];
-        A_I_I_lsm.4_44 = D.2738_23;
+        D.2737_23 = A[i_31][j_32];
+        A_I_I_lsm.4_44 = D.2737_23;
         # .MEM_54 = VDEF <.MEM_38>
-        Cross_BB_scalar_dependence.6[0] = A_I_I_lsm.4_44;
+        Cross_BB_scalar_dependence.5[0] = A_I_I_lsm.4_44;

       }
       bb_17 (preds = {bb_7 }, succs = {bb_14 })
@@ -103,7 +103,7 @@ 
       {
       <bb 14>:
         # VUSE <.MEM_54>
-        A_I_I_lsm.4_52 = Cross_BB_scalar_dependence.6[0];
+        A_I_I_lsm.4_52 = Cross_BB_scalar_dependence.5[0];
         # .MEM_43 = VDEF <.MEM_54>
         A[i_31][j_32] = A_I_I_lsm.4_52;
         j_13 = j_32 + 1;
@@ -121,7 +121,7 @@ 
           # k_33 = PHI <k_12(6), 0(4)>
           # .MEM_34 = PHI <.MEM_38(6), .MEM_7(4)>
           # VUSE <.MEM_34>
-          prephitmp.3_37 = Commutative_Associative_Reduction.5[0];
+          prephitmp.3_37 = A[i_31][j_32];
           # VUSE <.MEM_34>
           D.2722_8 = B[i_31][k_33];
           # VUSE <.MEM_34>
@@ -129,7 +129,7 @@ 
           D.2724_10 = D.2722_8 * D.2723_9;
           D.2725_11 = D.2724_10 + prephitmp.3_37;
           # .MEM_38 = VDEF <.MEM_34>
-          Commutative_Associative_Reduction.5[0] = D.2725_11;
+          A[i_31][j_32] = D.2725_11;
           k_12 = k_33 + 1;
           if (k_12 != 200)
             goto <bb 6>;

We now use the data reference A instead of creating the one-element
array Commutative_Associative_Reduction, and now we can interchange
the two loops as we know the access function for the data reference.

The problem occurs in 3 SPEC benchmarks reduced to:

      SUBROUTINE CAMB(RX2,RTX,NUM)
      DIMENSION RX2(NUM,NUM),RTX(NUM,NUM)
      DO I=1,NUM
        DO J=1,I
          DO M=1,NUM
            RX2(I,J)=RX2(I,J)+RTX(M,I)
          END DO
        END DO
      END DO
      IF (RX2(I,1).LE.EIGCT2) THEN
      RTX(I,1)=4.0D+00
      END IF
      END

For some reason (unimportant for now), the scop is bounded to the
innermost loop M.  RX2(I,J) is first scalar optimized such that the
reduction happens into a scalar:

        DO J=1,I
	  t = RX2(I,J)
          DO M=1,NUM
            t=t+RTX(M,I)
          END DO
	  RX2(I,J)=t
        END DO

and then when we rewrite the commutative associative reductions out of
SSA, we reintroduce the data reference RX2(I,J) in the loop nest.  The
access functions for RX2(I,J) are parameters of the scop, as they do
not variate in the scop (bounded to the innermost loop here).  As we
rewrite out-of-SSA quite late in the polyhedral translation, after the
parameters of the scop have already been determined, we fail in this
specific case because we introduce new parameters that are not
registered yet in the representation.  The first part of the patch to fix
this is to move the rewrite of reductions before parameters analysis:

diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 12b93b0..9184065 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -3159,6 +3159,9 @@  build_poly_scop (scop_p scop)
   if (!scop_ivs_can_be_represented (scop))
     return;

+  if (flag_associative_math)
+    rewrite_commutative_reductions_out_of_ssa (scop);
+
   build_sese_loop_nests (region);
   build_sese_conditions (region);
   find_scop_parameters (scop);
@@ -3175,8 +3178,6 @@  build_poly_scop (scop_p scop)
      representation to the polyhedral representation to avoid scev
      analysis failures.  That means that these functions will insert
      new data references that they create in the right place.  */
-  if (flag_associative_math)
-    rewrite_commutative_reductions_out_of_ssa (scop);
   rewrite_reductions_out_of_ssa (scop);
   rewrite_cross_bb_scalar_deps_out_of_ssa (scop);