diff mbox

Catch more MEM_REFs sharing common addressing part in gimple strength reduction

Message ID 004901ceadf9$195f9700$4c1ec500$@arm.com
State New
Headers show

Commit Message

Bin Cheng Sept. 10, 2013, 7:41 a.m. UTC
On Mon, Sep 9, 2013 at 11:35 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote:
>
>> > I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type.  Otherwise there will be ssa_verify error later.
>>
>> OK, I see now.  I had thought this was handled by fold_build2, but
>> apparently not.  I guess all T2's formerly handled were already sizetype
>> as expected.  Thanks for the explanation!
>
> So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2) in
> the argument list to fold_build2?  It's picking nits, but that would be
> slightly more efficient.

Hi Bill,

This is the 2nd version of patch with your comments incorporated.
Bootstrap and re-test on x86.  Re-test on ARM ongoing.  Is it ok if tests pass?

Thanks.
bin

Comments

Bill Schmidt Sept. 10, 2013, 1:30 p.m. UTC | #1
On Tue, 2013-09-10 at 15:41 +0800, bin.cheng wrote:
> On Mon, Sep 9, 2013 at 11:35 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote:
> >
> >> > I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type.  Otherwise there will be ssa_verify error later.
> >>
> >> OK, I see now.  I had thought this was handled by fold_build2, but
> >> apparently not.  I guess all T2's formerly handled were already sizetype
> >> as expected.  Thanks for the explanation!
> >
> > So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2) in
> > the argument list to fold_build2?  It's picking nits, but that would be
> > slightly more efficient.
> 
> Hi Bill,
> 
> This is the 2nd version of patch with your comments incorporated.
> Bootstrap and re-test on x86.  Re-test on ARM ongoing.  Is it ok if tests pass?

Looks good to me!  Thanks, Bin.

Bill

> 
> Thanks.
> bin
Bin Cheng Sept. 12, 2013, 6:13 a.m. UTC | #2
On Tue, Sep 10, 2013 at 9:30 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote:
>
>
> On Tue, 2013-09-10 at 15:41 +0800, bin.cheng wrote:
>> On Mon, Sep 9, 2013 at 11:35 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote:
>> >
>> >> > I rely on size_binop to convert T2 into sizetype, because T2' may be in other kind of type.  Otherwise there will be ssa_verify error later.
>> >>
>> >> OK, I see now.  I had thought this was handled by fold_build2, but
>> >> apparently not.  I guess all T2's formerly handled were already sizetype
>> >> as expected.  Thanks for the explanation!
>> >
>> > So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2) in
>> > the argument list to fold_build2?  It's picking nits, but that would be
>> > slightly more efficient.
>>
>> Hi Bill,
>>
>> This is the 2nd version of patch with your comments incorporated.
>> Bootstrap and re-test on x86.  Re-test on ARM ongoing.  Is it ok if tests pass?
>
> Looks good to me!  Thanks, Bin.
>

Sorry I have to hold on this patch since it causes several tests failed on ARM.  Will investigate it and get back ASAP.

Thanks.
bin
Bin Cheng Sept. 17, 2013, 4:47 a.m. UTC | #3
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of bin.cheng
> Sent: Thursday, September 12, 2013 2:13 PM
> To: 'Bill Schmidt'; Yufeng Zhang; Yufeng Zhang
> Cc: Richard Biener; GCC Patches
> Subject: RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing
> part in gimple strength reduction
> 
> 
> On Tue, Sep 10, 2013 at 9:30 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com>
> wrote:
> >
> >
> > On Tue, 2013-09-10 at 15:41 +0800, bin.cheng wrote:
> >> On Mon, Sep 9, 2013 at 11:35 PM, Bill Schmidt
> <wschmidt@linux.vnet.ibm.com> wrote:
> >> >
> >> >> > I rely on size_binop to convert T2 into sizetype, because T2' may be
> in other kind of type.  Otherwise there will be ssa_verify error later.
> >> >>
> >> >> OK, I see now.  I had thought this was handled by fold_build2, but
> >> >> apparently not.  I guess all T2's formerly handled were already
> >> >> sizetype as expected.  Thanks for the explanation!
> >> >
> >> > So, wouldn't it suffice to change t2 to fold_convert (sizetype, t2)
> >> > in the argument list to fold_build2?  It's picking nits, but that
> >> > would be slightly more efficient.
> >>
> >> Hi Bill,
> >>
> >> This is the 2nd version of patch with your comments incorporated.
> >> Bootstrap and re-test on x86.  Re-test on ARM ongoing.  Is it ok if tests
> pass?
> >
> > Looks good to me!  Thanks, Bin.
> >
> 
> Sorry I have to hold on this patch since it causes several tests failed on ARM.
> Will investigate it and get back ASAP.
> 
The reported failure is false alarm and happens on trunk too.  I must have compared wrong testing results.
Since there is no regression and the patch is approved before, I will apply it to trunk.

Thanks.
bin
diff mbox

Patch

Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-39.c	(revision 0)
@@ -0,0 +1,26 @@ 
+/* Verify straight-line strength reduction for back-tracing
+   CAND_ADD for T2 in:
+
+    *PBASE:    T1
+    *POFFSET:  MULT_EXPR (T2, C3)
+    *PINDEX:   C1 + (C2 * C3) + C4  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-slsr" } */
+
+typedef int arr_2[50][50];
+
+void foo (arr_2 a2, int v1)
+{
+  int i, j;
+
+  i = v1 + 5;
+  j = i;
+  a2 [i] [j++] = i;
+  a2 [i] [j++] = i;
+  a2 [i] [i-1] += 1;
+  return;
+}
+
+/* { dg-final { scan-tree-dump-times "MEM" 4 "slsr" } } */
+/* { dg-final { cleanup-tree-dump "slsr" } } */
Index: gcc/gimple-ssa-strength-reduction.c
===================================================================
--- gcc/gimple-ssa-strength-reduction.c	(revision 202067)
+++ gcc/gimple-ssa-strength-reduction.c	(working copy)
@@ -750,6 +750,57 @@  slsr_process_phi (gimple phi, bool speed)
   add_cand_for_stmt (phi, c);
 }
 
+/* Given PBASE which is a pointer to tree, look up the defining
+   statement for it and check whether the candidate is in the
+   form of:
+
+     X = B + (1 * S), S is integer constant
+     X = B + (i * S), S is integer one
+
+   If so, set PBASE to the candidate's base_expr and return double
+   int (i * S).
+   Otherwise, just return double int zero.  */
+
+static double_int
+backtrace_base_for_ref (tree *pbase)
+{
+  tree base_in = *pbase;
+  slsr_cand_t base_cand;
+
+  STRIP_NOPS (base_in);
+  if (TREE_CODE (base_in) != SSA_NAME)
+    return tree_to_double_int (integer_zero_node);
+
+  base_cand = base_cand_from_table (base_in);
+
+  while (base_cand && base_cand->kind != CAND_PHI)
+    {
+      if (base_cand->kind == CAND_ADD
+	  && base_cand->index.is_one ()
+	  && TREE_CODE (base_cand->stride) == INTEGER_CST)
+	{
+	  /* X = B + (1 * S), S is integer constant.  */
+	  *pbase = base_cand->base_expr;
+	  return tree_to_double_int (base_cand->stride);
+	}
+      else if (base_cand->kind == CAND_ADD
+	       && TREE_CODE (base_cand->stride) == INTEGER_CST
+	       && integer_onep (base_cand->stride))
+        {
+	  /* X = B + (i * S), S is integer one.  */
+	  *pbase = base_cand->base_expr;
+	  return base_cand->index;
+	}
+
+      if (base_cand->next_interp)
+	base_cand = lookup_cand (base_cand->next_interp);
+      else
+	base_cand = NULL;
+    }
+
+  return tree_to_double_int (integer_zero_node);
+}
+
 /* Look for the following pattern:
 
     *PBASE:    MEM_REF (T1, C1)
@@ -767,8 +818,15 @@  slsr_process_phi (gimple phi, bool speed)
 
     *PBASE:    T1
     *POFFSET:  MULT_EXPR (T2, C3)
-    *PINDEX:   C1 + (C2 * C3) + C4  */
+    *PINDEX:   C1 + (C2 * C3) + C4
 
+   When T2 is recorded by a CAND_ADD in the form of (T2' + C5), it
+   will be further restructured to:
+
+    *PBASE:    T1
+    *POFFSET:  MULT_EXPR (T2', C3)
+    *PINDEX:   C1 + (C2 * C3) + C4 + (C5 * C3)  */
+
 static bool
 restructure_reference (tree *pbase, tree *poffset, double_int *pindex,
 		       tree *ptype)
@@ -777,7 +835,7 @@  restructure_reference (tree *pbase, tree *poffset,
   double_int index = *pindex;
   double_int bpu = double_int::from_uhwi (BITS_PER_UNIT);
   tree mult_op0, mult_op1, t1, t2, type;
-  double_int c1, c2, c3, c4;
+  double_int c1, c2, c3, c4, c5;
 
   if (!base
       || !offset
@@ -823,11 +881,12 @@  restructure_reference (tree *pbase, tree *poffset,
     }
 
   c4 = index.udiv (bpu, FLOOR_DIV_EXPR);
+  c5 = backtrace_base_for_ref (&t2);
 
   *pbase = t1;
-  *poffset = fold_build2 (MULT_EXPR, sizetype, t2,
+  *poffset = fold_build2 (MULT_EXPR, sizetype, fold_convert (sizetype, t2),
 			  double_int_to_tree (sizetype, c3));
-  *pindex = c1 + c2 * c3 + c4;
+  *pindex = c1 + c2 * c3 + c4 + c5 * c3;
   *ptype = type;
 
   return true;