Patchwork Fix for PR51879 - Missed tail merging with non-const/pure calls

login
register
mail settings
Submitter Tom de Vries
Date Jan. 23, 2012, 9:27 p.m.
Message ID <4F1DD0CF.8010808@mentor.com>
Download mbox | patch
Permalink /patch/137486/
State New
Headers show

Comments

Tom de Vries - Jan. 23, 2012, 9:27 p.m.
Richard,
Jakub,

the following patch fixes PR51879.

Consider the following test-case:
...
int bar (int);
void baz (int);

void
foo (int y)
{
  int a;
  if (y == 6)
    a = bar (7);
  else
    a = bar (7);
  baz (a);
}
...

after compiling at -02, the representation looks like this before tail-merging:
...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF <.MEMD.1714_6(D)>
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto <bb 5>;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF <.MEMD.1714_6(D)>
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:10000
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI <aD.1709_3(3), aD.1709_4(4)>
  # .MEMD.1714_5 = PHI <.MEMD.1714_7(3), .MEMD.1714_8(4)>
  # .MEMD.1714_9 = VDEF <.MEMD.1714_5>
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE <.MEMD.1714_9>
  return;
...

the patch allows aD.1709_4 to be value numbered to aD.1709_3, and .MEMD.1714_8
to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

The patch makes sure non-const/pure call results (gimple_vdef and
gimple_call_lhs) are properly value numbered.

Bootstrapped and reg-tested on x86_64.

ok for stage1?

Thanks,
- Tom

2012-01-23  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/51879
	tree-ssa-sccvn.c (visit_reference_op_call): Handle gimple_vdef.
	(visit_use): Handle non-pure/const calls using visit_reference_op_call.

	gcc.dg/pr51879.c: New test.
	gcc.dg/pr51879-2.c: Same.
	gcc.dg/pr51879-3.c: Same.
	gcc.dg/pr51879-4.c: Same.
Richard Guenther - Jan. 24, 2012, 10:40 a.m.
On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> Richard,
> Jakub,
>
> the following patch fixes PR51879.
>
> Consider the following test-case:
> ...
> int bar (int);
> void baz (int);
>
> void
> foo (int y)
> {
>  int a;
>  if (y == 6)
>    a = bar (7);
>  else
>    a = bar (7);
>  baz (a);
> }
> ...
>
> after compiling at -02, the representation looks like this before tail-merging:
> ...
>  # BLOCK 3 freq:1991
>  # PRED: 2 [19.9%]  (true,exec)
>  # .MEMD.1714_7 = VDEF <.MEMD.1714_6(D)>
>  # USE = nonlocal
>  # CLB = nonlocal
>  aD.1709_3 = barD.1703 (7);
>  goto <bb 5>;
>  # SUCC: 5 [100.0%]  (fallthru,exec)
>
>  # BLOCK 4 freq:8009
>  # PRED: 2 [80.1%]  (false,exec)
>  # .MEMD.1714_8 = VDEF <.MEMD.1714_6(D)>
>  # USE = nonlocal
>  # CLB = nonlocal
>  aD.1709_4 = barD.1703 (7);
>  # SUCC: 5 [100.0%]  (fallthru,exec)
>
>  # BLOCK 5 freq:10000
>  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
>  # aD.1709_1 = PHI <aD.1709_3(3), aD.1709_4(4)>
>  # .MEMD.1714_5 = PHI <.MEMD.1714_7(3), .MEMD.1714_8(4)>
>  # .MEMD.1714_9 = VDEF <.MEMD.1714_5>
>  # USE = nonlocal
>  # CLB = nonlocal
>  bazD.1705 (aD.1709_1);
>  # VUSE <.MEMD.1714_9>
>  return;
> ...
>
> the patch allows aD.1709_4 to be value numbered to aD.1709_3, and .MEMD.1714_8
> to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.
>
> The patch makes sure non-const/pure call results (gimple_vdef and
> gimple_call_lhs) are properly value numbered.
>
> Bootstrapped and reg-tested on x86_64.
>
> ok for stage1?

The following cannot really work:

@@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
   result = vn_reference_lookup_1 (&vr1, NULL);
   if (result)
     {
-      changed = set_ssa_val_to (lhs, result);
+      tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
+      if (vdef)
+       changed |= set_ssa_val_to (vdef, result_vdef);
+      changed |= set_ssa_val_to (lhs, result);

becase 'result' may be not an SSA name.  It might also not have
a proper definition statement (if VN_INFO (result)->needs_insertion
is true).  So you at least need to guard things properly.

(On a side-note - I _did_ want to remove value-numbering virtual operands
at some point ...)

@@ -3359,8 +3366,10 @@ visit_use (tree use)
          /* ???  We should handle stores from calls.  */
          else if (TREE_CODE (lhs) == SSA_NAME)
            {
+             tree vuse = gimple_vuse (stmt);
              if (!gimple_call_internal_p (stmt)
-                 && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST))
+                 && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)
+                     || (vuse && SSA_VAL (vuse) != VN_TOP)))
                changed = visit_reference_op_call (lhs, stmt);
              else
                changed = defs_to_varying (stmt);

... exactly because of the issue that a stmt has multiple defs.  Btw,
vuse should have been visited here or be part of our SCC, so, why do
you need this check?

Thanks,
Richard.

> Thanks,
> - Tom
>
> 2012-01-23  Tom de Vries  <tom@codesourcery.com>
>
>        PR tree-optimization/51879
>        tree-ssa-sccvn.c (visit_reference_op_call): Handle gimple_vdef.
>        (visit_use): Handle non-pure/const calls using visit_reference_op_call.
>
>        gcc.dg/pr51879.c: New test.
>        gcc.dg/pr51879-2.c: Same.
>        gcc.dg/pr51879-3.c: Same.
>        gcc.dg/pr51879-4.c: Same.

Patch

Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c (revision 183325)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -2591,6 +2591,7 @@  visit_reference_op_call (tree lhs, gimpl
   struct vn_reference_s vr1;
   tree result;
   tree vuse = gimple_vuse (stmt);
+  tree vdef = gimple_vdef (stmt);
 
   vr1.vuse = vuse ? SSA_VAL (vuse) : NULL_TREE;
   vr1.operands = valueize_shared_reference_ops_from_call (stmt);
@@ -2600,7 +2601,11 @@  visit_reference_op_call (tree lhs, gimpl
   result = vn_reference_lookup_1 (&vr1, NULL);
   if (result)
     {
-      changed = set_ssa_val_to (lhs, result);
+      tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
+      if (vdef)
+	changed |= set_ssa_val_to (vdef, result_vdef);
+      changed |= set_ssa_val_to (lhs, result);
+
       if (TREE_CODE (result) == SSA_NAME
 	  && VN_INFO (result)->has_constants)
 	VN_INFO (lhs)->has_constants = true;
@@ -2609,7 +2614,9 @@  visit_reference_op_call (tree lhs, gimpl
     {
       void **slot;
       vn_reference_t vr2;
-      changed = set_ssa_val_to (lhs, lhs);
+      if (vdef)
+	changed |= set_ssa_val_to (vdef, vdef);
+      changed |= set_ssa_val_to (lhs, lhs);
       vr2 = (vn_reference_t) pool_alloc (current_info->references_pool);
       vr2->vuse = vr1.vuse;
       vr2->operands = valueize_refs (create_reference_ops_from_call (stmt));
@@ -3359,8 +3366,10 @@  visit_use (tree use)
 	  /* ???  We should handle stores from calls.  */
 	  else if (TREE_CODE (lhs) == SSA_NAME)
 	    {
+	      tree vuse = gimple_vuse (stmt);
 	      if (!gimple_call_internal_p (stmt)
-		  && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST))
+		  && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)
+		      || (vuse && SSA_VAL (vuse) != VN_TOP)))
 		changed = visit_reference_op_call (lhs, stmt);
 	      else
 		changed = defs_to_varying (stmt);
Index: gcc/testsuite/gcc.dg/pr51879-2.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-2.c (revision 0)
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int bar (int);
+void baz (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y)
+    baz (bar (7) + 6);
+  else
+    baz (bar (7) + 6);
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-times "baz \\(" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr51879.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879.c (revision 0)
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int bar (int);
+void baz (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y == 6)
+    a = bar (7);
+  else
+    a = bar (7);
+  baz (a);
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr51879-3.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-3.c (revision 0)
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int bar (int);
+void baz (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y == 6)
+    a = bar (7) + 6;
+  else
+    a = bar (7) + 6;
+  baz (a);
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr51879-4.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-4.c (revision 0)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int bar (int);
+void baz (int);
+
+int foo (int y)
+{
+  int a, b;
+  a = bar (7) + 6;
+  b = bar (7) + 6;
+  return a + b;
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 2 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */