Patchwork Fix PR54489 - FRE needing AVAIL_OUT

login
register
mail settings
Submitter Richard Guenther
Date Sept. 14, 2012, 9:43 a.m.
Message ID <alpine.LNX.2.00.1209141142360.28649@zhemvz.fhfr.qr>
Download mbox | patch
Permalink /patch/183860/
State New
Headers show

Comments

Richard Guenther - Sept. 14, 2012, 9:43 a.m.
On Fri, 14 Sep 2012, Richard Guenther wrote:

> On Thu, 13 Sep 2012, Steven Bosscher wrote:
> 
> > On Wed, Sep 12, 2012 at 4:52 PM, Steven Bosscher wrote:
> > > On Wed, Sep 12, 2012 at 4:02 PM, Richard Guenther wrote:
> > >> for a followup (and I bet sth else than PRE blows up at -O2 as well).
> > >
> > > Actually, the only thing that really blows up is that enemy of scalability, VRP.
> > 
> > FWIW, this appears to be due to the well-known problem with the equiv
> > set, but also due to the liveness computations that tree-vrp performs,
> > since your commit in r139263.
> > 
> > Any reason why you didn't just re-use the tree-ssa-live machinery?
> 
> Probably I didn't know about it or didn't want to keep the full life
> problem life (it tries to free things as soon as possible).
> 
> > And any reason why you don't let a DEF kill a live SSA name? AFAICT
> > you're exposing all SSA names up without ever killing a USE :-)
> 
> Eh ;)  We also should traverse blocks backward I suppose.  Also
> the RPO traversal suffers from the same issue I noticed in PRE
> and for what I invented my_rev_post_order_compute ...
> (pre_and_rev_post_order_compute doesn't compute an optimal
> reverse post order).
> 
> Patch fixing the liveness below, untested sofar, apart from on
> tree-ssa.exp where it seems to regress gcc.dg/tree-ssa/pr21086.c :/

The following not.  Queued for testing (though I doubt it will
help memory usage due to the use of sbitmaps).  I think
jump threading special-casing asserts and the equiv bitmaps are
the real problem of VRP.  What testcase did you notice the live
issue on?

Thanks,
Richard.

2012-09-14  Richard Guenther  <rguenther@suse.de>

	* tree-vrp.c (register_new_assert_for): Simplify for backward
	walk.
	(find_assert_locations_1): Walk the basic-block backwards,
	properly add/prune from live.  Use live for asserts derived
	from stmts.
Steven Bosscher - Sept. 14, 2012, 10:45 a.m.
On Fri, Sep 14, 2012 at 11:43 AM, Richard Guenther <rguenther@suse.de> wrote:
>> > Any reason why you didn't just re-use the tree-ssa-live machinery?
>>
>> Probably I didn't know about it or didn't want to keep the full life
>> problem life (it tries to free things as soon as possible).

I think it'd be good to use the tree-ssa-live stuff instead of a local
liveness dataflow solver, even if that may allocate a bit of extra
memory. tree-ssa-live is very smart about how liveness is computed.
I'll experiment with using it in tree-vrp.


>> > And any reason why you don't let a DEF kill a live SSA name? AFAICT
>> > you're exposing all SSA names up without ever killing a USE :-)
>>
>> Eh ;)  We also should traverse blocks backward I suppose.  Also
>> the RPO traversal suffers from the same issue I noticed in PRE
>> and for what I invented my_rev_post_order_compute ...
>> (pre_and_rev_post_order_compute doesn't compute an optimal
>> reverse post order).

Eh, what do you mean with "optimal" here?

Yikes, I didn't know about my_rev_post_order_compute. How horrible!
That function doesn't compute reverse post-order of the CFG, but a
post-order of the reverse CFG!


> The following not.  Queued for testing (though I doubt it will
> help memory usage due to the use of sbitmaps).  I think
> jump threading special-casing asserts and the equiv bitmaps are
> the real problem of VRP.  What testcase did you notice the live
> issue on?

I don't remember exactly what test case it was, but it had a large
switch in it. Maybe Brad Lucier's scheme interpreter, but I 'm not
sure.

Ciao!
Steven
Richard Guenther - Sept. 14, 2012, 10:50 a.m.
On Fri, 14 Sep 2012, Steven Bosscher wrote:

> On Fri, Sep 14, 2012 at 11:43 AM, Richard Guenther <rguenther@suse.de> wrote:
> >> > Any reason why you didn't just re-use the tree-ssa-live machinery?
> >>
> >> Probably I didn't know about it or didn't want to keep the full life
> >> problem life (it tries to free things as soon as possible).
> 
> I think it'd be good to use the tree-ssa-live stuff instead of a local
> liveness dataflow solver, even if that may allocate a bit of extra
> memory. tree-ssa-live is very smart about how liveness is computed.
> I'll experiment with using it in tree-vrp.

Ok.  Note that VRP likes to know liveness at a given stmt (but
actually doesn't use it - it would, with my patch) here:

              /* If OP is used only once, namely in this STMT, don't
                 bother creating an ASSERT_EXPR for it.  Such an
                 ASSERT_EXPR would do nothing but increase compile time.  
*/
              if (!has_single_use (op))

that really wants to know if anything would consume the assert, thus
if OP is live below stmt.

> >> > And any reason why you don't let a DEF kill a live SSA name? AFAICT
> >> > you're exposing all SSA names up without ever killing a USE :-)
> >>
> >> Eh ;)  We also should traverse blocks backward I suppose.  Also
> >> the RPO traversal suffers from the same issue I noticed in PRE
> >> and for what I invented my_rev_post_order_compute ...
> >> (pre_and_rev_post_order_compute doesn't compute an optimal
> >> reverse post order).
> 
> Eh, what do you mean with "optimal" here?
> 
> Yikes, I didn't know about my_rev_post_order_compute. How horrible!
> That function doesn't compute reverse post-order of the CFG, but a
> post-order of the reverse CFG!

Ok, well - then that's what we need for compute_antic to have
minmal number of iterations and it is what VRP needs.  Visit
all successors before BB if possible.

;)

> 
> > The following not.  Queued for testing (though I doubt it will
> > help memory usage due to the use of sbitmaps).  I think
> > jump threading special-casing asserts and the equiv bitmaps are
> > the real problem of VRP.  What testcase did you notice the live
> > issue on?
> 
> I don't remember exactly what test case it was, but it had a large
> switch in it. Maybe Brad Lucier's scheme interpreter, but I 'm not
> sure.

I see.

Richard.
Steven Bosscher - Sept. 14, 2012, 12:20 p.m.
On Fri, Sep 14, 2012 at 12:50 PM, Richard Guenther wrote:
>> Yikes, I didn't know about my_rev_post_order_compute. How horrible!
>> That function doesn't compute reverse post-order of the CFG, but a
>> post-order of the reverse CFG!
>
> Ok, well - then that's what we need for compute_antic to have
> minimal number of iterations and it is what VRP needs.  Visit
> all successors before BB if possible.

Right, visit all successors of BB before BB itself, aka visiting in
topological order of the reverse CFG. But your
my_rev_post_order_compute doesn't actually compute a post-order of the
reverse CFG. The first block pushed into the array is EXIT_BLOCK, iff
include_entry_exit==true. Fortunately, the function is only ever
called with include_entry_exit==false.

Ciao!
Steven
Richard Guenther - Sept. 14, 2012, 12:26 p.m.
On Fri, 14 Sep 2012, Steven Bosscher wrote:

> On Fri, Sep 14, 2012 at 12:50 PM, Richard Guenther wrote:
> >> Yikes, I didn't know about my_rev_post_order_compute. How horrible!
> >> That function doesn't compute reverse post-order of the CFG, but a
> >> post-order of the reverse CFG!
> >
> > Ok, well - then that's what we need for compute_antic to have
> > minimal number of iterations and it is what VRP needs.  Visit
> > all successors before BB if possible.
> 
> Right, visit all successors of BB before BB itself, aka visiting in
> topological order of the reverse CFG. But your
> my_rev_post_order_compute doesn't actually compute a post-order of the
> reverse CFG. The first block pushed into the array is EXIT_BLOCK, iff
> include_entry_exit==true. Fortunately, the function is only ever
> called with include_entry_exit==false.

Oops.

If you can figure out a better name for the function we should
probably move it to cfganal.c

Richard.
Steven Bosscher - Oct. 5, 2012, 8:47 p.m.
On Fri, Sep 14, 2012 at 2:26 PM, Richard Guenther <rguenther@suse.de> wrote:
> If you can figure out a better name for the function we should
> probably move it to cfganal.c

It looks like my previous e-mail about this appears to have gone got
somehow, so retry:

Your my_rev_post_order_compute is simply inverted_post_order_compute.
The only difference is that you'll want to ignore EXIT_BLOCK, which is
always added to the list by inverted_post_order_compute.

Ciao!
Steven
Richard Guenther - Oct. 8, 2012, 8:41 a.m.
On Fri, 5 Oct 2012, Steven Bosscher wrote:

> On Fri, Sep 14, 2012 at 2:26 PM, Richard Guenther <rguenther@suse.de> wrote:
> > If you can figure out a better name for the function we should
> > probably move it to cfganal.c
> 
> It looks like my previous e-mail about this appears to have gone got
> somehow, so retry:
> 
> Your my_rev_post_order_compute is simply inverted_post_order_compute.
> The only difference is that you'll want to ignore EXIT_BLOCK, which is
> always added to the list by inverted_post_order_compute.

Indeed.  inverted_post_order_compute seems to handle a CFG without
infinite-loop and noreturns connected to exit though.  Possibly
that's why it doesn't care for not handling entry/exit.

I'm testing a patch to use inverted_post_order_compute from PRE.

Richard.

Patch

Index: gcc/tree-vrp.c
===================================================================
--- gcc/tree-vrp.c	(revision 191289)
+++ gcc/tree-vrp.c	(working copy)
@@ -4384,24 +4384,7 @@  register_new_assert_for (tree name, tree
 	  && (loc->expr == expr
 	      || operand_equal_p (loc->expr, expr, 0)))
 	{
-	  /* If the assertion NAME COMP_CODE VAL has already been
-	     registered at a basic block that dominates DEST_BB, then
-	     we don't need to insert the same assertion again.  Note
-	     that we don't check strict dominance here to avoid
-	     replicating the same assertion inside the same basic
-	     block more than once (e.g., when a pointer is
-	     dereferenced several times inside a block).
-
-	     An exception to this rule are edge insertions.  If the
-	     new assertion is to be inserted on edge E, then it will
-	     dominate all the other insertions that we may want to
-	     insert in DEST_BB.  So, if we are doing an edge
-	     insertion, don't do this dominance check.  */
-          if (e == NULL
-	      && dominated_by_p (CDI_DOMINATORS, dest_bb, loc->bb))
-	    return;
-
-	  /* Otherwise, if E is not a critical edge and DEST_BB
+	  /* If E is not a critical edge and DEST_BB
 	     dominates the existing location for the assertion, move
 	     the assertion up in the dominance tree by updating its
 	     location information.  */
@@ -5439,7 +5422,6 @@  find_assert_locations_1 (basic_block bb,
 {
   gimple_stmt_iterator si;
   gimple last;
-  gimple phi;
   bool need_assert;
 
   need_assert = false;
@@ -5462,7 +5444,7 @@  find_assert_locations_1 (basic_block bb,
 
   /* Traverse all the statements in BB marking used names and looking
      for statements that may infer assertions for their used operands.  */
-  for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
+  for (si = gsi_last_bb (bb); !gsi_end_p (si); gsi_prev (&si))
     {
       gimple stmt;
       tree op;
@@ -5479,8 +5461,10 @@  find_assert_locations_1 (basic_block bb,
 	  tree value;
 	  enum tree_code comp_code;
 
-	  /* Mark OP in our live bitmap.  */
-	  SET_BIT (live, SSA_NAME_VERSION (op));
+	  /* If op is not live beyond this stmt, do not bother to insert
+	     asserts for it.  */
+	  if (!TEST_BIT (live, SSA_NAME_VERSION (op)))
+	    continue;
 
 	  /* If OP is used in such a way that we can infer a value
 	     range for it, and we don't find a previous assertion for
@@ -5520,25 +5504,28 @@  find_assert_locations_1 (basic_block bb,
 		    }
 		}
 
-	      /* If OP is used only once, namely in this STMT, don't
-		 bother creating an ASSERT_EXPR for it.  Such an
-		 ASSERT_EXPR would do nothing but increase compile time.  */
-	      if (!has_single_use (op))
-		{
-		  register_new_assert_for (op, op, comp_code, value,
-					   bb, NULL, si);
-		  need_assert = true;
-		}
+	      register_new_assert_for (op, op, comp_code, value, bb, NULL, si);
+	      need_assert = true;
 	    }
 	}
+
+      /* Update live.  */
+      FOR_EACH_SSA_TREE_OPERAND (op, stmt, i, SSA_OP_USE)
+	SET_BIT (live, SSA_NAME_VERSION (op));
+      FOR_EACH_SSA_TREE_OPERAND (op, stmt, i, SSA_OP_DEF)
+	RESET_BIT (live, SSA_NAME_VERSION (op));
     }
 
-  /* Traverse all PHI nodes in BB marking used operands.  */
+  /* Traverse all PHI nodes in BB, updating live.  */
   for (si = gsi_start_phis (bb); !gsi_end_p(si); gsi_next (&si))
     {
       use_operand_p arg_p;
       ssa_op_iter i;
-      phi = gsi_stmt (si);
+      gimple phi = gsi_stmt (si);
+      tree res = gimple_phi_result (phi);
+
+      if (virtual_operand_p (res))
+	continue;
 
       FOR_EACH_PHI_ARG (arg_p, phi, i, SSA_OP_USE)
 	{
@@ -5546,6 +5533,8 @@  find_assert_locations_1 (basic_block bb,
 	  if (TREE_CODE (arg) == SSA_NAME)
 	    SET_BIT (live, SSA_NAME_VERSION (arg));
 	}
+
+      RESET_BIT (live, SSA_NAME_VERSION (res));
     }
 
   return need_assert;