Patchwork [PR54693] loss of debug info in jump threading and loop ivopts

login
register
mail settings
Submitter Alexandre Oliva
Date Nov. 3, 2012, 4:11 p.m.
Message ID <orehkais45.fsf@livre.localdomain>
Download mbox | patch
Permalink /patch/196869/
State New
Headers show

Comments

Alexandre Oliva - Nov. 3, 2012, 4:11 p.m.
On Nov  2, 2012, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Nov  1, 2012, Jakub Jelinek <jakub@redhat.com> wrote:
>> Even for stmt frontiers it is IMHO undesirable to duplicate
>> (perhaps many times) the whole sequence, as one would then reply the var
>> changing sequence in the debugger perhaps once in the original bb, then
>> once or many times again in the successor bb.

> Quite the opposite: when we do jump threading, we're *bypassing* the
> entire block

What I wrote above was not entirely accurate.

We do bypass empty blocks, in thread_around_empty_block, but we
propagated debug stmts from their predecessors, that may actually be the
block duplicated for threading.  So, the debug stmts from the duplicated
block are entirely artificial, exclusively for purposes of SSA
resolution, while those from bypassed otherwise-empty blocks actually
tell the history of the computation.  All that said, even the latter are
useless right now, when their bindings happen to be immediately modified
by subsequent debug stmts.

Now, one of the problems with my patch was that it propagated debug
stmts over and over, when there was a long chain of empty blocks.  I'd
assumed they'd all be eliminated after the jump was redirected, but
they're only bypassed by the copy of the initial block.

Furthermore, when jump threading fails to determine the taken edge out
of e->dest, it proceeds to trying to threading through successors of
e->dest, and the current code would propagate debug stmts from e->dest
to single-pred successors at such attempt, even if no threading would
take place.

The patch below (based on a tree from before your first patch that's
already in) improves the situation: we only propagate debug stmts to the
final destination block, and only after we make sure we are performing
jump threading to that block.  Also, it refrains from propagating debug
bind stmts whose bindings are rendered inoperant by later bindings, in
the threaded blocks or at the top of the destination block.

I didn't try delaying the creation of the pointer set as you did, but
I'm now thinking using a linear on-stack vector for searches up to some
size might be even faster than creating the pointer set at the first
var.

Regstrapping on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?
Jakub Jelinek - Nov. 3, 2012, 6:15 p.m.
On Sat, Nov 03, 2012 at 02:11:22PM -0200, Alexandre Oliva wrote:
> I didn't try delaying the creation of the pointer set as you did, but
> I'm now thinking using a linear on-stack vector for searches up to some
> size might be even faster than creating the pointer set at the first
> var.

THat would be a nice thing as a follow-up (say up to 8 or 16 decls).

> Propagate debug stmts only after decision to thread.
> 
> From: Alexandre Oliva <aoliva@redhat.com>
> 
> for  gcc/ChangeLog
> 
> 	PR debug/54693
> 	* tree-ssa-threadedge.c (propagate_threaded_block_debug_into):
> 	New, rewritten from debug stmt copying code...
> 	(thread_around_empty_block): ... removed from here.
> 	(thread_across_edge): Call propagate_threaded_block_debug_into.

Ok, thanks.

	Jakub
Jeff Law - Nov. 5, 2012, 4:57 p.m.
On 11/03/2012 10:11 AM, Alexandre Oliva wrote:
> On Nov  2, 2012, Alexandre Oliva <aoliva@redhat.com> wrote:
>
>> On Nov  1, 2012, Jakub Jelinek <jakub@redhat.com> wrote:
>>> Even for stmt frontiers it is IMHO undesirable to duplicate
>>> (perhaps many times) the whole sequence, as one would then reply the var
>>> changing sequence in the debugger perhaps once in the original bb, then
>>> once or many times again in the successor bb.
>
>> Quite the opposite: when we do jump threading, we're *bypassing* the
>> entire block
>
> What I wrote above was not entirely accurate.
>
> We do bypass empty blocks, in thread_around_empty_block, but we
> propagated debug stmts from their predecessors, that may actually be the
> block duplicated for threading.  So, the debug stmts from the duplicated
> block are entirely artificial, exclusively for purposes of SSA
> resolution, while those from bypassed otherwise-empty blocks actually
> tell the history of the computation.  All that said, even the latter are
> useless right now, when their bindings happen to be immediately modified
> by subsequent debug stmts.
>
> Now, one of the problems with my patch was that it propagated debug
> stmts over and over, when there was a long chain of empty blocks.  I'd
> assumed they'd all be eliminated after the jump was redirected, but
> they're only bypassed by the copy of the initial block.
>
> Furthermore, when jump threading fails to determine the taken edge out
> of e->dest, it proceeds to trying to threading through successors of
> e->dest, and the current code would propagate debug stmts from e->dest
> to single-pred successors at such attempt, even if no threading would
> take place.
>
> The patch below (based on a tree from before your first patch that's
> already in) improves the situation: we only propagate debug stmts to the
> final destination block, and only after we make sure we are performing
> jump threading to that block.  Also, it refrains from propagating debug
> bind stmts whose bindings are rendered inoperant by later bindings, in
> the threaded blocks or at the top of the destination block.
>
> I didn't try delaying the creation of the pointer set as you did, but
> I'm now thinking using a linear on-stack vector for searches up to some
> size might be even faster than creating the pointer set at the first
> var.
>
> Regstrapping on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?
This is fine.  Thanks,
Jeff

Patch

Propagate debug stmts only after decision to thread.

From: Alexandre Oliva <aoliva@redhat.com>

for  gcc/ChangeLog

	PR debug/54693
	* tree-ssa-threadedge.c (propagate_threaded_block_debug_into):
	New, rewritten from debug stmt copying code...
	(thread_around_empty_block): ... removed from here.
	(thread_across_edge): Call propagate_threaded_block_debug_into.
---

 gcc/tree-ssa-threadedge.c |  106 +++++++++++++++++++++++++++++++++++++--------
 1 files changed, 87 insertions(+), 19 deletions(-)


diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index f43a564..a9c671e 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -610,6 +610,85 @@  cond_arg_set_in_bb (edge e, basic_block bb)
   return false;
 }
 
+/* Copy debug stmts from DEST's chain of single predecessors up to
+   SRC, so that we don't lose the bindings as PHI nodes are introduced
+   when DEST gains new predecessors.  */
+static void
+propagate_threaded_block_debug_into (basic_block dest, basic_block src)
+{
+  if (!MAY_HAVE_DEBUG_STMTS)
+    return;
+
+  if (!single_pred_p (dest))
+    return;
+
+  gcc_checking_assert (dest != src);
+
+  gimple_stmt_iterator gsi = gsi_after_labels (dest);
+  pointer_set_t *vars = pointer_set_create ();
+
+  for (gimple_stmt_iterator si = gsi;
+       !gsi_end_p (si); gsi_next (&si))
+    {
+      gimple stmt = gsi_stmt (si);
+      if (!is_gimple_debug (stmt))
+	break;
+
+      tree var;
+
+      if (gimple_debug_bind_p (stmt))
+	var = gimple_debug_bind_get_var (stmt);
+      else if (gimple_debug_source_bind_p (stmt))
+	var = gimple_debug_source_bind_get_var (stmt);
+      else
+	gcc_unreachable ();
+
+      pointer_set_insert (vars, var);
+    }
+
+  basic_block bb = dest;
+
+  do
+    {
+      bb = single_pred (bb);
+      for (gimple_stmt_iterator si = gsi_last_bb (bb);
+	   !gsi_end_p (si); gsi_prev (&si))
+	{
+	  gimple stmt = gsi_stmt (si);
+	  if (!is_gimple_debug (stmt))
+	    continue;
+
+	  tree var;
+
+	  if (gimple_debug_bind_p (stmt))
+	    var = gimple_debug_bind_get_var (stmt);
+	  else if (gimple_debug_source_bind_p (stmt))
+	    var = gimple_debug_source_bind_get_var (stmt);
+	  else
+	    gcc_unreachable ();
+
+	  /* Discard debug bind overlaps.  ??? Unlike stmts from src,
+	     copied into a new block that will precede BB, debug bind
+	     stmts in bypassed BBs may actually be discarded if
+	     they're overwritten by subsequent debug bind stmts, which
+	     might be a problem once we introduce stmt frontier notes
+	     or somesuch.  Adding `&& bb == src' to the condition
+	     below will preserve all potentially relevant debug
+	     notes.  */
+	  if (pointer_set_insert (vars, var))
+	    continue;
+
+	  stmt = gimple_copy (stmt);
+	  /* ??? Should we drop the location of the copy to denote
+	     they're artificial bindings?  */
+	  gsi_insert_before (&gsi, stmt, GSI_NEW_STMT);
+	}
+    }
+  while (bb != src && single_pred_p (bb));
+
+  pointer_set_destroy (vars);
+}
+
 /* TAKEN_EDGE represents the an edge taken as a result of jump threading.
    See if we can thread around TAKEN_EDGE->dest as well.  If so, return
    the edge out of TAKEN_EDGE->dest that we can statically compute will be
@@ -637,24 +716,6 @@  thread_around_empty_block (edge taken_edge,
   if (!single_pred_p (bb))
     return NULL;
 
-  /* Before threading, copy DEBUG stmts from the predecessor, so that
-     we don't lose the bindings as we redirect the edges.  */
-  if (MAY_HAVE_DEBUG_STMTS)
-    {
-      gsi = gsi_after_labels (bb);
-      for (gimple_stmt_iterator si = gsi_last_bb (taken_edge->src);
-	   !gsi_end_p (si); gsi_prev (&si))
-	{
-	  stmt = gsi_stmt (si);
-	  if (!is_gimple_debug (stmt))
-	    continue;
-
-	  stmt = gimple_copy (stmt);
-	  /* ??? Should we drop the location of the copy?  */
-	  gsi_insert_before (&gsi, stmt, GSI_NEW_STMT);
-	}
-    }
-
   /* This block must have more than one successor.  */
   if (single_succ_p (bb))
     return NULL;
@@ -827,6 +888,9 @@  thread_across_edge (gimple dummy_cond,
 	    }
 
 	  remove_temporary_equivalences (stack);
+	  if (!taken_edge)
+	    return;
+	  propagate_threaded_block_debug_into (taken_edge->dest, e->dest);
 	  register_jump_thread (e, taken_edge, NULL);
 	  return;
 	}
@@ -892,7 +956,11 @@  thread_across_edge (gimple dummy_cond,
 	       same.  */
 	    tmp = find_edge (taken_edge->src, e3->dest);
 	    if (!tmp || phi_args_equal_on_edges (tmp, e3))
-	      register_jump_thread (e, taken_edge, e3);
+	      {
+		propagate_threaded_block_debug_into (e3->dest,
+						     taken_edge->dest);
+		register_jump_thread (e, taken_edge, e3);
+	      }
 	  }
 
       }