mbox series

[0/3] RFC: Let debug stmts influence codegen at -Og

Message ID mpt5zowjtpx.fsf@arm.com
Headers show
Series RFC: Let debug stmts influence codegen at -Og | expand

Message

Richard Sandiford June 23, 2019, 1:51 p.m. UTC
-Og is documented as:

  @option{-Og} should be the optimization
  level of choice for the standard edit-compile-debug cycle, offering
  a reasonable level of optimization while maintaining fast compilation
  and a good debugging experience.  It is a better choice than @option{-O0}
  for producing debuggable code because some compiler passes
  that collect debug information are disabled at @option{-O0}.

One of the things hampering that is that, as for the "normal" -O* flags,
the code produced by -Og -g must be the same as the code produced
without any debug IL at all.  There are many cases in which that makes
it impossible to stop useful values from being optimised out of the
debug info, either because the value simply isn't available at runtime
at the point that the debugger needs it, or because of limitations in
the debug representation.  (E.g. pointers to external data are dropped
from debug info because the relocations wouldn't be honoured.)

I think it would be better to flip things around so that the debug IL
is always present when optimising at -Og, and then allow the debug IL
to influence codegen at -Og.  This still honours the -fcompare-debug
principle, and the compile speed of -Og without -g doesn't seem very
important.

This series therefore adds a mode in which debug stmts and debug insns
are present even without -g and are explicitly allowed to affect codegen.
In particular, when this mode is active:

- uses in debug binds become first-class uses, acting like uses in
  executable code

- the use of DEBUG_EXPR_DECLs is banned.  If we want to refer to
  a temporary value in debug binds, we need to calculate the value
  with executable code instead

This needs a new term to distinguish stmts/insns that affect codegen
from those that don't.  I couldn't think of one that I was really
happy with, but possibilities included:

    tangible/shadow
    manifest/hidden
    foreground/background
    reactive/inert
    active/dormant   (but "active insn" already means something else)
    peppy/sullen

The series uses tangible/shadow.  There's a new global flag_tangible_debug
that controls whether debug insns are "tangible" insns (for the new mode)
or "shadow" insns (for normal optimisation).  -Og enables the new mode
while the other optimisation levels leave it off.  (Despite the name,
the new variable is just an internal flag, there's no -ftangible-debug
option.)

The first patch adds the infrastructure but doesn't improve the debug
experience much on its own.

As an example of one thing we can do with the new mode, the second patch
ensures that the gimple IL has debug info for each is_gimple_reg variable
throughout the variable's lifetime.  This fixes a couple of the PRs in
the -Og meta-bug and from spot-testing seems to ensure that far fewer
values are optimised out.

Also, the new mode is mostly orthogonal to the optimisation level
(although it would in effect disable optimisations like loop
vectorisation, until we have a way of representing debug info for
vectorised loops).  The third patch therefore adds an -O1g option
that optimises more heavily than -Og but provides a better debug
experience than -O1.

I think -O2g would make sense too, and would be a viable option
for people who want to deploy relatively heavily optimised binaries
without compromising the debug experience too much.

Other possible follow-ons for the new mode include:

- Make sure that tangible debug stmts never read memory or take
  an address.  (This is so that addressability and vops depend
  only on non-debug insns.)

- Fall back on expanding real code if expand_debug_expr fails.

- Force debug insns to be simple enough for dwarf2out (e.g. no external
  or TLS symbols).  This could be done by having a validation step for
  debug insns, like we already do for normal insns.

- Prevent the removal of dead stores if it would lead to wrong debug info.
  (Maybe under control of an option?)

To get an idea of the runtime cost, I tried compiling tree-into-ssa.ii
at -O2 -g with various --enable-checking=yes builds of cc1plus:

                            time taken
cc1plus compiled with -O0:     100.00%   (baseline)
cc1plus compiled with old -Og:  30.94%
cc1plus compiled with new -Og:  31.82%
cc1plus compiled with -O1g:     28.22%
cc1plus compiled with -O1:      26.72%
cc1plus compiled with -O2:      25.15%

So there is a noticeable but small performance cost to the new mode.

To get an idea of the compile-time impact, I tried compiling
tree-into-ssa.ii at various optimisation levels, all using the
same --enable-checking=release bootstrap build:

                              time taken
tree-into-ssa.ii with -O0 -g:     100.0%  (baseline)
tree-into-ssa.ii with old -Og -g: 180.6%
tree-into-ssa.ii with new -Og -g: 198.2%
tree-into-ssa.ii with -O1g -g:    237.1%
tree-into-ssa.ii with -O1 -g:     211.8%
tree-into-ssa.ii with -O2 -g:     331.5%

So there's definitely a bit of a compile-time hit.  I haven't yet looked
at how easy it would be to fix.

What do you think?  Is it worth pursuing this further?

Of course, even if we do do this, it's still important that the debug
info for things like -O2 -g is as good as it can be.  I just think some
of the open bugs against -Og fundamentally can't be fixed properly while
-Og remains a cut-down version of -O1.

Thanks,
Richard

Comments

Richard Sandiford June 23, 2019, 1:53 p.m. UTC | #1
This patch tries to make sure that, when optimising at -Og, the gimple
IL has appropriate debug information for the whole of a variable's scope.
It does the same for parameters in functions that always return (but still
misses cases in functions that don't return -- a TODO for later).

The idea is to emit a debug stmt that uses each gimple register "var"
when it goes out of scope.  With the new "tangible" debug stmts, this
forces SSA renaming of "var" for its entire lifetime, even after the
last real use.  This in turn means that we should enter rtl in a state
where, at any given point in a variable's scope, there is a single
dominating debug bind insn for it.

A side-effect of this is that a variable that is in-scope but
uninitialised and unused will give "0" rather than "<optimized out>"
(where the zero is guaranteed by init-regs).  E.g. for functions like:

static inline void
get_conflict_and_start_profitable_regs (ira_allocno_t a, bool retry_p,
					HARD_REG_SET *conflict_regs,
					HARD_REG_SET *start_profitable_regs)
{
  int i, nwords;
  ira_object_t obj;

  nwords = ALLOCNO_NUM_OBJECTS (a);
  for (i = 0; i < nwords; i++)
    {
      obj = ALLOCNO_OBJECT (a, i);
      COPY_HARD_REG_SET (conflict_regs[i],
			 OBJECT_TOTAL_CONFLICT_HARD_REGS (obj));
    }
  if (retry_p)
    {
      COPY_HARD_REG_SET (*start_profitable_regs,
			 reg_class_contents[ALLOCNO_CLASS (a)]);
      AND_COMPL_HARD_REG_SET (*start_profitable_regs,
			      ira_prohibited_class_mode_regs
			      [ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]);
    }
  else
    COPY_HARD_REG_SET (*start_profitable_regs,
		       ALLOCNO_COLOR_DATA (a)->profitable_hard_regs);
}

it is still possible to print "obj" after the loop.  If nwords is <= 0
(which it never is in practice), "obj" would print as zero.

IMO this isn't wrong debug info, because the value that we print for
"obj" is the value that a printf at that location would print, even in
the uninitialised/zero case.  And this kind of for loop is very common:
the program in practice guarantees that the loop executes at least once,
and thus guarantees that "obj" has a useful value, but the compiler
doesn't know that.

The patch uses debug insns of the form:

    # DEBUG NULL => var

to keep "var" live.  Binding to null seemed better than binding to "var",
which is about to go out of scope and doesn't actually change value here.
Binding to null also means that we can delete the stmt if no longer has
any SSA uses (e.g. due to fwprop).  Most of the patch is just adding
support for null debug bind vars.

Although this should ensure we have good-quality debug info for gimple
registers when entering rtl, it doesn't guarantee that the value bound
in a debug bind insn remains live for as long as the debug bind is in effect.
That's a separate patch.

It's also possible that rtl cfg manipulations could break the original
property of having one dominating debug bind at each point.  That too
is a separate patch.  It probably makes sense to fix both of these
problems immediately before IRA.

The patch fixes the referenced PRs.  In particular, the backtrace for
PR78685 is now:

  #0  call_debugger (x=x@entry=3) at pr78685.c:6
  #1  ... in apply_lambda (fun=fun@entry=1, args=args@entry=2, count=count@entry=3) at pr78685.c:14
  #2  ... in main (argc=1, argv=0x<addr>) at pr78685.c:22

instead of:

  #0  call_debugger (x=3) at pr78685.c:6
  #1  ... in apply_lambda (fun=1, args=2, count=<optimized out>) at pr78685.c:14
  #2  ... in main (argc=<optimized out>, argv=<optimized out>) at pr78685.c:22


2019-06-23  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	PR debug/78685
	PR debug/88730
	* gimplify.c (gimplify_keep_decl_live): New function.
	(gimplify_bind_expr, gimplify_return_expr): Use it.
	* function.c (gimplify_parameters): Add debug stmts for parameters
	if flag_tangible_debug.
	* var-tracking.c (use_type): Drop debug binds with null variables.
	(delete_vta_debug_insn): Likewise.
	* gimple-low.c (lower_stmt): Allow debug binds too.
	* cfgexpand.c (expand_debug_locations): Handle null debug bind vars.
	(expand_gimple_basic_block): Likewise.
	* cfgrtl.c (duplicate_insn_chain): Likewise.
	* cse.c (insn_live_p): Likewise.
	* omp-simd-clone.c (ipa_simd_modify_function_body): Likewise.
	* print-rtl.c (rtx_writer::print_rtx, print_insn): Likewise.
	* tree-cfg.c (make_blocks_1, stmt_starts_bb_p): Handle tangible
	debug stmts like normal stmts.
	(make_blocks): Check for marker stmts rather than asserting for them.
	(gimple_duplicate_bb): Handle null debug bind vars.
	* tree-inline.c (maybe_move_debug_stmts_to_successors): Likewise.
	(copy_debug_stmt): Likewise.
	* tree-parloops.c (separate_decls_in_region_debug): Likewise.
	* tree-ssa-threadedge.c (propagate_threaded_block_debug_into):
	Likewise.
	* tree-ssa-dce.c (eliminate_unnecessary_stmts): Likewise.
	(mark_stmt_if_obviously_necessary): Likewise.  Allow debug binds
	with null variables to be deleted if they no longer have any
	SSA uses.

gcc/testsuite/
	* c-c++-common/guality/Og-keep-alive-1.c: New test.
	* c-c++-common/guality/Og-pr88730.c: Likewise.

Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c	2019-06-18 09:35:54.925869432 +0100
+++ gcc/gimplify.c	2019-06-23 14:48:55.226975633 +0100
@@ -1300,6 +1300,28 @@ asan_poison_variables (hash_set<tree> *v
     }
 }
 
+/* If DECL is a parameter or variable that we're going to rewrite into
+   SSA form, emit a "# DEBUG NULL => DECL" statement to make it live at
+   the end of SEQ, which is the point at which DECL dies.  This is only
+   valid for flag_tangible_debug.
+
+   Doing this ensures that we have an SSA name definition of DECL
+   throughout its lifetime, which as a side-effect ensures that a debug
+   binding for T is always available throughout its lifetime.  We can
+   then try to optimize away the SSA names, but only if doing so doesn't
+   compromise the debug bindings.  */
+
+static void
+gimplify_keep_decl_live (gimple_seq *seq, tree decl)
+{
+  gcc_assert (MAY_HAVE_DEBUG_BIND_STMTS && flag_tangible_debug);
+  if (is_gimple_reg (decl) && target_for_debug_bind (decl))
+    {
+      gdebug *debug = gimple_build_debug_bind (NULL_TREE, decl, NULL);
+      gimplify_seq_add_stmt (seq, debug);
+    }
+}
+
 /* Gimplify a BIND_EXPR.  Just voidify and recurse.  */
 
 static enum gimplify_status
@@ -1434,6 +1456,11 @@ gimplify_bind_expr (tree *expr_p, gimple
 	  && !is_global_var (t)
 	  && DECL_CONTEXT (t) == current_function_decl)
 	{
+	  if (MAY_HAVE_DEBUG_BIND_INSNS && flag_tangible_debug)
+	    /* Ensure that the variable's value is available for debug
+	       purposes throughout its lifetime.  */
+	    gimplify_keep_decl_live (&cleanup, t);
+
 	  if (!DECL_HARD_REGISTER (t)
 	      && !TREE_THIS_VOLATILE (t)
 	      && !DECL_HAS_VALUE_EXPR_P (t)
@@ -1629,6 +1656,14 @@ gimplify_return_expr (tree stmt, gimple_
 
   gimplify_and_add (TREE_OPERAND (stmt, 0), pre_p);
 
+  if (MAY_HAVE_DEBUG_BIND_STMTS && flag_tangible_debug)
+    /* If we're going to rewrite a parameter into SSA form, ensure that
+       the parameter's value is available for debug purposes throughout
+       its lifetime.  */
+    for (tree decl = DECL_ARGUMENTS (current_function_decl);
+	 decl; decl = DECL_CHAIN (decl))
+      gimplify_keep_decl_live (pre_p, decl);
+
   maybe_add_early_return_predict_stmt (pre_p);
   ret = gimple_build_return (result);
   gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
Index: gcc/function.c
===================================================================
--- gcc/function.c	2019-06-21 15:37:49.635878948 +0100
+++ gcc/function.c	2019-06-23 14:48:55.222975666 +0100
@@ -3906,6 +3906,17 @@ gimplify_parameters (gimple_seq *cleanup
 	      DECL_HAS_VALUE_EXPR_P (parm) = 1;
 	    }
 	}
+      if (flag_tangible_debug
+	  && is_gimple_reg (parm)
+	  && target_for_debug_bind (parm))
+	{
+	  /* This will be rewritten during SSA renaming to
+	     "# DEBUG parm => parm_N(D)".  In functions that return,
+	     the return statements have partnering "# DEBUG NULL => parm_M",
+	     statements.  */
+	  gdebug *debug = gimple_build_debug_bind (parm, parm, NULL);
+	  gimple_seq_add_stmt_without_update (&stmts, debug);
+	}
     }
 
   fnargs.release ();
Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c	2019-06-07 08:39:43.686343024 +0100
+++ gcc/var-tracking.c	2019-06-23 14:48:55.234975566 +0100
@@ -5508,7 +5508,10 @@ use_type (rtx loc, struct count_use_info
     {
       if (GET_CODE (loc) == VAR_LOCATION)
 	{
-	  if (track_expr_p (PAT_VAR_LOCATION_DECL (loc), false))
+	  /* The point of debug binds with no variable is to keep values
+	     live until this pass runs.  We can drop them here.  */
+	  if (PAT_VAR_LOCATION_DECL (loc)
+	      && track_expr_p (PAT_VAR_LOCATION_DECL (loc), false))
 	    {
 	      rtx ploc = PAT_VAR_LOCATION_LOC (loc);
 	      if (! VAR_LOC_UNKNOWN_P (ploc))
@@ -10315,7 +10318,8 @@ delete_vta_debug_insn (rtx_insn *insn)
     }
 
   tree decl = INSN_VAR_LOCATION_DECL (insn);
-  if (TREE_CODE (decl) == LABEL_DECL
+  if (decl
+      && TREE_CODE (decl) == LABEL_DECL
       && DECL_NAME (decl)
       && !DECL_RTL_SET_P (decl))
     {
Index: gcc/gimple-low.c
===================================================================
--- gcc/gimple-low.c	2019-06-18 09:35:54.121876078 +0100
+++ gcc/gimple-low.c	2019-06-23 14:48:55.222975666 +0100
@@ -309,18 +309,25 @@ lower_stmt (gimple_stmt_iterator *gsi, s
       break;
 
     case GIMPLE_DEBUG:
-      gcc_checking_assert (cfun->debug_nonbind_markers);
-      /* We can't possibly have debug bind stmts before lowering, we
-	 first emit them when entering SSA.  */
-      gcc_checking_assert (gimple_debug_nonbind_marker_p (stmt));
-      /* Propagate fallthruness.  */
-      /* If the function (e.g. from PCH) had debug stmts, but they're
-	 disabled for this compilation, remove them.  */
-      if (!MAY_HAVE_DEBUG_MARKER_STMTS)
-	gsi_remove (gsi, true);
-      else
-	gsi_next (gsi);
-      return;
+      {
+	bool keep_p;
+	if (gimple_debug_nonbind_marker_p (stmt))
+	  {
+	    gcc_assert (cfun->debug_nonbind_markers);
+	    keep_p = MAY_HAVE_DEBUG_MARKER_STMTS;
+	  }
+	else if (gimple_debug_bind_p (stmt)
+		 || gimple_debug_source_bind_p (stmt))
+	  keep_p = MAY_HAVE_DEBUG_BIND_STMTS;
+	else
+	  gcc_unreachable ();
+
+	if (keep_p)
+	  gsi_next (gsi);
+	else
+	  gsi_remove (gsi, true);
+	return;
+      }
 
     case GIMPLE_NOP:
     case GIMPLE_ASM:
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c	2019-06-23 14:48:44.491065092 +0100
+++ gcc/cfgexpand.c	2019-06-23 14:48:55.222975666 +0100
@@ -5455,6 +5455,7 @@ expand_debug_locations (void)
 	       being defined in this DEBUG_INSN.  */
 	    else if (deep_ter_debug_map && TREE_CODE (value) == SSA_NAME)
 	      {
+		gcc_assert (INSN_VAR_LOCATION_DECL (insn));
 		tree *slot = deep_ter_debug_map->get (value);
 		if (slot)
 		  {
@@ -5796,19 +5797,22 @@ expand_gimple_basic_block (basic_block b
 		    {
 		      var = gimple_debug_bind_get_var (stmt);
 
-		      if (TREE_CODE (var) != DEBUG_EXPR_DECL
+		      if (var
+			  && TREE_CODE (var) != DEBUG_EXPR_DECL
 			  && TREE_CODE (var) != LABEL_DECL
 			  && !target_for_debug_bind (var))
 			goto delink_debug_stmt;
 
-		      if (DECL_P (var))
+		      if (gimple_debug_bind_has_value_p (stmt))
+			value = gimple_debug_bind_get_value (stmt);
+
+		      if (!var)
+			mode = TYPE_MODE (TREE_TYPE (value));
+		      else if (DECL_P (var))
 			mode = DECL_MODE (var);
 		      else
 			mode = TYPE_MODE (TREE_TYPE (var));
 
-		      if (gimple_debug_bind_has_value_p (stmt))
-			value = gimple_debug_bind_get_value (stmt);
-
 		      val = gen_rtx_VAR_LOCATION
 			(mode, var, (rtx)value, VAR_INIT_STATUS_INITIALIZED);
 		    }
Index: gcc/cfgrtl.c
===================================================================
--- gcc/cfgrtl.c	2019-06-07 08:39:43.690343015 +0100
+++ gcc/cfgrtl.c	2019-06-23 14:48:55.222975666 +0100
@@ -4185,6 +4185,7 @@ duplicate_insn_chain (rtx_insn *from, rt
 	case DEBUG_INSN:
 	  /* Don't duplicate label debug insns.  */
 	  if (DEBUG_BIND_INSN_P (insn)
+	      && INSN_VAR_LOCATION_DECL (insn)
 	      && TREE_CODE (INSN_VAR_LOCATION_DECL (insn)) == LABEL_DECL)
 	    break;
 	  /* FALLTHRU */
Index: gcc/cse.c
===================================================================
--- gcc/cse.c	2019-06-23 14:48:44.499065025 +0100
+++ gcc/cse.c	2019-06-23 14:48:55.222975666 +0100
@@ -7068,6 +7068,8 @@ insn_live_p (rtx_insn *insn, int *counts
 	   we want to keep the earlier debug insn.  */
 	else if (DEBUG_MARKER_INSN_P (next))
 	  return true;
+	else if (!INSN_VAR_LOCATION_DECL (insn))
+	  return true;
 	else if (INSN_VAR_LOCATION_DECL (insn) == INSN_VAR_LOCATION_DECL (next))
 	  return false;
 
Index: gcc/omp-simd-clone.c
===================================================================
--- gcc/omp-simd-clone.c	2019-03-18 12:24:58.391436179 +0000
+++ gcc/omp-simd-clone.c	2019-06-23 14:48:55.226975633 +0100
@@ -1073,6 +1073,7 @@ ipa_simd_modify_function_body (struct cg
 		 vectorized loops doesn't work too well, so don't bother for
 		 now.  */
 	      if ((gimple_debug_bind_p (stmt)
+		   && gimple_debug_bind_get_var (stmt)
 		   && !DECL_P (gimple_debug_bind_get_var (stmt)))
 		  || (gimple_debug_source_bind_p (stmt)
 		      && !DECL_P (gimple_debug_source_bind_get_var (stmt))))
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c	2019-06-07 08:39:59.000000000 +0100
+++ gcc/print-rtl.c	2019-06-23 14:48:55.226975633 +0100
@@ -861,7 +861,9 @@ rtx_writer::print_rtx (const_rtx in_rtx)
 #ifndef GENERATOR_FILE
       if (GET_CODE (in_rtx) == VAR_LOCATION)
 	{
-	  if (TREE_CODE (PAT_VAR_LOCATION_DECL (in_rtx)) == STRING_CST)
+	  if (!PAT_VAR_LOCATION_DECL (in_rtx))
+	    fputs (" <none>", m_outfile);
+	  else if (TREE_CODE (PAT_VAR_LOCATION_DECL (in_rtx)) == STRING_CST)
 	    fputs (" <debug string placeholder>", m_outfile);
 	  else
 	    print_mem_expr (m_outfile, PAT_VAR_LOCATION_DECL (in_rtx));
@@ -1914,7 +1916,8 @@ print_insn (pretty_printer *pp, const rt
 	const char *name = "?";
 	char idbuf[32];
 
-	if (DECL_P (INSN_VAR_LOCATION_DECL (x)))
+	if (INSN_VAR_LOCATION_DECL (x)
+	    && DECL_P (INSN_VAR_LOCATION_DECL (x)))
 	  {
 	    tree id = DECL_NAME (INSN_VAR_LOCATION_DECL (x));
 	    if (id)
Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c	2019-06-23 14:48:44.511064925 +0100
+++ gcc/tree-cfg.c	2019-06-23 14:48:55.230975600 +0100
@@ -524,15 +524,15 @@ make_blocks_1 (gimple_seq seq, basic_blo
 
   while (!gsi_end_p (i))
     {
-      /* PREV_STMT should only be set to a debug stmt if the debug
-	 stmt is before nondebug stmts.  Once stmt reaches a nondebug
+      /* PREV_STMT should only be set to a shadow stmt if the shadow
+	 stmt is before tangible stmts.  Once stmt reaches a tangible
 	 nonlabel, prev_stmt will be set to it, so that
 	 stmt_starts_bb_p will know to start a new block if a label is
-	 found.  However, if stmt was a label after debug stmts only,
-	 keep the label in prev_stmt even if we find further debug
+	 found.  However, if stmt was a label after shadow stmts only,
+	 keep the label in prev_stmt even if we find further shadow
 	 stmts, for there may be other labels after them, and they
 	 should land in the same block.  */
-      if (!prev_stmt || !stmt || !is_gimple_debug (stmt))
+      if (!prev_stmt || !stmt || tangible_stmt_p (stmt))
 	prev_stmt = stmt;
       stmt = gsi_stmt (i);
 
@@ -633,9 +633,8 @@ make_blocks (gimple_seq seq)
 	    continue;
 
 	  /* Move the debug stmt at I after LABEL.  */
-	  if (is_gimple_debug (stmt))
+	  if (gimple_debug_nonbind_marker_p (stmt))
 	    {
-	      gcc_assert (gimple_debug_nonbind_marker_p (stmt));
 	      /* As STMT is removed, I advances to the stmt after
 		 STMT, so the gsi_prev in the for "increment"
 		 expression gets us to the stmt we're to visit after
@@ -2705,11 +2704,11 @@ stmt_starts_bb_p (gimple *stmt, gimple *
   if (stmt == NULL)
     return false;
 
-  /* PREV_STMT is only set to a debug stmt if the debug stmt is before
-     any nondebug stmts in the block.  We don't want to start another
-     block in this case: the debug stmt will already have started the
-     one STMT would start if we weren't outputting debug stmts.  */
-  if (prev_stmt && is_gimple_debug (prev_stmt))
+  /* PREV_STMT is only set to a shadow stmt if the shadow stmt is before
+     any tangible stmts in the block.  We don't want to start another
+     block in this case: the shadow stmt will already have started the
+     one STMT would start if we weren't outputting shadow stmts.  */
+  if (prev_stmt && !tangible_stmt_p (prev_stmt))
     return false;
 
   /* Labels start a new basic block only if the preceding statement
@@ -6196,6 +6195,7 @@ gimple_duplicate_bb (basic_block bb, cop
 
       /* Don't duplicate label debug stmts.  */
       if (gimple_debug_bind_p (stmt)
+	  && gimple_debug_bind_get_var (stmt)
 	  && TREE_CODE (gimple_debug_bind_get_var (stmt))
 	     == LABEL_DECL)
 	continue;
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c	2019-06-23 14:48:44.511064925 +0100
+++ gcc/tree-inline.c	2019-06-23 14:48:55.230975600 +0100
@@ -2760,8 +2760,10 @@ maybe_move_debug_stmts_to_successors (co
 		  value = unshare_expr (value);
 		  new_stmt = gimple_build_debug_bind (var, value, stmt);
 		}
-	      else
+	      else if (var)
 		new_stmt = gimple_build_debug_bind (var, NULL_TREE, NULL);
+	      else
+		continue;
 	    }
 	  else if (gimple_debug_source_bind_p (stmt))
 	    {
@@ -3060,16 +3062,19 @@ copy_debug_stmt (gdebug *stmt, copy_body
   else
     gcc_unreachable ();
 
-  if (TREE_CODE (t) == PARM_DECL && id->debug_map
-      && (n = id->debug_map->get (t)))
+  if (t)
     {
-      gcc_assert (VAR_P (*n));
-      t = *n;
+      if (TREE_CODE (t) == PARM_DECL && id->debug_map
+	  && (n = id->debug_map->get (t)))
+	{
+	  gcc_assert (VAR_P (*n));
+	  t = *n;
+	}
+      else if (VAR_P (t) && !is_global_var (t) && !id->decl_map->get (t))
+	/* T is a non-localized variable.  */;
+      else
+	walk_tree (&t, remap_gimple_op_r, &wi, NULL);
     }
-  else if (VAR_P (t) && !is_global_var (t) && !id->decl_map->get (t))
-    /* T is a non-localized variable.  */;
-  else
-    walk_tree (&t, remap_gimple_op_r, &wi, NULL);
 
   if (gimple_debug_bind_p (stmt))
     {
Index: gcc/tree-parloops.c
===================================================================
--- gcc/tree-parloops.c	2019-06-18 09:35:54.197875448 +0100
+++ gcc/tree-parloops.c	2019-06-23 14:48:55.230975600 +0100
@@ -950,23 +950,27 @@ separate_decls_in_region_debug (gimple *
   name_to_copy_elt **slot;
   int_tree_map *dslot;
 
+  /* FIXME */
   if (gimple_debug_bind_p (stmt))
     var = gimple_debug_bind_get_var (stmt);
   else if (gimple_debug_source_bind_p (stmt))
     var = gimple_debug_source_bind_get_var (stmt);
   else
     return true;
-  if (TREE_CODE (var) == DEBUG_EXPR_DECL || TREE_CODE (var) == LABEL_DECL)
-    return true;
-  gcc_assert (DECL_P (var) && SSA_VAR_P (var));
-  ielt.uid = DECL_UID (var);
-  dslot = decl_copies->find_slot_with_hash (ielt, ielt.uid, NO_INSERT);
-  if (!dslot)
-    return true;
-  if (gimple_debug_bind_p (stmt))
-    gimple_debug_bind_set_var (stmt, dslot->to);
-  else if (gimple_debug_source_bind_p (stmt))
-    gimple_debug_source_bind_set_var (stmt, dslot->to);
+  if (var)
+    {
+      if (TREE_CODE (var) == DEBUG_EXPR_DECL || TREE_CODE (var) == LABEL_DECL)
+	return true;
+      gcc_assert (DECL_P (var) && SSA_VAR_P (var));
+      ielt.uid = DECL_UID (var);
+      dslot = decl_copies->find_slot_with_hash (ielt, ielt.uid, NO_INSERT);
+      if (!dslot)
+	return true;
+      if (gimple_debug_bind_p (stmt))
+	gimple_debug_bind_set_var (stmt, dslot->to);
+      else if (gimple_debug_source_bind_p (stmt))
+	gimple_debug_source_bind_set_var (stmt, dslot->to);
+    }
 
   FOR_EACH_PHI_OR_STMT_USE (use, stmt, oi, SSA_OP_USE)
   {
Index: gcc/tree-ssa-threadedge.c
===================================================================
--- gcc/tree-ssa-threadedge.c	2019-06-18 09:35:55.505864640 +0100
+++ gcc/tree-ssa-threadedge.c	2019-06-23 14:48:55.234975566 +0100
@@ -803,10 +803,13 @@ propagate_threaded_block_debug_into (bas
       else
 	gcc_unreachable ();
 
-      if (vars)
-	vars->add (var);
-      else
-	fewvars.quick_push (var);
+      if (var)
+	{
+	  if (vars)
+	    vars->add (var);
+	  else
+	    fewvars.quick_push (var);
+	}
     }
 
   basic_block bb = dest;
@@ -844,9 +847,9 @@ propagate_threaded_block_debug_into (bas
 	     not actually running the code that performed the binds at
 	     this point, we're just adding binds so that they survive
 	     the new confluence, so markers should not be copied.  */
-	  if (vars && vars->add (var))
+	  if (var && vars && vars->add (var))
 	    continue;
-	  else if (!vars)
+	  if (var && !vars)
 	    {
 	      int i = fewvars.length ();
 	      while (i--)
Index: gcc/tree-ssa-dce.c
===================================================================
--- gcc/tree-ssa-dce.c	2019-06-23 14:48:44.515064892 +0100
+++ gcc/tree-ssa-dce.c	2019-06-23 14:48:55.230975600 +0100
@@ -282,15 +282,26 @@ mark_stmt_if_obviously_necessary (gimple
       }
 
     case GIMPLE_DEBUG:
-      /* Debug temps without a value are not useful.  ??? If we could
-	 easily locate the debug temp bind stmt for a use thereof,
-	 would could refrain from marking all debug temps here, and
-	 mark them only if they're used.  */
-      if (gimple_debug_nonbind_marker_p (stmt)
-	  || !gimple_debug_bind_p (stmt)
-	  || gimple_debug_bind_has_value_p (stmt)
-	  || TREE_CODE (gimple_debug_bind_get_var (stmt)) != DEBUG_EXPR_DECL)
-	mark_stmt_necessary (stmt, tangible_stmt_p (stmt));
+      if (gimple_debug_bind_p (stmt))
+	{
+	  tree var = gimple_debug_bind_get_var (stmt);
+
+	  /* Debug temps without a value are not useful.  ??? If we could
+	     easily locate the debug temp bind stmt for a use thereof,
+	     would could refrain from marking all debug temps here, and
+	     mark them only if they're used.  */
+	  if (var
+	      && TREE_CODE (var) == DEBUG_EXPR_DECL
+	      && !gimple_debug_bind_has_value_p (stmt))
+	    return;
+
+	  /* "# DEBUG NULL => foo" only exists to keep foo live till this
+	     point (and initially to force SSA renaming).  Drop it if it
+	     no longer has any effect.  */
+	  if (!var && !gimple_use_ops (stmt))
+	    return;
+	}
+      mark_stmt_necessary (stmt, tangible_stmt_p (stmt));
       return;
 
     case GIMPLE_GOTO:
@@ -1399,7 +1410,8 @@ eliminate_unnecessary_stmts (void)
 	         non-DEBUG_EXPR_DECL variable in a series of
 		 debug-bind stmts.  */
 	      tree var = gimple_debug_bind_get_var (stmt);
-	      if (TREE_CODE (var) != DEBUG_EXPR_DECL
+	      if (var
+		  && TREE_CODE (var) != DEBUG_EXPR_DECL
 		  && !bitmap_set_bit (debug_seen, DECL_UID (var)))
 		remove_dead_stmt (&gsi, bb, to_remove_edges);
 	      continue;
Index: gcc/testsuite/c-c++-common/guality/Og-keep-alive-1.c
===================================================================
--- /dev/null	2019-06-14 15:59:19.298479944 +0100
+++ gcc/testsuite/c-c++-common/guality/Og-keep-alive-1.c	2019-06-23 14:48:55.226975633 +0100
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-g" } */
+
+#include "../../gcc.dg/nop.h"
+
+int __attribute__ ((noipa))
+f (int a)
+{
+  int b, c;
+  if (a < 10)
+    {
+      b = 3;
+      c = b + a;
+    }
+  else
+    {
+      b = 2;
+      c = b - a;
+    }
+  asm volatile (NOP);	/* { dg-final { gdb-test . "a" "10" } } */
+			/* { dg-final { gdb-test .-1 "b" "2" } } */
+			/* { dg-final { gdb-test .-2 "c" "-8" } } */
+  return c;
+}
+
+int
+main (void)
+{
+  f (10);
+}
Index: gcc/testsuite/c-c++-common/guality/Og-pr88730.c
===================================================================
--- /dev/null	2019-06-14 15:59:19.298479944 +0100
+++ gcc/testsuite/c-c++-common/guality/Og-pr88730.c	2019-06-23 14:48:55.226975633 +0100
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+/* { dg-options "-g" } */
+
+#include "../../gcc.dg/nop.h"
+
+int a;
+int main() {
+  int b, j;
+  b = 0;
+  for (; b < 1; b++) {
+    j = 0;
+    for (; j < 5; j++)
+      ;
+  }
+  asm volatile (NOP); /* { dg-final { gdb-test . "j" "5" } } */
+}
Richard Biener June 24, 2019, 12:23 p.m. UTC | #2
On Sun, Jun 23, 2019 at 3:51 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> -Og is documented as:
>
>   @option{-Og} should be the optimization
>   level of choice for the standard edit-compile-debug cycle, offering
>   a reasonable level of optimization while maintaining fast compilation
>   and a good debugging experience.  It is a better choice than @option{-O0}
>   for producing debuggable code because some compiler passes
>   that collect debug information are disabled at @option{-O0}.
>
> One of the things hampering that is that, as for the "normal" -O* flags,
> the code produced by -Og -g must be the same as the code produced
> without any debug IL at all.  There are many cases in which that makes
> it impossible to stop useful values from being optimised out of the
> debug info, either because the value simply isn't available at runtime
> at the point that the debugger needs it, or because of limitations in
> the debug representation.  (E.g. pointers to external data are dropped
> from debug info because the relocations wouldn't be honoured.)
>
> I think it would be better to flip things around so that the debug IL
> is always present when optimising at -Og, and then allow the debug IL
> to influence codegen at -Og.  This still honours the -fcompare-debug
> principle, and the compile speed of -Og without -g doesn't seem very
> important.

Just to note it's also a common user misconception that -Og enables
debug-info ...

> This series therefore adds a mode in which debug stmts and debug insns
> are present even without -g and are explicitly allowed to affect codegen.
> In particular, when this mode is active:
>
> - uses in debug binds become first-class uses, acting like uses in
>   executable code
>
> - the use of DEBUG_EXPR_DECLs is banned.  If we want to refer to
>   a temporary value in debug binds, we need to calculate the value
>   with executable code instead
>
> This needs a new term to distinguish stmts/insns that affect codegen
> from those that don't.  I couldn't think of one that I was really
> happy with, but possibilities included:
>
>     tangible/shadow
>     manifest/hidden
>     foreground/background
>     reactive/inert
>     active/dormant   (but "active insn" already means something else)
>     peppy/sullen
>
> The series uses tangible/shadow.  There's a new global flag_tangible_debug
> that controls whether debug insns are "tangible" insns (for the new mode)
> or "shadow" insns (for normal optimisation).  -Og enables the new mode
> while the other optimisation levels leave it off.  (Despite the name,
> the new variable is just an internal flag, there's no -ftangible-debug
> option.)
>
> The first patch adds the infrastructure but doesn't improve the debug
> experience much on its own.
>
> As an example of one thing we can do with the new mode, the second patch
> ensures that the gimple IL has debug info for each is_gimple_reg variable
> throughout the variable's lifetime.  This fixes a couple of the PRs in
> the -Og meta-bug and from spot-testing seems to ensure that far fewer
> values are optimised out.
>
> Also, the new mode is mostly orthogonal to the optimisation level
> (although it would in effect disable optimisations like loop
> vectorisation, until we have a way of representing debug info for
> vectorised loops).  The third patch therefore adds an -O1g option
> that optimises more heavily than -Og but provides a better debug
> experience than -O1.
>
> I think -O2g would make sense too, and would be a viable option
> for people who want to deploy relatively heavily optimised binaries
> without compromising the debug experience too much.
>
> Other possible follow-ons for the new mode include:
>
> - Make sure that tangible debug stmts never read memory or take
>   an address.  (This is so that addressability and vops depend
>   only on non-debug insns.)
>
> - Fall back on expanding real code if expand_debug_expr fails.
>
> - Force debug insns to be simple enough for dwarf2out (e.g. no external
>   or TLS symbols).  This could be done by having a validation step for
>   debug insns, like we already do for normal insns.
>
> - Prevent the removal of dead stores if it would lead to wrong debug info.
>   (Maybe under control of an option?)
>
> To get an idea of the runtime cost, I tried compiling tree-into-ssa.ii
> at -O2 -g with various --enable-checking=yes builds of cc1plus:
>
>                             time taken
> cc1plus compiled with -O0:     100.00%   (baseline)
> cc1plus compiled with old -Og:  30.94%

is this -Og -g or just -Og?  I suppose all numbers are with -g enabled?

> cc1plus compiled with new -Og:  31.82%
> cc1plus compiled with -O1g:     28.22%
> cc1plus compiled with -O1:      26.72%
> cc1plus compiled with -O2:      25.15%
>
> So there is a noticeable but small performance cost to the new mode.
>
> To get an idea of the compile-time impact, I tried compiling
> tree-into-ssa.ii at various optimisation levels, all using the
> same --enable-checking=release bootstrap build:
>
>                               time taken
> tree-into-ssa.ii with -O0 -g:     100.0%  (baseline)
> tree-into-ssa.ii with old -Og -g: 180.6%
> tree-into-ssa.ii with new -Og -g: 198.2%
> tree-into-ssa.ii with -O1g -g:    237.1%
> tree-into-ssa.ii with -O1 -g:     211.8%
> tree-into-ssa.ii with -O2 -g:     331.5%
>
> So there's definitely a bit of a compile-time hit.  I haven't yet looked
> at how easy it would be to fix.
>
> What do you think?  Is it worth pursuing this further?
>
> Of course, even if we do do this, it's still important that the debug
> info for things like -O2 -g is as good as it can be.  I just think some
> of the open bugs against -Og fundamentally can't be fixed properly while
> -Og remains a cut-down version of -O1.

Thanks for doing this experiment.  I'm not sure extra complication
is welcome (but didn't look into the patch(es) yet...).

My original motivation for -Og -g was to provide -O0 -g compile-time
at -O1 runtime with better debuggability than -O0 -g (mainly because
that doesn't enable var-tracking).  Of course that failed completely ;)

I somewhat like the idea that -ONg forces debug info generation
(but not necessarily output without -g) and thus we can take debug-stmts
into account.  I suppose you're simply keying on optimize_debug here.

Richard.

> Thanks,
> Richard
Richard Sandiford June 24, 2019, 12:50 p.m. UTC | #3
Richard Biener <richard.guenther@gmail.com> writes:
> On Sun, Jun 23, 2019 at 3:51 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>> To get an idea of the runtime cost, I tried compiling tree-into-ssa.ii
>> at -O2 -g with various --enable-checking=yes builds of cc1plus:
>>
>>                             time taken
>> cc1plus compiled with -O0:     100.00%   (baseline)
>> cc1plus compiled with old -Og:  30.94%
>
> is this -Og -g or just -Og?  I suppose all numbers are with -g enabled?

Yeah, all with -g enabled.  But these were the options used to build
cc1plus, whereas the timings are for running it.  So -g shouldn't make
a difference.

>> cc1plus compiled with new -Og:  31.82%
>> cc1plus compiled with -O1g:     28.22%
>> cc1plus compiled with -O1:      26.72%
>> cc1plus compiled with -O2:      25.15%
>>
>> So there is a noticeable but small performance cost to the new mode.
>>
>> To get an idea of the compile-time impact, I tried compiling
>> tree-into-ssa.ii at various optimisation levels, all using the
>> same --enable-checking=release bootstrap build:
>>
>>                               time taken
>> tree-into-ssa.ii with -O0 -g:     100.0%  (baseline)
>> tree-into-ssa.ii with old -Og -g: 180.6%
>> tree-into-ssa.ii with new -Og -g: 198.2%
>> tree-into-ssa.ii with -O1g -g:    237.1%
>> tree-into-ssa.ii with -O1 -g:     211.8%
>> tree-into-ssa.ii with -O2 -g:     331.5%
>>
>> So there's definitely a bit of a compile-time hit.  I haven't yet looked
>> at how easy it would be to fix.
>>
>> What do you think?  Is it worth pursuing this further?
>>
>> Of course, even if we do do this, it's still important that the debug
>> info for things like -O2 -g is as good as it can be.  I just think some
>> of the open bugs against -Og fundamentally can't be fixed properly while
>> -Og remains a cut-down version of -O1.
>
> Thanks for doing this experiment.  I'm not sure extra complication
> is welcome (but didn't look into the patch(es) yet...).

OK.  The complication is in having (to think about) a new pair of
conditions that might be relevant: tangible/shadow instead of
nondebug/debug.  But almost all the changes involve replacing the
latter with the former rather than adding new checks.  Hopefully once
the whole sourcebase has been converted, the check would become second
nature.  (But patch 1 doesn't convert the whole sourcebase.)

> My original motivation for -Og -g was to provide -O0 -g compile-time
> at -O1 runtime with better debuggability than -O0 -g (mainly because
> that doesn't enable var-tracking).  Of course that failed completely ;)

We fall short on the compile-time side, but being able to get a reasonable
debugging experience with code that runs 3 times faster than -O0 is still
a pretty nice feature. :-)

> I somewhat like the idea that -ONg forces debug info generation
> (but not necessarily output without -g) and thus we can take debug-stmts
> into account.  I suppose you're simply keying on optimize_debug here.

I wondered about doing it that way, but in the end added a new internal
variable instead (flag_tangible_debug, but not backed by a real -f option).
optimize_debug mostly disables optimisations, and whether we want to do
that for -ONg felt like a separate question from whether we should take
debug stmts into account.

Thanks,
Richard
Segher Boessenkool June 24, 2019, 2:14 p.m. UTC | #4
Hi!

What does -O1g do with OPT_LEVELS_1_PLUS_NOT_DEBUG, is it enabled or
not there?  Maybe that name needs to change, with your patches?  It is
currently documented as

    /* -O1 (and not -Og) optimizations.  */


Segher
Richard Sandiford June 24, 2019, 2:22 p.m. UTC | #5
Segher Boessenkool <segher@kernel.crashing.org> writes:
> Hi!
>
> What does -O1g do with OPT_LEVELS_1_PLUS_NOT_DEBUG, is it enabled or
> not there?  Maybe that name needs to change, with your patches?  It is
> currently documented as
>
>     /* -O1 (and not -Og) optimizations.  */

Yeah, comment should change to be:

    /* -O1 and -O1g (but not -Og) optimizations.  */

Richard
Jakub Jelinek June 24, 2019, 2:28 p.m. UTC | #6
On Sun, Jun 23, 2019 at 02:51:06PM +0100, Richard Sandiford wrote:
> What do you think?  Is it worth pursuing this further?

Wouldn't it be more useful to just force all automatic variables to be
used at the end of their corresponding scope?  That is IMHO the main issue
with -Og debugging, VTA is a best effort, if we can express a variable with
some expression, nice, but if there is no expression nor memory nor register
that holds the value, we are out of luck.  Could be some magic stmt like
gimple_clobber or ifn or something similar, which would make sure that at
least until expansion to RTL we force those vars to be live in either a
register or memory.
I'm afraid having different modes, one in which debug stmts can't and one
where they can affect code generation might be a maintainance nightmare.

	Jakub
Segher Boessenkool June 24, 2019, 2:36 p.m. UTC | #7
On Mon, Jun 24, 2019 at 04:28:34PM +0200, Jakub Jelinek wrote:
> On Sun, Jun 23, 2019 at 02:51:06PM +0100, Richard Sandiford wrote:
> > What do you think?  Is it worth pursuing this further?
> 
> Wouldn't it be more useful to just force all automatic variables to be
> used at the end of their corresponding scope?  That is IMHO the main issue
> with -Og debugging, VTA is a best effort, if we can express a variable with
> some expression, nice, but if there is no expression nor memory nor register
> that holds the value, we are out of luck.  Could be some magic stmt like
> gimple_clobber or ifn or something similar, which would make sure that at
> least until expansion to RTL we force those vars to be live in either a
> register or memory.
> I'm afraid having different modes, one in which debug stmts can't and one
> where they can affect code generation might be a maintainance nightmare.

This is pretty much exactly what USE in RTL is?  Maybe use a similar name
in Gimple?


Segher
Richard Sandiford June 24, 2019, 4:10 p.m. UTC | #8
Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Mon, Jun 24, 2019 at 04:28:34PM +0200, Jakub Jelinek wrote:
>> On Sun, Jun 23, 2019 at 02:51:06PM +0100, Richard Sandiford wrote:
>> > What do you think?  Is it worth pursuing this further?
>> 
>> Wouldn't it be more useful to just force all automatic variables to be
>> used at the end of their corresponding scope?

This is what patch 2 does FWIW.

>> That is IMHO the main issue
>> with -Og debugging, VTA is a best effort, if we can express a variable with
>> some expression, nice, but if there is no expression nor memory nor register
>> that holds the value, we are out of luck.  Could be some magic stmt like
>> gimple_clobber or ifn or something similar, which would make sure that at
>> least until expansion to RTL we force those vars to be live in either a
>> register or memory.
>> I'm afraid having different modes, one in which debug stmts can't and one
>> where they can affect code generation might be a maintainance nightmare.

Very few places need to check for the mode explicitly.  Most code just
uses a new test instead of (rather than on top of) is_gimple_debug/
NONDEBUG_INSN_P.

The vast majority of the existing tests for is_gimple_debug aren't
interested in the debug contents of a debug stmt.  They're just
interested in whether the stmt is allowed to affect codegen.  With this
series that becomes a new predicate in its own right.  We then only use
is_gimple_debug if we want to process the contents of a debug insn in a
particular way, just like we only test for gimple calls or assignments
when we're interested in their contents.

Old habits die hard of course, so there'd definitely be a period while
people instinctively use is_gimple_debug/NONDEBUG_INSN_P instead of
the new test.

> This is pretty much exactly what USE in RTL is?  Maybe use a similar name
> in Gimple?

Yeah, I wondered about doing it that way, but one problem with USE is
that it isn't clear why the USE is there or what the ordering constraints
on it are.  Here we want the same rules as for normal debug stmts
(as modified by patch 1).

Also, if the final value ends up being constant, it's fine to propagate
that value into the debug stmt that marks the end of the scope.
I'm not sure it's valid to propagate constants into a use.

But the rtl side of patch 2 (using debug insns instead of USEs)
is only a small change.  The main part of it is on the gimple side.
Unless we prevent all user values from becoming gimple registers
and being rewritten into SSA, we'd still need something like the gimple
side of patch 2 to ensure that there's sufficient tracking of the variable.

Thanks,
Richard
Jonathan Wakely June 26, 2019, 8:38 a.m. UTC | #9
On 23/06/19 14:51 +0100, Richard Sandiford wrote:
>Also, the new mode is mostly orthogonal to the optimisation level
>(although it would in effect disable optimisations like loop
>vectorisation, until we have a way of representing debug info for
>vectorised loops).  The third patch therefore adds an -O1g option
>that optimises more heavily than -Og but provides a better debug
>experience than -O1.

I think it would be confusing to have -O and -O1 mean the same, but
-Og and -O1g mean something different.

Maybe another name could avoid that, e.g. appending +g to signify the
new modes, so -O+g and -O1+g would mean the same thing.

>I think -O2g would make sense too, and would be a viable option
>for people who want to deploy relatively heavily optimised binaries
>without compromising the debug experience too much.

Which would be -O2+g using the naming scheme above.

If the mode is orthogonal to optimisation level I think this is
clearer, because you can have +g appended to any level, -Os+g, maybe
even -Og+g ;-)