diff mbox

[debug-early] C++ clones and limbo DIEs

Message ID 54D4EEDC.1070307@redhat.com
State New
Headers show

Commit Message

Aldy Hernandez Feb. 6, 2015, 4:42 p.m. UTC
>> +    && DECL_CONTEXT (snode->decl)
>> +    && TREE_CODE (DECL_CONTEXT (snode->decl)) != FUNCTION_DECL)
>
> I think this should be !decl_function_context (snode->decl), in case
> there's a class or BLOCK between the symbol and its enclosing function.

Done, also for the iteration through reachable functions.

>
>>  dwarf2out_type_decl (tree decl, int local)
>> +  /* ?? Technically, we shouldn't need this hook at all, as all
>> +     symbols (and by consequence their types) will be outputed from
>> +     finalize_compilation_unit.
>
> Note that we also want to emit debug info about some types that are not
> referenced by symbols, such as when a type is used in a cast.

Fair enough.  I've removed the comment.

>> +/* Perform any cleanups needed after the early debug generation pass
>> +   has run.  */
>> +
>> +static void
>> +dwarf2out_early_finish (void)
>
> Since this is also called from dwarf2out_finish, let's call it something
> more descriptive, say, flush_limbo_dies?

I was actually thinking of using dwarf2out_early_finish() to mop things 
up as we generate early (or stream out) other auxiliary tables 
(pubname_table, pubtype_table, file_table, etc).  More details on that 
later.  If so, can I leave it as is?

How is this version?  No regressions on guality.  Target libraries build 
fine.

Aldy

Comments

Jason Merrill Feb. 6, 2015, 5:40 p.m. UTC | #1
On 02/06/2015 11:42 AM, Aldy Hernandez wrote:
> I was actually thinking of using dwarf2out_early_finish() to mop things up as we generate early (or stream out) other auxiliary tables (pubname_table, pubtype_table, file_table, etc).  More details on that later.  If so, can I leave it as is?

OK.

> +  /* No one should depend on this, as it is a temporary debugging aid
> +     to indicate the DECL for which this DIE was created for.  */
> +  tree tmp_created_for;

Maybe add a FIXME comment to remove/#if this out at merge time?  I don't 
want to add an unnecessary pointer to every DIE in the released compiler.

OK with that change.

Jason
Aldy Hernandez Feb. 6, 2015, 5:47 p.m. UTC | #2
> OK with that change.

Sweet!  Thanks for everything.

Though he's been silent, I bet Richi is secretly dancing :-).
Richard Biener Feb. 10, 2015, 10:52 a.m. UTC | #3
On Fri, Feb 6, 2015 at 5:42 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>>> +    && DECL_CONTEXT (snode->decl)
>>> +    && TREE_CODE (DECL_CONTEXT (snode->decl)) != FUNCTION_DECL)
>>
>>
>> I think this should be !decl_function_context (snode->decl), in case
>> there's a class or BLOCK between the symbol and its enclosing function.
>
>
> Done, also for the iteration through reachable functions.
>
>>
>>>  dwarf2out_type_decl (tree decl, int local)
>>> +  /* ?? Technically, we shouldn't need this hook at all, as all
>>> +     symbols (and by consequence their types) will be outputed from
>>> +     finalize_compilation_unit.
>>
>>
>> Note that we also want to emit debug info about some types that are not
>> referenced by symbols, such as when a type is used in a cast.
>
>
> Fair enough.  I've removed the comment.
>
>>> +/* Perform any cleanups needed after the early debug generation pass
>>> +   has run.  */
>>> +
>>> +static void
>>> +dwarf2out_early_finish (void)
>>
>>
>> Since this is also called from dwarf2out_finish, let's call it something
>> more descriptive, say, flush_limbo_dies?
>
>
> I was actually thinking of using dwarf2out_early_finish() to mop things up
> as we generate early (or stream out) other auxiliary tables (pubname_table,
> pubtype_table, file_table, etc).  More details on that later.  If so, can I
> leave it as is?
>
> How is this version?  No regressions on guality.  Target libraries build
> fine.

Finally having a look at the patch.  And indeed - this is how I thought
it should work.

Of course I wonder why you need to separate handling of functions and
variables.  What breaks if you emit debug info for functions before
the first analyze_functions () call?

I also wonder why you restrict it to functions with a GIMPLE body.

I'd have expected for a TU like

void foo (int, int);
int main()
{
  return 0;
}

to be able to do

(gdb) start
(gdb) ptype foo

and get a prototype for foo?  (ok, that may be -g3 stuff)

Likewise for

struct foo { int i; };
int main ()
{
  return 0;
}

and I realize that this needs frontend support - the middle-end really
only gets "reachable" stuff reliably.

Thus for -g3 we may end up retaining (or adding) some FE calls
to early_global{_decl,_type}?

(not that I care about the above cases in practice, but in theory?)

Thanks for doing all this work!

Richard.
Jason Merrill Feb. 10, 2015, 5:58 p.m. UTC | #4
On 02/10/2015 05:52 AM, Richard Biener wrote:
> I also wonder why you restrict it to functions with a GIMPLE body.
>
> Likewise for
>
> struct foo { int i; };

I guess these should depend on -feliminate-unused-debug-symbols and 
-feliminate-unused-debug-types.

Jason
Aldy Hernandez Feb. 12, 2015, 6:04 p.m. UTC | #5
On 02/10/2015 02:52 AM, Richard Biener wrote:
> On Fri, Feb 6, 2015 at 5:42 PM, Aldy Hernandez <aldyh@redhat.com> wrote:

> Of course I wonder why you need to separate handling of functions and variables

The variables need to be handled earlier, else the call to 
analyze_functions() will remove some optimized global variables away, 
and we'll never see them.  I believe that Jason said they were needed 
up-thread.

> variables.  What breaks if you emit debug info for functions before
> the first analyze_functions () call?
 >
 > I also wonder why you restrict it to functions with a GIMPLE body.

The functions, on the other hand, need to be handled after the second 
call to analyze_function (and with a GIMPLE body) else we get far more 
function DIEs than mainline currently does, especially wrt C++ clones. 
Otherwise, we get DIEs for base constructors, complete constructors, and 
what-have-yous.  Jason wanted less DIEs, more attune to what mainline is 
currently doing.

>
> I'd have expected for a TU like
>
> void foo (int, int);
> int main()
> {
>    return 0;
> }
>
> to be able to do
>
> (gdb) start
> (gdb) ptype foo
>
> and get a prototype for foo?  (ok, that may be -g3 stuff)

This may need frontend support.  I don't think we get any of these 
prototype symbols in cgraphunit, at least not in the symbol table 
(FOR_EACH_SYMBOL).  Perhaps a call from the front-ends, similar to what 
we do with types with debug_hooks->type_decl().  But... do really want 
that?  Is there a practical use?

>
> Likewise for
>
> struct foo { int i; };
> int main ()
> {
>    return 0;
> }

This is already working if compiled with 
-fno-eliminate-unused-debug-types, probably by virtue of types being 
called from the front-ends with debug_hooks->type_decl().  We could 
certainly enable -fno-eliminate-unused-debug-{symbols,types} for -g3 if 
desirable.

> Thanks for doing all this work!

Thanks for volunteering to do the LTO bits ;-).

Aldy
Jason Merrill Feb. 12, 2015, 7:27 p.m. UTC | #6
On 02/12/2015 01:04 PM, Aldy Hernandez wrote:
> On 02/10/2015 02:52 AM, Richard Biener wrote:
>> On Fri, Feb 6, 2015 at 5:42 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> Of course I wonder why you need to separate handling of functions and
>> variables
>
> The variables need to be handled earlier, else the call to
> analyze_functions() will remove some optimized global variables away,
> and we'll never see them.  I believe that Jason said they were needed
> up-thread.
>
>> variables.  What breaks if you emit debug info for functions before
>> the first analyze_functions () call?
>  >
>  > I also wonder why you restrict it to functions with a GIMPLE body.
>
> The functions, on the other hand, need to be handled after the second
> call to analyze_function (and with a GIMPLE body) else we get far more
> function DIEs than mainline currently does, especially wrt C++ clones.
> Otherwise, we get DIEs for base constructors, complete constructors, and
> what-have-yous.  Jason wanted less DIEs, more attune to what mainline is
> currently doing.

I think it makes sense to generate DIEs for everything defined in the TU 
if we don't have -feliminate-unused-debug-symbols.  But since clones are 
artificial, emit them only if they're used.

>> void foo (int, int);
>> and get a prototype for foo?  (ok, that may be -g3 stuff)
>
> This may need frontend support.  I don't think we get any of these
> prototype symbols in cgraphunit, at least not in the symbol table
> (FOR_EACH_SYMBOL).  Perhaps a call from the front-ends, similar to what
> we do with types with debug_hooks->type_decl().  But... do really want
> that?  Is there a practical use?

We haven't done that, historically.  But it would be useful for ABI 
verification by, comparing the debug info for the declarations to the 
debug info for the definition in the shared library.

Jason
Aldy Hernandez Feb. 16, 2015, 8:46 p.m. UTC | #7
On 02/12/2015 11:27 AM, Jason Merrill wrote:
> On 02/12/2015 01:04 PM, Aldy Hernandez wrote:
>> On 02/10/2015 02:52 AM, Richard Biener wrote:
>>> On Fri, Feb 6, 2015 at 5:42 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>>> Of course I wonder why you need to separate handling of functions and
>>> variables
>>
>> The variables need to be handled earlier, else the call to
>> analyze_functions() will remove some optimized global variables away,
>> and we'll never see them.  I believe that Jason said they were needed
>> up-thread.
>>
>>> variables.  What breaks if you emit debug info for functions before
>>> the first analyze_functions () call?
>>  >
>>  > I also wonder why you restrict it to functions with a GIMPLE body.
>>
>> The functions, on the other hand, need to be handled after the second
>> call to analyze_function (and with a GIMPLE body) else we get far more
>> function DIEs than mainline currently does, especially wrt C++ clones.
>> Otherwise, we get DIEs for base constructors, complete constructors, and
>> what-have-yous.  Jason wanted less DIEs, more attune to what mainline is
>> currently doing.
>
> I think it makes sense to generate DIEs for everything defined in the TU
> if we don't have -feliminate-unused-debug-symbols.  But since clones are
> artificial, emit them only if they're used.

Ok, just so we're on the same page.  I'm thinking that for 
-fNO-eliminate-unused-debug-symbols, we can iterate through 
FOR_EACH_DEFINED_FUNCTION before unreachable functions have been 
removed.  There we can output all non-clones.

Then for the -feliminate-unused-debug-symbols case, we can output 
reachable functions after the unreachable ones have been removed.  Here 
we can also dump the clones we ignored for 
-fNO-eliminate-unused-debug-symbols above, since we only want to emit 
them if they're reachable (regardless of -feliminate-unused-debug-symbols).

In either case, we always ignore those without a gimple body, otherwise 
we end up generating DIEs for the _ZN1AC2Ei constructor in the attached 
function unnecessarily.  See how the bits end up in the attached testcase:

(Oh, and we determine clonehood with DECL_ABSTRACT_ORIGIN)

Before any calls to analyze_functions()
---------------------------------------
Function: 'int main()' (Mangled: main) gimple_body=1 DECL_ABSTRACT_ORIGIN=0
Function: 'A::A(int)' (Mangled: _ZN1AC1Ei) gimple_body=0 
DECL_ABSTRACT_ORIGIN=1
Function: 'A::A(int)' (Mangled: _ZN1AC2Ei) gimple_body=1 
DECL_ABSTRACT_ORIGIN=1
Function: 'void foo(int)' (Mangled: _Z3fooi) gimple_body=1 
DECL_ABSTRACT_ORIGIN=0
Function: 'int bar()' (Mangled: _Z3barv) gimple_body=1 
DECL_ABSTRACT_ORIGIN=0
Function: 'void unreachable_func()' (Mangled: _ZL16unreachable_funcv) 
gimple_body=1 DECL_ABSTRACT_ORIGIN=0

After reachability analysis
(after first call to analyze_functions())
-----------------------------------------
Function: 'int main()' (Mangled: main) gimple_body=1 DECL_ABSTRACT_ORIGIN=0
Function: 'A::A(int)' (Mangled: _ZN1AC1Ei) gimple_body=0 
DECL_ABSTRACT_ORIGIN=1
Function: 'A::A(int)' (Mangled: _ZN1AC2Ei) gimple_body=1 
DECL_ABSTRACT_ORIGIN=1
Function: 'void foo(int)' (Mangled: _Z3fooi) gimple_body=1 
DECL_ABSTRACT_ORIGIN=0
Function: 'int bar()' (Mangled: _Z3barv) gimple_body=1 
DECL_ABSTRACT_ORIGIN=0

Is this what you had in mind?  I can provide a patch to make things clearer.

Aldy
extern "C" void abort ();
struct A { A (int); int a; };

int i;

static void unreachable_func()
{
  i = 5;
}


__attribute__((noinline, noclone)) int
bar (void)
{
  return 40;
}

__attribute__((noinline, noclone)) void
foo (int x)
{
  __asm volatile ("" : : "r" (x) : "memory");
}

A::A (int x)
{
  static int p = bar ();
  foo (p);
  a = ++p;
}

int
main ()
{
  A a (42);
  if (a.a != 41)
    abort ();
}
diff mbox

Patch

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index d1e1f74..7e1305c 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -10657,10 +10657,7 @@  c_write_global_declarations_1 (tree globals)
      here, and start the TV_PHASE_DBGINFO timer.  Is it worth it, or
      would it convolute things?  */
   for (decl = globals; decl; decl = DECL_CHAIN (decl))
-    {
-      check_global_declaration_1 (decl);
-      debug_hooks->early_global_decl (decl);
-    }
+    check_global_declaration_1 (decl);
   /* ?? Similarly here. Stop TV_PHASE_DBGINFO and start
      TV_PHASE_DEFERRED again.  */
 }
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index fed1a3e..3b57dfe 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2326,6 +2326,16 @@  symbol_table::finalize_compilation_unit (void)
   if (flag_dump_passes)
     dump_passes ();
 
+  /* Generate early debug for global symbols.  Any local symbols will
+     be handled by either handling reachable functions further down
+     (and by consequence, locally scoped symbols), or by generating
+     DIEs for types.  */
+  symtab_node *snode;
+  FOR_EACH_SYMBOL (snode)
+    if (TREE_CODE (snode->decl) != FUNCTION_DECL
+	&& !decl_function_context (snode->decl))
+      (*debug_hooks->early_global_decl) (snode->decl);
+
   /* Gimplify and lower all functions, compute reachability and
      remove unreachable nodes.  */
   analyze_functions ();
@@ -2336,6 +2346,17 @@  symbol_table::finalize_compilation_unit (void)
   /* Gimplify and lower thunks.  */
   analyze_functions ();
 
+  /* Emit early debug for reachable functions, and by consequence,
+     locally scoped symbols.  */
+  struct cgraph_node *cnode;
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (cnode)
+    if (!decl_function_context (cnode->decl))
+      (*debug_hooks->early_global_decl) (cnode->decl);
+
+  /* Clean up anything that needs cleaning up after initial debug
+     generation.  */
+  (*debug_hooks->early_finish) ();
+
   /* Finally drive the pass manager.  */
   compile ();
 
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 691688b..96740e8e 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4333,11 +4333,10 @@  dump_tu (void)
     }
 }
 
-/* Issue warnings for globals in NAME_SPACE (unused statics, etc) and
-   generate debug information for said globals.  */
+/* Issue warnings for globals in NAME_SPACE (unused statics, etc).  */
 
 static int
-emit_debug_for_namespace (tree name_space, void* data ATTRIBUTE_UNUSED)
+check_statics_for_namespace (tree name_space, void* data ATTRIBUTE_UNUSED)
 {
   cp_binding_level *level = NAMESPACE_LEVEL (name_space);
   vec<tree, va_gc> *statics = level->static_decls;
@@ -4346,9 +4345,6 @@  emit_debug_for_namespace (tree name_space, void* data ATTRIBUTE_UNUSED)
 
   check_global_declarations (vec, len);
 
-  for (tree t = level->names; t; t = TREE_CHAIN(t))
-    debug_hooks->early_global_decl (t);
-
   return 0;
 }
 
@@ -4737,15 +4733,10 @@  c_parse_final_cleanups (void)
      generate initial debug information.  */
   timevar_stop (TV_PHASE_PARSING);
   timevar_start (TV_PHASE_DBGINFO);
-  walk_namespaces (emit_debug_for_namespace, 0);
+  walk_namespaces (check_statics_for_namespace, 0);
   if (vec_safe_length (pending_statics) != 0)
-    {
-      check_global_declarations (pending_statics->address (),
-				 pending_statics->length ());
-      emit_debug_global_declarations (pending_statics->address (),
-				      pending_statics->length (),
-				      EMIT_DEBUG_EARLY);
-    }
+    check_global_declarations (pending_statics->address (),
+			       pending_statics->length ());
 
   perform_deferred_noexcept_checks ();
 
diff --git a/gcc/dbxout.c b/gcc/dbxout.c
index 430a2eb..202ef8a 100644
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -359,6 +359,7 @@  const struct gcc_debug_hooks dbx_debug_hooks =
   dbxout_init,
   dbxout_finish,
   debug_nothing_void,
+  debug_nothing_void,
   debug_nothing_int_charstar,
   debug_nothing_int_charstar,
   dbxout_start_source_file,
@@ -400,6 +401,7 @@  const struct gcc_debug_hooks xcoff_debug_hooks =
   dbxout_init,
   dbxout_finish,
   debug_nothing_void,
+  debug_nothing_void,
   debug_nothing_int_charstar,
   debug_nothing_int_charstar,
   dbxout_start_source_file,
diff --git a/gcc/debug.c b/gcc/debug.c
index 449d3a1..d0e00c0 100644
--- a/gcc/debug.c
+++ b/gcc/debug.c
@@ -27,6 +27,7 @@  const struct gcc_debug_hooks do_nothing_debug_hooks =
 {
   debug_nothing_charstar,
   debug_nothing_charstar,
+  debug_nothing_void,			/* early_finish */
   debug_nothing_void,
   debug_nothing_int_charstar,
   debug_nothing_int_charstar,
diff --git a/gcc/debug.h b/gcc/debug.h
index f9485bc..a8d3f23 100644
--- a/gcc/debug.h
+++ b/gcc/debug.h
@@ -30,6 +30,9 @@  struct gcc_debug_hooks
   /* Output debug symbols.  */
   void (* finish) (const char *main_filename);
 
+  /* Run cleanups necessary after early debug generation.  */
+  void (* early_finish) (void);
+
   /* Called from cgraph_optimize before starting to assemble
      functions/variables/toplevel asms.  */
   void (* assembly_start) (void);
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 33738d9..036194d 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -106,6 +106,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-dfa.h"
 #include "gdb/gdb-index.h"
 #include "rtl-iter.h"
+#include "print-tree.h"
 
 static void dwarf2out_source_line (unsigned int, const char *, int, bool);
 static rtx_insn *last_var_location_insn;
@@ -2424,6 +2425,7 @@  build_cfa_aligned_loc (dw_cfa_location *cfa,
 
 static void dwarf2out_init (const char *);
 static void dwarf2out_finish (const char *);
+static void dwarf2out_early_finish (void);
 static void dwarf2out_assembly_start (void);
 static void dwarf2out_define (unsigned int, const char *);
 static void dwarf2out_undef (unsigned int, const char *);
@@ -2451,6 +2453,7 @@  const struct gcc_debug_hooks dwarf2_debug_hooks =
 {
   dwarf2out_init,
   dwarf2out_finish,
+  dwarf2out_early_finish,
   dwarf2out_assembly_start,
   dwarf2out_define,
   dwarf2out_undef,
@@ -2611,6 +2614,9 @@  typedef struct GTY((chain_circular ("%h.die_sib"), for_user)) die_struct {
   int die_mark;
   unsigned int decl_id;
   enum dwarf_tag die_tag;
+  /* No one should depend on this, as it is a temporary debugging aid
+     to indicate the DECL for which this DIE was created for.  */
+  tree tmp_created_for;
   /* Die is used and must not be pruned as unused.  */
   BOOL_BITFIELD die_perennial_p : 1;
   BOOL_BITFIELD comdat_type_p : 1; /* DIE has a type signature */
@@ -4890,6 +4896,7 @@  new_die (enum dwarf_tag tag_value, dw_die_ref parent_die, tree t)
   dw_die_ref die = ggc_cleared_alloc<die_node> ();
 
   die->die_tag = tag_value;
+  die->tmp_created_for = t;
 
   if (early_dwarf_dumping)
     die->dumped_early = true;
@@ -4900,6 +4907,30 @@  new_die (enum dwarf_tag tag_value, dw_die_ref parent_die, tree t)
     {
       limbo_die_node *limbo_node;
 
+      /* No DIEs created after early dwarf should end up in limbo,
+	 because the limbo list should not persist past LTO
+	 streaming.  */
+      if (tag_value != DW_TAG_compile_unit
+	  && !early_dwarf_dumping
+	  /* Allow nested functions to live in limbo because they will
+	     only temporarily live there, as decls_for_scope will fix
+	     them up.  */
+	  && (TREE_CODE (t) != FUNCTION_DECL
+	      || !decl_function_context (t))
+	  /* FIXME: Allow types for now.  We are getting some internal
+	     template types from inlining (building libstdc++).
+	     Templates need to be looked at.  */
+	  && !TYPE_P (t)
+	  /* FIXME: Allow late limbo DIE creation for LTO, especially
+	     in the ltrans stage, but once we implement LTO dwarf
+	     streaming, we should remove this exception.  */
+	  && !in_lto_p)
+	{
+	  fprintf (stderr, "symbol ended up in limbo too late:");
+	  debug_generic_stmt (t);
+	  gcc_unreachable ();
+	}
+
       limbo_node = ggc_cleared_alloc<limbo_die_node> ();
       limbo_node->die = die;
       limbo_node->created_for = t;
@@ -5399,6 +5430,13 @@  print_die (dw_die_ref die, FILE *outfile)
 	fprintf (outfile, ": %s", name);
       fputc (')', outfile);
     }
+  if (die->tmp_created_for
+      && DECL_P (die->tmp_created_for)
+      && CODE_CONTAINS_STRUCT
+           (TREE_CODE (die->tmp_created_for), TS_DECL_WITH_VIS)
+      && DECL_ASSEMBLER_NAME_SET_P (die->tmp_created_for))
+    fprintf (outfile, "(mangle: %s)",
+	     IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (die->tmp_created_for)));
   fputc ('\n', outfile);
   print_spaces (outfile);
   fprintf (outfile, "  abbrev id: %lu", die->die_abbrev);
@@ -18274,7 +18312,8 @@  dwarf2out_abstract_function (tree decl)
   current_function_decl = decl;
 
   was_abstract = DECL_ABSTRACT_P (decl);
-  set_decl_abstract_flags (decl, 1);
+  if (!was_abstract)
+    set_decl_abstract_flags (decl, 1);
   dwarf2out_decl (decl);
   if (! was_abstract)
     set_decl_abstract_flags (decl, 0);
@@ -18403,24 +18442,93 @@  gen_subprogram_die (tree decl, dw_die_ref context_die)
   tree origin = decl_ultimate_origin (decl);
   dw_die_ref subr_die;
   dw_die_ref old_die = lookup_decl_die (decl);
+
+  /* This function gets called multiple times for different stages of
+     the debug process.  For example, for func() in this code:
+
+	namespace S
+	{
+	  void func() { ... }
+	}
+
+     ...we get called 4 times.  Twice in early debug and twice in
+     late debug:
+
+     Early debug
+     -----------
+
+       1. Once while generating func() within the namespace.  This is
+          the declaration.  The declaration bit below is set, as the
+          context is the namespace.
+
+	  A new DIE will be generated with DW_AT_declaration set.
+
+       2. Once for func() itself.  This is the specification.  The
+          declaration bit below is clear as the context is the CU.
+
+	  We will use the cached DIE from (1) to create a new DIE with
+	  DW_AT_specification pointing to the declaration in (1).
+
+     Late debug via rest_of_handle_final()
+     -------------------------------------
+
+       3. Once generating func() within the namespace.  This is also the
+          declaration, as in (1), but this time we will early exit below
+          as we have a cached DIE and a declaration needs no additional
+          annotations (no locations), as the source declaration line
+          info is enough.
+
+       4. Once for func() itself.  As in (2), this is the specification,
+          but this time we will re-use the cached DIE, and just annotate
+          it with the location information that should now be available.
+
+     For something without namespaces, but with abstract instances, we
+     are also called a multiple times:
+
+        class Base
+	{
+	public:
+	  Base ();	  // constructor declaration (1)
+	};
+
+	Base::Base () { } // constructor specification (2)
+
+    Early debug
+    -----------
+
+       1. Once for the Base() constructor by virtue of it being a
+          member of the Base class.  This is done via
+          rest_of_type_compilation.
+
+	  This is a declaration, so a new DIE will be created with
+	  DW_AT_declaration.
+
+       2. Once for the Base() constructor definition, but this time
+          while generating the abstract instance of the base
+          constructor (__base_ctor) which is being generated via early
+          debug of reachable functions.
+
+	  Even though we have a cached version of the declaration (1),
+	  we will create a DW_AT_specification of the declaration DIE
+	  in (1).
+
+       3. Once for the __base_ctor itself, but this time, we generate
+          an DW_AT_abstract_origin version of the DW_AT_specification in
+	  (2).
+
+    Late debug via rest_of_handle_final
+    -----------------------------------
+
+       4. One final time for the __base_ctor (which will have a cached
+          DIE with DW_AT_abstract_origin created in (3).  This time,
+          we will just annotate the location information now
+          available.
+  */
   int declaration = (current_function_decl != decl
 		     || class_or_namespace_scope_p (context_die));
 
   premark_used_types (DECL_STRUCT_FUNCTION (decl));
 
-  /* It is possible to have both DECL_ABSTRACT_P and DECLARATION be true if we
-     started to generate the abstract instance of an inline, decided to output
-     its containing class, and proceeded to emit the declaration of the inline
-     from the member list for the class.  If so, DECLARATION takes priority;
-     we'll get back to the abstract instance when done with the class.  */
-
-  /* The class-scope declaration DIE must be the primary DIE.  */
-  if (origin && declaration && class_or_namespace_scope_p (context_die))
-    {
-      origin = NULL;
-      gcc_assert (!old_die);
-    }
-
   /* Now that the C++ front end lazily declares artificial member fns, we
      might need to retrofit the declaration into its class.  */
   if (!declaration && !origin && !old_die
@@ -18440,14 +18548,23 @@  gen_subprogram_die (tree decl, dw_die_ref context_die)
       if (old_die && old_die->die_parent == NULL)
 	add_child_die (context_die, old_die);
 
-      subr_die = new_die (DW_TAG_subprogram, context_die, decl);
-      add_abstract_origin_attribute (subr_die, origin);
-      /*  This is where the actual code for a cloned function is.
-	  Let's emit linkage name attribute for it.  This helps
-	  debuggers to e.g, set breakpoints into
-	  constructors/destructors when the user asks "break
-	  K::K".  */
-      add_linkage_name (subr_die, decl);
+      if (old_die && get_AT_ref (old_die, DW_AT_abstract_origin))
+	{
+	  /* If we have a DW_AT_abstract_origin we have a working
+	     cached version.  */
+	  subr_die = old_die;
+	}
+      else
+	{
+	  subr_die = new_die (DW_TAG_subprogram, context_die, decl);
+	  add_abstract_origin_attribute (subr_die, origin);
+	  /*  This is where the actual code for a cloned function is.
+	      Let's emit linkage name attribute for it.  This helps
+	      debuggers to e.g, set breakpoints into
+	      constructors/destructors when the user asks "break
+	      K::K".  */
+	  add_linkage_name (subr_die, decl);
+	}
     }
   /* A cached copy, possibly from early dwarf generation.  Reuse as
      much as possible.  */
@@ -18496,24 +18613,13 @@  gen_subprogram_die (tree decl, dw_die_ref context_die)
 	{
 	  subr_die = old_die;
 
-	  /* ??? Hmmm, early dwarf generation happened earlier, so no
-	     sense in removing the parameters.  Let's keep them and
-	     augment them with location information later.  */
-#if 0
-	  /* Clear out the declaration attribute and the formal parameters.
-	     Do not remove all children, because it is possible that this
-	     declaration die was forced using force_decl_die(). In such
-	     cases die that forced declaration die (e.g. TAG_imported_module)
-	     is one of the children that we do not want to remove.  */
-	  remove_AT (subr_die, DW_AT_declaration);
-	  remove_AT (subr_die, DW_AT_object_pointer);
-	  remove_child_TAG (subr_die, DW_TAG_formal_parameter);
-#else
-	  /* We don't need the DW_AT_declaration the second or third
-	     time around anyhow.  */
+	  /* Clear out the declaration attribute, but leave the
+	     parameters so they can be augmented with location
+	     information later.  */
 	  remove_AT (subr_die, DW_AT_declaration);
-#endif
 	}
+      /* Make a specification pointing to the previously built
+	 declaration.  */
       else
 	{
 	  subr_die = new_die (DW_TAG_subprogram, context_die, decl);
@@ -21319,7 +21425,12 @@  static void
 dwarf2out_type_decl (tree decl, int local)
 {
   if (!local)
-    dwarf2out_decl (decl);
+    {
+      bool t = early_dwarf_dumping;
+      early_dwarf_dumping = true;
+      dwarf2out_decl (decl);
+      early_dwarf_dumping = t;
+    }
 }
 
 /* Output debug information for imported module or decl DECL.
@@ -21636,7 +21747,10 @@  dwarf2out_decl (tree decl)
   /* If we early created a DIE, make sure it didn't get re-created by
      mistake.  */
   if (early_die && early_die->dumped_early)
-    gcc_assert (early_die == die);
+    gcc_assert (early_die == die
+		/* We can have a differing DIE if and only if, the
+		   new one is a specification of the old one.  */
+		|| get_AT_ref (die, DW_AT_specification) == early_die);
 #endif
   return die;
 }
@@ -21733,6 +21847,9 @@  lookup_filename (const char *file_name)
 {
   struct dwarf_file_data * created;
 
+  if (!file_name)
+    return NULL;
+
   dwarf_file_data **slot
     = file_table->find_slot_with_hash (file_name, htab_hash_string (file_name),
 				       INSERT);
@@ -24726,10 +24843,20 @@  optimize_location_lists (dw_die_ref die)
 static void
 dwarf2out_finish (const char *filename)
 {
-  limbo_die_node *node, *next_node;
   comdat_type_node *ctnode;
   dw_die_ref main_comp_unit_die;
 
+  /* If the limbo list has anything, it should be things that were
+     created after the compilation proper.  Anything from the early
+     dwarf pass, should have parents and should never be in the limbo
+     list this late.  */
+  for (limbo_die_node *node = limbo_die_list; node; node = node->next)
+    gcc_assert (node->die->die_tag == DW_TAG_compile_unit
+		|| !node->die->dumped_early);
+
+  /* Flush out any latecomers to the limbo party.  */
+  dwarf2out_early_finish();
+
   /* PCH might result in DW_AT_producer string being restored from the
      header compilation, so always fill it with empty string initially
      and overwrite only here.  */
@@ -24754,55 +24881,6 @@  dwarf2out_finish (const char *filename)
 	add_comp_dir_attribute (comp_unit_die ());
     }
 
-  /* Traverse the limbo die list, and add parent/child links.  The only
-     dies without parents that should be here are concrete instances of
-     inline functions, and the comp_unit_die.  We can ignore the comp_unit_die.
-     For concrete instances, we can get the parent die from the abstract
-     instance.  */
-  for (node = limbo_die_list; node; node = next_node)
-    {
-      dw_die_ref die = node->die;
-      next_node = node->next;
-
-      if (die->die_parent == NULL)
-	{
-	  dw_die_ref origin = get_AT_ref (die, DW_AT_abstract_origin);
-
-	  if (origin && origin->die_parent)
-	    add_child_die (origin->die_parent, die);
-	  else if (is_cu_die (die))
-	    ;
-	  else if (seen_error ())
-	    /* It's OK to be confused by errors in the input.  */
-	    add_child_die (comp_unit_die (), die);
-	  else
-	    {
-	      /* In certain situations, the lexical block containing a
-		 nested function can be optimized away, which results
-		 in the nested function die being orphaned.  Likewise
-		 with the return type of that nested function.  Force
-		 this to be a child of the containing function.
-
-		 It may happen that even the containing function got fully
-		 inlined and optimized out.  In that case we are lost and
-		 assign the empty child.  This should not be big issue as
-		 the function is likely unreachable too.  */
-	      gcc_assert (node->created_for);
-
-	      if (DECL_P (node->created_for))
-		origin = get_context_die (DECL_CONTEXT (node->created_for));
-	      else if (TYPE_P (node->created_for))
-		origin = scope_die_for (node->created_for, comp_unit_die ());
-	      else
-		origin = comp_unit_die ();
-
-	      add_child_die (origin, die);
-	    }
-	}
-    }
-
-  limbo_die_list = NULL;
-
 #if ENABLE_ASSERT_CHECKING
   {
     dw_die_ref die = comp_unit_die (), c;
@@ -24850,6 +24928,7 @@  dwarf2out_finish (const char *filename)
   /* Traverse the DIE's and add add sibling attributes to those DIE's
      that have children.  */
   add_sibling_attributes (comp_unit_die ());
+  limbo_die_node *node;
   for (node = limbo_die_list; node; node = node->next)
     add_sibling_attributes (node->die);
   for (ctnode = comdat_type_list; ctnode != NULL; ctnode = ctnode->next)
@@ -25111,6 +25190,66 @@  dwarf2out_finish (const char *filename)
     output_indirect_strings ();
 }
 
+/* Perform any cleanups needed after the early debug generation pass
+   has run.  */
+
+static void
+dwarf2out_early_finish (void)
+{
+  /* Traverse the limbo die list, and add parent/child links.  The only
+     dies without parents that should be here are concrete instances of
+     inline functions, and the comp_unit_die.  We can ignore the comp_unit_die.
+     For concrete instances, we can get the parent die from the abstract
+     instance.
+
+     The point here is to flush out the limbo list so that it is empty
+     and we don't need to stream it for LTO.  */
+  limbo_die_node *node, *next_node;
+  for (node = limbo_die_list; node; node = next_node)
+    {
+      dw_die_ref die = node->die;
+      next_node = node->next;
+
+      if (die->die_parent == NULL)
+	{
+	  dw_die_ref origin = get_AT_ref (die, DW_AT_abstract_origin);
+
+	  if (origin && origin->die_parent)
+	    add_child_die (origin->die_parent, die);
+	  else if (is_cu_die (die))
+	    ;
+	  else if (seen_error ())
+	    /* It's OK to be confused by errors in the input.  */
+	    add_child_die (comp_unit_die (), die);
+	  else
+	    {
+	      /* In certain situations, the lexical block containing a
+		 nested function can be optimized away, which results
+		 in the nested function die being orphaned.  Likewise
+		 with the return type of that nested function.  Force
+		 this to be a child of the containing function.
+
+		 It may happen that even the containing function got fully
+		 inlined and optimized out.  In that case we are lost and
+		 assign the empty child.  This should not be big issue as
+		 the function is likely unreachable too.  */
+	      gcc_assert (node->created_for);
+
+	      if (DECL_P (node->created_for))
+		origin = get_context_die (DECL_CONTEXT (node->created_for));
+	      else if (TYPE_P (node->created_for))
+		origin = scope_die_for (node->created_for, comp_unit_die ());
+	      else
+		origin = comp_unit_die ();
+
+	      add_child_die (origin, die);
+	    }
+	}
+    }
+
+  limbo_die_list = NULL;
+}
+
 /* Reset all state within dwarf2out.c so that we can rerun the compiler
    within the same process.  For use by toplev::finalize.  */
 
diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c
index 3d217d4..4d41b4e 100644
--- a/gcc/fortran/f95-lang.c
+++ b/gcc/fortran/f95-lang.c
@@ -242,8 +242,7 @@  gfc_be_parse_file (void)
      diagnostics before gfc_finish().  */
   gfc_diagnostics_finish ();
 
-  /* Do the debug dance.  */
-  global_decl_processing_and_early_debug ();
+  global_decl_processing ();
 }
 
 
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 1fa3060..efd4cd4 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -4709,7 +4709,6 @@  gfc_emit_parameter_debug_info (gfc_symbol *sym)
 					      TREE_TYPE (decl),
 					      sym->attr.dimension,
 					      false, false);
-  debug_hooks->early_global_decl (decl);
 }
 
 
diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc
index 215c4e5..c9c50dd 100644
--- a/gcc/go/go-gcc.cc
+++ b/gcc/go/go-gcc.cc
@@ -3000,8 +3000,6 @@  Gcc_backend::write_global_definitions(
 
   wrapup_global_declarations(defs, i);
 
-  emit_debug_global_declarations (defs, i, EMIT_DEBUG_EARLY);
-
   /* ?? Can we leave this call here, thus getting called before
      finalize_compilation_unit?
 
diff --git a/gcc/java/class.c b/gcc/java/class.c
index bea7720..3d84a57 100644
--- a/gcc/java/class.c
+++ b/gcc/java/class.c
@@ -116,10 +116,6 @@  static GTY(()) vec<tree, va_gc> *registered_class;
    currently being compiled.  */
 static GTY(()) tree this_classdollar;
 
-/* A list of static class fields.  This is to emit proper debug
-   info for them.  */
-vec<tree, va_gc> *pending_static_fields;
-
 /* Return the node that most closely represents the class whose name
    is IDENT.  Start the search from NODE (followed by its siblings).
    Return NULL if an appropriate node does not exist.  */
@@ -888,8 +884,6 @@  add_field (tree klass, tree name, tree field_type, int flags)
       /* Considered external unless we are compiling it into this
 	 object file.  */
       DECL_EXTERNAL (field) = (is_compiled_class (klass) != 2);
-      if (!DECL_EXTERNAL (field))
-	vec_safe_push (pending_static_fields, field);
     }
 
   return field;
diff --git a/gcc/java/decl.c b/gcc/java/decl.c
index fbd09a3..7f02b4e 100644
--- a/gcc/java/decl.c
+++ b/gcc/java/decl.c
@@ -1964,11 +1964,7 @@  java_mark_class_local (tree klass)
 
   for (t = TYPE_FIELDS (klass); t ; t = DECL_CHAIN (t))
     if (FIELD_STATIC (t))
-      {
-	if (DECL_EXTERNAL (t))
-	  vec_safe_push (pending_static_fields, t);
-	java_mark_decl_local (t);
-      }
+      java_mark_decl_local (t);
 
   for (t = TYPE_METHODS (klass); t ; t = DECL_CHAIN (t))
     if (!METHOD_ABSTRACT (t))
diff --git a/gcc/java/java-tree.h b/gcc/java/java-tree.h
index 4ea8feb..142661f 100644
--- a/gcc/java/java-tree.h
+++ b/gcc/java/java-tree.h
@@ -1194,8 +1194,6 @@  extern void rewrite_reflection_indexes (void *);
 
 int cxx_keyword_p (const char *name, int length);
 
-extern GTY(()) vec<tree, va_gc> *pending_static_fields;
-
 #define DECL_FINAL(DECL) DECL_LANG_FLAG_3 (DECL)
 
 /* Access flags etc for a method (a FUNCTION_DECL): */
diff --git a/gcc/java/jcf-parse.c b/gcc/java/jcf-parse.c
index e163d03..a65c5c7 100644
--- a/gcc/java/jcf-parse.c
+++ b/gcc/java/jcf-parse.c
@@ -1993,12 +1993,8 @@  java_parse_file (void)
   java_emit_static_constructor ();
   gcc_assert (global_bindings_p ());
 
-  /* Do final processing on globals and emit early debug information.  */
-  tree *vec = vec_safe_address (pending_static_fields);
-  int len = vec_safe_length (pending_static_fields);
-  global_decl_processing_and_early_debug ();
-  emit_debug_global_declarations (vec, len, EMIT_DEBUG_EARLY);
-  vec_free (pending_static_fields);
+  /* Do final processing on globals.  */
+  global_decl_processing ();
 }
 
 
diff --git a/gcc/langhooks.c b/gcc/langhooks.c
index 1c0edc1..c035490 100644
--- a/gcc/langhooks.c
+++ b/gcc/langhooks.c
@@ -306,12 +306,12 @@  lhd_decl_ok_for_sibcall (const_tree decl ATTRIBUTE_UNUSED)
   return true;
 }
 
-/* Generic global declaration processing and early debug generation.
-   This is meant to be called by the front-ends at the end of parsing.
-   C/C++ do their own thing, but other front-ends may call this.  */
+/* Generic global declaration processing.  This is meant to be called
+   by the front-ends at the end of parsing.  C/C++ do their own thing,
+   but other front-ends may call this.  */
 
 void
-global_decl_processing_and_early_debug (void)
+global_decl_processing (void)
 {
   tree globals, decl, *vec;
   int len, i;
@@ -336,10 +336,6 @@  global_decl_processing_and_early_debug (void)
   check_global_declarations (vec, len);
   timevar_stop (TV_PHASE_DEFERRED);
 
-  timevar_start (TV_PHASE_DBGINFO);
-  emit_debug_global_declarations (vec, len, EMIT_DEBUG_EARLY);
-  timevar_stop (TV_PHASE_DBGINFO);
-
   timevar_start (TV_PHASE_PARSING);
   free (vec);
 }
diff --git a/gcc/sdbout.c b/gcc/sdbout.c
index d7b2d6b..43b8cf2 100644
--- a/gcc/sdbout.c
+++ b/gcc/sdbout.c
@@ -279,6 +279,7 @@  const struct gcc_debug_hooks sdb_debug_hooks =
 {
   sdbout_init,			         /* init */
   sdbout_finish,		         /* finish */
+  debug_nothing_void,			 /* early_finish */
   debug_nothing_void,			 /* assembly_start */
   debug_nothing_int_charstar,	         /* define */
   debug_nothing_int_charstar,	         /* undef */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 42a2cdc..4897d57 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -538,13 +538,11 @@  check_global_declarations (tree *v, int len)
     check_global_declaration_1 (v[i]);
 }
 
-/* Emit debugging information for all global declarations in VEC.
-   WHEN is either EMIT_DEBUG_EARLY or EMIT_DEBUG_LATE depending on if
-   we are generating early debug (at the end of parsing), or the late
-   (post compilation) version.  */
+/* Emit late debugging information (post compilation) for all global
+   declarations in VEC.  */
 
 void
-emit_debug_global_declarations (tree *vec, int len, enum emit_debug when)
+emit_debug_global_declarations (tree *vec, int len)
 {
   int i;
 
@@ -554,14 +552,7 @@  emit_debug_global_declarations (tree *vec, int len, enum emit_debug when)
 
   timevar_push (TV_SYMOUT);
   for (i = 0; i < len; i++)
-    {
-      if (when == EMIT_DEBUG_EARLY)
-	debug_hooks->early_global_decl (vec[i]);
-      else if (when == EMIT_DEBUG_LATE)
-	debug_hooks->late_global_decl (vec[i]);
-      else
-	gcc_unreachable ();
-    }
+    debug_hooks->late_global_decl (vec[i]);
   timevar_pop (TV_SYMOUT);
 }
 
@@ -2136,6 +2127,21 @@  toplev::main (int argc, char **argv)
   if (version_flag)
     print_version (stderr, "");
 
+  /* FIXME: Temporary debugging aid to know which LTO phase we are in
+     without having to pass -v to the driver and all its verbosity.  */
+  if (0)
+    {
+      fprintf(stderr, "MAIN: cc1*\n");
+      if (flag_lto)
+	fprintf(stderr, "\tMAIN: flag_lto\n");
+      if (in_lto_p)
+	fprintf(stderr, "\tMAIN: in_lto_p\n");
+      if (flag_wpa)
+	fprintf(stderr, "\tMAIN: flag_wpa\n");
+      if (flag_ltrans)
+	fprintf(stderr, "\tMAIN: flag_ltrans\n");
+    }
+
   if (help_flag)
     print_plugins_help (stderr, "");
 
diff --git a/gcc/toplev.h b/gcc/toplev.h
index ce83539..5d40a4a 100644
--- a/gcc/toplev.h
+++ b/gcc/toplev.h
@@ -60,13 +60,9 @@  extern bool wrapup_global_declarations (tree *, int);
 extern void check_global_declaration_1 (tree);
 extern void check_global_declarations (tree *, int);
 
-enum emit_debug {
-  EMIT_DEBUG_EARLY,
-  EMIT_DEBUG_LATE
-};
-extern void emit_debug_global_declarations (tree *, int, enum emit_debug);
+extern void emit_debug_global_declarations (tree *, int);
 
-extern void global_decl_processing_and_early_debug (void);
+extern void global_decl_processing (void);
 
 extern void dump_memory_report (bool);
 extern void dump_profile_report (void);
diff --git a/gcc/vmsdbgout.c b/gcc/vmsdbgout.c
index 5cb66bc..6da48eb 100644
--- a/gcc/vmsdbgout.c
+++ b/gcc/vmsdbgout.c
@@ -179,6 +179,7 @@  static void vmsdbgout_abstract_function (tree);
 const struct gcc_debug_hooks vmsdbg_debug_hooks
 = {vmsdbgout_init,
    vmsdbgout_finish,
+   debug_nothing_void,
    vmsdbgout_assembly_start,
    vmsdbgout_define,
    vmsdbgout_undef,