diff mbox

LTO early debug

Message ID alpine.LSU.2.11.1504161411540.6786@zhemvz.fhfr.qr
State New
Headers show

Commit Message

Richard Biener April 16, 2015, 12:18 p.m. UTC
The LTO early debug info prototype project has been completed.

The provided patch against the debug-early branch shows that the 
general idea is sound and works.  Now on to the details.


What works?
-----------

Simple C and C++ testcases (didn't test more), esp. now libstdc++
pretty-printers finally work!  For example

#include <string>
#include <iostream>
int main()
{
  std::string s = "Hello";
  std::cout << s << std::endl;
}

with GCC 5 and a -flto -g compile produces

(gdb) start
Temporary breakpoint 1, main () at t2.C:5
5         std::string s = "Hello";
(gdb) info locals
s = <incomplete type>
(gdb) ptype s
type = struct basic_string {
    <incomplete type>
}

while the patched compiler produces

(gdb) start
Temporary breakpoint 1, main () at t2.C:5
5         std::string s = "Hello";
(gdb) info locals
s = ""
(gdb) n
6         std::cout << s << std::endl;
(gdb) info locals
s = "Hello"
(gdb) ptype s
type = std::string

exactly the same as with non-LTO operation.


What does not work?
-------------------

Currently the simple "tooling" (driver support for linking in the
.debug_info sections produced early at compile-time into the final
executable) does not work for fat LTO objects (thus also when a linker
plugin is not available).  The tooling might also break automatic
LTO linker plugin loading as the slim LTO files are not marked slim.

You can run into new ICEs in dwarf2out.c (also for non-LTO operation).
The libstdc++ testsuite doesn't get very far with -flto -g (we
ICE building testsuite_abi.cc).

VLAs don't work (see below).

Somehow constructor invocations show duplicate parameters and no
locations (but DWARF looks sane).


Future work
-----------

Make the WPA stage side more efficient (do not use DIEs to store/retrieve
the tree <-> DIE symbol + offset info).

Fix the tooling to support fat LTO objects (and make slim objects slim
again).  The only reasonable way to do this is to emit the early
debug info into .gnu.lto_. prefixed sections (??? can/need we redirect
relocations into separate LTO sections as well?), then at WPA (or
unpartitioned LTO) time move all such sections from the LTO inputs
to a (single?) temporary file which serves as additional output the
linker can consume.  We'd need to re-name the sections back at that
time (does that work for a single file and the relocations?  Otherwise
we need unique section names or simply N files).  In theory this
should work with using simple-object and thus be not ELF specific(?).

Fix all the ICEs.

VLAs and stuff don't work (but gcc.dg/guality/vla-1.c also fails quite a
bit on trunk).  I've fixed it up somewhat but the issue is that the
gimplified type size decls are not getting ignored and that they
get location info.  The DWARF looks good but gdb appearantly doesn't
see it (yes, there seem to be some gdb issues as well).
-- Looks like we have to pull up a type chain up to the point that
can refer to the upper bound DIE for the artificial decl (also makes
it unnecessary to output this early).  Will be somewhat fun but
not too difficult.

We should re-think how we handle abstract origins for functions
we currently output for inlines.  The early debug can serve as
abstract origin for example.  Currently with LTO we get quite
easily confused by having two DIEs for the same decl (also LTO
streaming still drops abstract origins from early inlining, something
no longer necessary with early debug).

late dwarf generation should be refactored out of early dwarf generation.
late dwarf for global variables should be created from where we output
the variable (similar to functions), not from a loop in toplev.c

And of course the attached patch needs to be split up, a changelog
written and formally tested and submitted (after fixing all of the
above ;))


Now looking forward to get debug-early merged to trunk (I'll do
some LTO testing on the bare branch - I suspect some issues I ran
into are pre-existing on the branch).

Thanks,
Richard.
diff mbox

Patch

diff --git a/gcc/dwarf2asm.c b/gcc/dwarf2asm.c
index b817aaf..78e594e 100644
--- a/gcc/dwarf2asm.c
+++ b/gcc/dwarf2asm.c
@@ -44,6 +44,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "hash-map.h"
 #include "ggc.h"
 #include "tm_p.h"
+#include "function.h"
+#include "emit-rtl.h"
 
 
 /* Output an unaligned integer with the given value and size.  Prefer not
@@ -216,6 +218,34 @@  dw2_asm_output_offset (int size, const char *label,
   va_end (ap);
 }
 
+void
+dw2_asm_output_offset_offset (int size, const char *label, HOST_WIDE_INT offset,
+			      section *base ATTRIBUTE_UNUSED,
+			      const char *comment, ...)
+{
+  va_list ap;
+
+  va_start (ap, comment);
+
+#ifdef ASM_OUTPUT_DWARF_OFFSET
+  FIXME
+  ASM_OUTPUT_DWARF_OFFSET (asm_out_file, size, label, base);
+#else
+  dw2_assemble_integer (size, gen_rtx_PLUS (Pmode,
+					    gen_rtx_SYMBOL_REF (Pmode, label),
+					    gen_int_mode (offset, Pmode)));
+#endif
+
+  if (flag_debug_asm && comment)
+    {
+      fprintf (asm_out_file, "\t%s ", ASM_COMMENT_START);
+      vfprintf (asm_out_file, comment, ap);
+    }
+  fputc ('\n', asm_out_file);
+
+  va_end (ap);
+}
+
 #if 0
 
 /* Output a self-relative reference to a label, possibly in a
diff --git a/gcc/dwarf2asm.h b/gcc/dwarf2asm.h
index d4a5706..c7d49df 100644
--- a/gcc/dwarf2asm.h
+++ b/gcc/dwarf2asm.h
@@ -40,6 +40,10 @@  extern void dw2_asm_output_offset (int, const char *, section *,
 				   const char *, ...)
      ATTRIBUTE_NULL_PRINTF_4;
 
+extern void dw2_asm_output_offset_offset (int, const char *, HOST_WIDE_INT,
+					  section *, const char *, ...)
+     ATTRIBUTE_NULL_PRINTF_5;
+
 extern void dw2_asm_output_addr (int, const char *, const char *, ...)
      ATTRIBUTE_NULL_PRINTF_3;
 
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 0976415..10d3b49 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -2637,6 +2637,12 @@  typedef struct GTY((chain_circular ("%h.die_sib"), for_user)) die_struct {
   BOOL_BITFIELD comdat_type_p : 1; /* DIE has a type signature */
   /* Die was generated early via dwarf2out_early_global_decl.  */
   BOOL_BITFIELD dumped_early : 1;
+  /* For an external ref to die_symbol if die_offset contains an extra
+     offset to that symbol.  */
+  BOOL_BITFIELD with_offset : 1;
+  /* For a DIE removed from the DIE tree.  TYPE_SYMTAB_DIE is supposed
+     to be ignored (and cleared lazily).  */
+  BOOL_BITFIELD removed : 1;
   /* Lots of spare bits.  */
 }
 die_node;
@@ -4977,7 +4983,10 @@  new_die (enum dwarf_tag tag_value, dw_die_ref parent_die, tree t)
 static inline dw_die_ref
 lookup_type_die (tree type)
 {
-  return TYPE_SYMTAB_DIE (type);
+  dw_die_ref die = TYPE_SYMTAB_DIE (type);
+  if (die && die->removed)
+    TYPE_SYMTAB_DIE (type) = die = NULL;
+  return die;
 }
 
 /* Given a TYPE_DIE representing the type TYPE, if TYPE is an
@@ -5042,7 +5051,166 @@  decl_die_hasher::equal (die_node *x, tree y)
 static inline dw_die_ref
 lookup_decl_die (tree decl)
 {
-  return decl_die_table->find_with_hash (decl, DECL_UID (decl));
+  dw_die_ref die = decl_die_table->find_with_hash (decl, DECL_UID (decl));
+  if (die && die->removed)
+    {
+      decl_die_table->remove_elt_with_hash (decl, DECL_UID (decl));
+      die = NULL;
+    }
+  return die;
+}
+
+
+/* For DECL which might have early dwarf output query a SYMBOL + OFFSET
+   style reference.  Return true if we found one refering to a DIE for
+   DECL, otherwise return false.  */
+
+bool
+lookup_die_ref_for_decl (tree decl, const char **sym,
+			 unsigned HOST_WIDE_INT *off)
+{
+  dw_die_ref die;
+
+  if (flag_fat_lto_objects)
+    return false;
+
+  if (TREE_CODE (decl) == BLOCK)
+    die = BLOCK_DIE (decl);
+  else
+    die = lookup_decl_die (decl);
+  if (!die)
+    return false;
+
+  /* During WPA stage we currently use DIEs to store the
+     decl <-> label + offset map.  That's quite inefficient but it
+     works for now.  */
+  if (flag_wpa)
+    {
+      dw_die_ref ref = get_AT_ref (die, DW_AT_abstract_origin);
+      if (!ref)
+	{
+	  gcc_assert (die == comp_unit_die ());
+	  return false;
+	}
+      *off = ref->die_offset;
+      *sym = ref->die_id.die_symbol;
+      return true;
+    }
+
+  /* Similar to get_ref_die_offset_label, but using the "correct"
+     label.  */
+  *off = die->die_offset;
+  while (die->die_parent)
+    die = die->die_parent;
+  /* For the containing CU DIE we compute a die_symbol in 
+     compute_section_prefix.  */
+  gcc_assert (die->die_tag == DW_TAG_compile_unit
+	      && die->die_id.die_symbol != NULL);
+  *sym = die->die_id.die_symbol;
+  return true;
+}
+
+/* Add a reference of kind ATTR_KIND to a DIE at SYMBOL + OFFSET to DIE.  */
+
+static void
+add_AT_external_die_ref (dw_die_ref die, enum dwarf_attribute attr_kind,
+			 const char *symbol, HOST_WIDE_INT offset)
+{
+  /* Create a fake DIE that contains the reference.  Don't use
+     new_die because we don't want to end up in the limbo list.  */
+  dw_die_ref ref = ggc_cleared_alloc<die_node> ();
+  ref->die_tag = die->die_tag;
+  ref->die_id.die_symbol = IDENTIFIER_POINTER (get_identifier (symbol));
+  ref->die_offset = offset;
+  ref->with_offset = 1;
+  add_AT_die_ref (die, attr_kind, ref);
+}
+
+/* Create a DIE for DECL if required and add a reference to a DIE
+   at SYMBOL + OFFSET which contains attributes dumped early.  */
+
+void
+register_die_ref_for_decl (tree decl, const char *sym,
+			   unsigned HOST_WIDE_INT off)
+{
+  if (flag_wpa && !decl_die_table)
+    decl_die_table = hash_table<decl_die_hasher>::create_ggc (1000);
+
+  dw_die_ref die
+    = TREE_CODE (decl) == BLOCK ? BLOCK_DIE (decl) : lookup_decl_die (decl);
+  gcc_assert (!die);
+  if (!die)
+    {
+      tree ctx;
+      dw_die_ref parent = NULL;
+      /* Need to lookup a DIE for the decls context - the containing
+         function or translation unit.  */
+      if (TREE_CODE (decl) == BLOCK)
+	ctx = BLOCK_SUPERCONTEXT (decl);
+      else
+	ctx = DECL_CONTEXT (decl);
+      while (ctx && TYPE_P (ctx))
+	ctx = TYPE_CONTEXT (ctx);
+      if (ctx)
+	{
+	  if (TREE_CODE (ctx) == BLOCK)
+	    parent = BLOCK_DIE (ctx);
+	  else if (TREE_CODE (ctx) == TRANSLATION_UNIT_DECL
+		   /* Keep the 1:1 association during WPA.  */
+		   && !flag_wpa)
+	    /* Otherwise all late annotations go to the main CU which
+	       imports the original CUs.  */
+	    parent = comp_unit_die ();
+	  else
+	    parent = lookup_decl_die (ctx);
+	}
+      /* Create a DIE "stub".  */
+      switch (TREE_CODE (decl))
+	{
+	case TRANSLATION_UNIT_DECL:
+	  if (flag_wpa)
+	    {
+	      /* Keep the 1:1 association during WPA.  */
+	      die = new_die (DW_TAG_compile_unit, NULL, decl);
+	      break;
+	    }
+	  {
+	    die = comp_unit_die ();
+	    dw_die_ref import = new_die (DW_TAG_imported_unit, die, NULL_TREE);
+	    add_AT_external_die_ref (import, DW_AT_import, sym, off);
+	    /* We re-target all CU decls to the LTRANS CU DIE, so no need
+	       to create a DIE for the original CUs.  */
+	    return;
+	  }
+	case NAMESPACE_DECL:
+	  /* ???  LANG issue - DW_TAG_module for fortran.  */
+	  die = new_die (DW_TAG_namespace, parent, decl);
+	  break;
+	case FUNCTION_DECL:
+	  die = new_die (DW_TAG_subprogram, parent, decl);
+	  break;
+	case VAR_DECL:
+	case RESULT_DECL:
+	  die = new_die (DW_TAG_variable, parent, decl);
+	  break;
+	case PARM_DECL:
+	  die = new_die (DW_TAG_formal_parameter, parent, decl);
+	  break;
+	case BLOCK:
+	  die = new_die (DW_TAG_lexical_block, parent, decl);
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+      die->dumped_early = true;
+      if (TREE_CODE (decl) == BLOCK)
+	BLOCK_DIE (decl) = die;
+      else
+	equate_decl_number_to_die (decl, die);
+    }
+
+  /* Add a reference to the DIE providing early debug at $sym + off.  */
+  add_AT_external_die_ref (die, DW_AT_abstract_origin, sym, off);
 }
 
 /* Returns a hash value for X (which really is a var_loc_list).  */
@@ -5521,7 +5689,11 @@  print_dw_val (dw_val_node *val, bool recurse, FILE *outfile)
 			       die->die_id.die_type_node->signature);
 	    }
 	  else if (die->die_id.die_symbol)
-	    fprintf (outfile, "die -> label: %s", die->die_id.die_symbol);
+	    {
+	      fprintf (outfile, "die -> label: %s", die->die_id.die_symbol);
+	      if (die->with_offset)
+		fprintf (outfile, " + %ld", die->die_offset);
+	    }
 	  else
 	    fprintf (outfile, "die -> %ld", die->die_offset);
 	  fprintf (outfile, " (%p)", (void *) die);
@@ -6784,7 +6956,7 @@  static unsigned int comdat_symbol_number;
    children, and set comdat_symbol_id accordingly.  */
 
 static void
-compute_section_prefix (dw_die_ref unit_die)
+compute_section_prefix_1 (dw_die_ref unit_die, bool comdat_p)
 {
   const char *die_name = get_AT_string (unit_die, DW_AT_name);
   const char *base = die_name ? lbasename (die_name) : "anonymous";
@@ -6813,7 +6985,15 @@  compute_section_prefix (dw_die_ref unit_die)
       p += 2;
     }
 
-  comdat_symbol_id = unit_die->die_id.die_symbol = xstrdup (name);
+  unit_die->die_id.die_symbol = xstrdup (name);
+  unit_die->comdat_type_p = comdat_p;
+}
+
+static void
+compute_section_prefix (dw_die_ref unit_die)
+{
+  compute_section_prefix_1 (unit_die, true);
+  comdat_symbol_id = unit_die->die_id.die_symbol;
   comdat_symbol_number = 0;
 }
 
@@ -8935,7 +9115,11 @@  output_die (dw_die_ref die)
 
   /* If someone in another CU might refer to us, set up a symbol for
      them to point to.  */
-  if (! die->comdat_type_p && die->die_id.die_symbol)
+  if (! die->comdat_type_p && die->die_id.die_symbol
+      /* Don't output the symbol twice.  For LTO we want the label
+         on the section beginning, not on the actual DIE.  */
+      && (!flag_generate_lto || flag_fat_lto_objects
+	  || die->die_tag != DW_TAG_compile_unit))
     output_die_symbol (die);
 
   dw2_asm_output_data_uleb128 (die->die_abbrev, "(DIE (%#lx) %s)",
@@ -9114,8 +9298,13 @@  output_die (dw_die_ref die)
 		    size = DWARF2_ADDR_SIZE;
 		  else
 		    size = DWARF_OFFSET_SIZE;
-		  dw2_asm_output_offset (size, sym, debug_info_section, "%s",
-					 name);
+		  if (AT_ref (a)->with_offset)
+		    dw2_asm_output_offset_offset
+			(size, sym, AT_ref (a)->die_offset,
+			 debug_info_section, "%s", name);
+		  else
+		    dw2_asm_output_offset (size, sym, debug_info_section, "%s",
+					   name);
 		}
 	    }
 	  else
@@ -9266,7 +9455,7 @@  output_comp_unit (dw_die_ref die, int output_if_empty)
   calc_die_sizes (die);
 
   oldsym = die->die_id.die_symbol;
-  if (oldsym)
+  if (oldsym && die->comdat_type_p)
     {
       tmp = XALLOCAVEC (char, strlen (oldsym) + 24);
 
@@ -9282,6 +9471,26 @@  output_comp_unit (dw_die_ref die, int output_if_empty)
       info_section_emitted = true;
     }
 
+  /* For LTO cross unit DIE refs we want a symbol on the start of the
+     debuginfo section, not on the CU DIE.
+     ???  We could simply use the symbol output by output_die and account
+     for the extra offset produced by the CU heade (which has fixed size?).  */
+  if (flag_generate_lto && !flag_fat_lto_objects)
+    {
+      gcc_assert (oldsym);
+      /* ???  No way to get visibility assembled without a decl.  */
+      tree decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+			      get_identifier (oldsym), char_type_node);
+      TREE_PUBLIC (decl) = true;
+      TREE_STATIC (decl) = true;
+      DECL_ARTIFICIAL (decl) = true;
+      DECL_VISIBILITY (decl) = VISIBILITY_HIDDEN;
+      DECL_VISIBILITY_SPECIFIED (decl) = true;
+      targetm.asm_out.assemble_visibility (decl, VISIBILITY_HIDDEN);
+      targetm.asm_out.globalize_label (asm_out_file, oldsym);
+      ASM_OUTPUT_LABEL (asm_out_file, oldsym);
+    }
+
   /* Output debugging information.  */
   output_compilation_unit_header ();
   output_die (die);
@@ -16785,7 +16994,10 @@  add_scalar_info (dw_die_ref die, enum dwarf_attribute attr, tree value,
 
       if (decl != NULL_TREE)
 	{
-	  dw_die_ref decl_die = lookup_decl_die (decl);
+	  /* ???  Force a DIE for the decl - this is probably
+	     artificially generated by gimplifying type sizes and thus
+	     early debug wouldn't have created a DIE for it because.  */
+	  dw_die_ref decl_die = force_decl_die (decl);
 
 	  /* ??? Can this happen, or should the variable have been bound
 	     first?  Probably it can, since I imagine that we try to create
@@ -18790,6 +19002,8 @@  gen_subprogram_die (tree decl, dw_die_ref context_die)
 	      };  */
 	   || !is_cu_die (context_die))
 	   && (DECL_ARTIFICIAL (decl)
+	       // Unless we do this we can't prune file/line in prune_dies
+	       || get_AT_ref (old_die, DW_AT_abstract_origin)
 	       || (get_AT_file (old_die, DW_AT_decl_file) == file_index
 		   && (get_AT_unsigned (old_die, DW_AT_decl_line)
 		       == (unsigned) s.line))))
@@ -19596,7 +19810,6 @@  gen_variable_die (tree decl, tree origin, dw_die_ref context_die)
 
 		 See dwarf2out_decl and its use of
 		 local_function_static to see how this happened.  */
-	      gcc_assert (local_function_static (decl));
 	      var_die = old_die;
 	      goto gen_variable_die_location;
 	    }
@@ -19946,6 +20159,20 @@  gen_lexical_block_die (tree stmt, dw_die_ref context_die)
     {
       /* If this is an inlined instance, create a new lexical die for
 	 anything below to attach DW_AT_abstract_origin to.  */
+      /* ???  Between early and late dwarf we get confused here
+         by set_block_origin_self () called by dwarf2out_function_decl
+	 via gen_decl_die
+
+	   If we're emitting an out-of-line copy of an inline function,
+	   emit info for the abstract instance and set up to refer to it.
+       
+	 which ends up calling dwarf2out_abstract_function (late)
+	 and then set_decl_origin_self.  This should have been discovered
+	 early already.  Or rather we do have the abstract instance
+	 output early already, no need to do that again.  So
+	 dwarf2out_abstract_function should take the early output
+	 DIEs and create a duplicate that just refers back to the
+	 "abstract" early instance?  */
       stmt_die = new_die (DW_TAG_lexical_block, context_die, stmt);
     }
   else
@@ -21036,6 +21263,31 @@  gen_block_die (tree stmt, dw_die_ref context_die)
     decls_for_scope (stmt, context_die);
 }
 
+static tree
+process_vla_type (tree *tp, int *walk_subtrees, void *ctx)
+{
+  /* ???  walk_type_fields doesn't walk TYPE_SIZE and friends and
+     while it walks TYPE_DOMAIN for arrays it doesn't walk
+     TYPE_MIN/MAX_VALUE.  Just special-case the ARRAY_TYPE domain
+     type case here for now.  */
+  if (TREE_CODE (*tp) == INTEGER_TYPE)
+    {
+      if (TREE_CODE (TYPE_MIN_VALUE (*tp)) == VAR_DECL
+	  && DECL_ARTIFICIAL (TYPE_MIN_VALUE (*tp))
+	  && !DECL_IGNORED_P (TYPE_MIN_VALUE (*tp)))
+	gen_decl_die (TYPE_MIN_VALUE (*tp), NULL_TREE, (dw_die_ref) ctx);
+      if (TREE_CODE (TYPE_MAX_VALUE (*tp)) == VAR_DECL
+	  && DECL_ARTIFICIAL (TYPE_MAX_VALUE (*tp))
+	  && !DECL_IGNORED_P (TYPE_MAX_VALUE (*tp)))
+	gen_decl_die (TYPE_MAX_VALUE (*tp), NULL_TREE, (dw_die_ref) ctx);
+    }
+
+  if (!TYPE_P (*tp))
+    *walk_subtrees = 0;
+
+  return NULL_TREE;
+}
+
 /* Process variable DECL (or variable with origin ORIGIN) within
    block STMT and add it to CONTEXT_DIE.  */
 static void
@@ -21061,7 +21313,44 @@  process_scope_var (tree stmt, tree decl, tree origin, dw_die_ref context_die)
 					     stmt, context_die);
     }
   else
-    gen_decl_die (decl, origin, context_die);
+    {
+      if (decl && DECL_P (decl))
+	die = lookup_decl_die (decl);
+
+      if (in_lto_p
+	  && die && die->die_parent != context_die)
+	{
+	  /* ???  For non-LTO operation we do not want to get here via
+	     dwarf2out_abstract_function / set_decl_origin_self which
+	     ends up modifying the tree rep in some odd way instead
+	     of just playing with the DIEs.  */
+	  /* We associate vars with their DECL_CONTEXT first which misses
+	     their BLOCK association.  Move them.  */
+	  gcc_assert (die->die_parent != NULL);
+	  /* ???  Moving is expensive.  Better fix DECL_CONTEXT?  */
+	  dw_die_ref prev = die->die_parent->die_child;
+	  while (prev->die_sib != die)
+	    prev = prev->die_sib;
+	  remove_child_with_prev (die, prev);
+	  add_child_die (context_die, die);
+	}
+
+      if (in_lto_p
+	  && TREE_CODE (decl) == VAR_DECL
+	  && variably_modified_type_p (TREE_TYPE (decl), cfun->decl))
+	{
+	  /* We need to add location attributes to decls refered to
+	     from the decls type but we don't have DIEs for the type
+	     itself materialized.  The decls are also not part of the
+	     functions BLOCK tree (because they are artificial).  */
+	  walk_tree (&TREE_TYPE (decl), process_vla_type, NULL, NULL);
+	}
+
+      /* ???  The following gets stray type DIEs created even for decls
+	 that were created early.  */
+
+      gen_decl_die (decl, origin, context_die);
+    }
 }
 
 /* Generate all of the decls declared within a given scope and (recursively)
@@ -21442,6 +21731,9 @@  gen_decl_die (tree decl, tree origin, dw_die_ref context_die)
 
       /* If we're emitting an out-of-line copy of an inline function,
 	 emit info for the abstract instance and set up to refer to it.  */
+      /* ???  We have output an abstract instance early already and
+         could just re-use that.  This is how LTO treats all functions
+	 for example.  */
       else if (cgraph_function_possibly_inlined_p (decl)
 	       && ! DECL_ABSTRACT_P (decl)
 	       && ! class_or_namespace_scope_p (context_die)
@@ -21455,7 +21747,11 @@  gen_decl_die (tree decl, tree origin, dw_die_ref context_die)
 	}
 
       /* Otherwise we're emitting the primary DIE for this decl.  */
-      else if (debug_info_level > DINFO_LEVEL_TERSE)
+      else if (debug_info_level > DINFO_LEVEL_TERSE
+	       /* Do not generate stray type DIEs in late LTO dumping.  */
+	       && !(decl
+		    && lookup_decl_die (decl)
+		    && lookup_decl_die (decl)->dumped_early))
 	{
 	  /* Before we describe the FUNCTION_DECL itself, make sure that we
 	     have its containing type.  */
@@ -21522,6 +21818,12 @@  gen_decl_die (tree decl, tree origin, dw_die_ref context_die)
       if (debug_info_level <= DINFO_LEVEL_TERSE)
 	break;
 
+	{
+      /* ???  Avoid generating stray type DIEs during late LTO dwarf dumping.
+         All types have been dumped early.  */
+      dw_die_ref die = decl ? lookup_decl_die (decl) : NULL;
+      if (!die || !die->dumped_early)
+	{
       /* Output any DIEs that are needed to specify the type of this data
 	 object.  */
       if (decl_by_reference_p (decl_or_origin))
@@ -21533,6 +21835,8 @@  gen_decl_die (tree decl, tree origin, dw_die_ref context_die)
       class_origin = decl_class_context (decl_or_origin);
       if (class_origin != NULL_TREE)
 	gen_type_die_for_member (class_origin, decl_or_origin, context_die);
+	}
+	}
 
       /* And its containing namespace.  */
       context_die = declare_in_namespace (decl_or_origin, context_die);
@@ -24516,8 +24820,15 @@  resolve_addr (dw_die_ref die)
 		&& DECL_EXTERNAL (tdecl)
 		&& DECL_ABSTRACT_ORIGIN (tdecl) == NULL_TREE)
 	      {
-		force_decl_die (tdecl);
-		tdie = lookup_decl_die (tdecl);
+		/* Creating a full DIE for tdecl is overly expensive and
+		   at this point (with early debug active) even wrong
+		   as it can end up generating new type DIEs we didn't
+		   output.  */
+		tdie = new_die (DW_TAG_subprogram, comp_unit_die (), NULL_TREE);
+		add_AT_flag (tdie, DW_AT_external, 1);
+		add_AT_flag (tdie, DW_AT_declaration, 1);
+		add_linkage_attr (tdie, tdecl);
+		equate_decl_number_to_die (tdecl, tdie);
 	      }
 	    if (tdie)
 	      {
@@ -25087,6 +25398,82 @@  dwarf2out_finish (const char *filename)
   comdat_type_node *ctnode;
   dw_die_ref main_comp_unit_die;
 
+  /* Create parent - child relationship if we streamed in DIE reference
+     stubs.  */
+  if (in_lto_p)
+    {
+      bool changed;
+      do
+	{
+	  changed = false;
+	  for (limbo_die_node **node = &limbo_die_list;
+	       *node; )
+	    {
+	      if ((*node)->die->die_tag == DW_TAG_compile_unit)
+		{
+		  if ((*node)->die == comp_unit_die ())
+		    *node = (*node)->next;
+		  else
+		    node = &(*node)->next;
+		  continue;
+		}
+
+	      tree t = (*node)->created_for;
+	      tree context = NULL_TREE;
+	      if (DECL_P (t))
+		context = DECL_CONTEXT (t);
+	      else if (TREE_CODE (t) == BLOCK)
+		context = BLOCK_SUPERCONTEXT (t);
+	      /* ???  We don't output DIEs for wrapping scopes.  Skip
+		 as many DIEs as needed.  */
+	      if (context)
+		while (TREE_CODE (context) == BLOCK
+		       && !BLOCK_DIE (context))
+		  context = BLOCK_SUPERCONTEXT (context);
+	      /* ???  We fail to re-parent BLOCK scope variables as
+	         dependent on frontend their DECL_CONTEXT is not the
+		 containing scope but the containing function.  */
+	      if (context && DECL_P (context))
+		{
+		  dw_die_ref ctx = lookup_decl_die (context);
+		  if (ctx)
+		    {
+		      add_child_die (ctx, (*node)->die);
+		      *node = (*node)->next;
+		      changed = true;
+		      continue;
+		    }
+		}
+	      else if (context && TREE_CODE (context) == BLOCK)
+		{
+		  dw_die_ref ctx = BLOCK_DIE (context);
+		  if (ctx)
+		    {
+		      add_child_die (ctx, (*node)->die);
+		      *node = (*node)->next;
+		      changed = true;
+		      continue;
+		    }
+		}
+	      else if (!context)
+		{
+		  /* ???  In some cases the C++ FE (at least) fails to
+		     set DECL_CONTEXT properly.  Simply globalize stuff
+		     in this case.  For example
+		     __dso_handle created via iostream line 74 col 25.  */
+		  gcc_assert ((*node)->die->dumped_early);
+		  add_child_die (comp_unit_die (), (*node)->die);
+		  *node = (*node)->next;
+		  changed = true;
+		  continue;
+		}
+
+	      node = &(*node)->next;
+	    }
+	}
+      while (changed);
+    }
+
   /* If the limbo list has anything, it should be things that were
      created after the compilation proper.  Anything from the early
      dwarf pass, should have parents and should never be in the limbo
@@ -25096,36 +25483,7 @@  dwarf2out_finish (const char *filename)
 		|| !node->die->dumped_early);
 
   /* Flush out any latecomers to the limbo party.  */
-  dwarf2out_early_finish();
-
-  /* PCH might result in DW_AT_producer string being restored from the
-     header compilation, so always fill it with empty string initially
-     and overwrite only here.  */
-  dw_attr_ref producer = get_AT (comp_unit_die (), DW_AT_producer);
-  producer_string = gen_producer_string ();
-  producer->dw_attr_val.v.val_str->refcount--;
-  producer->dw_attr_val.v.val_str = find_AT_string (producer_string);
-
-  gen_scheduled_generic_parms_dies ();
-  gen_remaining_tmpl_value_param_die_attribute ();
-
-  /* Add the name for the main input file now.  We delayed this from
-     dwarf2out_init to avoid complications with PCH.
-     For LTO produced units use a fixed artificial name to avoid
-     leaking tempfile names into the dwarf.  */
-  if (!in_lto_p)
-    add_name_attribute (comp_unit_die (), remap_debug_filename (filename));
-  else
-    add_name_attribute (comp_unit_die (), "<artificial>");
-  if (!IS_ABSOLUTE_PATH (filename) || targetm.force_at_comp_dir)
-    add_comp_dir_attribute (comp_unit_die ());
-  else if (get_AT (comp_unit_die (), DW_AT_comp_dir) == NULL)
-    {
-      bool p = false;
-      file_table->traverse<bool *, file_table_relative_p> (&p);
-      if (p)
-	add_comp_dir_attribute (comp_unit_die ());
-    }
+  //dwarf2out_early_finish();
 
 #if ENABLE_ASSERT_CHECKING
   {
@@ -25134,44 +25492,6 @@  dwarf2out_finish (const char *filename)
   }
 #endif
   resolve_addr (comp_unit_die ());
-  move_marked_base_types ();
-
-  /* Walk through the list of incomplete types again, trying once more to
-     emit full debugging info for them.  */
-  retry_incomplete_types ();
-
-  if (flag_eliminate_unused_debug_types)
-    prune_unused_types ();
-
-  /* FIXME debug-early: Prune DIEs for unused decls.  */
-
-  /* Generate separate COMDAT sections for type DIEs. */
-  if (use_debug_types)
-    {
-      break_out_comdat_types (comp_unit_die ());
-
-      /* Each new type_unit DIE was added to the limbo die list when created.
-         Since these have all been added to comdat_type_list, clear the
-         limbo die list.  */
-      limbo_die_list = NULL;
-
-      /* For each new comdat type unit, copy declarations for incomplete
-         types to make the new unit self-contained (i.e., no direct
-         references to the main compile unit).  */
-      for (ctnode = comdat_type_list; ctnode != NULL; ctnode = ctnode->next)
-        copy_decls_for_unworthy_types (ctnode->root_die);
-      copy_decls_for_unworthy_types (comp_unit_die ());
-
-      /* In the process of copying declarations from one unit to another,
-         we may have left some declarations behind that are no longer
-         referenced.  Prune them.  */
-      prune_unused_types ();
-    }
-
-  /* Generate separate CUs for each of the include files we've seen.
-     They will go into limbo_die_list.  */
-  if (flag_eliminate_dwarf2_dups)
-    break_out_includes (comp_unit_die ());
 
   /* Traverse the DIE's and add add sibling attributes to those DIE's
      that have children.  */
@@ -25438,12 +25758,101 @@  dwarf2out_finish (const char *filename)
     output_indirect_strings ();
 }
 
+/* Reset DIEs so we can output them again, pruning those we do not
+   need late.  */
+
+static void
+prune_dies (dw_die_ref die, const char *sym)
+{
+  dw_die_ref c, prev;
+
+  /* Add an abstract origin to debug_info_section_label + offset.  */
+  if (die->die_tag == DW_TAG_variable
+      || die->die_tag == DW_TAG_formal_parameter
+      || die->die_tag == DW_TAG_subprogram
+      || die->die_tag == DW_TAG_lexical_block)
+      //|| die->die_tag == DW_TAG_compile_unit)  // gdb doesn't like this
+    {
+      /* Remove stuff we refer to via the abstract origin.  */
+#if 1 // we should be able to remove everything, but ...
+      vec_safe_truncate (die->die_attr, 0);
+#else
+      // at least remove references to types, names and locations
+      remove_AT (die, DW_AT_decl_file);
+      remove_AT (die, DW_AT_decl_line);
+      remove_AT (die, DW_AT_type);
+      remove_AT (die, DW_AT_name);
+#endif
+
+      /* Add a reference to the early output DIE.  */
+      add_AT_external_die_ref (die, DW_AT_abstract_origin,
+			       sym ? sym
+			       : IDENTIFIER_POINTER
+			          (get_identifier (debug_info_section_label)),
+			       die->die_offset);
+    }
+
+  /* Remove stuff we re-generate.  */
+  die->die_mark = 0;
+  die->die_offset = 0;
+  die->die_abbrev = 0;
+  remove_AT (die, DW_AT_sibling);
+
+  prev = die->die_child;
+  if (prev)
+    do {
+      c = prev->die_sib;
+      /* Remove non-decl DIEs.  */
+      if (c->die_tag != DW_TAG_variable
+	  && c->die_tag != DW_TAG_formal_parameter
+	  && c->die_tag != DW_TAG_subprogram
+	  && c->die_tag != DW_TAG_compile_unit
+	  && c->die_tag != DW_TAG_lexical_block)
+	{
+	  /* Remove DIE.  */
+	  bool was_last = (c == die->die_child);
+	  remove_child_with_prev (c, prev);
+	  /* TYPE_SYMBTAB_DIE (and lookup_type_die) will still find
+	     the removed DIE.  Same for lookup_decl_die.  */
+	  c->removed = 1;
+	  /* Re-parent children so we'll visit them next.  */
+	  if (c->die_child)
+	    {
+	      dw_die_ref cc;
+	      FOR_EACH_CHILD (c, cc, cc->die_parent = c->die_parent);
+	      c->die_child->die_sib = prev->die_sib;
+	      prev->die_sib = c->die_child;
+	      continue;
+	    }
+	  if (was_last)
+	    break;
+	  continue;
+	}
+      else
+	{
+	  prune_dies (c, sym);
+	  prev = c;
+	}
+    } while (c != die->die_child);
+}
+
 /* Perform any cleanups needed after the early debug generation pass
    has run.  */
 
 static void
 dwarf2out_early_finish (void)
 {
+  comdat_type_node *ctnode;
+
+  /* Pick the first TRANSLATION_UNIT_DECL we didn't create a DIE for
+     and equate it with our default CU DIE.  LTO output needs to be
+     able to lookup DIEs for translation unit decls.  */
+  unsigned i;
+  tree decl;
+  FOR_EACH_VEC_SAFE_ELT (all_translation_units, i, decl)
+    if (!lookup_decl_die (decl))
+      equate_decl_number_to_die (decl, comp_unit_die ());
+
   /* Traverse the limbo die list, and add parent/child links.  The only
      dies without parents that should be here are concrete instances of
      inline functions, and the comp_unit_die.  We can ignore the comp_unit_die.
@@ -25496,6 +25905,185 @@  dwarf2out_early_finish (void)
     }
 
   limbo_die_list = NULL;
+
+  /* PCH might result in DW_AT_producer string being restored from the
+     header compilation, so always fill it with empty string initially
+     and overwrite only here.  */
+  dw_attr_ref producer = get_AT (comp_unit_die (), DW_AT_producer);
+  producer_string = gen_producer_string ();
+  producer->dw_attr_val.v.val_str->refcount--;
+  producer->dw_attr_val.v.val_str = find_AT_string (producer_string);
+
+  gen_scheduled_generic_parms_dies ();
+  gen_remaining_tmpl_value_param_die_attribute ();
+
+  /* Add the name for the main input file now.  We delayed this from
+     dwarf2out_init to avoid complications with PCH.
+     For LTO produced units use a fixed artificial name to avoid
+     leaking tempfile names into the dwarf.  */
+  if (!in_lto_p)
+    add_name_attribute (comp_unit_die (), remap_debug_filename (main_input_filename));
+  else
+    add_name_attribute (comp_unit_die (), "<artificial>");
+  if (!IS_ABSOLUTE_PATH (main_input_filename) || targetm.force_at_comp_dir)
+    add_comp_dir_attribute (comp_unit_die ());
+  else if (get_AT (comp_unit_die (), DW_AT_comp_dir) == NULL)
+    {
+      bool p = false;
+      file_table->traverse<bool *, file_table_relative_p> (&p);
+      if (p)
+	add_comp_dir_attribute (comp_unit_die ());
+    }
+
+  move_marked_base_types ();
+
+  /* Walk through the list of incomplete types again, trying once more to
+     emit full debugging info for them.  */
+  retry_incomplete_types ();
+
+  if (flag_eliminate_unused_debug_types)
+    prune_unused_types ();
+
+  /* FIXME debug-early: Prune DIEs for unused decls.  */
+
+  /* Generate separate COMDAT sections for type DIEs. */
+  if (use_debug_types)
+    {
+      break_out_comdat_types (comp_unit_die ());
+
+      /* Each new type_unit DIE was added to the limbo die list when created.
+         Since these have all been added to comdat_type_list, clear the
+         limbo die list.  */
+      limbo_die_list = NULL;
+
+      /* For each new comdat type unit, copy declarations for incomplete
+         types to make the new unit self-contained (i.e., no direct
+         references to the main compile unit).  */
+      for (ctnode = comdat_type_list; ctnode != NULL; ctnode = ctnode->next)
+        copy_decls_for_unworthy_types (ctnode->root_die);
+      copy_decls_for_unworthy_types (comp_unit_die ());
+
+      /* In the process of copying declarations from one unit to another,
+         we may have left some declarations behind that are no longer
+         referenced.  Prune them.  */
+      prune_unused_types ();
+    }
+
+  /* Generate separate CUs for each of the include files we've seen.
+     They will go into limbo_die_list.  */
+  if (flag_eliminate_dwarf2_dups)
+    break_out_includes (comp_unit_die ());
+
+
+  /* Do not generate DWARF assembler now when not producing LTO bytecode.
+     ???  The following code in principle handles split DWARF producing but
+     it has some issues (and it's likely not desired).  */
+  if (!flag_generate_lto)
+    return;
+
+
+  /* ???  Duplicated from dwarf2out_finish.  */
+
+  /* Traverse the DIE's and add add sibling attributes to those DIE's
+     that have children.  */
+  add_sibling_attributes (comp_unit_die ());
+  for (node = limbo_die_list; node; node = node->next)
+    add_sibling_attributes (node->die);
+  for (ctnode = comdat_type_list; ctnode != NULL; ctnode = ctnode->next)
+    add_sibling_attributes (ctnode->root_die);
+
+  if (have_macinfo)
+    add_AT_macptr (comp_unit_die (),
+		   dwarf_strict ? DW_AT_macro_info : DW_AT_GNU_macros,
+		   macinfo_section_label);
+
+  save_macinfo_strings ();
+
+  /* Output all of the compilation units.  We put the main one last so that
+     the offsets are available to output_pubnames.  */
+  for (node = limbo_die_list; node; node = node->next)
+    output_comp_unit (node->die, 0);
+
+  hash_table<comdat_type_hasher> comdat_type_table (100);
+  for (ctnode = comdat_type_list; ctnode != NULL; ctnode = ctnode->next)
+    {
+      comdat_type_node **slot = comdat_type_table.find_slot (ctnode, INSERT);
+
+      /* Don't output duplicate types.  */
+      if (*slot != HTAB_EMPTY_ENTRY)
+        continue;
+
+      /* Add a pointer to the line table for the main compilation unit
+         so that the debugger can make sense of DW_AT_decl_file
+         attributes.  */
+      if (debug_info_level >= DINFO_LEVEL_TERSE)
+        add_AT_lineptr (ctnode->root_die, DW_AT_stmt_list,
+                        (!dwarf_split_debug_info
+                         ? debug_line_section_label
+                         : debug_skeleton_line_section_label));
+
+      output_comdat_type_unit (ctnode);
+      *slot = ctnode;
+    }
+
+  /* The AT_pubnames attribute needs to go in all skeleton dies, including
+     both the main_cu and all skeleton TUs.  Making this call unconditional
+     would end up either adding a second copy of the AT_pubnames attribute, or
+     requiring a special case in add_top_level_skeleton_die_attrs.  */
+  if (!dwarf_split_debug_info)
+    add_AT_pubnames (comp_unit_die ());
+
+  /* If we generate LTO stick a unique symbol to the main debuginfo section.  */
+  if (flag_generate_lto && !flag_fat_lto_objects)
+    compute_section_prefix_1 (comp_unit_die (), false);
+
+  /* Output the main compilation unit if non-empty or if .debug_macinfo
+     or .debug_macro will be emitted.  */
+  output_comp_unit (comp_unit_die (), have_macinfo);
+
+  /* Output the abbreviation table.  */
+  if (abbrev_die_table_in_use != 1)
+    {
+      switch_to_section (debug_abbrev_section);
+      ASM_OUTPUT_LABEL (asm_out_file, abbrev_section_label);
+      output_abbrev_section ();
+    }
+
+  /* Have to end the macro section.  */
+  if (have_macinfo)
+    {
+      switch_to_section (debug_macinfo_section);
+      ASM_OUTPUT_LABEL (asm_out_file, macinfo_section_label);
+      output_macinfo ();
+      dw2_asm_output_data (1, 0, "End compilation unit");
+    }
+
+
+  /* If we emitted any indirect strings, output the string table too.  */
+  if (debug_str_hash || skeleton_debug_str_hash)
+    output_indirect_strings ();
+
+
+  /* ???  Prune stuff so that dwarf2out_finish runs successfully.  */
+  /* ???  If generating LTO don't do this, we need the output state
+     to be preserved until LTO out.  This of course will break
+     fat LTO objects - no idea what to do for them but to emit
+     "split" info as well (and thus have another after_lto_emit hook
+     or simply do the pruning at regular dwarf2out_finish...).  */
+  if (!flag_generate_lto || flag_fat_lto_objects)
+    {
+      prune_dies (comp_unit_die (), comp_unit_die ()->die_id.die_symbol);
+      for (node = limbo_die_list; node; node = node->next)
+	prune_dies (node->die, node->die->die_id.die_symbol);
+    }
+
+  debug_str_hash = NULL;  /* Contents are GCed.  */
+
+  /* ???  Need new labels if we output some stuff early.  */
+  ASM_GENERATE_INTERNAL_LABEL (abbrev_section_label,
+			       DEBUG_ABBREV_SECTION_LABEL, 1);
+  ASM_GENERATE_INTERNAL_LABEL (debug_info_section_label,
+			       DEBUG_INFO_SECTION_LABEL, 1);
 }
 
 /* Reset all state within dwarf2out.c so that we can rerun the compiler
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index d822913..10a9318 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8994,6 +8994,12 @@  gimplify_one_sizepos (tree *expr_p, gimple_seq *stmt_p)
   *expr_p = unshare_expr (expr);
 
   gimplify_expr (expr_p, stmt_p, NULL, is_gimple_val, fb_rvalue);
+
+  /* The possibly generated temporary is interesting for debug information
+     to complete the VLA type sizes and bounds.  Clear DECL_IGNORED_P.  */
+  if (TREE_CODE (*expr_p) == VAR_DECL
+      && DECL_ARTIFICIAL (*expr_p))
+    DECL_IGNORED_P (*expr_p) = false;
 }
 
 /* Gimplify the body of statements of FNDECL and return a GIMPLE_BIND node
diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index a045b97..eee9e4f 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -1181,6 +1181,8 @@  lto_input_variable_constructor (struct lto_file_decl_data *file_data,
 }
 
 
+vec<dref_entry> dref_queue;
+
 /* Read the physical representation of a tree node EXPR from
    input block IB using the per-file context in DATA_IN.  */
 
@@ -1201,6 +1203,23 @@  lto_read_tree_1 (struct lto_input_block *ib, struct data_in *data_in, tree expr)
       && TREE_CODE (expr) != TRANSLATION_UNIT_DECL)
     DECL_INITIAL (expr) = stream_read_tree (ib, data_in);
 
+  /* Stream references to early generated DIEs.  Keep in sync with the
+     trees handled in dwarf2out.c:register_die_ref_for_decl.  */
+  if ((DECL_P (expr)
+       && TREE_CODE (expr) != FIELD_DECL
+       && TREE_CODE (expr) != DEBUG_EXPR_DECL
+       && TREE_CODE (expr) != TYPE_DECL)
+      || TREE_CODE (expr) == BLOCK)
+    {
+      const char *str = streamer_read_string (data_in, ib);
+      if (str)
+	{
+	  unsigned HOST_WIDE_INT off = streamer_read_uhwi (ib);
+	  dref_entry e = { expr, str, off };
+	  dref_queue.safe_push (e);
+	}
+    }
+
   /* We should never try to instantiate an MD or NORMAL builtin here.  */
   if (TREE_CODE (expr) == FUNCTION_DECL)
     gcc_assert (!streamer_handle_as_builtin_p (expr));
@@ -1362,6 +1381,15 @@  lto_input_tree (struct lto_input_block *ib, struct data_in *data_in)
     {
       unsigned len, entry_len;
       lto_input_scc (ib, data_in, &len, &entry_len);
+
+      /* Register DECLs with the debuginfo machinery.  */
+      while (!dref_queue.is_empty ())
+	{
+	  extern void register_die_ref_for_decl (tree, const char *,
+						 unsigned HOST_WIDE_INT);
+	  dref_entry e = dref_queue.pop ();
+	  register_die_ref_for_decl (e.decl, e.sym, e.off);
+	}
     }
   return lto_input_tree_1 (ib, data_in, tag, 0);
 }
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 671bac3..fa36363 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -400,6 +400,28 @@  lto_write_tree_1 (struct output_block *ob, tree expr, bool ref_p)
 			 (ob->decl_state->symtab_node_encoder, expr);
       stream_write_tree (ob, initial, ref_p);
     }
+
+  /* Stream references to early generated DIEs.  Keep in sync with the
+     trees handled in dwarf2out.c:register_die_ref_for_decl.  */
+  if ((DECL_P (expr)
+       && TREE_CODE (expr) != FIELD_DECL
+       && TREE_CODE (expr) != DEBUG_EXPR_DECL
+       && TREE_CODE (expr) != TYPE_DECL)
+      || TREE_CODE (expr) == BLOCK)
+    {
+      extern bool lookup_die_ref_for_decl (tree, const char **,
+					   unsigned HOST_WIDE_INT *);
+      const char *sym;
+      unsigned HOST_WIDE_INT off;
+      if (debug_info_level > DINFO_LEVEL_NONE
+	  && lookup_die_ref_for_decl (expr, &sym, &off))
+	{
+	  streamer_write_string (ob, ob->main_stream, sym, true);
+	  streamer_write_uhwi (ob, off);
+	}
+      else
+	streamer_write_string (ob, ob->main_stream, NULL, true);
+    }
 }
 
 /* Write a physical representation of tree node EXPR to output block
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index c8862a2..7f7fd09 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -1154,4 +1154,13 @@  DEFINE_DECL_STREAM_FUNCS (TYPE_DECL, type_decl)
 DEFINE_DECL_STREAM_FUNCS (NAMESPACE_DECL, namespace_decl)
 DEFINE_DECL_STREAM_FUNCS (LABEL_DECL, label_decl)
 
+struct dref_entry {
+    tree decl;
+    const char *sym;
+    unsigned HOST_WIDE_INT off;
+};
+
+extern vec<dref_entry> dref_queue;
+
+
 #endif /* GCC_LTO_STREAMER_H  */
diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 404cb68..30fd736 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -892,8 +892,8 @@  run_gcc (unsigned argc, char *argv[])
   int new_head_argc;
   bool have_lto = false;
   bool have_offload = false;
-  unsigned lto_argc = 0, offload_argc = 0;
-  char **lto_argv, **offload_argv;
+  unsigned lto_argc = 0, ltoobj_argc = 0, offload_argc = 0;
+  char **lto_argv, **ltoobj_argv, **offload_argv;
 
   /* Get the driver and options.  */
   collect_gcc = getenv ("COLLECT_GCC");
@@ -912,6 +912,7 @@  run_gcc (unsigned argc, char *argv[])
   /* Allocate arrays for input object files with LTO or offload IL,
      and for possible preceding arguments.  */
   lto_argv = XNEWVEC (char *, argc);
+  ltoobj_argv = XNEWVEC (char *, argc);
   offload_argv = XNEWVEC (char *, argc);
 
   /* Look at saved options in the IL files.  */
@@ -946,7 +947,7 @@  run_gcc (unsigned argc, char *argv[])
 				  collect_gcc))
 	{
 	  have_lto = true;
-	  lto_argv[lto_argc++] = argv[i];
+	  ltoobj_argv[ltoobj_argc++] = argv[i];
 	}
 
       if (find_and_merge_options (fd, file_offset, OFFLOAD_SECTION_NAME_PREFIX,
@@ -1147,9 +1148,12 @@  run_gcc (unsigned argc, char *argv[])
         obstack_ptr_grow (&argv_obstack, "-fwpa");
     }
 
-  /* Append the input objects and possible preceding arguments.  */
+  /* Append input arguments.  */
   for (i = 0; i < lto_argc; ++i)
     obstack_ptr_grow (&argv_obstack, lto_argv[i]);
+  /* Append the input objects.  */
+  for (i = 0; i < ltoobj_argc; ++i)
+    obstack_ptr_grow (&argv_obstack, ltoobj_argv[i]);
   obstack_ptr_grow (&argv_obstack, NULL);
 
   new_argv = XOBFINISH (&argv_obstack, const char **);
@@ -1158,6 +1162,9 @@  run_gcc (unsigned argc, char *argv[])
 
   if (lto_mode == LTO_MODE_LTO)
     {
+      /* Re-link the LTO input objects for their early debuginfo section.  */
+      for (i = 0; i < ltoobj_argc; ++i)
+	printf ("%s\n", ltoobj_argv[i]);
       printf ("%s\n", flto_out);
       free (flto_out);
       flto_out = NULL;
@@ -1307,10 +1314,12 @@  cont:
 	  for (i = 0; i < nr; ++i)
 	    maybe_unlink (input_names[i]);
 	}
+      /* Re-link the LTO input objects for their early debuginfo section.  */
+      for (i = 0; i < ltoobj_argc; ++i)
+	printf ("%s\n", ltoobj_argv[i]);
       for (i = 0; i < nr; ++i)
 	{
-	  fputs (output_names[i], stdout);
-	  putc ('\n', stdout);
+	  printf ("%s\n", output_names[i]);
 	  free (input_names[i]);
 	}
       nr = 0;
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 760975f..b3d71fe 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -1838,6 +1838,9 @@  unify_scc (struct streamer_tree_cache_d *cache, unsigned from,
 	      ggc_free (scc->entries[i]);
 	    }
 
+	  /* Drop DIE references.  */
+	  dref_queue.truncate (0);
+
 	  break;
 	}
 
@@ -1914,7 +1917,6 @@  lto_read_decls (struct lto_file_decl_data *decl_data, const void *data,
 	  if (len == 1
 	      && (TREE_CODE (first) == IDENTIFIER_NODE
 		  || TREE_CODE (first) == INTEGER_CST
-		  || TREE_CODE (first) == TRANSLATION_UNIT_DECL
 		  || streamer_handle_as_builtin_p (first)))
 	    continue;
 
@@ -1950,10 +1952,6 @@  lto_read_decls (struct lto_file_decl_data *decl_data, const void *data,
 	      if (TREE_CODE (t) == INTEGER_CST
 		  && !TREE_OVERFLOW (t))
 		cache_integer_cst (t);
-	      /* Register TYPE_DECLs with the debuginfo machinery.  */
-	      if (!flag_wpa
-		  && TREE_CODE (t) == TYPE_DECL)
-		debug_hooks->type_decl (t, !DECL_FILE_SCOPE_P (t));
 	      if (!flag_ltrans)
 		{
 		  /* Register variables and functions with the
@@ -1969,6 +1967,16 @@  lto_read_decls (struct lto_file_decl_data *decl_data, const void *data,
 		    vec_safe_push (tree_with_vars, t);
 		}
 	    }
+
+	  /* Register DECLs with the debuginfo machinery.  */
+	  while (!dref_queue.is_empty ())
+	    {
+	      extern void register_die_ref_for_decl (tree, const char *,
+						     unsigned HOST_WIDE_INT);
+	      dref_entry e = dref_queue.pop ();
+	      register_die_ref_for_decl (e.decl, e.sym, e.off);
+	    }
+
 	  if (seen_type)
 	    num_type_scc_trees += len;
 	}
diff --git a/gcc/passes.c b/gcc/passes.c
index f122ef0..50d1595 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -297,7 +297,7 @@  rest_of_decl_compilation (tree decl,
      be handled by either handling reachable functions from
      finalize_compilation_unit (and by consequence, locally scoped
      symbols), or by rest_of_type_compilation below.  */
-  if (!flag_wpa
+  if (!in_lto_p
 	&& TREE_CODE (decl) != FUNCTION_DECL
       && !decl_function_context (decl)
       && !current_function_decl
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 4fa433d..84c122b 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -93,7 +93,6 @@  if [check_effective_target_lto] {
     # path.
     if [check_linker_plugin_available] {
       set LTO_TORTURE_OPTIONS [list \
-	  { -O2 -flto -fno-use-linker-plugin -flto-partition=none } \
 	  { -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects }
       ]
       set gcc_force_conventional_output "-ffat-lto-objects"
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 5e3bd08..28be59a 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -622,24 +622,21 @@  compile_file (void)
   if (seen_error ())
     return;
 
-  /* After the parser has generated debugging information, augment
-     this information with any new location/etc information that may
-     have become available after the compilation proper.  */
-  timevar_start (TV_PHASE_DBGINFO);
-  symtab_node *node;
-  FOR_EACH_DEFINED_SYMBOL (node)
-    debug_hooks->late_global_decl (node->decl);
-  timevar_stop (TV_PHASE_DBGINFO);
-
-  if (!in_lto_p && flag_dump_early_debug_stats)
-    dwarf2out_dump_early_debug_stats ();
-
-  timevar_start (TV_PHASE_LATE_ASM);
-
   /* Compilation unit is finalized.  When producing non-fat LTO object, we are
      basically finished.  */
   if (in_lto_p || !flag_lto || flag_fat_lto_objects)
     {
+      /* After the parser has generated debugging information, augment
+	 this information with any new location/etc information that may
+	 have become available after the compilation proper.  */
+      timevar_start (TV_PHASE_DBGINFO);
+      symtab_node *node;
+      FOR_EACH_DEFINED_SYMBOL (node)
+	debug_hooks->late_global_decl (node->decl);
+      timevar_stop (TV_PHASE_DBGINFO);
+
+      timevar_start (TV_PHASE_LATE_ASM);
+
       /* File-scope initialization for AddressSanitizer.  */
       if (flag_sanitize & SANITIZE_ADDRESS)
         asan_finish_file ();
@@ -681,6 +678,11 @@  compile_file (void)
       /* Flush any pending external directives.  */
       process_pending_assemble_externals ();
    }
+  else
+    timevar_start (TV_PHASE_LATE_ASM);
+
+  if (!in_lto_p && flag_dump_early_debug_stats)
+    dwarf2out_dump_early_debug_stats ();
 
   /* Emit LTO marker if LTO info has been previously emitted.  This is
      used by collect2 to determine whether an object file contains IL.
@@ -705,7 +707,10 @@  compile_file (void)
 
   /* Let linker plugin know that this is a slim object and must be LTOed
      even when user did not ask for it.  */
-  if (flag_generate_lto && !flag_fat_lto_objects)
+  /* ???  As we now put early debug info into even slim objects they
+     are kind-of "fat" again.  Make sure the linker doesn't refuse
+     to operate in non-LTO mode on them.  */
+  if (flag_generate_lto && !flag_fat_lto_objects && 0)
     {
 #if defined ASM_OUTPUT_ALIGNED_DECL_COMMON
       ASM_OUTPUT_ALIGNED_DECL_COMMON (asm_out_file, NULL_TREE, "__gnu_lto_slim",