diff mbox

Random cleanups [4/4]: Streamlining streamer

Message ID Pine.LNX.4.64.1103310332450.19760@wotan.suse.de
State New
Headers show

Commit Message

Michael Matz March 31, 2011, 2:07 a.m. UTC
Hi,

I fear I wasn't as thorough in also splitting this one into several 
patches, but the different cleanups are at least mostly in different 
files.  They are:

* lto-lang remembers all builtin decls in a local list, to be returned
  by the getdecls langhook.  But as we have our own write_globals langhook
  this isn't actually called (except by dbxout.c), so there's no point in 
  remembering.

* lto.c:lto_materialize_function has code to read in the function body 
  sections, do something with them in non-wpa mode, and discard them then.
  There's no point in even reading them in in non-wpa mode (except for a 
  dubious error message that rather is worth an assert).

* gimple.c:gimple_type_leader_entry is a fixed length cache for speeding 
  up our type merging machinery.  It can hold references to many meanwhile 
  merged trees, interferring with the wish of free up much memory with a 
  ggc_collect with early-merging LTO.  We can simply make it deletable.

* ipa-inline.c: some tidying in not calling a macro with function call 
  arguments, and calling a costly function only after early-outs.

* lto-streamer-out.c : it writes out and compares strings character by 
  character.  memcmp and output_data_stream work just as well

* lto-streamer: output_unreferenced_globals writes out all global varpool 
  decls.  The reading side simply reads over all of them, and ignores 
  them.  This was supposed to help symbol resolution, and it probably once 
  did.  But since some time we properly emit varpool and cgraph nodes, and 
  references between them, and a proper symtab.  There's no need for 
  emitting these trees again.

* lto-streamer: the following changes the bytecode:
  1: all indices  into the cache are unsigned, hence we should say 
     so, instead of casting casts back and forth
  2: trees are only appended to the cache, when writing out.  When reading
     in we read in all trees in the stream one after the other, also 
     appending to the cache.  References to existing trees _always_ are to 
     - well - existing trees, hence to those already emitted earlier in 
     the stream, i.e. with a smaller offset, and more importantly with a 
     known index even at reader side.

     So, the offset never is used, so remove that and all associated 
     tracking and params.
  3: for the same reason we also don't need to stream the index that new
     trees get in the cache.  They will get exactly the ones they also had 
     when writing out.  We could use it as consistency check, but we 
     stream the expected tree-node for back-references for that already.

     Obviously we do need to stream the index in back references (aka 
     pickled references).

     (the index could change if there's a different set of nodes preloaded 
     into the cache between writing out and reading in.  But that would 
     have much worse problems already, silently overwriting slots with 
     trees from the stream; we should do away with the preloaded nodes,
     and instead rely on type merging to get canonical versions of the 
     builtin trees)

Not streaming offset and index for most trees obviously shortens the 
bytecode somewhat but I don't have statistics on how much.  Not much would 
be my guess.

Regstrapped on x86_64-linux with the other three cleanups.  Okay for 
trunk?


Ciao,
Michael.

Comments

Richard Biener March 31, 2011, 9:07 a.m. UTC | #1
On Thu, Mar 31, 2011 at 4:07 AM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> I fear I wasn't as thorough in also splitting this one into several
> patches, but the different cleanups are at least mostly in different
> files.  They are:
>
> * lto-lang remembers all builtin decls in a local list, to be returned
>  by the getdecls langhook.  But as we have our own write_globals langhook
>  this isn't actually called (except by dbxout.c), so there's no point in
>  remembering.
>
> * lto.c:lto_materialize_function has code to read in the function body
>  sections, do something with them in non-wpa mode, and discard them then.
>  There's no point in even reading them in in non-wpa mode (except for a
>  dubious error message that rather is worth an assert).
>
> * gimple.c:gimple_type_leader_entry is a fixed length cache for speeding
>  up our type merging machinery.  It can hold references to many meanwhile
>  merged trees, interferring with the wish of free up much memory with a
>  ggc_collect with early-merging LTO.  We can simply make it deletable.
>
> * ipa-inline.c: some tidying in not calling a macro with function call
>  arguments, and calling a costly function only after early-outs.
>
> * lto-streamer-out.c : it writes out and compares strings character by
>  character.  memcmp and output_data_stream work just as well
>
> * lto-streamer: output_unreferenced_globals writes out all global varpool
>  decls.  The reading side simply reads over all of them, and ignores
>  them.  This was supposed to help symbol resolution, and it probably once
>  did.  But since some time we properly emit varpool and cgraph nodes, and
>  references between them, and a proper symtab.  There's no need for
>  emitting these trees again.
>
> * lto-streamer: the following changes the bytecode:
>  1: all indices  into the cache are unsigned, hence we should say
>     so, instead of casting casts back and forth
>  2: trees are only appended to the cache, when writing out.  When reading
>     in we read in all trees in the stream one after the other, also
>     appending to the cache.  References to existing trees _always_ are to
>     - well - existing trees, hence to those already emitted earlier in
>     the stream, i.e. with a smaller offset, and more importantly with a
>     known index even at reader side.
>
>     So, the offset never is used, so remove that and all associated
>     tracking and params.
>  3: for the same reason we also don't need to stream the index that new
>     trees get in the cache.  They will get exactly the ones they also had
>     when writing out.  We could use it as consistency check, but we
>     stream the expected tree-node for back-references for that already.
>
>     Obviously we do need to stream the index in back references (aka
>     pickled references).
>
>     (the index could change if there's a different set of nodes preloaded
>     into the cache between writing out and reading in.  But that would
>     have much worse problems already, silently overwriting slots with
>     trees from the stream; we should do away with the preloaded nodes,
>     and instead rely on type merging to get canonical versions of the
>     builtin trees)
>
> Not streaming offset and index for most trees obviously shortens the
> bytecode somewhat but I don't have statistics on how much.  Not much would
> be my guess.
>
> Regstrapped on x86_64-linux with the other three cleanups.  Okay for
> trunk?

I don't see a need (in this patch) to move lto_input_chain earlier, but
I suppose it doesn't matter and is needed by a followup.

Ok.

Thanks,
Richard.
diff mbox

Patch

Index: lto-streamer-out.c
===================================================================
--- lto-streamer-out.c.orig	2011-03-29 17:08:14.000000000 +0200
+++ lto-streamer-out.c	2011-03-29 17:12:07.000000000 +0200
@@ -70,13 +70,7 @@  eq_string_slot_node (const void *p1, con
   const struct string_slot *ds2 = (const struct string_slot *) p2;
 
   if (ds1->len == ds2->len)
-    {
-      int i;
-      for (i = 0; i < ds1->len; i++)
-	if (ds1->s[i] != ds2->s[i])
-	  return 0;
-      return 1;
-    }
+    return memcmp (ds1->s, ds2->s, ds1->len) == 0;
 
   return 0;
 }
@@ -181,7 +175,6 @@  lto_output_string_with_length (struct ou
       unsigned int start = string_stream->total_size;
       struct string_slot *new_slot
 	= (struct string_slot *) xmalloc (sizeof (struct string_slot));
-      unsigned int i;
 
       new_slot->s = string;
       new_slot->len = len;
@@ -189,12 +182,11 @@  lto_output_string_with_length (struct ou
       *slot = new_slot;
       lto_output_uleb128_stream (index_stream, start);
       lto_output_uleb128_stream (string_stream, len);
-      for (i = 0; i < len; i++)
-	lto_output_1_stream (string_stream, string[i]);
+      lto_output_data_stream (string_stream, string, len);
     }
   else
     {
-      struct string_slot *old_slot = (struct string_slot *)*slot;
+      struct string_slot *old_slot = *slot;
       lto_output_uleb128_stream (index_stream, old_slot->slot_num);
       free (string);
     }
@@ -1247,7 +1239,7 @@  lto_output_tree_pointers (struct output_
    where EXPR is stored.  REF_P is as in lto_output_tree.  */
 
 static void
-lto_output_tree_header (struct output_block *ob, tree expr, int ix)
+lto_output_tree_header (struct output_block *ob, tree expr)
 {
   enum LTO_tags tag;
   enum tree_code code;
@@ -1264,7 +1256,6 @@  lto_output_tree_header (struct output_bl
      variable sized nodes).  */
   tag = lto_tree_code_to_tag (code);
   output_record_start (ob, tag);
-  output_sleb128 (ob, ix);
 
   /* The following will cause bootstrap miscomparisons.  Enable with care.  */
 #ifdef LTO_STREAMER_DEBUG
@@ -1293,7 +1284,7 @@  lto_output_tree_header (struct output_bl
    the index into the streamer cache where EXPR is stored.*/
 
 static void
-lto_output_builtin_tree (struct output_block *ob, tree expr, int ix)
+lto_output_builtin_tree (struct output_block *ob, tree expr)
 {
   gcc_assert (lto_stream_as_builtin_p (expr));
 
@@ -1305,7 +1296,6 @@  lto_output_builtin_tree (struct output_b
   output_record_start (ob, LTO_builtin_decl);
   output_uleb128 (ob, DECL_BUILT_IN_CLASS (expr));
   output_uleb128 (ob, DECL_FUNCTION_CODE (expr));
-  output_sleb128 (ob, ix);
 
   if (DECL_ASSEMBLER_NAME_SET_P (expr))
     {
@@ -1330,13 +1320,13 @@  lto_output_builtin_tree (struct output_b
    where EXPR is stored.  */
 
 static void
-lto_write_tree (struct output_block *ob, tree expr, bool ref_p, int ix)
+lto_write_tree (struct output_block *ob, tree expr, bool ref_p)
 {
   struct bitpack_d bp;
 
   /* Write the header, containing everything needed to materialize
      EXPR on the reading side.  */
-  lto_output_tree_header (ob, expr, ix);
+  lto_output_tree_header (ob, expr);
 
   /* Pack all the non-pointer fields in EXPR into a bitpack and write
      the resulting bitpack.  */
@@ -1373,9 +1363,8 @@  lto_output_integer_cst (struct output_bl
 void
 lto_output_tree (struct output_block *ob, tree expr, bool ref_p)
 {
-  int ix;
+  unsigned ix;
   bool existed_p;
-  unsigned offset;
 
   if (expr == NULL_TREE)
     {
@@ -1391,22 +1380,15 @@  lto_output_tree (struct output_block *ob
       return;
     }
 
-  /* Determine the offset in the stream where EXPR will be written.
-     This is used when emitting pickle references so the reader knows
-     where to reconstruct the pickled object from.  This allows
-     circular and forward references within the same stream.  */
-  offset = ob->main_stream->total_size;
-
-  existed_p = lto_streamer_cache_insert (ob->writer_cache, expr, &ix, &offset);
+  existed_p = lto_streamer_cache_insert (ob->writer_cache, expr, &ix);
   if (existed_p)
     {
       /* If a node has already been streamed out, make sure that
 	 we don't write it more than once.  Otherwise, the reader
 	 will instantiate two different nodes for the same object.  */
       output_record_start (ob, LTO_tree_pickle_reference);
-      output_sleb128 (ob, ix);
+      output_uleb128 (ob, ix);
       output_uleb128 (ob, lto_tree_code_to_tag (TREE_CODE (expr)));
-      output_uleb128 (ob, offset);
     }
   else if (lto_stream_as_builtin_p (expr))
     {
@@ -1415,13 +1397,13 @@  lto_output_tree (struct output_block *ob
 	 compiler on startup.  The only builtins that need to
 	 be written out are BUILT_IN_FRONTEND.  For all other
 	 builtins, we simply write the class and code.  */
-      lto_output_builtin_tree (ob, expr, ix);
+      lto_output_builtin_tree (ob, expr);
     }
   else
     {
       /* This is the first time we see EXPR, write its fields
 	 to OB.  */
-      lto_write_tree (ob, expr, ref_p, ix);
+      lto_write_tree (ob, expr, ref_p);
     }
 }
 
@@ -2074,7 +2056,6 @@  output_unreferenced_globals (cgraph_node
   struct output_block *ob;
   alias_pair *p;
   unsigned i;
-  struct varpool_node *vnode;
   symbol_alias_set_t *defined;
   struct sets setdata;
 
@@ -2089,30 +2070,6 @@  output_unreferenced_globals (cgraph_node
   /* Make string 0 be a NULL string.  */
   lto_output_1_stream (ob->string_stream, 0);
 
-  /* Emit references for all the global symbols.  If a global symbol
-     was never referenced in any of the functions of this file, it
-     would not be emitted otherwise.  This will result in unreferenced
-     symbols at link time if a file defines a global symbol but
-     never references it.  */
-  FOR_EACH_STATIC_VARIABLE (vnode)
-   if (vnode->needed && varpool_node_in_set_p (vnode, vset))
-      {
-	tree var = vnode->decl;
-
-	if (TREE_CODE (var) == VAR_DECL)
-	  {
-	    /* Output the object in order to output references used in the
-	       initialization. */
-	    lto_output_tree (ob, var, true);
-
-	    /* If it is public we also need a reference to the object itself. */
-	    if (TREE_PUBLIC (var))
-	      lto_output_tree_ref (ob, var);
-	  }
-      }
-
-  output_zero (ob);
-
   /* We really need to propagate in both directoins:
      for normal aliases we propagate from first defined alias to
      all aliases defined based on it.  For weakrefs we propagate in
@@ -2316,19 +2273,19 @@  write_global_references (struct output_b
  			 struct lto_tree_ref_encoder *encoder)
 {
   tree t;
-  int32_t index;
-  const int32_t size = lto_tree_ref_encoder_size (encoder);
+  uint32_t index;
+  const uint32_t size = lto_tree_ref_encoder_size (encoder);
 
   /* Write size as 32-bit unsigned. */
   lto_output_data_stream (ref_stream, &size, sizeof (int32_t));
 
   for (index = 0; index < size; index++)
     {
-      int32_t slot_num;
+      uint32_t slot_num;
 
       t = lto_tree_ref_encoder_get_tree (encoder, index);
       lto_streamer_cache_lookup (ob->writer_cache, t, &slot_num);
-      gcc_assert (slot_num >= 0);
+      gcc_assert (slot_num != (unsigned)-1);
       lto_output_data_stream (ref_stream, &slot_num, sizeof slot_num);
     }
 }
@@ -2357,15 +2314,15 @@  lto_output_decl_state_refs (struct outpu
 			    struct lto_out_decl_state *state)
 {
   unsigned i;
-  int32_t ref;
+  uint32_t ref;
   tree decl;
 
   /* Write reference to FUNCTION_DECL.  If there is not function,
      write reference to void_type_node. */
   decl = (state->fn_decl) ? state->fn_decl : void_type_node;
   lto_streamer_cache_lookup (ob->writer_cache, decl, &ref);
-  gcc_assert (ref >= 0);
-  lto_output_data_stream (out_stream, &ref, sizeof (int32_t));
+  gcc_assert (ref != (unsigned)-1);
+  lto_output_data_stream (out_stream, &ref, sizeof (uint32_t));
 
   for (i = 0;  i < LTO_N_DECL_STREAMS; i++)
     write_global_references (ob, out_stream, &state->streams[i]);
@@ -2402,7 +2359,7 @@  write_symbol (struct lto_streamer_cache_
   const char *name;
   enum gcc_plugin_symbol_kind kind;
   enum gcc_plugin_symbol_visibility visibility;
-  int slot_num;
+  unsigned slot_num;
   uint64_t size;
   const char *comdat;
   unsigned char c;
@@ -2429,7 +2386,7 @@  write_symbol (struct lto_streamer_cache_
   pointer_set_insert (seen, name);
 
   lto_streamer_cache_lookup (cache, t, &slot_num);
-  gcc_assert (slot_num >= 0);
+  gcc_assert (slot_num != (unsigned)-1);
 
   if (DECL_EXTERNAL (t))
     {
@@ -2540,7 +2497,7 @@  produce_symtab (struct output_block *ob,
   memset (&stream, 0, sizeof (stream));
 
   /* Write all functions. 
-     First write all defined functions and the write all used functions.
+     First write all defined functions and then write all used functions.
      This is done so only to handle duplicated symbols in cgraph.  */
   for (i = 0; i < lto_cgraph_encoder_size (encoder); i++)
     {
Index: ipa-inline.c
===================================================================
--- ipa-inline.c.orig	2011-03-29 17:08:14.000000000 +0200
+++ ipa-inline.c	2011-03-29 17:12:07.000000000 +0200
@@ -519,13 +519,15 @@  static int
 cgraph_edge_badness (struct cgraph_edge *edge, bool dump)
 {
   gcov_type badness;
-  int growth =
-    (cgraph_estimate_size_after_inlining (edge->caller, edge->callee)
-     - edge->caller->global.size);
+  int growth;
 
   if (edge->callee->local.disregard_inline_limits)
     return INT_MIN;
 
+  growth =
+    (cgraph_estimate_size_after_inlining (edge->caller, edge->callee)
+     - edge->caller->global.size);
+
   if (dump)
     {
       fprintf (dump_file, "    Badness calculation for %s -> %s\n",
@@ -584,11 +586,11 @@  cgraph_edge_badness (struct cgraph_edge
       int growth_for_all;
       badness = growth * 10000;
       benefitperc =
-	MIN (100 * inline_summary (edge->callee)->time_inlining_benefit /
-	     (edge->callee->global.time + 1) +1, 100);
+	100 * inline_summary (edge->callee)->time_inlining_benefit
+	    / (edge->callee->global.time + 1) +1;
+      benefitperc = MIN (benefitperc, 100);
       div *= benefitperc;
 
-
       /* Decrease badness if call is nested.  */
       /* Compress the range so we don't overflow.  */
       if (div > 10000)
Index: lto-streamer-in.c
===================================================================
--- lto-streamer-in.c.orig	2011-03-29 17:08:14.000000000 +0200
+++ lto-streamer-in.c	2011-03-29 17:12:07.000000000 +0200
@@ -387,6 +387,33 @@  lto_input_tree_ref (struct lto_input_blo
 }
 
 
+/* Read a chain of tree nodes from input block IB. DATA_IN contains
+   tables and descriptors for the file being read.  */
+
+static tree
+lto_input_chain (struct lto_input_block *ib, struct data_in *data_in)
+{
+  int i, count;
+  tree first, prev, curr;
+
+  first = prev = NULL_TREE;
+  count = lto_input_sleb128 (ib);
+  for (i = 0; i < count; i++)
+    {
+      curr = lto_input_tree (ib, data_in);
+      if (prev)
+	TREE_CHAIN (prev) = curr;
+      else
+	first = curr;
+
+      TREE_CHAIN (curr) = NULL_TREE;
+      prev = curr;
+    }
+
+  return first;
+}
+
+
 /* Read and return a double-linked list of catch handlers from input
    block IB, using descriptors in DATA_IN.  */
 
@@ -1255,7 +1282,7 @@  input_function (tree fn_decl, struct dat
        oarg && narg;
        oarg = TREE_CHAIN (oarg), narg = TREE_CHAIN (narg))
     {
-      int ix;
+      unsigned ix;
       bool res;
       res = lto_streamer_cache_lookup (data_in->reader_cache, oarg, &ix);
       gcc_assert (res);
@@ -1343,11 +1370,6 @@  input_alias_pairs (struct lto_input_bloc
 
   clear_line_info (data_in);
 
-  /* Skip over all the unreferenced globals.  */
-  do
-    var = lto_input_tree (ib, data_in);
-  while (var);
-
   var = lto_input_tree (ib, data_in);
   while (var)
     {
@@ -1819,7 +1841,7 @@  unpack_value_fields (struct bitpack_d *b
 
 static tree
 lto_materialize_tree (struct lto_input_block *ib, struct data_in *data_in,
-		      enum LTO_tags tag, int *ix_p)
+		      enum LTO_tags tag)
 {
   struct bitpack_d bp;
   enum tree_code code;
@@ -1827,15 +1849,9 @@  lto_materialize_tree (struct lto_input_b
 #ifdef LTO_STREAMER_DEBUG
   HOST_WIDEST_INT orig_address_in_writer;
 #endif
-  HOST_WIDE_INT ix;
 
   result = NULL_TREE;
 
-  /* Read the header of the node we are about to create.  */
-  ix = lto_input_sleb128 (ib);
-  gcc_assert ((int) ix == ix);
-  *ix_p = (int) ix;
-
 #ifdef LTO_STREAMER_DEBUG
   /* Read the word representing the memory address for the tree
      as it was written by the writer.  This is useful when
@@ -1867,8 +1883,7 @@  lto_materialize_tree (struct lto_input_b
     }
   else
     {
-      /* All other nodes can be materialized with a raw make_node
-	 call.  */
+      /* All other nodes can be materialized with a raw make_node call.  */
       result = make_node (code);
     }
 
@@ -1895,39 +1910,12 @@  lto_materialize_tree (struct lto_input_b
   /* Enter RESULT in the reader cache.  This will make RESULT
      available so that circular references in the rest of the tree
      structure can be resolved in subsequent calls to lto_input_tree.  */
-  lto_streamer_cache_insert_at (data_in->reader_cache, result, ix);
+  lto_streamer_cache_append (data_in->reader_cache, result);
 
   return result;
 }
 
 
-/* Read a chain of tree nodes from input block IB. DATA_IN contains
-   tables and descriptors for the file being read.  */
-
-static tree
-lto_input_chain (struct lto_input_block *ib, struct data_in *data_in)
-{
-  int i, count;
-  tree first, prev, curr;
-
-  first = prev = NULL_TREE;
-  count = lto_input_sleb128 (ib);
-  for (i = 0; i < count; i++)
-    {
-      curr = lto_input_tree (ib, data_in);
-      if (prev)
-	TREE_CHAIN (prev) = curr;
-      else
-	first = curr;
-
-      TREE_CHAIN (curr) = NULL_TREE;
-      prev = curr;
-    }
-
-  return first;
-}
-
-
 /* Read all pointer fields in the TS_COMMON structure of EXPR from input
    block IB.  DATA_IN contains tables and descriptors for the
    file being read.  */
@@ -2454,7 +2442,7 @@  lto_register_var_decl_in_symtab (struct
      declaration for merging.  */
   if (TREE_PUBLIC (decl))
     {
-      int ix;
+      unsigned ix;
       if (!lto_streamer_cache_lookup (data_in->reader_cache, decl, &ix))
 	gcc_unreachable ();
       lto_symtab_register_decl (decl, get_resolution (data_in, ix),
@@ -2521,7 +2509,7 @@  lto_register_function_decl_in_symtab (st
      declaration for merging.  */
   if (TREE_PUBLIC (decl) && !DECL_ABSTRACT (decl))
     {
-      int ix;
+      unsigned ix;
       if (!lto_streamer_cache_lookup (data_in->reader_cache, decl, &ix))
 	gcc_unreachable ();
       lto_symtab_register_decl (decl, get_resolution (data_in, ix),
@@ -2536,35 +2524,14 @@  lto_register_function_decl_in_symtab (st
 static tree
 lto_get_pickled_tree (struct lto_input_block *ib, struct data_in *data_in)
 {
-  HOST_WIDE_INT ix;
+  unsigned HOST_WIDE_INT ix;
   tree result;
   enum LTO_tags expected_tag;
-  unsigned HOST_WIDE_INT orig_offset;
 
-  ix = lto_input_sleb128 (ib);
+  ix = lto_input_uleb128 (ib);
   expected_tag = (enum LTO_tags) lto_input_uleb128 (ib);
 
-  orig_offset = lto_input_uleb128 (ib);
-  gcc_assert (orig_offset == (unsigned) orig_offset);
-
   result = lto_streamer_cache_get (data_in->reader_cache, ix);
-  if (result == NULL_TREE)
-    {
-      /* We have not yet read the cache slot IX.  Go to the offset
-	 in the stream where the physical tree node is, and materialize
-	 it from there.  */
-      struct lto_input_block fwd_ib;
-
-      /* If we are trying to go back in the stream, something is wrong.
-	 We should've read the node at the earlier position already.  */
-      if (ib->p >= orig_offset)
-	internal_error ("bytecode stream: tried to jump backwards in the "
-		        "stream");
-
-      LTO_INIT_INPUT_BLOCK (fwd_ib, ib->data, orig_offset, ib->len);
-      result = lto_input_tree (&fwd_ib, data_in);
-    }
-
   gcc_assert (result
               && TREE_CODE (result) == lto_tag_to_tree_code (expected_tag));
 
@@ -2582,16 +2549,12 @@  lto_get_builtin_tree (struct lto_input_b
   enum built_in_function fcode;
   const char *asmname;
   tree result;
-  int ix;
 
   fclass = (enum built_in_class) lto_input_uleb128 (ib);
   gcc_assert (fclass == BUILT_IN_NORMAL || fclass == BUILT_IN_MD);
 
   fcode = (enum built_in_function) lto_input_uleb128 (ib);
 
-  ix = lto_input_sleb128 (ib);
-  gcc_assert (ix == (int) ix);
-
   if (fclass == BUILT_IN_NORMAL)
     {
       gcc_assert (fcode < END_BUILTINS);
@@ -2611,7 +2574,7 @@  lto_get_builtin_tree (struct lto_input_b
   if (asmname)
     set_builtin_user_assembler_name (result, asmname);
 
-  lto_streamer_cache_insert_at (data_in->reader_cache, result, ix);
+  lto_streamer_cache_append (data_in->reader_cache, result);
 
   return result;
 }
@@ -2625,9 +2588,8 @@  lto_read_tree (struct lto_input_block *i
 	       enum LTO_tags tag)
 {
   tree result;
-  int ix;
 
-  result = lto_materialize_tree (ib, data_in, tag, &ix);
+  result = lto_materialize_tree (ib, data_in, tag);
 
   /* Read all the pointer fields in RESULT.  */
   lto_input_tree_pointers (ib, data_in, result);
Index: varasm.c
===================================================================
--- varasm.c.orig	2011-03-29 17:08:14.000000000 +0200
+++ varasm.c	2011-03-29 17:12:07.000000000 +0200
@@ -6798,7 +6798,7 @@  default_binds_local_p_1 (const_tree exp,
    current module (shared library or executable), that is to binds_local_p.
    We use this fact to avoid need for another target hook and implement
    the logic using binds_local_p and just special cases where
-   decl_binds_to_current_def_p is stronger than binds local_p.  In particular
+   decl_binds_to_current_def_p is stronger than binds_local_p.  In particular
    the weak definitions (that can be overwritten at linktime by other
    definition from different object file) and when resolution info is available
    we simply use the knowledge passed to us by linker plugin.  */
@@ -6811,7 +6811,7 @@  decl_binds_to_current_def_p (tree decl)
   if (!targetm.binds_local_p (decl))
     return false;
   /* When resolution is available, just use it.  */
-  if (TREE_CODE (decl) == VAR_DECL && TREE_PUBLIC (decl)
+  if (TREE_CODE (decl) == VAR_DECL
       && (TREE_STATIC (decl) || DECL_EXTERNAL (decl)))
     {
       struct varpool_node *vnode = varpool_get_node (decl);
@@ -6819,7 +6819,7 @@  decl_binds_to_current_def_p (tree decl)
 	  && vnode->resolution != LDPR_UNKNOWN)
 	return resolution_to_local_definition_p (vnode->resolution);
     }
-  else if (TREE_CODE (decl) == FUNCTION_DECL && TREE_PUBLIC (decl))
+  else if (TREE_CODE (decl) == FUNCTION_DECL)
     {
       struct cgraph_node *node = cgraph_get_node_or_alias (decl);
       if (node
Index: gimple.c
===================================================================
--- gimple.c.orig	2011-03-29 17:12:04.000000000 +0200
+++ gimple.c	2011-03-29 17:12:07.000000000 +0200
@@ -3242,7 +3242,7 @@  typedef struct GTY(()) gimple_type_leade
 } gimple_type_leader_entry;
 
 #define GIMPLE_TYPE_LEADER_SIZE 16381
-static GTY((length("GIMPLE_TYPE_LEADER_SIZE"))) gimple_type_leader_entry
+static GTY((deletable, length("GIMPLE_TYPE_LEADER_SIZE"))) gimple_type_leader_entry
   *gimple_type_leader;
 
 /* Lookup an existing leader for T and return it or NULL_TREE, if
Index: lto-streamer.c
===================================================================
--- lto-streamer.c.orig	2011-03-29 17:08:14.000000000 +0200
+++ lto-streamer.c	2011-03-29 17:12:07.000000000 +0200
@@ -314,29 +314,25 @@  check_handled_ts_structures (void)
 
 
 /* Helper for lto_streamer_cache_insert_1.  Add T to CACHE->NODES at
-   slot IX.  Add OFFSET to CACHE->OFFSETS at slot IX.  */
+   slot IX.  */
 
 static void
 lto_streamer_cache_add_to_node_array (struct lto_streamer_cache_d *cache,
-				      int ix, tree t, unsigned offset)
+				      unsigned ix, tree t)
 {
-  gcc_assert (ix >= 0);
+  /* Make sure we're either replacing an old element or
+     appending consecutively.  */
+  gcc_assert (ix <= VEC_length (tree, cache->nodes));
 
-  /* Grow the array of nodes and offsets to accomodate T at IX.  */
-  if (ix >= (int) VEC_length (tree, cache->nodes))
-    {
-      size_t sz = ix + (20 + ix) / 4;
-      VEC_safe_grow_cleared (tree, heap, cache->nodes, sz);
-      VEC_safe_grow_cleared (unsigned, heap, cache->offsets, sz);
-    }
-
-  VEC_replace (tree, cache->nodes, ix, t);
-  VEC_replace (unsigned, cache->offsets, ix, offset);
+  if (ix == VEC_length (tree, cache->nodes))
+    VEC_safe_push (tree, heap, cache->nodes, t);
+  else
+    VEC_replace (tree, cache->nodes, ix, t);
 }
 
 
 /* Helper for lto_streamer_cache_insert and lto_streamer_cache_insert_at.
-   CACHE, T, IX_P and OFFSET_P are as in lto_streamer_cache_insert.
+   CACHE, T, and IX_P are as in lto_streamer_cache_insert.
 
    If INSERT_AT_NEXT_SLOT_P is true, T is inserted at the next available
    slot in the cache.  Otherwise, T is inserted at the position indicated
@@ -347,13 +343,12 @@  lto_streamer_cache_add_to_node_array (st
 
 static bool
 lto_streamer_cache_insert_1 (struct lto_streamer_cache_d *cache,
-			     tree t, int *ix_p, unsigned *offset_p,
+			     tree t, unsigned *ix_p,
 			     bool insert_at_next_slot_p)
 {
   void **slot;
   struct tree_int_map d_entry, *entry;
-  int ix;
-  unsigned offset;
+  unsigned ix;
   bool existed_p;
 
   gcc_assert (t);
@@ -364,19 +359,16 @@  lto_streamer_cache_insert_1 (struct lto_
     {
       /* Determine the next slot to use in the cache.  */
       if (insert_at_next_slot_p)
-	ix = cache->next_slot++;
+	ix = VEC_length (tree, cache->nodes);
       else
 	ix = *ix_p;
 
       entry = (struct tree_int_map *)pool_alloc (cache->node_map_entries);
       entry->base.from = t;
-      entry->to = (unsigned) ix;
+      entry->to = ix;
       *slot = entry;
 
-      /* If no offset was given, store the invalid offset -1.  */
-      offset = (offset_p) ? *offset_p : (unsigned) -1;
-
-      lto_streamer_cache_add_to_node_array (cache, ix, t, offset);
+      lto_streamer_cache_add_to_node_array (cache, ix, t);
 
       /* Indicate that the item was not present in the cache.  */
       existed_p = false;
@@ -384,8 +376,7 @@  lto_streamer_cache_insert_1 (struct lto_
   else
     {
       entry = (struct tree_int_map *) *slot;
-      ix = (int) entry->to;
-      offset = VEC_index (unsigned, cache->offsets, ix);
+      ix = entry->to;
 
       if (!insert_at_next_slot_p && ix != *ix_p)
 	{
@@ -404,10 +395,7 @@  lto_streamer_cache_insert_1 (struct lto_
 	  gcc_assert (lto_stream_as_builtin_p (t));
 	  ix = *ix_p;
 
-	  /* Since we are storing a builtin, the offset into the
-	     stream is not necessary as we will not need to read
-	     forward in the stream.  */
-	  lto_streamer_cache_add_to_node_array (cache, ix, t, -1);
+	  lto_streamer_cache_add_to_node_array (cache, ix, t);
 	}
 
       /* Indicate that T was already in the cache.  */
@@ -417,9 +405,6 @@  lto_streamer_cache_insert_1 (struct lto_
   if (ix_p)
     *ix_p = ix;
 
-  if (offset_p)
-    *offset_p = offset;
-
   return existed_p;
 }
 
@@ -428,21 +413,13 @@  lto_streamer_cache_insert_1 (struct lto_
    return true.  Otherwise, return false.
 
    If IX_P is non-null, update it with the index into the cache where
-   T has been stored.
-
-   *OFFSET_P represents the offset in the stream where T is physically
-   written out.  The first time T is added to the cache, *OFFSET_P is
-   recorded in the cache together with T.  But if T already existed
-   in the cache, *OFFSET_P is updated with the value that was recorded
-   the first time T was added to the cache.
-
-   If OFFSET_P is NULL, it is ignored.  */
+   T has been stored.  */
 
 bool
 lto_streamer_cache_insert (struct lto_streamer_cache_d *cache, tree t,
-			   int *ix_p, unsigned *offset_p)
+			   unsigned *ix_p)
 {
-  return lto_streamer_cache_insert_1 (cache, t, ix_p, offset_p, true);
+  return lto_streamer_cache_insert_1 (cache, t, ix_p, true);
 }
 
 
@@ -451,24 +428,33 @@  lto_streamer_cache_insert (struct lto_st
 
 bool
 lto_streamer_cache_insert_at (struct lto_streamer_cache_d *cache,
-			      tree t, int ix)
+			      tree t, unsigned ix)
 {
-  return lto_streamer_cache_insert_1 (cache, t, &ix, NULL, false);
+  return lto_streamer_cache_insert_1 (cache, t, &ix, false);
 }
 
 
-/* Return true if tree node T exists in CACHE.  If IX_P is
+/* Appends tree node T to CACHE, even if T already existed in it.  */
+
+void
+lto_streamer_cache_append (struct lto_streamer_cache_d *cache, tree t)
+{
+  unsigned ix = VEC_length (tree, cache->nodes);
+  lto_streamer_cache_insert_1 (cache, t, &ix, false);
+}
+
+/* Return true if tree node T exists in CACHE, otherwise false.  If IX_P is
    not NULL, write to *IX_P the index into the cache where T is stored
-   (-1 if T is not found).  */
+   ((unsigned)-1 if T is not found).  */
 
 bool
 lto_streamer_cache_lookup (struct lto_streamer_cache_d *cache, tree t,
-			   int *ix_p)
+			   unsigned *ix_p)
 {
   void **slot;
   struct tree_int_map d_slot;
   bool retval;
-  int ix;
+  unsigned ix;
 
   gcc_assert (t);
 
@@ -482,7 +468,7 @@  lto_streamer_cache_lookup (struct lto_st
   else
     {
       retval = true;
-      ix = (int) ((struct tree_int_map *) *slot)->to;
+      ix = ((struct tree_int_map *) *slot)->to;
     }
 
   if (ix_p)
@@ -495,17 +481,14 @@  lto_streamer_cache_lookup (struct lto_st
 /* Return the tree node at slot IX in CACHE.  */
 
 tree
-lto_streamer_cache_get (struct lto_streamer_cache_d *cache, int ix)
+lto_streamer_cache_get (struct lto_streamer_cache_d *cache, unsigned ix)
 {
   gcc_assert (cache);
 
-  /* If the reader is requesting an index beyond the length of the
-     cache, it will need to read ahead.  Return NULL_TREE to indicate
-     that.  */
-  if ((unsigned) ix >= VEC_length (tree, cache->nodes))
-    return NULL_TREE;
+  /* Make sure we're not requesting something we don't have.  */
+  gcc_assert (ix < VEC_length (tree, cache->nodes));
 
-  return VEC_index (tree, cache->nodes, (unsigned) ix);
+  return VEC_index (tree, cache->nodes, ix);
 }
 
 
@@ -538,13 +521,10 @@  lto_record_common_node (tree *nodep, VEC
 
   VEC_safe_push (tree, heap, *common_nodes, node);
 
-  if (tree_node_can_be_shared (node))
-    {
-      if (POINTER_TYPE_P (node)
-	  || TREE_CODE (node) == COMPLEX_TYPE
-	  || TREE_CODE (node) == ARRAY_TYPE)
-	lto_record_common_node (&TREE_TYPE (node), common_nodes, seen_nodes);
-    }
+  if (POINTER_TYPE_P (node)
+      || TREE_CODE (node) == COMPLEX_TYPE
+      || TREE_CODE (node) == ARRAY_TYPE)
+    lto_record_common_node (&TREE_TYPE (node), common_nodes, seen_nodes);
 }
 
 
@@ -607,7 +587,7 @@  preload_common_node (struct lto_streamer
 {
   gcc_assert (t);
 
-  lto_streamer_cache_insert (cache, t, NULL, NULL);
+  lto_streamer_cache_insert (cache, t, NULL);
 
  /* The FIELD_DECLs of structures should be shared, so that every
     COMPONENT_REF uses the same tree node when referencing a field.
@@ -667,7 +647,6 @@  lto_streamer_cache_delete (struct lto_st
   htab_delete (c->node_map);
   free_alloc_pool (c->node_map_entries);
   VEC_free (tree, heap, c->nodes);
-  VEC_free (unsigned, heap, c->offsets);
   free (c);
 }
 
Index: lto-streamer.h
===================================================================
--- lto-streamer.h.orig	2011-03-29 17:08:14.000000000 +0200
+++ lto-streamer.h	2011-03-29 17:12:07.000000000 +0200
@@ -350,14 +350,8 @@  struct lto_streamer_cache_d
   /* Node map to store entries into.  */
   alloc_pool node_map_entries;
 
-  /* Next available slot in the nodes and offsets arrays.  */
-  unsigned next_slot;
-
   /* The nodes pickled so far.  */
   VEC(tree,heap) *nodes;
-
-  /* Offset into the stream where the nodes have been written.  */
-  VEC(unsigned,heap) *offsets;
 };
 
 
@@ -831,12 +825,13 @@  extern void lto_bitmap_free (bitmap);
 extern char *lto_get_section_name (int, const char *, struct lto_file_decl_data *);
 extern void print_lto_report (void);
 extern bool lto_streamer_cache_insert (struct lto_streamer_cache_d *, tree,
-				       int *, unsigned *);
+				       unsigned *);
 extern bool lto_streamer_cache_insert_at (struct lto_streamer_cache_d *, tree,
-					  int);
+					  unsigned);
+extern void lto_streamer_cache_append (struct lto_streamer_cache_d *, tree);
 extern bool lto_streamer_cache_lookup (struct lto_streamer_cache_d *, tree,
-				       int *);
-extern tree lto_streamer_cache_get (struct lto_streamer_cache_d *, int);
+				       unsigned *);
+extern tree lto_streamer_cache_get (struct lto_streamer_cache_d *, unsigned);
 extern struct lto_streamer_cache_d *lto_streamer_cache_create (void);
 extern void lto_streamer_cache_delete (struct lto_streamer_cache_d *);
 extern void lto_streamer_init (void);
Index: lto/lto.c
===================================================================
--- lto/lto.c.orig	2011-03-29 17:08:14.000000000 +0200
+++ lto/lto.c	2011-03-29 17:12:08.000000000 +0200
@@ -149,37 +149,36 @@  lto_materialize_function (struct cgraph_
       /* Clones don't need to be read.  */
       if (node->clone_of)
 	return;
-      file_data = node->local.lto_file_data;
-      name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); 
-
-      /* We may have renamed the declaration, e.g., a static function.  */
-      name = lto_get_decl_name_mapping (file_data, name);
-
-      data = lto_get_section_data (file_data, LTO_section_function_body,
-				   name, &len);
-      if (!data)
-	fatal_error ("%s: section %s is missing",
-		     file_data->file_name,
-		     name);
-
-      gcc_assert (DECL_STRUCT_FUNCTION (decl) == NULL);
 
       /* Load the function body only if not operating in WPA mode.  In
 	 WPA mode, the body of the function is not needed.  */
       if (!flag_wpa)
 	{
+	  file_data = node->local.lto_file_data;
+	  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
+
+	  /* We may have renamed the declaration, e.g., a static function.  */
+	  name = lto_get_decl_name_mapping (file_data, name);
+
+	  data = lto_get_section_data (file_data, LTO_section_function_body,
+				       name, &len);
+	  if (!data)
+	    fatal_error ("%s: section %s is missing",
+			 file_data->file_name,
+			 name);
+
+	  gcc_assert (DECL_STRUCT_FUNCTION (decl) == NULL);
+
 	  allocate_struct_function (decl, false);
 	  announce_function (decl);
 	  lto_input_function_body (file_data, decl, data);
 	  if (DECL_FUNCTION_PERSONALITY (decl) && !first_personality_decl)
 	    first_personality_decl = DECL_FUNCTION_PERSONALITY (decl);
 	  lto_stats.num_function_bodies++;
+	  lto_free_section_data (file_data, LTO_section_function_body, name,
+				 data, len);
+	  ggc_collect ();
 	}
-
-      lto_free_section_data (file_data, LTO_section_function_body, name,
-			     data, len);
-      if (!flag_wpa)
-	ggc_collect ();
     }
 
   /* Let the middle end know about the function.  */
@@ -200,7 +199,7 @@  lto_read_in_decl_state (struct data_in *
   uint32_t i, j;
   
   ix = *data++;
-  decl = lto_streamer_cache_get (data_in->reader_cache, (int) ix);
+  decl = lto_streamer_cache_get (data_in->reader_cache, ix);
   if (TREE_CODE (decl) != FUNCTION_DECL)
     {
       gcc_assert (decl == void_type_node);
Index: lto/lto-lang.c
===================================================================
--- lto/lto-lang.c.orig	2011-03-29 17:08:14.000000000 +0200
+++ lto/lto-lang.c	2011-03-30 21:05:34.000000000 +0200
@@ -615,11 +615,6 @@  lto_define_builtins (tree va_list_ref_ty
 
 static GTY(()) tree registered_builtin_types;
 
-/* A chain of builtin functions that we need to recognize.  We will
-   assume that all other function names we see will be defined by the
-   user's program.  */
-static GTY(()) tree registered_builtin_fndecls;
-
 /* Language hooks.  */
 
 static unsigned int
@@ -994,7 +989,10 @@  lto_pushdecl (tree t ATTRIBUTE_UNUSED)
 static tree
 lto_getdecls (void)
 {
-  return registered_builtin_fndecls;
+  /* We have our own write_globals langhook, hence the getdecls
+     langhook shouldn't be used, except by dbxout.c, so we can't
+     just abort here.  */
+  return NULL_TREE;
 }
 
 static void
@@ -1010,10 +1008,6 @@  lto_write_globals (void)
 static tree
 lto_builtin_function (tree decl)
 {
-  /* Record it.  */
-  TREE_CHAIN (decl) = registered_builtin_fndecls;
-  registered_builtin_fndecls = decl;
-
   return decl;
 }