Patchwork Merge constructors and destructors

login
register
mail settings
Submitter Jan Hubicka
Date Aug. 21, 2010, 12:11 p.m.
Message ID <20100821121110.GQ630@kam.mff.cuni.cz>
Download mbox | patch
Permalink /patch/62345/
State New
Headers show

Comments

Jan Hubicka - Aug. 21, 2010, 12:11 p.m.
Hi,
this patch implements merging of multiple constructors and destructors in LTO
units into single function.  I believe this is actually needed for correctness
when linking with libraries since constructors of libraries are required to be
executed first, while we currently execute them in random order depending on
layout of the final LTO file.

The main motivation is however to improve code locality by inlining all the
functions into single.  Mozilla has about 400 constructors, chrome has couple
thousdand and they are causing problems because at startup they bring into
memory random parts of the executable.
See also
http://blog.mozilla.com/tglek/2010/05/27/startup-backward-constructors/

I've taken Richard Henderson's code to produce ctors/dtors for collect2
and extended it to work on targets with ctors and dtors too.  I also put
it into separate micro-IPA pass rather then having it part of cgraphunit
driver (things become more twisted there with LTO otherwise).

Bootstrapped/regtested x86_64-linux, will commit it today after further testing
at Mozilla if there are no complains.

Honza

	* tree-pass.h (pass_ipa_cdtor_merge): New function.
	* cgraphunit.c (static_ctors, static_dtors): Move to ipa.c; make
	heap allocated.
	(record_cdtor_fn): Move to ipa.c; do not test for
	have_ctors_dtors.
	(build_cdtor): Move to ipa.c; add code avoiding construction
	when target have ctors/dtors and there is only one ctor/dtor at given
	priority.
	(compare_ctor, compare_dtor): Move to ipa.c; use DECL_UID to stabilize sort;
	reverse order of constructors.
	(cgraph_build_cdtor_fns):Move to ipa.c; rename to build_cdtor_fns.
	(cgraph_finalize_function): Do not call record_cdtor_fn.
	(cgraph_finalize_compilation_unit): Do not call cgraph_build_cdtor_fns.
	(cgraph_build_static_cdtor): Move to ipa.c.
	* ipa.c: Include target.h and tree-iterator.h.
	(cgraph_build_static_cdtor, static_ctors, static_dtors,
	record_cdtor_fn, build_cdtor, compare_ctor, compare_dtor,
	build_cdtor_fns, ipa_cdtor_merge, gate_ipa_cdtor_merge,
	pass_ipa_cdtor_merge): New.
	* passes.c (init_optimization_passes): Enqueue pass_ipa_cdtor_merge.
	* ipa-prop.c (update_indirect_edges_after_inlining): Avoid out of bounds access.
Richard Henderson - Aug. 23, 2010, 6:16 p.m.
On 08/21/2010 05:11 AM, Jan Hubicka wrote:
> +  while (i < len)
> +    {
> +      tree body;
> +      tree fn;
> +      priority_type priority;
> +
> +      priority = 0;
> +      body = NULL_TREE;
> +      j = i;
> +      do
> +	{
> +	  priority_type p;
> +	  fn = cdtors[i];

Surely this should be J.  Nothing in this loop looks at J.

> +      /* Find the next batch of constructors/destructors with the same
> +	 initialization priority.  */

Probably clearer to write the loop

  for (; i < j; ++i)

rather than the do/while with the break in the middle.

> +    /* Ensure a stable sort.  Constructors are executed in backwarding
> +       order to make LTO initialize braries first.  */
> +    return DECL_UID (f2) - DECL_UID (f1);

I have trouble believing that this is reliable.  If you do this,
put a checking assert that the original vector is monotonically
increasing in DECL_UID.

> +      build_cdtor (/*ctor_p=*/true,
> +		   VEC_address (tree, static_ctors),
> +		   VEC_length (tree, static_ctors));

Surely it would be better to pass in the vector, rather than
decomposing it into address+length.


r~
Jan Hubicka - Aug. 24, 2010, 9:50 a.m.
> On 08/21/2010 05:11 AM, Jan Hubicka wrote:
> > +  while (i < len)
> > +    {
> > +      tree body;
> > +      tree fn;
> > +      priority_type priority;
> > +
> > +      priority = 0;
> > +      body = NULL_TREE;
> > +      j = i;
> > +      do
> > +	{
> > +	  priority_type p;
> > +	  fn = cdtors[i];
> 
> Surely this should be J.  Nothing in this loop looks at J.

Oops, thanks
> 
> > +      /* Find the next batch of constructors/destructors with the same
> > +	 initialization priority.  */
> 
> Probably clearer to write the loop
> 
>   for (; i < j; ++i)
> 
> rather than the do/while with the break in the middle.

Yep, wil fix that.
> 
> > +    /* Ensure a stable sort.  Constructors are executed in backwarding
> > +       order to make LTO initialize braries first.  */
> > +    return DECL_UID (f2) - DECL_UID (f1);
> 
> I have trouble believing that this is reliable.  If you do this,
> put a checking assert that the original vector is monotonically
> increasing in DECL_UID.

My understanding is that within compilation unit the order of constructors
does not matter, while for LTO we need to care libraries to be initialized
before program.  So the sort is supposed to do what crtstuff does for constructors
in LTO world, while just ensuring stable output across qsort implementations
at non-LTO (yep, the comment should have been updates, since originally
we wanted to keep vector stable)
> 
> > +      build_cdtor (/*ctor_p=*/true,
> > +		   VEC_address (tree, static_ctors),
> > +		   VEC_length (tree, static_ctors));
> 
> Surely it would be better to pass in the vector, rather than
> decomposing it into address+length.

This comes literaly from old code, will clean this up too.

Thanks,
Honza
> 
> 
> r~
Mike Stump - Aug. 24, 2010, 3 p.m.
On Aug 24, 2010, at 2:50 AM, Jan Hubicka wrote:
> My understanding is that within compilation unit the order of constructors
> does not matter,

Ouch.  Maybe things have changed, but I liked the well defined ordering we had.  Right to left of the link line and top to bottom inside the translation unit, if memory serves.
Richard Henderson - Aug. 24, 2010, 4:40 p.m.
On 08/24/2010 02:50 AM, Jan Hubicka wrote:
>>> +    /* Ensure a stable sort.  Constructors are executed in backwarding
>>> +       order to make LTO initialize braries first.  */
>>> +    return DECL_UID (f2) - DECL_UID (f1);
>>
>> I have trouble believing that this is reliable.  If you do this,
>> put a checking assert that the original vector is monotonically
>> increasing in DECL_UID.
> 
> My understanding is that within compilation unit the order of constructors
> does not matter, while for LTO we need to care libraries to be initialized
> before program.  So the sort is supposed to do what crtstuff does for constructors
> in LTO world, while just ensuring stable output across qsort implementations
> at non-LTO (yep, the comment should have been updates, since originally
> we wanted to keep vector stable)

Within the original translation units, you are correct.  Though as Mike
says, our implementation was predictable and no doubt there are programs
that had come to rely on it.

However, think about this in the context of WHOPR, where we have more than
one translation unit involved.  As you say, the sort is *supposed* to do
what crtstuff does.  Can you prove that it does?  Why must DECL_UID
correspond to a linear scan of the object files?  I can see that it might,
right now, but I could see that's something that could change behind your
back and you wouldn't notice.

If you add the monotonically increasing check under ENABLE_CHECKING, then
we'll be sure to catch any changes in this area that are unexpected.


r~
Mike Stump - Aug. 24, 2010, 11:05 p.m.
On Aug 24, 2010, at 9:40 AM, Richard Henderson wrote:
> On 08/24/2010 02:50 AM, Jan Hubicka wrote:
>>>> +    /* Ensure a stable sort.  Constructors are executed in backwarding
>>>> +       order to make LTO initialize braries first.  */
>>>> +    return DECL_UID (f2) - DECL_UID (f1);
>>> 
>>> I have trouble believing that this is reliable.  If you do this,
>>> put a checking assert that the original vector is monotonically
>>> increasing in DECL_UID.
>> 
>> My understanding is that within compilation unit the order of constructors
>> does not matter, while for LTO we need to care libraries to be initialized
>> before program.  So the sort is supposed to do what crtstuff does for constructors
>> in LTO world, while just ensuring stable output across qsort implementations
>> at non-LTO (yep, the comment should have been updates, since originally
>> we wanted to keep vector stable)
> 
> Within the original translation units, you are correct.

Does having a random order allow one to fulfill this requirement from the standard:

  Objects  with  static
  storage  duration  defined  in namespace scope in the same translation
  unit and dynamically initialized shall be initialized in the order  in
  which  their  definition  appears  in  the  translation  unit.

?

> Though as Mike says, our implementation was predictable and no doubt there are programs that had come to rely on it.
> 
> Why must DECL_UID correspond to a linear scan of the object files?

The right answer here would be because it is a documented property of UID...  anything less is asking for trouble longer term.  If it happens to be true, and people want to use it to ensure ordering, then we should expand the doc to cover it.
Jan Hubicka - Aug. 25, 2010, 9:56 a.m.
> On 08/24/2010 02:50 AM, Jan Hubicka wrote:
> >>> +    /* Ensure a stable sort.  Constructors are executed in backwarding
> >>> +       order to make LTO initialize braries first.  */
> >>> +    return DECL_UID (f2) - DECL_UID (f1);
> >>
> >> I have trouble believing that this is reliable.  If you do this,
> >> put a checking assert that the original vector is monotonically
> >> increasing in DECL_UID.
> > 
> > My understanding is that within compilation unit the order of constructors
> > does not matter, while for LTO we need to care libraries to be initialized
> > before program.  So the sort is supposed to do what crtstuff does for constructors
> > in LTO world, while just ensuring stable output across qsort implementations
> > at non-LTO (yep, the comment should have been updates, since originally
> > we wanted to keep vector stable)
> 
> Within the original translation units, you are correct.  Though as Mike
> says, our implementation was predictable and no doubt there are programs
> that had come to rely on it.

This was not true on ELF targets since cgraph started reordering functions,
since we get same order in ctor section as the order in asm files I believe.
> 
> However, think about this in the context of WHOPR, where we have more than
> one translation unit involved.  As you say, the sort is *supposed* to do
> what crtstuff does.  Can you prove that it does?  Why must DECL_UID
> correspond to a linear scan of the object files?  I can see that it might,

The global decls from each object are read in order and DECL_UIDs are assigned
sequentially.

I am not sure about much more robust way to ensure ordering.  Cgraph nodes do
have sequentially assigned uids too and I can use them, but they are more or
less equivalent since they are assigned by same order.  There are also ids used
for -fno-toplevel-reorder, perhaps I can use those to drive sort with adding
proper streaming mechanizm (currently I guess they are completely random at LTO
stage)

> right now, but I could see that's something that could change behind your
> back and you wouldn't notice.
> 
> If you add the monotonically increasing check under ENABLE_CHECKING, then
> we'll be sure to catch any changes in this area that are unexpected.

Well, if the vectors was always supposed to be monotonically increasing,
I would go for that, but it is not true. The vector is collected from
cgraph_node linked list that is not supposed to be in any particlar order
(thoug I guess it will be downwards for non-LTO and upwards at LTO stage
since we always insert at beggining of linked list)

Honza
> 
> 
> r~
Jan Hubicka - Aug. 25, 2010, 9:58 a.m.
> On Aug 24, 2010, at 9:40 AM, Richard Henderson wrote:
> > On 08/24/2010 02:50 AM, Jan Hubicka wrote:
> >>>> +    /* Ensure a stable sort.  Constructors are executed in backwarding
> >>>> +       order to make LTO initialize braries first.  */
> >>>> +    return DECL_UID (f2) - DECL_UID (f1);
> >>> 
> >>> I have trouble believing that this is reliable.  If you do this,
> >>> put a checking assert that the original vector is monotonically
> >>> increasing in DECL_UID.
> >> 
> >> My understanding is that within compilation unit the order of constructors
> >> does not matter, while for LTO we need to care libraries to be initialized
> >> before program.  So the sort is supposed to do what crtstuff does for constructors
> >> in LTO world, while just ensuring stable output across qsort implementations
> >> at non-LTO (yep, the comment should have been updates, since originally
> >> we wanted to keep vector stable)
> > 
> > Within the original translation units, you are correct.
> 
> Does having a random order allow one to fulfill this requirement from the standard:
> 
>   Objects  with  static
>   storage  duration  defined  in namespace scope in the same translation
>   unit and dynamically initialized shall be initialized in the order  in
>   which  their  definition  appears  in  the  translation  unit.
> 
> ?

C++ frontend is combining all the consturctor into single, so this should be
safe.
> 
> > Though as Mike says, our implementation was predictable and no doubt there are programs that had come to rely on it.
> > 
> > Why must DECL_UID correspond to a linear scan of the object files?
> 
> The right answer here would be because it is a documented property of UID...  anything less is asking for trouble longer term.  If it happens to be true, and people want to use it to ensure ordering, then we should expand the doc to cover it.

We can document DECL_UId to be assigned sequentially as DECL_UIDs are created, or i can swith the code to
cgraph_node->order field (after reorganizing to collect cgraph nodes and fixing lto-cgraph to stream those
sanily)

Honza

Patch

Index: tree-pass.h
===================================================================
--- tree-pass.h	(revision 163440)
+++ tree-pass.h	(working copy)
@@ -468,6 +468,7 @@  extern struct simple_ipa_opt_pass pass_i
 extern struct ipa_opt_pass_d pass_ipa_lto_wpa_fixup;
 extern struct ipa_opt_pass_d pass_ipa_lto_finish_out;
 extern struct ipa_opt_pass_d pass_ipa_profile;
+extern struct ipa_opt_pass_d pass_ipa_cdtor_merge;
 
 extern struct gimple_opt_pass pass_all_optimizations;
 extern struct gimple_opt_pass pass_cleanup_cfg_post_optimizing;
Index: cgraphunit.c
===================================================================
--- cgraphunit.c	(revision 163440)
+++ cgraphunit.c	(working copy)
@@ -147,174 +147,9 @@  static void cgraph_analyze_function (str
 
 FILE *cgraph_dump_file;
 
-/* A vector of FUNCTION_DECLs declared as static constructors.  */
-static GTY (()) VEC(tree, gc) *static_ctors;
-/* A vector of FUNCTION_DECLs declared as static destructors.  */
-static GTY (()) VEC(tree, gc) *static_dtors;
-
 /* Used for vtable lookup in thunk adjusting.  */
 static GTY (()) tree vtable_entry_type;
 
-/* When target does not have ctors and dtors, we call all constructor
-   and destructor by special initialization/destruction function
-   recognized by collect2.
-
-   When we are going to build this function, collect all constructors and
-   destructors and turn them into normal functions.  */
-
-static void
-record_cdtor_fn (tree fndecl)
-{
-  struct cgraph_node *node;
-  if (targetm.have_ctors_dtors
-      || (!DECL_STATIC_CONSTRUCTOR (fndecl)
-	  && !DECL_STATIC_DESTRUCTOR (fndecl)))
-    return;
-
-  if (DECL_STATIC_CONSTRUCTOR (fndecl))
-    {
-      VEC_safe_push (tree, gc, static_ctors, fndecl);
-      DECL_STATIC_CONSTRUCTOR (fndecl) = 0;
-    }
-  if (DECL_STATIC_DESTRUCTOR (fndecl))
-    {
-      VEC_safe_push (tree, gc, static_dtors, fndecl);
-      DECL_STATIC_DESTRUCTOR (fndecl) = 0;
-    }
-  node = cgraph_node (fndecl);
-  node->local.disregard_inline_limits = 1;
-  cgraph_mark_reachable_node (node);
-}
-
-/* Define global constructors/destructor functions for the CDTORS, of
-   which they are LEN.  The CDTORS are sorted by initialization
-   priority.  If CTOR_P is true, these are constructors; otherwise,
-   they are destructors.  */
-
-static void
-build_cdtor (bool ctor_p, tree *cdtors, size_t len)
-{
-  size_t i;
-
-  i = 0;
-  while (i < len)
-    {
-      tree body;
-      tree fn;
-      priority_type priority;
-
-      priority = 0;
-      body = NULL_TREE;
-      /* Find the next batch of constructors/destructors with the same
-	 initialization priority.  */
-      do
-	{
-	  priority_type p;
-	  tree call;
-	  fn = cdtors[i];
-	  p = ctor_p ? DECL_INIT_PRIORITY (fn) : DECL_FINI_PRIORITY (fn);
-	  if (!body)
-	    priority = p;
-	  else if (p != priority)
-	    break;
-	  call = build_call_expr (fn, 0);
-	  append_to_statement_list (call, &body);
-	  ++i;
-	}
-      while (i < len);
-      gcc_assert (body != NULL_TREE);
-      /* Generate a function to call all the function of like
-	 priority.  */
-      cgraph_build_static_cdtor (ctor_p ? 'I' : 'D', body, priority);
-    }
-}
-
-/* Comparison function for qsort.  P1 and P2 are actually of type
-   "tree *" and point to static constructors.  DECL_INIT_PRIORITY is
-   used to determine the sort order.  */
-
-static int
-compare_ctor (const void *p1, const void *p2)
-{
-  tree f1;
-  tree f2;
-  int priority1;
-  int priority2;
-
-  f1 = *(const tree *)p1;
-  f2 = *(const tree *)p2;
-  priority1 = DECL_INIT_PRIORITY (f1);
-  priority2 = DECL_INIT_PRIORITY (f2);
-
-  if (priority1 < priority2)
-    return -1;
-  else if (priority1 > priority2)
-    return 1;
-  else
-    /* Ensure a stable sort.  */
-    return (const tree *)p1 - (const tree *)p2;
-}
-
-/* Comparison function for qsort.  P1 and P2 are actually of type
-   "tree *" and point to static destructors.  DECL_FINI_PRIORITY is
-   used to determine the sort order.  */
-
-static int
-compare_dtor (const void *p1, const void *p2)
-{
-  tree f1;
-  tree f2;
-  int priority1;
-  int priority2;
-
-  f1 = *(const tree *)p1;
-  f2 = *(const tree *)p2;
-  priority1 = DECL_FINI_PRIORITY (f1);
-  priority2 = DECL_FINI_PRIORITY (f2);
-
-  if (priority1 < priority2)
-    return -1;
-  else if (priority1 > priority2)
-    return 1;
-  else
-    /* Ensure a stable sort.  */
-    return (const tree *)p1 - (const tree *)p2;
-}
-
-/* Generate functions to call static constructors and destructors
-   for targets that do not support .ctors/.dtors sections.  These
-   functions have magic names which are detected by collect2.  */
-
-static void
-cgraph_build_cdtor_fns (void)
-{
-  if (!VEC_empty (tree, static_ctors))
-    {
-      gcc_assert (!targetm.have_ctors_dtors);
-      qsort (VEC_address (tree, static_ctors),
-	     VEC_length (tree, static_ctors),
-	     sizeof (tree),
-	     compare_ctor);
-      build_cdtor (/*ctor_p=*/true,
-		   VEC_address (tree, static_ctors),
-		   VEC_length (tree, static_ctors));
-      VEC_truncate (tree, static_ctors, 0);
-    }
-
-  if (!VEC_empty (tree, static_dtors))
-    {
-      gcc_assert (!targetm.have_ctors_dtors);
-      qsort (VEC_address (tree, static_dtors),
-	     VEC_length (tree, static_dtors),
-	     sizeof (tree),
-	     compare_dtor);
-      build_cdtor (/*ctor_p=*/false,
-		   VEC_address (tree, static_dtors),
-		   VEC_length (tree, static_dtors));
-      VEC_truncate (tree, static_dtors, 0);
-    }
-}
-
 /* Determine if function DECL is needed.  That is, visible to something
    either outside this translation unit, something magic in the system
    configury.  */
@@ -519,7 +354,6 @@  cgraph_finalize_function (tree decl, boo
   node->local.finalized = true;
   node->lowered = DECL_STRUCT_FUNCTION (decl)->cfg != NULL;
   node->finalized_by_frontend = true;
-  record_cdtor_fn (node->decl);
 
   if (cgraph_decide_is_function_needed (node, decl))
     cgraph_mark_needed_node (node);
@@ -1155,10 +989,6 @@  cgraph_finalize_compilation_unit (void)
   /* Emit size functions we didn't inline.  */
   finalize_size_functions ();
 
-  /* Call functions declared with the "constructor" or "destructor"
-     attribute.  */
-  cgraph_build_cdtor_fns ();
-
   /* Mark alias targets necessary and emit diagnostics.  */
   finish_aliases_1 ();
 
@@ -2007,78 +1837,6 @@  cgraph_optimize (void)
 #endif
 }
 
-
-/* Generate and emit a static constructor or destructor.  WHICH must
-   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
-   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
-   initialization priority for this constructor or destructor.  */
-
-void
-cgraph_build_static_cdtor (char which, tree body, int priority)
-{
-  static int counter = 0;
-  char which_buf[16];
-  tree decl, name, resdecl;
-
-  /* The priority is encoded in the constructor or destructor name.
-     collect2 will sort the names and arrange that they are called at
-     program startup.  */
-  sprintf (which_buf, "%c_%.5d_%d", which, priority, counter++);
-  name = get_file_function_name (which_buf);
-
-  decl = build_decl (input_location, FUNCTION_DECL, name,
-		     build_function_type_list (void_type_node, NULL_TREE));
-  current_function_decl = decl;
-
-  resdecl = build_decl (input_location,
-			RESULT_DECL, NULL_TREE, void_type_node);
-  DECL_ARTIFICIAL (resdecl) = 1;
-  DECL_RESULT (decl) = resdecl;
-  DECL_CONTEXT (resdecl) = decl;
-
-  allocate_struct_function (decl, false);
-
-  TREE_STATIC (decl) = 1;
-  TREE_USED (decl) = 1;
-  DECL_ARTIFICIAL (decl) = 1;
-  DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (decl) = 1;
-  DECL_SAVED_TREE (decl) = body;
-  if (!targetm.have_ctors_dtors)
-    {
-      TREE_PUBLIC (decl) = 1;
-      DECL_PRESERVE_P (decl) = 1;
-    }
-  DECL_UNINLINABLE (decl) = 1;
-
-  DECL_INITIAL (decl) = make_node (BLOCK);
-  TREE_USED (DECL_INITIAL (decl)) = 1;
-
-  DECL_SOURCE_LOCATION (decl) = input_location;
-  cfun->function_end_locus = input_location;
-
-  switch (which)
-    {
-    case 'I':
-      DECL_STATIC_CONSTRUCTOR (decl) = 1;
-      decl_init_priority_insert (decl, priority);
-      break;
-    case 'D':
-      DECL_STATIC_DESTRUCTOR (decl) = 1;
-      decl_fini_priority_insert (decl, priority);
-      break;
-    default:
-      gcc_unreachable ();
-    }
-
-  gimplify_function_tree (decl);
-
-  cgraph_add_new_function (decl, false);
-  cgraph_mark_needed_node (cgraph_node (decl));
-
-  set_cfun (NULL);
-  current_function_decl = NULL;
-}
-
 void
 init_cgraph (void)
 {
Index: ipa.c
===================================================================
--- ipa.c	(revision 163440)
+++ ipa.c	(working copy)
@@ -29,6 +29,8 @@  along with GCC; see the file COPYING3.  
 #include "ggc.h"
 #include "flags.h"
 #include "pointer-set.h"
+#include "target.h"
+#include "tree-iterator.h"
 
 /* Fill array order with all nodes with output flag set in the reverse
    topological order.  */
@@ -1317,3 +1319,310 @@  struct ipa_opt_pass_d pass_ipa_profile =
  NULL,			                /* function_transform */
  NULL					/* variable_transform */
 };
+
+/* Generate and emit a static constructor or destructor.  WHICH must
+   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
+   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
+   initialization priority for this constructor or destructor.  */
+
+void
+cgraph_build_static_cdtor (char which, tree body, int priority)
+{
+  static int counter = 0;
+  char which_buf[16];
+  tree decl, name, resdecl;
+
+  /* The priority is encoded in the constructor or destructor name.
+     collect2 will sort the names and arrange that they are called at
+     program startup.  */
+  sprintf (which_buf, "%c_%.5d_%d", which, priority, counter++);
+  name = get_file_function_name (which_buf);
+
+  decl = build_decl (input_location, FUNCTION_DECL, name,
+		     build_function_type_list (void_type_node, NULL_TREE));
+  current_function_decl = decl;
+
+  resdecl = build_decl (input_location,
+			RESULT_DECL, NULL_TREE, void_type_node);
+  DECL_ARTIFICIAL (resdecl) = 1;
+  DECL_RESULT (decl) = resdecl;
+  DECL_CONTEXT (resdecl) = decl;
+
+  allocate_struct_function (decl, false);
+
+  TREE_STATIC (decl) = 1;
+  TREE_USED (decl) = 1;
+  DECL_ARTIFICIAL (decl) = 1;
+  DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (decl) = 1;
+  DECL_SAVED_TREE (decl) = body;
+  if (!targetm.have_ctors_dtors)
+    {
+      TREE_PUBLIC (decl) = 1;
+      DECL_PRESERVE_P (decl) = 1;
+    }
+  DECL_UNINLINABLE (decl) = 1;
+
+  DECL_INITIAL (decl) = make_node (BLOCK);
+  TREE_USED (DECL_INITIAL (decl)) = 1;
+
+  DECL_SOURCE_LOCATION (decl) = input_location;
+  cfun->function_end_locus = input_location;
+
+  switch (which)
+    {
+    case 'I':
+      DECL_STATIC_CONSTRUCTOR (decl) = 1;
+      decl_init_priority_insert (decl, priority);
+      break;
+    case 'D':
+      DECL_STATIC_DESTRUCTOR (decl) = 1;
+      decl_fini_priority_insert (decl, priority);
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  gimplify_function_tree (decl);
+
+  cgraph_add_new_function (decl, false);
+
+  set_cfun (NULL);
+  current_function_decl = NULL;
+}
+
+
+/* A vector of FUNCTION_DECLs declared as static constructors.  */
+static VEC(tree, heap) *static_ctors;
+/* A vector of FUNCTION_DECLs declared as static destructors.  */
+static VEC(tree, heap) *static_dtors;
+
+/* When target does not have ctors and dtors, we call all constructor
+   and destructor by special initialization/destruction function
+   recognized by collect2.
+
+   When we are going to build this function, collect all constructors and
+   destructors and turn them into normal functions.  */
+
+static void
+record_cdtor_fn (struct cgraph_node *node)
+{
+  if (DECL_STATIC_CONSTRUCTOR (node->decl))
+    VEC_safe_push (tree, heap, static_ctors, node->decl);
+  if (DECL_STATIC_DESTRUCTOR (node->decl))
+    VEC_safe_push (tree, heap, static_dtors, node->decl);
+  node = cgraph_node (node->decl);
+  node->local.disregard_inline_limits = 1;
+}
+
+/* Define global constructors/destructor functions for the CDTORS, of
+   which they are LEN.  The CDTORS are sorted by initialization
+   priority.  If CTOR_P is true, these are constructors; otherwise,
+   they are destructors.  */
+
+static void
+build_cdtor (bool ctor_p, tree *cdtors, size_t len)
+{
+  size_t i,j;
+
+  i = 0;
+  while (i < len)
+    {
+      tree body;
+      tree fn;
+      priority_type priority;
+
+      priority = 0;
+      body = NULL_TREE;
+      j = i;
+      do
+	{
+	  priority_type p;
+	  fn = cdtors[i];
+	  p = ctor_p ? DECL_INIT_PRIORITY (fn) : DECL_FINI_PRIORITY (fn);
+	  if (j==i)
+	    priority = p;
+	  else if (p != priority)
+	    break;
+	  j++;
+	}
+      while (j < len);
+
+      /* When there is only once constructor and target supports them, do nothing.  */
+      if (j == i + 1
+	  && targetm.have_ctors_dtors)
+	continue;
+      /* Find the next batch of constructors/destructors with the same
+	 initialization priority.  */
+      do
+	{
+	  priority_type p;
+	  tree call;
+	  fn = cdtors[i];
+	  p = ctor_p ? DECL_INIT_PRIORITY (fn) : DECL_FINI_PRIORITY (fn);
+	  if (p != priority)
+	    break;
+	  call = build_call_expr (fn, 0);
+	  if (ctor_p)
+	    DECL_STATIC_CONSTRUCTOR (fn) = 0;
+	  else
+	    DECL_STATIC_DESTRUCTOR (fn) = 0;
+	  /* We do not want to optimize away pure/const calls here.
+	     When optimizing, these should be already removed, when not
+	     optimizing, we want user to be able to breakpoint in them.  */
+	  TREE_SIDE_EFFECTS (call) = 1;
+	  append_to_statement_list (call, &body);
+	  ++i;
+	}
+      while (i < len);
+      gcc_assert (body != NULL_TREE);
+      /* Generate a function to call all the function of like
+	 priority.  */
+      cgraph_build_static_cdtor (ctor_p ? 'I' : 'D', body, priority);
+    }
+}
+
+/* Comparison function for qsort.  P1 and P2 are actually of type
+   "tree *" and point to static constructors.  DECL_INIT_PRIORITY is
+   used to determine the sort order.  */
+
+static int
+compare_ctor (const void *p1, const void *p2)
+{
+  tree f1;
+  tree f2;
+  int priority1;
+  int priority2;
+
+  f1 = *(const tree *)p1;
+  f2 = *(const tree *)p2;
+  priority1 = DECL_INIT_PRIORITY (f1);
+  priority2 = DECL_INIT_PRIORITY (f2);
+
+  if (priority1 < priority2)
+    return -1;
+  else if (priority1 > priority2)
+    return 1;
+  else
+    /* Ensure a stable sort.  Constructors are executed in backwarding
+       order to make LTO initialize braries first.  */
+    return DECL_UID (f2) - DECL_UID (f1);
+}
+
+/* Comparison function for qsort.  P1 and P2 are actually of type
+   "tree *" and point to static destructors.  DECL_FINI_PRIORITY is
+   used to determine the sort order.  */
+
+static int
+compare_dtor (const void *p1, const void *p2)
+{
+  tree f1;
+  tree f2;
+  int priority1;
+  int priority2;
+
+  f1 = *(const tree *)p1;
+  f2 = *(const tree *)p2;
+  priority1 = DECL_FINI_PRIORITY (f1);
+  priority2 = DECL_FINI_PRIORITY (f2);
+
+  if (priority1 < priority2)
+    return -1;
+  else if (priority1 > priority2)
+    return 1;
+  else
+    /* Ensure a stable sort.  */
+    return DECL_UID (f1) - DECL_UID (f2);
+}
+
+/* Generate functions to call static constructors and destructors
+   for targets that do not support .ctors/.dtors sections.  These
+   functions have magic names which are detected by collect2.  */
+
+static void
+build_cdtor_fns (void)
+{
+  if (!VEC_empty (tree, static_ctors))
+    {
+      gcc_assert (!targetm.have_ctors_dtors || in_lto_p);
+      qsort (VEC_address (tree, static_ctors),
+	     VEC_length (tree, static_ctors),
+	     sizeof (tree),
+	     compare_ctor);
+      build_cdtor (/*ctor_p=*/true,
+		   VEC_address (tree, static_ctors),
+		   VEC_length (tree, static_ctors));
+      VEC_truncate (tree, static_ctors, 0);
+    }
+
+  if (!VEC_empty (tree, static_dtors))
+    {
+      gcc_assert (!targetm.have_ctors_dtors || in_lto_p);
+      qsort (VEC_address (tree, static_dtors),
+	     VEC_length (tree, static_dtors),
+	     sizeof (tree),
+	     compare_dtor);
+      build_cdtor (/*ctor_p=*/false,
+		   VEC_address (tree, static_dtors),
+		   VEC_length (tree, static_dtors));
+      VEC_truncate (tree, static_dtors, 0);
+    }
+}
+
+/* Look for constructors and destructors and produce function calling them.
+   This is needed for targets not supporting ctors or dtors, but we perform the
+   transformation also at linktime to merge possibly numberous
+   constructors/destructors into single function to improve code locality and
+   reduce size.  */
+
+static unsigned int
+ipa_cdtor_merge (void)
+{
+  struct cgraph_node *node;
+  for (node = cgraph_nodes; node; node = node->next)
+    if (node->analyzed
+	&& (DECL_STATIC_CONSTRUCTOR (node->decl)
+	    || DECL_STATIC_DESTRUCTOR (node->decl)))
+       record_cdtor_fn (node);
+  build_cdtor_fns ();
+  VEC_free (tree, heap, static_ctors);
+  VEC_free (tree, heap, static_dtors);
+  return 0;
+}
+
+/* Perform the pass when we have no ctors/dtors support
+   or at LTO time to merge multiple constructors into single
+   function.  */
+
+static bool
+gate_ipa_cdtor_merge (void)
+{
+  return !targetm.have_ctors_dtors || (optimize && in_lto_p);
+}
+
+struct ipa_opt_pass_d pass_ipa_cdtor_merge =
+{
+ {
+  IPA_PASS,
+  "cdtor",				/* name */
+  gate_ipa_cdtor_merge,			/* gate */
+  ipa_cdtor_merge,		        /* execute */
+  NULL,					/* sub */
+  NULL,					/* next */
+  0,					/* static_pass_number */
+  TV_CGRAPHOPT,			        /* tv_id */
+  0,	                                /* properties_required */
+  0,					/* properties_provided */
+  0,					/* properties_destroyed */
+  0,					/* todo_flags_start */
+  0                                     /* todo_flags_finish */
+ },
+ NULL,				        /* generate_summary */
+ NULL,					/* write_summary */
+ NULL,					/* read_summary */
+ NULL,					/* write_optimization_summary */
+ NULL,					/* read_optimization_summary */
+ NULL,					/* stmt_fixup */
+ 0,					/* TODOs */
+ NULL,			                /* function_transform */
+ NULL					/* variable_transform */
+};
Index: passes.c
===================================================================
--- passes.c	(revision 163440)
+++ passes.c	(working copy)
@@ -811,6 +811,7 @@  init_optimization_passes (void)
   NEXT_PASS (pass_ipa_whole_program_visibility);
   NEXT_PASS (pass_ipa_profile);
   NEXT_PASS (pass_ipa_cp);
+  NEXT_PASS (pass_ipa_cdtor_merge);
   NEXT_PASS (pass_ipa_inline);
   NEXT_PASS (pass_ipa_pure_const);
   NEXT_PASS (pass_ipa_reference);
Index: ipa-prop.c
===================================================================
--- ipa-prop.c	(revision 163440)
+++ ipa-prop.c	(working copy)
@@ -1541,11 +1541,12 @@  update_indirect_edges_after_inlining (st
 				      struct cgraph_node *node,
 				      VEC (cgraph_edge_p, heap) **new_edges)
 {
-  struct ipa_edge_args *top = IPA_EDGE_REF (cs);
+  struct ipa_edge_args *top;
   struct cgraph_edge *ie, *next_ie, *new_direct_edge;
   bool res = false;
 
   ipa_check_create_edge_args ();
+  top = IPA_EDGE_REF (cs);
 
   for (ie = node->indirect_calls; ie; ie = next_ie)
     {