Patchwork [PR52640] Fix quadratic behavior with many referenced extern functions

login
register
mail settings
Submitter Steven Bosscher
Date March 21, 2012, 7:30 p.m.
Message ID <CABu31nNSwAg_N0AJyxwVUd8Z5Ou_U6OiMwjJxf_Nu4Zp42fesQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/148066/
State New
Headers show

Comments

Steven Bosscher - March 21, 2012, 7:30 p.m.
Hello,

The test case for this bug triggeres O(extern_delcs**2) behavior
because value_member traverses the pending_assemble_externals list
from start to end for every new extern decl.

The solution I've picked, is to add a pointer set, and while there I
made pending_assemble_externals a VEC instead of a TREE_LIST. I've
also added a FIXME to clarify that this whole situation of having
places calling assemble_external is not desirable.

On gcc110, the test case compiles in ~4s with the patch, and ~24s
without. If I add LIM5(Y) and LIM5(Z), the compile time is ~15s with
the patch, and ~372s without the patch. I am not sure what is
reasonable for this kind of test case, but on a smaller machine the
test case will probably blow up if I add those two extra LIM5s.

Anyway. Bootstrapped&tested on powerpc64-unknown-linux-gnu.
OK for trunk?
OK for all open release branches too?

Ciao!
Steven

gcc/
	* varasm.c (pending_assemble_externals): Make a VEC.
	(pending_assemble_externals_set): New pointer set.
	(process_pending_assemble_externals): Traverse the VEC instead
	of the TREE_LIST. Destroy the pointer set.
	(assemble_external): See if decl is in pending_assemble_externals_set,
	and add it to pending_assemble_externals if necessary.
	(init_varasm_once): Allocate pending_assemble_externals and
	pending_assemble_externals_set.

testsuite/
	* gcc.c-torture/compile/limits-externdecl.c: New test for PR62640.
gcc/
	* varasm.c (pending_assemble_externals): Make a VEC.
	(pending_assemble_externals_set): New pointer set.
	(process_pending_assemble_externals): Traverse the VEC instead
	of the TREE_LIST. Destroy the pointer set.
	(assemble_external): See if decl is in pending_assemble_externals_set,
	and add it to pending_assemble_externals if necessary.
	(init_varasm_once): Allocate pending_assemble_externals and
	pending_assemble_externals_set.

testsuite/
	* gcc.c-torture/compile/limits-externdecl.c: New test for PR62640.
Diego Novillo - March 21, 2012, 8:35 p.m.
On 3/21/12 3:30 PM, Steven Bosscher wrote:

> +/* FIXME: Trunk is at GCC 4.8 now and the above problem still hasn't been
> +   addressed properly.  This caused PR 52640 due to O(external_decls**2)
> +   lookups in the pending_assemble_externals queue in assemble_external.
> +   Paper over with this pointer set.  (And pending_assemble_externals even
> +   was a TREE_LIST before?!)  */
> +static struct pointer_set_t *pending_assemble_externals_set;

Can you add some description on what this pointer set does?


> +
>  #ifdef ASM_OUTPUT_EXTERNAL
>  /* True if DECL is a function decl for which no out-of-line copy exists.
>     It is assumed that DECL's assembler name has been set.  */
> @@ -2146,11 +2153,14 @@ void
>  process_pending_assemble_externals (void)
>  {
>  #ifdef ASM_OUTPUT_EXTERNAL
> -  tree list;
> -  for (list = pending_assemble_externals; list; list = TREE_CHAIN (list))
> -    assemble_external_real (TREE_VALUE (list));
> +  size_t i;
> +  tree decl;
>
> -  pending_assemble_externals = 0;
> +  FOR_EACH_VEC_ELT (tree, pending_assemble_externals, i, decl)
> +    assemble_external_real (decl);
> +  VEC_free (tree, gc, pending_assemble_externals);

Just pending_assemble_externals = NULL;

OK for all release branches (if they are open for fixes).


Diego.
Steven Bosscher - March 26, 2012, 5:01 p.m.
On Wed, Mar 21, 2012 at 9:35 PM, Diego Novillo <dnovillo@google.com> wrote:
> On 3/21/12 3:30 PM, Steven Bosscher wrote:
>
>> +/* FIXME: Trunk is at GCC 4.8 now and the above problem still hasn't been
>> +   addressed properly.  This caused PR 52640 due to O(external_decls**2)
>> +   lookups in the pending_assemble_externals queue in assemble_external.
>> +   Paper over with this pointer set.  (And pending_assemble_externals
>> even
>> +   was a TREE_LIST before?!)  */
>> +static struct pointer_set_t *pending_assemble_externals_set;
>
>
> Can you add some description on what this pointer set does?

Done, I hope you agree it's sufficiently clear now what this set is for.

> OK for all release branches (if they are open for fixes).

I've committed a version to the release branches where the
pending_assemble_externals remains a TREE_LIST. I saw no reason to
complicate the patch more than strictly necessary for the release
branches. For trunk I'm trying to make pending_assemble_externals go
away.

Ciao!
Steven

Patch

Index: varasm.c
===================================================================
--- varasm.c	(revision 185603)
+++ varasm.c	(working copy)
@@ -2097,8 +2097,15 @@  contains_pointers_p (tree type)
    the compilation unit is finalized.  This is the best we can do for
    right now (i.e. stage 3 of GCC 4.0) - the right thing is to delay
    it all the way to final.  See PR 17982 for further discussion.  */
-static GTY(()) tree pending_assemble_externals;
+static GTY(()) VEC(tree,gc) *pending_assemble_externals;
 
+/* FIXME: Trunk is at GCC 4.8 now and the above problem still hasn't been
+   addressed properly.  This caused PR 52640 due to O(external_decls**2)
+   lookups in the pending_assemble_externals queue in assemble_external.
+   Paper over with this pointer set.  (And pending_assemble_externals even
+   was a TREE_LIST before?!)  */
+static struct pointer_set_t *pending_assemble_externals_set;
+
 #ifdef ASM_OUTPUT_EXTERNAL
 /* True if DECL is a function decl for which no out-of-line copy exists.
    It is assumed that DECL's assembler name has been set.  */
@@ -2146,11 +2153,14 @@  void
 process_pending_assemble_externals (void)
 {
 #ifdef ASM_OUTPUT_EXTERNAL
-  tree list;
-  for (list = pending_assemble_externals; list; list = TREE_CHAIN (list))
-    assemble_external_real (TREE_VALUE (list));
+  size_t i;
+  tree decl;
 
-  pending_assemble_externals = 0;
+  FOR_EACH_VEC_ELT (tree, pending_assemble_externals, i, decl)
+    assemble_external_real (decl);
+  VEC_free (tree, gc, pending_assemble_externals);
+
+  pointer_set_destroy (pending_assemble_externals_set);
 #endif
 }
 
@@ -2191,9 +2201,8 @@  assemble_external (tree decl ATTRIBUTE_UNUSED)
     weak_decls = tree_cons (NULL, decl, weak_decls);
 
 #ifdef ASM_OUTPUT_EXTERNAL
-  if (value_member (decl, pending_assemble_externals) == NULL_TREE)
-    pending_assemble_externals = tree_cons (NULL, decl,
-					    pending_assemble_externals);
+  if (! pointer_set_insert (pending_assemble_externals_set, decl))
+    VEC_safe_push (tree, gc, pending_assemble_externals, decl);
 #endif
 }
 
@@ -6168,6 +6177,11 @@  init_varasm_once (void)
 
   if (readonly_data_section == NULL)
     readonly_data_section = text_section;
+
+#ifdef ASM_OUTPUT_EXTERNAL
+  pending_assemble_externals = VEC_alloc (tree, gc, 12);
+  pending_assemble_externals_set = pointer_set_create ();
+#endif
 }
 
 enum tls_model
Index: testsuite/gcc.c-torture/compile/limits-externdecl.c
===================================================================
--- testsuite/gcc.c-torture/compile/limits-externdecl.c	(revision 0)
+++ testsuite/gcc.c-torture/compile/limits-externdecl.c	(revision 0)
@@ -0,0 +1,56 @@ 
+/* Inspired by the test case for PR middle-end/52640.  */
+
+typedef struct
+{
+    char *value;
+} REFERENCE;
+
+/* Add a few "extern int Xxxxxx ();" declarations.  */
+#undef DEF
+#undef LIM1
+#undef LIM2
+#undef LIM3
+#undef LIM4
+#undef LIM5
+#undef LIM6
+#define DEF(x) 	extern int x ()
+#define LIM1(x) DEF(x##0); DEF(x##1); DEF(x##2); DEF(x##3); DEF(x##4); \
+		DEF(x##5); DEF(x##6); DEF(x##7); DEF(x##8); DEF(x##9);
+#define LIM2(x) LIM1(x##0) LIM1(x##1) LIM1(x##2) LIM1(x##3) LIM1(x##4) \
+		LIM1(x##5) LIM1(x##6) LIM1(x##7) LIM1(x##8) LIM1(x##9)
+#define LIM3(x) LIM2(x##0) LIM2(x##1) LIM2(x##2) LIM2(x##3) LIM2(x##4) \
+		LIM2(x##5) LIM2(x##6) LIM2(x##7) LIM2(x##8) LIM2(x##9)
+#define LIM4(x) LIM3(x##0) LIM3(x##1) LIM3(x##2) LIM3(x##3) LIM3(x##4) \
+		LIM3(x##5) LIM3(x##6) LIM3(x##7) LIM3(x##8) LIM3(x##9)
+#define LIM5(x) LIM4(x##0) LIM4(x##1) LIM4(x##2) LIM4(x##3) LIM4(x##4) \
+		LIM4(x##5) LIM4(x##6) LIM4(x##7) LIM4(x##8) LIM4(x##9)
+#define LIM6(x) LIM5(x##0) LIM5(x##1) LIM5(x##2) LIM5(x##3) LIM5(x##4) \
+		LIM5(x##5) LIM5(x##6) LIM5(x##7) LIM5(x##8) LIM5(x##9)
+LIM5 (X);
+
+/* Add references to them, or GCC will simply ignore the extern decls.  */
+#undef DEF
+#undef LIM1
+#undef LIM2
+#undef LIM3
+#undef LIM4
+#undef LIM5
+#undef LIM6
+#define DEF(x)	(char *) x
+#define LIM1(x) DEF(x##0), DEF(x##1), DEF(x##2), DEF(x##3), DEF(x##4), \
+		DEF(x##5), DEF(x##6), DEF(x##7), DEF(x##8), DEF(x##9),
+#define LIM2(x) LIM1(x##0) LIM1(x##1) LIM1(x##2) LIM1(x##3) LIM1(x##4) \
+		LIM1(x##5) LIM1(x##6) LIM1(x##7) LIM1(x##8) LIM1(x##9)
+#define LIM3(x) LIM2(x##0) LIM2(x##1) LIM2(x##2) LIM2(x##3) LIM2(x##4) \
+		LIM2(x##5) LIM2(x##6) LIM2(x##7) LIM2(x##8) LIM2(x##9)
+#define LIM4(x) LIM3(x##0) LIM3(x##1) LIM3(x##2) LIM3(x##3) LIM3(x##4) \
+		LIM3(x##5) LIM3(x##6) LIM3(x##7) LIM3(x##8) LIM3(x##9)
+#define LIM5(x) LIM4(x##0) LIM4(x##1) LIM4(x##2) LIM4(x##3) LIM4(x##4) \
+		LIM4(x##5) LIM4(x##6) LIM4(x##7) LIM4(x##8) LIM4(x##9)
+#define LIM6(x) LIM5(x##0) LIM5(x##1) LIM5(x##2) LIM5(x##3) LIM5(x##4) \
+		LIM5(x##5) LIM5(x##6) LIM5(x##7) LIM5(x##8) LIM5(x##9)
+REFERENCE references[] = {
+  LIM5 (X)
+  0
+};
+