diff mbox

Reduce global decl stream

Message ID 20151211013409.GB5527@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka Dec. 11, 2015, 1:34 a.m. UTC
Hi,
this patch saves about 30% of global decl stream size in firefox.  While
implementing the lto sections for initializers I put very stupid heursitcs
to get_symbol_initial_value deciding whether the initializer is better streamed
inline or offline.  This ignores strings and may get bit out of hand.

With this patch and the compression, the largest ltrans unit is 
118479156 bytes and 103584016 out of that is a global decl stream.

Bootstrapped/regtested x86_64-linux OK?

Honza

	* lto-streamer-out.c (subtract_estimated_size): New function
	(get_symbol_initial_value): Use it.

Comments

Richard Biener Dec. 11, 2015, 9:09 a.m. UTC | #1
On Fri, 11 Dec 2015, Jan Hubicka wrote:

> Hi,
> this patch saves about 30% of global decl stream size in firefox.  While
> implementing the lto sections for initializers I put very stupid heursitcs
> to get_symbol_initial_value deciding whether the initializer is better streamed
> inline or offline.  This ignores strings and may get bit out of hand.
> 
> With this patch and the compression, the largest ltrans unit is 
> 118479156 bytes and 103584016 out of that is a global decl stream.

So that's still 87% global decl stream.  At least for this ltrans
unit there couldn't have been a 30% saving.

Btw, the separate initializer sections will get separate string
encoders, right?  So we might end up with larger files due
to less string sharing which would happen when the strings go into
the decl section.

> Bootstrapped/regtested x86_64-linux OK?
> 
> Honza
> 
> 	* lto-streamer-out.c (subtract_estimated_size): New function
> 	(get_symbol_initial_value): Use it.
> Index: lto-streamer-out.c
> ===================================================================
> --- lto-streamer-out.c	(revision 231546)
> +++ lto-streamer-out.c	(working copy)
> @@ -309,6 +309,33 @@ lto_is_streamable (tree expr)
>  	     || TREE_CODE_CLASS (code) != tcc_statement);
>  }
>  
> +/* Very rough estimate of streaming size of the initializer.  If we ignored
> +   presence of strings, we could simply just count number of non-indexable
> +   tree nodes and number of references to indexable nodes.  Strings however
> +   may be very large and we do not want to dump them int othe global stream.
> +
> +   Count the size of initializer until the size in DATA is positive.  */
> +
> +static tree
> +subtract_estimated_size (tree *tp, int *ws, void *data)
> +{
> +  long *sum = (long *)data;
> +  if (tree_is_indexable (*tp))
> +    {
> +      *sum -= 4;
> +      *ws = 0;
> +    }
> +  if (TREE_CODE (*tp) == STRING_CST)
> +    *sum -= TREE_STRING_LENGTH (*tp) + 8;
> +  if (TREE_CODE (*tp) == IDENTIFIER_NODE)
> +    *sum -= IDENTIFIER_LENGTH (*tp) + 8;

I doubt we can ever see those.

> +  else
> +    *sum -= 16;
> +  if (*sum < 0)
> +    return *tp;
> +  return NULL_TREE;
> +}
> +

I'd like to see an explanation for the magic constants.  Also
a FE might construct

 int *a = &CONST_DECL;

with CONST_DECL having a large array initializer.  walk_tree doesn't
traverse CONST_DECLs DECL_INITIAL.

[insert rant about STRING_CSTs being special and not tied to a CONST_DECL]

>  /* For EXPR lookup and return what we want to stream to OB as DECL_INITIAL.  */
>  
> @@ -329,10 +356,16 @@ get_symbol_initial_value (lto_symtab_enc
>        varpool_node *vnode;
>        /* Extra section needs about 30 bytes; do not produce it for simple
>  	 scalar values.  */
> -      if (TREE_CODE (DECL_INITIAL (expr)) == CONSTRUCTOR
> -	  || !(vnode = varpool_node::get (expr))
> +      if (!(vnode = varpool_node::get (expr))
>  	  || !lto_symtab_encoder_encode_initializer_p (encoder, vnode))
>          initial = error_mark_node;
> +      if (initial != error_mark_node)
> +	{
> +	  long max_size = 30;
> +	  if (walk_tree (&initial, subtract_estimated_size, (void *)&max_size,
> +			 NULL))
> +	    initial = error_mark_node;
> +	}
>      }
>    return initial;
> 
>
Jan Hubicka Dec. 11, 2015, 9:22 a.m. UTC | #2
> On Fri, 11 Dec 2015, Jan Hubicka wrote:
> 
> > Hi,
> > this patch saves about 30% of global decl stream size in firefox.  While
> > implementing the lto sections for initializers I put very stupid heursitcs
> > to get_symbol_initial_value deciding whether the initializer is better streamed
> > inline or offline.  This ignores strings and may get bit out of hand.
> > 
> > With this patch and the compression, the largest ltrans unit is 
> > 118479156 bytes and 103584016 out of that is a global decl stream.
> 
> So that's still 87% global decl stream.  At least for this ltrans
> unit there couldn't have been a 30% saving.

Yeah, 30% saving was the overall size of global decls streams produced
by WPA, so this one had bad luck. 
> 
> Btw, the separate initializer sections will get separate string
> encoders, right?  So we might end up with larger files due
> to less string sharing which would happen when the strings go into
> the decl section.

Yep, if we get many duplicated stirngs in decl sections, then we will
end up with larger files.  Same thing hapens if you would use the string
in many function initializers..
> > +  if (TREE_CODE (*tp) == STRING_CST)
> > +    *sum -= TREE_STRING_LENGTH (*tp) + 8;
> > +  if (TREE_CODE (*tp) == IDENTIFIER_NODE)
> > +    *sum -= IDENTIFIER_LENGTH (*tp) + 8;
> 
> I doubt we can ever see those.

Me too, I just went though the vairable length trees for completeness.
Can just add gcc_unreachable.
> 
> > +  else
> > +    *sum -= 16;
> > +  if (*sum < 0)
> > +    return *tp;
> > +  return NULL_TREE;
> > +}
> > +
> 
> I'd like to see an explanation for the magic constants.  Also

OK, nothing really scientific behind them.  I simply divided size
of the stream by number of trees in it from the lto stats and picked
nearest power of 2.

> a FE might construct
> 
>  int *a = &CONST_DECL;
> 
> with CONST_DECL having a large array initializer.  walk_tree doesn't
> traverse CONST_DECLs DECL_INITIAL.

CONST_DECL is indexale, so in this case we will always have it in global
stream. We may want to stream initializers for those but for that we need
const decl to be in symbol table (it should) for which we need to sanity
the visibility bits... We have open PR for that.

Honza
diff mbox

Patch

Index: lto-streamer-out.c
===================================================================
--- lto-streamer-out.c	(revision 231546)
+++ lto-streamer-out.c	(working copy)
@@ -309,6 +309,33 @@  lto_is_streamable (tree expr)
 	     || TREE_CODE_CLASS (code) != tcc_statement);
 }
 
+/* Very rough estimate of streaming size of the initializer.  If we ignored
+   presence of strings, we could simply just count number of non-indexable
+   tree nodes and number of references to indexable nodes.  Strings however
+   may be very large and we do not want to dump them int othe global stream.
+
+   Count the size of initializer until the size in DATA is positive.  */
+
+static tree
+subtract_estimated_size (tree *tp, int *ws, void *data)
+{
+  long *sum = (long *)data;
+  if (tree_is_indexable (*tp))
+    {
+      *sum -= 4;
+      *ws = 0;
+    }
+  if (TREE_CODE (*tp) == STRING_CST)
+    *sum -= TREE_STRING_LENGTH (*tp) + 8;
+  if (TREE_CODE (*tp) == IDENTIFIER_NODE)
+    *sum -= IDENTIFIER_LENGTH (*tp) + 8;
+  else
+    *sum -= 16;
+  if (*sum < 0)
+    return *tp;
+  return NULL_TREE;
+}
+
 
 /* For EXPR lookup and return what we want to stream to OB as DECL_INITIAL.  */
 
@@ -329,10 +356,16 @@  get_symbol_initial_value (lto_symtab_enc
       varpool_node *vnode;
       /* Extra section needs about 30 bytes; do not produce it for simple
 	 scalar values.  */
-      if (TREE_CODE (DECL_INITIAL (expr)) == CONSTRUCTOR
-	  || !(vnode = varpool_node::get (expr))
+      if (!(vnode = varpool_node::get (expr))
 	  || !lto_symtab_encoder_encode_initializer_p (encoder, vnode))
         initial = error_mark_node;
+      if (initial != error_mark_node)
+	{
+	  long max_size = 30;
+	  if (walk_tree (&initial, subtract_estimated_size, (void *)&max_size,
+			 NULL))
+	    initial = error_mark_node;
+	}
     }
 
   return initial;