Message ID | 20151211013409.GB5527@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
On Fri, 11 Dec 2015, Jan Hubicka wrote: > Hi, > this patch saves about 30% of global decl stream size in firefox. While > implementing the lto sections for initializers I put very stupid heursitcs > to get_symbol_initial_value deciding whether the initializer is better streamed > inline or offline. This ignores strings and may get bit out of hand. > > With this patch and the compression, the largest ltrans unit is > 118479156 bytes and 103584016 out of that is a global decl stream. So that's still 87% global decl stream. At least for this ltrans unit there couldn't have been a 30% saving. Btw, the separate initializer sections will get separate string encoders, right? So we might end up with larger files due to less string sharing which would happen when the strings go into the decl section. > Bootstrapped/regtested x86_64-linux OK? > > Honza > > * lto-streamer-out.c (subtract_estimated_size): New function > (get_symbol_initial_value): Use it. > Index: lto-streamer-out.c > =================================================================== > --- lto-streamer-out.c (revision 231546) > +++ lto-streamer-out.c (working copy) > @@ -309,6 +309,33 @@ lto_is_streamable (tree expr) > || TREE_CODE_CLASS (code) != tcc_statement); > } > > +/* Very rough estimate of streaming size of the initializer. If we ignored > + presence of strings, we could simply just count number of non-indexable > + tree nodes and number of references to indexable nodes. Strings however > + may be very large and we do not want to dump them int othe global stream. > + > + Count the size of initializer until the size in DATA is positive. */ > + > +static tree > +subtract_estimated_size (tree *tp, int *ws, void *data) > +{ > + long *sum = (long *)data; > + if (tree_is_indexable (*tp)) > + { > + *sum -= 4; > + *ws = 0; > + } > + if (TREE_CODE (*tp) == STRING_CST) > + *sum -= TREE_STRING_LENGTH (*tp) + 8; > + if (TREE_CODE (*tp) == IDENTIFIER_NODE) > + *sum -= IDENTIFIER_LENGTH (*tp) + 8; I doubt we can ever see those. > + else > + *sum -= 16; > + if (*sum < 0) > + return *tp; > + return NULL_TREE; > +} > + I'd like to see an explanation for the magic constants. Also a FE might construct int *a = &CONST_DECL; with CONST_DECL having a large array initializer. walk_tree doesn't traverse CONST_DECLs DECL_INITIAL. [insert rant about STRING_CSTs being special and not tied to a CONST_DECL] > /* For EXPR lookup and return what we want to stream to OB as DECL_INITIAL. */ > > @@ -329,10 +356,16 @@ get_symbol_initial_value (lto_symtab_enc > varpool_node *vnode; > /* Extra section needs about 30 bytes; do not produce it for simple > scalar values. */ > - if (TREE_CODE (DECL_INITIAL (expr)) == CONSTRUCTOR > - || !(vnode = varpool_node::get (expr)) > + if (!(vnode = varpool_node::get (expr)) > || !lto_symtab_encoder_encode_initializer_p (encoder, vnode)) > initial = error_mark_node; > + if (initial != error_mark_node) > + { > + long max_size = 30; > + if (walk_tree (&initial, subtract_estimated_size, (void *)&max_size, > + NULL)) > + initial = error_mark_node; > + } > } > return initial; > >
> On Fri, 11 Dec 2015, Jan Hubicka wrote: > > > Hi, > > this patch saves about 30% of global decl stream size in firefox. While > > implementing the lto sections for initializers I put very stupid heursitcs > > to get_symbol_initial_value deciding whether the initializer is better streamed > > inline or offline. This ignores strings and may get bit out of hand. > > > > With this patch and the compression, the largest ltrans unit is > > 118479156 bytes and 103584016 out of that is a global decl stream. > > So that's still 87% global decl stream. At least for this ltrans > unit there couldn't have been a 30% saving. Yeah, 30% saving was the overall size of global decls streams produced by WPA, so this one had bad luck. > > Btw, the separate initializer sections will get separate string > encoders, right? So we might end up with larger files due > to less string sharing which would happen when the strings go into > the decl section. Yep, if we get many duplicated stirngs in decl sections, then we will end up with larger files. Same thing hapens if you would use the string in many function initializers.. > > + if (TREE_CODE (*tp) == STRING_CST) > > + *sum -= TREE_STRING_LENGTH (*tp) + 8; > > + if (TREE_CODE (*tp) == IDENTIFIER_NODE) > > + *sum -= IDENTIFIER_LENGTH (*tp) + 8; > > I doubt we can ever see those. Me too, I just went though the vairable length trees for completeness. Can just add gcc_unreachable. > > > + else > > + *sum -= 16; > > + if (*sum < 0) > > + return *tp; > > + return NULL_TREE; > > +} > > + > > I'd like to see an explanation for the magic constants. Also OK, nothing really scientific behind them. I simply divided size of the stream by number of trees in it from the lto stats and picked nearest power of 2. > a FE might construct > > int *a = &CONST_DECL; > > with CONST_DECL having a large array initializer. walk_tree doesn't > traverse CONST_DECLs DECL_INITIAL. CONST_DECL is indexale, so in this case we will always have it in global stream. We may want to stream initializers for those but for that we need const decl to be in symbol table (it should) for which we need to sanity the visibility bits... We have open PR for that. Honza
Index: lto-streamer-out.c =================================================================== --- lto-streamer-out.c (revision 231546) +++ lto-streamer-out.c (working copy) @@ -309,6 +309,33 @@ lto_is_streamable (tree expr) || TREE_CODE_CLASS (code) != tcc_statement); } +/* Very rough estimate of streaming size of the initializer. If we ignored + presence of strings, we could simply just count number of non-indexable + tree nodes and number of references to indexable nodes. Strings however + may be very large and we do not want to dump them int othe global stream. + + Count the size of initializer until the size in DATA is positive. */ + +static tree +subtract_estimated_size (tree *tp, int *ws, void *data) +{ + long *sum = (long *)data; + if (tree_is_indexable (*tp)) + { + *sum -= 4; + *ws = 0; + } + if (TREE_CODE (*tp) == STRING_CST) + *sum -= TREE_STRING_LENGTH (*tp) + 8; + if (TREE_CODE (*tp) == IDENTIFIER_NODE) + *sum -= IDENTIFIER_LENGTH (*tp) + 8; + else + *sum -= 16; + if (*sum < 0) + return *tp; + return NULL_TREE; +} + /* For EXPR lookup and return what we want to stream to OB as DECL_INITIAL. */ @@ -329,10 +356,16 @@ get_symbol_initial_value (lto_symtab_enc varpool_node *vnode; /* Extra section needs about 30 bytes; do not produce it for simple scalar values. */ - if (TREE_CODE (DECL_INITIAL (expr)) == CONSTRUCTOR - || !(vnode = varpool_node::get (expr)) + if (!(vnode = varpool_node::get (expr)) || !lto_symtab_encoder_encode_initializer_p (encoder, vnode)) initial = error_mark_node; + if (initial != error_mark_node) + { + long max_size = 30; + if (walk_tree (&initial, subtract_estimated_size, (void *)&max_size, + NULL)) + initial = error_mark_node; + } } return initial;