Message ID | 20160115110802.GD77658@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
On Fri, 15 Jan 2016, Jan Hubicka wrote: > Hi, > this is last of my patches to improve partitionability of programs. > This patch compacts BLOCK_VARs lists and trows away TYPE_DECLs for -g0 > and redundant TYPE_DECLs on all levels. It does make noticeable difference > on firefox, but I managed to erase the numbers (can re-test them if requested) > > Bootstrapped/regtested x86_64-linux, OK? Hmm. So I wonder why remove_unused_locals does not do this then or if it does not because it would change code-gen based on -g/-g0 I wonder why your patch wouldn't introduce that issue with LTO. That said - improve remove_unused_locals instead please? Thanks, Richard. > * tree.c (needed_in_block_vars_p): New function. > (free_lang_data_in_decl): Use it. > Index: tree.c > =================================================================== > --- tree.c (revision 232407) > +++ tree.c (working copy) > @@ -5343,6 +5343,42 @@ need_assembler_name_p (tree decl) > return true; > } > > +/* Return true if DECL should stay in BLOCK_VARs list. */ > + > +static inline bool > +needed_in_block_vars_p (const_tree decl) > +{ > + if (TREE_CODE (decl) == VAR_DECL && TREE_USED (decl) > + && (!TREE_STATIC (decl) || DECL_EXTERNAL (decl))) > + return true; > + if (TREE_CODE (decl) == RESULT_DECL) > + return true; > + /* Dwarf2out output non-fortran and non-ada const decls when the > + enum is output and it will not consider language to be ada or fortran > + in LTO. */ > + if (TREE_CODE (decl) == CONST_DECL) > + return false; > + if (debug_info_level >= DINFO_LEVEL_TERSE) > + return false; > + /* Strip builtins from the translation-unit BLOCK. We still have targets > + without builtin_decl_explicit support and also builtins are shared > + nodes and thus we can't use TREE_CHAIN in multiple lists. */ > + if (TREE_CODE (decl) == FUNCTION_DECL > + && DECL_BUILT_IN (decl)) > + return false; > + if (DECL_IGNORED_P (decl)) > + return false; > + if (TREE_CODE (decl) == TYPE_DECL > + && is_redundant_typedef (decl)) > + return false; > + gcc_checking_assert (TREE_CODE (decl) == TYPE_DECL > + || TREE_CODE (decl) == VAR_DECL > + || TREE_CODE (decl) == FUNCTION_DECL > + || TREE_CODE (decl) == LABEL_DECL > + || TREE_CODE (decl) == RESULT_DECL); > + return true; > +} > + > > /* Reset all language specific information still present in symbol > DECL. */ > @@ -5445,15 +5481,11 @@ free_lang_data_in_decl (tree decl) > && DECL_INITIAL (decl) > && TREE_CODE (DECL_INITIAL (decl)) == BLOCK) > { > - /* Strip builtins from the translation-unit BLOCK. We still have targets > - without builtin_decl_explicit support and also builtins are shared > - nodes and thus we can't use TREE_CHAIN in multiple lists. */ > tree *nextp = &BLOCK_VARS (DECL_INITIAL (decl)); > while (*nextp) > { > tree var = *nextp; > - if (TREE_CODE (var) == FUNCTION_DECL > - && DECL_BUILT_IN (var)) > + if (needed_in_block_vars_p (var)) > *nextp = TREE_CHAIN (var); > else > nextp = &TREE_CHAIN (var); > >
> On Fri, 15 Jan 2016, Jan Hubicka wrote: > > > Hi, > > this is last of my patches to improve partitionability of programs. > > This patch compacts BLOCK_VARs lists and trows away TYPE_DECLs for -g0 > > and redundant TYPE_DECLs on all levels. It does make noticeable difference > > on firefox, but I managed to erase the numbers (can re-test them if requested) > > > > Bootstrapped/regtested x86_64-linux, OK? > > Hmm. So I wonder why remove_unused_locals does not do this then > or if it does not because it would change code-gen based on -g/-g0 > I wonder why your patch wouldn't introduce that issue with LTO. Usually the things stay to make DECL_UID stable across -g and -g0. This is not the case with LTO, so ATM we do not have same codegen with lto and -g/-g0. I guess it is something we can start shooting for once early debug is in. > > That said - improve remove_unused_locals instead please? Yep, that makes sense (though i am not sure every BLOCK_VARs seen by lto streamer is also seen by this function). Will update the patch and re-measure the effect. Honza
On January 16, 2016 11:27:09 AM GMT+01:00, Jan Hubicka <hubicka@ucw.cz> wrote: >> On Fri, 15 Jan 2016, Jan Hubicka wrote: >> >> > Hi, >> > this is last of my patches to improve partitionability of programs. > >> > This patch compacts BLOCK_VARs lists and trows away TYPE_DECLs for >-g0 >> > and redundant TYPE_DECLs on all levels. It does make noticeable >difference >> > on firefox, but I managed to erase the numbers (can re-test them if >requested) >> > >> > Bootstrapped/regtested x86_64-linux, OK? >> >> Hmm. So I wonder why remove_unused_locals does not do this then >> or if it does not because it would change code-gen based on -g/-g0 >> I wonder why your patch wouldn't introduce that issue with LTO. > >Usually the things stay to make DECL_UID stable across -g and -g0. Only DECL_UID order needs to be stable I think. >This is not >the case with LTO, I don't see how LTO does behave differently here. so ATM we do not have same codegen with lto and >-g/-g0. I >guess it is something we can start shooting for once early debug is in. >> >> That said - improve remove_unused_locals instead please? > >Yep, that makes sense (though i am not sure every BLOCK_VARs seen by >lto >streamer is also seen by this function). Will update the patch and >re-measure >the effect. > >Honza
On Sat, Jan 16, 2016 at 11:31:49AM +0100, Richard Biener wrote: > On January 16, 2016 11:27:09 AM GMT+01:00, Jan Hubicka <hubicka@ucw.cz> wrote: > >> On Fri, 15 Jan 2016, Jan Hubicka wrote: > >> > >> > Hi, > >> > this is last of my patches to improve partitionability of programs. > > > >> > This patch compacts BLOCK_VARs lists and trows away TYPE_DECLs for > >-g0 > >> > and redundant TYPE_DECLs on all levels. It does make noticeable > >difference > >> > on firefox, but I managed to erase the numbers (can re-test them if > >requested) > >> > > >> > Bootstrapped/regtested x86_64-linux, OK? > >> > >> Hmm. So I wonder why remove_unused_locals does not do this then > >> or if it does not because it would change code-gen based on -g/-g0 > >> I wonder why your patch wouldn't introduce that issue with LTO. > > > >Usually the things stay to make DECL_UID stable across -g and -g0. > > Only DECL_UID order needs to be stable I think. There is certainly something we only care about order and not exact numbers, but I'm not sure if it is DECL_UIDs - otherwise we'd happily use allocate_decl_uid () even for DECL_DEBUG_EXPRs, but we instead use a different counter for those. Alex? Jakub
> >Usually the things stay to make DECL_UID stable across -g and -g0. > > Only DECL_UID order needs to be stable I think. Well, when you remove a reference to a declaration, the DECL_UID order will change. We stream declarations in the order they are inserted to the encoder and that depends on specific references from the IL. Creating the decl and not inserting it to il/optimizing it out early is safe. We do so with debug statements, block local vars and other stuff. The difference to non-LTO is that in non-LTO you only need to be sure you create same decls in the same order. With LTO we need to be sure to keep all references streamed in the same order. I suppose we could avoid some of it by streaming DECL_UIDs and merging them same way as we do symbol_table->order but that will still have fun effects wrt merging. > > >This is not > >the case with LTO, > > I don't see how LTO does behave differently here. > > so ATM we do not have same codegen with lto and > >-g/-g0. I > >guess it is something we can start shooting for once early debug is in. > >> > >> That said - improve remove_unused_locals instead please? I looked into remove_unused_locals and in fact it is removing those decls aready to some degree, it just keeps all TYPE_DECLs when debug info is on. I will extend it to skip redundant typedefs. Honza
Index: tree.c =================================================================== --- tree.c (revision 232407) +++ tree.c (working copy) @@ -5343,6 +5343,42 @@ need_assembler_name_p (tree decl) return true; } +/* Return true if DECL should stay in BLOCK_VARs list. */ + +static inline bool +needed_in_block_vars_p (const_tree decl) +{ + if (TREE_CODE (decl) == VAR_DECL && TREE_USED (decl) + && (!TREE_STATIC (decl) || DECL_EXTERNAL (decl))) + return true; + if (TREE_CODE (decl) == RESULT_DECL) + return true; + /* Dwarf2out output non-fortran and non-ada const decls when the + enum is output and it will not consider language to be ada or fortran + in LTO. */ + if (TREE_CODE (decl) == CONST_DECL) + return false; + if (debug_info_level >= DINFO_LEVEL_TERSE) + return false; + /* Strip builtins from the translation-unit BLOCK. We still have targets + without builtin_decl_explicit support and also builtins are shared + nodes and thus we can't use TREE_CHAIN in multiple lists. */ + if (TREE_CODE (decl) == FUNCTION_DECL + && DECL_BUILT_IN (decl)) + return false; + if (DECL_IGNORED_P (decl)) + return false; + if (TREE_CODE (decl) == TYPE_DECL + && is_redundant_typedef (decl)) + return false; + gcc_checking_assert (TREE_CODE (decl) == TYPE_DECL + || TREE_CODE (decl) == VAR_DECL + || TREE_CODE (decl) == FUNCTION_DECL + || TREE_CODE (decl) == LABEL_DECL + || TREE_CODE (decl) == RESULT_DECL); + return true; +} + /* Reset all language specific information still present in symbol DECL. */ @@ -5445,15 +5481,11 @@ free_lang_data_in_decl (tree decl) && DECL_INITIAL (decl) && TREE_CODE (DECL_INITIAL (decl)) == BLOCK) { - /* Strip builtins from the translation-unit BLOCK. We still have targets - without builtin_decl_explicit support and also builtins are shared - nodes and thus we can't use TREE_CHAIN in multiple lists. */ tree *nextp = &BLOCK_VARS (DECL_INITIAL (decl)); while (*nextp) { tree var = *nextp; - if (TREE_CODE (var) == FUNCTION_DECL - && DECL_BUILT_IN (var)) + if (needed_in_block_vars_p (var)) *nextp = TREE_CHAIN (var); else nextp = &TREE_CHAIN (var);