Message ID | 1286792263-9244-3-git-send-email-andi@firstfloor.org |
---|---|
State | New |
Headers | show |
On Mon, Oct 11, 2010 at 12:17 PM, Andi Kleen <andi@firstfloor.org> wrote: > From: Andi Kleen <ak@linux.intel.com> > > This adds a new LTO mode "lto slim" that only puts LTO information > into the object file, no "fallback code". This improves compilation > performance because the code has to be only generated once. It also > allows some future extensions I am planning to do. > > The disadvantage is that a slim build requires more tool chain support > because there is no "safety net" of fallback code in the object file. > > For example ar and ranlib need use the gcc lto plugin (and currently > need to be binutils mainline) to be able to create the symbol indexes. > Also libtool needs to be updated. And it will only work with > the linker plugin. > > Because of these complications slim LTO is not default, but a separate > -flto-slim option. The option works with whopr and with LTO. > > Spreading the checking of the flag into the various frontends is ugly, > but I didn't find a simple nicer way to do this. > > Thanks to Jan Hubicka for some fixes to my initial version. > > Requires earlier patches to fix up collect2. > > This passes normal bootstrap and testing. > > I also did a full lto slim bootstrap on x86_64-linux, this works > with some caveats: > > - libtool needs to be updated to 2.4 in tree > (Ralf W. indicated he would do that later) > - It requires a binutils git snapshot with plugin support for ar > - Suitable ar/ranlib wrappers that specify the linker plugin > need to be passed in explicitely to the build > (I hope this can be auto detected later once the wrappers > will be part of gcc) > - The fixincl Makefile needs to be manually fixed to > use -fwhopr -fuse-linker-plugin in stage2 and 3 to be able > to handle slim libiberty > (this seems like a fixincl bug really; it should be able > to use different CFLAGS for stage 2 and 3) > An alternative would be to disable LTO for libiberty, this > might be preferable anyways if libiberty is installed by > "make install" > > Ok to commit? Well. Instead of putting checks everywhere for flag_lto_slim we should re-organize how/when the frontends dispatch to the middle-end (thus, look at the langhooks and see what common code that write_global_declarations does we can factor out). For example cgraph_optimize () should be called by common code and generally be the last thing the frontend does in w_g_d. Richard. > gcc/ada/ > > 2010-10-11 Andi Kleen <ak@linux.intel.com> > > * utils.c (gnat_write_global_declarations): Bail out early > when flag_lto_slim is set. > > gcc/ > > 2010-10-11 Andi Kleen <ak@linux.intel.com> > Jan Hubicka <jh@suse.cz> > > * cgrapunit.c (ipa_passes): Check for flag_lto_slim. > (cgraph_optimize): Bail out early if flag_lto_slim is set. > * common.opt (flto-slim): Add. > * doc/invoke.texi (-flto-slim): Document. > * langhooks.c (write_global_declarations): Check for > flag_lto_slim. > * toplev.c (compile_file): Check for flag_lto_slim. > > gcc/cp > > 2010-10-11 Andi Kleen <ak@linux.intel.com> > > * decl2.c (cp_write_global_declarations): Check for > flag_lto_slim. > > gcc/lto > > 2010-10-11 Andi Kleen <ak@linux.intel.com> > > * lto.c (lto_main): Set flag_lto_slim to zero. > --- > gcc/ada/gcc-interface/utils.c | 3 +++ > gcc/cgraphunit.c | 13 ++++++++++++- > gcc/common.opt | 5 +++++ > gcc/cp/decl2.c | 8 +++++--- > gcc/doc/invoke.texi | 10 +++++++++- > gcc/langhooks.c | 7 +++++-- > gcc/lto/lto.c | 3 +++ > gcc/toplev.c | 7 +++++-- > 8 files changed, 47 insertions(+), 9 deletions(-) > > diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c > index 876556c..e339d88 100644 > --- a/gcc/ada/gcc-interface/utils.c > +++ b/gcc/ada/gcc-interface/utils.c > @@ -4646,6 +4646,9 @@ gnat_write_global_declarations (void) > FIXME: shouldn't be the front end's responsibility to call this. */ > cgraph_finalize_compilation_unit (); > > + if (flag_lto_slim) > + return; > + > /* Emit debug info for all global declarations. */ > emit_debug_global_declarations (VEC_address (tree, global_decls), > VEC_length (tree, global_decls)); > diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c > index e9d1f1d..103a724 100644 > --- a/gcc/cgraphunit.c > +++ b/gcc/cgraphunit.c > @@ -1730,7 +1730,7 @@ ipa_passes (void) > if (flag_generate_lto) > targetm.asm_out.lto_end (); > > - if (!flag_ltrans) > + if (!flag_ltrans && !flag_lto_slim) > execute_ipa_pass_list (all_regular_ipa_passes); > invoke_plugin_callbacks (PLUGIN_ALL_IPA_PASSES_END, NULL); > > @@ -1775,6 +1775,17 @@ cgraph_optimize (void) > return; > } > > + if (flag_lto_slim) > + { > + if (!quiet_flag) > + fprintf (stderr, "In slim LTO mode. Stopping output.\n"); > + timevar_pop (TV_CGRAPHOPT); > + /* Release the trees here? Right now it doesn't matter because > + we exit soon. */ > + return; > + } > + > + > /* This pass remove bodies of extern inline functions we never inlined. > Do this later so other IPA passes see what is really going on. */ > cgraph_remove_unreachable_nodes (false, dump_file); > diff --git a/gcc/common.opt b/gcc/common.opt > index b0e40c1..37d2eff 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -1089,6 +1089,11 @@ flto-report > Common Report Var(flag_lto_report) Init(0) Optimization > Report various link-time optimization statistics > > +flto-slim > +Common Report Var(flag_lto_slim) Init(0) Optimization > +Only generate LTO output, no assembler. This flag is ignored for the LTO > +frontend. Will only work with -fuse-linker-plugin. > + > fmath-errno > Common Report Var(flag_errno_math) Init(1) Optimization > Set errno after built-in math functions > diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c > index fcc83fb..ebac286 100644 > --- a/gcc/cp/decl2.c > +++ b/gcc/cp/decl2.c > @@ -3934,8 +3934,9 @@ cp_write_global_declarations (void) > { > check_global_declarations (VEC_address (tree, pending_statics), > VEC_length (tree, pending_statics)); > - emit_debug_global_declarations (VEC_address (tree, pending_statics), > - VEC_length (tree, pending_statics)); > + if (!flag_lto_slim) > + emit_debug_global_declarations (VEC_address (tree, pending_statics), > + VEC_length (tree, pending_statics)); > } > > perform_deferred_noexcept_checks (); > @@ -3943,7 +3944,8 @@ cp_write_global_declarations (void) > /* Generate hidden aliases for Java. */ > if (candidates) > { > - build_java_method_aliases (candidates); > + if (!flag_lto_slim) > + build_java_method_aliases (candidates); > pointer_set_destroy (candidates); > } > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 776fdd0..33548b1 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -355,7 +355,7 @@ Objective-C and Objective-C++ Dialects}. > -fivopts -fkeep-inline-functions -fkeep-static-consts @gol > -floop-block -floop-flatten -floop-interchange -floop-strip-mine @gol > -floop-parallelize-all -flto -flto-compression-level -flto-partition=@var{alg} @gol > --flto-report -fltrans -fltrans-output-list -fmerge-all-constants @gol > +-flto-report -flto-slim -fltrans -fltrans-output-list -fmerge-all-constants @gol > -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol > -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol > -fno-default-inline @gol > @@ -7655,6 +7655,14 @@ files in LTO mode (via @option{-fwhopr} or @option{-flto}). > > Disabled by default. > > +@item -flto-slim > +Only generate LTO output, no assembler. The result is that the assembler > +code is only generated once for a LTO build. This flag is ignored for the LTO > +frontend. Will only work with the @option{-fuse-linker-plugin} and > +with the @command{gold} linker. > + > +Disabled by default > + > @item -fuse-linker-plugin > Enables the extraction of objects with GIMPLE bytecode information > from library archives. This option relies on features available only > diff --git a/gcc/langhooks.c b/gcc/langhooks.c > index 2217a24..1032ceb 100644 > --- a/gcc/langhooks.c > +++ b/gcc/langhooks.c > @@ -323,9 +323,12 @@ write_global_declarations (void) > for (i = 0, decl = globals; i < len; i++, decl = DECL_CHAIN (decl)) > vec[len - i - 1] = decl; > > - wrapup_global_declarations (vec, len); > + if (!flag_lto_slim) > + wrapup_global_declarations (vec, len); > check_global_declarations (vec, len); > - emit_debug_global_declarations (vec, len); > + > + if (!flag_lto_slim) > + emit_debug_global_declarations (vec, len); > > /* Clean up. */ > free (vec); > diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c > index 3baea80..aec50ae 100644 > --- a/gcc/lto/lto.c > +++ b/gcc/lto/lto.c > @@ -2424,6 +2424,9 @@ lto_main (int debug_p ATTRIBUTE_UNUSED) > > lto_init_reader (); > > + /* In LTO mode don't ever be slim */ > + flag_lto_slim = 0; > + > /* Read all the symbols and call graph from all the files in the > command line. */ > read_cgraph_and_symbols (num_in_fnames, in_fnames); > diff --git a/gcc/toplev.c b/gcc/toplev.c > index a6c13f1..e9404de 100644 > --- a/gcc/toplev.c > +++ b/gcc/toplev.c > @@ -930,8 +930,11 @@ compile_file (void) > /* This must also call cgraph_finalize_compilation_unit. */ > lang_hooks.decls.final_write_globals (); > > - if (seen_error ()) > - return; > + if (seen_error () || flag_lto_slim) > + { > + invoke_plugin_callbacks (PLUGIN_FINISH_UNIT, NULL); > + return; > + } > > varpool_assemble_pending_decls (); > finish_aliases_2 (); > -- > 1.7.1 > >
> Well. Instead of putting checks everywhere for flag_lto_slim > we should re-organize how/when the frontends dispatch to the > middle-end (thus, look at the langhooks and see what common > code that write_global_declarations does we can factor out). > > For example cgraph_optimize () should be called by common code > and generally be the last thing the frontend does in w_g_d. The problem is you still have to bail out and sometimes the frontends do stuff afterwards which should not be skipped (e.g. generating some warnings or updating repo and plugins) Maybe some of that code could be moved earlier, but at least I don't understand it all well enough to say if that is safe to do or not. So yes that could be done, but you would still need to have a lot of check in the callers. I suspect it wouldn't reduce the checking significantly. -Andi
diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c index 876556c..e339d88 100644 --- a/gcc/ada/gcc-interface/utils.c +++ b/gcc/ada/gcc-interface/utils.c @@ -4646,6 +4646,9 @@ gnat_write_global_declarations (void) FIXME: shouldn't be the front end's responsibility to call this. */ cgraph_finalize_compilation_unit (); + if (flag_lto_slim) + return; + /* Emit debug info for all global declarations. */ emit_debug_global_declarations (VEC_address (tree, global_decls), VEC_length (tree, global_decls)); diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index e9d1f1d..103a724 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1730,7 +1730,7 @@ ipa_passes (void) if (flag_generate_lto) targetm.asm_out.lto_end (); - if (!flag_ltrans) + if (!flag_ltrans && !flag_lto_slim) execute_ipa_pass_list (all_regular_ipa_passes); invoke_plugin_callbacks (PLUGIN_ALL_IPA_PASSES_END, NULL); @@ -1775,6 +1775,17 @@ cgraph_optimize (void) return; } + if (flag_lto_slim) + { + if (!quiet_flag) + fprintf (stderr, "In slim LTO mode. Stopping output.\n"); + timevar_pop (TV_CGRAPHOPT); + /* Release the trees here? Right now it doesn't matter because + we exit soon. */ + return; + } + + /* This pass remove bodies of extern inline functions we never inlined. Do this later so other IPA passes see what is really going on. */ cgraph_remove_unreachable_nodes (false, dump_file); diff --git a/gcc/common.opt b/gcc/common.opt index b0e40c1..37d2eff 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1089,6 +1089,11 @@ flto-report Common Report Var(flag_lto_report) Init(0) Optimization Report various link-time optimization statistics +flto-slim +Common Report Var(flag_lto_slim) Init(0) Optimization +Only generate LTO output, no assembler. This flag is ignored for the LTO +frontend. Will only work with -fuse-linker-plugin. + fmath-errno Common Report Var(flag_errno_math) Init(1) Optimization Set errno after built-in math functions diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index fcc83fb..ebac286 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -3934,8 +3934,9 @@ cp_write_global_declarations (void) { check_global_declarations (VEC_address (tree, pending_statics), VEC_length (tree, pending_statics)); - emit_debug_global_declarations (VEC_address (tree, pending_statics), - VEC_length (tree, pending_statics)); + if (!flag_lto_slim) + emit_debug_global_declarations (VEC_address (tree, pending_statics), + VEC_length (tree, pending_statics)); } perform_deferred_noexcept_checks (); @@ -3943,7 +3944,8 @@ cp_write_global_declarations (void) /* Generate hidden aliases for Java. */ if (candidates) { - build_java_method_aliases (candidates); + if (!flag_lto_slim) + build_java_method_aliases (candidates); pointer_set_destroy (candidates); } diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 776fdd0..33548b1 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -355,7 +355,7 @@ Objective-C and Objective-C++ Dialects}. -fivopts -fkeep-inline-functions -fkeep-static-consts @gol -floop-block -floop-flatten -floop-interchange -floop-strip-mine @gol -floop-parallelize-all -flto -flto-compression-level -flto-partition=@var{alg} @gol --flto-report -fltrans -fltrans-output-list -fmerge-all-constants @gol +-flto-report -flto-slim -fltrans -fltrans-output-list -fmerge-all-constants @gol -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol -fno-default-inline @gol @@ -7655,6 +7655,14 @@ files in LTO mode (via @option{-fwhopr} or @option{-flto}). Disabled by default. +@item -flto-slim +Only generate LTO output, no assembler. The result is that the assembler +code is only generated once for a LTO build. This flag is ignored for the LTO +frontend. Will only work with the @option{-fuse-linker-plugin} and +with the @command{gold} linker. + +Disabled by default + @item -fuse-linker-plugin Enables the extraction of objects with GIMPLE bytecode information from library archives. This option relies on features available only diff --git a/gcc/langhooks.c b/gcc/langhooks.c index 2217a24..1032ceb 100644 --- a/gcc/langhooks.c +++ b/gcc/langhooks.c @@ -323,9 +323,12 @@ write_global_declarations (void) for (i = 0, decl = globals; i < len; i++, decl = DECL_CHAIN (decl)) vec[len - i - 1] = decl; - wrapup_global_declarations (vec, len); + if (!flag_lto_slim) + wrapup_global_declarations (vec, len); check_global_declarations (vec, len); - emit_debug_global_declarations (vec, len); + + if (!flag_lto_slim) + emit_debug_global_declarations (vec, len); /* Clean up. */ free (vec); diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c index 3baea80..aec50ae 100644 --- a/gcc/lto/lto.c +++ b/gcc/lto/lto.c @@ -2424,6 +2424,9 @@ lto_main (int debug_p ATTRIBUTE_UNUSED) lto_init_reader (); + /* In LTO mode don't ever be slim */ + flag_lto_slim = 0; + /* Read all the symbols and call graph from all the files in the command line. */ read_cgraph_and_symbols (num_in_fnames, in_fnames); diff --git a/gcc/toplev.c b/gcc/toplev.c index a6c13f1..e9404de 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -930,8 +930,11 @@ compile_file (void) /* This must also call cgraph_finalize_compilation_unit. */ lang_hooks.decls.final_write_globals (); - if (seen_error ()) - return; + if (seen_error () || flag_lto_slim) + { + invoke_plugin_callbacks (PLUGIN_FINISH_UNIT, NULL); + return; + } varpool_assemble_pending_decls (); finish_aliases_2 ();
From: Andi Kleen <ak@linux.intel.com> This adds a new LTO mode "lto slim" that only puts LTO information into the object file, no "fallback code". This improves compilation performance because the code has to be only generated once. It also allows some future extensions I am planning to do. The disadvantage is that a slim build requires more tool chain support because there is no "safety net" of fallback code in the object file. For example ar and ranlib need use the gcc lto plugin (and currently need to be binutils mainline) to be able to create the symbol indexes. Also libtool needs to be updated. And it will only work with the linker plugin. Because of these complications slim LTO is not default, but a separate -flto-slim option. The option works with whopr and with LTO. Spreading the checking of the flag into the various frontends is ugly, but I didn't find a simple nicer way to do this. Thanks to Jan Hubicka for some fixes to my initial version. Requires earlier patches to fix up collect2. This passes normal bootstrap and testing. I also did a full lto slim bootstrap on x86_64-linux, this works with some caveats: - libtool needs to be updated to 2.4 in tree (Ralf W. indicated he would do that later) - It requires a binutils git snapshot with plugin support for ar - Suitable ar/ranlib wrappers that specify the linker plugin need to be passed in explicitely to the build (I hope this can be auto detected later once the wrappers will be part of gcc) - The fixincl Makefile needs to be manually fixed to use -fwhopr -fuse-linker-plugin in stage2 and 3 to be able to handle slim libiberty (this seems like a fixincl bug really; it should be able to use different CFLAGS for stage 2 and 3) An alternative would be to disable LTO for libiberty, this might be preferable anyways if libiberty is installed by "make install" Ok to commit? gcc/ada/ 2010-10-11 Andi Kleen <ak@linux.intel.com> * utils.c (gnat_write_global_declarations): Bail out early when flag_lto_slim is set. gcc/ 2010-10-11 Andi Kleen <ak@linux.intel.com> Jan Hubicka <jh@suse.cz> * cgrapunit.c (ipa_passes): Check for flag_lto_slim. (cgraph_optimize): Bail out early if flag_lto_slim is set. * common.opt (flto-slim): Add. * doc/invoke.texi (-flto-slim): Document. * langhooks.c (write_global_declarations): Check for flag_lto_slim. * toplev.c (compile_file): Check for flag_lto_slim. gcc/cp 2010-10-11 Andi Kleen <ak@linux.intel.com> * decl2.c (cp_write_global_declarations): Check for flag_lto_slim. gcc/lto 2010-10-11 Andi Kleen <ak@linux.intel.com> * lto.c (lto_main): Set flag_lto_slim to zero. --- gcc/ada/gcc-interface/utils.c | 3 +++ gcc/cgraphunit.c | 13 ++++++++++++- gcc/common.opt | 5 +++++ gcc/cp/decl2.c | 8 +++++--- gcc/doc/invoke.texi | 10 +++++++++- gcc/langhooks.c | 7 +++++-- gcc/lto/lto.c | 3 +++ gcc/toplev.c | 7 +++++-- 8 files changed, 47 insertions(+), 9 deletions(-)