Patchwork [3/3] Implement -flto-slim

login
register
mail settings
Submitter Andi Kleen
Date Oct. 11, 2010, 10:17 a.m.
Message ID <1286792263-9244-3-git-send-email-andi@firstfloor.org>
Download mbox | patch
Permalink /patch/67406/
State New
Headers show

Comments

Andi Kleen - Oct. 11, 2010, 10:17 a.m.
From: Andi Kleen <ak@linux.intel.com>

This adds a new LTO mode "lto slim" that only puts LTO information
into the object file, no "fallback code". This improves compilation
performance because the code has to be only generated once. It also
allows some future extensions I am planning to do.

The disadvantage is that a slim build requires more tool chain support
because there is no "safety net" of fallback code in the object file.

For example ar and ranlib need use the gcc lto plugin (and currently
need to be binutils mainline) to be able to create the symbol indexes.
Also libtool needs to be updated. And it will only work with
the linker plugin.

Because of these complications slim LTO is not default, but a separate
-flto-slim option. The option works with whopr and with LTO.

Spreading the checking of the flag into the various frontends is ugly,
but I didn't find a simple nicer way to do this.

Thanks to Jan Hubicka for some fixes to my initial version.

Requires earlier patches to fix up collect2.

This passes normal bootstrap and testing.

I also did a full lto slim bootstrap on x86_64-linux, this works
with some caveats:

- libtool needs to be updated to 2.4 in tree
(Ralf W. indicated he would do that later)
- It requires a binutils git snapshot with plugin support for ar
- Suitable ar/ranlib wrappers that specify the linker plugin
need to be passed in explicitely to the build
(I hope this can be auto detected later once the wrappers
will be part of gcc)
- The fixincl Makefile needs to be manually fixed to
use -fwhopr -fuse-linker-plugin in stage2 and 3 to be able
to handle slim libiberty
(this seems like a fixincl bug really; it should be able
to use different CFLAGS for stage 2 and 3)
An alternative would be to disable LTO for libiberty, this
might be preferable anyways if libiberty is installed by
"make install"

Ok to commit?

gcc/ada/

2010-10-11  Andi Kleen  <ak@linux.intel.com>

	* utils.c (gnat_write_global_declarations): Bail out early
	when flag_lto_slim is set.

gcc/

2010-10-11  Andi Kleen  <ak@linux.intel.com>
	    Jan Hubicka <jh@suse.cz>

	* cgrapunit.c (ipa_passes): Check for flag_lto_slim.
	(cgraph_optimize): Bail out early if flag_lto_slim is set.
	* common.opt (flto-slim): Add.
	* doc/invoke.texi (-flto-slim): Document.
	* langhooks.c (write_global_declarations): Check for
	flag_lto_slim.
	* toplev.c (compile_file): Check for flag_lto_slim.

gcc/cp

2010-10-11  Andi Kleen  <ak@linux.intel.com>

	* decl2.c (cp_write_global_declarations): Check for
	flag_lto_slim.

gcc/lto

2010-10-11  Andi Kleen  <ak@linux.intel.com>

	* lto.c (lto_main): Set flag_lto_slim to zero.
---
 gcc/ada/gcc-interface/utils.c |    3 +++
 gcc/cgraphunit.c              |   13 ++++++++++++-
 gcc/common.opt                |    5 +++++
 gcc/cp/decl2.c                |    8 +++++---
 gcc/doc/invoke.texi           |   10 +++++++++-
 gcc/langhooks.c               |    7 +++++--
 gcc/lto/lto.c                 |    3 +++
 gcc/toplev.c                  |    7 +++++--
 8 files changed, 47 insertions(+), 9 deletions(-)
Richard Guenther - Oct. 11, 2010, 10:38 a.m.
On Mon, Oct 11, 2010 at 12:17 PM, Andi Kleen <andi@firstfloor.org> wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> This adds a new LTO mode "lto slim" that only puts LTO information
> into the object file, no "fallback code". This improves compilation
> performance because the code has to be only generated once. It also
> allows some future extensions I am planning to do.
>
> The disadvantage is that a slim build requires more tool chain support
> because there is no "safety net" of fallback code in the object file.
>
> For example ar and ranlib need use the gcc lto plugin (and currently
> need to be binutils mainline) to be able to create the symbol indexes.
> Also libtool needs to be updated. And it will only work with
> the linker plugin.
>
> Because of these complications slim LTO is not default, but a separate
> -flto-slim option. The option works with whopr and with LTO.
>
> Spreading the checking of the flag into the various frontends is ugly,
> but I didn't find a simple nicer way to do this.
>
> Thanks to Jan Hubicka for some fixes to my initial version.
>
> Requires earlier patches to fix up collect2.
>
> This passes normal bootstrap and testing.
>
> I also did a full lto slim bootstrap on x86_64-linux, this works
> with some caveats:
>
> - libtool needs to be updated to 2.4 in tree
> (Ralf W. indicated he would do that later)
> - It requires a binutils git snapshot with plugin support for ar
> - Suitable ar/ranlib wrappers that specify the linker plugin
> need to be passed in explicitely to the build
> (I hope this can be auto detected later once the wrappers
> will be part of gcc)
> - The fixincl Makefile needs to be manually fixed to
> use -fwhopr -fuse-linker-plugin in stage2 and 3 to be able
> to handle slim libiberty
> (this seems like a fixincl bug really; it should be able
> to use different CFLAGS for stage 2 and 3)
> An alternative would be to disable LTO for libiberty, this
> might be preferable anyways if libiberty is installed by
> "make install"
>
> Ok to commit?

Well.  Instead of putting checks everywhere for flag_lto_slim
we should re-organize how/when the frontends dispatch to the
middle-end (thus, look at the langhooks and see what common
code that write_global_declarations does we can factor out).

For example cgraph_optimize () should be called by common code
and generally be the last thing the frontend does in w_g_d.

Richard.

> gcc/ada/
>
> 2010-10-11  Andi Kleen  <ak@linux.intel.com>
>
>        * utils.c (gnat_write_global_declarations): Bail out early
>        when flag_lto_slim is set.
>
> gcc/
>
> 2010-10-11  Andi Kleen  <ak@linux.intel.com>
>            Jan Hubicka <jh@suse.cz>
>
>        * cgrapunit.c (ipa_passes): Check for flag_lto_slim.
>        (cgraph_optimize): Bail out early if flag_lto_slim is set.
>        * common.opt (flto-slim): Add.
>        * doc/invoke.texi (-flto-slim): Document.
>        * langhooks.c (write_global_declarations): Check for
>        flag_lto_slim.
>        * toplev.c (compile_file): Check for flag_lto_slim.
>
> gcc/cp
>
> 2010-10-11  Andi Kleen  <ak@linux.intel.com>
>
>        * decl2.c (cp_write_global_declarations): Check for
>        flag_lto_slim.
>
> gcc/lto
>
> 2010-10-11  Andi Kleen  <ak@linux.intel.com>
>
>        * lto.c (lto_main): Set flag_lto_slim to zero.
> ---
>  gcc/ada/gcc-interface/utils.c |    3 +++
>  gcc/cgraphunit.c              |   13 ++++++++++++-
>  gcc/common.opt                |    5 +++++
>  gcc/cp/decl2.c                |    8 +++++---
>  gcc/doc/invoke.texi           |   10 +++++++++-
>  gcc/langhooks.c               |    7 +++++--
>  gcc/lto/lto.c                 |    3 +++
>  gcc/toplev.c                  |    7 +++++--
>  8 files changed, 47 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
> index 876556c..e339d88 100644
> --- a/gcc/ada/gcc-interface/utils.c
> +++ b/gcc/ada/gcc-interface/utils.c
> @@ -4646,6 +4646,9 @@ gnat_write_global_declarations (void)
>      FIXME: shouldn't be the front end's responsibility to call this.  */
>   cgraph_finalize_compilation_unit ();
>
> +  if (flag_lto_slim)
> +    return;
> +
>   /* Emit debug info for all global declarations.  */
>   emit_debug_global_declarations (VEC_address (tree, global_decls),
>                                  VEC_length (tree, global_decls));
> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
> index e9d1f1d..103a724 100644
> --- a/gcc/cgraphunit.c
> +++ b/gcc/cgraphunit.c
> @@ -1730,7 +1730,7 @@ ipa_passes (void)
>   if (flag_generate_lto)
>     targetm.asm_out.lto_end ();
>
> -  if (!flag_ltrans)
> +  if (!flag_ltrans && !flag_lto_slim)
>     execute_ipa_pass_list (all_regular_ipa_passes);
>   invoke_plugin_callbacks (PLUGIN_ALL_IPA_PASSES_END, NULL);
>
> @@ -1775,6 +1775,17 @@ cgraph_optimize (void)
>       return;
>     }
>
> +  if (flag_lto_slim)
> +    {
> +      if (!quiet_flag)
> +       fprintf (stderr, "In slim LTO mode. Stopping output.\n");
> +      timevar_pop (TV_CGRAPHOPT);
> +      /* Release the trees here? Right now it doesn't matter because
> +        we exit soon. */
> +      return;
> +    }
> +
> +
>   /* This pass remove bodies of extern inline functions we never inlined.
>      Do this later so other IPA passes see what is really going on.  */
>   cgraph_remove_unreachable_nodes (false, dump_file);
> diff --git a/gcc/common.opt b/gcc/common.opt
> index b0e40c1..37d2eff 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1089,6 +1089,11 @@ flto-report
>  Common Report Var(flag_lto_report) Init(0) Optimization
>  Report various link-time optimization statistics
>
> +flto-slim
> +Common Report Var(flag_lto_slim) Init(0) Optimization
> +Only generate LTO output, no assembler. This flag is ignored for the LTO
> +frontend. Will only work with -fuse-linker-plugin.
> +
>  fmath-errno
>  Common Report Var(flag_errno_math) Init(1) Optimization
>  Set errno after built-in math functions
> diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
> index fcc83fb..ebac286 100644
> --- a/gcc/cp/decl2.c
> +++ b/gcc/cp/decl2.c
> @@ -3934,8 +3934,9 @@ cp_write_global_declarations (void)
>     {
>       check_global_declarations (VEC_address (tree, pending_statics),
>                                 VEC_length (tree, pending_statics));
> -      emit_debug_global_declarations (VEC_address (tree, pending_statics),
> -                                     VEC_length (tree, pending_statics));
> +      if (!flag_lto_slim)
> +        emit_debug_global_declarations (VEC_address (tree, pending_statics),
> +                                       VEC_length (tree, pending_statics));
>     }
>
>   perform_deferred_noexcept_checks ();
> @@ -3943,7 +3944,8 @@ cp_write_global_declarations (void)
>   /* Generate hidden aliases for Java.  */
>   if (candidates)
>     {
> -      build_java_method_aliases (candidates);
> +      if (!flag_lto_slim)
> +        build_java_method_aliases (candidates);
>       pointer_set_destroy (candidates);
>     }
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 776fdd0..33548b1 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -355,7 +355,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
>  -floop-block -floop-flatten -floop-interchange -floop-strip-mine @gol
>  -floop-parallelize-all -flto -flto-compression-level -flto-partition=@var{alg} @gol
> --flto-report -fltrans -fltrans-output-list -fmerge-all-constants @gol
> +-flto-report -flto-slim -fltrans -fltrans-output-list -fmerge-all-constants @gol
>  -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
>  -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol
>  -fno-default-inline @gol
> @@ -7655,6 +7655,14 @@ files in LTO mode (via @option{-fwhopr} or @option{-flto}).
>
>  Disabled by default.
>
> +@item -flto-slim
> +Only generate LTO output, no assembler. The result is that the assembler
> +code is only generated once for a LTO build. This flag is ignored for the LTO
> +frontend. Will only work with the @option{-fuse-linker-plugin} and
> +with the @command{gold} linker.
> +
> +Disabled by default
> +
>  @item -fuse-linker-plugin
>  Enables the extraction of objects with GIMPLE bytecode information
>  from library archives.  This option relies on features available only
> diff --git a/gcc/langhooks.c b/gcc/langhooks.c
> index 2217a24..1032ceb 100644
> --- a/gcc/langhooks.c
> +++ b/gcc/langhooks.c
> @@ -323,9 +323,12 @@ write_global_declarations (void)
>   for (i = 0, decl = globals; i < len; i++, decl = DECL_CHAIN (decl))
>     vec[len - i - 1] = decl;
>
> -  wrapup_global_declarations (vec, len);
> +  if (!flag_lto_slim)
> +    wrapup_global_declarations (vec, len);
>   check_global_declarations (vec, len);
> -  emit_debug_global_declarations (vec, len);
> +
> +  if (!flag_lto_slim)
> +    emit_debug_global_declarations (vec, len);
>
>   /* Clean up.  */
>   free (vec);
> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> index 3baea80..aec50ae 100644
> --- a/gcc/lto/lto.c
> +++ b/gcc/lto/lto.c
> @@ -2424,6 +2424,9 @@ lto_main (int debug_p ATTRIBUTE_UNUSED)
>
>   lto_init_reader ();
>
> +  /* In LTO mode don't ever be slim */
> +  flag_lto_slim = 0;
> +
>   /* Read all the symbols and call graph from all the files in the
>      command line.  */
>   read_cgraph_and_symbols (num_in_fnames, in_fnames);
> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index a6c13f1..e9404de 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -930,8 +930,11 @@ compile_file (void)
>   /* This must also call cgraph_finalize_compilation_unit.  */
>   lang_hooks.decls.final_write_globals ();
>
> -  if (seen_error ())
> -    return;
> +  if (seen_error () || flag_lto_slim)
> +    {
> +      invoke_plugin_callbacks (PLUGIN_FINISH_UNIT, NULL);
> +      return;
> +    }
>
>   varpool_assemble_pending_decls ();
>   finish_aliases_2 ();
> --
> 1.7.1
>
>
Andi Kleen - Oct. 11, 2010, 10:42 a.m.
> Well.  Instead of putting checks everywhere for flag_lto_slim
> we should re-organize how/when the frontends dispatch to the
> middle-end (thus, look at the langhooks and see what common
> code that write_global_declarations does we can factor out).
> 
> For example cgraph_optimize () should be called by common code
> and generally be the last thing the frontend does in w_g_d.

The problem is you still have to bail out and sometimes
the frontends do stuff afterwards which should not be skipped
(e.g. generating some warnings or updating repo and plugins)  

Maybe some of that code could be moved earlier, but at least
I don't understand it all well enough to say if that is safe to 
do or not.

So yes that could be done, but you would still need to have
a lot of check in the callers. I suspect it wouldn't reduce
the checking significantly.

-Andi

Patch

diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index 876556c..e339d88 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -4646,6 +4646,9 @@  gnat_write_global_declarations (void)
      FIXME: shouldn't be the front end's responsibility to call this.  */
   cgraph_finalize_compilation_unit ();
 
+  if (flag_lto_slim)
+    return;
+
   /* Emit debug info for all global declarations.  */
   emit_debug_global_declarations (VEC_address (tree, global_decls),
 				  VEC_length (tree, global_decls));
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index e9d1f1d..103a724 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1730,7 +1730,7 @@  ipa_passes (void)
   if (flag_generate_lto)
     targetm.asm_out.lto_end ();
 
-  if (!flag_ltrans)
+  if (!flag_ltrans && !flag_lto_slim)
     execute_ipa_pass_list (all_regular_ipa_passes);
   invoke_plugin_callbacks (PLUGIN_ALL_IPA_PASSES_END, NULL);
 
@@ -1775,6 +1775,17 @@  cgraph_optimize (void)
       return;
     }
 
+  if (flag_lto_slim)
+    {
+      if (!quiet_flag)
+	fprintf (stderr, "In slim LTO mode. Stopping output.\n");
+      timevar_pop (TV_CGRAPHOPT);
+      /* Release the trees here? Right now it doesn't matter because
+	 we exit soon. */
+      return;
+    }
+
+
   /* This pass remove bodies of extern inline functions we never inlined.
      Do this later so other IPA passes see what is really going on.  */
   cgraph_remove_unreachable_nodes (false, dump_file);
diff --git a/gcc/common.opt b/gcc/common.opt
index b0e40c1..37d2eff 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1089,6 +1089,11 @@  flto-report
 Common Report Var(flag_lto_report) Init(0) Optimization
 Report various link-time optimization statistics
 
+flto-slim
+Common Report Var(flag_lto_slim) Init(0) Optimization
+Only generate LTO output, no assembler. This flag is ignored for the LTO
+frontend. Will only work with -fuse-linker-plugin.
+
 fmath-errno
 Common Report Var(flag_errno_math) Init(1) Optimization
 Set errno after built-in math functions
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index fcc83fb..ebac286 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -3934,8 +3934,9 @@  cp_write_global_declarations (void)
     {
       check_global_declarations (VEC_address (tree, pending_statics),
 				 VEC_length (tree, pending_statics));
-      emit_debug_global_declarations (VEC_address (tree, pending_statics),
-				      VEC_length (tree, pending_statics));
+      if (!flag_lto_slim)
+        emit_debug_global_declarations (VEC_address (tree, pending_statics),
+	  			        VEC_length (tree, pending_statics));
     }
 
   perform_deferred_noexcept_checks ();
@@ -3943,7 +3944,8 @@  cp_write_global_declarations (void)
   /* Generate hidden aliases for Java.  */
   if (candidates)
     {
-      build_java_method_aliases (candidates);
+      if (!flag_lto_slim)
+        build_java_method_aliases (candidates);
       pointer_set_destroy (candidates);
     }
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 776fdd0..33548b1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -355,7 +355,7 @@  Objective-C and Objective-C++ Dialects}.
 -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
 -floop-block -floop-flatten -floop-interchange -floop-strip-mine @gol
 -floop-parallelize-all -flto -flto-compression-level -flto-partition=@var{alg} @gol
--flto-report -fltrans -fltrans-output-list -fmerge-all-constants @gol
+-flto-report -flto-slim -fltrans -fltrans-output-list -fmerge-all-constants @gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
 -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol
 -fno-default-inline @gol
@@ -7655,6 +7655,14 @@  files in LTO mode (via @option{-fwhopr} or @option{-flto}).
 
 Disabled by default.
 
+@item -flto-slim
+Only generate LTO output, no assembler. The result is that the assembler
+code is only generated once for a LTO build. This flag is ignored for the LTO
+frontend. Will only work with the @option{-fuse-linker-plugin} and
+with the @command{gold} linker.
+
+Disabled by default
+
 @item -fuse-linker-plugin
 Enables the extraction of objects with GIMPLE bytecode information
 from library archives.  This option relies on features available only
diff --git a/gcc/langhooks.c b/gcc/langhooks.c
index 2217a24..1032ceb 100644
--- a/gcc/langhooks.c
+++ b/gcc/langhooks.c
@@ -323,9 +323,12 @@  write_global_declarations (void)
   for (i = 0, decl = globals; i < len; i++, decl = DECL_CHAIN (decl))
     vec[len - i - 1] = decl;
 
-  wrapup_global_declarations (vec, len);
+  if (!flag_lto_slim)
+    wrapup_global_declarations (vec, len);
   check_global_declarations (vec, len);
-  emit_debug_global_declarations (vec, len);
+
+  if (!flag_lto_slim)
+    emit_debug_global_declarations (vec, len);
 
   /* Clean up.  */
   free (vec);
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 3baea80..aec50ae 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -2424,6 +2424,9 @@  lto_main (int debug_p ATTRIBUTE_UNUSED)
 
   lto_init_reader ();
 
+  /* In LTO mode don't ever be slim */
+  flag_lto_slim = 0;
+
   /* Read all the symbols and call graph from all the files in the
      command line.  */
   read_cgraph_and_symbols (num_in_fnames, in_fnames);
diff --git a/gcc/toplev.c b/gcc/toplev.c
index a6c13f1..e9404de 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -930,8 +930,11 @@  compile_file (void)
   /* This must also call cgraph_finalize_compilation_unit.  */
   lang_hooks.decls.final_write_globals ();
 
-  if (seen_error ())
-    return;
+  if (seen_error () || flag_lto_slim)
+    {
+      invoke_plugin_callbacks (PLUGIN_FINISH_UNIT, NULL);
+      return;
+    }
 
   varpool_assemble_pending_decls ();
   finish_aliases_2 ();