Patchwork Make timevar phases mutually exclusive. (issue6302064)

login
register
mail settings
Submitter Lawrence Crowl
Date June 11, 2012, 6:45 p.m.
Message ID <20120611184558.0A4FB2225FB@jade.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/164238/
State New
Headers show

Comments

Lawrence Crowl - June 11, 2012, 6:45 p.m.
The intent of the phases was to have a high-level but mutually exclusive
accounting of compile time.  We want to track compile time in a way that
tells us which conceptual phases are taking the most time.  That intent
is not currently satisfied.  This patch restores that intent.

Add code to verify that the sum of the phase times is less than the total
time, to detect when phases are overlapped.  A slight amount of leeway
is required due to time jitters.  This verification is done as the last
step in printing timevars so that any timevar information is not lost.

Rename the phases to be clearer about what they measure, so that they are
less likely to be modified to be overlapping.  The primary example is
to change TV_PHASE_GENERATE to TV_PHASE_LATE_ASM, meaning late work on
the assembly.  This name change avoids confusion n moving the timevar
start call after the call to lang_hooks.decls.final_write_globals,
which prevents overlapping phases.

Each implementation of lang_hooks.decls.final_write_globals, is
responsible for starting and stopping its own phases.  Each implementation
currently has a first phase of TV_PHASE_DEFERRED for front-end work
deferred until after parsing is complete.  The second phase has been
renamed from TV_PHASE_CGRAPH to TV_PHASE_OPT_GEN, to better reflect its
use as the main optimization and generation phase.  This phase accounts
for 70%-80% of compilation time.  The third phase is TV_PHASE_DBGINFO,
except in cp/decl2.c, where it is TV_PHASE_CHECK_DBGINFO because
cc1plus mixes checking in with debug info generation.  In langhooks.c,
write_global_declarations was using TV_PHASE_CHECK_DBGINFO, but it was
doing no checking.  So, it now uses TV_PHASE_DBGINFO.

The changes to LTO are significant.  First, initialization now uses
TV_PHASE_SETUP.  Reading files now uses TV_PHASE_STREAM_IN.  Writing files
now uses TV_PHASE_STREAM_OUT.  The remaining phase is TV_PHASE_OPT_GEN
(formerly TV_PHASE_CGRAPH).

Tested on x86_64.

Okay for trunk?



--
This patch is available for review at http://codereview.appspot.com/6302064
Diego Novillo - June 13, 2012, 12:46 p.m.
On 12-06-11 14:45 , Lawrence Crowl wrote:
> The intent of the phases was to have a high-level but mutually exclusive
> accounting of compile time.  We want to track compile time in a way that
> tells us which conceptual phases are taking the most time.  That intent
> is not currently satisfied.  This patch restores that intent.
>
> Add code to verify that the sum of the phase times is less than the total
> time, to detect when phases are overlapped.  A slight amount of leeway
> is required due to time jitters.  This verification is done as the last
> step in printing timevars so that any timevar information is not lost.
>
> Rename the phases to be clearer about what they measure, so that they are
> less likely to be modified to be overlapping.  The primary example is
> to change TV_PHASE_GENERATE to TV_PHASE_LATE_ASM, meaning late work on
> the assembly.  This name change avoids confusion n moving the timevar
> start call after the call to lang_hooks.decls.final_write_globals,
> which prevents overlapping phases.
>
> Each implementation of lang_hooks.decls.final_write_globals, is
> responsible for starting and stopping its own phases.  Each implementation
> currently has a first phase of TV_PHASE_DEFERRED for front-end work
> deferred until after parsing is complete.  The second phase has been
> renamed from TV_PHASE_CGRAPH to TV_PHASE_OPT_GEN, to better reflect its
> use as the main optimization and generation phase.  This phase accounts
> for 70%-80% of compilation time.  The third phase is TV_PHASE_DBGINFO,
> except in cp/decl2.c, where it is TV_PHASE_CHECK_DBGINFO because
> cc1plus mixes checking in with debug info generation.  In langhooks.c,
> write_global_declarations was using TV_PHASE_CHECK_DBGINFO, but it was
> doing no checking.  So, it now uses TV_PHASE_DBGINFO.
>
> The changes to LTO are significant.  First, initialization now uses
> TV_PHASE_SETUP.  Reading files now uses TV_PHASE_STREAM_IN.  Writing files
> now uses TV_PHASE_STREAM_OUT.  The remaining phase is TV_PHASE_OPT_GEN
> (formerly TV_PHASE_CGRAPH).
>
> Tested on x86_64.
>
> Okay for trunk?
>
>
> Index: gcc/ChangeLog
>
> 2012-06-11  Lawrence Crowl<crowl@google.com>
>
> 	* timevar.def (TV_PHASE_GENERATE): Rename to TV_PHASE_LATE_ASM.
> 	(TV_PHASE_CGRAPH): Rename to TV_PHASE_OPT_GEN.
> 	(TV_PHASE_STREAM_IN): New.
> 	(TV_PHASE_STREAM_OUT): New.
> 	* timevar.c (validate_phases): New.
> 	(timevar_print): Call validate_phases.
> 	* c-decl.c (c_write_global_declarations): Rename use of TV_PHASE_CGRAPH
> 	to TV_PHASE_OPT_GEN.
> 	* langhooks.c (write_global_declarations): Rename use of
> 	TV_PHASE_CGRAPH to TV_PHASE_OPT_GEN.  Use TV_PHASE_DBGINFO instead of
> 	TV_PHASE_CHECK_DBGINFO.
> 	* toplev.c (compile_file): Rename use of TV_PHASE_GENERATE to
> 	TV_PHASE_LATE_ASM.  Move start of TV_PHASE_LATE_ASM to after call to
> 	lang_hooks.decls.final_write_globals.
>
> Index: gcc/cp/ChangeLog
>
> 2012-06-11  Lawrence Crowl<crowl@google.com>
>
> 	* decl2.c (cp_write_global_declarations): Rename use of TV_PHASE_CGRAPH
> 	to TV_PHASE_OPT_GEN.
>
> Index: gcc/lto/ChangeLog
>
> 2012-06-11  Lawrence Crowl<crowl@google.com>
>
>          * lto.c (do_whole_program_analysis):  Rename use of TV_PHASE_CGRAPH to
> 	TV_PHASE_OPT_GEN.  Use new timevar TV_PHASE_STREAM_OUT around the call
> 	to lto_wpa_write_files.
> 	(lto_main):  Rename use of TV_PHASE_CGRAPH to TV_PHASE_OPT_GEN.  Move
> 	start of TV_PHASE_OPT_GEN to include call to materialize_cgraph.  Use
> 	TV_PHASE_SETUP for the call to lto_init.  Use new timevar
> 	TV_PHASE_STREAM_IN around the call to read_cgraph_and_symbols.

The LTO bits are fine.  I would prefer if an FE maintainer takes a 
second look over the other bits.  Jason, Joseph?


Thanks.  Diego.
Diego Novillo - June 13, 2012, 1:58 p.m.
On 12-06-13 08:46 , Diego Novillo wrote:

> The LTO bits are fine. I would prefer if an FE maintainer takes a second
> look over the other bits. Jason, Joseph?

Incidentally, could you please test it with an LTO-enabled bootstrap?

$ ../src/configure --with-build-config=bootstrap-lto 
--enable-languages=c++,fortran
$ make profiledbootstrap

If you cut and paste one of the -flto compiles you see during bootstrap, 
you can add -ftime-report to it to make sure that the LTO timers are 
properly setup.


Diego.
Joseph S. Myers - June 13, 2012, 5:56 p.m.
On Wed, 13 Jun 2012, Diego Novillo wrote:

> The LTO bits are fine.  I would prefer if an FE maintainer takes a second look
> over the other bits.  Jason, Joseph?

The c-decl.c changes are fine with me.
Jason Merrill - June 14, 2012, 5:59 a.m.
OK.

Jason
Lawrence Crowl - June 19, 2012, 1:33 a.m.
On 6/13/12, Diego Novillo <dnovillo@google.com> wrote:
> On 12-06-13 08:46 , Diego Novillo wrote:
> > The LTO bits are fine. I would prefer if an FE maintainer takes
> > a second look over the other bits. Jason, Joseph?
>
> Incidentally, could you please test it with an LTO-enabled bootstrap?
>
> $ ../src/configure --with-build-config=bootstrap-lto
> --enable-languages=c++,fortran
> $ make profiledbootstrap
>
> If you cut and paste one of the -flto compiles you see during
> bootstrap, you can add -ftime-report to it to make sure that the
> LTO timers are properly setup.

I had to make a slight adjustment here.  LTO runs as a front-end,
and the parse timers are automatically started for the front ends.
But LTO is not parsing.  Rather than change every front end to
handle those timers, I turned the timers off and then back on again
within LTO.

Committed.

Patch

Index: gcc/ChangeLog

2012-06-11  Lawrence Crowl  <crowl@google.com>

	* timevar.def (TV_PHASE_GENERATE): Rename to TV_PHASE_LATE_ASM.
	(TV_PHASE_CGRAPH): Rename to TV_PHASE_OPT_GEN.
	(TV_PHASE_STREAM_IN): New.
	(TV_PHASE_STREAM_OUT): New.
	* timevar.c (validate_phases): New.
	(timevar_print): Call validate_phases.
	* c-decl.c (c_write_global_declarations): Rename use of TV_PHASE_CGRAPH
	to TV_PHASE_OPT_GEN.
	* langhooks.c (write_global_declarations): Rename use of
	TV_PHASE_CGRAPH to TV_PHASE_OPT_GEN.  Use TV_PHASE_DBGINFO instead of
	TV_PHASE_CHECK_DBGINFO.
	* toplev.c (compile_file): Rename use of TV_PHASE_GENERATE to
	TV_PHASE_LATE_ASM.  Move start of TV_PHASE_LATE_ASM to after call to
	lang_hooks.decls.final_write_globals.

Index: gcc/cp/ChangeLog

2012-06-11  Lawrence Crowl  <crowl@google.com>

	* decl2.c (cp_write_global_declarations): Rename use of TV_PHASE_CGRAPH
	to TV_PHASE_OPT_GEN.

Index: gcc/lto/ChangeLog

2012-06-11  Lawrence Crowl  <crowl@google.com>

        * lto.c (do_whole_program_analysis):  Rename use of TV_PHASE_CGRAPH to
	TV_PHASE_OPT_GEN.  Use new timevar TV_PHASE_STREAM_OUT around the call
	to lto_wpa_write_files.
	(lto_main):  Rename use of TV_PHASE_CGRAPH to TV_PHASE_OPT_GEN.  Move
	start of TV_PHASE_OPT_GEN to include call to materialize_cgraph.  Use
	TV_PHASE_SETUP for the call to lto_init.  Use new timevar
	TV_PHASE_STREAM_IN around the call to read_cgraph_and_symbols.

Index: gcc/toplev.c
===================================================================
--- gcc/toplev.c	(revision 188335)
+++ gcc/toplev.c	(working copy)
@@ -558,18 +558,15 @@  compile_file (void)
   if (flag_syntax_only || flag_wpa)
     return;
 
-  timevar_start (TV_PHASE_GENERATE);
-
   ggc_protect_identifiers = false;
 
   /* This must also call finalize_compilation_unit.  */
   lang_hooks.decls.final_write_globals ();
 
   if (seen_error ())
-    {
-      timevar_stop (TV_PHASE_GENERATE);
-      return;
-    }
+    return;
+
+  timevar_start (TV_PHASE_LATE_ASM);
 
   /* Compilation unit is finalized.  When producing non-fat LTO object, we are
      basically finished.  */
@@ -670,7 +667,7 @@  compile_file (void)
      assembly file after this point.  */
   targetm.asm_out.file_end ();
 
-  timevar_stop (TV_PHASE_GENERATE);
+  timevar_stop (TV_PHASE_LATE_ASM);
 }
 
 /* Print version information to FILE.
Index: gcc/cp/decl2.c
===================================================================
--- gcc/cp/decl2.c	(revision 188335)
+++ gcc/cp/decl2.c	(working copy)
@@ -4017,11 +4017,11 @@  cp_write_global_declarations (void)
   candidates = collect_candidates_for_java_method_aliases ();
 
   timevar_stop (TV_PHASE_DEFERRED);
-  timevar_start (TV_PHASE_CGRAPH);
+  timevar_start (TV_PHASE_OPT_GEN);
 
   finalize_compilation_unit ();
 
-  timevar_stop (TV_PHASE_CGRAPH);
+  timevar_stop (TV_PHASE_OPT_GEN);
   timevar_start (TV_PHASE_CHECK_DBGINFO);
 
   /* Now, issue warnings about static, but not defined, functions,
Index: gcc/timevar.def
===================================================================
--- gcc/timevar.def	(revision 188335)
+++ gcc/timevar.def	(working copy)
@@ -33,13 +33,17 @@ 
 
 /* The total execution time.  */
 DEFTIMEVAR (TV_TOTAL                 , "total time")
+/* The compiler phases.  These must be mutually exclusive.
+   Ideally, they would sum to near the total time.  */
 DEFTIMEVAR (TV_PHASE_SETUP           , "phase setup")
 DEFTIMEVAR (TV_PHASE_PARSING         , "phase parsing")
 DEFTIMEVAR (TV_PHASE_DEFERRED        , "phase lang. deferred")
-DEFTIMEVAR (TV_PHASE_CGRAPH          , "phase cgraph")
+DEFTIMEVAR (TV_PHASE_OPT_GEN         , "phase opt and generate")
 DEFTIMEVAR (TV_PHASE_DBGINFO         , "phase debug info")
 DEFTIMEVAR (TV_PHASE_CHECK_DBGINFO   , "phase check & debug info")
-DEFTIMEVAR (TV_PHASE_GENERATE        , "phase generate")
+DEFTIMEVAR (TV_PHASE_LATE_ASM        , "phase last asm")
+DEFTIMEVAR (TV_PHASE_STREAM_IN       , "phase stream in")
+DEFTIMEVAR (TV_PHASE_STREAM_OUT      , "phase stream out")
 DEFTIMEVAR (TV_PHASE_FINALIZE        , "phase finalize")
 
 /* Concurrent timers, indicated by "|".  */
Index: gcc/timevar.c
===================================================================
--- gcc/timevar.c	(revision 188335)
+++ gcc/timevar.c	(working copy)
@@ -421,6 +421,57 @@  timevar_cond_stop (timevar_id_t timevar,
 }
 
 
+/* Validate that phase times are consistent.  */
+
+static void
+validate_phases (FILE *fp)
+{
+  unsigned int /* timevar_id_t */ id;
+  struct timevar_time_def *total = &timevars[TV_TOTAL].elapsed;
+  double phase_user = 0.0;
+  double phase_sys = 0.0;
+  double phase_wall = 0.0;
+  unsigned phase_ggc_mem = 0;
+  static char phase_prefix[] = "phase ";
+  const double tolerance = 1.000001;  /* One part in a million.  */
+
+  for (id = 0; id < (unsigned int) TIMEVAR_LAST; ++id)
+    {
+      struct timevar_def *tv = &timevars[(timevar_id_t) id];
+
+      /* Don't evaluate timing variables that were never used.  */
+      if (!tv->used)
+	continue;
+
+      if (strncmp (tv->name, phase_prefix, sizeof phase_prefix - 1) == 0)
+	{
+	  phase_user += tv->elapsed.user;
+	  phase_sys += tv->elapsed.sys;
+	  phase_wall += tv->elapsed.wall;
+	  phase_ggc_mem += tv->elapsed.ggc_mem;
+	}
+    }
+
+  if (phase_user > total->user * tolerance
+      || phase_sys > total->sys * tolerance
+      || phase_wall > total->wall * tolerance
+      || phase_ggc_mem > total->ggc_mem * tolerance)
+    {
+
+      fprintf (fp, "Timing error: total of phase timers exceeds total time.\n");
+      if (phase_user > total->user)
+	fprintf (fp, "user    %24.18e > %24.18e\n", phase_user, total->user);
+      if (phase_sys > total->sys)
+	fprintf (fp, "sys     %24.18e > %24.18e\n", phase_sys, total->sys);
+      if (phase_wall > total->wall)
+	fprintf (fp, "wall    %24.18e > %24.18e\n", phase_wall, total->wall);
+      if (phase_ggc_mem > total->ggc_mem)
+	fprintf (fp, "ggc_mem %24u > %24u\n", phase_ggc_mem, total->ggc_mem);
+      gcc_unreachable ();
+    }
+}
+
+
 /* Summarize timing variables to FP.  The timing variable TV_TOTAL has
    a special meaning -- it's considered to be the total elapsed time,
    for normalizing the others, and is displayed last.  */
@@ -535,6 +586,8 @@  timevar_print (FILE *fp)
 
 #endif /* defined (HAVE_USER_TIME) || defined (HAVE_SYS_TIME)
 	  || defined (HAVE_WALL_TIME) */
+
+  validate_phases (fp);
 }
 
 /* Prints a message to stderr stating that time elapsed in STR is
Index: gcc/c-decl.c
===================================================================
--- gcc/c-decl.c	(revision 188335)
+++ gcc/c-decl.c	(working copy)
@@ -10098,13 +10098,13 @@  c_write_global_declarations (void)
   c_write_global_declarations_1 (BLOCK_VARS (ext_block));
 
   timevar_stop (TV_PHASE_DEFERRED);
-  timevar_start (TV_PHASE_CGRAPH);
+  timevar_start (TV_PHASE_OPT_GEN);
 
   /* We're done parsing; proceed to optimize and emit assembly.
      FIXME: shouldn't be the front end's responsibility to call this.  */
   finalize_compilation_unit ();
 
-  timevar_stop (TV_PHASE_CGRAPH);
+  timevar_stop (TV_PHASE_OPT_GEN);
   timevar_start (TV_PHASE_DBGINFO);
 
   /* After cgraph has had a chance to emit everything that's going to
Index: gcc/langhooks.c
===================================================================
--- gcc/langhooks.c	(revision 188335)
+++ gcc/langhooks.c	(working copy)
@@ -317,15 +317,15 @@  write_global_declarations (void)
   check_global_declarations (vec, len);
   timevar_stop (TV_PHASE_DEFERRED);
 
-  timevar_start (TV_PHASE_CGRAPH);
+  timevar_start (TV_PHASE_OPT_GEN);
   /* This lang hook is dual-purposed, and also finalizes the
      compilation unit.  */
   finalize_compilation_unit ();
-  timevar_stop (TV_PHASE_CGRAPH);
+  timevar_stop (TV_PHASE_OPT_GEN);
 
-  timevar_start (TV_PHASE_CHECK_DBGINFO);
+  timevar_start (TV_PHASE_DBGINFO);
   emit_debug_global_declarations (vec, len);
-  timevar_stop (TV_PHASE_CHECK_DBGINFO);
+  timevar_stop (TV_PHASE_DBGINFO);
 
   /* Clean up.  */
   free (vec);
Index: gcc/lto/lto.c
===================================================================
--- gcc/lto/lto.c	(revision 188335)
+++ gcc/lto/lto.c	(working copy)
@@ -1935,7 +1935,8 @@  materialize_cgraph (void)
 static void
 do_whole_program_analysis (void)
 {
-  timevar_start (TV_PHASE_CGRAPH);
+  timevar_start (TV_PHASE_OPT_GEN);
+
   /* Note that since we are in WPA mode, materialize_cgraph will not
      actually read in all the function bodies.  It only materializes
      the decls and cgraph nodes so that analysis can be performed.  */
@@ -1979,23 +1980,27 @@  do_whole_program_analysis (void)
   else
     lto_balanced_map ();
 
+  timevar_stop (TV_PHASE_OPT_GEN);
+  timevar_start (TV_PHASE_STREAM_OUT);
+
   if (!quiet_flag)
     {
       fprintf (stderr, "\nStreaming out");
       fflush (stderr);
     }
   lto_wpa_write_files ();
-  ggc_collect ();
   if (!quiet_flag)
     fprintf (stderr, "\n");
 
+  timevar_stop (TV_PHASE_STREAM_OUT);
+
+  ggc_collect ();
   if (post_ipa_mem_report)
     {
       fprintf (stderr, "Memory consumption after IPA\n");
       dump_memory_report (false);
     }
 
-  timevar_stop (TV_PHASE_CGRAPH);
   /* Show the LTO report before launching LTRANS.  */
   if (flag_lto_report)
     print_lto_report ();
@@ -2075,13 +2080,20 @@  lto_init (void)
 void
 lto_main (void)
 {
+  timevar_start (TV_PHASE_SETUP);
+
   /* Initialize the LTO front end.  */
   lto_init ();
 
+  timevar_stop (TV_PHASE_SETUP);
+  timevar_start (TV_PHASE_STREAM_IN);
+
   /* Read all the symbols and call graph from all the files in the
      command line.  */
   read_cgraph_and_symbols (num_in_fnames, in_fnames);
 
+  timevar_stop (TV_PHASE_STREAM_IN);
+
   if (!seen_error ())
     {
       /* If WPA is enabled analyze the whole call graph and create an
@@ -2091,13 +2103,15 @@  lto_main (void)
 	do_whole_program_analysis ();
       else
 	{
+	  timevar_start (TV_PHASE_OPT_GEN);
+
 	  materialize_cgraph ();
 
 	  /* Let the middle end know that we have read and merged all of
 	     the input files.  */ 
-	  timevar_start (TV_PHASE_CGRAPH);
 	  compile ();
-	  timevar_stop (TV_PHASE_CGRAPH);
+
+	  timevar_stop (TV_PHASE_OPT_GEN);
 
 	  /* FIXME lto, if the processes spawned by WPA fail, we miss
 	     the chance to print WPA's report, so WPA will call