Patchwork Remove global state from gcc/tracer.c

login
register
mail settings
Submitter David Malcolm
Date May 23, 2013, 12:45 a.m.
Message ID <1369269945.26167.50.camel@surprise>
Download mbox | patch
Permalink /patch/245792/
State New
Headers show

Comments

David Malcolm - May 23, 2013, 12:45 a.m.
I'm attempting to eliminate global state from the insides of gcc.

gcc/tracer.c has various global variables, which are only used during
the lifetime of the execute callback of that pass, and cleaned up at the
end of each invocation of the pass.

The attached patch introduces a class to hold the state of the pass
("tracer_state"), eliminating these globals.  An instance of the state
is created on the stack, and all of the various "static" functions in
tracer.c that used the globals become member functions of the state.
Hence the state is passed around by the implicit "this" of the
tracer_state, avoiding the need to patch each individual use of a field
within the state, minimizing the diff.

Bootstrapped and tested on x86_64-unknown-linux-gnu against r199189, and
has the same test results as an unpatched bootstrap of that revision.

OK to commit to trunk?

2013-05-21  David Malcolm  <dmalcolm@redhat.com>

	* tracer.c (tracer_state): New class
	(probability_cutoff): Convert global variable to be member data
	within tracer_state
	(branch_ratio_cutoff): ditto
	(bb_seen): ditto
	(mark_bb_seen): convert to...
	(tracer_state::mark_bb_seen): ...member function of new
	tracer_state class
	(find_best_predecessor): likewise, convert to...
	(tracer_state::find_best_predecessor): ...this
	(find_trace): likewise, convert to...
	(tracer_state::find_trace): ...this
	(tail_duplicate): likewise, convert to...
	(tracer_state::tail_duplicate): ...this
	(tracer): introduce tracer_state instance to move state within
	the pass from being global to being on the stack
Jakub Jelinek - May 23, 2013, 5:14 a.m.
On Wed, May 22, 2013 at 08:45:45PM -0400, David Malcolm wrote:
> I'm attempting to eliminate global state from the insides of gcc.
> 
> gcc/tracer.c has various global variables, which are only used during
> the lifetime of the execute callback of that pass, and cleaned up at the
> end of each invocation of the pass.
> 
> The attached patch introduces a class to hold the state of the pass
> ("tracer_state"), eliminating these globals.  An instance of the state
> is created on the stack, and all of the various "static" functions in
> tracer.c that used the globals become member functions of the state.
> Hence the state is passed around by the implicit "this" of the
> tracer_state, avoiding the need to patch each individual use of a field
> within the state, minimizing the diff.

But do we want to handle the global state this way?  This adds overhead
to (almost?) every single function (now method) in the file (because it gets
an extra argument).  While that might be fine for rarely executed functions,
if it involves also hot functions called many times, where especially on
register starved hosts it could increase register pressure, plus the
overhead of passing the this argument everywhere, this could start to be
noticeable.  Sure, if you plan to do that just in one pass (but, why then?),
it might be tiny slowdown, but after you convert the hundreds of passes in
gcc that contain global state it might become significant.

There are alternative approaches that should be considered.
E.g. global state of a pass can be moved into a per-pass structure,
and have some way how to aggregate those per pass structures together from
all the passes in the whole compiler (that can be either manual process,
say each pass providing its own *-passstate.h and one big header including
all that together), or automatic ones (say gengstate or a new tool could
create those for us from special markings in the source, say new option on
GTY or something) and have some magic macro how to access the global state
within the pass (thispass->fieldname ?).  Then e.g. depending on how the
compiler would be configured and built, thispass could be just address of a
pass struct var (i.e. essentially keep the global state as is, for
performance reasons), or when trying to build compiler as a library (with
-fpic overhead we probably don't want for cc1/cc1plus - we can build all the
*.o files twice, like libtool does) thispass could expand to __thread
pointer var dereference plus a field inside of the global compiler state
structure it points to for the current pass.  Thus, the library version
of the compiler would be somewhat slower (both -fpic overhead and TLS
overhead), and would need either a few of the entrypoints tweaked to adjust
the TLS pointer to the global state, or we could require users to just call
a special function to make the global state current in the current thread
before calling compiler internals.

	Jakub

Patch

commit a5aea6ce6802fa3d74a8a801cd7809869a63a96a
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Wed May 22 06:20:30 2013 -0400

    Remove global state from gcc/tracer.c
    
    gcc/
    	* tracer.c (tracer_state): New class
    	(probability_cutoff): Convert global variable to be member data
    	within tracer_state
    	(branch_ratio_cutoff): ditto
    	(bb_seen): ditto
    	(mark_bb_seen): convert to...
    	(tracer_state::mark_bb_seen): ...member function of new tracer_state
    	class
    	(find_best_predecessor): likewise, convert to...
    	(tracer_state::find_best_predecessor): ...this
    	(find_trace): likewise, convert to...
    	(tracer_state::find_trace): ...this
    	(tail_duplicate): likewise, convert to...
    	(tracer_state::tail_duplicate): ...this
    	(tracer): introduce tracer_state instance to move state within the
    	pass from being global to being on the stack

diff --git a/gcc/tracer.c b/gcc/tracer.c
index 975cadb..977adcb 100644
--- a/gcc/tracer.c
+++ b/gcc/tracer.c
@@ -53,20 +53,43 @@ 
 static int count_insns (basic_block);
 static bool ignore_bb_p (const_basic_block);
 static bool better_p (const_edge, const_edge);
-static edge find_best_successor (basic_block);
-static edge find_best_predecessor (basic_block);
-static int find_trace (basic_block, basic_block *);
 
-/* Minimal outgoing edge probability considered for superblock formation.  */
-static int probability_cutoff;
-static int branch_ratio_cutoff;
+class tracer_state
+{
+public:
+  tracer_state();
+
+  bool tail_duplicate ();
+
+private:
+
+  edge find_best_successor (basic_block);
+  edge find_best_predecessor (basic_block);
+  int find_trace (basic_block, basic_block *);
+  void mark_bb_seen (basic_block bb);
+  bool bb_seen_p (basic_block bb);
+
+private:
+
+  /* Minimal outgoing edge probability considered for superblock formation.  */
+  int probability_cutoff;
+  int branch_ratio_cutoff;
 
-/* A bit BB->index is set if BB has already been seen, i.e. it is
-   connected to some trace already.  */
-sbitmap bb_seen;
+  /* A bit BB->index is set if BB has already been seen, i.e. it is
+     connected to some trace already.  */
+  sbitmap bb_seen;
 
-static inline void
-mark_bb_seen (basic_block bb)
+}; // tracer_state
+
+tracer_state::tracer_state()
+  : probability_cutoff(0),
+    branch_ratio_cutoff(0),
+    bb_seen()
+{
+}
+
+inline void
+tracer_state::mark_bb_seen (basic_block bb)
 {
   unsigned int size = SBITMAP_SIZE (bb_seen);
 
@@ -76,8 +99,8 @@  mark_bb_seen (basic_block bb)
   bitmap_set_bit (bb_seen, bb->index);
 }
 
-static inline bool
-bb_seen_p (basic_block bb)
+inline bool
+tracer_state::bb_seen_p (basic_block bb)
 {
   return bitmap_bit_p (bb_seen, bb->index);
 }
@@ -138,8 +161,8 @@  better_p (const_edge e1, const_edge e2)
 
 /* Return most frequent successor of basic block BB.  */
 
-static edge
-find_best_successor (basic_block bb)
+edge
+tracer_state::find_best_successor (basic_block bb)
 {
   edge e;
   edge best = NULL;
@@ -157,8 +180,8 @@  find_best_successor (basic_block bb)
 
 /* Return most frequent predecessor of basic block BB.  */
 
-static edge
-find_best_predecessor (basic_block bb)
+edge
+tracer_state::find_best_predecessor (basic_block bb)
 {
   edge e;
   edge best = NULL;
@@ -178,8 +201,8 @@  find_best_predecessor (basic_block bb)
 /* Find the trace using bb and record it in the TRACE array.
    Return number of basic blocks recorded.  */
 
-static int
-find_trace (basic_block bb, basic_block *trace)
+int
+tracer_state::find_trace (basic_block bb, basic_block *trace)
 {
   int i = 0;
   edge e;
@@ -220,8 +243,8 @@  find_trace (basic_block bb, basic_block *trace)
 /* Look for basic blocks in frequency order, construct traces and tail duplicate
    if profitable.  */
 
-static bool
-tail_duplicate (void)
+bool
+tracer_state::tail_duplicate ()
 {
   fibnode_t *blocks = XCNEWVEC (fibnode_t, last_basic_block);
   basic_block *trace = XNEWVEC (basic_block, n_basic_blocks);
@@ -376,7 +399,8 @@  tracer (void)
     brief_dump_cfg (dump_file, dump_flags);
 
   /* Trace formation is done on the fly inside tail_duplicate */
-  changed = tail_duplicate ();
+  tracer_state state;
+  changed = state.tail_duplicate ();
   if (changed)
     {
       free_dominance_info (CDI_DOMINATORS);