Patchwork Remove global state from gcc/tracer.c

login
register
mail settings
Submitter David Malcolm
Date July 1, 2013, 7:59 p.m.
Message ID <1372708759.1789.177.camel@surprise>
Download mbox | patch
Permalink /patch/256214/
State New
Headers show

Comments

David Malcolm - July 1, 2013, 7:59 p.m.
(restarting this thread:
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01318.html )

On Thu, 2013-05-23 at 13:51 -0700, Richard Henderson wrote:
> On 05/23/2013 12:06 PM, Richard Henderson wrote:
> > Another thing I should mention while you're doing all of these static function
> > to class member conversions is that as written we're losing target-specific
> > optimizations that can be done on purely local functions.  This is trivially
> > fixed by placing these new classes in an anonymous namespace.
> 
> At which point it occurs to me that we don't need to distinguish between static
> and normal member methods, nor play with sub-state structures.  All we need to
> require is that IPA constant propagation sees that the initial "this" argument
> is passed a constant value, and let it propagate and eliminate.
> 
> E.g.
> 
> namespace {
> 
> class pass_state
> {
>   private:
>     int x, y, z;
> 
>   public:
>     constexpr pass_state()
>       : x(0), y(0), z(0)
>     { }
> 
>     void doit();
> 
>   private:
>     void a();
>     void b();
>     void c();
> };
> 
> // ...
> 
> } // anon namespace
> 
> #ifdef GLOBAL_STATE
> static pass_state ps;
> #endif
> 
> void toplev()
> {
> #ifndef GLOBAL_STATE
>   pass_state ps;
> #endif
>   ps.doit();
> }
> 
> With an example that's probably too small, I verified that gcc will eliminate
> all of the this parameters with GLOBAL_STATE defined.  It's certainly something
> worth investigating further, as this scheme has the least amount of boilerplate
> of any so far advanced.

I tried this approach for tracer, but there were non-trivial differences
between the old and new tracer.o files, enough to make me uncomfortable
with this approach.

So I've been investigating another approach: making all the member
data/code of the pass state be "static" (and thus no "this") in a global
state build, and non-static in a shared-library build.

This is similar to the macro-based approach from:
  http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01351.html
but to minimize boilerplate, we could introduce a new compiler
attribute.

I recently posted:
  http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00017.html
which adds __attribute__(force_static) to the compiler: classes marked
with this implicitly have "static" on all their members.  Hence we can
optimize away the "this" with a single annotation at the top of the
class; the optimization takes effect in stages 2 and 3 in a global-state
build, but isn't present for a shared-library build.  "implicit_static"
might be a better name for the attribute.

I've posted more information in the plan linked to from:
http://gcc.gnu.org/ml/gcc/2013-06/msg00215.html
specifically in:
http://dmalcolm.fedorapeople.org/gcc/global-state/singletons.html#another-singleton-removal-optimization

In the meantime, I'm attaching a patch which introduces a tracer_state
class, and for now makes all members explicitly static.

I've successfully bootstrapped this on x86_64-unknown-linux-gnu; all
testcases have the same results as an unpatched build, and indeed the
generated machine code has no differences between old and new tracer.o
beyond the ordering of the functions in the .text section, and some
functions moving to their own .text section).

I also tried this with an anonymous namespace to hold tracer_state, but
it did not appear to affect the generated code beyond making it more
difficult to debug in gdb - without the namespace one can easily print
tracer_state static data.  With it, I had to be at a frame within the
file to be able to print them.  Hence I've dropped the anonymous
namespace.  (alternative idea: put it in a "gccint" namespace? i.e. "gcc
internals").  This was with gdb-7.4.50.20120120-54.fc17.x86_64 fwiw.

This patch does the bulk of the work needed to allow this pass to be
threadsafe in a shared-library build; the remaining work would be to add
the new attribute (via a macro), remove the explicit "static" calls, and
introduce a local variable instance of tracer_state, as per:
http://dmalcolm.fedorapeople.org/gcc/global-state/pass-patterns.html#per-invocation-state-with-no-gty-markings

I would prefer to implement my state-removal work on trunk for 4.9, but
if I have to do it on a branch, is this kind of "enabling" patch OK for
trunk?

FWIW I'm using this pass as my example of a pass with no GTY markings
and with only "per-invocation" state (i.e. setup/cleaned up at the
start/end of each "execute" call), which is one of the three kinds of
pass in the plan I posted here:
http://dmalcolm.fedorapeople.org/gcc/global-state/pass-patterns.html#proposed-implementation


2013-06-28  David Malcolm  <dmalcolm@redhat.com>

	Introduce class tracer_state

	* tracer.c (tail_duplicate): Convert to a static member function
	of a new class tracer_state...
	(tracer_state::tail_duplicate): ...here.
	(mark_bb_seen): Likewise, converting to...
	(tracer_state::mark_bb_seen): ...this.
	(bb_seen_p): Likewise, converting to...
	(tracer_state::bb_seen_p): ...this.
	(find_best_successor): Likewise, converting to...
	(tracer_state::find_best_successor): ...this.
	(find_best_predecessor): Likewise, converting to...
	(tracer_state::find_best_predecessor): ...this.
	(find_trace): Likewise, converting to...
	(tracer_state::find_trace): ...this.
	(probability_cutoff): Convert to a static member of new class
	tracer_state...
	(tracer_state::probability_cutoff): ...here.
	(branch_ratio_cutoff): Likewise, converting to...
	(tracer_state::branch_ratio_cutoff): ...this.
	(bb_seen): Likewise, converting to...
	(tracer_state::bb_seen): ...this.
	(tracer): Convert call from tail_duplicate to
	tracer_state::tail_duplicate.
Richard Henderson - July 1, 2013, 8:07 p.m.
On 07/01/2013 12:59 PM, David Malcolm wrote:
> I tried this approach for tracer, but there were non-trivial differences
> between the old and new tracer.o files, enough to make me uncomfortable
> with this approach.

Like what?  Are these just bugs in the IPA-CP pass failing to propagate?  If
so, we should file bugs against it.


> +++ b/gcc/tracer.c
> @@ -53,20 +53,39 @@
>  static int count_insns (basic_block);
>  static bool ignore_bb_p (const_basic_block);
>  static bool better_p (const_edge, const_edge);
> -static edge find_best_successor (basic_block);
> -static edge find_best_predecessor (basic_block);
> -static int find_trace (basic_block, basic_block *);
>  
> -/* Minimal outgoing edge probability considered for superblock formation.  */
> -static int probability_cutoff;
> -static int branch_ratio_cutoff;
> +/* Internal state of one invocation of the tracer pass.  */
>  
> -/* A bit BB->index is set if BB has already been seen, i.e. it is
> -   connected to some trace already.  */
> -sbitmap bb_seen;
> -
> -static inline void
> -mark_bb_seen (basic_block bb)
> +class tracer_state
> +{
> +public:
> +  static bool tail_duplicate ();
> +
> +private:
> +  static void mark_bb_seen (basic_block);
> +  static bool bb_seen_p (basic_block);
> +  static edge find_best_successor (basic_block);
> +  static edge find_best_predecessor (basic_block);
> +  static int find_trace (basic_block, basic_block *);
> +

Regardless, the new class should still be in an anonymous namespace, so that
all the member functions remain private to the translation unit.


r~

Patch

diff --git a/gcc/tracer.c b/gcc/tracer.c
index 975cadb..7044256 100644
--- a/gcc/tracer.c
+++ b/gcc/tracer.c
@@ -53,20 +53,39 @@ 
 static int count_insns (basic_block);
 static bool ignore_bb_p (const_basic_block);
 static bool better_p (const_edge, const_edge);
-static edge find_best_successor (basic_block);
-static edge find_best_predecessor (basic_block);
-static int find_trace (basic_block, basic_block *);
 
-/* Minimal outgoing edge probability considered for superblock formation.  */
-static int probability_cutoff;
-static int branch_ratio_cutoff;
+/* Internal state of one invocation of the tracer pass.  */
 
-/* A bit BB->index is set if BB has already been seen, i.e. it is
-   connected to some trace already.  */
-sbitmap bb_seen;
-
-static inline void
-mark_bb_seen (basic_block bb)
+class tracer_state
+{
+public:
+  static bool tail_duplicate ();
+
+private:
+  static void mark_bb_seen (basic_block);
+  static bool bb_seen_p (basic_block);
+  static edge find_best_successor (basic_block);
+  static edge find_best_predecessor (basic_block);
+  static int find_trace (basic_block, basic_block *);
+
+private:
+  /* Minimal outgoing edge probability considered for superblock
+     formation.  */
+  static int probability_cutoff;
+  static int branch_ratio_cutoff;
+
+  /* A bit BB->index is set if BB has already been seen, i.e. it is
+     connected to some trace already.  */
+  static sbitmap bb_seen;
+
+}; // class tracer_state
+
+int tracer_state::probability_cutoff;
+int tracer_state::branch_ratio_cutoff;
+sbitmap tracer_state::bb_seen;
+
+inline void
+tracer_state::mark_bb_seen (basic_block bb)
 {
   unsigned int size = SBITMAP_SIZE (bb_seen);
 
@@ -76,8 +95,8 @@  mark_bb_seen (basic_block bb)
   bitmap_set_bit (bb_seen, bb->index);
 }
 
-static inline bool
-bb_seen_p (basic_block bb)
+inline bool
+tracer_state::bb_seen_p (basic_block bb)
 {
   return bitmap_bit_p (bb_seen, bb->index);
 }
@@ -138,8 +157,8 @@  better_p (const_edge e1, const_edge e2)
 
 /* Return most frequent successor of basic block BB.  */
 
-static edge
-find_best_successor (basic_block bb)
+edge
+tracer_state::find_best_successor (basic_block bb)
 {
   edge e;
   edge best = NULL;
@@ -157,8 +176,8 @@  find_best_successor (basic_block bb)
 
 /* Return most frequent predecessor of basic block BB.  */
 
-static edge
-find_best_predecessor (basic_block bb)
+edge
+tracer_state::find_best_predecessor (basic_block bb)
 {
   edge e;
   edge best = NULL;
@@ -178,8 +197,8 @@  find_best_predecessor (basic_block bb)
 /* Find the trace using bb and record it in the TRACE array.
    Return number of basic blocks recorded.  */
 
-static int
-find_trace (basic_block bb, basic_block *trace)
+int
+tracer_state::find_trace (basic_block bb, basic_block *trace)
 {
   int i = 0;
   edge e;
@@ -220,8 +239,8 @@  find_trace (basic_block bb, basic_block *trace)
 /* Look for basic blocks in frequency order, construct traces and tail duplicate
    if profitable.  */
 
-static bool
-tail_duplicate (void)
+bool
+tracer_state::tail_duplicate ()
 {
   fibnode_t *blocks = XCNEWVEC (fibnode_t, last_basic_block);
   basic_block *trace = XNEWVEC (basic_block, n_basic_blocks);
@@ -376,7 +395,7 @@  tracer (void)
     brief_dump_cfg (dump_file, dump_flags);
 
   /* Trace formation is done on the fly inside tail_duplicate */
-  changed = tail_duplicate ();
+  changed = tracer_state::tail_duplicate ();
   if (changed)
     {
       free_dominance_info (CDI_DOMINATORS);