From patchwork Mon Jul 1 19:59:19 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Malcolm X-Patchwork-Id: 256214 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id D56A32C008C for ; Tue, 2 Jul 2013 05:59:36 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:in-reply-to:references :content-type:mime-version; q=dns; s=default; b=Fr9k0mfonIwomW4A L/0BjQpWeHuQfPU+AkfpTwudxVh89qni0odKdqew6IVJ+ugL+zP238BdqRqVe6tb IEmHApUv0BqAgPNmTF7GDmXGLSmHF0ZTampycDQNC/C4axSco2EesQ7IM/BWlk9o pU58HHA6WgspfpYQ/YMKw4MTMHc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:from:to:cc:date:in-reply-to:references :content-type:mime-version; s=default; bh=EUthm/Tp28nfPIu9S7Zxa0 HI1k0=; b=d2hHZMuw0Gr6y5oNN6Dron+mnWaBWcp6ND4NPk10xNAeN1CD/8BA+8 CX38zUDjD9zzUV45JhNBbL1zYC2CXT9G9t8OWGsvF6Xh1MGjUF4p3wo/QD8r2NLa tOB2jpNFCnqqZoYKlmsx0nFo8NJjNj+kwZk5nZ+/kDp9sn9oypl4c= Received: (qmail 8876 invoked by alias); 1 Jul 2013 19:59:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 8857 invoked by uid 89); 1 Jul 2013 19:59:26 -0000 X-Spam-SWARE-Status: No, score=-6.6 required=5.0 tests=AWL, BAYES_00, KHOP_THREADED, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.1 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Mon, 01 Jul 2013 19:59:25 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r61JxNFx015681 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 1 Jul 2013 15:59:24 -0400 Received: from [10.18.25.93] ([10.18.25.93]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r61JxNQs029020; Mon, 1 Jul 2013 15:59:23 -0400 Message-ID: <1372708759.1789.177.camel@surprise> Subject: Re: Remove global state from gcc/tracer.c From: David Malcolm To: Richard Henderson Cc: Jakub Jelinek , GCC Patches , Jason Merrill Date: Mon, 01 Jul 2013 15:59:19 -0400 In-Reply-To: <519E8168.70803@redhat.com> References: <1369269945.26167.50.camel@surprise> <20130523051453.GN1377@tucnak.redhat.com> <1369306601.26167.70.camel@surprise> <519E6899.9030506@redhat.com> <519E8168.70803@redhat.com> Mime-Version: 1.0 X-Virus-Found: No (restarting this thread: http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01318.html ) On Thu, 2013-05-23 at 13:51 -0700, Richard Henderson wrote: > On 05/23/2013 12:06 PM, Richard Henderson wrote: > > Another thing I should mention while you're doing all of these static function > > to class member conversions is that as written we're losing target-specific > > optimizations that can be done on purely local functions. This is trivially > > fixed by placing these new classes in an anonymous namespace. > > At which point it occurs to me that we don't need to distinguish between static > and normal member methods, nor play with sub-state structures. All we need to > require is that IPA constant propagation sees that the initial "this" argument > is passed a constant value, and let it propagate and eliminate. > > E.g. > > namespace { > > class pass_state > { > private: > int x, y, z; > > public: > constexpr pass_state() > : x(0), y(0), z(0) > { } > > void doit(); > > private: > void a(); > void b(); > void c(); > }; > > // ... > > } // anon namespace > > #ifdef GLOBAL_STATE > static pass_state ps; > #endif > > void toplev() > { > #ifndef GLOBAL_STATE > pass_state ps; > #endif > ps.doit(); > } > > With an example that's probably too small, I verified that gcc will eliminate > all of the this parameters with GLOBAL_STATE defined. It's certainly something > worth investigating further, as this scheme has the least amount of boilerplate > of any so far advanced. I tried this approach for tracer, but there were non-trivial differences between the old and new tracer.o files, enough to make me uncomfortable with this approach. So I've been investigating another approach: making all the member data/code of the pass state be "static" (and thus no "this") in a global state build, and non-static in a shared-library build. This is similar to the macro-based approach from: http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01351.html but to minimize boilerplate, we could introduce a new compiler attribute. I recently posted: http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00017.html which adds __attribute__(force_static) to the compiler: classes marked with this implicitly have "static" on all their members. Hence we can optimize away the "this" with a single annotation at the top of the class; the optimization takes effect in stages 2 and 3 in a global-state build, but isn't present for a shared-library build. "implicit_static" might be a better name for the attribute. I've posted more information in the plan linked to from: http://gcc.gnu.org/ml/gcc/2013-06/msg00215.html specifically in: http://dmalcolm.fedorapeople.org/gcc/global-state/singletons.html#another-singleton-removal-optimization In the meantime, I'm attaching a patch which introduces a tracer_state class, and for now makes all members explicitly static. I've successfully bootstrapped this on x86_64-unknown-linux-gnu; all testcases have the same results as an unpatched build, and indeed the generated machine code has no differences between old and new tracer.o beyond the ordering of the functions in the .text section, and some functions moving to their own .text section). I also tried this with an anonymous namespace to hold tracer_state, but it did not appear to affect the generated code beyond making it more difficult to debug in gdb - without the namespace one can easily print tracer_state static data. With it, I had to be at a frame within the file to be able to print them. Hence I've dropped the anonymous namespace. (alternative idea: put it in a "gccint" namespace? i.e. "gcc internals"). This was with gdb-7.4.50.20120120-54.fc17.x86_64 fwiw. This patch does the bulk of the work needed to allow this pass to be threadsafe in a shared-library build; the remaining work would be to add the new attribute (via a macro), remove the explicit "static" calls, and introduce a local variable instance of tracer_state, as per: http://dmalcolm.fedorapeople.org/gcc/global-state/pass-patterns.html#per-invocation-state-with-no-gty-markings I would prefer to implement my state-removal work on trunk for 4.9, but if I have to do it on a branch, is this kind of "enabling" patch OK for trunk? FWIW I'm using this pass as my example of a pass with no GTY markings and with only "per-invocation" state (i.e. setup/cleaned up at the start/end of each "execute" call), which is one of the three kinds of pass in the plan I posted here: http://dmalcolm.fedorapeople.org/gcc/global-state/pass-patterns.html#proposed-implementation 2013-06-28 David Malcolm Introduce class tracer_state * tracer.c (tail_duplicate): Convert to a static member function of a new class tracer_state... (tracer_state::tail_duplicate): ...here. (mark_bb_seen): Likewise, converting to... (tracer_state::mark_bb_seen): ...this. (bb_seen_p): Likewise, converting to... (tracer_state::bb_seen_p): ...this. (find_best_successor): Likewise, converting to... (tracer_state::find_best_successor): ...this. (find_best_predecessor): Likewise, converting to... (tracer_state::find_best_predecessor): ...this. (find_trace): Likewise, converting to... (tracer_state::find_trace): ...this. (probability_cutoff): Convert to a static member of new class tracer_state... (tracer_state::probability_cutoff): ...here. (branch_ratio_cutoff): Likewise, converting to... (tracer_state::branch_ratio_cutoff): ...this. (bb_seen): Likewise, converting to... (tracer_state::bb_seen): ...this. (tracer): Convert call from tail_duplicate to tracer_state::tail_duplicate. diff --git a/gcc/tracer.c b/gcc/tracer.c index 975cadb..7044256 100644 --- a/gcc/tracer.c +++ b/gcc/tracer.c @@ -53,20 +53,39 @@ static int count_insns (basic_block); static bool ignore_bb_p (const_basic_block); static bool better_p (const_edge, const_edge); -static edge find_best_successor (basic_block); -static edge find_best_predecessor (basic_block); -static int find_trace (basic_block, basic_block *); -/* Minimal outgoing edge probability considered for superblock formation. */ -static int probability_cutoff; -static int branch_ratio_cutoff; +/* Internal state of one invocation of the tracer pass. */ -/* A bit BB->index is set if BB has already been seen, i.e. it is - connected to some trace already. */ -sbitmap bb_seen; - -static inline void -mark_bb_seen (basic_block bb) +class tracer_state +{ +public: + static bool tail_duplicate (); + +private: + static void mark_bb_seen (basic_block); + static bool bb_seen_p (basic_block); + static edge find_best_successor (basic_block); + static edge find_best_predecessor (basic_block); + static int find_trace (basic_block, basic_block *); + +private: + /* Minimal outgoing edge probability considered for superblock + formation. */ + static int probability_cutoff; + static int branch_ratio_cutoff; + + /* A bit BB->index is set if BB has already been seen, i.e. it is + connected to some trace already. */ + static sbitmap bb_seen; + +}; // class tracer_state + +int tracer_state::probability_cutoff; +int tracer_state::branch_ratio_cutoff; +sbitmap tracer_state::bb_seen; + +inline void +tracer_state::mark_bb_seen (basic_block bb) { unsigned int size = SBITMAP_SIZE (bb_seen); @@ -76,8 +95,8 @@ mark_bb_seen (basic_block bb) bitmap_set_bit (bb_seen, bb->index); } -static inline bool -bb_seen_p (basic_block bb) +inline bool +tracer_state::bb_seen_p (basic_block bb) { return bitmap_bit_p (bb_seen, bb->index); } @@ -138,8 +157,8 @@ better_p (const_edge e1, const_edge e2) /* Return most frequent successor of basic block BB. */ -static edge -find_best_successor (basic_block bb) +edge +tracer_state::find_best_successor (basic_block bb) { edge e; edge best = NULL; @@ -157,8 +176,8 @@ find_best_successor (basic_block bb) /* Return most frequent predecessor of basic block BB. */ -static edge -find_best_predecessor (basic_block bb) +edge +tracer_state::find_best_predecessor (basic_block bb) { edge e; edge best = NULL; @@ -178,8 +197,8 @@ find_best_predecessor (basic_block bb) /* Find the trace using bb and record it in the TRACE array. Return number of basic blocks recorded. */ -static int -find_trace (basic_block bb, basic_block *trace) +int +tracer_state::find_trace (basic_block bb, basic_block *trace) { int i = 0; edge e; @@ -220,8 +239,8 @@ find_trace (basic_block bb, basic_block *trace) /* Look for basic blocks in frequency order, construct traces and tail duplicate if profitable. */ -static bool -tail_duplicate (void) +bool +tracer_state::tail_duplicate () { fibnode_t *blocks = XCNEWVEC (fibnode_t, last_basic_block); basic_block *trace = XNEWVEC (basic_block, n_basic_blocks); @@ -376,7 +395,7 @@ tracer (void) brief_dump_cfg (dump_file, dump_flags); /* Trace formation is done on the fly inside tail_duplicate */ - changed = tail_duplicate (); + changed = tracer_state::tail_duplicate (); if (changed) { free_dominance_info (CDI_DOMINATORS);