From patchwork Tue Dec 2 20:28:47 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 417133 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8C310140185 for ; Wed, 3 Dec 2014 07:28:59 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=GppZWj7LOdCMq1w3R X1iCYBJnIuQOt1SZDXwebhw38V8bgiK6whtePM4VBv+dPI4NKYacih2nHSzcwccs VW7NFbV4IogikCzUnibmHXt7urQo0pVghLmu5H8nC9s2CQlZvB92IFxfgrH2qqFL nntTblHrPhuhX4O8YwxU+B5PZY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=sBAUeJ104QDytuJ7sJNYsva /N+Q=; b=KQ9KesLKTUhNASiOPrKUXFWYLb9GG3rh8tXe4fRmraE5ybmLy96bm7J Zy5tFBVl+zU5B/Ay+B5Qjubz+cYsLYGEL5PnSYFh9Fbznoo/jrPE739K448nHQH0 H3xnI9FjNmP6EeqWhnRKxaV3c7W4MnMpA8w89CkGExAFRZ92oVuo= Received: (qmail 11417 invoked by alias); 2 Dec 2014 20:28:52 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 11405 invoked by uid 89); 2 Dec 2014 20:28:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL, BAYES_00, COMPENSATION autolearn=no version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Tue, 02 Dec 2014 20:28:50 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id A7909AC24; Tue, 2 Dec 2014 20:28:47 +0000 (UTC) Date: Tue, 2 Dec 2014 21:28:47 +0100 From: Martin Jambor To: Jan Hubicka Cc: GCC Patches Subject: Re: [PATCH 2/2, PR 63814] Do not re-create expanded artificial thunks Message-ID: <20141202202847.GJ8214@virgil.suse> Mail-Followup-To: Jan Hubicka , GCC Patches References: <20141121191812.GC7784@virgil.suse> <20141201172631.GD8214@virgil.suse> <20141201214319.GB11741@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20141201214319.GB11741@kam.mff.cuni.cz> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Hi, On Mon, Dec 01, 2014 at 10:43:19PM +0100, Jan Hubicka wrote: > > On Fri, Nov 21, 2014 at 08:18:12PM +0100, Martin Jambor wrote: > > > Hi, > > > > > > when debugging PR 63814 I noticed that when cgraph_node::create_clone > > > was using redirect_edge_duplicating_thunks to redirect two edges to a > > > thunk of a clone, two thunks were created, one for each edge. The > > > reason is that even though duplicate_thunk_for_node attempts to locate > > > an already created thunk, it does so by looking for a caller with > > > thunk.thunk_p set and the previously created one does not have it set > > > because (on i686) expand_thunk has expanded the thunk to gimple and > > > cleared the flag. > > > > > > This patch fixes the issue by marking such expanded thunks with yet > > > another flag and then uses the flag to identify such expanded thunks. > > > Bootstrapped and tested on x86_64-linux and i686-linux. Honza, do you > > > think this is a good approach? Is the patch OK for trunk? > > Hmm, I was sort of hoping to drop the whole .thunk structure into summary and > do not keep it for functions with Gimple body, so this change will imply the > need to keep this datastructure for thunks that are already expanded. > moreover we do not stream the flag, so not all expanded thunks are handled, > only those that was introduced during WPA. > > It may be cleaner to simply donot expand thunks proactively before all > the clonning is finished? > I did not really like it either (although I must say that in general I really dislike the need to call expand_thunk but telling it to not really expand anything if possible). So here is another approach, which works by delaying expansion of thunks but only after all redirections within create_clone are done. This has the advantage that the new method expand_all_artificial_thunks is not exposed to every (indirect) user of cgraph_node::create_clone (as opposed to requiring them all to call that function at some point after making any redirections at all). However, it also has the disadvantage that in the unlikeliest of circumstances (strongly connected components in IPA-CP value dependencies connected by thunks and these thunks ceasing to be assembly thunks in the course of cloning at the same time), IPA-CP might still create extra gimple-expanded thunks. The alternative not suffering from this problem would be either exposing expand_all_artificial_thunks to all users and requiring them to call it, with IPA-CP doing it for all created clones at the very end of propagation, or teaching the infrastructure and pass manager about this and then expanding them at some convenient time (before/after inlining?). However, the first option seems to be too ugly for very little benefit (but I am willing to implement that) and the second does not seem to be stage3 material. What do you think? Bootstrapped and tested on x86_64-linux. Martin 2014-12-02 Martin Jambor * cgraph.h (cgraph_node): New method expand_all_artificial_thunks. (cgraph_edge): New method redirect_callee_duplicating_thunks. * cgraphclones.c (duplicate_thunk_for_node): Donot expand newly created thunks. (redirect_edge_duplicating_thunks): Turned into edge method redirect_callee_duplicating_thunks. (cgraph_node::expand_all_artificial_thunks): New method. (create_clone): Call expand_all_artificial_thunks. * ipa-cp.c (perhaps_add_new_callers): Call redirect_callee_duplicating_thunks instead of redirect_callee. Also call expand_all_artificial_thunks. Index: src/gcc/cgraph.h =================================================================== --- src.orig/gcc/cgraph.h +++ src/gcc/cgraph.h @@ -908,6 +908,10 @@ public: thunks that are not lowered. */ bool expand_thunk (bool output_asm_thunks, bool force_gimple_thunk); + /* Call expand_thunk on all callers that are thunks and if analyze those + nodes that were expanded. */ + void expand_all_artificial_thunks (); + /* Assemble thunks and aliases associated to node. */ void assemble_thunks_and_aliases (void); @@ -1477,6 +1481,12 @@ struct GTY((chain_next ("%h.next_caller" call expression. */ void redirect_callee (cgraph_node *n); + /* If the edge does not lead to a thunk, simply redirect it to N. Otherwise + create one or more equivalent thunks for N and redirect E to the first in + the chain. Note that it is then necessary to call + n->expand_all_artificial_thunks once all callers are redirected. */ + void redirect_callee_duplicating_thunks (cgraph_node *n); + /* Make an indirect edge with an unknown callee an ordinary edge leading to CALLEE. DELTA is an integer constant that is to be added to the this pointer (first parameter) to compensate for skipping Index: src/gcc/cgraphclones.c =================================================================== --- src.orig/gcc/cgraphclones.c +++ src/gcc/cgraphclones.c @@ -370,28 +370,47 @@ duplicate_thunk_for_node (cgraph_node *t CGRAPH_FREQ_BASE); e->call_stmt_cannot_inline_p = true; symtab->call_edge_duplication_hooks (thunk->callees, e); - if (new_thunk->expand_thunk (false, false)) - { - new_thunk->thunk.thunk_p = false; - new_thunk->analyze (); - } - symtab->call_cgraph_duplication_hooks (thunk, new_thunk); return new_thunk; } /* If E does not lead to a thunk, simply redirect it to N. Otherwise create one or more equivalent thunks for N and redirect E to the first in the - chain. */ + chain. Note that it is then necessary to call + n->expand_all_artificial_thunks once all callers are redirected. */ void -redirect_edge_duplicating_thunks (cgraph_edge *e, cgraph_node *n) +cgraph_edge::redirect_callee_duplicating_thunks (cgraph_node *n) { - cgraph_node *orig_to = e->callee->ultimate_alias_target (); + cgraph_node *orig_to = callee->ultimate_alias_target (); if (orig_to->thunk.thunk_p) n = duplicate_thunk_for_node (orig_to, n); - e->redirect_callee (n); + redirect_callee (n); +} + +/* Call expand_thunk on all callers that are thunks and if analyze those nodes + that were expanded. */ + +void +cgraph_node::expand_all_artificial_thunks () +{ + cgraph_edge *e; + for (e = callers; e;) + if (e->caller->thunk.thunk_p) + { + cgraph_node *thunk = e->caller; + + e = e->next_caller; + if (thunk->expand_thunk (false, false)) + { + thunk->thunk.thunk_p = false; + thunk->analyze (); + } + thunk->expand_all_artificial_thunks (); + } + else + e = e->next_caller; } /* Create node representing clone of N executed COUNT times. Decrease @@ -483,8 +502,9 @@ cgraph_node::create_clone (tree decl, gc if (!e->callee || DECL_BUILT_IN_CLASS (e->callee->decl) != BUILT_IN_NORMAL || DECL_FUNCTION_CODE (e->callee->decl) != BUILT_IN_UNREACHABLE) - redirect_edge_duplicating_thunks (e, new_node); + e->redirect_callee_duplicating_thunks (new_node); } + new_node->expand_all_artificial_thunks (); for (e = callees;e; e=e->next_callee) e->clone (new_node, e->call_stmt, e->lto_stmt_uid, count_scale, Index: src/gcc/ipa-cp.c =================================================================== --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -3806,7 +3806,8 @@ perhaps_add_new_callers (cgraph_node *no xstrdup (val->spec_node->name ()), val->spec_node->order); - cs->redirect_callee (val->spec_node); + cs->redirect_callee_duplicating_thunks (val->spec_node); + val->spec_node->expand_all_artificial_thunks (); redirected_sum += cs->count; } cs = get_next_cgraph_edge_clone (cs);