From patchwork Tue Feb 17 17:14:36 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Martin_Li=C5=A1ka?= X-Patchwork-Id: 440673 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 987E414019D for ; Wed, 18 Feb 2015 04:14:53 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=oj5W/WIbC4xr7rhvZwjajRFjaaWhPOGLkWLpEgx6BNy XRzkEJD9bbWyeSQxJR3t6Ci93QDKTLNizIE/hAFaEkgagI25jdDaF7sI2HG7LM6R 7/bETX5CHRTbNhngWjgHOOrBCEalbMvAhYPHkKA8YeCw9JfusNCyn8gefAXRqbbM = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=yYpu9FSck7tk/WmUUCrTlLWmMPo=; b=TZHNWlIfec4SGlzrl IqG7qzFPjo7dYzmqPSdYwpGDb6jBBlJZN3RLtIo/YP5k6rWQwMvjuoK4nKsyYN36 +qhrFdxypug+G5UiCrhJJLypV2vf4J+GXDx6z2E4Dzeyh4iFnTuXP7g13VZLc6st q5U0A7zUU4W6EYAU/WIgD186mg= Received: (qmail 21913 invoked by alias); 17 Feb 2015 17:14:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 21902 invoked by uid 89); 17 Feb 2015 17:14:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL, BAYES_00, T_FILL_THIS_FORM_SHORT autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Tue, 17 Feb 2015 17:14:41 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id B3D99AB08; Tue, 17 Feb 2015 17:14:37 +0000 (UTC) Message-ID: <54E376FC.9080709@suse.cz> Date: Tue, 17 Feb 2015 18:14:36 +0100 From: =?UTF-8?B?TWFydGluIExpxaFrYQ==?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: "hubicka >> Jan Hubicka" Subject: [RFC, PATCH] LTO: IPA inline speed up for large apps (Chrome) X-IsSubscribed: yes Hello. After LTO debugging of Chrome we noticed with Honza that WPA phase taken quite long time. Following patch is an attempt to cache IPA inliner predicates that are constant during inline_small functions. As you can see in attached report, this patch can reduce time spent in WPA by ~40%, which is really big improvement. Disadvantage of the solution is that the patch adds 4 new bitfields to cgraph_node class. Well, we can move these flags to inline_summary, but as this struct is not accessible from cgraph.h, we cannot benefit from inlining that is crucial for these predicates. I welcome and ideas about the solution and I'm not sure if it's acceptable for STAGE4? That's reason why no ChangeLog entry is prepared. Thanks, Martin Hello. Following mini patchset is speed-up for LTO WPA received on chromium binary: Before: Execution times (seconds) phase setup : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 1977 kB ( 0%) ggc phase opt and generate : 179.87 (66%) usr 1.67 (45%) sys 181.47 (66%) wall 2682287 kB (13%) ggc phase stream in : 92.75 (34%) usr 2.05 (55%) sys 94.77 (34%) wall18738391 kB (87%) ggc callgraph optimization : 0.71 ( 0%) usr 0.00 ( 0%) sys 0.71 ( 0%) wall 16 kB ( 0%) ggc ipa dead code removal : 5.20 ( 2%) usr 0.05 ( 1%) sys 5.26 ( 2%) wall 0 kB ( 0%) ggc ipa virtual call target : 3.22 ( 1%) usr 0.03 ( 1%) sys 3.20 ( 1%) wall 0 kB ( 0%) ggc ipa devirtualization : 0.28 ( 0%) usr 0.01 ( 0%) sys 0.26 ( 0%) wall 32638 kB ( 0%) ggc ipa cp : 4.27 ( 2%) usr 0.24 ( 6%) sys 4.55 ( 2%) wall 851324 kB ( 4%) ggc ipa inlining heuristics : 127.09 (47%) usr 0.27 ( 7%) sys 127.25 (46%) wall 807884 kB ( 4%) ggc ipa comdats : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57 ( 0%) wall 0 kB ( 0%) ggc ipa lto gimple in : 5.47 ( 2%) usr 0.92 (25%) sys 6.37 ( 2%) wall 1370242 kB ( 6%) ggc ipa lto decl in : 79.23 (29%) usr 1.32 (35%) sys 80.53 (29%) wall16957392 kB (79%) ggc ipa lto constructors in : 0.33 ( 0%) usr 0.03 ( 1%) sys 0.44 ( 0%) wall 22897 kB ( 0%) ggc ipa lto cgraph I/O : 1.41 ( 1%) usr 0.21 ( 6%) sys 1.62 ( 1%) wall 901987 kB ( 4%) ggc ipa lto decl merge : 3.22 ( 1%) usr 0.00 ( 0%) sys 3.22 ( 1%) wall 16383 kB ( 0%) ggc ipa lto cgraph merge : 5.10 ( 2%) usr 0.01 ( 0%) sys 5.11 ( 2%) wall 20432 kB ( 0%) ggc whopr wpa : 1.95 ( 1%) usr 0.00 ( 0%) sys 1.94 ( 1%) wall 2 kB ( 0%) ggc whopr partitioning : 5.22 ( 2%) usr 0.01 ( 0%) sys 5.23 ( 2%) wall 7800 kB ( 0%) ggc ipa reference : 2.97 ( 1%) usr 0.06 ( 2%) sys 3.02 ( 1%) wall 0 kB ( 0%) ggc ipa profile : 0.52 ( 0%) usr 0.04 ( 1%) sys 0.56 ( 0%) wall 0 kB ( 0%) ggc ipa pure const : 3.51 ( 1%) usr 0.04 ( 1%) sys 3.56 ( 1%) wall 0 kB ( 0%) ggc ipa icf : 19.33 ( 7%) usr 0.12 ( 3%) sys 19.52 ( 7%) wall 3089 kB ( 0%) ggc tree SSA rewrite : 0.35 ( 0%) usr 0.02 ( 1%) sys 0.37 ( 0%) wall 51191 kB ( 0%) ggc tree SSA other : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree SSA incremental : 0.48 ( 0%) usr 0.06 ( 2%) sys 0.37 ( 0%) wall 33552 kB ( 0%) ggc tree operand scan : 0.41 ( 0%) usr 0.08 ( 2%) sys 0.53 ( 0%) wall 343835 kB ( 2%) ggc dominance frontiers : 0.04 ( 0%) usr 0.01 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.36 ( 0%) usr 0.09 ( 2%) sys 0.55 ( 0%) wall 0 kB ( 0%) ggc varconst : 0.03 ( 0%) usr 0.03 ( 1%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc loop fini : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 0 kB ( 0%) ggc unaccounted todo : 1.18 ( 0%) usr 0.00 ( 0%) sys 1.19 ( 0%) wall 0 kB ( 0%) ggc TOTAL : 272.63 3.72 276.25 21422657 kB AFTER: Execution times (seconds) phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 1977 kB ( 0%) ggc phase opt and generate : 73.30 (43%) usr 1.79 (44%) sys 75.06 (43%) wall 2682287 kB (13%) ggc phase stream in : 95.72 (57%) usr 2.25 (56%) sys 97.94 (57%) wall18738391 kB (87%) ggc callgraph optimization : 0.75 ( 0%) usr 0.00 ( 0%) sys 0.76 ( 0%) wall 16 kB ( 0%) ggc ipa dead code removal : 5.19 ( 3%) usr 0.03 ( 1%) sys 5.25 ( 3%) wall 0 kB ( 0%) ggc ipa virtual call target : 2.81 ( 2%) usr 0.03 ( 1%) sys 3.15 ( 2%) wall 0 kB ( 0%) ggc ipa devirtualization : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.26 ( 0%) wall 32638 kB ( 0%) ggc ipa cp : 4.59 ( 3%) usr 0.24 ( 6%) sys 4.76 ( 3%) wall 851324 kB ( 4%) ggc ipa inlining heuristics : 22.09 (13%) usr 0.26 ( 6%) sys 22.20 (13%) wall 807884 kB ( 4%) ggc ipa comdats : 0.57 ( 0%) usr 0.00 ( 0%) sys 0.57 ( 0%) wall 0 kB ( 0%) ggc ipa lto gimple in : 5.67 ( 3%) usr 0.93 (23%) sys 6.51 ( 4%) wall 1370242 kB ( 6%) ggc ipa lto decl in : 81.86 (48%) usr 1.45 (36%) sys 83.29 (48%) wall16957392 kB (79%) ggc ipa lto constructors in : 0.41 ( 0%) usr 0.09 ( 2%) sys 0.36 ( 0%) wall 22897 kB ( 0%) ggc ipa lto cgraph I/O : 1.49 ( 1%) usr 0.25 ( 6%) sys 1.73 ( 1%) wall 901987 kB ( 4%) ggc ipa lto decl merge : 3.55 ( 2%) usr 0.00 ( 0%) sys 3.55 ( 2%) wall 16383 kB ( 0%) ggc ipa lto cgraph merge : 5.05 ( 3%) usr 0.00 ( 0%) sys 5.07 ( 3%) wall 20432 kB ( 0%) ggc whopr wpa : 1.88 ( 1%) usr 0.00 ( 0%) sys 1.86 ( 1%) wall 2 kB ( 0%) ggc whopr partitioning : 4.89 ( 3%) usr 0.02 ( 0%) sys 4.90 ( 3%) wall 7800 kB ( 0%) ggc ipa reference : 2.85 ( 2%) usr 0.05 ( 1%) sys 2.91 ( 2%) wall 0 kB ( 0%) ggc ipa profile : 0.55 ( 0%) usr 0.04 ( 1%) sys 0.59 ( 0%) wall 0 kB ( 0%) ggc ipa pure const : 3.28 ( 2%) usr 0.04 ( 1%) sys 3.33 ( 2%) wall 0 kB ( 0%) ggc ipa icf : 18.23 (11%) usr 0.12 ( 3%) sys 18.29 (11%) wall 3089 kB ( 0%) ggc tree SSA rewrite : 0.26 ( 0%) usr 0.04 ( 1%) sys 0.32 ( 0%) wall 51191 kB ( 0%) ggc tree SSA other : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA incremental : 0.51 ( 0%) usr 0.16 ( 4%) sys 0.60 ( 0%) wall 33552 kB ( 0%) ggc tree operand scan : 0.36 ( 0%) usr 0.13 ( 3%) sys 0.49 ( 0%) wall 343835 kB ( 2%) ggc dominance frontiers : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.39 ( 0%) usr 0.06 ( 1%) sys 0.63 ( 0%) wall 0 kB ( 0%) ggc varconst : 0.05 ( 0%) usr 0.04 ( 1%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc loop fini : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 0 kB ( 0%) ggc unaccounted todo : 1.26 ( 1%) usr 0.00 ( 0%) sys 1.26 ( 1%) wall 0 kB ( 0%) ggc TOTAL : 169.02 4.04 173.00 21422657 kB perf report after: 10.17% lto1-wpa lto1 [.] inflate_fast 3.74% lto1-wpa lto1 [.] compare_tree_sccs_1(tree_node*, tree_node*, tree_node***) 3.56% lto1-wpa lto1 [.] streamer_read_uhwi(lto_input_block*) 3.16% lto1-wpa lto1 [.] ht_lookup_with_hash(ht*, unsigned char const*, unsigned long, unsigned int, ht_lookup_option) 3.01% lto1-wpa lto1 [.] unify_scc(streamer_tree_cache_d*, unsigned int, unsigned int, unsigned int, unsigned int) 2.69% lto1-wpa lto1 [.] streamer_read_tree_bitfields(lto_input_block*, data_in*, tree_node*) 2.16% lto1-wpa lto1 [.] lto_cgraph_replace_node(cgraph_node*, cgraph_node*) 2.00% lto1-wpa lto1 [.] streamer_get_pickled_tree(lto_input_block*, data_in*) 2.00% lto1-wpa libc-2.19.so [.] msort_with_tmp.part.0 1.91% lto1-wpa lto1 [.] ipa_icf::sem_variable::equals(tree_node*, tree_node*) 1.72% lto1-wpa libc-2.19.so [.] _int_malloc 1.70% lto1-wpa lto1 [.] symbol_table::remove_unreachable_nodes(_IO_FILE*) 1.54% lto1-wpa lto1 [.] lto_input_tree_1(lto_input_block*, data_in*, LTO_tags, unsigned int) 1.33% lto1-wpa lto1 [.] inflate 1.21% lto1-wpa lto1 [.] adler32 1.16% lto1-wpa lto1 [.] cgraph_node::call_for_symbol_thunks_and_aliases(bool (*)(cgraph_node*, void*), void*, bool, bool) 1.11% lto1-wpa lto1 [.] lto_input_tree(lto_input_block*, data_in*) 1.07% lto1-wpa lto1 [.] streamer_read_tree_body(lto_input_block*, data_in*, tree_node*) 1.03% lto1-wpa lto1 [.] lto_input_location(bitpack_d*, data_in*) 1.01% lto1-wpa lto1 [.] htab_hash_string 0.99% lto1-wpa lto1 [.] estimate_calls_size_and_time(cgraph_node*, int*, int*, int*, int*, unsigned int, vec, vec, vec) [clone .isra.137] 0.92% lto1-wpa lto1 [.] ht_lookup(ht*, unsigned char const*, unsigned long, ht_lookup_option) 0.92% lto1-wpa lto1 [.] ggc_internal_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) 0.86% lto1-wpa lto1 [.] splay_tree_splay 0.83% lto1-wpa lto1 [.] bp_unpack_var_len_unsigned(bitpack_d*) 0.80% lto1-wpa libc-2.19.so [.] malloc_consolidate 0.77% lto1-wpa lto1 [.] can_inline_edge_p(cgraph_edge*, bool, bool) 0.72% lto1-wpa lto1 [.] gimple_has_body_p(tree_node*) Thanks, Martin From 4e878a928ff7e9fe4eee0ea4b241c01c4440bd60 Mon Sep 17 00:00:00 2001 From: mliska Date: Mon, 16 Feb 2015 16:48:01 +0100 Subject: [PATCH] ipa-inline: introduce computed value that speeds up IPA inliner. --- gcc/cgraph.c | 77 ------------- gcc/cgraph.h | 309 ++++++++++++++++++++++++++++++++++++++++++++++++++++- gcc/ipa-inline.c | 2 + gcc/lto-streamer.c | 2 + gcc/symtab.c | 48 ++++++--- 5 files changed, 345 insertions(+), 93 deletions(-) diff --git a/gcc/cgraph.c b/gcc/cgraph.c index 3548bd0..b72a6c0 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -2403,83 +2403,6 @@ cgraph_edge::maybe_hot_p (void) return true; } -/* Worker for cgraph_can_remove_if_no_direct_calls_p. */ - -static bool -nonremovable_p (cgraph_node *node, void *) -{ - return !node->can_remove_if_no_direct_calls_and_refs_p (); -} - -/* Return true when function cgraph_node and its aliases can be removed from - callgraph if all direct calls are eliminated. */ - -bool -cgraph_node::can_remove_if_no_direct_calls_p (void) -{ - /* Extern inlines can always go, we will use the external definition. */ - if (DECL_EXTERNAL (decl)) - return true; - if (address_taken) - return false; - return !call_for_symbol_and_aliases (nonremovable_p, NULL, true); -} - -/* Return true when function cgraph_node can be expected to be removed - from program when direct calls in this compilation unit are removed. - - As a special case COMDAT functions are - cgraph_can_remove_if_no_direct_calls_p while the are not - cgraph_only_called_directly_p (it is possible they are called from other - unit) - - This function behaves as cgraph_only_called_directly_p because eliminating - all uses of COMDAT function does not make it necessarily disappear from - the program unless we are compiling whole program or we do LTO. In this - case we know we win since dynamic linking will not really discard the - linkonce section. */ - -bool -cgraph_node::will_be_removed_from_program_if_no_direct_calls_p (void) -{ - gcc_assert (!global.inlined_to); - - if (call_for_symbol_and_aliases (used_from_object_file_p_worker, - NULL, true)) - return false; - if (!in_lto_p && !flag_whole_program) - return only_called_directly_p (); - else - { - if (DECL_EXTERNAL (decl)) - return true; - return can_remove_if_no_direct_calls_p (); - } -} - - -/* Worker for cgraph_only_called_directly_p. */ - -static bool -cgraph_not_only_called_directly_p_1 (cgraph_node *node, void *) -{ - return !node->only_called_directly_or_aliased_p (); -} - -/* Return true when function cgraph_node and all its aliases are only called - directly. - i.e. it is not externally visible, address was not taken and - it is not used in any other non-standard way. */ - -bool -cgraph_node::only_called_directly_p (void) -{ - gcc_assert (ultimate_alias_target () == this); - return !call_for_symbol_and_aliases (cgraph_not_only_called_directly_p_1, - NULL, true); -} - - /* Collect all callers of NODE. Worker for collect_callers_of_node. */ static bool diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 06d2704..39cb340 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -261,17 +261,29 @@ public: void *data, bool include_overwrite); + /* Call callback on symtab node and aliases associated to this node. + When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are + skipped. */ + template + bool call_for_symbol_and_aliases (Arg data, bool include_overwrite); + /* If node can not be interposable by static or dynamic linker to point to different definition, return this symbol. Otherwise look for alias with such property and if none exists, introduce new one. */ symtab_node *noninterposable_alias (void); + /* Worker searching noninterposable alias. */ + static bool noninterposable_alias (symtab_node *node, symtab_node **data); + /* Return node that alias is aliasing. */ inline symtab_node *get_alias_target (void); /* Set section for symbol and its aliases. */ void set_section (const char *section); + /* Worker for set_section. */ + static bool set_section (symtab_node *n, const char *s); + /* Set section, do not recurse into aliases. When one wants to change section of symbol and its aliases, use set_section. */ @@ -523,6 +535,11 @@ protected: bool call_for_symbol_and_aliases_1 (bool (*callback) (symtab_node *, void *), void *data, bool include_overwrite); + + /* Worker for call_for_symbol_and_aliases. */ + template + bool call_for_symbol_and_aliases_1 (Arg data, bool include_overwritable); + private: /* Worker for set_section. */ static bool set_section (symtab_node *n, void *s); @@ -1042,6 +1059,13 @@ public: void *), void *data, bool include_overwritable); + /* Call callback on function and aliases associated to the function. + When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are + skipped. */ + template + bool call_for_symbol_and_aliases (Arg data, bool include_overwritable); + + /* Call callback on cgraph_node, thunks and aliases associated to NODE. When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are skipped. When EXCLUDE_VIRTUAL_THUNKS is true, virtual thunks are @@ -1052,6 +1076,15 @@ public: bool include_overwritable, bool exclude_virtual_thunks = false); + /* Call callback on cgraph_node, thunks and aliases associated to NODE. + When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are + skipped. When EXCLUDE_VIRTUAL_THUNKS is true, virtual thunks are + skipped. */ + template + bool call_for_symbol_thunks_and_aliases (Arg data, + bool include_overwritable, + bool exclude_virtual_thunks = false); + /* Likewise indicate that a node is needed, i.e. reachable via some external means. */ inline void mark_force_output (void); @@ -1093,6 +1126,9 @@ public: the program unless we are compiling whole program or we do LTO. In this case we know we win since dynamic linking will not really discard the linkonce section. */ + bool will_be_removed_from_program_if_no_direct_calls_compute_p (void); + + /* Wrapper for will_be_removed_from_program_if_no_direct_calls_compute_p. */ bool will_be_removed_from_program_if_no_direct_calls_p (void); /* Return true when function can be removed from callgraph @@ -1101,8 +1137,15 @@ public: /* Return true when function cgraph_node and its aliases can be removed from callgraph if all direct calls are eliminated. */ + bool can_remove_if_no_direct_calls_compute_p (void); + + /* Wrapper for can_remove_if_no_direct_calls_compute_p. */ bool can_remove_if_no_direct_calls_p (void); + /* Worker for cgraph_can_remove_if_no_direct_calls_p. */ + static bool nonremovable_p (cgraph_node *node, void *); + static bool nonremovable_compute_p (cgraph_node *node, void *); + /* Return true when callgraph node is a function with Gimple body defined in current unit. Functions can also be define externally or they can be thunks with no Gimple representation. @@ -1295,11 +1338,24 @@ public: /* True if there was multiple COMDAT bodies merged by lto-symtab. */ unsigned merged : 1; + /* IPA inline cached values. */ + unsigned inline_nonremovable_init: 1; + unsigned inline_can_remove_if_no_direct_calls_init: 1; + unsigned inline_will_be_removed_if_no_direct_calls_init: 1; + + unsigned inline_nonremovable: 1; + unsigned inline_can_remove_if_no_direct_calls: 1; + unsigned inline_will_be_removed_if_no_direct_calls: 1; + private: /* Worker for call_for_symbol_and_aliases. */ bool call_for_symbol_and_aliases_1 (bool (*callback) (cgraph_node *, void *), void *data, bool include_overwritable); + + /* Worker for call_for_symbol_and_aliases. */ + template + bool call_for_symbol_and_aliases_1 (Arg data, bool include_overwritable); }; /* A cgraph node set is a collection of cgraph nodes. A cgraph node @@ -1683,6 +1739,12 @@ public: void *data, bool include_overwritable); + /* Call calback on varpool symbol and aliases associated to varpool symbol. + When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are + skipped. */ + template + bool call_for_symbol_and_aliases (Arg data, bool include_overwritable); + /* Return true when variable should be considered externally visible. */ bool externally_visible_p (void); @@ -1761,6 +1823,10 @@ private: bool call_for_symbol_and_aliases_1 (bool (*callback) (varpool_node *, void *), void *data, bool include_overwritable); + + /* Worker for call_for_symbol_and_aliases. */ + template + bool call_for_symbol_and_aliases_1 (Arg data, bool include_overwritable); }; /* Every top level asm statement is put into a asm_node. */ @@ -1862,7 +1928,7 @@ public: friend class cgraph_node; friend class cgraph_edge; - symbol_table (): cgraph_max_summary_uid (1) + symbol_table (): cgraph_max_summary_uid (1), enable_inline_cache (false) { } @@ -2101,6 +2167,9 @@ public: FILE* GTY ((skip)) dump_file; + /* Inline cache flag. */ + bool enable_inline_cache; + private: /* Allocate new callgraph node. */ inline cgraph_node * allocate_cgraph_symbol (void); @@ -2987,6 +3056,21 @@ symtab_node::call_for_symbol_and_aliases (bool (*callback) (symtab_node *, return false; } +template +inline bool +symtab_node::call_for_symbol_and_aliases (Arg data, bool include_overwritable) +{ + ipa_ref *ref; + + if (callback (this, data)) + return true; + if (iterate_direct_aliases (0, ref)) + return call_for_symbol_and_aliases_1 + (data, include_overwritable); + return false; +} + + /* Call callback on function and aliases associated to the function. When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are skipped. */ @@ -3004,6 +3088,43 @@ cgraph_node::call_for_symbol_and_aliases (bool (*callback) (cgraph_node *, return false; } +/* Call callback on function and aliases associated to the function. + When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are + skipped. */ + +template +inline bool +cgraph_node::call_for_symbol_and_aliases (Arg data, bool include_overwritable) +{ + ipa_ref *ref; + + if (callback (this, data)) + return true; + + if (iterate_direct_aliases (0, ref)) + return call_for_symbol_and_aliases_1 (data, include_overwritable); + + return false; +} + +template +inline bool +cgraph_node::call_for_symbol_and_aliases_1 (Arg data, bool include_overwritable) +{ + ipa_ref *ref; + FOR_EACH_ALIAS (this, ref) + { + cgraph_node *alias = dyn_cast (ref->referring); + if (include_overwritable + || alias->get_availability () > AVAIL_INTERPOSABLE) + if (alias->call_for_symbol_and_aliases (data, include_overwritable)) + return true; + } + + return false; +} + + /* Call calback on varpool symbol and aliases associated to varpool symbol. When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are skipped. */ @@ -3021,6 +3142,47 @@ varpool_node::call_for_symbol_and_aliases (bool (*callback) (varpool_node *, return false; } + +/* Call calback on varpool symbol and aliases associated to varpool symbol. + When INCLUDE_OVERWRITABLE is false, overwritable aliases and thunks are + skipped. */ + +template +inline bool +varpool_node::call_for_symbol_and_aliases (Arg data, bool include_overwritable) +{ + ipa_ref *ref; + + if (callback (this, data)) + return true; + if (iterate_direct_aliases (0, ref)) + return call_for_symbol_and_aliases_1 + (data, include_overwritable); + + return false; +} + +/* Worker for call_for_symbol_and_aliases. */ + +template +bool +varpool_node::call_for_symbol_and_aliases_1 (Arg data, + bool include_overwritable) +{ + ipa_ref *ref; + + FOR_EACH_ALIAS (this, ref) + { + varpool_node *alias = dyn_cast (ref->referring); + if (include_overwritable + || alias->get_availability () > AVAIL_INTERPOSABLE) + if (alias->call_for_symbol_and_aliases + (data, include_overwritable)) + return true; + } + return false; +} + /* Build polymorphic call context for indirect call E. */ inline @@ -3094,6 +3256,151 @@ cgraph_local_p (cgraph_node *node) return node->local.local && node->instrumented_version->local.local; } +inline bool +cgraph_node::nonremovable_compute_p (cgraph_node *node, void *) +{ + return !node->can_remove_if_no_direct_calls_and_refs_p (); +} + +inline bool +cgraph_node::nonremovable_p (cgraph_node *node, void *) +{ + bool retval; + + if (symtab->enable_inline_cache) + { + if (!node->inline_nonremovable_init) + { + node->inline_nonremovable = nonremovable_compute_p (node, NULL); + node->inline_nonremovable_init = true; + } + + retval = node->inline_nonremovable; + + gcc_checking_assert (retval == nonremovable_compute_p (node, NULL)); + } + else + retval = nonremovable_compute_p (node, NULL); + + return retval; +} + +inline bool +cgraph_node::can_remove_if_no_direct_calls_compute_p (void) +{ + if (DECL_EXTERNAL (decl)) + return true; + if (address_taken) + return false; + + return !call_for_symbol_and_aliases + (NULL, true); +} + +/* Return true when function cgraph_node and its aliases can be removed from + callgraph if all direct calls are eliminated. */ + +inline bool +cgraph_node::can_remove_if_no_direct_calls_p (void) +{ + bool retval; + + if (symtab->enable_inline_cache) + { + if (!inline_can_remove_if_no_direct_calls_init) + { + inline_can_remove_if_no_direct_calls = can_remove_if_no_direct_calls_compute_p (); + inline_can_remove_if_no_direct_calls_init = true; + } + + retval = inline_can_remove_if_no_direct_calls; + + gcc_checking_assert + (retval == can_remove_if_no_direct_calls_compute_p ()); + } + else + retval = can_remove_if_no_direct_calls_compute_p (); + + return retval; +} + +/* Return true when function cgraph_node can be expected to be removed + from program when direct calls in this compilation unit are removed. + + As a special case COMDAT functions are + cgraph_can_remove_if_no_direct_calls_p while the are not + cgraph_only_called_directly_p (it is possible they are called from other + unit) + + This function behaves as cgraph_only_called_directly_p because eliminating + all uses of COMDAT function does not make it necessarily disappear from + the program unless we are compiling whole program or we do LTO. In this + case we know we win since dynamic linking will not really discard the + linkonce section. */ + +inline bool +cgraph_node::will_be_removed_from_program_if_no_direct_calls_compute_p (void) +{ + gcc_assert (!global.inlined_to); + + if (call_for_symbol_and_aliases + (NULL, true)) + return false; + if (!in_lto_p && !flag_whole_program) + return only_called_directly_p (); + else + { + if (DECL_EXTERNAL (decl)) + return true; + return can_remove_if_no_direct_calls_p (); + } +} + +/* Wrapper for will_be_removed_from_program_if_no_direct_calls_computed_p. */ + +inline bool +cgraph_node::will_be_removed_from_program_if_no_direct_calls_p (void) +{ + if (symtab->enable_inline_cache) + { + if (!inline_will_be_removed_if_no_direct_calls_init) + { + inline_will_be_removed_if_no_direct_calls + = will_be_removed_from_program_if_no_direct_calls_compute_p (); + + inline_will_be_removed_if_no_direct_calls_init = true; + } + + gcc_checking_assert (inline_will_be_removed_if_no_direct_calls == + will_be_removed_from_program_if_no_direct_calls_compute_p ()); + return inline_will_be_removed_if_no_direct_calls; + } + + return will_be_removed_from_program_if_no_direct_calls_compute_p (); +} + +/* Worker for cgraph_only_called_directly_p. */ + +static bool +cgraph_not_only_called_directly_p_1 (cgraph_node *node, void *) +{ + return !node->only_called_directly_or_aliased_p (); +} + +/* Return true when function cgraph_node and all its aliases are only called + directly. + i.e. it is not externally visible, address was not taken and + it is not used in any other non-standard way. */ + +inline bool +cgraph_node::only_called_directly_p (void) +{ + gcc_assert (ultimate_alias_target () == this); + return !call_for_symbol_and_aliases (cgraph_not_only_called_directly_p_1, + NULL, true); +} + + /* When using fprintf (or similar), problems can arise with transient generated strings. Many string-generation APIs only support one result being alive at once (e.g. by diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c index 287a6dd..8a07e04 100644 --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -1651,6 +1651,7 @@ inline_small_functions (void) ipa_reduced_postorder (order, true, true, NULL); free (order); + symtab->enable_inline_cache = true; FOR_EACH_DEFINED_FUNCTION (node) if (!node->global.inlined_to) { @@ -1966,6 +1967,7 @@ inline_small_functions (void) } } + symtab->enable_inline_cache = false; free_growth_caches (); if (dump_file) fprintf (dump_file, diff --git a/gcc/lto-streamer.c b/gcc/lto-streamer.c index 836dce9..542a813 100644 --- a/gcc/lto-streamer.c +++ b/gcc/lto-streamer.c @@ -319,11 +319,13 @@ static hash_table *tree_htab; void lto_streamer_init (void) { +#ifdef ENABLE_CHECKING /* Check that all the TS_* handled by the reader and writer routines match exactly the structures defined in treestruct.def. When a new TS_* astructure is added, the streamer should be updated to handle it. */ streamer_check_handled_ts_structures (); +#endif #ifdef LTO_STREAMER_DEBUG tree_htab = new hash_table (31); diff --git a/gcc/symtab.c b/gcc/symtab.c index ee47a73..df0950b 100644 --- a/gcc/symtab.c +++ b/gcc/symtab.c @@ -1337,9 +1337,9 @@ symtab_node::set_section_for_node (const char *section) /* Worker for set_section. */ bool -symtab_node::set_section (symtab_node *n, void *s) +symtab_node::set_section (symtab_node *n, const char *s) { - n->set_section_for_node ((char *)s); + n->set_section_for_node (s); return false; } @@ -1349,8 +1349,7 @@ void symtab_node::set_section (const char *section) { gcc_assert (!this->alias); - call_for_symbol_and_aliases - (symtab_node::set_section, const_cast(section), true); + call_for_symbol_and_aliases (section, true); } /* Return the initialization priority. */ @@ -1491,10 +1490,11 @@ symtab_node::resolve_alias (symtab_node *target) { error ("section of alias %q+D must match section of its target", decl); } - call_for_symbol_and_aliases (symtab_node::set_section, - const_cast(target->get_section ()), true); + call_for_symbol_and_aliases + (const_cast(target->get_section ()), true); if (target->implicit_section) - call_for_symbol_and_aliases (set_implicit_section, NULL, true); + call_for_symbol_and_aliases + (NULL, true); /* Alias targets become redundant after alias is resolved into an reference. We do not want to keep it around or we would have to mind updating them @@ -1513,7 +1513,7 @@ symtab_node::resolve_alias (symtab_node *target) /* Worker searching noninterposable alias. */ bool -symtab_node::noninterposable_alias (symtab_node *node, void *data) +symtab_node::noninterposable_alias (symtab_node *node, symtab_node **data) { if (decl_binds_to_current_def_p (node->decl)) { @@ -1530,7 +1530,7 @@ symtab_node::noninterposable_alias (symtab_node *node, void *data) || DECL_ATTRIBUTES (node->decl) != DECL_ATTRIBUTES (fn->decl)) return false; - *(symtab_node **)data = node; + *data = node; return true; } return false; @@ -1550,8 +1550,8 @@ symtab_node::noninterposable_alias (void) (if that is already non-overwritable). */ symtab_node *node = ultimate_alias_target (); gcc_assert (!node->alias && !node->weakref); - node->call_for_symbol_and_aliases (symtab_node::noninterposable_alias, - (void *)&new_node, true); + node->call_for_symbol_and_aliases + (&new_node, true); if (new_node) return new_node; #ifndef ASM_OUTPUT_DEF @@ -1840,10 +1840,8 @@ symtab_node::equal_address_to (symtab_node *s2) /* Worker for call_for_symbol_and_aliases. */ bool -symtab_node::call_for_symbol_and_aliases_1 (bool (*callback) (symtab_node *, - void *), - void *data, - bool include_overwritable) +symtab_node::call_for_symbol_and_aliases_1 (bool (*callback) (symtab_node *,void *), + void *data, bool include_overwritable) { ipa_ref *ref; FOR_EACH_ALIAS (this, ref) @@ -1857,3 +1855,23 @@ symtab_node::call_for_symbol_and_aliases_1 (bool (*callback) (symtab_node *, } return false; } + +/* Worker for call_for_symbol_and_aliases. */ + +template +bool +symtab_node::call_for_symbol_and_aliases_1 (Arg data, + bool include_overwritable) +{ + ipa_ref *ref; + FOR_EACH_ALIAS (this, ref) + { + symtab_node *alias = ref->referring; + if (include_overwritable + || alias->get_availability () > AVAIL_INTERPOSABLE) + if (alias->call_for_symbol_and_aliases (data, + include_overwritable)) + return true; + } + return false; +} -- 2.1.2