diff mbox

[RFC] LTO: IPA inline speed up for large apps (Chrome)

Message ID 54E46955.1040903@suse.cz
State New
Headers show

Commit Message

Martin Liška Feb. 18, 2015, 10:28 a.m. UTC
On 02/17/2015 07:38 PM, Jan Hubicka wrote:
> Hi,
> thanks for working on it.  There are 3 basically indpeendent changes in the patch
>   - The patch to make checking in lto_streamer_init ENABLE_CHECKING only that I
>     think can be comitted as obvoius.

Hello.

Following email contains fix for that, which I'm going to install.

>   - Templates for call_for_symbol_and_aliases
>     I do not think these should be strictly necessary for perofrmance, because once we
>     spent too much time in these we are bit screwed.
>     I however see it also makes things bit nicer by not needing typecasts on data pointer.
>     Pehraps that could be further cleaned?
>
>     Alternative would be to implement FOR_EACH_ALIAS macro with tree walking iterator.
>     You have all the structure to not require stack.  Iterator will ocntain an
>     root node, current node and index to ref.
>     This may be even easier to use and probably wind up generating about the same code
>     given that the for each template anyway needs to produce self recursive function.
>
>     I would not care about for_symbol_thunk_and_aliases.  That function is heavy by walking
>     all callers anyway and should not be used in hot code.
>     I have patch that removes its use from inliner - it is more or less leftover from time
>     we represented thunks as special aliases instead of functions w/o gimple body.

Yes, I was also thinking about flat iterator that will be capable of iterating thunks/aliases and
I prefer that approach compared to recursive functions. I think we can prepare it for next release,
as you said it does not bring so much performance gain.

>   - the caching itself.
>
> I will look into the caching in detail.  I am not quite sure I like the idea of exposing inline
> only cache into cgraph.h.  You could just keep the predicates as are, but have inline_ variants
> in ipa-inline.h that does the caching for you.
>
> Allocating the bits directly in cgraph_node is probably OK, we don't really have shortage there
> and can be revisited easily later...
>
> Honza
>

Please take a look at caching, it would be crucial part of speed improvement.

Martin
diff mbox

Patch

From eb9d34244c43ae1d0576b2ae1002f5267c6cd547 Mon Sep 17 00:00:00 2001
From: mliska <mliska@suse.cz>
Date: Wed, 18 Feb 2015 11:18:47 +0100
Subject: [PATCH] Add checking macro within lto_streamer_init.

gcc/ChangeLog:

2015-02-18  Martin Liska  <mliska@suse.cz>

	* lto-streamer.c (lto_streamer_init): Encapsulate
	streamer_check_handled_ts_structures with checking macro.
---
 gcc/lto-streamer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/lto-streamer.c b/gcc/lto-streamer.c
index 836dce9..542a813 100644
--- a/gcc/lto-streamer.c
+++ b/gcc/lto-streamer.c
@@ -319,11 +319,13 @@  static hash_table<tree_hash_entry> *tree_htab;
 void
 lto_streamer_init (void)
 {
+#ifdef ENABLE_CHECKING
   /* Check that all the TS_* handled by the reader and writer routines
      match exactly the structures defined in treestruct.def.  When a
      new TS_* astructure is added, the streamer should be updated to
      handle it.  */
   streamer_check_handled_ts_structures ();
+#endif
 
 #ifdef LTO_STREAMER_DEBUG
   tree_htab = new hash_table<tree_hash_entry> (31);
-- 
2.1.2