Patchwork Relax limits of early inliner for the forwarder functions

login
register
mail settings
Submitter Jan Hubicka
Date Nov. 5, 2012, 11:23 a.m.
Message ID <20121105112336.GC11052@kam.mff.cuni.cz>
Download mbox | patch
Permalink /patch/197192/
State New
Headers show

Comments

Jan Hubicka - Nov. 5, 2012, 11:23 a.m.
Hi,
in 4.6 timeframe I limited early inlier growth to apply only for leaf functions.
This does not work really well, because with less propagation of address expressions
we are really not 100% succesfull on detecting C++ forwarders and predicting them
zero cost.   This patch simply makes the cost to be divided by number of callees, similarly
as in LLVM.

Bootstrapped/regtested x86_64-linux, benchmarked and comitted.
The patch seems consistent win in all benchmarks, most noticeably in tramp3d.

	* ipa-inline.c (leaf_node_p): Rename to ...
	(num_calls) ... this one.
	(want_early_inline_function_p): Allow smal growth on non-leafs.
Richard Guenther - Jan. 8, 2013, 2:07 p.m.
On Mon, Nov 5, 2012 at 12:23 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> in 4.6 timeframe I limited early inlier growth to apply only for leaf functions.
> This does not work really well, because with less propagation of address expressions
> we are really not 100% succesfull on detecting C++ forwarders and predicting them
> zero cost.   This patch simply makes the cost to be divided by number of callees, similarly
> as in LLVM.
>
> Bootstrapped/regtested x86_64-linux, benchmarked and comitted.
> The patch seems consistent win in all benchmarks, most noticeably in tramp3d.
>
>         * ipa-inline.c (leaf_node_p): Rename to ...
>         (num_calls) ... this one.
>         (want_early_inline_function_p): Allow smal growth on non-leafs.
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c        (revision 193134)
> +++ ipa-inline.c        (working copy)
> @@ -380,17 +380,18 @@ can_early_inline_edge_p (struct cgraph_e
>  }
>
>
> -/* Return true when N is leaf function.  Accept cheap builtins
> -   in leaf functions.  */
> +/* Return number of calls in N.  Ignore cheap builtins.  */
>
> -static bool
> -leaf_node_p (struct cgraph_node *n)
> +static int
> +num_calls (struct cgraph_node *n)
>  {
>    struct cgraph_edge *e;
> +  int num = 0;
> +
>    for (e = n->callees; e; e = e->next_callee)
>      if (!is_inexpensive_builtin (e->callee->symbol.decl))
> -      return false;
> -  return true;
> +      num++;
> +  return num;
>  }

This counts all calls in 'n'

>
> @@ -414,6 +415,8 @@ want_early_inline_function_p (struct cgr
>    else
>      {
>        int growth = estimate_edge_growth (e);
> +      int n;
> +
>        if (growth <= 0)
>         ;
>        else if (!cgraph_maybe_hot_edge_p (e)
> @@ -427,22 +430,23 @@ want_early_inline_function_p (struct cgr
>                      growth);
>           want_inline = false;
>         }
> -      else if (!leaf_node_p (callee)
> -              && growth > 0)
> +      else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
>         {
>           if (dump_file)
>             fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> -                    "callee is not leaf and code would grow by %i\n",
> +                    "growth %i exceeds --param early-inlining-insns\n",
>                      xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
>                      xstrdup (cgraph_node_name (callee)), callee->uid,
>                      growth);
>           want_inline = false;
>         }
> -      else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
> +      else if ((n = num_calls (callee)) != 0
> +              && growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))

So this counts all calls in the function we want to inline (!?).
That's completely
backward to me.  In fact for forwarder functions you still only allow half
of the early-inlining-insns growth.  Previously for non-leafs we didn't allow
any growth (hm, why?).

Now with relaxing that and allowing functions with calls to be inlined more
frequently we run into PR55797 which shows that we cannot limit recursive
inlining anymore if it is indirect one level.  By means of early
inlining iteration
we blow up completely (8 iterations at most?!).  Also because we do not
compute overall function growth (because we rely on early inlining only
shrinking code size ...).

I believe we at least need to track recursive inlining during early inliner
iteration by means of some ->aux marking or so.

Honza - please have a look at the ICE in PR55797 and the issues with
this patch enabling more inlining.

Thanks,
Richard.

>         {
>           if (dump_file)
>             fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> -                    "growth %i exceeds --param early-inlining-insns\n",
> +                    "growth %i exceeds --param early-inlining-insns "
> +                    "divided by number of calls\n",
>                      xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
>                      xstrdup (cgraph_node_name (callee)), callee->uid,
>                      growth);
Jan Hubicka - Jan. 8, 2013, 3:08 p.m.
> 
> So this counts all calls in the function we want to inline (!?).
> That's completely
> backward to me.  In fact for forwarder functions you still only allow half
> of the early-inlining-insns growth.  Previously for non-leafs we didn't allow
> any growth (hm, why?).

Well, the idea is that inlining leaf functions is almost always good idea
(i.e. you can assume that the function's body will optimize well with surrounding
code and eliminating a call is good thing)
Inlining functions that have call in it is less cool.  I introduced the non-leaf/leaf
logic in about 4.6 time after late inlining became more informed about anticipated
optimizations, but it really caused quite some trouble on C++ abstraction,
so relaxing this logic somewhat seemed like resonable idea.
> 
> Now with relaxing that and allowing functions with calls to be inlined more
> frequently we run into PR55797 which shows that we cannot limit recursive
> inlining anymore if it is indirect one level.  By means of early
> inlining iteration
> we blow up completely (8 iterations at most?!).  Also because we do not
> compute overall function growth (because we rely on early inlining only
> shrinking code size ...).

Well, we compute function growth, but for each iteratio nseparately.
> 
> I believe we at least need to track recursive inlining during early inliner
> iteration by means of some ->aux marking or so.

Hmm, I guess we want to disable recursive inlining in the early inliner completely.
I will take a look.

Honza
> 
> Honza - please have a look at the ICE in PR55797 and the issues with
> this patch enabling more inlining.
> 
> Thanks,
> Richard.
> 
> >         {
> >           if (dump_file)
> >             fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> > -                    "growth %i exceeds --param early-inlining-insns\n",
> > +                    "growth %i exceeds --param early-inlining-insns "
> > +                    "divided by number of calls\n",
> >                      xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
> >                      xstrdup (cgraph_node_name (callee)), callee->uid,
> >                      growth);

Patch

Index: ipa-inline.c
===================================================================
--- ipa-inline.c	(revision 193134)
+++ ipa-inline.c	(working copy)
@@ -380,17 +380,18 @@  can_early_inline_edge_p (struct cgraph_e
 }
 
 
-/* Return true when N is leaf function.  Accept cheap builtins
-   in leaf functions.  */
+/* Return number of calls in N.  Ignore cheap builtins.  */
 
-static bool
-leaf_node_p (struct cgraph_node *n)
+static int
+num_calls (struct cgraph_node *n)
 {
   struct cgraph_edge *e;
+  int num = 0;
+
   for (e = n->callees; e; e = e->next_callee)
     if (!is_inexpensive_builtin (e->callee->symbol.decl))
-      return false;
-  return true;
+      num++;
+  return num;
 }
 
 
@@ -414,6 +415,8 @@  want_early_inline_function_p (struct cgr
   else
     {
       int growth = estimate_edge_growth (e);
+      int n;
+
       if (growth <= 0)
 	;
       else if (!cgraph_maybe_hot_edge_p (e)
@@ -427,22 +430,23 @@  want_early_inline_function_p (struct cgr
 		     growth);
 	  want_inline = false;
 	}
-      else if (!leaf_node_p (callee)
-	       && growth > 0)
+      else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
 	{
 	  if (dump_file)
 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
-		     "callee is not leaf and code would grow by %i\n",
+		     "growth %i exceeds --param early-inlining-insns\n",
 		     xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
 		     xstrdup (cgraph_node_name (callee)), callee->uid,
 		     growth);
 	  want_inline = false;
 	}
-      else if (growth > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
+      else if ((n = num_calls (callee)) != 0
+	       && growth * (n + 1) > PARAM_VALUE (PARAM_EARLY_INLINING_INSNS))
 	{
 	  if (dump_file)
 	    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
-		     "growth %i exceeds --param early-inlining-insns\n",
+		     "growth %i exceeds --param early-inlining-insns "
+		     "divided by number of calls\n",
 		     xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
 		     xstrdup (cgraph_node_name (callee)), callee->uid,
 		     growth);