Patchwork Account for devirtualization opportunities in inliner

login
register
mail settings
Submitter Jan Hubicka
Date Oct. 20, 2011, 9:11 a.m.
Message ID <20111020091109.GD2467@kam.mff.cuni.cz>
Download mbox | patch
Permalink /patch/120774/
State New
Headers show

Comments

Jan Hubicka - Oct. 20, 2011, 9:11 a.m.
Hi,
sorry for delayed review.  I am still trying to get ipa-inline-analysis to behave well on real codebases
and make my mind around how to get more advanced hints, like this one, into it.


I am not quite sure how to estimate the actual benefits.  estimate_num_insns
doesn't really make a difference in between direct and indirect calls.

I see it is good idea to inline more then the destination is known & inlinable.
This is an example when we have additional knowledge that we want to mix into
badness metric that does not directly translate to time/size.  There are multiple
cases like this.  I was thinking of adding kind of bonus metric for this purpose,
but I would suggest doing this incrementally.

What about
 1) extending estimate_num_insns wieghts to account direct calls differently
    from indirect calls (i.e. adding indirect_call cost value into eni wights)
    I would set it 2 for size metrics and 15 for time metrics for start
 2) make estimate_edge_time_and_size to subtract difference of those two metrics
    from edge costs when destination is direct.

Incrementally we can think of how to extra prioritize direct calls with inlinable
targets.
+/* Increase SIZE and TIME for size and time needed to handle all calls in NODE.
+   POSSIBLE_TRUTHS, KNOWN_VALS and KNOWN_BINFOS describe context of the call
+   site.  */
 
 static void
 estimate_calls_size_and_time (struct cgraph_node *node, int *size, int *time,
-			      clause_t possible_truths)
+			      clause_t possible_truths,
+			      VEC (tree, heap) *known_vals,
+			      VEC (tree, heap) *known_binfos)
 {
   struct cgraph_edge *e;
   for (e = node->callees; e; e = e->next_callee)
@@ -2125,25 +2207,35 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size, int *time,
 	    }
 	  else
 	    estimate_calls_size_and_time (e->callee, size, time,
-					  possible_truths);
+					  possible_truths,
+					  /* TODO: remap KNOWN_VALS and
+					     KNOWN_BINFOS to E->CALLEE
+					     parameters, and use them.  */
+					  NULL, NULL);

Remapping should not be needed here - the jump functions are merged after marking edge inline, so jump
functions in inlined functions actually reffer to the parameters of the function they are inlined to.

Honza
Maxim Kuvyrkov - Oct. 28, 2011, 2:43 a.m.
On 20/10/2011, at 10:11 PM, Jan Hubicka wrote:
> static clause_t
> -evaluate_conditions_for_edge (struct cgraph_edge *e, bool inline_p)
> +evaluate_conditions_vals_binfos_for_edge (struct cgraph_edge *e,
> +					  bool inline_p,
> +					  VEC (tree, heap) **known_vals_ptr,
> +					  VEC (tree, heap) **known_binfos_ptr)
> 
> Hmm, I would make clause also returned by reference to be sonsistent and perhaps
> call it something like edge_properties
> since it is not really only about evaulating the clause anymore.

Agree.

> 
> -/* Increase SIZE and TIME for size and time needed to handle all calls in NODE.  */
> +/* Estimate benefit devirtualizing indirect edge IE, provided KNOWN_VALS and
> +   KNOWN_BINFOS.  */
> +
> +static void
> +estimate_edge_devirt_benefit (struct cgraph_edge *ie,
> +			      int *size, int *time, int prob,
> +			      VEC (tree, heap) *known_vals,
> +			      VEC (tree, heap) *known_binfos)
> 
> I think this whole logic should go into estimate_edge_time_and_size.  This way
> we will save all the duplication of scaling logic
> Just add the known_vals/binfos arguments.

Then devirtualization benefit will not be available through estimate_node_size_and_time, which is the primary interface for users of ipa-inline-analysis other than the inliner itself.  I.e., estimate_ipcp_clone_size_and_time, which is the only other user of the analysis at the moment, will not see devirtualization benefit.

> 
> I am not quite sure how to estimate the actual benefits.  estimate_num_insns
> doesn't really make a difference in between direct and indirect calls.
> 
> I see it is good idea to inline more then the destination is known & inlinable.
> This is an example when we have additional knowledge that we want to mix into
> badness metric that does not directly translate to time/size.  There are multiple
> cases like this.  I was thinking of adding kind of bonus metric for this purpose,
> but I would suggest doing this incrementally.

I too thought about this, and decided to keep the bonus metric part to bare minimum in this patch.

> 
> What about
> 1) extending estimate_num_insns wieghts to account direct calls differently
>    from indirect calls (i.e. adding indirect_call cost value into eni wights)
>    I would set it 2 for size metrics and 15 for time metrics for start
> 2) make estimate_edge_time_and_size to subtract difference of those two metrics
>    from edge costs when destination is direct.

OK, I'll try this.

>  @@ -2125,25 +2207,35 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size, int *time,
> 	    }
> 	  else
> 	    estimate_calls_size_and_time (e->callee, size, time,
> -					  possible_truths);
> +					  possible_truths,
> +					  /* TODO: remap KNOWN_VALS and
> +					     KNOWN_BINFOS to E->CALLEE
> +					     parameters, and use them.  */
> +					  NULL, NULL);
> 
> Remapping should not be needed here - the jump functions are merged after marking edge inline, so jump
> functions in inlined functions actually reffer to the parameters of the function they are inlined to.

I remember it crashing on some testcase and thought the lack of remapping was the cause.  I'll look into this.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics
Maxim Kuvyrkov - Oct. 28, 2011, 8:01 a.m.
Jan,

Attached is the updated patch.  The only major change is the addition of indirect_call_cost to size and time weights.  I've set the size cost of indirect call to 3, which is what I remember calculating when I looked into costs couple of months ago: one call instruction for the call itself, one memory instruction to pull the call address out of vtable, and one ALU instruction to calculate the address inside vtable.  On architectures with base+offset addressing the above can be shrunk into 2 instructions.

The remapping of the known_vals and known_binfos did indeed turned out to work just fine.  Probably, that was a bug that was fixed since.

The patch bootstraps and passes regtest.

Comments?  OK for trunk?

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics



On 28/10/2011, at 3:43 PM, Maxim Kuvyrkov wrote:

> On 20/10/2011, at 10:11 PM, Jan Hubicka wrote:
>> static clause_t
>> -evaluate_conditions_for_edge (struct cgraph_edge *e, bool inline_p)
>> +evaluate_conditions_vals_binfos_for_edge (struct cgraph_edge *e,
>> +					  bool inline_p,
>> +					  VEC (tree, heap) **known_vals_ptr,
>> +					  VEC (tree, heap) **known_binfos_ptr)
>> 
>> Hmm, I would make clause also returned by reference to be sonsistent and perhaps
>> call it something like edge_properties
>> since it is not really only about evaulating the clause anymore.
> 
> Agree.
> 
>> 
>> -/* Increase SIZE and TIME for size and time needed to handle all calls in NODE.  */
>> +/* Estimate benefit devirtualizing indirect edge IE, provided KNOWN_VALS and
>> +   KNOWN_BINFOS.  */
>> +
>> +static void
>> +estimate_edge_devirt_benefit (struct cgraph_edge *ie,
>> +			      int *size, int *time, int prob,
>> +			      VEC (tree, heap) *known_vals,
>> +			      VEC (tree, heap) *known_binfos)
>> 
>> I think this whole logic should go into estimate_edge_time_and_size.  This way
>> we will save all the duplication of scaling logic
>> Just add the known_vals/binfos arguments.
> 
> Then devirtualization benefit will not be available through estimate_node_size_and_time, which is the primary interface for users of ipa-inline-analysis other than the inliner itself.  I.e., estimate_ipcp_clone_size_and_time, which is the only other user of the analysis at the moment, will not see devirtualization benefit.
> 
>> 
>> I am not quite sure how to estimate the actual benefits.  estimate_num_insns
>> doesn't really make a difference in between direct and indirect calls.
>> 
>> I see it is good idea to inline more then the destination is known & inlinable.
>> This is an example when we have additional knowledge that we want to mix into
>> badness metric that does not directly translate to time/size.  There are multiple
>> cases like this.  I was thinking of adding kind of bonus metric for this purpose,
>> but I would suggest doing this incrementally.
> 
> I too thought about this, and decided to keep the bonus metric part to bare minimum in this patch.
> 
>> 
>> What about
>> 1) extending estimate_num_insns wieghts to account direct calls differently
>>   from indirect calls (i.e. adding indirect_call cost value into eni wights)
>>   I would set it 2 for size metrics and 15 for time metrics for start
>> 2) make estimate_edge_time_and_size to subtract difference of those two metrics
>>   from edge costs when destination is direct.
> 
> OK, I'll try this.
> 
>> @@ -2125,25 +2207,35 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size, int *time,
>> 	    }
>> 	  else
>> 	    estimate_calls_size_and_time (e->callee, size, time,
>> -					  possible_truths);
>> +					  possible_truths,
>> +					  /* TODO: remap KNOWN_VALS and
>> +					     KNOWN_BINFOS to E->CALLEE
>> +					     parameters, and use them.  */
>> +					  NULL, NULL);
>> 
>> Remapping should not be needed here - the jump functions are merged after marking edge inline, so jump
>> functions in inlined functions actually reffer to the parameters of the function they are inlined to.
> 
> I remember it crashing on some testcase and thought the lack of remapping was the cause.  I'll look into this.
> 
> Thank you,
> 
> --
> Maxim Kuvyrkov
> CodeSourcery / Mentor Graphics
>
Maxim Kuvyrkov - Nov. 9, 2011, 4:20 a.m.
Ping.

--
Maxim Kuvyrkov
www.trivialbugs.com



On 28/10/2011, at 9:01 PM, Maxim Kuvyrkov wrote:

> Jan,
> 
> Attached is the updated patch.  The only major change is the addition of indirect_call_cost to size and time weights.  I've set the size cost of indirect call to 3, which is what I remember calculating when I looked into costs couple of months ago: one call instruction for the call itself, one memory instruction to pull the call address out of vtable, and one ALU instruction to calculate the address inside vtable.  On architectures with base+offset addressing the above can be shrunk into 2 instructions.
> 
> The remapping of the known_vals and known_binfos did indeed turned out to work just fine.  Probably, that was a bug that was fixed since.
> 
> The patch bootstraps and passes regtest.
> 
> Comments?  OK for trunk?
> 
> --
> Maxim Kuvyrkov
> CodeSourcery / Mentor Graphics
> 
> 
> 
> On 28/10/2011, at 3:43 PM, Maxim Kuvyrkov wrote:
> 
>> On 20/10/2011, at 10:11 PM, Jan Hubicka wrote:
>>> static clause_t
>>> -evaluate_conditions_for_edge (struct cgraph_edge *e, bool inline_p)
>>> +evaluate_conditions_vals_binfos_for_edge (struct cgraph_edge *e,
>>> +					  bool inline_p,
>>> +					  VEC (tree, heap) **known_vals_ptr,
>>> +					  VEC (tree, heap) **known_binfos_ptr)
>>> 
>>> Hmm, I would make clause also returned by reference to be sonsistent and perhaps
>>> call it something like edge_properties
>>> since it is not really only about evaulating the clause anymore.
>> 
>> Agree.
>> 
>>> 
>>> -/* Increase SIZE and TIME for size and time needed to handle all calls in NODE.  */
>>> +/* Estimate benefit devirtualizing indirect edge IE, provided KNOWN_VALS and
>>> +   KNOWN_BINFOS.  */
>>> +
>>> +static void
>>> +estimate_edge_devirt_benefit (struct cgraph_edge *ie,
>>> +			      int *size, int *time, int prob,
>>> +			      VEC (tree, heap) *known_vals,
>>> +			      VEC (tree, heap) *known_binfos)
>>> 
>>> I think this whole logic should go into estimate_edge_time_and_size.  This way
>>> we will save all the duplication of scaling logic
>>> Just add the known_vals/binfos arguments.
>> 
>> Then devirtualization benefit will not be available through estimate_node_size_and_time, which is the primary interface for users of ipa-inline-analysis other than the inliner itself.  I.e., estimate_ipcp_clone_size_and_time, which is the only other user of the analysis at the moment, will not see devirtualization benefit.
>> 
>>> 
>>> I am not quite sure how to estimate the actual benefits.  estimate_num_insns
>>> doesn't really make a difference in between direct and indirect calls.
>>> 
>>> I see it is good idea to inline more then the destination is known & inlinable.
>>> This is an example when we have additional knowledge that we want to mix into
>>> badness metric that does not directly translate to time/size.  There are multiple
>>> cases like this.  I was thinking of adding kind of bonus metric for this purpose,
>>> but I would suggest doing this incrementally.
>> 
>> I too thought about this, and decided to keep the bonus metric part to bare minimum in this patch.
>> 
>>> 
>>> What about
>>> 1) extending estimate_num_insns wieghts to account direct calls differently
>>>  from indirect calls (i.e. adding indirect_call cost value into eni wights)
>>>  I would set it 2 for size metrics and 15 for time metrics for start
>>> 2) make estimate_edge_time_and_size to subtract difference of those two metrics
>>>  from edge costs when destination is direct.
>> 
>> OK, I'll try this.
>> 
>>> @@ -2125,25 +2207,35 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size, int *time,
>>> 	    }
>>> 	  else
>>> 	    estimate_calls_size_and_time (e->callee, size, time,
>>> -					  possible_truths);
>>> +					  possible_truths,
>>> +					  /* TODO: remap KNOWN_VALS and
>>> +					     KNOWN_BINFOS to E->CALLEE
>>> +					     parameters, and use them.  */
>>> +					  NULL, NULL);
>>> 
>>> Remapping should not be needed here - the jump functions are merged after marking edge inline, so jump
>>> functions in inlined functions actually reffer to the parameters of the function they are inlined to.
>> 
>> I remember it crashing on some testcase and thought the lack of remapping was the cause.  I'll look into this.
>> 
>> Thank you,
>> 
>> --
>> Maxim Kuvyrkov
>> CodeSourcery / Mentor Graphics
>> 
> 
> <fsf-gcc-devirt-account-2.ChangeLog><fsf-gcc-devirt-account-2.patch>
Jan Hubicka - Nov. 14, 2011, 5:05 p.m.
Hi,
the patch is OK. Thanks!
Honza

Patch

diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index bd4d2ea..5e88c2d 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -711,14 +711,23 @@  evaluate_conditions_for_known_args (struct cgraph_node *node,
 /* Work out what conditions might be true at invocation of E.  */
 
 static clause_t
-evaluate_conditions_for_edge (struct cgraph_edge *e, bool inline_p)
+evaluate_conditions_vals_binfos_for_edge (struct cgraph_edge *e,
+					  bool inline_p,
+					  VEC (tree, heap) **known_vals_ptr,
+					  VEC (tree, heap) **known_binfos_ptr)

Hmm, I would make clause also returned by reference to be sonsistent and perhaps
call it something like edge_properties
since it is not really only about evaulating the clause anymore.

-/* Increase SIZE and TIME for size and time needed to handle all calls in NODE.  */
+/* Estimate benefit devirtualizing indirect edge IE, provided KNOWN_VALS and
+   KNOWN_BINFOS.  */
+
+static void
+estimate_edge_devirt_benefit (struct cgraph_edge *ie,
+			      int *size, int *time, int prob,
+			      VEC (tree, heap) *known_vals,
+			      VEC (tree, heap) *known_binfos)

I think this whole logic should go into estimate_edge_time_and_size.  This way
we will save all the duplication of scaling logic
Just add the known_vals/binfos arguments.