Patchwork [google,gcc-4_8] fix size_estimation for builtin_expect

login
register
mail settings
Submitter Rong Xu
Date Sept. 26, 2013, 10:17 p.m.
Message ID <CAF1bQ=TS1--pLwzS+B4coX-0mkuRkn0VyfQhzL+qrxZB_TAR6Q@mail.gmail.com>
Download mbox | patch
Permalink /patch/278284/
State New
Headers show

Comments

Rong Xu - Sept. 26, 2013, 10:17 p.m.
Hi,

builtin_expect should be a NOP in size_estimation. Indeed, the call
stmt itself is 0 weight in size and time. But it may introduce
an extra relation expr which has non-zero size/time. The end result
is: for w/ and w/o builtin_expect, we have different size/time estimation
for early inlining.

This patch fixes this problem.

-Rong
2013-09-26  Rong Xu  <xur@google.com>

	* ipa-inline-analysis.c (estimate_function_body_sizes): fix
        the size estimation for builtin_expect.
Jan Hubicka - Sept. 26, 2013, 10:23 p.m.
> Hi,
> 
> builtin_expect should be a NOP in size_estimation. Indeed, the call
> stmt itself is 0 weight in size and time. But it may introduce
> an extra relation expr which has non-zero size/time. The end result
> is: for w/ and w/o builtin_expect, we have different size/time estimation
> for early inlining.
> 
> This patch fixes this problem.
> 
> -Rong

> 2013-09-26  Rong Xu  <xur@google.com>
> 
> 	* ipa-inline-analysis.c (estimate_function_body_sizes): fix
>         the size estimation for builtin_expect.

This seems fine with an comment in the code what it is about.
I also think we want to support mutiple builtin_expects in a BB so perhaps
we want to have pointer set of statements to ignore?

To avoid spagetti code, please just move the new logic into separate functions.

Honza
> 
> Index: ipa-inline-analysis.c
> ===================================================================
> --- ipa-inline-analysis.c	(revision 202638)
> +++ ipa-inline-analysis.c	(working copy)
> @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
>    /* Estimate static overhead for function prologue/epilogue and alignment. */
>    int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
>    int size = overhead;
> +  gimple fix_expect_builtin;
> +
>    /* Benefits are scaled by probability of elimination that is in range
>       <0,2>.  */
>    basic_block bb;
> @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
>  	    }
>  	}
>  
> +      fix_expect_builtin = NULL;
>        for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>  	{
>  	  gimple stmt = gsi_stmt (bsi);
> +	  if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
> +            {
> +              tree var = gimple_call_lhs (stmt);
> +              tree arg = gimple_call_arg (stmt, 0);
> +              use_operand_p use_p;
> +              gimple use_stmt;
> +              bool match = false;
> +              bool done = false;
> +              gcc_assert (var && arg);
> +              gcc_assert (TREE_CODE (var) == SSA_NAME);
> +
> +              while (TREE_CODE (arg) == SSA_NAME)
> +                {
> +                  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
> +                  switch (gimple_assign_rhs_code (stmt_tmp))
> +                    {
> +                      case LT_EXPR:
> +                      case LE_EXPR:
> +                      case GT_EXPR:
> +                      case GE_EXPR:
> +                      case EQ_EXPR:
> +                      case NE_EXPR:
> +                        match = true;
> +                        done = true;
> +                        break;
> +                      case NOP_EXPR:
> +                        break;
> +                      default:
> +                        done = true;
> +                        break;
> +                    }
> +                  if (done)
> +                    break;
> +                  arg = gimple_assign_rhs1 (stmt_tmp);
> +                }
> +
> +              if (match && single_imm_use (var, &use_p, &use_stmt))
> +                {
> +                  if (gimple_code (use_stmt) == GIMPLE_COND)
> +                    {
> +                      fix_expect_builtin = use_stmt;
> +                    }
> +                }
> +
> +              /* we should see one builtin_expert call in one bb.  */
> +              break;
> +            }
> +        }
> +
> +      for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
> +	{
> +	  gimple stmt = gsi_stmt (bsi);
>  	  int this_size = estimate_num_insns (stmt, &eni_size_weights);
>  	  int this_time = estimate_num_insns (stmt, &eni_time_weights);
>  	  int prob;
>  	  struct predicate will_be_nonconstant;
>  
> +	  if (stmt == fix_expect_builtin)
> +            {
> +              this_size--;
> +              this_time--;
> +            }
> +
>  	  if (dump_file && (dump_flags & TDF_DETAILS))
>  	    {
>  	      fprintf (dump_file, "  ");
Richard Guenther - Sept. 27, 2013, 8:20 a.m.
On Fri, Sep 27, 2013 at 12:23 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Hi,
>>
>> builtin_expect should be a NOP in size_estimation. Indeed, the call
>> stmt itself is 0 weight in size and time. But it may introduce
>> an extra relation expr which has non-zero size/time. The end result
>> is: for w/ and w/o builtin_expect, we have different size/time estimation
>> for early inlining.
>>
>> This patch fixes this problem.
>>
>> -Rong
>
>> 2013-09-26  Rong Xu  <xur@google.com>
>>
>>       * ipa-inline-analysis.c (estimate_function_body_sizes): fix
>>         the size estimation for builtin_expect.
>
> This seems fine with an comment in the code what it is about.
> I also think we want to support mutiple builtin_expects in a BB so perhaps
> we want to have pointer set of statements to ignore?
>
> To avoid spagetti code, please just move the new logic into separate functions.

Looks like this could use tree-ssa.c:walk_use_def_chains (please
change its implementation as necessary, make it C++, etc. - you will
be the first user again).

Richard.

> Honza
>>
>> Index: ipa-inline-analysis.c
>> ===================================================================
>> --- ipa-inline-analysis.c     (revision 202638)
>> +++ ipa-inline-analysis.c     (working copy)
>> @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
>>    /* Estimate static overhead for function prologue/epilogue and alignment. */
>>    int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
>>    int size = overhead;
>> +  gimple fix_expect_builtin;
>> +
>>    /* Benefits are scaled by probability of elimination that is in range
>>       <0,2>.  */
>>    basic_block bb;
>> @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
>>           }
>>       }
>>
>> +      fix_expect_builtin = NULL;
>>        for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>>       {
>>         gimple stmt = gsi_stmt (bsi);
>> +       if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
>> +            {
>> +              tree var = gimple_call_lhs (stmt);
>> +              tree arg = gimple_call_arg (stmt, 0);
>> +              use_operand_p use_p;
>> +              gimple use_stmt;
>> +              bool match = false;
>> +              bool done = false;
>> +              gcc_assert (var && arg);
>> +              gcc_assert (TREE_CODE (var) == SSA_NAME);
>> +
>> +              while (TREE_CODE (arg) == SSA_NAME)
>> +                {
>> +                  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
>> +                  switch (gimple_assign_rhs_code (stmt_tmp))
>> +                    {
>> +                      case LT_EXPR:
>> +                      case LE_EXPR:
>> +                      case GT_EXPR:
>> +                      case GE_EXPR:
>> +                      case EQ_EXPR:
>> +                      case NE_EXPR:
>> +                        match = true;
>> +                        done = true;
>> +                        break;
>> +                      case NOP_EXPR:
>> +                        break;
>> +                      default:
>> +                        done = true;
>> +                        break;
>> +                    }
>> +                  if (done)
>> +                    break;
>> +                  arg = gimple_assign_rhs1 (stmt_tmp);
>> +                }
>> +
>> +              if (match && single_imm_use (var, &use_p, &use_stmt))
>> +                {
>> +                  if (gimple_code (use_stmt) == GIMPLE_COND)
>> +                    {
>> +                      fix_expect_builtin = use_stmt;
>> +                    }
>> +                }
>> +
>> +              /* we should see one builtin_expert call in one bb.  */
>> +              break;
>> +            }
>> +        }
>> +
>> +      for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>> +     {
>> +       gimple stmt = gsi_stmt (bsi);
>>         int this_size = estimate_num_insns (stmt, &eni_size_weights);
>>         int this_time = estimate_num_insns (stmt, &eni_time_weights);
>>         int prob;
>>         struct predicate will_be_nonconstant;
>>
>> +       if (stmt == fix_expect_builtin)
>> +            {
>> +              this_size--;
>> +              this_time--;
>> +            }
>> +
>>         if (dump_file && (dump_flags & TDF_DETAILS))
>>           {
>>             fprintf (dump_file, "  ");
>
Rong Xu - Sept. 27, 2013, 5:03 p.m.
On Fri, Sep 27, 2013 at 1:20 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Sep 27, 2013 at 12:23 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> Hi,
>>>
>>> builtin_expect should be a NOP in size_estimation. Indeed, the call
>>> stmt itself is 0 weight in size and time. But it may introduce
>>> an extra relation expr which has non-zero size/time. The end result
>>> is: for w/ and w/o builtin_expect, we have different size/time estimation
>>> for early inlining.
>>>
>>> This patch fixes this problem.
>>>
>>> -Rong
>>
>>> 2013-09-26  Rong Xu  <xur@google.com>
>>>
>>>       * ipa-inline-analysis.c (estimate_function_body_sizes): fix
>>>         the size estimation for builtin_expect.
>>
>> This seems fine with an comment in the code what it is about.
>> I also think we want to support mutiple builtin_expects in a BB so perhaps
>> we want to have pointer set of statements to ignore?
>>
>> To avoid spagetti code, please just move the new logic into separate functions.
>
> Looks like this could use tree-ssa.c:walk_use_def_chains (please
> change its implementation as necessary, make it C++, etc. - you will
> be the first user again).
>

Thanks for the suggestion. But it seems walk_use_def_chains() was
removed by r201951.

-Rong

> Richard.
>
>> Honza
>>>
>>> Index: ipa-inline-analysis.c
>>> ===================================================================
>>> --- ipa-inline-analysis.c     (revision 202638)
>>> +++ ipa-inline-analysis.c     (working copy)
>>> @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
>>>    /* Estimate static overhead for function prologue/epilogue and alignment. */
>>>    int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
>>>    int size = overhead;
>>> +  gimple fix_expect_builtin;
>>> +
>>>    /* Benefits are scaled by probability of elimination that is in range
>>>       <0,2>.  */
>>>    basic_block bb;
>>> @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
>>>           }
>>>       }
>>>
>>> +      fix_expect_builtin = NULL;
>>>        for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>>>       {
>>>         gimple stmt = gsi_stmt (bsi);
>>> +       if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
>>> +            {
>>> +              tree var = gimple_call_lhs (stmt);
>>> +              tree arg = gimple_call_arg (stmt, 0);
>>> +              use_operand_p use_p;
>>> +              gimple use_stmt;
>>> +              bool match = false;
>>> +              bool done = false;
>>> +              gcc_assert (var && arg);
>>> +              gcc_assert (TREE_CODE (var) == SSA_NAME);
>>> +
>>> +              while (TREE_CODE (arg) == SSA_NAME)
>>> +                {
>>> +                  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
>>> +                  switch (gimple_assign_rhs_code (stmt_tmp))
>>> +                    {
>>> +                      case LT_EXPR:
>>> +                      case LE_EXPR:
>>> +                      case GT_EXPR:
>>> +                      case GE_EXPR:
>>> +                      case EQ_EXPR:
>>> +                      case NE_EXPR:
>>> +                        match = true;
>>> +                        done = true;
>>> +                        break;
>>> +                      case NOP_EXPR:
>>> +                        break;
>>> +                      default:
>>> +                        done = true;
>>> +                        break;
>>> +                    }
>>> +                  if (done)
>>> +                    break;
>>> +                  arg = gimple_assign_rhs1 (stmt_tmp);
>>> +                }
>>> +
>>> +              if (match && single_imm_use (var, &use_p, &use_stmt))
>>> +                {
>>> +                  if (gimple_code (use_stmt) == GIMPLE_COND)
>>> +                    {
>>> +                      fix_expect_builtin = use_stmt;
>>> +                    }
>>> +                }
>>> +
>>> +              /* we should see one builtin_expert call in one bb.  */
>>> +              break;
>>> +            }
>>> +        }
>>> +
>>> +      for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>>> +     {
>>> +       gimple stmt = gsi_stmt (bsi);
>>>         int this_size = estimate_num_insns (stmt, &eni_size_weights);
>>>         int this_time = estimate_num_insns (stmt, &eni_time_weights);
>>>         int prob;
>>>         struct predicate will_be_nonconstant;
>>>
>>> +       if (stmt == fix_expect_builtin)
>>> +            {
>>> +              this_size--;
>>> +              this_time--;
>>> +            }
>>> +
>>>         if (dump_file && (dump_flags & TDF_DETAILS))
>>>           {
>>>             fprintf (dump_file, "  ");
>>
Rong Xu - Sept. 27, 2013, 6:04 p.m.
On Thu, Sep 26, 2013 at 3:23 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Hi,
>>
>> builtin_expect should be a NOP in size_estimation. Indeed, the call
>> stmt itself is 0 weight in size and time. But it may introduce
>> an extra relation expr which has non-zero size/time. The end result
>> is: for w/ and w/o builtin_expect, we have different size/time estimation
>> for early inlining.
>>
>> This patch fixes this problem.
>>
>> -Rong
>
>> 2013-09-26  Rong Xu  <xur@google.com>
>>
>>       * ipa-inline-analysis.c (estimate_function_body_sizes): fix
>>         the size estimation for builtin_expect.
>
> This seems fine with an comment in the code what it is about.
> I also think we want to support mutiple builtin_expects in a BB so perhaps
> we want to have pointer set of statements to ignore?
>

Thanks for feedback. I'll add the comment and split the code into a
new function.

You have a good pointer about multiple builtin_expect. I think I need
to remove the early exit (the break stmt).
But I would argue that using pointer_set is probably not necessary.

Here is the reasoning: The typical usage of builtin_expect is like:
  if (__builtin_expect (a<=b, 1))  { ... }
The IR is:
  t1 = a <= b;
  t2 = (long int) t1;
  t3 = __builtin_expect (t2, 1);
  if (t3 != 0) goto ...
Without the builtin, the IR is
  if (a<=b) goto...

The code in the patch check the use of the builtin_expect return
value, to see if it's
used in a COND stmt. So even there are multiple builtins, only one can match.

-Rong

> To avoid spagetti code, please just move the new logic into separate functions.
>
> Honza
>>
>> Index: ipa-inline-analysis.c
>> ===================================================================
>> --- ipa-inline-analysis.c     (revision 202638)
>> +++ ipa-inline-analysis.c     (working copy)
>> @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
>>    /* Estimate static overhead for function prologue/epilogue and alignment. */
>>    int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
>>    int size = overhead;
>> +  gimple fix_expect_builtin;
>> +
>>    /* Benefits are scaled by probability of elimination that is in range
>>       <0,2>.  */
>>    basic_block bb;
>> @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
>>           }
>>       }
>>
>> +      fix_expect_builtin = NULL;
>>        for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>>       {
>>         gimple stmt = gsi_stmt (bsi);
>> +       if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
>> +            {
>> +              tree var = gimple_call_lhs (stmt);
>> +              tree arg = gimple_call_arg (stmt, 0);
>> +              use_operand_p use_p;
>> +              gimple use_stmt;
>> +              bool match = false;
>> +              bool done = false;
>> +              gcc_assert (var && arg);
>> +              gcc_assert (TREE_CODE (var) == SSA_NAME);
>> +
>> +              while (TREE_CODE (arg) == SSA_NAME)
>> +                {
>> +                  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
>> +                  switch (gimple_assign_rhs_code (stmt_tmp))
>> +                    {
>> +                      case LT_EXPR:
>> +                      case LE_EXPR:
>> +                      case GT_EXPR:
>> +                      case GE_EXPR:
>> +                      case EQ_EXPR:
>> +                      case NE_EXPR:
>> +                        match = true;
>> +                        done = true;
>> +                        break;
>> +                      case NOP_EXPR:
>> +                        break;
>> +                      default:
>> +                        done = true;
>> +                        break;
>> +                    }
>> +                  if (done)
>> +                    break;
>> +                  arg = gimple_assign_rhs1 (stmt_tmp);
>> +                }
>> +
>> +              if (match && single_imm_use (var, &use_p, &use_stmt))
>> +                {
>> +                  if (gimple_code (use_stmt) == GIMPLE_COND)
>> +                    {
>> +                      fix_expect_builtin = use_stmt;
>> +                    }
>> +                }
>> +
>> +              /* we should see one builtin_expert call in one bb.  */
>> +              break;
>> +            }
>> +        }
>> +
>> +      for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>> +     {
>> +       gimple stmt = gsi_stmt (bsi);
>>         int this_size = estimate_num_insns (stmt, &eni_size_weights);
>>         int this_time = estimate_num_insns (stmt, &eni_time_weights);
>>         int prob;
>>         struct predicate will_be_nonconstant;
>>
>> +       if (stmt == fix_expect_builtin)
>> +            {
>> +              this_size--;
>> +              this_time--;
>> +            }
>> +
>>         if (dump_file && (dump_flags & TDF_DETAILS))
>>           {
>>             fprintf (dump_file, "  ");
>

Patch

Index: ipa-inline-analysis.c
===================================================================
--- ipa-inline-analysis.c	(revision 202638)
+++ ipa-inline-analysis.c	(working copy)
@@ -2266,6 +2266,8 @@  estimate_function_body_sizes (struct cgraph_node *
   /* Estimate static overhead for function prologue/epilogue and alignment. */
   int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
   int size = overhead;
+  gimple fix_expect_builtin;
+
   /* Benefits are scaled by probability of elimination that is in range
      <0,2>.  */
   basic_block bb;
@@ -2359,14 +2361,73 @@  estimate_function_body_sizes (struct cgraph_node *
 	    }
 	}
 
+      fix_expect_builtin = NULL;
       for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
 	{
 	  gimple stmt = gsi_stmt (bsi);
+	  if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
+            {
+              tree var = gimple_call_lhs (stmt);
+              tree arg = gimple_call_arg (stmt, 0);
+              use_operand_p use_p;
+              gimple use_stmt;
+              bool match = false;
+              bool done = false;
+              gcc_assert (var && arg);
+              gcc_assert (TREE_CODE (var) == SSA_NAME);
+
+              while (TREE_CODE (arg) == SSA_NAME)
+                {
+                  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
+                  switch (gimple_assign_rhs_code (stmt_tmp))
+                    {
+                      case LT_EXPR:
+                      case LE_EXPR:
+                      case GT_EXPR:
+                      case GE_EXPR:
+                      case EQ_EXPR:
+                      case NE_EXPR:
+                        match = true;
+                        done = true;
+                        break;
+                      case NOP_EXPR:
+                        break;
+                      default:
+                        done = true;
+                        break;
+                    }
+                  if (done)
+                    break;
+                  arg = gimple_assign_rhs1 (stmt_tmp);
+                }
+
+              if (match && single_imm_use (var, &use_p, &use_stmt))
+                {
+                  if (gimple_code (use_stmt) == GIMPLE_COND)
+                    {
+                      fix_expect_builtin = use_stmt;
+                    }
+                }
+
+              /* we should see one builtin_expert call in one bb.  */
+              break;
+            }
+        }
+
+      for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
+	{
+	  gimple stmt = gsi_stmt (bsi);
 	  int this_size = estimate_num_insns (stmt, &eni_size_weights);
 	  int this_time = estimate_num_insns (stmt, &eni_time_weights);
 	  int prob;
 	  struct predicate will_be_nonconstant;
 
+	  if (stmt == fix_expect_builtin)
+            {
+              this_size--;
+              this_time--;
+            }
+
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    {
 	      fprintf (dump_file, "  ");