diff mbox

[Fortran] RFC / RFA patch for using build_predict_expr instead of builtin_expect / PR 58721

Message ID 52A7958E.9060903@net-b.de
State New
Headers show

Commit Message

Tobias Burnus Dec. 10, 2013, 10:28 p.m. UTC
Pre-remark:

I had hoped that something like the following patch would work. However, 
it will lead to a bunch of run-time segfaults in the test suite - but 
the original dump seems to be fine and I also fail to spot the problem 
when looking at the patch. Thus, instead of posting a working patch, I 
have below a nonworking draft patch. Questions:

(a) Does the build_predict_expr() usage look in principle right - or do 
I do something obviously wrong?

(b) Is it possible to use build_predict_expr() for code like "a = (cond) 
? (expr1) : (expr2)"? After gimplifying, there should be a BB [and a 
phi] - thus, adding the predict to the BB should be possible. But is it 
also possible on tree level with code like above? That mainly applies to 
trans-array.c but also in another case.

(c) How to handle:  if (cond1 || overflow_cond) {...}? Currently, 
"overflow_cond" is marked as unlikely but as build_predict_expr() 
applies to the branch/basic block and not to the condition, how should 
it be handled?

(d) And in general: Does the approach look right and the choice of PRED_ 
and their values?

* * *

gfortran uses internally builtin_expect [i.e. gfc_likely()/gfc_unlikely()].

On the other hand, builtin_expect is also used by (C/C++) users. Google 
did some tests and empirical result was that unlikely is more likely 
than GCC's prediction handling expected. I believe, the likelyhood for 
likely() reduced from a hit rate of 99 to 90.

For gfortran, the change leads to PR58721, where some code is regarded 
as less likely as before and to no inlining. In many cases, the 
unlikely() is not necessary as those branches have a call to an error 
function, annotated with "noreturn". As explicit user request overrides 
other heurisitic, the builtin_expect will cause that the predictor 
regards the gfc_unlikely() as *more* likely than without (as "noreturn" 
is less likely).


This patch tries to address this for the different cases:

a) If there is a "noreturn" function called, the gfc_(un)likely has been 
removed. [Unless, I wrongly classified is in one of the next items; 
however, a "noreturn" will override another PRED_*. Thus, it shouldn't 
matter.]

b) As OVERFLOWs should only occur extremely rarely, I used PRED_OVERFLOW 
with PROB_VERY_LIKELY (i.e. the branch is very likely to be NOT_TAKEN).

c) In many cases, GCC permits to replace run-time aborts for failures by 
asking for a status, e.g. "DEALLOCATE(var, stat=status)"; I have used 
PRED_FAIL with HITRATE(80) for those - but one can argue about the value.

d) gfortran supports a run-time warning. That's currently only used for 
copy-in/out. I think both branches can be approximately equally likely 
(with no-warning being more common in most codes). However, the warning 
is only shown once. I have set PRED_WARN_CALL's hitrate to  to 70 - but 
that's also an arbitrary value.

e) As in the trans_image_index: There one had:  "if (cond || 
invalid_bound) {... } else {...}". The "invalid_bound" is marked as 
unlikely but PRED_* cannot be use as only part of the branch condition 
is unlikely. [I've removed the unlikely.]


Does the patch look sensible? Does the addition of the PRED_* make 
sense? How about their values?


Tobias

Comments

Jan Hubicka Dec. 15, 2013, 3:09 p.m. UTC | #1
Hi,
sorry for taking time to return back to this.
> Pre-remark:
> 
> I had hoped that something like the following patch would work.
> However, it will lead to a bunch of run-time segfaults in the test
> suite - but the original dump seems to be fine and I also fail to
> spot the problem when looking at the patch. Thus, instead of posting
> a working patch, I have below a nonworking draft patch. Questions:
> 
> (a) Does the build_predict_expr() usage look in principle right - or
> do I do something obviously wrong?
> 
> (b) Is it possible to use build_predict_expr() for code like "a =
> (cond) ? (expr1) : (expr2)"? After gimplifying, there should be a BB
> [and a phi] - thus, adding the predict to the BB should be possible.
> But is it also possible on tree level with code like above? That
> mainly applies to trans-array.c but also in another case.
> 
> (c) How to handle:  if (cond1 || overflow_cond) {...}? Currently,
> "overflow_cond" is marked as unlikely but as build_predict_expr()
> applies to the branch/basic block and not to the condition, how
> should it be handled?
> 
> (d) And in general: Does the approach look right and the choice of
> PRED_ and their values?
> 
> * * *
> 
> gfortran uses internally builtin_expect [i.e. gfc_likely()/gfc_unlikely()].
> 
> On the other hand, builtin_expect is also used by (C/C++) users.
> Google did some tests and empirical result was that unlikely is more
> likely than GCC's prediction handling expected. I believe, the
> likelyhood for likely() reduced from a hit rate of 99 to 90.
> 
> For gfortran, the change leads to PR58721, where some code is
> regarded as less likely as before and to no inlining. In many cases,
> the unlikely() is not necessary as those branches have a call to an
> error function, annotated with "noreturn". As explicit user request
> overrides other heurisitic, the builtin_expect will cause that the
> predictor regards the gfc_unlikely() as *more* likely than without
> (as "noreturn" is less likely).
> 
> 
> This patch tries to address this for the different cases:
> 
> a) If there is a "noreturn" function called, the gfc_(un)likely has
> been removed. [Unless, I wrongly classified is in one of the next
> items; however, a "noreturn" will override another PRED_*. Thus, it
> shouldn't matter.]
> 
> b) As OVERFLOWs should only occur extremely rarely, I used
> PRED_OVERFLOW with PROB_VERY_LIKELY (i.e. the branch is very likely
> to be NOT_TAKEN).
> 
> c) In many cases, GCC permits to replace run-time aborts for
> failures by asking for a status, e.g. "DEALLOCATE(var,
> stat=status)"; I have used PRED_FAIL with HITRATE(80) for those -
> but one can argue about the value.
> 
> d) gfortran supports a run-time warning. That's currently only used
> for copy-in/out. I think both branches can be approximately equally
> likely (with no-warning being more common in most codes). However,
> the warning is only shown once. I have set PRED_WARN_CALL's hitrate
> to  to 70 - but that's also an arbitrary value.
> 
> e) As in the trans_image_index: There one had:  "if (cond ||
> invalid_bound) {... } else {...}". The "invalid_bound" is marked as
> unlikely but PRED_* cannot be use as only part of the branch
> condition is unlikely. [I've removed the unlikely.]
> 
> 
> Does the patch look sensible? Does the addition of the PRED_* make
> sense? How about their values?
> 
> 
> Tobias

>  predict.def               |   16 +++++++
> 
>  fortran/trans-array.c     |   26 +++++++-----
>  fortran/trans-expr.c      |   15 ++++---
>  fortran/trans-intrinsic.c |    3 -
>  fortran/trans-io.c        |    1 
>  fortran/trans-stmt.c      |   26 ++++++++----
>  fortran/trans.c           |   97 +++++++++++++++++++++++-----------------------
>  fortran/trans.h           |    4 -
> 
>  8 files changed, 107 insertions(+), 81 deletions(-)
> 
> diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
> index 78b08d7..8f4db5a 100644
> --- a/gcc/fortran/trans-array.c
> +++ b/gcc/fortran/trans-array.c
> @@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "system.h"
>  #include "coretypes.h"
>  #include "tree.h"
> +#include "predict.h"	/* For PRED_FAIL.  */
> @@ -5222,6 +5222,8 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
>    overflow = integer_zero_node;
>  
>    gfc_init_block (&set_descriptor_block);
> +  gfc_add_expr_to_block (&set_descriptor_block,
> +			 build_predict_expr (PRED_FAIL, TAKEN));
>    size = gfc_array_init_size (se->expr, ref->u.ar.as->rank,
>  			      ref->u.ar.as->corank, &offset, lower, upper,
>  			      &se->pre, &set_descriptor_block, &overflow,
> @@ -5248,6 +5250,8 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
>  	  stmtblock_t set_status_block;
>  
>  	  gfc_start_block (&set_status_block);
> +	  gfc_add_expr_to_block (&set_status_block,
> +				 build_predict_expr (PRED_FAIL, NOT_TAKEN));
>  	  gfc_add_modify (&set_status_block, status,
>  			  build_int_cst (status_type, LIBERROR_ALLOCATION));
>  	  error = gfc_finish_block (&set_status_block);

I am bit confused by the expansion, but in general it seems resonable.
You do not need to attach predict_expr on both sides of one branch.  I.e. it seems enough
to attach the NOT_TAKEN predict_expr to the failure path.

There is important different in between builtin_expect and predict_expr.
The first speaks about the value, while the second speaks about its current basic block.

I.e.

if (__builtin_expect (a,0))
  unlikely path
else
  likely path.

translates as
if (a)
  predict_expr not_taken
  unlikely path
else
  likely path.

But if you have something like

a=__builtin_expect (b?1:0,0)

and you produce

a=b?predict_expr not_taken, 0,0
...
if (a)
  unlikely path

We need to check how it goes down to gimple.  Most probably we will end up with:

if (b)
  predict_expr not_taken
  a=0
else
  a=1;
if (a)
  unlikely path

predict.c itself is not smart enough to work out that A in this case has
expected value.  I am not sure if it should be - this is a difficult case
where you already need to have branch predictions to derive predictions
on value ranges.  There is a paper on the topic, but overall it seems
bit expensive and with questionable benefits for production use.

We still get proper prediction if we manage to crossjump second condition.
Crossjumping currently happens only after IPA so it won't help the branch
prediction accuracy in general. Moreover we will not have

So in short if we want the predictions to work reliably, we need to be sure
we produce them in a way backend can reliably consume them.

I think we should
  1) make use of predict_expr where it makes sense (i.e. forntend know it is
     generating the fallback path by itself).  it is cheaper and also more safe
     about fact that we will actually take the hint.
     __builtin_expect handling has its limitations and there are cases where we
     simply end up ignoring it since its use is too complicated.
  2) if there are the other cases (i.e. fortran language allows the failure
     flag to be stored into user variable and handled by user later),
     I will add internal use only argument to predict_expr and extend its handling.
     Again it won't be 100% reliable, since one conditional can then be handled
     by multiple predict_exprs that leads to generally unsolvable problem on
     how to combine probabilities.
     But hopefully those cases will mostly be simple - i.e. user will have
     if (failure_flag)
       make horrible death in controlled way
     statement somewhre after the allcation.
     We are smart enough to work out simple variants like
     if (failure_flag1 || failure_flag2)
       make horrible death in controlled way
  3) make sure the runtime library calls are correctly annotated with noreturn/cold
     flags that may help back end to work out the cases it failed otherwise.

Let me know if you need me to make patch for 2).
otherwise the patch seems OK with changes bellow

> diff --git a/gcc/predict.def b/gcc/predict.def
> index 2ce135c..ca58b26 100644
> --- a/gcc/predict.def
> +++ b/gcc/predict.def
> @@ -133,3 +133,19 @@ DEF_PREDICTOR (PRED_HOT_LABEL, "hot label", HITRATE (85), 0)
>  /* Branches to cold labels are extremely unlikely.  */
>  DEF_PREDICTOR (PRED_COLD_LABEL, "cold label", PROB_VERY_LIKELY,
>  	       PRED_FLAG_FIRST_MATCH)
> +
> +
> +/* The following are used in Fortran. */
> +
> +/* Branch leading to a run-time warning message are less likely.  */
> +DEF_PREDICTOR (PRED_WARN_CALL, "warn call", HITRATE (70), 0)
> +
> +/* Branch leading to an failure status are unlikely.  */
> +DEF_PREDICTOR (PRED_FAIL, "fail", HITRATE(80), 0)
> +
> +/* Branch belonging to a zero-sized array or absent argument are unlikely.  */
> +DEF_PREDICTOR (PRED_ZERO, "zero", HITRATE(80), 0)
> +
> +/* Branch leading to an overflow are extremely unlikely.  */
> +DEF_PREDICTOR (PRED_OVERFLOW, "overflow", PROB_VERY_LIKELY,
> +	       PRED_FLAG_FIRST_MATCH)

I would even gor for PROB_ALWAYS here.  As observed in the testcase, we may have
_very_ many early exits from the function because it allocated buch of arrays.
It realy depends if you compute 1^50, 0.99^50 or 0.9^50

I would go for a bit more dscriptive names, like "fortran alloc overflow"/"fortran zero size alloc" etc.
Those names will hopefuly be more obvious for middle end developer like myself.

Honza
Tobias Burnus Dec. 15, 2013, 6:55 p.m. UTC | #2
Hi Honza,

Jan Hubicka wrote:
> But if you have something like
>
> a=__builtin_expect (b?1:0,0)
>
> and you produce
>
> a=b?predict_expr not_taken, 0,0
> ...
> if (a)
>    unlikely path
>
> We need to check how it goes down to gimple.

It seems as if something doesn't work in that case – at least I do not 
understand the failure in gfortran.dg/elemental_optional_args_5.f03. 
With the patch one has in gfc_conv_procedure_call:

+ gfc_init_block (&err_block);
+ gfc_add_expr_to_block (&err_block,
+ build_predict_expr (PRED_ZERO, NOT_TAKEN));
+ gfc_add_expr_to_block (&err_block,
+ fold_convert (TREE_TYPE (parmse.expr),
+ null_pointer_node));


tmp = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
descriptor_data,
fold_convert (TREE_TYPE (descriptor_data),
null_pointer_node));
parmse.expr
= fold_build3_loc (input_location, COND_EXPR,
+ TREE_TYPE (parmse.expr), tmp,
+ gfc_finish_block (&err_block), parmse.expr);


And for some reasons that will fail with -O0/-Os - it works with -O1 or 
when one removes the PREDICT.

Any idea what goes wrong here? I thought predictions can only produce 
inefficient code but not wrong code.


> So in short if we want the predictions to work reliably, we need to be sure
> we produce them in a way backend can reliably consume them.

I fully concur.

>    2) if there are the other cases (i.e. fortran language allows the failure
>       flag to be stored into user variable and handled by user later),
>       I will add internal use only argument to predict_expr and extend its handling.
>       Again it won't be 100% reliable, since one conditional can then be handled
>       by multiple predict_exprs that leads to generally unsolvable problem on
>       how to combine probabilities.
>       But hopefully those cases will mostly be simple - i.e. user will have
>       if (failure_flag)
>         make horrible death in controlled way
>       statement somewhre after the allcation.
>       We are smart enough to work out simple variants like
>       if (failure_flag1 || failure_flag2)
>         make horrible death in controlled way

I think that's the case for, e.g.
ALLOCATE(variable, stat=status)
where gives status /= 0 in case an error occurs.

>    3) make sure the runtime library calls are correctly annotated with noreturn/cold
>       flags that may help back end to work out the cases it failed otherwise.

I think that's already the case - and is the most important case.

>> +/* Branch leading to an overflow are extremely unlikely.  */
>> +DEF_PREDICTOR (PRED_OVERFLOW, "overflow", PROB_VERY_LIKELY,
>> +	       PRED_FLAG_FIRST_MATCH)
> I would even gor for PROB_ALWAYS here.

Fine with me - I was only fearing that it would regard it as always and 
remove the NOT_TAKEN path. But if it keeps it, using PROB_ALWAYS makes 
sense.

> I would go for a bit more dscriptive names, like "fortran alloc overflow"/"fortran zero size alloc" etc.
> Those names will hopefuly be more obvious for middle end developer like myself.

Okay. When the issue with elemental_optional_args_5.f03 is understood, I 
will update the patch.

Tobias
Jan Hubicka Dec. 15, 2013, 7:48 p.m. UTC | #3
> Hi Honza,
> 
> Jan Hubicka wrote:
> >But if you have something like
> >
> >a=__builtin_expect (b?1:0,0)
> >
> >and you produce
> >
> >a=b?predict_expr not_taken, 0,0
> >...
> >if (a)
> >   unlikely path
> >
> >We need to check how it goes down to gimple.
> 
> It seems as if something doesn't work in that case – at least I do
> not understand the failure in
> gfortran.dg/elemental_optional_args_5.f03. With the patch one has in
> gfc_conv_procedure_call:
> 
> + gfc_init_block (&err_block);
> + gfc_add_expr_to_block (&err_block,
> + build_predict_expr (PRED_ZERO, NOT_TAKEN));
> + gfc_add_expr_to_block (&err_block,
> + fold_convert (TREE_TYPE (parmse.expr),
> + null_pointer_node));
> 
> 
> tmp = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
> descriptor_data,
> fold_convert (TREE_TYPE (descriptor_data),
> null_pointer_node));
> parmse.expr
> = fold_build3_loc (input_location, COND_EXPR,
> + TREE_TYPE (parmse.expr), tmp,
> + gfc_finish_block (&err_block), parmse.expr);
> 
> 
> And for some reasons that will fail with -O0/-Os - it works with -O1
> or when one removes the PREDICT.
> 
> Any idea what goes wrong here? I thought predictions can only
> produce inefficient code but not wrong code.

Yep, they should not roduce incorrect code. Isn't the problem
that you expect the whole expression to have value and predit_expr
"reutrns" nothing?
Can you check what lands into gimple?
> 
> 
> >So in short if we want the predictions to work reliably, we need to be sure
> >we produce them in a way backend can reliably consume them.
> 
> I fully concur.
> 
> >   2) if there are the other cases (i.e. fortran language allows the failure
> >      flag to be stored into user variable and handled by user later),
> >      I will add internal use only argument to predict_expr and extend its handling.
> >      Again it won't be 100% reliable, since one conditional can then be handled
> >      by multiple predict_exprs that leads to generally unsolvable problem on
> >      how to combine probabilities.
> >      But hopefully those cases will mostly be simple - i.e. user will have
> >      if (failure_flag)
> >        make horrible death in controlled way
> >      statement somewhre after the allcation.
> >      We are smart enough to work out simple variants like
> >      if (failure_flag1 || failure_flag2)
> >        make horrible death in controlled way
> 
> I think that's the case for, e.g.
> ALLOCATE(variable, stat=status)
> where gives status /= 0 in case an error occurs.

Yep, in that case I will make the patch extending builtin_expect semantic then.
> 
> >   3) make sure the runtime library calls are correctly annotated with noreturn/cold
> >      flags that may help back end to work out the cases it failed otherwise.
> 
> I think that's already the case - and is the most important case.
> 
> >>+/* Branch leading to an overflow are extremely unlikely.  */
> >>+DEF_PREDICTOR (PRED_OVERFLOW, "overflow", PROB_VERY_LIKELY,
> >>+	       PRED_FLAG_FIRST_MATCH)
> >I would even gor for PROB_ALWAYS here.
> 
> Fine with me - I was only fearing that it would regard it as always
> and remove the NOT_TAKEN path. But if it keeps it, using PROB_ALWAYS
> makes sense.

It is prediction only, not a fact one can use for optimization. Basically
things that will never ever be on time critical path can be marked
as PROB_ALWAYS.

Here I think we can be pretty sure (well of course except for artifically
constructed testcase.  We can also make difference in between case where
frontend autogenerate termination of the program and where user asks for
status, perhaps?)
> 
> >I would go for a bit more dscriptive names, like "fortran alloc overflow"/"fortran zero size alloc" etc.
> >Those names will hopefuly be more obvious for middle end developer like myself.
> 
> Okay. When the issue with elemental_optional_args_5.f03 is
> understood, I will update the patch.
> 
> Tobias
Tobias Burnus Dec. 15, 2013, 8:58 p.m. UTC | #4
Jan Hubicka wrote:
> Yep, they should not roduce incorrect code. Isn't the problem
> that you expect the whole expression to have value and predit_expr
> "reutrns" nothing?
> Can you check what lands into gimple?

That could well be the case – I replace "0" by "{ built_predict; 0 }" 
and I wouldn't be surprised if the built_predict causes problem as it 
returns 'nothing'. At least the basic block belonging to "else" 
(<D.1934>) is empty:

Original dump (4.8 and 4.9 with predict patch):
           sub1 (&v[S.0 + -1], (logical(kind=4)) __builtin_expect 
((integer(kind=8)) (D.1897 == 0B), 0) ? 0B : &(*D.1897)[(S.0 + D.1901) * 
D.1903 + D.1898]);
and
           sub1 (&v[S.0 + -1], D.1917 != 0B ? &(*D.1917)[(S.0 + D.1921) 
* D.1923 + D.1918] : (void) 0);

Gimple of 4.8:
       if (D.1916 != 0) goto <D.1917>; else goto <D.1918>;
       <D.1917>:
       iftmp.2 = 0B;
       goto <D.1919>;
       <D.1918>:
...
       iftmp.2 = &*D.1897[D.1922];
       <D.1919>:
..
       sub1 (D.1924, iftmp.2);

gimple of 4.9 with patch:
       if (D.1917 != 0B) goto <D.1933>; else goto <D.1934>;
       <D.1933>:
       D.1935 = S.0 + D.1921;
       D.1936 = D.1935 * D.1923;
       D.1937 = D.1936 + D.1918;
       iftmp.2 = &*D.1917[D.1937];
       goto <D.1938>;
       <D.1934>:
       <D.1938>:
       D.1939 = S.0 + -1;
       D.1940 = &v[D.1939];
       sub1 (D.1940, iftmp.2);

Tobias

PS: That's for the code:

   implicit none
   type t2
     integer, allocatable :: a
     integer, pointer :: p2(:) => null()
   end type t2
   type(t2), save :: x
   integer, save :: s, v(2)
   call sub1 (v, x%p2)
contains
   elemental subroutine sub1 (x, y)
     integer, intent(inout) :: x
     integer, intent(in), optional :: y
   end subroutine sub1
end
Jan Hubicka Dec. 15, 2013, 10:15 p.m. UTC | #5
> Jan Hubicka wrote:
> >Yep, they should not roduce incorrect code. Isn't the problem
> >that you expect the whole expression to have value and predit_expr
> >"reutrns" nothing?
> >Can you check what lands into gimple?
> 
> That could well be the case – I replace "0" by "{ built_predict; 0
> }" and I wouldn't be surprised if the built_predict causes problem

Yep, though this is also the case whrere you really want to predict
a value rather than code path, so the extended bultin_expect is probably
only resonable approach here.

I am not really generic person, but perhaps it is a difference in between {
built_predict; 0 } that is stmt with no value and built_predict,0 that is an
expression with value?

I will look into the builtin_expect extension. Then we can probably
keem gfc_likely/unlikely with extra argument specifying the predictor
for cases like this and use predict_expr in cases where you
really produce runtime conditional like
if (...)
  { bulitin_predict; abort ();}

Honza
> as it returns 'nothing'. At least the basic block belonging to
> "else" (<D.1934>) is empty:
> 
> Original dump (4.8 and 4.9 with predict patch):
>           sub1 (&v[S.0 + -1], (logical(kind=4)) __builtin_expect
> ((integer(kind=8)) (D.1897 == 0B), 0) ? 0B : &(*D.1897)[(S.0 +
> D.1901) * D.1903 + D.1898]);
> and
>           sub1 (&v[S.0 + -1], D.1917 != 0B ? &(*D.1917)[(S.0 +
> D.1921) * D.1923 + D.1918] : (void) 0);
> 
> Gimple of 4.8:
>       if (D.1916 != 0) goto <D.1917>; else goto <D.1918>;
>       <D.1917>:
>       iftmp.2 = 0B;
>       goto <D.1919>;
>       <D.1918>:
> ...
>       iftmp.2 = &*D.1897[D.1922];
>       <D.1919>:
> ..
>       sub1 (D.1924, iftmp.2);
> 
> gimple of 4.9 with patch:
>       if (D.1917 != 0B) goto <D.1933>; else goto <D.1934>;
>       <D.1933>:
>       D.1935 = S.0 + D.1921;
>       D.1936 = D.1935 * D.1923;
>       D.1937 = D.1936 + D.1918;
>       iftmp.2 = &*D.1917[D.1937];
>       goto <D.1938>;
>       <D.1934>:
>       <D.1938>:
>       D.1939 = S.0 + -1;
>       D.1940 = &v[D.1939];
>       sub1 (D.1940, iftmp.2);
> 
> Tobias
> 
> PS: That's for the code:
> 
>   implicit none
>   type t2
>     integer, allocatable :: a
>     integer, pointer :: p2(:) => null()
>   end type t2
>   type(t2), save :: x
>   integer, save :: s, v(2)
>   call sub1 (v, x%p2)
> contains
>   elemental subroutine sub1 (x, y)
>     integer, intent(inout) :: x
>     integer, intent(in), optional :: y
>   end subroutine sub1
> end
diff mbox

Patch

 predict.def               |   16 +++++++

 fortran/trans-array.c     |   26 +++++++-----
 fortran/trans-expr.c      |   15 ++++---
 fortran/trans-intrinsic.c |    3 -
 fortran/trans-io.c        |    1 
 fortran/trans-stmt.c      |   26 ++++++++----
 fortran/trans.c           |   97 +++++++++++++++++++++++-----------------------
 fortran/trans.h           |    4 -

 8 files changed, 107 insertions(+), 81 deletions(-)

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 78b08d7..8f4db5a 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -79,6 +79,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "tree.h"
+#include "predict.h"	/* For PRED_FAIL.  */
 #include "gimple-expr.h"
 #include "diagnostic-core.h"	/* For internal_error/fatal_error.  */
 #include "flags.h"
@@ -4987,13 +4988,12 @@  gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
 			     fold_convert (gfc_array_index_type,
 					   TYPE_MAX_VALUE (gfc_array_index_type)),
 					   size);
-      cond = gfc_unlikely (fold_build2_loc (input_location, LT_EXPR,
-					    boolean_type_node, tmp, stride));
+      cond = fold_build2_loc (input_location, LT_EXPR, boolean_type_node, tmp, stride);
       tmp = fold_build3_loc (input_location, COND_EXPR, integer_type_node, cond,
 			     integer_one_node, integer_zero_node);
-      cond = gfc_unlikely (fold_build2_loc (input_location, EQ_EXPR,
+      cond = /*gfc_unlikely (*/fold_build2_loc (input_location, EQ_EXPR,
 					    boolean_type_node, size,
-					    gfc_index_zero_node));
+					    gfc_index_zero_node);
       tmp = fold_build3_loc (input_location, COND_EXPR, integer_type_node, cond,
 			     integer_zero_node, tmp);
       tmp = fold_build2_loc (input_location, PLUS_EXPR, integer_type_node,
@@ -5089,13 +5089,13 @@  gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
   tmp = fold_build2_loc (input_location, TRUNC_DIV_EXPR,
 			 size_type_node,
 			 TYPE_MAX_VALUE (size_type_node), element_size);
-  cond = gfc_unlikely (fold_build2_loc (input_location, LT_EXPR,
-					boolean_type_node, tmp, stride));
+  cond = /*gfc_unlikely (*/fold_build2_loc (input_location, LT_EXPR,
+					boolean_type_node, tmp, stride);
   tmp = fold_build3_loc (input_location, COND_EXPR, integer_type_node, cond,
 			 integer_one_node, integer_zero_node);
-  cond = gfc_unlikely (fold_build2_loc (input_location, EQ_EXPR,
+  cond = /*gfc_unlikely (*/fold_build2_loc (input_location, EQ_EXPR,
 					boolean_type_node, element_size,
-					build_int_cst (size_type_node, 0)));
+					build_int_cst (size_type_node, 0));
   tmp = fold_build3_loc (input_location, COND_EXPR, integer_type_node, cond,
 			 integer_zero_node, tmp);
   tmp = fold_build2_loc (input_location, PLUS_EXPR, integer_type_node,
@@ -5222,6 +5222,8 @@  gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
   overflow = integer_zero_node;
 
   gfc_init_block (&set_descriptor_block);
+  gfc_add_expr_to_block (&set_descriptor_block,
+			 build_predict_expr (PRED_FAIL, TAKEN));
   size = gfc_array_init_size (se->expr, ref->u.ar.as->rank,
 			      ref->u.ar.as->corank, &offset, lower, upper,
 			      &se->pre, &set_descriptor_block, &overflow,
@@ -5248,6 +5250,8 @@  gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
 	  stmtblock_t set_status_block;
 
 	  gfc_start_block (&set_status_block);
+	  gfc_add_expr_to_block (&set_status_block,
+				 build_predict_expr (PRED_FAIL, NOT_TAKEN));
 	  gfc_add_modify (&set_status_block, status,
 			  build_int_cst (status_type, LIBERROR_ALLOCATION));
 	  error = gfc_finish_block (&set_status_block);
@@ -5276,8 +5280,8 @@  gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
 
   if (dimension)
     {
-      cond = gfc_unlikely (fold_build2_loc (input_location, NE_EXPR,
-			   boolean_type_node, var_overflow, integer_zero_node));
+      cond = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
+			      var_overflow, integer_zero_node);
       tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, cond,
 			     error, gfc_finish_block (&elseblock));
     }
@@ -5298,7 +5302,7 @@  gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
 			  build_int_cst (TREE_TYPE (status), 0));
       gfc_add_expr_to_block (&se->pre,
 		 fold_build3_loc (input_location, COND_EXPR, void_type_node,
-				  gfc_likely (cond), set_descriptor,
+				  cond, set_descriptor,
 				  build_empty_stmt (input_location)));
     }
   else
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 62ba932..9dda723 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -25,6 +25,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "tree.h"
+#include "predict.h"	/* For PRED_FAIL.  */
 #include "stringpool.h"
 #include "diagnostic-core.h"	/* For fatal_error.  */
 #include "langhooks.h"
@@ -4080,7 +4081,14 @@  gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	  if (ss->info->can_be_null_ref && ss->info->type != GFC_SS_REFERENCE)
 	    {
 	      tree descriptor_data;
+	      stmtblock_t err_block;
 
+	      gfc_init_block (&err_block);
+	      gfc_add_expr_to_block (&err_block,
+				     build_predict_expr (PRED_ZERO, NOT_TAKEN));
+	      gfc_add_expr_to_block (&err_block,
+				     fold_convert (TREE_TYPE (parmse.expr),
+						   null_pointer_node));
 	      descriptor_data = ss->info->data.array.data;
 	      tmp = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
 				     descriptor_data,
@@ -4088,11 +4096,8 @@  gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 						   null_pointer_node));
 	      parmse.expr
 		= fold_build3_loc (input_location, COND_EXPR,
-				   TREE_TYPE (parmse.expr),
-				   gfc_unlikely (tmp),
-				   fold_convert (TREE_TYPE (parmse.expr),
-						 null_pointer_node),
-				   parmse.expr);
+				   TREE_TYPE (parmse.expr), tmp,
+				   gfc_finish_block (&err_block), parmse.expr);
 	    }
 
 	  /* The scalarizer does not repackage the reference to a class
diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 4acdc8d..f7df6b7 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -1196,9 +1196,6 @@  trans_image_index (gfc_se * se, gfc_expr *expr)
 				       boolean_type_node, invalid_bound, cond);
     }
 
-  invalid_bound = gfc_unlikely (invalid_bound);
-
-
   /* See Fortran 2008, C.10 for the following algorithm.  */
 
   /* coindex = sub(corank) - lcobound(n).  */
diff --git a/gcc/fortran/trans-io.c b/gcc/fortran/trans-io.c
index 9b46a4e..fd5f50b 100644
--- a/gcc/fortran/trans-io.c
+++ b/gcc/fortran/trans-io.c
@@ -268,7 +268,6 @@  gfc_trans_io_runtime_check (tree cond, tree var, int error_code,
     }
   else
     {
-      cond = gfc_unlikely (cond);
       tmp = build3_v (COND_EXPR, cond, body, build_empty_stmt (input_location));
       gfc_add_expr_to_block (pblock, tmp);
     }
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 4f21197..76d1124 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -24,6 +24,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "tree.h"
+#include "predict.h"	/* For PRED_FAIL.  */
 #include "stringpool.h"
 #include "gfortran.h"
 #include "flags.h"
@@ -5086,13 +5087,18 @@  gfc_trans_allocate (gfc_code * code)
       /* Error checking -- Note: ERRMSG only makes sense with STAT.  */
       if (code->expr1)
 	{
-	  tmp = build1_v (GOTO_EXPR, label_errmsg);
+	  stmtblock_t err_block;
+	  gfc_init_block (&err_block);
+	  gfc_add_expr_to_block (&err_block, build_predict_expr (PRED_FAIL,
+								 NOT_TAKEN));
+	  gfc_add_expr_to_block (&err_block,
+				 build1_v (GOTO_EXPR, label_errmsg));
 	  parm = fold_build2_loc (input_location, NE_EXPR,
 				  boolean_type_node, stat,
 				  build_int_cst (TREE_TYPE (stat), 0));
 	  tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
-				 gfc_unlikely (parm), tmp,
-				     build_empty_stmt (input_location));
+				 parm, gfc_finish_block (&err_block),
+				 build_empty_stmt (input_location));
 	  gfc_add_expr_to_block (&block, tmp);
 	}
 
@@ -5448,13 +5454,16 @@  gfc_trans_deallocate (gfc_code *code)
 
       if (code->expr1)
 	{
-          tree cond;
-
+	  tree cond;
+	  stmtblock_t err_block;
+	  gfc_init_block (&err_block);
+	  gfc_add_expr_to_block (&err_block, build_predict_expr (PRED_FAIL,
+								 NOT_TAKEN));
+	  gfc_add_expr_to_block (&err_block, build1_v (GOTO_EXPR, label_errmsg));
 	  cond = fold_build2_loc (input_location, NE_EXPR, boolean_type_node, stat,
 				  build_int_cst (TREE_TYPE (stat), 0));
 	  tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
-				 gfc_unlikely (cond),
-				 build1_v (GOTO_EXPR, label_errmsg),
+				 cond, gfc_finish_block (&err_block),
 				 build_empty_stmt (input_location));
 	  gfc_add_expr_to_block (&se.pre, tmp);
 	}
@@ -5493,8 +5502,7 @@  gfc_trans_deallocate (gfc_code *code)
       cond = fold_build2_loc (input_location, NE_EXPR, boolean_type_node, stat,
 			     build_int_cst (TREE_TYPE (stat), 0));
       tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
-			     gfc_unlikely (cond), tmp,
-			     build_empty_stmt (input_location));
+			     cond, tmp, build_empty_stmt (input_location));
 
       gfc_add_expr_to_block (&block, tmp);
     }
diff --git a/gcc/fortran/trans.c b/gcc/fortran/trans.c
index 9e57058..580c981 100644
--- a/gcc/fortran/trans.c
+++ b/gcc/fortran/trans.c
@@ -25,6 +25,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "gimple-expr.h"	/* For create_tmp_var_raw.  */
 #include "stringpool.h"
 #include "tree-iterator.h"
+#include "predict.h"	/* For PRED_WARN_CALL.  */
 #include "diagnostic-core.h"  /* For internal_error.  */
 #include "flags.h"
 #include "gfortran.h"
@@ -491,6 +492,12 @@  gfc_trans_runtime_check (bool error, bool once, tree cond, stmtblock_t * pblock,
   if (integer_zerop (cond))
     return;
 
+  /* Mark branch belonging to the warning condition as less likely.  Error is
+     handled via noreturn.  */
+  if (!error)
+    gfc_add_expr_to_block (pblock, build_predict_expr (PRED_WARN_CALL,
+						       NOT_TAKEN));
+
   if (once)
     {
        tmpvar = gfc_create_var (boolean_type_node, "print_warning");
@@ -526,7 +533,6 @@  gfc_trans_runtime_check (bool error, bool once, tree cond, stmtblock_t * pblock,
       else
 	cond = fold_convert (long_integer_type_node, cond);
 
-      cond = gfc_unlikely (cond);
       tmp = fold_build3_loc (where->lb->location, COND_EXPR, void_type_node,
 			     cond, body,
 			     build_empty_stmt (where->lb->location));
@@ -641,9 +647,18 @@  gfc_allocate_using_malloc (stmtblock_t * block, tree pointer,
 
   /* What to do in case of error.  */
   if (status != NULL_TREE)
-    on_error = fold_build2_loc (input_location, MODIFY_EXPR, status_type,
-			status, build_int_cst (status_type, LIBERROR_ALLOCATION));
+    {
+      stmtblock_t err_block;
+      gfc_init_block (&err_block);
+      gfc_add_expr_to_block (&err_block, build_predict_expr (PRED_OVERFLOW,
+							     NOT_TAKEN));
+      tmp = fold_build2_loc (input_location, MODIFY_EXPR, status_type, status,
+			     build_int_cst (status_type, LIBERROR_ALLOCATION));
+      gfc_add_expr_to_block (&err_block, tmp);
+      on_error = gfc_finish_block (&err_block);
+    }
   else
+    /* No need for a predict as noreturn is set.  */
     on_error = build_call_expr_loc (input_location, gfor_fndecl_os_error, 1,
 		    gfc_build_addr_expr (pchar_type_node,
 				 gfc_build_localized_cstring_const
@@ -653,7 +668,7 @@  gfc_allocate_using_malloc (stmtblock_t * block, tree pointer,
 				boolean_type_node, pointer,
 				build_int_cst (prvoid_type_node, 0));
   tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
-			 gfc_unlikely (error_cond), on_error,
+			 error_cond, on_error,
 			 build_empty_stmt (input_location));
 
   gfc_add_expr_to_block (block, tmp);
@@ -748,9 +763,8 @@  gfc_allocate_allocatable (stmtblock_t * block, tree mem, tree size, tree token,
   if (TREE_TYPE (size) != TREE_TYPE (size_type_node))
     size = fold_convert (size_type_node, size);
 
-  null_mem = gfc_unlikely (fold_build2_loc (input_location, NE_EXPR,
-					    boolean_type_node, mem,
-					    build_int_cst (type, 0)));
+  null_mem = fold_build2_loc (input_location, NE_EXPR, boolean_type_node, mem,
+			      build_int_cst (type, 0));
 
   /* If mem is NULL, we call gfc_allocate_using_malloc or
      gfc_allocate_using_lib.  */
@@ -765,12 +779,17 @@  gfc_allocate_allocatable (stmtblock_t * block, tree mem, tree size, tree token,
 			      errmsg, errlen);
       if (status != NULL_TREE)
 	{
+          stmtblock_t err_block;
+
+          gfc_init_block (&err_block);
 	  TREE_USED (label_finish) = 1;
-	  tmp = build1_v (GOTO_EXPR, label_finish);
+	  gfc_add_expr_to_block (&err_block, build_predict_expr (PRED_FAIL,
+								 NOT_TAKEN));
+	  gfc_add_expr_to_block (&err_block,  build1_v (GOTO_EXPR, label_finish));
 	  cond = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
 				  status, build_zero_cst (TREE_TYPE (status)));
 	  tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
-				 gfc_unlikely (cond), tmp,
+				 cond, gfc_finish_block (&err_block),
 				 build_empty_stmt (input_location));
 	  gfc_add_expr_to_block (&alloc_block, tmp);
 	}
@@ -802,10 +821,16 @@  gfc_allocate_allocatable (stmtblock_t * block, tree mem, tree size, tree token,
 
   if (status != NULL_TREE)
     {
+      stmtblock_t err_block;
       tree status_type = TREE_TYPE (status);
 
+      gfc_init_block (&err_block);
+      gfc_add_expr_to_block (&err_block, build_predict_expr (PRED_FAIL,
+							     NOT_TAKEN));
       error = fold_build2_loc (input_location, MODIFY_EXPR, status_type,
 	      status, build_int_cst (status_type, LIBERROR_ALLOCATION));
+      gfc_add_expr_to_block (&err_block, error);
+      error = gfc_finish_block (&err_block);
     }
 
   tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, null_mem,
@@ -1257,18 +1282,23 @@  gfc_deallocate_with_status (tree pointer, tree status, tree errmsg,
       if (status != NULL_TREE && !integer_zerop (status))
 	{
 	  /* We set STATUS to zero if it is present.  */
+	  stmtblock_t err_block;
 	  tree status_type = TREE_TYPE (TREE_TYPE (status));
 	  tree cond2;
 
 	  cond2 = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
 				   status,
 				   build_int_cst (TREE_TYPE (status), 0));
-	  tmp = fold_build2_loc (input_location, MODIFY_EXPR, status_type,
+	  gfc_init_block (&err_block);
+	  gfc_add_expr_to_block (&err_block, build_predict_expr (PRED_FAIL,
+								 NOT_TAKEN));
+	  gfc_add_expr_to_block (&err_block,
+		fold_build2_loc (input_location, MODIFY_EXPR, status_type,
 				 fold_build1_loc (input_location, INDIRECT_REF,
 						  status_type, status),
-				 build_int_cst (status_type, 0));
+				 build_int_cst (status_type, 0)));
 	  tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
-				 gfc_unlikely (cond2), tmp,
+				 cond2, gfc_finish_block (&err_block),
 				 build_empty_stmt (input_location));
 	  gfc_add_expr_to_block (&non_null, tmp);
 	}
@@ -1320,14 +1350,19 @@  gfc_deallocate_with_status (tree pointer, tree status, tree errmsg,
 
       if (status != NULL_TREE)
 	{
+	  stmtblock_t err_block;
 	  tree stat = build_fold_indirect_ref_loc (input_location, status);
 
+	  gfc_init_block (&err_block);
+	  gfc_add_expr_to_block (&err_block, build_predict_expr (PRED_FAIL,
+								 NOT_TAKEN));
 	  TREE_USED (label_finish) = 1;
-	  tmp = build1_v (GOTO_EXPR, label_finish);
+	  gfc_add_expr_to_block (&err_block,
+				 build1_v (GOTO_EXPR, label_finish));
 	  cond2 = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
 				   stat, build_zero_cst (TREE_TYPE (stat)));
 	  tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
-        			 gfc_unlikely (cond2), tmp,
+				 cond2, gfc_finish_block (&err_block),
 				 build_empty_stmt (input_location));
 	  gfc_add_expr_to_block (&non_null, tmp);
 	}
@@ -2010,37 +2045,3 @@  gfc_finish_wrapped_block (gfc_wrapped_block* block)
 
   return result;
 }
-
-
-/* Helper function for marking a boolean expression tree as unlikely.  */
-
-tree
-gfc_unlikely (tree cond)
-{
-  tree tmp;
-
-  cond = fold_convert (long_integer_type_node, cond);
-  tmp = build_zero_cst (long_integer_type_node);
-  cond = build_call_expr_loc (input_location,
-			      builtin_decl_explicit (BUILT_IN_EXPECT),
-			      2, cond, tmp);
-  cond = fold_convert (boolean_type_node, cond);
-  return cond;
-}
-
-
-/* Helper function for marking a boolean expression tree as likely.  */
-
-tree
-gfc_likely (tree cond)
-{
-  tree tmp;
-
-  cond = fold_convert (long_integer_type_node, cond);
-  tmp = build_one_cst (long_integer_type_node);
-  cond = build_call_expr_loc (input_location,
-			      builtin_decl_explicit (BUILT_IN_EXPECT),
-			      2, cond, tmp);
-  cond = fold_convert (boolean_type_node, cond);
-  return cond;
-}
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 424ce7a..2ed006f 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -577,10 +577,6 @@  void gfc_generate_constructors (void);
 /* Get the string length of an array constructor.  */
 bool get_array_ctor_strlen (stmtblock_t *, gfc_constructor_base, tree *);
 
-/* Mark a condition as likely or unlikely.  */
-tree gfc_likely (tree);
-tree gfc_unlikely (tree);
-
 /* Generate a runtime error call.  */
 tree gfc_trans_runtime_error (bool, locus*, const char*, ...);
 
diff --git a/gcc/predict.def b/gcc/predict.def
index 2ce135c..ca58b26 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -133,3 +133,19 @@  DEF_PREDICTOR (PRED_HOT_LABEL, "hot label", HITRATE (85), 0)
 /* Branches to cold labels are extremely unlikely.  */
 DEF_PREDICTOR (PRED_COLD_LABEL, "cold label", PROB_VERY_LIKELY,
 	       PRED_FLAG_FIRST_MATCH)
+
+
+/* The following are used in Fortran. */
+
+/* Branch leading to a run-time warning message are less likely.  */
+DEF_PREDICTOR (PRED_WARN_CALL, "warn call", HITRATE (70), 0)
+
+/* Branch leading to an failure status are unlikely.  */
+DEF_PREDICTOR (PRED_FAIL, "fail", HITRATE(80), 0)
+
+/* Branch belonging to a zero-sized array or absent argument are unlikely.  */
+DEF_PREDICTOR (PRED_ZERO, "zero", HITRATE(80), 0)
+
+/* Branch leading to an overflow are extremely unlikely.  */
+DEF_PREDICTOR (PRED_OVERFLOW, "overflow", PROB_VERY_LIKELY,
+	       PRED_FLAG_FIRST_MATCH)