diff mbox series

[v1] tree-ssa-sink: Improve code sinking pass.

Message ID acdb8e2b-7c95-0fd2-2189-51f661f15bc5@linux.ibm.com
State New
Headers show
Series [v1] tree-ssa-sink: Improve code sinking pass. | expand

Commit Message

Ajit Agarwal May 18, 2023, 7:14 a.m. UTC
Hello All:

This patch improves code sinking pass to sink statements before call to reduce
register pressure.
Review comments are incorporated.

Bootstrapped and regtested on powerpc64-linux-gnu.

Thanks & Regards
Ajit


tree-ssa-sink: Improve code sinking pass.

Code Sinking sinks the blocks after call. This increases
register pressure for callee-saved registers. Improves
code sinking before call in the use blocks or immediate
dominator of use blocks.

2023-05-18  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>

gcc/ChangeLog:

	* tree-ssa-sink.cc (statement_sink_location): Modifed to
	move statements before calls.
	(block_call_p): New function.
	(def_use_same_block): New function.
	(select_best_block): Add heuristics to select the best
	blocks in the immediate post dominator.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
	* gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  16 ++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  20 +++
 gcc/tree-ssa-sink.cc                        | 159 ++++++++++++++++++--
 3 files changed, 185 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c

Comments

Segher Boessenkool May 18, 2023, 4:57 p.m. UTC | #1
Hi!

On Thu, May 18, 2023 at 12:44:28PM +0530, Ajit Agarwal wrote:
> This patch improves code sinking pass to sink statements before call to reduce
> register pressure.

An example would be useful :-)

> 	* tree-ssa-sink.cc (statement_sink_location): Modifed to
> 	move statements before calls.

Spello ("modified").  But, you should write in the imperative mood
anyway, so "modify".  But, every change is a modification, so do without
the fluff altogether?  "Move statements before calls."

> 	(block_call_p): New function.
> 	(def_use_same_block): New function.
> 	(select_best_block): Add heuristics to select the best
> 	blocks in the immediate post dominator.

Please don't break lines
early
it makes things
harder to
read.

:-)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */

This is the default, you can just leave it out.

> +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-sink-stats" } */

You don't need -fdump-tree-sink without options since you have
-fdump-tree-sink-stats as well.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */ 
> +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */

You don't need to say it twice either :-)

> +/* Return TRUE if immediate uses of the defs in
> +   USE occur in the same block as USE, FALSE otherwise.  */
> +
> +bool
> +def_use_same_block (gimple *stmt)
> +{

There is no function parameter "use" here?  STMT instead?

> +  use_operand_p use_p;
> +  def_operand_p def_p;

Neither of these is a predicate.  Lose the _p please?

> +	  if (use_p
> +	      && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))

Please fit this on one line.  And no parens around random things please.

> +/* Return TRUE if the block has only calls, FALSE otherwise. */
> +
> +bool
> +block_call_p (basic_block bb)

> +	   /* We have already seen a call.  */
> +	   if (is_call)
> +	     return false;
> +
> +	   if (is_gimple_call (stmt))
> +	     is_call = true;
> +	   else
> +	     return false;

> +  if (is_call && i == 1)
> +    return true;
> +
> +  return false;

This doesn't do what the function comment says?  It is very important
that function comments say exactly what a function does.  It can perhaps
leave out some details, but it should be correct by and large.

> +		/* Update sinking point as stmt before call if the sinking block
> +		   has only calls. Otherwise update sinking point as the use
> +		   stmt. */

(two spaces after full stop, twice)

> +		if (gsi_stmt (gsi) == use
> +		    && !is_gimple_call (last_stmt)
> +		    && (gimple_code (last_stmt) != GIMPLE_SWITCH)
> +		    && (gimple_code (last_stmt) != GIMPLE_COND)
> +		    && (gimple_code (last_stmt) != GIMPLE_GOTO)
> +		    && (!gimple_vdef (use) || !def_use_same_block (def_stmt)))

Please no unnecessary parens.  At first I didn't notice the last line
here *does* need it!


Segher
Richard Biener May 22, 2023, 12:56 p.m. UTC | #2
On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>
> Hello All:
>
> This patch improves code sinking pass to sink statements before call to reduce
> register pressure.
> Review comments are incorporated.
>
> Bootstrapped and regtested on powerpc64-linux-gnu.
>
> Thanks & Regards
> Ajit
>
>
> tree-ssa-sink: Improve code sinking pass.
>
> Code Sinking sinks the blocks after call. This increases
> register pressure for callee-saved registers. Improves
> code sinking before call in the use blocks or immediate
> dominator of use blocks.
>
> 2023-05-18  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>
> gcc/ChangeLog:
>
>         * tree-ssa-sink.cc (statement_sink_location): Modifed to
>         move statements before calls.
>         (block_call_p): New function.
>         (def_use_same_block): New function.
>         (select_best_block): Add heuristics to select the best
>         blocks in the immediate post dominator.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
>         * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  16 ++
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  20 +++
>  gcc/tree-ssa-sink.cc                        | 159 ++++++++++++++++++--
>  3 files changed, 185 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> new file mode 100644
> index 00000000000..716bc1f9257
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-sink-stats" } */
> +
> +void bar();
> +int j;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +    {
> +      bar();
> +      j = l;
> +    }
> +}
> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */

this doesn't verify the place we sink to?

> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> new file mode 100644
> index 00000000000..ff41e2ea8ae
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */
> +
> +void bar();
> +int j, x;
> +void foo(int a, int b, int c, int d, int e, int f)
> +{
> +  int l;
> +  l = a + b + c + d +e + f;
> +  if (a != 5)
> +    {
> +      bar();
> +      if (b != 3)
> +        x = 3;
> +      else
> +        x = 5;
> +      j = l;
> +    }
> +}
> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */

likewise.  So both tests already pass before the patch?

> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
> index 87b1d40c174..76556e7795b 100644
> --- a/gcc/tree-ssa-sink.cc
> +++ b/gcc/tree-ssa-sink.cc
> @@ -171,6 +171,72 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>    return commondom;
>  }
>
> +/* Return TRUE if immediate uses of the defs in
> +   USE occur in the same block as USE, FALSE otherwise.  */
> +
> +bool
> +def_use_same_block (gimple *stmt)
> +{
> +  use_operand_p use_p;
> +  def_operand_p def_p;
> +  imm_use_iterator imm_iter;
> +  ssa_op_iter iter;
> +
> +  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
> +    {
> +      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p))
> +       {
> +         if (is_gimple_debug (USE_STMT (use_p)))
> +           continue;
> +
> +         if (use_p

use_p is never null

> +             && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))
> +           return true;

the function behavior is obviously odd ...

> +       }
> +     }
> +  return false;
> +}
> +
> +/* Return TRUE if the block has only calls, FALSE otherwise. */
> +
> +bool
> +block_call_p (basic_block bb)
> +{
> +  int i = 0;
> +  bool is_call = false;
> +  gimple_stmt_iterator gsi = gsi_last_bb (bb);
> +  gimple *last_stmt = gsi_stmt (gsi);
> +
> +  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
> +    {
> +      if (!gsi_end_p (gsi))
> +       gsi_prev (&gsi);
> +
> +       for (; !gsi_end_p (gsi);)
> +        {
> +          gimple *stmt = gsi_stmt (gsi);
> +
> +          /* We have already seen a call.  */
> +          if (is_call)
> +            return false;

Likewise.  Do you want to check whether a block has
a single stmt and that is a call and that is followed by
a condition?  It looks like a very convoluted way to write this.

> +
> +          if (is_gimple_call (stmt))
> +            is_call = true;
> +          else
> +            return false;
> +
> +          if (!gsi_end_p (gsi))
> +            gsi_prev (&gsi);
> +
> +           ++i;
> +       }
> +     }
> +  if (is_call && i == 1)
> +    return true;
> +
> +  return false;
> +}
> +
>  /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
>     tree, return the best basic block between them (inclusive) to place
>     statements.
> @@ -190,7 +256,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>  static basic_block
>  select_best_block (basic_block early_bb,
>                    basic_block late_bb,
> -                  gimple *stmt)
> +                  gimple *stmt,
> +                  gimple *use)

please update the function comment

>  {
>    basic_block best_bb = late_bb;
>    basic_block temp_bb = late_bb;
> @@ -230,14 +297,47 @@ select_best_block (basic_block early_bb,
>        if (threshold > 100)
>         threshold = 100;
>      }
> -
>    /* If BEST_BB is at the same nesting level, then require it to have
>       significantly lower execution frequency to avoid gratuitous movement.  */
>    if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
>        /* If result of comparsion is unknown, prefer EARLY_BB.
>          Thus use !(...>=..) rather than (...<...)  */
>        && !(best_bb->count * 100 >= early_bb->count * threshold))
> -    return best_bb;
> +    {
> +      basic_block new_best_bb = get_immediate_dominator (CDI_DOMINATORS, best_bb);
> +      /* Return best_bb if def and use are in same block otherwise new_best_bb.
> +
> +        Things to consider:
> +
> +          new_best_bb is not equal to best_bb and early_bb.
> +
> +          stmt is not call.
> +
> +          new_best_bb doesnt have any phis.
> +
> +          use basic block is not equal to early_bb.
> +
> +          use basic block post dominates to new_best_bb.
> +
> +          new_best_bb dominates early_bb. */
> +      if (new_best_bb && use
> +         && (new_best_bb != best_bb)
> +         && (new_best_bb != early_bb)
> +         && !is_gimple_call (stmt)
> +         && gsi_end_p (gsi_start_phis (new_best_bb))
> +         && (gimple_bb (use) != early_bb)
> +         && !is_gimple_call (use)
> +         && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb(use))
> +         && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
> +         && block_call_p (new_best_bb))
> +       {
> +         if (def_use_same_block (use))
> +           return best_bb;

given the odd implementation of the predicates this matches very very
specific cases.

Consider

 if (..)
  {
    foo();
    bar();
    ... = l;
  }

and C++ where foo and bar might throw.  You then likely want to sink
before foo ().

What's the reason to only consider blocks with exactly 'call; cond;' ?

> +
> +         return new_best_bb;
> +       }
> +       return best_bb;
> +    }
>
>    /* No better block found, so return EARLY_BB, which happens to be the
>       statement's original block.  */
> @@ -439,7 +539,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>        if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb))
>         return false;
>
> -      commondom = select_best_block (frombb, commondom, stmt);
> +      commondom = select_best_block (frombb, commondom, stmt, NULL);
>
>        if (commondom == frombb)
>         return false;
> @@ -456,19 +556,58 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>             continue;
>           break;
>         }
> +
>        use = USE_STMT (one_use);
>
>        if (gimple_code (use) != GIMPLE_PHI)
>         {
> -         sinkbb = select_best_block (frombb, gimple_bb (use), stmt);
> +         sinkbb = select_best_block (frombb, gimple_bb (use), stmt, use);
>
>           if (sinkbb == frombb)
>             return false;
>
> -         if (sinkbb == gimple_bb (use))
> -           *togsi = gsi_for_stmt (use);
> -         else
> -           *togsi = gsi_after_labels (sinkbb);
> +          gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
> +
> +          if ((gimple_bb (def_stmt) == gimple_bb (use))
> +               && (gimple_bb (use) != sinkbb))
> +            sinkbb = gimple_bb (use);
> +
> +           if (sinkbb == gimple_bb (use))
> +             {
> +               gimple_stmt_iterator gsi = gsi_last_bb (sinkbb);
> +               gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
> +               gimple *last_stmt = gsi_stmt (gsi);
> +
> +               /* Update sinking point as stmt before call if the sinking block
> +                  has only calls. Otherwise update sinking point as the use
> +                  stmt. */
> +               if (gsi_stmt (gsi) == use
> +                   && !is_gimple_call (last_stmt)
> +                   && (gimple_code (last_stmt) != GIMPLE_SWITCH)
> +                   && (gimple_code (last_stmt) != GIMPLE_COND)
> +                   && (gimple_code (last_stmt) != GIMPLE_GOTO)
> +                   && (!gimple_vdef (use) || !def_use_same_block (def_stmt)))
> +                 {
> +                   if (!gsi_end_p (gsi))
> +                     gsi_prev (&gsi);
> +
> +                   gimple *stmt = gsi_stmt (gsi);
> +
> +                   if (!gsi_end_p (gsi))
> +                     gsi_prev (&gsi);
> +
> +                   if (gsi_end_p (gsi) && stmt && is_gimple_call (stmt)
> +                       && gsi_end_p (gsi_start_phis (sinkbb))
> +                       && !is_gimple_call (def_stmt))
> +                     *togsi = gsi_for_stmt (stmt);
> +                   else
> +                     *togsi = gsi_for_stmt (use);
> +                  }
> +               else
> +                 *togsi = gsi_for_stmt(use);
> +              }
> +            else
> +               *togsi = gsi_after_labels (sinkbb);

This is very convoluted.  I think that in the end you want to compute (once) the
position of the first call in each block.  Since we're waking the CFG backwards
in post-dominator order this information can be gathered during this walk.
This would determine the location to sink to iff the use stmt is dominated by
this location (you can for example use gimple_uid to mark stmts before it).

The alternative is to simply always sink to the start of blocks even for the
use stmt block in case that has a call before the use (but you still need to
efficiently compute that).

Richard.

>
>           return true;
>         }
> @@ -480,7 +619,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>    if (!sinkbb)
>      return false;
>
> -  sinkbb = select_best_block (frombb, sinkbb, stmt);
> +  sinkbb = select_best_block (frombb, sinkbb, stmt, NULL);
>    if (!sinkbb || sinkbb == frombb)
>      return false;
>
> --
> 2.31.1
>
Ajit Agarwal May 30, 2023, 5:06 a.m. UTC | #3
Hello Richard:

On 22/05/23 6:26 pm, Richard Biener wrote:
> On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>>
>> Hello All:
>>
>> This patch improves code sinking pass to sink statements before call to reduce
>> register pressure.
>> Review comments are incorporated.
>>
>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>
>> Thanks & Regards
>> Ajit
>>
>>
>> tree-ssa-sink: Improve code sinking pass.
>>
>> Code Sinking sinks the blocks after call. This increases
>> register pressure for callee-saved registers. Improves
>> code sinking before call in the use blocks or immediate
>> dominator of use blocks.
>>
>> 2023-05-18  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>>
>> gcc/ChangeLog:
>>
>>         * tree-ssa-sink.cc (statement_sink_location): Modifed to
>>         move statements before calls.
>>         (block_call_p): New function.
>>         (def_use_same_block): New function.
>>         (select_best_block): Add heuristics to select the best
>>         blocks in the immediate post dominator.
>>
>> gcc/testsuite/ChangeLog:
>>
>>         * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
>>         * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
>> ---
>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  16 ++
>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  20 +++
>>  gcc/tree-ssa-sink.cc                        | 159 ++++++++++++++++++--
>>  3 files changed, 185 insertions(+), 10 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>> new file mode 100644
>> index 00000000000..716bc1f9257
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>> @@ -0,0 +1,16 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-sink-stats" } */
>> +
>> +void bar();
>> +int j;
>> +void foo(int a, int b, int c, int d, int e, int f)
>> +{
>> +  int l;
>> +  l = a + b + c + d +e + f;
>> +  if (a != 5)
>> +    {
>> +      bar();
>> +      j = l;
>> +    }
>> +}
>> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
> 
> this doesn't verify the place we sink to?
>

I am not sure how to verify the place we sink to with dg-final.
 
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>> new file mode 100644
>> index 00000000000..ff41e2ea8ae
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */
>> +
>> +void bar();
>> +int j, x;
>> +void foo(int a, int b, int c, int d, int e, int f)
>> +{
>> +  int l;
>> +  l = a + b + c + d +e + f;
>> +  if (a != 5)
>> +    {
>> +      bar();
>> +      if (b != 3)
>> +        x = 3;
>> +      else
>> +        x = 5;
>> +      j = l;
>> +    }
>> +}
>> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
> 
> likewise.  So both tests already pass before the patch?
> 
>> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
>> index 87b1d40c174..76556e7795b 100644
>> --- a/gcc/tree-ssa-sink.cc
>> +++ b/gcc/tree-ssa-sink.cc
>> @@ -171,6 +171,72 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>>    return commondom;
>>  }
>>
>> +/* Return TRUE if immediate uses of the defs in
>> +   USE occur in the same block as USE, FALSE otherwise.  */
>> +
>> +bool
>> +def_use_same_block (gimple *stmt)
>> +{
>> +  use_operand_p use_p;
>> +  def_operand_p def_p;
>> +  imm_use_iterator imm_iter;
>> +  ssa_op_iter iter;
>> +
>> +  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
>> +    {
>> +      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p))
>> +       {
>> +         if (is_gimple_debug (USE_STMT (use_p)))
>> +           continue;
>> +
>> +         if (use_p
> 
> use_p is never null
> 
>> +             && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))
>> +           return true;
> 
> the function behavior is obviously odd ...
> 
>> +       }
>> +     }
>> +  return false;
>> +}
>> +
>> +/* Return TRUE if the block has only calls, FALSE otherwise. */
>> +
>> +bool
>> +block_call_p (basic_block bb)
>> +{
>> +  int i = 0;
>> +  bool is_call = false;
>> +  gimple_stmt_iterator gsi = gsi_last_bb (bb);
>> +  gimple *last_stmt = gsi_stmt (gsi);
>> +
>> +  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
>> +    {
>> +      if (!gsi_end_p (gsi))
>> +       gsi_prev (&gsi);
>> +
>> +       for (; !gsi_end_p (gsi);)
>> +        {
>> +          gimple *stmt = gsi_stmt (gsi);
>> +
>> +          /* We have already seen a call.  */
>> +          if (is_call)
>> +            return false;
> 
> Likewise.  Do you want to check whether a block has
> a single stmt and that is a call and that is followed by
> a condition?  It looks like a very convoluted way to write this.
> 
>> +
>> +          if (is_gimple_call (stmt))
>> +            is_call = true;
>> +          else
>> +            return false;
>> +
>> +          if (!gsi_end_p (gsi))
>> +            gsi_prev (&gsi);
>> +
>> +           ++i;
>> +       }
>> +     }
>> +  if (is_call && i == 1)
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>>  /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
>>     tree, return the best basic block between them (inclusive) to place
>>     statements.
>> @@ -190,7 +256,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>>  static basic_block
>>  select_best_block (basic_block early_bb,
>>                    basic_block late_bb,
>> -                  gimple *stmt)
>> +                  gimple *stmt,
>> +                  gimple *use)
> 
> please update the function comment
> 
>>  {
>>    basic_block best_bb = late_bb;
>>    basic_block temp_bb = late_bb;
>> @@ -230,14 +297,47 @@ select_best_block (basic_block early_bb,
>>        if (threshold > 100)
>>         threshold = 100;
>>      }
>> -
>>    /* If BEST_BB is at the same nesting level, then require it to have
>>       significantly lower execution frequency to avoid gratuitous movement.  */
>>    if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
>>        /* If result of comparsion is unknown, prefer EARLY_BB.
>>          Thus use !(...>=..) rather than (...<...)  */
>>        && !(best_bb->count * 100 >= early_bb->count * threshold))
>> -    return best_bb;
>> +    {
>> +      basic_block new_best_bb = get_immediate_dominator (CDI_DOMINATORS, best_bb);
>> +      /* Return best_bb if def and use are in same block otherwise new_best_bb.
>> +
>> +        Things to consider:
>> +
>> +          new_best_bb is not equal to best_bb and early_bb.
>> +
>> +          stmt is not call.
>> +
>> +          new_best_bb doesnt have any phis.
>> +
>> +          use basic block is not equal to early_bb.
>> +
>> +          use basic block post dominates to new_best_bb.
>> +
>> +          new_best_bb dominates early_bb. */
>> +      if (new_best_bb && use
>> +         && (new_best_bb != best_bb)
>> +         && (new_best_bb != early_bb)
>> +         && !is_gimple_call (stmt)
>> +         && gsi_end_p (gsi_start_phis (new_best_bb))
>> +         && (gimple_bb (use) != early_bb)
>> +         && !is_gimple_call (use)
>> +         && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb(use))
>> +         && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
>> +         && block_call_p (new_best_bb))
>> +       {
>> +         if (def_use_same_block (use))
>> +           return best_bb;
> 
> given the odd implementation of the predicates this matches very very
> specific cases.
> 
> Consider
> 
>  if (..)
>   {
>     foo();
>     bar();
>     ... = l;
>   }
> 
> and C++ where foo and bar might throw.  You then likely want to sink
> before foo ().
> 
> What's the reason to only consider blocks with exactly 'call; cond;' ?
> 
>> +
>> +         return new_best_bb;
>> +       }
>> +       return best_bb;
>> +    }
>>
>>    /* No better block found, so return EARLY_BB, which happens to be the
>>       statement's original block.  */
>> @@ -439,7 +539,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>>        if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb))
>>         return false;
>>
>> -      commondom = select_best_block (frombb, commondom, stmt);
>> +      commondom = select_best_block (frombb, commondom, stmt, NULL);
>>
>>        if (commondom == frombb)
>>         return false;
>> @@ -456,19 +556,58 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>>             continue;
>>           break;
>>         }
>> +
>>        use = USE_STMT (one_use);
>>
>>        if (gimple_code (use) != GIMPLE_PHI)
>>         {
>> -         sinkbb = select_best_block (frombb, gimple_bb (use), stmt);
>> +         sinkbb = select_best_block (frombb, gimple_bb (use), stmt, use);
>>
>>           if (sinkbb == frombb)
>>             return false;
>>
>> -         if (sinkbb == gimple_bb (use))
>> -           *togsi = gsi_for_stmt (use);
>> -         else
>> -           *togsi = gsi_after_labels (sinkbb);
>> +          gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
>> +
>> +          if ((gimple_bb (def_stmt) == gimple_bb (use))
>> +               && (gimple_bb (use) != sinkbb))
>> +            sinkbb = gimple_bb (use);
>> +
>> +           if (sinkbb == gimple_bb (use))
>> +             {
>> +               gimple_stmt_iterator gsi = gsi_last_bb (sinkbb);
>> +               gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
>> +               gimple *last_stmt = gsi_stmt (gsi);
>> +
>> +               /* Update sinking point as stmt before call if the sinking block
>> +                  has only calls. Otherwise update sinking point as the use
>> +                  stmt. */
>> +               if (gsi_stmt (gsi) == use
>> +                   && !is_gimple_call (last_stmt)
>> +                   && (gimple_code (last_stmt) != GIMPLE_SWITCH)
>> +                   && (gimple_code (last_stmt) != GIMPLE_COND)
>> +                   && (gimple_code (last_stmt) != GIMPLE_GOTO)
>> +                   && (!gimple_vdef (use) || !def_use_same_block (def_stmt)))
>> +                 {
>> +                   if (!gsi_end_p (gsi))
>> +                     gsi_prev (&gsi);
>> +
>> +                   gimple *stmt = gsi_stmt (gsi);
>> +
>> +                   if (!gsi_end_p (gsi))
>> +                     gsi_prev (&gsi);
>> +
>> +                   if (gsi_end_p (gsi) && stmt && is_gimple_call (stmt)
>> +                       && gsi_end_p (gsi_start_phis (sinkbb))
>> +                       && !is_gimple_call (def_stmt))
>> +                     *togsi = gsi_for_stmt (stmt);
>> +                   else
>> +                     *togsi = gsi_for_stmt (use);
>> +                  }
>> +               else
>> +                 *togsi = gsi_for_stmt(use);
>> +              }
>> +            else
>> +               *togsi = gsi_after_labels (sinkbb);
> 
> This is very convoluted.  I think that in the end you want to compute (once) the
> position of the first call in each block.  Since we're waking the CFG backwards
> in post-dominator order this information can be gathered during this walk.
> This would determine the location to sink to iff the use stmt is dominated by
> this location (you can for example use gimple_uid to mark stmts before it).
> 
> The alternative is to simply always sink to the start of blocks even for the
> use stmt block in case that has a call before the use (but you still need to
> efficiently compute that).
>

Incorporated the above comments and sent a separate patch.

Thanks & Regards
Ajit
 
> Richard.
> 
>>
>>           return true;
>>         }
>> @@ -480,7 +619,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>>    if (!sinkbb)
>>      return false;
>>
>> -  sinkbb = select_best_block (frombb, sinkbb, stmt);
>> +  sinkbb = select_best_block (frombb, sinkbb, stmt, NULL);
>>    if (!sinkbb || sinkbb == frombb)
>>      return false;
>>
>> --
>> 2.31.1
>>
Richard Biener May 30, 2023, 7:04 a.m. UTC | #4
On Tue, May 30, 2023 at 7:06 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>
> Hello Richard:
>
> On 22/05/23 6:26 pm, Richard Biener wrote:
> > On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
> >>
> >> Hello All:
> >>
> >> This patch improves code sinking pass to sink statements before call to reduce
> >> register pressure.
> >> Review comments are incorporated.
> >>
> >> Bootstrapped and regtested on powerpc64-linux-gnu.
> >>
> >> Thanks & Regards
> >> Ajit
> >>
> >>
> >> tree-ssa-sink: Improve code sinking pass.
> >>
> >> Code Sinking sinks the blocks after call. This increases
> >> register pressure for callee-saved registers. Improves
> >> code sinking before call in the use blocks or immediate
> >> dominator of use blocks.
> >>
> >> 2023-05-18  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
> >>
> >> gcc/ChangeLog:
> >>
> >>         * tree-ssa-sink.cc (statement_sink_location): Modifed to
> >>         move statements before calls.
> >>         (block_call_p): New function.
> >>         (def_use_same_block): New function.
> >>         (select_best_block): Add heuristics to select the best
> >>         blocks in the immediate post dominator.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>         * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
> >>         * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
> >> ---
> >>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  16 ++
> >>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  20 +++
> >>  gcc/tree-ssa-sink.cc                        | 159 ++++++++++++++++++--
> >>  3 files changed, 185 insertions(+), 10 deletions(-)
> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> >> new file mode 100644
> >> index 00000000000..716bc1f9257
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> >> @@ -0,0 +1,16 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-sink-stats" } */
> >> +
> >> +void bar();
> >> +int j;
> >> +void foo(int a, int b, int c, int d, int e, int f)
> >> +{
> >> +  int l;
> >> +  l = a + b + c + d +e + f;
> >> +  if (a != 5)
> >> +    {
> >> +      bar();
> >> +      j = l;
> >> +    }
> >> +}
> >> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
> >
> > this doesn't verify the place we sink to?
> >
>
> I am not sure how to verify the place we sink to with dg-final.

I think dejagnu supports matching multi-line regexps so I suggest
to scan for the sunk expr RHS to be followed by the call?

> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> >> new file mode 100644
> >> index 00000000000..ff41e2ea8ae
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> >> @@ -0,0 +1,20 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */
> >> +
> >> +void bar();
> >> +int j, x;
> >> +void foo(int a, int b, int c, int d, int e, int f)
> >> +{
> >> +  int l;
> >> +  l = a + b + c + d +e + f;
> >> +  if (a != 5)
> >> +    {
> >> +      bar();
> >> +      if (b != 3)
> >> +        x = 3;
> >> +      else
> >> +        x = 5;
> >> +      j = l;
> >> +    }
> >> +}
> >> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
> >
> > likewise.  So both tests already pass before the patch?
> >
> >> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
> >> index 87b1d40c174..76556e7795b 100644
> >> --- a/gcc/tree-ssa-sink.cc
> >> +++ b/gcc/tree-ssa-sink.cc
> >> @@ -171,6 +171,72 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
> >>    return commondom;
> >>  }
> >>
> >> +/* Return TRUE if immediate uses of the defs in
> >> +   USE occur in the same block as USE, FALSE otherwise.  */
> >> +
> >> +bool
> >> +def_use_same_block (gimple *stmt)
> >> +{
> >> +  use_operand_p use_p;
> >> +  def_operand_p def_p;
> >> +  imm_use_iterator imm_iter;
> >> +  ssa_op_iter iter;
> >> +
> >> +  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
> >> +    {
> >> +      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p))
> >> +       {
> >> +         if (is_gimple_debug (USE_STMT (use_p)))
> >> +           continue;
> >> +
> >> +         if (use_p
> >
> > use_p is never null
> >
> >> +             && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))
> >> +           return true;
> >
> > the function behavior is obviously odd ...
> >
> >> +       }
> >> +     }
> >> +  return false;
> >> +}
> >> +
> >> +/* Return TRUE if the block has only calls, FALSE otherwise. */
> >> +
> >> +bool
> >> +block_call_p (basic_block bb)
> >> +{
> >> +  int i = 0;
> >> +  bool is_call = false;
> >> +  gimple_stmt_iterator gsi = gsi_last_bb (bb);
> >> +  gimple *last_stmt = gsi_stmt (gsi);
> >> +
> >> +  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
> >> +    {
> >> +      if (!gsi_end_p (gsi))
> >> +       gsi_prev (&gsi);
> >> +
> >> +       for (; !gsi_end_p (gsi);)
> >> +        {
> >> +          gimple *stmt = gsi_stmt (gsi);
> >> +
> >> +          /* We have already seen a call.  */
> >> +          if (is_call)
> >> +            return false;
> >
> > Likewise.  Do you want to check whether a block has
> > a single stmt and that is a call and that is followed by
> > a condition?  It looks like a very convoluted way to write this.
> >
> >> +
> >> +          if (is_gimple_call (stmt))
> >> +            is_call = true;
> >> +          else
> >> +            return false;
> >> +
> >> +          if (!gsi_end_p (gsi))
> >> +            gsi_prev (&gsi);
> >> +
> >> +           ++i;
> >> +       }
> >> +     }
> >> +  if (is_call && i == 1)
> >> +    return true;
> >> +
> >> +  return false;
> >> +}
> >> +
> >>  /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
> >>     tree, return the best basic block between them (inclusive) to place
> >>     statements.
> >> @@ -190,7 +256,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
> >>  static basic_block
> >>  select_best_block (basic_block early_bb,
> >>                    basic_block late_bb,
> >> -                  gimple *stmt)
> >> +                  gimple *stmt,
> >> +                  gimple *use)
> >
> > please update the function comment
> >
> >>  {
> >>    basic_block best_bb = late_bb;
> >>    basic_block temp_bb = late_bb;
> >> @@ -230,14 +297,47 @@ select_best_block (basic_block early_bb,
> >>        if (threshold > 100)
> >>         threshold = 100;
> >>      }
> >> -
> >>    /* If BEST_BB is at the same nesting level, then require it to have
> >>       significantly lower execution frequency to avoid gratuitous movement.  */
> >>    if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
> >>        /* If result of comparsion is unknown, prefer EARLY_BB.
> >>          Thus use !(...>=..) rather than (...<...)  */
> >>        && !(best_bb->count * 100 >= early_bb->count * threshold))
> >> -    return best_bb;
> >> +    {
> >> +      basic_block new_best_bb = get_immediate_dominator (CDI_DOMINATORS, best_bb);
> >> +      /* Return best_bb if def and use are in same block otherwise new_best_bb.
> >> +
> >> +        Things to consider:
> >> +
> >> +          new_best_bb is not equal to best_bb and early_bb.
> >> +
> >> +          stmt is not call.
> >> +
> >> +          new_best_bb doesnt have any phis.
> >> +
> >> +          use basic block is not equal to early_bb.
> >> +
> >> +          use basic block post dominates to new_best_bb.
> >> +
> >> +          new_best_bb dominates early_bb. */
> >> +      if (new_best_bb && use
> >> +         && (new_best_bb != best_bb)
> >> +         && (new_best_bb != early_bb)
> >> +         && !is_gimple_call (stmt)
> >> +         && gsi_end_p (gsi_start_phis (new_best_bb))
> >> +         && (gimple_bb (use) != early_bb)
> >> +         && !is_gimple_call (use)
> >> +         && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb(use))
> >> +         && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
> >> +         && block_call_p (new_best_bb))
> >> +       {
> >> +         if (def_use_same_block (use))
> >> +           return best_bb;
> >
> > given the odd implementation of the predicates this matches very very
> > specific cases.
> >
> > Consider
> >
> >  if (..)
> >   {
> >     foo();
> >     bar();
> >     ... = l;
> >   }
> >
> > and C++ where foo and bar might throw.  You then likely want to sink
> > before foo ().
> >
> > What's the reason to only consider blocks with exactly 'call; cond;' ?
> >
> >> +
> >> +         return new_best_bb;
> >> +       }
> >> +       return best_bb;
> >> +    }
> >>
> >>    /* No better block found, so return EARLY_BB, which happens to be the
> >>       statement's original block.  */
> >> @@ -439,7 +539,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
> >>        if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb))
> >>         return false;
> >>
> >> -      commondom = select_best_block (frombb, commondom, stmt);
> >> +      commondom = select_best_block (frombb, commondom, stmt, NULL);
> >>
> >>        if (commondom == frombb)
> >>         return false;
> >> @@ -456,19 +556,58 @@ statement_sink_location (gimple *stmt, basic_block frombb,
> >>             continue;
> >>           break;
> >>         }
> >> +
> >>        use = USE_STMT (one_use);
> >>
> >>        if (gimple_code (use) != GIMPLE_PHI)
> >>         {
> >> -         sinkbb = select_best_block (frombb, gimple_bb (use), stmt);
> >> +         sinkbb = select_best_block (frombb, gimple_bb (use), stmt, use);
> >>
> >>           if (sinkbb == frombb)
> >>             return false;
> >>
> >> -         if (sinkbb == gimple_bb (use))
> >> -           *togsi = gsi_for_stmt (use);
> >> -         else
> >> -           *togsi = gsi_after_labels (sinkbb);
> >> +          gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
> >> +
> >> +          if ((gimple_bb (def_stmt) == gimple_bb (use))
> >> +               && (gimple_bb (use) != sinkbb))
> >> +            sinkbb = gimple_bb (use);
> >> +
> >> +           if (sinkbb == gimple_bb (use))
> >> +             {
> >> +               gimple_stmt_iterator gsi = gsi_last_bb (sinkbb);
> >> +               gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
> >> +               gimple *last_stmt = gsi_stmt (gsi);
> >> +
> >> +               /* Update sinking point as stmt before call if the sinking block
> >> +                  has only calls. Otherwise update sinking point as the use
> >> +                  stmt. */
> >> +               if (gsi_stmt (gsi) == use
> >> +                   && !is_gimple_call (last_stmt)
> >> +                   && (gimple_code (last_stmt) != GIMPLE_SWITCH)
> >> +                   && (gimple_code (last_stmt) != GIMPLE_COND)
> >> +                   && (gimple_code (last_stmt) != GIMPLE_GOTO)
> >> +                   && (!gimple_vdef (use) || !def_use_same_block (def_stmt)))
> >> +                 {
> >> +                   if (!gsi_end_p (gsi))
> >> +                     gsi_prev (&gsi);
> >> +
> >> +                   gimple *stmt = gsi_stmt (gsi);
> >> +
> >> +                   if (!gsi_end_p (gsi))
> >> +                     gsi_prev (&gsi);
> >> +
> >> +                   if (gsi_end_p (gsi) && stmt && is_gimple_call (stmt)
> >> +                       && gsi_end_p (gsi_start_phis (sinkbb))
> >> +                       && !is_gimple_call (def_stmt))
> >> +                     *togsi = gsi_for_stmt (stmt);
> >> +                   else
> >> +                     *togsi = gsi_for_stmt (use);
> >> +                  }
> >> +               else
> >> +                 *togsi = gsi_for_stmt(use);
> >> +              }
> >> +            else
> >> +               *togsi = gsi_after_labels (sinkbb);
> >
> > This is very convoluted.  I think that in the end you want to compute (once) the
> > position of the first call in each block.  Since we're waking the CFG backwards
> > in post-dominator order this information can be gathered during this walk.
> > This would determine the location to sink to iff the use stmt is dominated by
> > this location (you can for example use gimple_uid to mark stmts before it).
> >
> > The alternative is to simply always sink to the start of blocks even for the
> > use stmt block in case that has a call before the use (but you still need to
> > efficiently compute that).
> >
>
> Incorporated the above comments and sent a separate patch.
>
> Thanks & Regards
> Ajit
>
> > Richard.
> >
> >>
> >>           return true;
> >>         }
> >> @@ -480,7 +619,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
> >>    if (!sinkbb)
> >>      return false;
> >>
> >> -  sinkbb = select_best_block (frombb, sinkbb, stmt);
> >> +  sinkbb = select_best_block (frombb, sinkbb, stmt, NULL);
> >>    if (!sinkbb || sinkbb == frombb)
> >>      return false;
> >>
> >> --
> >> 2.31.1
> >>
Ajit Agarwal May 30, 2023, 7:32 a.m. UTC | #5
Hello Richard:

On 30/05/23 12:34 pm, Richard Biener wrote:
> On Tue, May 30, 2023 at 7:06 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>>
>> Hello Richard:
>>
>> On 22/05/23 6:26 pm, Richard Biener wrote:
>>> On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>>>>
>>>> Hello All:
>>>>
>>>> This patch improves code sinking pass to sink statements before call to reduce
>>>> register pressure.
>>>> Review comments are incorporated.
>>>>
>>>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>>>
>>>> Thanks & Regards
>>>> Ajit
>>>>
>>>>
>>>> tree-ssa-sink: Improve code sinking pass.
>>>>
>>>> Code Sinking sinks the blocks after call. This increases
>>>> register pressure for callee-saved registers. Improves
>>>> code sinking before call in the use blocks or immediate
>>>> dominator of use blocks.
>>>>
>>>> 2023-05-18  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>         * tree-ssa-sink.cc (statement_sink_location): Modifed to
>>>>         move statements before calls.
>>>>         (block_call_p): New function.
>>>>         (def_use_same_block): New function.
>>>>         (select_best_block): Add heuristics to select the best
>>>>         blocks in the immediate post dominator.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>         * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
>>>>         * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
>>>> ---
>>>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  16 ++
>>>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  20 +++
>>>>  gcc/tree-ssa-sink.cc                        | 159 ++++++++++++++++++--
>>>>  3 files changed, 185 insertions(+), 10 deletions(-)
>>>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>>>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>>>>
>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>>>> new file mode 100644
>>>> index 00000000000..716bc1f9257
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
>>>> @@ -0,0 +1,16 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-sink-stats" } */
>>>> +
>>>> +void bar();
>>>> +int j;
>>>> +void foo(int a, int b, int c, int d, int e, int f)
>>>> +{
>>>> +  int l;
>>>> +  l = a + b + c + d +e + f;
>>>> +  if (a != 5)
>>>> +    {
>>>> +      bar();
>>>> +      j = l;
>>>> +    }
>>>> +}
>>>> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
>>>
>>> this doesn't verify the place we sink to?
>>>
>>
>> I am not sure how to verify the place we sink to with dg-final.
> 
> I think dejagnu supports matching multi-line regexps so I suggest
> to scan for the sunk expr RHS to be followed by the call?
> 

You meant to use dg-begin-multiline-output and dg-end-multiline-output.

Thanks & Regards
Ajit
>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>>>> new file mode 100644
>>>> index 00000000000..ff41e2ea8ae
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>>>> @@ -0,0 +1,20 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */
>>>> +
>>>> +void bar();
>>>> +int j, x;
>>>> +void foo(int a, int b, int c, int d, int e, int f)
>>>> +{
>>>> +  int l;
>>>> +  l = a + b + c + d +e + f;
>>>> +  if (a != 5)
>>>> +    {
>>>> +      bar();
>>>> +      if (b != 3)
>>>> +        x = 3;
>>>> +      else
>>>> +        x = 5;
>>>> +      j = l;
>>>> +    }
>>>> +}
>>>> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
>>>
>>> likewise.  So both tests already pass before the patch?
>>>
>>>> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
>>>> index 87b1d40c174..76556e7795b 100644
>>>> --- a/gcc/tree-ssa-sink.cc
>>>> +++ b/gcc/tree-ssa-sink.cc
>>>> @@ -171,6 +171,72 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>>>>    return commondom;
>>>>  }
>>>>
>>>> +/* Return TRUE if immediate uses of the defs in
>>>> +   USE occur in the same block as USE, FALSE otherwise.  */
>>>> +
>>>> +bool
>>>> +def_use_same_block (gimple *stmt)
>>>> +{
>>>> +  use_operand_p use_p;
>>>> +  def_operand_p def_p;
>>>> +  imm_use_iterator imm_iter;
>>>> +  ssa_op_iter iter;
>>>> +
>>>> +  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
>>>> +    {
>>>> +      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p))
>>>> +       {
>>>> +         if (is_gimple_debug (USE_STMT (use_p)))
>>>> +           continue;
>>>> +
>>>> +         if (use_p
>>>
>>> use_p is never null
>>>
>>>> +             && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))
>>>> +           return true;
>>>
>>> the function behavior is obviously odd ...
>>>
>>>> +       }
>>>> +     }
>>>> +  return false;
>>>> +}
>>>> +
>>>> +/* Return TRUE if the block has only calls, FALSE otherwise. */
>>>> +
>>>> +bool
>>>> +block_call_p (basic_block bb)
>>>> +{
>>>> +  int i = 0;
>>>> +  bool is_call = false;
>>>> +  gimple_stmt_iterator gsi = gsi_last_bb (bb);
>>>> +  gimple *last_stmt = gsi_stmt (gsi);
>>>> +
>>>> +  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
>>>> +    {
>>>> +      if (!gsi_end_p (gsi))
>>>> +       gsi_prev (&gsi);
>>>> +
>>>> +       for (; !gsi_end_p (gsi);)
>>>> +        {
>>>> +          gimple *stmt = gsi_stmt (gsi);
>>>> +
>>>> +          /* We have already seen a call.  */
>>>> +          if (is_call)
>>>> +            return false;
>>>
>>> Likewise.  Do you want to check whether a block has
>>> a single stmt and that is a call and that is followed by
>>> a condition?  It looks like a very convoluted way to write this.
>>>
>>>> +
>>>> +          if (is_gimple_call (stmt))
>>>> +            is_call = true;
>>>> +          else
>>>> +            return false;
>>>> +
>>>> +          if (!gsi_end_p (gsi))
>>>> +            gsi_prev (&gsi);
>>>> +
>>>> +           ++i;
>>>> +       }
>>>> +     }
>>>> +  if (is_call && i == 1)
>>>> +    return true;
>>>> +
>>>> +  return false;
>>>> +}
>>>> +
>>>>  /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
>>>>     tree, return the best basic block between them (inclusive) to place
>>>>     statements.
>>>> @@ -190,7 +256,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
>>>>  static basic_block
>>>>  select_best_block (basic_block early_bb,
>>>>                    basic_block late_bb,
>>>> -                  gimple *stmt)
>>>> +                  gimple *stmt,
>>>> +                  gimple *use)
>>>
>>> please update the function comment
>>>
>>>>  {
>>>>    basic_block best_bb = late_bb;
>>>>    basic_block temp_bb = late_bb;
>>>> @@ -230,14 +297,47 @@ select_best_block (basic_block early_bb,
>>>>        if (threshold > 100)
>>>>         threshold = 100;
>>>>      }
>>>> -
>>>>    /* If BEST_BB is at the same nesting level, then require it to have
>>>>       significantly lower execution frequency to avoid gratuitous movement.  */
>>>>    if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
>>>>        /* If result of comparsion is unknown, prefer EARLY_BB.
>>>>          Thus use !(...>=..) rather than (...<...)  */
>>>>        && !(best_bb->count * 100 >= early_bb->count * threshold))
>>>> -    return best_bb;
>>>> +    {
>>>> +      basic_block new_best_bb = get_immediate_dominator (CDI_DOMINATORS, best_bb);
>>>> +      /* Return best_bb if def and use are in same block otherwise new_best_bb.
>>>> +
>>>> +        Things to consider:
>>>> +
>>>> +          new_best_bb is not equal to best_bb and early_bb.
>>>> +
>>>> +          stmt is not call.
>>>> +
>>>> +          new_best_bb doesnt have any phis.
>>>> +
>>>> +          use basic block is not equal to early_bb.
>>>> +
>>>> +          use basic block post dominates to new_best_bb.
>>>> +
>>>> +          new_best_bb dominates early_bb. */
>>>> +      if (new_best_bb && use
>>>> +         && (new_best_bb != best_bb)
>>>> +         && (new_best_bb != early_bb)
>>>> +         && !is_gimple_call (stmt)
>>>> +         && gsi_end_p (gsi_start_phis (new_best_bb))
>>>> +         && (gimple_bb (use) != early_bb)
>>>> +         && !is_gimple_call (use)
>>>> +         && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb(use))
>>>> +         && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
>>>> +         && block_call_p (new_best_bb))
>>>> +       {
>>>> +         if (def_use_same_block (use))
>>>> +           return best_bb;
>>>
>>> given the odd implementation of the predicates this matches very very
>>> specific cases.
>>>
>>> Consider
>>>
>>>  if (..)
>>>   {
>>>     foo();
>>>     bar();
>>>     ... = l;
>>>   }
>>>
>>> and C++ where foo and bar might throw.  You then likely want to sink
>>> before foo ().
>>>
>>> What's the reason to only consider blocks with exactly 'call; cond;' ?
>>>
>>>> +
>>>> +         return new_best_bb;
>>>> +       }
>>>> +       return best_bb;
>>>> +    }
>>>>
>>>>    /* No better block found, so return EARLY_BB, which happens to be the
>>>>       statement's original block.  */
>>>> @@ -439,7 +539,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>>>>        if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb))
>>>>         return false;
>>>>
>>>> -      commondom = select_best_block (frombb, commondom, stmt);
>>>> +      commondom = select_best_block (frombb, commondom, stmt, NULL);
>>>>
>>>>        if (commondom == frombb)
>>>>         return false;
>>>> @@ -456,19 +556,58 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>>>>             continue;
>>>>           break;
>>>>         }
>>>> +
>>>>        use = USE_STMT (one_use);
>>>>
>>>>        if (gimple_code (use) != GIMPLE_PHI)
>>>>         {
>>>> -         sinkbb = select_best_block (frombb, gimple_bb (use), stmt);
>>>> +         sinkbb = select_best_block (frombb, gimple_bb (use), stmt, use);
>>>>
>>>>           if (sinkbb == frombb)
>>>>             return false;
>>>>
>>>> -         if (sinkbb == gimple_bb (use))
>>>> -           *togsi = gsi_for_stmt (use);
>>>> -         else
>>>> -           *togsi = gsi_after_labels (sinkbb);
>>>> +          gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
>>>> +
>>>> +          if ((gimple_bb (def_stmt) == gimple_bb (use))
>>>> +               && (gimple_bb (use) != sinkbb))
>>>> +            sinkbb = gimple_bb (use);
>>>> +
>>>> +           if (sinkbb == gimple_bb (use))
>>>> +             {
>>>> +               gimple_stmt_iterator gsi = gsi_last_bb (sinkbb);
>>>> +               gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
>>>> +               gimple *last_stmt = gsi_stmt (gsi);
>>>> +
>>>> +               /* Update sinking point as stmt before call if the sinking block
>>>> +                  has only calls. Otherwise update sinking point as the use
>>>> +                  stmt. */
>>>> +               if (gsi_stmt (gsi) == use
>>>> +                   && !is_gimple_call (last_stmt)
>>>> +                   && (gimple_code (last_stmt) != GIMPLE_SWITCH)
>>>> +                   && (gimple_code (last_stmt) != GIMPLE_COND)
>>>> +                   && (gimple_code (last_stmt) != GIMPLE_GOTO)
>>>> +                   && (!gimple_vdef (use) || !def_use_same_block (def_stmt)))
>>>> +                 {
>>>> +                   if (!gsi_end_p (gsi))
>>>> +                     gsi_prev (&gsi);
>>>> +
>>>> +                   gimple *stmt = gsi_stmt (gsi);
>>>> +
>>>> +                   if (!gsi_end_p (gsi))
>>>> +                     gsi_prev (&gsi);
>>>> +
>>>> +                   if (gsi_end_p (gsi) && stmt && is_gimple_call (stmt)
>>>> +                       && gsi_end_p (gsi_start_phis (sinkbb))
>>>> +                       && !is_gimple_call (def_stmt))
>>>> +                     *togsi = gsi_for_stmt (stmt);
>>>> +                   else
>>>> +                     *togsi = gsi_for_stmt (use);
>>>> +                  }
>>>> +               else
>>>> +                 *togsi = gsi_for_stmt(use);
>>>> +              }
>>>> +            else
>>>> +               *togsi = gsi_after_labels (sinkbb);
>>>
>>> This is very convoluted.  I think that in the end you want to compute (once) the
>>> position of the first call in each block.  Since we're waking the CFG backwards
>>> in post-dominator order this information can be gathered during this walk.
>>> This would determine the location to sink to iff the use stmt is dominated by
>>> this location (you can for example use gimple_uid to mark stmts before it).
>>>
>>> The alternative is to simply always sink to the start of blocks even for the
>>> use stmt block in case that has a call before the use (but you still need to
>>> efficiently compute that).
>>>
>>
>> Incorporated the above comments and sent a separate patch.
>>
>> Thanks & Regards
>> Ajit
>>
>>> Richard.
>>>
>>>>
>>>>           return true;
>>>>         }
>>>> @@ -480,7 +619,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
>>>>    if (!sinkbb)
>>>>      return false;
>>>>
>>>> -  sinkbb = select_best_block (frombb, sinkbb, stmt);
>>>> +  sinkbb = select_best_block (frombb, sinkbb, stmt, NULL);
>>>>    if (!sinkbb || sinkbb == frombb)
>>>>      return false;
>>>>
>>>> --
>>>> 2.31.1
>>>>
Richard Biener May 30, 2023, 11:24 a.m. UTC | #6
On Tue, May 30, 2023 at 9:35 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>
> Hello Richard:
>
> On 30/05/23 12:34 pm, Richard Biener wrote:
> > On Tue, May 30, 2023 at 7:06 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
> >>
> >> Hello Richard:
> >>
> >> On 22/05/23 6:26 pm, Richard Biener wrote:
> >>> On Thu, May 18, 2023 at 9:14 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
> >>>>
> >>>> Hello All:
> >>>>
> >>>> This patch improves code sinking pass to sink statements before call to reduce
> >>>> register pressure.
> >>>> Review comments are incorporated.
> >>>>
> >>>> Bootstrapped and regtested on powerpc64-linux-gnu.
> >>>>
> >>>> Thanks & Regards
> >>>> Ajit
> >>>>
> >>>>
> >>>> tree-ssa-sink: Improve code sinking pass.
> >>>>
> >>>> Code Sinking sinks the blocks after call. This increases
> >>>> register pressure for callee-saved registers. Improves
> >>>> code sinking before call in the use blocks or immediate
> >>>> dominator of use blocks.
> >>>>
> >>>> 2023-05-18  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
> >>>>
> >>>> gcc/ChangeLog:
> >>>>
> >>>>         * tree-ssa-sink.cc (statement_sink_location): Modifed to
> >>>>         move statements before calls.
> >>>>         (block_call_p): New function.
> >>>>         (def_use_same_block): New function.
> >>>>         (select_best_block): Add heuristics to select the best
> >>>>         blocks in the immediate post dominator.
> >>>>
> >>>> gcc/testsuite/ChangeLog:
> >>>>
> >>>>         * gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
> >>>>         * gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
> >>>> ---
> >>>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c |  16 ++
> >>>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c |  20 +++
> >>>>  gcc/tree-ssa-sink.cc                        | 159 ++++++++++++++++++--
> >>>>  3 files changed, 185 insertions(+), 10 deletions(-)
> >>>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> >>>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> >>>>
> >>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> >>>> new file mode 100644
> >>>> index 00000000000..716bc1f9257
> >>>> --- /dev/null
> >>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
> >>>> @@ -0,0 +1,16 @@
> >>>> +/* { dg-do compile } */
> >>>> +/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-sink-stats" } */
> >>>> +
> >>>> +void bar();
> >>>> +int j;
> >>>> +void foo(int a, int b, int c, int d, int e, int f)
> >>>> +{
> >>>> +  int l;
> >>>> +  l = a + b + c + d +e + f;
> >>>> +  if (a != 5)
> >>>> +    {
> >>>> +      bar();
> >>>> +      j = l;
> >>>> +    }
> >>>> +}
> >>>> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
> >>>
> >>> this doesn't verify the place we sink to?
> >>>
> >>
> >> I am not sure how to verify the place we sink to with dg-final.
> >
> > I think dejagnu supports matching multi-line regexps so I suggest
> > to scan for the sunk expr RHS to be followed by the call?
> >
>
> You meant to use dg-begin-multiline-output and dg-end-multiline-output.

I was referring to uses like that in gcc.dg/debug/dwarf2/pr41445-6.c

> Thanks & Regards
> Ajit
> >>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> >>>> new file mode 100644
> >>>> index 00000000000..ff41e2ea8ae
> >>>> --- /dev/null
> >>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
> >>>> @@ -0,0 +1,20 @@
> >>>> +/* { dg-do compile } */
> >>>> +/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */
> >>>> +
> >>>> +void bar();
> >>>> +int j, x;
> >>>> +void foo(int a, int b, int c, int d, int e, int f)
> >>>> +{
> >>>> +  int l;
> >>>> +  l = a + b + c + d +e + f;
> >>>> +  if (a != 5)
> >>>> +    {
> >>>> +      bar();
> >>>> +      if (b != 3)
> >>>> +        x = 3;
> >>>> +      else
> >>>> +        x = 5;
> >>>> +      j = l;
> >>>> +    }
> >>>> +}
> >>>> +/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
> >>>
> >>> likewise.  So both tests already pass before the patch?
> >>>
> >>>> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
> >>>> index 87b1d40c174..76556e7795b 100644
> >>>> --- a/gcc/tree-ssa-sink.cc
> >>>> +++ b/gcc/tree-ssa-sink.cc
> >>>> @@ -171,6 +171,72 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
> >>>>    return commondom;
> >>>>  }
> >>>>
> >>>> +/* Return TRUE if immediate uses of the defs in
> >>>> +   USE occur in the same block as USE, FALSE otherwise.  */
> >>>> +
> >>>> +bool
> >>>> +def_use_same_block (gimple *stmt)
> >>>> +{
> >>>> +  use_operand_p use_p;
> >>>> +  def_operand_p def_p;
> >>>> +  imm_use_iterator imm_iter;
> >>>> +  ssa_op_iter iter;
> >>>> +
> >>>> +  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
> >>>> +    {
> >>>> +      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p))
> >>>> +       {
> >>>> +         if (is_gimple_debug (USE_STMT (use_p)))
> >>>> +           continue;
> >>>> +
> >>>> +         if (use_p
> >>>
> >>> use_p is never null
> >>>
> >>>> +             && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))
> >>>> +           return true;
> >>>
> >>> the function behavior is obviously odd ...
> >>>
> >>>> +       }
> >>>> +     }
> >>>> +  return false;
> >>>> +}
> >>>> +
> >>>> +/* Return TRUE if the block has only calls, FALSE otherwise. */
> >>>> +
> >>>> +bool
> >>>> +block_call_p (basic_block bb)
> >>>> +{
> >>>> +  int i = 0;
> >>>> +  bool is_call = false;
> >>>> +  gimple_stmt_iterator gsi = gsi_last_bb (bb);
> >>>> +  gimple *last_stmt = gsi_stmt (gsi);
> >>>> +
> >>>> +  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
> >>>> +    {
> >>>> +      if (!gsi_end_p (gsi))
> >>>> +       gsi_prev (&gsi);
> >>>> +
> >>>> +       for (; !gsi_end_p (gsi);)
> >>>> +        {
> >>>> +          gimple *stmt = gsi_stmt (gsi);
> >>>> +
> >>>> +          /* We have already seen a call.  */
> >>>> +          if (is_call)
> >>>> +            return false;
> >>>
> >>> Likewise.  Do you want to check whether a block has
> >>> a single stmt and that is a call and that is followed by
> >>> a condition?  It looks like a very convoluted way to write this.
> >>>
> >>>> +
> >>>> +          if (is_gimple_call (stmt))
> >>>> +            is_call = true;
> >>>> +          else
> >>>> +            return false;
> >>>> +
> >>>> +          if (!gsi_end_p (gsi))
> >>>> +            gsi_prev (&gsi);
> >>>> +
> >>>> +           ++i;
> >>>> +       }
> >>>> +     }
> >>>> +  if (is_call && i == 1)
> >>>> +    return true;
> >>>> +
> >>>> +  return false;
> >>>> +}
> >>>> +
> >>>>  /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
> >>>>     tree, return the best basic block between them (inclusive) to place
> >>>>     statements.
> >>>> @@ -190,7 +256,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
> >>>>  static basic_block
> >>>>  select_best_block (basic_block early_bb,
> >>>>                    basic_block late_bb,
> >>>> -                  gimple *stmt)
> >>>> +                  gimple *stmt,
> >>>> +                  gimple *use)
> >>>
> >>> please update the function comment
> >>>
> >>>>  {
> >>>>    basic_block best_bb = late_bb;
> >>>>    basic_block temp_bb = late_bb;
> >>>> @@ -230,14 +297,47 @@ select_best_block (basic_block early_bb,
> >>>>        if (threshold > 100)
> >>>>         threshold = 100;
> >>>>      }
> >>>> -
> >>>>    /* If BEST_BB is at the same nesting level, then require it to have
> >>>>       significantly lower execution frequency to avoid gratuitous movement.  */
> >>>>    if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
> >>>>        /* If result of comparsion is unknown, prefer EARLY_BB.
> >>>>          Thus use !(...>=..) rather than (...<...)  */
> >>>>        && !(best_bb->count * 100 >= early_bb->count * threshold))
> >>>> -    return best_bb;
> >>>> +    {
> >>>> +      basic_block new_best_bb = get_immediate_dominator (CDI_DOMINATORS, best_bb);
> >>>> +      /* Return best_bb if def and use are in same block otherwise new_best_bb.
> >>>> +
> >>>> +        Things to consider:
> >>>> +
> >>>> +          new_best_bb is not equal to best_bb and early_bb.
> >>>> +
> >>>> +          stmt is not call.
> >>>> +
> >>>> +          new_best_bb doesnt have any phis.
> >>>> +
> >>>> +          use basic block is not equal to early_bb.
> >>>> +
> >>>> +          use basic block post dominates to new_best_bb.
> >>>> +
> >>>> +          new_best_bb dominates early_bb. */
> >>>> +      if (new_best_bb && use
> >>>> +         && (new_best_bb != best_bb)
> >>>> +         && (new_best_bb != early_bb)
> >>>> +         && !is_gimple_call (stmt)
> >>>> +         && gsi_end_p (gsi_start_phis (new_best_bb))
> >>>> +         && (gimple_bb (use) != early_bb)
> >>>> +         && !is_gimple_call (use)
> >>>> +         && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb(use))
> >>>> +         && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
> >>>> +         && block_call_p (new_best_bb))
> >>>> +       {
> >>>> +         if (def_use_same_block (use))
> >>>> +           return best_bb;
> >>>
> >>> given the odd implementation of the predicates this matches very very
> >>> specific cases.
> >>>
> >>> Consider
> >>>
> >>>  if (..)
> >>>   {
> >>>     foo();
> >>>     bar();
> >>>     ... = l;
> >>>   }
> >>>
> >>> and C++ where foo and bar might throw.  You then likely want to sink
> >>> before foo ().
> >>>
> >>> What's the reason to only consider blocks with exactly 'call; cond;' ?
> >>>
> >>>> +
> >>>> +         return new_best_bb;
> >>>> +       }
> >>>> +       return best_bb;
> >>>> +    }
> >>>>
> >>>>    /* No better block found, so return EARLY_BB, which happens to be the
> >>>>       statement's original block.  */
> >>>> @@ -439,7 +539,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
> >>>>        if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb))
> >>>>         return false;
> >>>>
> >>>> -      commondom = select_best_block (frombb, commondom, stmt);
> >>>> +      commondom = select_best_block (frombb, commondom, stmt, NULL);
> >>>>
> >>>>        if (commondom == frombb)
> >>>>         return false;
> >>>> @@ -456,19 +556,58 @@ statement_sink_location (gimple *stmt, basic_block frombb,
> >>>>             continue;
> >>>>           break;
> >>>>         }
> >>>> +
> >>>>        use = USE_STMT (one_use);
> >>>>
> >>>>        if (gimple_code (use) != GIMPLE_PHI)
> >>>>         {
> >>>> -         sinkbb = select_best_block (frombb, gimple_bb (use), stmt);
> >>>> +         sinkbb = select_best_block (frombb, gimple_bb (use), stmt, use);
> >>>>
> >>>>           if (sinkbb == frombb)
> >>>>             return false;
> >>>>
> >>>> -         if (sinkbb == gimple_bb (use))
> >>>> -           *togsi = gsi_for_stmt (use);
> >>>> -         else
> >>>> -           *togsi = gsi_after_labels (sinkbb);
> >>>> +          gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
> >>>> +
> >>>> +          if ((gimple_bb (def_stmt) == gimple_bb (use))
> >>>> +               && (gimple_bb (use) != sinkbb))
> >>>> +            sinkbb = gimple_bb (use);
> >>>> +
> >>>> +           if (sinkbb == gimple_bb (use))
> >>>> +             {
> >>>> +               gimple_stmt_iterator gsi = gsi_last_bb (sinkbb);
> >>>> +               gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
> >>>> +               gimple *last_stmt = gsi_stmt (gsi);
> >>>> +
> >>>> +               /* Update sinking point as stmt before call if the sinking block
> >>>> +                  has only calls. Otherwise update sinking point as the use
> >>>> +                  stmt. */
> >>>> +               if (gsi_stmt (gsi) == use
> >>>> +                   && !is_gimple_call (last_stmt)
> >>>> +                   && (gimple_code (last_stmt) != GIMPLE_SWITCH)
> >>>> +                   && (gimple_code (last_stmt) != GIMPLE_COND)
> >>>> +                   && (gimple_code (last_stmt) != GIMPLE_GOTO)
> >>>> +                   && (!gimple_vdef (use) || !def_use_same_block (def_stmt)))
> >>>> +                 {
> >>>> +                   if (!gsi_end_p (gsi))
> >>>> +                     gsi_prev (&gsi);
> >>>> +
> >>>> +                   gimple *stmt = gsi_stmt (gsi);
> >>>> +
> >>>> +                   if (!gsi_end_p (gsi))
> >>>> +                     gsi_prev (&gsi);
> >>>> +
> >>>> +                   if (gsi_end_p (gsi) && stmt && is_gimple_call (stmt)
> >>>> +                       && gsi_end_p (gsi_start_phis (sinkbb))
> >>>> +                       && !is_gimple_call (def_stmt))
> >>>> +                     *togsi = gsi_for_stmt (stmt);
> >>>> +                   else
> >>>> +                     *togsi = gsi_for_stmt (use);
> >>>> +                  }
> >>>> +               else
> >>>> +                 *togsi = gsi_for_stmt(use);
> >>>> +              }
> >>>> +            else
> >>>> +               *togsi = gsi_after_labels (sinkbb);
> >>>
> >>> This is very convoluted.  I think that in the end you want to compute (once) the
> >>> position of the first call in each block.  Since we're waking the CFG backwards
> >>> in post-dominator order this information can be gathered during this walk.
> >>> This would determine the location to sink to iff the use stmt is dominated by
> >>> this location (you can for example use gimple_uid to mark stmts before it).
> >>>
> >>> The alternative is to simply always sink to the start of blocks even for the
> >>> use stmt block in case that has a call before the use (but you still need to
> >>> efficiently compute that).
> >>>
> >>
> >> Incorporated the above comments and sent a separate patch.
> >>
> >> Thanks & Regards
> >> Ajit
> >>
> >>> Richard.
> >>>
> >>>>
> >>>>           return true;
> >>>>         }
> >>>> @@ -480,7 +619,7 @@ statement_sink_location (gimple *stmt, basic_block frombb,
> >>>>    if (!sinkbb)
> >>>>      return false;
> >>>>
> >>>> -  sinkbb = select_best_block (frombb, sinkbb, stmt);
> >>>> +  sinkbb = select_best_block (frombb, sinkbb, stmt, NULL);
> >>>>    if (!sinkbb || sinkbb == frombb)
> >>>>      return false;
> >>>>
> >>>> --
> >>>> 2.31.1
> >>>>
diff mbox series

Patch

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
new file mode 100644
index 00000000000..716bc1f9257
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-sink -fdump-tree-optimized -fdump-tree-sink-stats" } */
+
+void bar();
+int j;
+void foo(int a, int b, int c, int d, int e, int f)
+{
+  int l;
+  l = a + b + c + d +e + f;
+  if (a != 5)
+    {
+      bar();
+      j = l;
+    }
+}
+/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
new file mode 100644
index 00000000000..ff41e2ea8ae
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */ 
+/* { dg-options "-O2 -fdump-tree-sink-stats -fdump-tree-sink-stats" } */
+
+void bar();
+int j, x;
+void foo(int a, int b, int c, int d, int e, int f)
+{
+  int l;
+  l = a + b + c + d +e + f;
+  if (a != 5)
+    {
+      bar();
+      if (b != 3)
+        x = 3;
+      else
+        x = 5;
+      j = l;
+    }
+}
+/* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink" } } */
diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
index 87b1d40c174..76556e7795b 100644
--- a/gcc/tree-ssa-sink.cc
+++ b/gcc/tree-ssa-sink.cc
@@ -171,6 +171,72 @@  nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
   return commondom;
 }
 
+/* Return TRUE if immediate uses of the defs in
+   USE occur in the same block as USE, FALSE otherwise.  */
+
+bool
+def_use_same_block (gimple *stmt)
+{
+  use_operand_p use_p;
+  def_operand_p def_p;
+  imm_use_iterator imm_iter;
+  ssa_op_iter iter;
+
+  FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
+    {
+      FOR_EACH_IMM_USE_FAST (use_p, imm_iter, DEF_FROM_PTR (def_p))
+	{
+	  if (is_gimple_debug (USE_STMT (use_p)))
+	    continue;
+
+	  if (use_p
+	      && (gimple_bb (USE_STMT (use_p)) == gimple_bb (stmt)))
+	    return true;
+	}
+     }
+  return false;
+}
+
+/* Return TRUE if the block has only calls, FALSE otherwise. */
+
+bool
+block_call_p (basic_block bb)
+{
+  int i = 0;
+  bool is_call = false;
+  gimple_stmt_iterator gsi = gsi_last_bb (bb);
+  gimple *last_stmt = gsi_stmt (gsi);
+
+  if (last_stmt && gimple_code (last_stmt) == GIMPLE_COND)
+    {
+      if (!gsi_end_p (gsi))
+	gsi_prev (&gsi);
+
+       for (; !gsi_end_p (gsi);)
+	 {
+	   gimple *stmt = gsi_stmt (gsi);
+
+	   /* We have already seen a call.  */
+	   if (is_call)
+	     return false;
+
+	   if (is_gimple_call (stmt))
+	     is_call = true;
+	   else
+	     return false;
+
+	   if (!gsi_end_p (gsi))
+	     gsi_prev (&gsi);
+
+	    ++i;
+	}
+     }
+  if (is_call && i == 1)
+    return true;
+
+  return false;
+}
+
 /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
    tree, return the best basic block between them (inclusive) to place
    statements.
@@ -190,7 +256,8 @@  nearest_common_dominator_of_uses (def_operand_p def_p, bool *debug_stmts)
 static basic_block
 select_best_block (basic_block early_bb,
 		   basic_block late_bb,
-		   gimple *stmt)
+		   gimple *stmt,
+		   gimple *use)
 {
   basic_block best_bb = late_bb;
   basic_block temp_bb = late_bb;
@@ -230,14 +297,47 @@  select_best_block (basic_block early_bb,
       if (threshold > 100)
 	threshold = 100;
     }
-
   /* If BEST_BB is at the same nesting level, then require it to have
      significantly lower execution frequency to avoid gratuitous movement.  */
   if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
       /* If result of comparsion is unknown, prefer EARLY_BB.
 	 Thus use !(...>=..) rather than (...<...)  */
       && !(best_bb->count * 100 >= early_bb->count * threshold))
-    return best_bb;
+    {
+      basic_block new_best_bb = get_immediate_dominator (CDI_DOMINATORS, best_bb);
+      /* Return best_bb if def and use are in same block otherwise new_best_bb.
+
+	 Things to consider:
+
+	   new_best_bb is not equal to best_bb and early_bb.
+
+	   stmt is not call.
+
+	   new_best_bb doesnt have any phis.
+
+	   use basic block is not equal to early_bb.
+
+	   use basic block post dominates to new_best_bb.
+
+	   new_best_bb dominates early_bb. */
+      if (new_best_bb && use
+	  && (new_best_bb != best_bb)
+	  && (new_best_bb != early_bb)
+	  && !is_gimple_call (stmt)
+	  && gsi_end_p (gsi_start_phis (new_best_bb))
+	  && (gimple_bb (use) != early_bb)
+	  && !is_gimple_call (use)
+	  && dominated_by_p (CDI_POST_DOMINATORS, new_best_bb, gimple_bb(use))
+	  && dominated_by_p (CDI_DOMINATORS, new_best_bb, early_bb)
+	  && block_call_p (new_best_bb))
+	{
+	  if (def_use_same_block (use))
+	    return best_bb;
+
+	  return new_best_bb;
+	}
+	return best_bb;
+    }
 
   /* No better block found, so return EARLY_BB, which happens to be the
      statement's original block.  */
@@ -439,7 +539,7 @@  statement_sink_location (gimple *stmt, basic_block frombb,
       if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb))
 	return false;
 
-      commondom = select_best_block (frombb, commondom, stmt);
+      commondom = select_best_block (frombb, commondom, stmt, NULL);
 
       if (commondom == frombb)
 	return false;	
@@ -456,19 +556,58 @@  statement_sink_location (gimple *stmt, basic_block frombb,
 	    continue;
 	  break;
 	}
+
       use = USE_STMT (one_use);
 
       if (gimple_code (use) != GIMPLE_PHI)
 	{
-	  sinkbb = select_best_block (frombb, gimple_bb (use), stmt);
+	  sinkbb = select_best_block (frombb, gimple_bb (use), stmt, use);
 
 	  if (sinkbb == frombb)
 	    return false;
 
-	  if (sinkbb == gimple_bb (use))
-	    *togsi = gsi_for_stmt (use);
-	  else
-	    *togsi = gsi_after_labels (sinkbb);
+	   gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
+
+	   if ((gimple_bb (def_stmt) == gimple_bb (use))
+		&& (gimple_bb (use) != sinkbb))
+	     sinkbb = gimple_bb (use);
+
+	    if (sinkbb == gimple_bb (use))
+	      {
+		gimple_stmt_iterator gsi = gsi_last_bb (sinkbb);
+		gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def_p));
+		gimple *last_stmt = gsi_stmt (gsi);
+
+		/* Update sinking point as stmt before call if the sinking block
+		   has only calls. Otherwise update sinking point as the use
+		   stmt. */
+		if (gsi_stmt (gsi) == use
+		    && !is_gimple_call (last_stmt)
+		    && (gimple_code (last_stmt) != GIMPLE_SWITCH)
+		    && (gimple_code (last_stmt) != GIMPLE_COND)
+		    && (gimple_code (last_stmt) != GIMPLE_GOTO)
+		    && (!gimple_vdef (use) || !def_use_same_block (def_stmt)))
+		  {
+		    if (!gsi_end_p (gsi))
+		      gsi_prev (&gsi);
+
+		    gimple *stmt = gsi_stmt (gsi);
+
+		    if (!gsi_end_p (gsi))
+		      gsi_prev (&gsi);
+
+		    if (gsi_end_p (gsi) && stmt && is_gimple_call (stmt)
+			&& gsi_end_p (gsi_start_phis (sinkbb))
+			&& !is_gimple_call (def_stmt))
+		      *togsi = gsi_for_stmt (stmt);
+		    else
+		      *togsi = gsi_for_stmt (use);
+		   }
+		else
+		  *togsi = gsi_for_stmt(use);
+	       }
+	     else
+		*togsi = gsi_after_labels (sinkbb);
 
 	  return true;
 	}
@@ -480,7 +619,7 @@  statement_sink_location (gimple *stmt, basic_block frombb,
   if (!sinkbb)
     return false;
   
-  sinkbb = select_best_block (frombb, sinkbb, stmt);
+  sinkbb = select_best_block (frombb, sinkbb, stmt, NULL);
   if (!sinkbb || sinkbb == frombb)
     return false;