diff mbox

[SMS] Schedule normalization after scheduling branch

Message ID CAJnFk2eiT9SOPQwaAe8KqkMvx-fD_0YrJ1YqvF9NC5ny2phE1g@mail.gmail.com
State New
Headers show

Commit Message

Roman Zhuykov Dec. 29, 2011, 2:04 p.m. UTC
This week I investigated modulo scheduler on IA64.  Enabling SMS by default
(-fmodulo-sched -fmodulo-sched-allow-regmoves) leads to bootstrap failure
on IA64: gcc/build/genautomata.o differs while comparing stages 2 and 3.

I haven't studied this issue in detail, because the combination of these
my patches fixes this problem:

[Additional edges to instructions with clobber]
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00505.html
[Correctly delete anti-dep edges for renaming]
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00506.html

Then I have regtested two compilers - first is clean trunk, and the second
is trunk with SMS enabled by default and two patches mentioned above.
Comparing the results shows several new failures.

FAIL: gcc.dg/pr45259.c (internal compiler error)
FAIL: gcc.dg/pr45259.c (test for excess errors)
FAIL: gcc.dg/pr47881.c (test for excess errors)
FAIL: tr1/5_numerical_facilities/special_functions/08_cyl_bessel_i/check_value.cc
execution test
FAIL: tr1/5_numerical_facilities/special_functions/09_cyl_bessel_j/check_value.cc
execution test
FAIL: tr1/5_numerical_facilities/special_functions/11_cyl_neumann/check_value.cc
execution test
FAIL: tr1/5_numerical_facilities/special_functions/21_sph_bessel/check_value.cc
execution test
FAIL: tr1/5_numerical_facilities/special_functions/23_sph_neumann/check_value.cc
execution test

Problem with gcc.dg/pr45259.c is an ICE, which I earlier fixed by this patch:
[Correct extracting loop exit condition]
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg02049.html

In gcc.dg/pr47881.c -fcompare-debug failure happens. The difference between
-fcompare-debug dumps is only some NOTE_INSN_DELETED entries are placed
differently.  I haven't studied this problem.

And the last 5 new failures have dissappered after fixing the following
described issue.

Imagine the following doloop (each use and set is a fmad in real example):

use1 reg
set reg
use2 reg
insn
cloop

After SMS it looks like this, I write a scheduling stage and cycle before each
instruction.

0  0 set reg
0  0 use1 reg_copy
0  4 use2 regR
0 -4 reg_copy = reg
0  8 insn
0 -1 cloop

So all instructions were wrongly classified to stage zero.  While copying them
to prologue the regmove remains to be placed after use1, and as a
result, the register
reg_copy is used uninititalized in prologue.  This leads to miscompilation.

I have found that the issue can be fixed by additional schedule normalizarion
after scheduling branch instruction in optimize_sc function.  The situation
here is the same as in patch by Richard Sandiford
http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00748.html
which enables scheduling regmoves.
"Moves that handle incoming values might have been added
to a new first stage.  Bump the stage count if so."
The same bumping should be done after scheduling branch.

In my model example branch is scheduled on cycle -1 and remains in zero stage.
When regmove is later scheduled on cycle -4 the Richard's check doesn't cause
normalization, because new PS_MIN_CYCLE is -4, but min_cycle is -12:

  min_cycle = PS_MIN_CYCLE (ps) - SMODULO (PS_MIN_CYCLE (ps), ps->ii);
  ...
  call schedule_reg_moves (ps)
  ...
  if (PS_MIN_CYCLE (ps) < min_cycle)
    {
      reset_sched_times (ps, 0);
      stage_count++;
    }

I attach patch which adds the same check into optimize_sc function.
With -fmodulo-sched -fmodulo-sched-allow-regmoves enabled it passes
bootstrap and regtest on IA64.  It also passes bootstrap and regtest
on x86_64 with SMS patched to schedule non-doloop loops.
[Support new loop pattern]
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00495.html

OK for trunk or maybe 4.8?

Happy holidays!

--
Roman Zhuykov
2011-12-29  Roman Zhuykov  <zhroma@ispras.ru>
	* modulo-sched.c (optimize_sc):	Allow branch-scheduling to add a new
	first stage.
---

Comments

Roman Zhuykov Jan. 18, 2016, 9:38 p.m. UTC | #1
Hello,

4 years ago when I create some SMS patches nobody cares about SMS
failures on ia64.  Now ia64 is even more dead but at least one of bugs
appears on powerpc - PR69252.
Proposed patch is here:
https://gcc.gnu.org/ml/gcc-patches/2011-12/txt00266.txt and it even
suits current trunk without modification.

Powerpc PR69252 situation seems to be the same as I described earlier
on ia64, I can discuss it once again.

There were the following doloop:

  set_before reg
Loop:
  use1 reg
  set reg
  use2 reg
  insn
  cloop

After SMS it looked like this (I add scheduling stage and cycle before
each loop instruction):

  set_before reg
SMS-prologue:
  set reg
  use1 reg_copy #uninitialized
  use2 reg
  reg_copy <- reg
Loop:
0  0 set reg
0  0 use1 reg_copy
0  4 use2 reg
0 -4 reg_copy <- reg
0  8 insn
0 -1 cloop

All instructions were wrongly classified to stage zero.  While copying
them to prologue the regmove remains to be placed after use1, and as a
result, the register reg_copy is used uninititalized in prologue.
This leads to miscompilation.

I have found that the issue can be fixed by additional schedule
normalizarion after scheduling branch instruction in optimize_sc
function.  The situation here is the same as in patch by Richard
Sandiford
http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00748.html which enables
scheduling regmoves.  "Moves that handle incoming values might have
been added to a new first stage.  Bump the stage count if so."  The
same bumping should be done after scheduling branch, and my patch
simply implements this.

As Martin Sebor have already done bootstrap and regtest in PR69252, I
want to ask him to attach PR69252 example as a new regression test to
this patch.  I don't know if it should be execution test or maybe
there are some special scan-assembler techniques.

Ok for trunk with regtest?

--
Roman

Here is the whole old letter about the patch:
2011-12-29 17:04 GMT+03:00 Roman Zhuykov <zhroma@ispras.ru>:
> This week I investigated modulo scheduler on IA64.  Enabling SMS by default
> (-fmodulo-sched -fmodulo-sched-allow-regmoves) leads to bootstrap failure
> on IA64: gcc/build/genautomata.o differs while comparing stages 2 and 3.
>
> I haven't studied this issue in detail, because the combination of these
> my patches fixes this problem:
>
> [Additional edges to instructions with clobber]
> http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00505.html
> [Correctly delete anti-dep edges for renaming]
> http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00506.html
>
> Then I have regtested two compilers - first is clean trunk, and the second
> is trunk with SMS enabled by default and two patches mentioned above.
> Comparing the results shows several new failures.
>
> FAIL: gcc.dg/pr45259.c (internal compiler error)
> FAIL: gcc.dg/pr45259.c (test for excess errors)
> FAIL: gcc.dg/pr47881.c (test for excess errors)
> FAIL: tr1/5_numerical_facilities/special_functions/08_cyl_bessel_i/check_value.cc
> execution test
> FAIL: tr1/5_numerical_facilities/special_functions/09_cyl_bessel_j/check_value.cc
> execution test
> FAIL: tr1/5_numerical_facilities/special_functions/11_cyl_neumann/check_value.cc
> execution test
> FAIL: tr1/5_numerical_facilities/special_functions/21_sph_bessel/check_value.cc
> execution test
> FAIL: tr1/5_numerical_facilities/special_functions/23_sph_neumann/check_value.cc
> execution test
>
> Problem with gcc.dg/pr45259.c is an ICE, which I earlier fixed by this patch:
> [Correct extracting loop exit condition]
> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg02049.html
>
> In gcc.dg/pr47881.c -fcompare-debug failure happens. The difference between
> -fcompare-debug dumps is only some NOTE_INSN_DELETED entries are placed
> differently.  I haven't studied this problem.
>
> And the last 5 new failures have dissappered after fixing the following
> described issue.
>
> Imagine the following doloop (each use and set is a fmad in real example):
>
> use1 reg
> set reg
> use2 reg
> insn
> cloop
>
> After SMS it looks like this, I write a scheduling stage and cycle before each
> instruction.
>
> 0  0 set reg
> 0  0 use1 reg_copy
> 0  4 use2 regR
> 0 -4 reg_copy = reg
> 0  8 insn
> 0 -1 cloop
>
> So all instructions were wrongly classified to stage zero.  While copying them
> to prologue the regmove remains to be placed after use1, and as a
> result, the register
> reg_copy is used uninititalized in prologue.  This leads to miscompilation.
>
> I have found that the issue can be fixed by additional schedule normalizarion
> after scheduling branch instruction in optimize_sc function.  The situation
> here is the same as in patch by Richard Sandiford
> http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00748.html
> which enables scheduling regmoves.
> "Moves that handle incoming values might have been added
> to a new first stage.  Bump the stage count if so."
> The same bumping should be done after scheduling branch.
>
> In my model example branch is scheduled on cycle -1 and remains in zero stage.
> When regmove is later scheduled on cycle -4 the Richard's check doesn't cause
> normalization, because new PS_MIN_CYCLE is -4, but min_cycle is -12:
>
>   min_cycle = PS_MIN_CYCLE (ps) - SMODULO (PS_MIN_CYCLE (ps), ps->ii);
>   ...
>   call schedule_reg_moves (ps)
>   ...
>   if (PS_MIN_CYCLE (ps) < min_cycle)
>     {
>       reset_sched_times (ps, 0);
>       stage_count++;
>     }
>
> I attach patch which adds the same check into optimize_sc function.
> With -fmodulo-sched -fmodulo-sched-allow-regmoves enabled it passes
> bootstrap and regtest on IA64.  It also passes bootstrap and regtest
> on x86_64 with SMS patched to schedule non-doloop loops.
> [Support new loop pattern]
> http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00495.html
>
> OK for trunk or maybe 4.8?
>
> Happy holidays!
>
> --
> Roman Zhuykov
Jeff Law Jan. 21, 2016, 3:04 a.m. UTC | #2
On 01/18/2016 02:38 PM, Roman Zhuykov wrote:
> Hello,
>
> 4 years ago when I create some SMS patches nobody cares about SMS
> failures on ia64.  Now ia64 is even more dead but at least one of bugs
> appears on powerpc - PR69252.
> Proposed patch is here:
> https://gcc.gnu.org/ml/gcc-patches/2011-12/txt00266.txt and it even
> suits current trunk without modification.
>
> Powerpc PR69252 situation seems to be the same as I described earlier
> on ia64, I can discuss it once again.
Actually a pointer back to this is actually more helpful:

https://gcc.gnu.org/ml/gcc-patches/2011-12/msg01800.html

As that has more of the rationale for the change, including the key note 
that it's essentially duplicating a check/adjustment that Richard added 
in sms_schedule in a place that would make it apply after scheduling the 
branch instruction.

This is OK for the trunk.  Please go ahead and install the change.

Thanks,
jeff
diff mbox

Patch

diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
index 969b273..e5de595 100644
--- a/gcc/modulo-sched.c
+++ b/gcc/modulo-sched.c
@@ -998,7 +998,7 @@  optimize_sc (partial_schedule_ptr ps, ddg_ptr g)
       int row = SMODULO (branch_cycle, ps->ii);
       int num_splits = 0;
       sbitmap must_precede, must_follow, tmp_precede, tmp_follow;
-      int c;
+      int min_cycle, c;
 
       if (dump_file)
 	fprintf (dump_file, "\nTrying to schedule node %d "
@@ -1053,6 +1053,7 @@  optimize_sc (partial_schedule_ptr ps, ddg_ptr g)
 	if (next_ps_i->id == g->closing_branch->cuid)
 	  break;
 
+      min_cycle = PS_MIN_CYCLE (ps) - SMODULO (PS_MIN_CYCLE (ps), ps->ii);
       remove_node_from_ps (ps, next_ps_i);
       success =
 	try_scheduling_node_in_cycle (ps, g->closing_branch->cuid, c,
@@ -1092,6 +1093,10 @@  optimize_sc (partial_schedule_ptr ps, ddg_ptr g)
 	  ok = true;
 	}
 
+      /* This might have been added to a new first stage.  */
+      if (PS_MIN_CYCLE (ps) < min_cycle)
+	reset_sched_times (ps, 0);
+
       free (must_precede);
       free (must_follow);
     }