Comments
Patch
@@ -998,7 +998,7 @@ optimize_sc (partial_schedule_ptr ps, ddg_ptr g)
int row = SMODULO (branch_cycle, ps->ii);
int num_splits = 0;
sbitmap must_precede, must_follow, tmp_precede, tmp_follow;
- int c;
+ int min_cycle, c;
if (dump_file)
fprintf (dump_file, "\nTrying to schedule node %d "
@@ -1053,6 +1053,7 @@ optimize_sc (partial_schedule_ptr ps, ddg_ptr g)
if (next_ps_i->id == g->closing_branch->cuid)
break;
+ min_cycle = PS_MIN_CYCLE (ps) - SMODULO (PS_MIN_CYCLE (ps), ps->ii);
remove_node_from_ps (ps, next_ps_i);
success =
try_scheduling_node_in_cycle (ps, g->closing_branch->cuid, c,
@@ -1092,6 +1093,10 @@ optimize_sc (partial_schedule_ptr ps, ddg_ptr g)
ok = true;
}
+ /* This might have been added to a new first stage. */
+ if (PS_MIN_CYCLE (ps) < min_cycle)
+ reset_sched_times (ps, 0);
+
free (must_precede);
free (must_follow);
}
This week I investigated modulo scheduler on IA64. Enabling SMS by default (-fmodulo-sched -fmodulo-sched-allow-regmoves) leads to bootstrap failure on IA64: gcc/build/genautomata.o differs while comparing stages 2 and 3. I haven't studied this issue in detail, because the combination of these my patches fixes this problem: [Additional edges to instructions with clobber] http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00505.html [Correctly delete anti-dep edges for renaming] http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00506.html Then I have regtested two compilers - first is clean trunk, and the second is trunk with SMS enabled by default and two patches mentioned above. Comparing the results shows several new failures. FAIL: gcc.dg/pr45259.c (internal compiler error) FAIL: gcc.dg/pr45259.c (test for excess errors) FAIL: gcc.dg/pr47881.c (test for excess errors) FAIL: tr1/5_numerical_facilities/special_functions/08_cyl_bessel_i/check_value.cc execution test FAIL: tr1/5_numerical_facilities/special_functions/09_cyl_bessel_j/check_value.cc execution test FAIL: tr1/5_numerical_facilities/special_functions/11_cyl_neumann/check_value.cc execution test FAIL: tr1/5_numerical_facilities/special_functions/21_sph_bessel/check_value.cc execution test FAIL: tr1/5_numerical_facilities/special_functions/23_sph_neumann/check_value.cc execution test Problem with gcc.dg/pr45259.c is an ICE, which I earlier fixed by this patch: [Correct extracting loop exit condition] http://gcc.gnu.org/ml/gcc-patches/2011-09/msg02049.html In gcc.dg/pr47881.c -fcompare-debug failure happens. The difference between -fcompare-debug dumps is only some NOTE_INSN_DELETED entries are placed differently. I haven't studied this problem. And the last 5 new failures have dissappered after fixing the following described issue. Imagine the following doloop (each use and set is a fmad in real example): use1 reg set reg use2 reg insn cloop After SMS it looks like this, I write a scheduling stage and cycle before each instruction. 0 0 set reg 0 0 use1 reg_copy 0 4 use2 regR 0 -4 reg_copy = reg 0 8 insn 0 -1 cloop So all instructions were wrongly classified to stage zero. While copying them to prologue the regmove remains to be placed after use1, and as a result, the register reg_copy is used uninititalized in prologue. This leads to miscompilation. I have found that the issue can be fixed by additional schedule normalizarion after scheduling branch instruction in optimize_sc function. The situation here is the same as in patch by Richard Sandiford http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00748.html which enables scheduling regmoves. "Moves that handle incoming values might have been added to a new first stage. Bump the stage count if so." The same bumping should be done after scheduling branch. In my model example branch is scheduled on cycle -1 and remains in zero stage. When regmove is later scheduled on cycle -4 the Richard's check doesn't cause normalization, because new PS_MIN_CYCLE is -4, but min_cycle is -12: min_cycle = PS_MIN_CYCLE (ps) - SMODULO (PS_MIN_CYCLE (ps), ps->ii); ... call schedule_reg_moves (ps) ... if (PS_MIN_CYCLE (ps) < min_cycle) { reset_sched_times (ps, 0); stage_count++; } I attach patch which adds the same check into optimize_sc function. With -fmodulo-sched -fmodulo-sched-allow-regmoves enabled it passes bootstrap and regtest on IA64. It also passes bootstrap and regtest on x86_64 with SMS patched to schedule non-doloop loops. [Support new loop pattern] http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00495.html OK for trunk or maybe 4.8? Happy holidays! -- Roman Zhuykov 2011-12-29 Roman Zhuykov <zhroma@ispras.ru> * modulo-sched.c (optimize_sc): Allow branch-scheduling to add a new first stage. ---