diff mbox

Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

Message ID CA+4CFy74eA6arbvph8G1zMAXzDURx_E4NbRdeQXkbRLsOHnvRg@mail.gmail.com
State New
Headers show

Commit Message

Wei Mi Nov. 24, 2013, 7:30 a.m. UTC
Sorry about the problem.

For the failed testcase, it was compiled using -fmodulo-sched.
modulo-sched phase set SCHED_GROUP_P of a jump insn to be true, which
means the jump insn should be scheduled with prev insn as a group.
When modulo scheduling is finished, the flag of SCHED_GROUP_P is not
cleaned up. After that, pass_jump2 phase split the bb and move the
prev insn to another bb. Then pass_sched2 see the flag and mistakenly
try to bind the jump insn with a code label.

I am thinking other cases setting SCHED_GROUP_P should have the same
problem because SCHED_GROUP_P is not cleaned up after scheduling is
done. The flag may become inconsistent after some optimizations and
may cause problem if it is used by later scheduling passes. I don't
know why similar problem was never exposed before.

The fix is to simply cleanup SCHED_GROUP_P flag in sched_finish.

bootstrap is ok. regression test is going on. Is it ok if regression passes?

Thanks,
Wei.

2013-11-23  Wei Mi  <wmi@google.com>

        PR rtl-optimization/59020
        * haifa-sched.c (cleanup_sched_group): New function.
        (sched_finish): Call cleanup_sched_group to cleanup SCHED_GROUP_P.

2013-11-23  Wei Mi  <wmi@google.com>
        PR rtl-optimization/59020
        * testsuite/gcc.dg/pr59020.c (void f):


On Sat, Nov 23, 2013 at 4:34 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Nov 4, 2013 at 1:51 PM, Wei Mi <wmi@google.com> wrote:
>> Thanks! The three patches are commited as r204367, r204369 and r204371.
>>
>
> r204369 caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59020
>
> --
> H.J.

Comments

Alexander Monakov Nov. 25, 2013, 10:08 a.m. UTC | #1
On Sat, 23 Nov 2013, Wei Mi wrote:
> For the failed testcase, it was compiled using -fmodulo-sched.
> modulo-sched phase set SCHED_GROUP_P of a jump insn to be true, which
> means the jump insn should be scheduled with prev insn as a group.

SMS doesn't set SCHED_GROUP_P by itself; did you mean that SCHED_GROUP_P is
set by dependency analysis code similar to sched2?

> When modulo scheduling is finished, the flag of SCHED_GROUP_P is not
> cleaned up. After that, pass_jump2 phase split the bb and move the
> prev insn to another bb. Then pass_sched2 see the flag and mistakenly
> try to bind the jump insn with a code label.

I think the analysis is incomplete.  Looking at the backtrace posted in the
bug report, the failure path goes through chain_to_prev_insn, which protects
against such failure:

  prev_nonnote = prev_nonnote_nondebug_insn (insn);
  if (BLOCK_FOR_INSN (insn) == BLOCK_FOR_INSN (prev_nonnote)
      && ! sched_insns_conditions_mutex_p (insn, prev_nonnote))
    add_dependence (insn, prev_nonnote, REG_DEP_ANTI);

Why does it end up with a label at the assertion failure point?

Alexander
Wei Mi Nov. 25, 2013, 5:41 p.m. UTC | #2
On Mon, Nov 25, 2013 at 2:08 AM, Alexander Monakov <amonakov@ispras.ru> wrote:
> On Sat, 23 Nov 2013, Wei Mi wrote:
>> For the failed testcase, it was compiled using -fmodulo-sched.
>> modulo-sched phase set SCHED_GROUP_P of a jump insn to be true, which
>> means the jump insn should be scheduled with prev insn as a group.
>
> SMS doesn't set SCHED_GROUP_P by itself; did you mean that SCHED_GROUP_P is
> set by dependency analysis code similar to sched2?
>

SCHED_GROUP_P is set in sched_analyze for "call + return value" group
and other groups, and in sched_init for macrofusion. Both
sched_analyze and sched_init are used by SMS (sched_analyze is used by
creating ddg). I think sched1 may have the same problem when it set
SCHED_GROUP_P and sched2 uses it.

>> When modulo scheduling is finished, the flag of SCHED_GROUP_P is not
>> cleaned up. After that, pass_jump2 phase split the bb and move the
>> prev insn to another bb. Then pass_sched2 see the flag and mistakenly
>> try to bind the jump insn with a code label.
>
> I think the analysis is incomplete.  Looking at the backtrace posted in the
> bug report, the failure path goes through chain_to_prev_insn, which protects
> against such failure:
>
>   prev_nonnote = prev_nonnote_nondebug_insn (insn);
>   if (BLOCK_FOR_INSN (insn) == BLOCK_FOR_INSN (prev_nonnote)
>       && ! sched_insns_conditions_mutex_p (insn, prev_nonnote))
>     add_dependence (insn, prev_nonnote, REG_DEP_ANTI);
>
> Why does it end up with a label at the assertion failure point?
>
> Alexander

Because code label is not a note or debug insn.

I think it is impossible to detect such inconsistency in
chain_to_prev_insn. If the prev_nonnote is not a code label, the bug
will not be exposed this time. Suppose some other optimizations insert
a real insn before the jump marked as SCHED_GROUP_P, following
scheduler pass will schedule them together silently. That is why I
think it is necessary to cleanup SCHED_GROUP_P when a scheduling pass
is finished.

Thanks,
Wei.
Jeff Law Nov. 25, 2013, 6:36 p.m. UTC | #3
On 11/24/13 00:30, Wei Mi wrote:
> Sorry about the problem.
>
> For the failed testcase, it was compiled using -fmodulo-sched.
> modulo-sched phase set SCHED_GROUP_P of a jump insn to be true, which
> means the jump insn should be scheduled with prev insn as a group.
> When modulo scheduling is finished, the flag of SCHED_GROUP_P is not
> cleaned up. After that, pass_jump2 phase split the bb and move the
> prev insn to another bb. Then pass_sched2 see the flag and mistakenly
> try to bind the jump insn with a code label.
>
> I am thinking other cases setting SCHED_GROUP_P should have the same
> problem because SCHED_GROUP_P is not cleaned up after scheduling is
> done. The flag may become inconsistent after some optimizations and
> may cause problem if it is used by later scheduling passes. I don't
> know why similar problem was never exposed before.
>
> The fix is to simply cleanup SCHED_GROUP_P flag in sched_finish.
I think this is showing up because this is the first time we have used 
SCHED_GROUP_P in cases where we merely want to keep two instructions 
consecutive vs cases where we are required to keep certain instructions 
consecutive.  For example, all the RTL passes already know they need to 
keep a cc0 setter and cc0 user consecutive on a HAVE_cc0 target.

In the latter case passes should already be doing what is necessary to 
keep those instructions consecutive.  In the former case, we'd have to 
audit & fix passes to honor the desire to keep certain instructions 
consecutive.




>
> bootstrap is ok. regression test is going on. Is it ok if regression passes?
>
> Thanks,
> Wei.
>
> 2013-11-23  Wei Mi  <wmi@google.com>
>
>          PR rtl-optimization/59020
>          * haifa-sched.c (cleanup_sched_group): New function.
>          (sched_finish): Call cleanup_sched_group to cleanup SCHED_GROUP_P.
>
> 2013-11-23  Wei Mi  <wmi@google.com>
>          PR rtl-optimization/59020
>          * testsuite/gcc.dg/pr59020.c (void f):
I'll note you're doing an extra pass over all the RTL here.   Is there 
any clean way you can clean SCHED_GROUP_P without that extra pass over 
the RTL?  Perhaps when the group actually gets scheduled?

jeff
diff mbox

Patch

Index: haifa-sched.c
===================================================================
--- haifa-sched.c       (revision 204923)
+++ haifa-sched.c       (working copy)
@@ -6598,6 +6598,23 @@  group_insns_for_macro_fusion ()
     try_group_insn (BB_END (bb));
 }

+/* Cleanup SCHED_GROUP_P after scheduling is done. This is necessary because
+   bb may be changed by other optimizations and the flag from last scheduling
+   may become invalid. If later scheduler see the flag generated from last
+   scheduling, it may produces incorrect result.  */
+
+static void
+cleanup_sched_group ()
+{
+  basic_block bb;
+  rtx insn;
+
+  FOR_EACH_BB (bb)
+    FOR_BB_INSNS(bb, insn)
+      if (INSN_P (insn) && SCHED_GROUP_P (insn))
+       SCHED_GROUP_P (insn) = 0;
+}
+
 /* Initialize some global state for the scheduler.  This function works
    with the common data shared between all the schedulers.  It is called
    from the scheduler specific initialization routine.  */
@@ -6841,6 +6858,8 @@  sched_finish (void)
     }
   free (curr_state);

+  cleanup_sched_group ();
+
   if (targetm.sched.finish_global)
     targetm.sched.finish_global (sched_dump, sched_verbose);

Index: testsuite/gcc.dg/pr59020.c
===================================================================
--- testsuite/gcc.dg/pr59020.c  (revision 0)
+++ testsuite/gcc.dg/pr59020.c  (revision 0)
@@ -0,0 +1,15 @@ 
+/* PR rtl-optimization/59020 */
+
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 -fmodulo-sched -fno-inline -march=corei7" } */
+
+int a, b, d;
+unsigned c;
+
+void f()
+{
+  unsigned q;
+  for(; a; a++)
+    if(((c %= d && 1) ? : 1) & 1)
+      for(; b; q++);
+}