Patchwork [PR50764] Fix for ICE in maybe_record_trace_start with -fsched2-use-superblocks

login
register
mail settings
Submitter Tom de Vries
Date Oct. 29, 2011, 7:17 p.m.
Message ID <4EAC514A.2090607@mentor.com>
Download mbox | patch
Permalink /patch/122557/
State New
Headers show

Comments

Tom de Vries - Oct. 29, 2011, 7:17 p.m.
Richard,

I have a tentative fix for PR50764.

In the example from the test-case, -fsched2-use-superblocks moves an insn from
block 4 to block 3.

           2
          bar
           |
    -------+-----
   /             \
  *               *
  5 *------------ 3
abort            bar
                  |
                  |
                  *
                  4
                return


The insn has a REG_CFA_DEF_CFA note and is frame-related.
...
(insn/f 51 50 52 4 (set (reg:DI 39 r10)
        (mem/c:DI (plus:DI (reg/f:DI 6 bp)
                (const_int -8 [0xfffffffffffffff8])) [3 S8 A8])) pr50764.c:13
62 {*movdi_internal_rex64}
     (expr_list:REG_CFA_DEF_CFA (reg:DI 39 r10)
        (nil)))
...

This causes the assert in maybe_record_trace_start to trigger:
...
      /* We ought to have the same state incoming to a given trace no
	 matter how we arrive at the trace.  Anything else means we've
	 got some kind of optimization error.  */
      gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
...

The assert does not occur with -fno-tree-tail-merge, but that is due to the
following:
- -fsched-use-superblocks does not handle dead labels explicitly
- -freorder-blocks introduces a dead label, which is not removed until after
  sched2
- -ftree-tail-merge makes a difference in which block -freorder-blocks
  introduces the dead label. In the case of -ftree-tail-merge, the dead label
  is introduced at the start of block 3, and block 3 and 4 end up in the same
  ebb. In the case of -fno-tree-tail-merge, the dead label is introduced at the
  start of block 4, and block 3 and 4 don't end up in the same ebb.

attached untested patch fixes PR50764 in a similar way as the patch for PR49994,
which is also about an ICE in maybe_record_trace_start with
-fsched2-use-superblocks.

The patch for PR49994 makes sure frame-related instructions are not moved past
the following jump.

Attached patch makes sure frame-related instructions are not moved past the
preceding jump.

Is this the way to fix this PR?

Thanks,
- Tom

2011-10-29  Tom de Vries  <tom@codesourcery.com>

	PR rtl-optimization/50764
	* (sched_analyze_insn): Make sure frame-related insns are not moved past
	preceding jump.
Tom de Vries - Oct. 30, 2011, 7:38 a.m.
On 10/29/2011 09:17 PM, Tom de Vries wrote:
> Richard,
> 
> I have a tentative fix for PR50764.
> 
> In the example from the test-case, -fsched2-use-superblocks moves an insn from
> block 4 to block 3.
> 
>            2
>           bar
>            |
>     -------+-----
>    /             \
>   *               *
>   5 *------------ 3
> abort            bar
>                   |
>                   |
>                   *
>                   4
>                 return
> 
> 
> The insn has a REG_CFA_DEF_CFA note and is frame-related.
> ...
> (insn/f 51 50 52 4 (set (reg:DI 39 r10)
>         (mem/c:DI (plus:DI (reg/f:DI 6 bp)
>                 (const_int -8 [0xfffffffffffffff8])) [3 S8 A8])) pr50764.c:13
> 62 {*movdi_internal_rex64}
>      (expr_list:REG_CFA_DEF_CFA (reg:DI 39 r10)
>         (nil)))
> ...
> 
> This causes the assert in maybe_record_trace_start to trigger:
> ...
>       /* We ought to have the same state incoming to a given trace no
> 	 matter how we arrive at the trace.  Anything else means we've
> 	 got some kind of optimization error.  */
>       gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
> ...
> 
> The assert does not occur with -fno-tree-tail-merge, but that is due to the
> following:
> - -fsched-use-superblocks does not handle dead labels explicitly
> - -freorder-blocks introduces a dead label, which is not removed until after
>   sched2
> - -ftree-tail-merge makes a difference in which block -freorder-blocks
>   introduces the dead label. In the case of -ftree-tail-merge, the dead label
>   is introduced at the start of block 3, and block 3 and 4 end up in the same
>   ebb. In the case of -fno-tree-tail-merge, the dead label is introduced at the
>   start of block 4, and block 3 and 4 don't end up in the same ebb.
> 
> attached untested patch fixes PR50764 in a similar way as the patch for PR49994,
> which is also about an ICE in maybe_record_trace_start with
> -fsched2-use-superblocks.
> 
> The patch for PR49994 makes sure frame-related instructions are not moved past
> the following jump.
> 
> Attached patch makes sure frame-related instructions are not moved past the
> preceding jump.
> 
> Is this the way to fix this PR?
> 

Bootstrapped and reg-tested on x86_64.

Ok for trunk?

Thanks,
- Tom

> Thanks,
> - Tom
> 
> 2011-10-29  Tom de Vries  <tom@codesourcery.com>
> 
> 	PR rtl-optimization/50764
> 	* (sched_analyze_insn): Make sure frame-related insns are not moved past
> 	preceding jump.
> 
> 
>
Tom de Vries - Nov. 7, 2011, 5:42 p.m.
On 10/30/2011 08:38 AM, Tom de Vries wrote:
> On 10/29/2011 09:17 PM, Tom de Vries wrote:
>> Richard,
>>
>> I have a tentative fix for PR50764.
>>
>> In the example from the test-case, -fsched2-use-superblocks moves an insn from
>> block 4 to block 3.
>>
>>            2
>>           bar
>>            |
>>     -------+-----
>>    /             \
>>   *               *
>>   5 *------------ 3
>> abort            bar
>>                   |
>>                   |
>>                   *
>>                   4
>>                 return
>>
>>
>> The insn has a REG_CFA_DEF_CFA note and is frame-related.
>> ...
>> (insn/f 51 50 52 4 (set (reg:DI 39 r10)
>>         (mem/c:DI (plus:DI (reg/f:DI 6 bp)
>>                 (const_int -8 [0xfffffffffffffff8])) [3 S8 A8])) pr50764.c:13
>> 62 {*movdi_internal_rex64}
>>      (expr_list:REG_CFA_DEF_CFA (reg:DI 39 r10)
>>         (nil)))
>> ...
>>
>> This causes the assert in maybe_record_trace_start to trigger:
>> ...
>>       /* We ought to have the same state incoming to a given trace no
>> 	 matter how we arrive at the trace.  Anything else means we've
>> 	 got some kind of optimization error.  */
>>       gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
>> ...
>>
>> The assert does not occur with -fno-tree-tail-merge, but that is due to the
>> following:
>> - -fsched-use-superblocks does not handle dead labels explicitly
>> - -freorder-blocks introduces a dead label, which is not removed until after
>>   sched2
>> - -ftree-tail-merge makes a difference in which block -freorder-blocks
>>   introduces the dead label. In the case of -ftree-tail-merge, the dead label
>>   is introduced at the start of block 3, and block 3 and 4 end up in the same
>>   ebb. In the case of -fno-tree-tail-merge, the dead label is introduced at the
>>   start of block 4, and block 3 and 4 don't end up in the same ebb.
>>
>> attached untested patch fixes PR50764 in a similar way as the patch for PR49994,
>> which is also about an ICE in maybe_record_trace_start with
>> -fsched2-use-superblocks.
>>
>> The patch for PR49994 makes sure frame-related instructions are not moved past
>> the following jump.
>>
>> Attached patch makes sure frame-related instructions are not moved past the
>> preceding jump.
>>
>> Is this the way to fix this PR?
>>
> 
> Bootstrapped and reg-tested on x86_64.
> 
> Ok for trunk?
> 

Ping.

Thanks,
- Tom

> Thanks,
> - Tom
> 
>> Thanks,
>> - Tom
>>
>> 2011-10-29  Tom de Vries  <tom@codesourcery.com>
>>
>> 	PR rtl-optimization/50764
>> 	* (sched_analyze_insn): Make sure frame-related insns are not moved past
>> 	preceding jump.
>>
>>
>>
>
Maxim Kuvyrkov - Nov. 15, 2011, 9:07 p.m.
On 30/10/2011, at 8:17 AM, Tom de Vries wrote:

> Richard,
> 
> I have a tentative fix for PR50764.

Richard,

Tom's patch is good (with the comments below addressed), and I would appreciate you validating my review with your formal approval.

> 
> In the example from the test-case, -fsched2-use-superblocks moves an insn from
> block 4 to block 3.
> 
>           2
>          bar
>           |
>    -------+-----
>   /             \
>  *               *
>  5 *------------ 3
> abort            bar
>                  |
>                  |
>                  *
>                  4
>                return
> 
> 
> The insn has a REG_CFA_DEF_CFA note and is frame-related.
> ...
> (insn/f 51 50 52 4 (set (reg:DI 39 r10)
>        (mem/c:DI (plus:DI (reg/f:DI 6 bp)
>                (const_int -8 [0xfffffffffffffff8])) [3 S8 A8])) pr50764.c:13
> 62 {*movdi_internal_rex64}
>     (expr_list:REG_CFA_DEF_CFA (reg:DI 39 r10)
>        (nil)))
> ...
> 
> This causes the assert in maybe_record_trace_start to trigger:
> ...
>      /* We ought to have the same state incoming to a given trace no
> 	 matter how we arrive at the trace.  Anything else means we've
> 	 got some kind of optimization error.  */
>      gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
> ...
> 
> The assert does not occur with -fno-tree-tail-merge, but that is due to the
> following:
> - -fsched-use-superblocks does not handle dead labels explicitly
> - -freorder-blocks introduces a dead label, which is not removed until after
>  sched2
> - -ftree-tail-merge makes a difference in which block -freorder-blocks
>  introduces the dead label. In the case of -ftree-tail-merge, the dead label
>  is introduced at the start of block 3, and block 3 and 4 end up in the same
>  ebb. In the case of -fno-tree-tail-merge, the dead label is introduced at the
>  start of block 4, and block 3 and 4 don't end up in the same ebb.
> 
> attached untested patch fixes PR50764 in a similar way as the patch for PR49994,
> which is also about an ICE in maybe_record_trace_start with
> -fsched2-use-superblocks.
> 
> The patch for PR49994 makes sure frame-related instructions are not moved past
> the following jump.
> 
> Attached patch makes sure frame-related instructions are not moved past the
> preceding jump.
> 
> Is this the way to fix this PR?

Tom,

Thank you for good analysis, your patch is the right way to go.

Scheduler should not move frame-related insns from either prologue or epilogue basic blocks.  Currently sched-deps analysis handles the prologue case, and your patch fixes the epilogue case.  The primary reason why we didn't hit the assert before is due to the fact that we do interblock scheduling after reload only on few architectures.  With single-block scheduling after reload, which is what we do for most architectures, this issue cannot arise.

> 
> Index: gcc/sched-deps.c
> ===================================================================
> --- gcc/sched-deps.c (revision 180521)
> +++ gcc/sched-deps.c (working copy)
> @@ -2812,8 +2812,13 @@ sched_analyze_insn (struct deps_desc *de
>      during prologue generation and avoid marking the frame pointer setup
>      as frame-related at all.  */
>   if (RTX_FRAME_RELATED_P (insn))
> -    deps->sched_before_next_jump
> -      = alloc_INSN_LIST (insn, deps->sched_before_next_jump);
> +    {
> +      deps->sched_before_next_jump
> +	= alloc_INSN_LIST (insn, deps->sched_before_next_jump);

This code is rather obscure, so additional comments would be helpful.  Please add something like "Make sure prologue INSN is scheduled before next jump." before the first statement; and add something like "Make sure epilogue INSN is not moved before preceding jumps." before the second statement.

> +
> +      if (deps->pending_jump_insns)
> +	add_dependence (insn, XEXP (deps->pending_jump_insns, 0), REG_DEP_ANTI);

Please use "add_dependence_list (insn, deps->pending_jump_insns, 1, REG_DEP_ANTI);" instead.  We want INSN to depend upon all of pending jumps, not just one of them.  The situation where pending_jump_insns has more than a single jump does not happen in current setup of scheduling runs (as sched-rgn does not do interblock scheduling after reload), but that may change in the future.

OK upon formal approval from Richard or other reviewer.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics

Patch

Index: gcc/sched-deps.c
===================================================================
--- gcc/sched-deps.c (revision 180521)
+++ gcc/sched-deps.c (working copy)
@@ -2812,8 +2812,13 @@  sched_analyze_insn (struct deps_desc *de
      during prologue generation and avoid marking the frame pointer setup
      as frame-related at all.  */
   if (RTX_FRAME_RELATED_P (insn))
-    deps->sched_before_next_jump
-      = alloc_INSN_LIST (insn, deps->sched_before_next_jump);
+    {
+      deps->sched_before_next_jump
+	= alloc_INSN_LIST (insn, deps->sched_before_next_jump);
+
+      if (deps->pending_jump_insns)
+	add_dependence (insn, XEXP (deps->pending_jump_insns, 0), REG_DEP_ANTI);
+    }
 
   if (code == COND_EXEC)
     {