diff mbox

[RFA,2/n] Don't lift loads above register using jumps in postreload-gcse.c

Message ID 1721935.92zzXaUnD9@e103209-lin
State New
Headers show

Commit Message

Matthew Gretton-Dann Sept. 5, 2012, 11:42 a.m. UTC
All,

When implementing ARM/Thumb support for -freorder-blocks-and-partition I
encountered the following silent code generation fault.

Given the following CFG:

     |              |
    93             97
     |              |
 (FALLTHRU)    (CROSSING)
     \             /
      \----\  /---/
            \/
            94
             |

Basic Block 94 has the following insn in it which we want to lift into
blocks 93 and 97:

(insn/v 62 767 63 94 (set (reg:SI 3 r3 [orig:1468 ivtmp.85 ] [1468])
        (mem/c:SI (plus:SI (reg/f:SI 13 sp)
                (const_int 20 [0x14])) [7 %sfp+-52 S4 A32]))

For block 93 this becomes a move from r0 to r3 - and everything is OK.

For block 97 there is no appropriate move so the compiler tries
to copy the load, and insert it on the edge from 97 to 94.  This edge is
a crossing edge, and so is implemented by an indirect jump:

(insn 2795 2590 3940 97 (set (reg:SI 3 r3 [2464])
        (mem/u/c:SI (symbol_ref/u:SI ("*.LC19") [flags 0x2]) [2 S4
A32])) 634 {*arm_movsi_vfp}
     (insn_list:REG_LABEL_OPERAND 887 (expr_list:REG_EQUIV (label_ref:SI 887)
            (nil))))
(jump_insn 2796 3940 2593 97 (set (pc)
        (reg:SI 3 r3 [2464])) 264 {*arm_indirect_jump}
     (expr_list:REG_CROSSING_JUMP (nil)
        (nil)))

The compiler tries to insert the copy of insn 62 (in this case it
becomes insn 3940) immediately before the jump_insn - which because this
is a crossing edge is implemented as an indirect jump using a register
in the ARM backend:

(insn 2795 2590 3940 97 (set (reg:SI 3 r3 [2464])
        (mem/u/c:SI (symbol_ref/u:SI ("*.LC19") [flags 0x2]) [2 S4
A32])) 634 {*arm_movsi_vfp}
     (insn_list:REG_LABEL_OPERAND 887 (expr_list:REG_EQUIV (label_ref:SI 887)
            (nil))))
(insn 3940 2795 2796 97 (set (reg:SI 3 r3 [orig:1468 ivtmp.85 ] [1468])
        (mem/c:SI (plus:SI (reg/f:SI 13 sp)
                (const_int 20 [0x14])) [7 %sfp+-52 S4 A32])) -1
     (nil))
(jump_insn 2796 3940 2593 97 (set (pc)
        (reg:SI 3 r3 [2464])) 264 {*arm_indirect_jump}
     (expr_list:REG_CROSSING_JUMP (nil)
        (nil)))

However, this is incorrect as insn 3940 sets r3, and the jump_insn
2796 wants to use r3 (as set by 2795).

The patch fixes this by checking that the register set by the load is
not used by the jump before allowing the load to be lifted.

Whilst this fix works for this particular case I am not sure it is the
best fix for the general issue, and so if others have a better idea how
to fix this I would be very happy.

In particular I wonder whether we should be defining
TARGET_CANNOT_MODIFY_JUMPS_P for the ARM backend as indirect jumps use
registers in a similar way to the SH backend.  Not that this would have helped 
in this particular instance.

Tested cross arm-linux-none-gnueabi with in progress ARM -freorder-blocks-and-
partition enabling patch.

OK?

Thanks,

Matt

gcc/ChangeLog:

2012-09-05  Matthew Gretton-Dann  <matthew.gretton-dann@linaro.org>

	* postreload-gcse.c (eliminate_partially_redundant_load): Ensure
	that loads are not lifted over branches which use the register
	loaded.

Comments

Steven Bosscher Sept. 5, 2012, 12:02 p.m. UTC | #1
On Wed, Sep 5, 2012 at 1:42 PM, Matthew Gretton-Dann wrote:
> Whilst this fix works for this particular case I am not sure it is the
> best fix for the general issue, and so if others have a better idea how
> to fix this I would be very happy.

postreload-gcse.c is broken in "interesting" ways. Look at this gem for example:

static bool
reg_changed_after_insn_p (rtx x, int cuid)
{
  unsigned int regno, end_regno;

  regno = REGNO (x);
  end_regno = END_HARD_REGNO (x);
  do
    if (reg_avail_info[regno] > cuid)
      return true;
  while (++regno < end_regno);
  return false;
}

So the more conservative the fix, the better :-)

The patch looks correct to me. But perhaps the pass should just punt
on blocks not ending in a simple jump in
bb_has_well_behaved_predecessors?

Ciao!
Steven
Richard Earnshaw Sept. 5, 2012, 12:45 p.m. UTC | #2
On 05/09/12 13:02, Steven Bosscher wrote:
> On Wed, Sep 5, 2012 at 1:42 PM, Matthew Gretton-Dann wrote:
>> Whilst this fix works for this particular case I am not sure it is the
>> best fix for the general issue, and so if others have a better idea how
>> to fix this I would be very happy.
> 
> postreload-gcse.c is broken in "interesting" ways. Look at this gem for example:
> 
> static bool
> reg_changed_after_insn_p (rtx x, int cuid)
> {
>   unsigned int regno, end_regno;
> 
>   regno = REGNO (x);
>   end_regno = END_HARD_REGNO (x);
>   do
>     if (reg_avail_info[regno] > cuid)
>       return true;
>   while (++regno < end_regno);
>   return false;
> }
> 
> So the more conservative the fix, the better :-)
> 
> The patch looks correct to me. But perhaps the pass should just punt
> on blocks not ending in a simple jump in
> bb_has_well_behaved_predecessors?
> 
> Ciao!
> Steven
> 


That sort of makes sense.  Why would we ever want to hoist an insn out
of a cold block into a hot one?  I could see it making sense to do the
reverse on occasion, but clearly care is needed here.

R.
Matthew Gretton-Dann Sept. 5, 2012, 1:18 p.m. UTC | #3
On 5 September 2012 13:45, Richard Earnshaw <rearnsha@arm.com> wrote:
> On 05/09/12 13:02, Steven Bosscher wrote:
>> On Wed, Sep 5, 2012 at 1:42 PM, Matthew Gretton-Dann wrote:
>>> Whilst this fix works for this particular case I am not sure it is the
>>> best fix for the general issue, and so if others have a better idea how
>>> to fix this I would be very happy.
>>
>> postreload-gcse.c is broken in "interesting" ways. Look at this gem for example:
>>
>> static bool
>> reg_changed_after_insn_p (rtx x, int cuid)
>> {
>>   unsigned int regno, end_regno;
>>
>>   regno = REGNO (x);
>>   end_regno = END_HARD_REGNO (x);
>>   do
>>     if (reg_avail_info[regno] > cuid)
>>       return true;
>>   while (++regno < end_regno);
>>   return false;
>> }
>>
>> So the more conservative the fix, the better :-)

I suppose removing the pass is too conservative :-)

>> The patch looks correct to me. But perhaps the pass should just punt
>> on blocks not ending in a simple jump in
>> bb_has_well_behaved_predecessors?

By 'simple jump' you mean any block with at most only EDGE_FALLTHRU on the edge?

> That sort of makes sense.  Why would we ever want to hoist an insn out
> of a cold block into a hot one?  I could see it making sense to do the
> reverse on occasion, but clearly care is needed here.

So whilst testing -freorder-blocks-and-partition has caused this
behaviour to be exhibited, I believe there is nothing stopping this
happening with any indirect jump - not just crossing jumps.

Thanks,

Matt
Steven Bosscher Sept. 5, 2012, 3:40 p.m. UTC | #4
On Wed, Sep 5, 2012 at 3:18 PM, Matthew Gretton-Dann
<matthew.gretton-dann@linaro.org> wrote:
> On 5 September 2012 13:45, Richard Earnshaw <rearnsha@arm.com> wrote:
>> On 05/09/12 13:02, Steven Bosscher wrote:
>>> On Wed, Sep 5, 2012 at 1:42 PM, Matthew Gretton-Dann wrote:
>>>> Whilst this fix works for this particular case I am not sure it is the
>>>> best fix for the general issue, and so if others have a better idea how
>>>> to fix this I would be very happy.
>>>
>>> postreload-gcse.c is broken in "interesting" ways. Look at this gem for example:
>>>
>>> static bool
>>> reg_changed_after_insn_p (rtx x, int cuid)
>>> {
>>>   unsigned int regno, end_regno;
>>>
>>>   regno = REGNO (x);
>>>   end_regno = END_HARD_REGNO (x);
>>>   do
>>>     if (reg_avail_info[regno] > cuid)
>>>       return true;
>>>   while (++regno < end_regno);
>>>   return false;
>>> }
>>>
>>> So the more conservative the fix, the better :-)
>
> I suppose removing the pass is too conservative :-)
>
>>> The patch looks correct to me. But perhaps the pass should just punt
>>> on blocks not ending in a simple jump in
>>> bb_has_well_behaved_predecessors?
>
> By 'simple jump' you mean any block with at most only EDGE_FALLTHRU on the edge?

No, I mean using the onlyjump_p predicate.

Ciao!
Steven
diff mbox

Patch

diff --git a/gcc/postreload-gcse.c b/gcc/postreload-gcse.c
index b464d1f..85fb9b3 100644
--- a/gcc/postreload-gcse.c
+++ b/gcc/postreload-gcse.c
@@ -1048,6 +1048,13 @@  eliminate_partially_redundant_load (basic_block bb, rtx
 	  /* Adding a load on a critical edge will cause a split.  */
 	  if (EDGE_CRITICAL_P (pred))
 	    critical_edge_split = true;
+
+	  /* If the destination register is used at the BB end we can not
+	     insert the load.  */
+	  if (reg_used_between_p (dest, PREV_INSN (BB_END (pred_bb)),
+				  next_pred_bb_end))
+	    goto cleanup;
+
 	  not_ok_count += pred->count;
 	  unoccr = (struct unoccr *) obstack_alloc (&unoccr_obstack,
 						    sizeof (struct unoccr));