diff mbox

PR51374: combine.c and volatile correctness

Message ID 4F26CBF3.5050806@gjlay.de
State New
Headers show

Commit Message

Georg-Johann Lay Jan. 30, 2012, 4:57 p.m. UTC
Richard Sandiford wrote:
> Georg-Johann Lay <avr@gjlay.de> writes:
>> This patch fixes PR51374 by more strictly updating mem_last_set.
>> Sloppy handling of mem_last_set can lead to error in volatile correctness
>> because combine then is allowed to drag one volatile access over an other
>> volatile thing (volatile access, asm volatile, unspec_volatile...).
>>
>> As explained in the source comment, any volatile might change memory, even a
>> volatile read.
> 
> This looks too broad to me.  Volatile MEMs don't usually alias
> non-volatile ones (so for example, reads from volatile memory shouldn't
> affect reads from non-volatile memory).

The patch tried to fix the issue. If there is less invasive solutions, that's
always preferred, of course.

> What do you think about instead changing:
> 
>   /* If there are any volatile insns between INSN and I3, reject, because
>      they might affect machine state.  */
> 
>   for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
>     if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN (p)))
>       return 0;
> 
> to:
> 
>   /* If INSN contains volatile references (specifically volatile MEMs),
>      we cannot combine across any other volatile references.
>      Even if INSN doesn't contain volatile references, any intervening
>      volatile insn might affect machine state.  */
> 
>   pred = volatile_refs_p (insn) ? volatile_refs_p : volatile_insn_p;
>   for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
>     if (NONDEBUG_INSN_P (p) && p != succ && p != succ2 && pred (PATTERN (p)))
>       return 0;

This would still allow a volatile_refs_p to cross a volatile_insn_p?

And some lines above there is this:

  /* If INSN contains anything volatile, or is an `asm' (whether volatile
     or not), reject, unless nothing volatile comes between it and I3 */

  if (GET_CODE (src) == ASM_OPERANDS || volatile_refs_p (src))
    {
      /* Make sure neither succ nor succ2 contains a volatile reference.  */
      if (succ2 != 0 && volatile_refs_p (PATTERN (succ2)))
	return 0;
      if (succ != 0 && volatile_refs_p (PATTERN (succ)))
	return 0;
      /* We'll check insns between INSN and I3 below.  */
    }

Would our change (whatever it will look like) make that code superfluous?

> As you say in the PR notes, we've already handled volatile_insn_p (insn)
> by this stage, so the new loop is handling:
> 
>    volatile_refs_p (insn) && !volatile_insn_p (insn)

There is this part of an if-clause in can_combine_p:

      /* Make sure that the value that is to be substituted for the register
	 does not use any registers whose values alter in between.  However,
	 If the insns are adjacent, a use can't cross a set even though we
	 think it might (this can happen for a sequence of insns each setting
	 the same destination; last_set of that register might point to
	 a NOTE).  If INSN has a REG_EQUIV note, the register is always
	 equivalent to the memory so the substitution is valid even if there
	 are intervening stores.  Also, don't move a volatile asm or
	 UNSPEC_VOLATILE across any other insns.  */
      || (! all_adjacent
	  && (((!MEM_P (src)
		|| ! find_reg_note (insn, REG_EQUIV, src))
	       && use_crosses_set_p (src, DF_INSN_LUID (insn)))
	      || (GET_CODE (src) == ASM_OPERANDS && MEM_VOLATILE_P (src))
	      || GET_CODE (src) == UNSPEC_VOLATILE))

so it's not a verbatim volatile_insn_p. Maybe it's for historical reason or for
some subtle differences not explained; only someone familiar with combine will
know... perhaps performance issue.

>> Moreover, writes of the shape (set (zero_extract (mem ... update mem_last_set.
> 
> Good catch.  I think we should use note_stores for added safety though,
> rather than call zero_extract out as a special case.  (I realise the
> function already handles subregs, but still.)

record_dead_and_set_regs_1 is called through note_stores so that (set
(zero_extract (mem)) is handled already and thus no change to
record_dead_and_set_regs_1 is needed, right?

Attached you find a new, tentative patch. It resolves the issue in my small
test program.
However, I think someone with more insight into combine should take over the patch.

Johann

 	PR rtl-optimization/51374
	* combine.c (can_combine_p): Don't allow volatile_refs_p insns
	to cross other volatile_refs_p insns.

Comments

Richard Sandiford Jan. 30, 2012, 7:16 p.m. UTC | #1
Georg-Johann Lay <avr@gjlay.de> writes:
> Richard Sandiford wrote:
>> What do you think about instead changing:
>> 
>>   /* If there are any volatile insns between INSN and I3, reject, because
>>      they might affect machine state.  */
>> 
>>   for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
>>     if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN (p)))
>>       return 0;
>> 
>> to:
>> 
>>   /* If INSN contains volatile references (specifically volatile MEMs),
>>      we cannot combine across any other volatile references.
>>      Even if INSN doesn't contain volatile references, any intervening
>>      volatile insn might affect machine state.  */
>> 
>>   pred = volatile_refs_p (insn) ? volatile_refs_p : volatile_insn_p;
>>   for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
>>     if (NONDEBUG_INSN_P (p) && p != succ && p != succ2 && pred (PATTERN (p)))
>>       return 0;
>
> This would still allow a volatile_refs_p to cross a volatile_insn_p?

No, volatile_insn_p => volatile_refs_p.

>   /* If INSN contains anything volatile, or is an `asm' (whether volatile
>      or not), reject, unless nothing volatile comes between it and I3 */
>
>   if (GET_CODE (src) == ASM_OPERANDS || volatile_refs_p (src))
>     {
>       /* Make sure neither succ nor succ2 contains a volatile reference.  */
>       if (succ2 != 0 && volatile_refs_p (PATTERN (succ2)))
> 	return 0;
>       if (succ != 0 && volatile_refs_p (PATTERN (succ)))
> 	return 0;
>       /* We'll check insns between INSN and I3 below.  */
>     }
>
> Would our change (whatever it will look like) make that code superfluous?

No, the loop skips succ and succ2, and this if-block is handling
non-volatile ASM_OPERANDS as well as volatile ones.

>>> Moreover, writes of the shape (set (zero_extract (mem ... update mem_last_set.
>> 
>> Good catch.  I think we should use note_stores for added safety though,
>> rather than call zero_extract out as a special case.  (I realise the
>> function already handles subregs, but still.)
>
> record_dead_and_set_regs_1 is called through note_stores so that (set
> (zero_extract (mem)) is handled already and thus no change to
> record_dead_and_set_regs_1 is needed, right?

Oops, yes.

> Attached you find a new, tentative patch. It resolves the issue in my
> small test program.  However, I think someone with more insight into
> combine should take over the patch.

OK, point taken :-)  If you'd prefer someone else to approve it
that's fine.

Richard
Georg-Johann Lay Jan. 30, 2012, 9:30 p.m. UTC | #2
Richard Sandiford schrieb:
> Georg-Johann Lay <avr@> writes:
> 
>>Attached you find a new, tentative patch. It resolves the issue in my
>>small test program.  However, I think someone with more insight into
>>combine should take over the patch.
> 
> OK, point taken :-)  If you'd prefer someone else to approve it
> that's fine.

Sorry for the misunderstanding.

The "someone else" didn't refer to the approver part, it's because I am 
unexperienced with combine.

If you are fine with the patch (and the local block around predicat) I'd 
gladly check it in and backport to 4.6 where the PR also pops up.

Johann
Richard Sandiford Jan. 30, 2012, 9:52 p.m. UTC | #3
Georg-Johann Lay <avr@gjlay.de> writes:
> Richard Sandiford schrieb:
>> Georg-Johann Lay <avr@> writes:
>> 
>>>Attached you find a new, tentative patch. It resolves the issue in my
>>>small test program.  However, I think someone with more insight into
>>>combine should take over the patch.
>> 
>> OK, point taken :-)  If you'd prefer someone else to approve it
>> that's fine.
>
> Sorry for the misunderstanding.
>
> The "someone else" didn't refer to the approver part, it's because I am 
> unexperienced with combine.

Ah, in that case I'm the one who should apologise.  I probably misunderstood
because it's all too true that I don't know this code very well. :-)

> If you are fine with the patch (and the local block around predicat) I'd 
> gladly check it in and backport to 4.6 where the PR also pops up.

I'd prefer it without the local block, and the variable declared (but not
initialised) at the start of the function, but otherwise it's OK for
both trunk and 4.6 as far as I'm concerned.  I realise you posted the
patch a while ago now, but let's leave it 24 hours just in case anyone
wants to object.

Thanks,
Richard
diff mbox

Patch

Index: combine.c
===================================================================
--- combine.c	(revision 183695)
+++ combine.c	(working copy)
@@ -1947,12 +1947,21 @@  can_combine_p (rtx insn, rtx i3, rtx pre
       && REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER)
     return 0;
 
-  /* If there are any volatile insns between INSN and I3, reject, because
-     they might affect machine state.  */
+  /* If INSN contains volatile references (specifically volatile MEMs),
+     we cannot combine across any other volatile references.
+     Even if INSN doesn't contain volatile references, any intervening
+     volatile insn might affect machine state.  */
+  {
+    int (*predicat) (const_rtx);
 
-  for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
-    if (INSN_P (p) && p != succ && p != succ2 && volatile_insn_p (PATTERN (p)))
-      return 0;
+    predicat = volatile_refs_p (PATTERN (insn))
+      ? volatile_refs_p
+      : volatile_insn_p;
+    
+    for (p = NEXT_INSN (insn); p != i3; p = NEXT_INSN (p))
+      if (INSN_P (p) && p != succ && p != succ2 && predicat (PATTERN (p)))
+        return 0;
+  }
 
   /* If INSN contains an autoincrement or autodecrement, make sure that
      register is not used between there and I3, and not already used in