diff mbox

gcse/cse: Revert parts of the PR25130 patch

Message ID 4C291A42.8010604@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt June 28, 2010, 9:55 p.m. UTC
I ran into some interesting problems when I was trying to change
thumb1_legitimate_address.  A minor change, which left initial RTL
generation identical at first glance, produced optimization regressions
later on in gcse.

It turns out there are two paths for expanding things like ARRAY_REF or
COMPONENT_REF, which are taken depending on whether direct_load is true
for a given mode.  In one of these paths, we call set_mem_attrs on the
MEM we generate, in the other, we do not.  (There'll probably need to be
one or two bug fixes here later.)

This causes gcse.c to behave differently for some testcases.  When we
don't set_mem_attrs, several MEMs are identical, and we get a single
expression of the form

(zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ])
	        (const_int 16 [0x10])) [0+16 S2 A32])

while in the other case, there's more information and we have
different-looking MEMs:

(zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ])
	        (const_int 16 [0x10])) [6 o_3(D)->op_type+0 S2 A32]

(zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ])
	        (const_int 16 [0x10])) [6 o_1->op_type+0 S2 A32])

The MEMs were generated as component_refs with different SSA names as
pointers.  They are identical in structure, but differ in MEM_ATTRS.  In
2006, as a fix for PR25130, there was a change in how gcse would treat
MEMs; previously, it only verified that the alias sets were the same,
now it is very strict and tests MEM_ATTRS for equality, so it will no
longer optimize the two latter expressions.

This change seems to be unnecessary; I believe the real problem was
fixed in PR41033.  There, nonoverlapping_component_ref was disabled for
the !flag_strict_aliasing case.  In comment #9 for PR25130, we get a
hint that RTL alias analysis is in fact the problem:
canon_true_dependence returns false for two MEMs with identical
structure (but different MEM_ATTRS).

I verified with a 4.1 compiler that when the PR25130 patch is reverted
and the one for PR41033 is applied, the compiler still produces correct
output for the testcase.  Based on this, I suggest the following patch,
which reverts the important part of the PR25130 patch on mainline and
fixes the optimization regressions I observed.

Bootstrapped and regression tested (with some other changes) on
i686-linux.  Ok?


Bernd
Revert parts of the change for PR25130.
	* cse.c (exp_equiv_p): For MEMs, if for_gcse, only compare
	MEM_ALIAS_SET.

Comments

Richard Biener June 29, 2010, 9:27 a.m. UTC | #1
On Mon, Jun 28, 2010 at 11:55 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> I ran into some interesting problems when I was trying to change
> thumb1_legitimate_address.  A minor change, which left initial RTL
> generation identical at first glance, produced optimization regressions
> later on in gcse.
>
> It turns out there are two paths for expanding things like ARRAY_REF or
> COMPONENT_REF, which are taken depending on whether direct_load is true
> for a given mode.  In one of these paths, we call set_mem_attrs on the
> MEM we generate, in the other, we do not.  (There'll probably need to be
> one or two bug fixes here later.)
>
> This causes gcse.c to behave differently for some testcases.  When we
> don't set_mem_attrs, several MEMs are identical, and we get a single
> expression of the form
>
> (zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ])
>                (const_int 16 [0x10])) [0+16 S2 A32])
>
> while in the other case, there's more information and we have
> different-looking MEMs:
>
> (zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ])
>                (const_int 16 [0x10])) [6 o_3(D)->op_type+0 S2 A32]
>
> (zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ])
>                (const_int 16 [0x10])) [6 o_1->op_type+0 S2 A32])
>
> The MEMs were generated as component_refs with different SSA names as
> pointers.  They are identical in structure, but differ in MEM_ATTRS.  In
> 2006, as a fix for PR25130, there was a change in how gcse would treat
> MEMs; previously, it only verified that the alias sets were the same,
> now it is very strict and tests MEM_ATTRS for equality, so it will no
> longer optimize the two latter expressions.
>
> This change seems to be unnecessary; I believe the real problem was
> fixed in PR41033.  There, nonoverlapping_component_ref was disabled for
> the !flag_strict_aliasing case.  In comment #9 for PR25130, we get a
> hint that RTL alias analysis is in fact the problem:
> canon_true_dependence returns false for two MEMs with identical
> structure (but different MEM_ATTRS).
>
> I verified with a 4.1 compiler that when the PR25130 patch is reverted
> and the one for PR41033 is applied, the compiler still produces correct
> output for the testcase.  Based on this, I suggest the following patch,
> which reverts the important part of the PR25130 patch on mainline and
> fixes the optimization regressions I observed.
>
> Bootstrapped and regression tested (with some other changes) on
> i686-linux.  Ok?

Ok.

Thanks,
Richard.

>
> Bernd
>
diff mbox

Patch

Index: cse.c
===================================================================
--- cse.c	(revision 161371)
+++ cse.c	(working copy)
@@ -2669,25 +2669,15 @@  exp_equiv_p (const_rtx x, const_rtx y, i
     case MEM:
       if (for_gcse)
 	{
-	  /* A volatile mem should not be considered equivalent to any
-	     other.  */
-	  if (MEM_VOLATILE_P (x) || MEM_VOLATILE_P (y))
-	    return 0;
-
 	  /* Can't merge two expressions in different alias sets, since we
 	     can decide that the expression is transparent in a block when
-	     it isn't, due to it being set with the different alias set.
-
-	     Also, can't merge two expressions with different MEM_ATTRS.
-	     They could e.g. be two different entities allocated into the
-	     same space on the stack (see e.g. PR25130).  In that case, the
-	     MEM addresses can be the same, even though the two MEMs are
-	     absolutely not equivalent.
+	     it isn't, due to it being set with the different alias set.  */
+	  if (MEM_ALIAS_SET (x) != MEM_ALIAS_SET (y))
+	    return 0;
 
-	     But because really all MEM attributes should be the same for
-	     equivalent MEMs, we just use the invariant that MEMs that have
-	     the same attributes share the same mem_attrs data structure.  */
-	  if (MEM_ATTRS (x) != MEM_ATTRS (y))
+	  /* A volatile mem should not be considered equivalent to any
+	     other.  */
+	  if (MEM_VOLATILE_P (x) || MEM_VOLATILE_P (y))
 	    return 0;
 	}
       break;