Message ID | 4C291A42.8010604@codesourcery.com |
---|---|
State | New |
Headers | show |
On Mon, Jun 28, 2010 at 11:55 PM, Bernd Schmidt <bernds@codesourcery.com> wrote: > I ran into some interesting problems when I was trying to change > thumb1_legitimate_address. A minor change, which left initial RTL > generation identical at first glance, produced optimization regressions > later on in gcse. > > It turns out there are two paths for expanding things like ARRAY_REF or > COMPONENT_REF, which are taken depending on whether direct_load is true > for a given mode. In one of these paths, we call set_mem_attrs on the > MEM we generate, in the other, we do not. (There'll probably need to be > one or two bug fixes here later.) > > This causes gcse.c to behave differently for some testcases. When we > don't set_mem_attrs, several MEMs are identical, and we get a single > expression of the form > > (zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ]) > (const_int 16 [0x10])) [0+16 S2 A32]) > > while in the other case, there's more information and we have > different-looking MEMs: > > (zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ]) > (const_int 16 [0x10])) [6 o_3(D)->op_type+0 S2 A32] > > (zero_extend:SI (mem/s:HI (plus:SI (reg/v/f:SI 149 [ o ]) > (const_int 16 [0x10])) [6 o_1->op_type+0 S2 A32]) > > The MEMs were generated as component_refs with different SSA names as > pointers. They are identical in structure, but differ in MEM_ATTRS. In > 2006, as a fix for PR25130, there was a change in how gcse would treat > MEMs; previously, it only verified that the alias sets were the same, > now it is very strict and tests MEM_ATTRS for equality, so it will no > longer optimize the two latter expressions. > > This change seems to be unnecessary; I believe the real problem was > fixed in PR41033. There, nonoverlapping_component_ref was disabled for > the !flag_strict_aliasing case. In comment #9 for PR25130, we get a > hint that RTL alias analysis is in fact the problem: > canon_true_dependence returns false for two MEMs with identical > structure (but different MEM_ATTRS). > > I verified with a 4.1 compiler that when the PR25130 patch is reverted > and the one for PR41033 is applied, the compiler still produces correct > output for the testcase. Based on this, I suggest the following patch, > which reverts the important part of the PR25130 patch on mainline and > fixes the optimization regressions I observed. > > Bootstrapped and regression tested (with some other changes) on > i686-linux. Ok? Ok. Thanks, Richard. > > Bernd >
Index: cse.c =================================================================== --- cse.c (revision 161371) +++ cse.c (working copy) @@ -2669,25 +2669,15 @@ exp_equiv_p (const_rtx x, const_rtx y, i case MEM: if (for_gcse) { - /* A volatile mem should not be considered equivalent to any - other. */ - if (MEM_VOLATILE_P (x) || MEM_VOLATILE_P (y)) - return 0; - /* Can't merge two expressions in different alias sets, since we can decide that the expression is transparent in a block when - it isn't, due to it being set with the different alias set. - - Also, can't merge two expressions with different MEM_ATTRS. - They could e.g. be two different entities allocated into the - same space on the stack (see e.g. PR25130). In that case, the - MEM addresses can be the same, even though the two MEMs are - absolutely not equivalent. + it isn't, due to it being set with the different alias set. */ + if (MEM_ALIAS_SET (x) != MEM_ALIAS_SET (y)) + return 0; - But because really all MEM attributes should be the same for - equivalent MEMs, we just use the invariant that MEMs that have - the same attributes share the same mem_attrs data structure. */ - if (MEM_ATTRS (x) != MEM_ATTRS (y)) + /* A volatile mem should not be considered equivalent to any + other. */ + if (MEM_VOLATILE_P (x) || MEM_VOLATILE_P (y)) return 0; } break;