Fix PR 91708
diff mbox series

Message ID AM6PR10MB2566D9B47A2676B890457CB3E4B60@AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM
State New
Headers show
Series
  • Fix PR 91708
Related show

Commit Message

Bernd Edlinger Sept. 10, 2019, 7:51 p.m. UTC
Hi!

This ICE happens when compiling real_nextafter in real.c.
CSE sees this:

(insn 179 178 180 11 (set (reg:SI 319)
        (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
     (nil))
[...]
(insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
        (unspec:SI [
                (reg:SI 320)
            ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
     (nil))
[...]
(insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
                (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
        (unspec:SI [
                (reg:SI 320)
            ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
     (expr_list:REG_DEAD (reg:SI 320)
        (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
            (nil))))
[...]
(insn 234 233 235 11 (set (reg:SI 340)
        (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
     (nil))


... and transforms insn 234 in an invalid insn:


(insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
        (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
                (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
     (nil))

which triggers the assertion in the arm back-end, because the MEM is not aligned.

To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
ALIAS_SET as different.

This patch fixes the arm bootstrap for --with-cpu=cortex-a57 --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
which I confirmed using a cross compiler.  And it fixes the test case that is attached to the PR, but it is way
too large for the test suite.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.

Comments

Jeff Law Sept. 10, 2019, 10:43 p.m. UTC | #1
On 9/10/19 1:51 PM, Bernd Edlinger wrote:
> Hi!
> 
> This ICE happens when compiling real_nextafter in real.c.
> CSE sees this:
> 
> (insn 179 178 180 11 (set (reg:SI 319)
>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>      (nil))
> [...]
> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>         (unspec:SI [
>                 (reg:SI 320)
>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>      (nil))
> [...]
> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
>         (unspec:SI [
>                 (reg:SI 320)
>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>      (expr_list:REG_DEAD (reg:SI 320)
>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>             (nil))))
> [...]
> (insn 234 233 235 11 (set (reg:SI 340)
>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>      (nil))
> 
> 
> ... and transforms insn 234 in an invalid insn:
> 
> 
> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>      (nil))
> 
> which triggers the assertion in the arm back-end, because the MEM is not aligned.
> 
> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
> ALIAS_SET as different.
> 
> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
> which I confirmed using a cross compiler.  And it fixes the test case that is attached to the PR, but it is way
> too large for the test suite.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
> 
> 
> patch-pr91708.diff
> 
> 2019-09-10  Bernd Edlinger  <bernd.edlinger@hotmail.de>
> 
> 	PR middle-end/91708
> 	* cse.c (exp_equiv_p): Consider MEMs with different
> 	alias set or alignment as different.
Hmm, I would have expected the address space and alignment checks to be
handled by the call to mem_attrs_eq_p.

Which aspect of the MEMs is really causing the problem here?

jeff
Richard Biener Sept. 11, 2019, 7:23 a.m. UTC | #2
On Tue, 10 Sep 2019, Bernd Edlinger wrote:

> Hi!
> 
> This ICE happens when compiling real_nextafter in real.c.
> CSE sees this:
> 
> (insn 179 178 180 11 (set (reg:SI 319)
>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>      (nil))
> [...]
> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>         (unspec:SI [
>                 (reg:SI 320)
>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>      (nil))
> [...]
> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
>         (unspec:SI [
>                 (reg:SI 320)
>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>      (expr_list:REG_DEAD (reg:SI 320)
>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>             (nil))))
> [...]
> (insn 234 233 235 11 (set (reg:SI 340)
>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>      (nil))
> 
> 
> ... and transforms insn 234 in an invalid insn:
> 
> 
> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>      (nil))
> 
> which triggers the assertion in the arm back-end, because the MEM is not aligned.
>
> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
> ALIAS_SET as different.

I think the appropriate fix is to retain the mem_attrs of the original
MEM since obviously the replacement does not change those?  It's a mere
change of the MEMs address but implementation-wise CSE replaces the whole
MEM?

For CSE exp_equiv_p simply compares the addresses, only if for_gcse
we do further checks like mem_attrs_eq_p.  IMHO that is correct.

I'm not sure what CSE does above, that is, what does it think it does?
It doesn't replace the loaded value, it seems to do sth else which
is odd for CSE (replaces a REG with a PLUS - but why?)
 
> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
> which I confirmed using a cross compiler.  And it fixes the test case that is attached to the PR, but it is way
> too large for the test suite.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?

So no, I don't think this is correct.

Thanks,
Richard.
Bernd Edlinger Sept. 11, 2019, 1:33 p.m. UTC | #3
On 9/11/19 12:43 AM, Jeff Law wrote:
> On 9/10/19 1:51 PM, Bernd Edlinger wrote:
>> Hi!
>>
>> This ICE happens when compiling real_nextafter in real.c.
>> CSE sees this:
>>
>> (insn 179 178 180 11 (set (reg:SI 319)
>>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>      (nil))
>> [...]
>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>         (unspec:SI [
>>                 (reg:SI 320)
>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>      (nil))
>> [...]
>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
>>         (unspec:SI [
>>                 (reg:SI 320)
>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>      (expr_list:REG_DEAD (reg:SI 320)
>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>             (nil))))
>> [...]
>> (insn 234 233 235 11 (set (reg:SI 340)
>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>      (nil))
>>
>>
>> ... and transforms insn 234 in an invalid insn:
>>
>>
>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>      (nil))
>>
>> which triggers the assertion in the arm back-end, because the MEM is not aligned.
>>
>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
>> ALIAS_SET as different.
>>
>> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
>> which I confirmed using a cross compiler.  And it fixes the test case that is attached to the PR, but it is way
>> too large for the test suite.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>
>>
>> patch-pr91708.diff
>>
>> 2019-09-10  Bernd Edlinger  <bernd.edlinger@hotmail.de>
>>
>> 	PR middle-end/91708
>> 	* cse.c (exp_equiv_p): Consider MEMs with different
>> 	alias set or alignment as different.
> Hmm, I would have expected the address space and alignment checks to be
> handled by the call to mem_attrs_eq_p.
> 

Yes, but exp_equiv_p is called from here:

lookup (rtx x, unsigned int hash, machine_mode mode)
{
  struct table_elt *p;

  for (p = table[hash]; p; p = p->next_same_hash)
    if (mode == p->mode && ((x == p->exp && REG_P (x))
                            || exp_equiv_p (x, p->exp, !REG_P (x), false)))
      return p;

with for_gcse = false si mem_attrs_eq_p is not used.

> Which aspect of the MEMs is really causing the problem here?
> 

The alignment is (formally) different, but since also the address is different,
the alignment could be also be really wrong.

Originally it was [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]
but it is replaced with [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8],
that appears to be done, because there is the same value that was in the original
location.

But also the wrong alias set will cause problems if the instruction
is re-materialized in another place (which actually happened in LRA and caused the
assertion, BTW).


Bernd.
Richard Biener Sept. 11, 2019, 1:37 p.m. UTC | #4
On Wed, 11 Sep 2019, Bernd Edlinger wrote:

> On 9/11/19 12:43 AM, Jeff Law wrote:
> > On 9/10/19 1:51 PM, Bernd Edlinger wrote:
> >> Hi!
> >>
> >> This ICE happens when compiling real_nextafter in real.c.
> >> CSE sees this:
> >>
> >> (insn 179 178 180 11 (set (reg:SI 319)
> >>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
> >>      (nil))
> >> [...]
> >> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
> >>         (unspec:SI [
> >>                 (reg:SI 320)
> >>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>      (nil))
> >> [...]
> >> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
> >>         (unspec:SI [
> >>                 (reg:SI 320)
> >>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>      (expr_list:REG_DEAD (reg:SI 320)
> >>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
> >>             (nil))))
> >> [...]
> >> (insn 234 233 235 11 (set (reg:SI 340)
> >>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
> >>      (nil))
> >>
> >>
> >> ... and transforms insn 234 in an invalid insn:
> >>
> >>
> >> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
> >>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
> >>      (nil))
> >>
> >> which triggers the assertion in the arm back-end, because the MEM is not aligned.
> >>
> >> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
> >> ALIAS_SET as different.
> >>
> >> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
> >> which I confirmed using a cross compiler.  And it fixes the test case that is attached to the PR, but it is way
> >> too large for the test suite.
> >>
> >>
> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> >> Is it OK for trunk?
> >>
> >>
> >> Thanks
> >> Bernd.
> >>
> >>
> >> patch-pr91708.diff
> >>
> >> 2019-09-10  Bernd Edlinger  <bernd.edlinger@hotmail.de>
> >>
> >> 	PR middle-end/91708
> >> 	* cse.c (exp_equiv_p): Consider MEMs with different
> >> 	alias set or alignment as different.
> > Hmm, I would have expected the address space and alignment checks to be
> > handled by the call to mem_attrs_eq_p.
> > 
> 
> Yes, but exp_equiv_p is called from here:
> 
> lookup (rtx x, unsigned int hash, machine_mode mode)
> {
>   struct table_elt *p;
> 
>   for (p = table[hash]; p; p = p->next_same_hash)
>     if (mode == p->mode && ((x == p->exp && REG_P (x))
>                             || exp_equiv_p (x, p->exp, !REG_P (x), false)))
>       return p;
> 
> with for_gcse = false si mem_attrs_eq_p is not used.
> 
> > Which aspect of the MEMs is really causing the problem here?
> > 
> 
> The alignment is (formally) different, but since also the address is different,
> the alignment could be also be really wrong.
> 
> Originally it was [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]
> but it is replaced with [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8],
> that appears to be done, because there is the same value that was in the original
> location.
> 
> But also the wrong alias set will cause problems if the instruction
> is re-materialized in another place (which actually happened in LRA and caused the
> assertion, BTW).

But !for_gcse shouldn't do this kind of things with MEMs.  It should
only replace a MEM with a REG, not put in some other "equivalent"
MEMs.

Richard.
Bernd Edlinger Sept. 11, 2019, 1:49 p.m. UTC | #5
On 9/11/19 9:23 AM, Richard Biener wrote:
> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
> 
>> Hi!
>>
>> This ICE happens when compiling real_nextafter in real.c.
>> CSE sees this:
>>
>> (insn 179 178 180 11 (set (reg:SI 319)
>>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>      (nil))
>> [...]
>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>         (unspec:SI [
>>                 (reg:SI 320)
>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>      (nil))
>> [...]
>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
>>         (unspec:SI [
>>                 (reg:SI 320)
>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>      (expr_list:REG_DEAD (reg:SI 320)
>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>             (nil))))
>> [...]
>> (insn 234 233 235 11 (set (reg:SI 340)
>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>      (nil))
>>
>>
>> ... and transforms insn 234 in an invalid insn:
>>
>>
>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>      (nil))
>>
>> which triggers the assertion in the arm back-end, because the MEM is not aligned.
>>
>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
>> ALIAS_SET as different.
> 
> I think the appropriate fix is to retain the mem_attrs of the original
> MEM since obviously the replacement does not change those?  It's a mere
> change of the MEMs address but implementation-wise CSE replaces the whole
> MEM?
> 
> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
> 
> I'm not sure what CSE does above, that is, what does it think it does?
> It doesn't replace the loaded value, it seems to do sth else which
> is odd for CSE (replaces a REG with a PLUS - but why?)
>  

What I think CSE is thinking there is this:

insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
that is the same address as where insn 234 reads the value back but there we know it is aligned.

insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8]

But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
insn 234 processed, it replaces one mem with another mem that has the same
value.


Usually that would be okay, but not when the RTX is extracted from an unspec
unalinged_storedi.


Bernd.


>> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
>> which I confirmed using a cross compiler.  And it fixes the test case that is attached to the PR, but it is way
>> too large for the test suite.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> 
> So no, I don't think this is correct.
> 
> Thanks,
> Richard.
>
Richard Biener Sept. 11, 2019, 1:55 p.m. UTC | #6
On Wed, 11 Sep 2019, Bernd Edlinger wrote:

> On 9/11/19 9:23 AM, Richard Biener wrote:
> > On Tue, 10 Sep 2019, Bernd Edlinger wrote:
> > 
> >> Hi!
> >>
> >> This ICE happens when compiling real_nextafter in real.c.
> >> CSE sees this:
> >>
> >> (insn 179 178 180 11 (set (reg:SI 319)
> >>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
> >>      (nil))
> >> [...]
> >> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
> >>         (unspec:SI [
> >>                 (reg:SI 320)
> >>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>      (nil))
> >> [...]
> >> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
> >>         (unspec:SI [
> >>                 (reg:SI 320)
> >>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>      (expr_list:REG_DEAD (reg:SI 320)
> >>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
> >>             (nil))))
> >> [...]
> >> (insn 234 233 235 11 (set (reg:SI 340)
> >>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
> >>      (nil))
> >>
> >>
> >> ... and transforms insn 234 in an invalid insn:
> >>
> >>
> >> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
> >>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
> >>      (nil))
> >>
> >> which triggers the assertion in the arm back-end, because the MEM is not aligned.
> >>
> >> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
> >> ALIAS_SET as different.
> > 
> > I think the appropriate fix is to retain the mem_attrs of the original
> > MEM since obviously the replacement does not change those?  It's a mere
> > change of the MEMs address but implementation-wise CSE replaces the whole
> > MEM?
> > 
> > For CSE exp_equiv_p simply compares the addresses, only if for_gcse
> > we do further checks like mem_attrs_eq_p.  IMHO that is correct.
> > 
> > I'm not sure what CSE does above, that is, what does it think it does?
> > It doesn't replace the loaded value, it seems to do sth else which
> > is odd for CSE (replaces a REG with a PLUS - but why?)
> >  
> 
> What I think CSE is thinking there is this:
> 
> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
> that is the same address as where insn 234 reads the value back but there we know it is aligned.
> 
> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8]
> 
> But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
> insn 234 processed, it replaces one mem with another mem that has the same
> value.

But why?  I see zero benefit doing that.  Note the addresses are equal,
so if it for some reason is beneficial CSE could still simply 
_preserve_ the original mem_attrs.

Richard.
Bernd Edlinger Sept. 11, 2019, 2:41 p.m. UTC | #7
On 9/11/19 3:55 PM, Richard Biener wrote:
> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
> 
>> On 9/11/19 9:23 AM, Richard Biener wrote:
>>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>>
>>>> Hi!
>>>>
>>>> This ICE happens when compiling real_nextafter in real.c.
>>>> CSE sees this:
>>>>
>>>> (insn 179 178 180 11 (set (reg:SI 319)
>>>>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>> [...]
>>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>>>         (unspec:SI [
>>>>                 (reg:SI 320)
>>>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>      (nil))
>>>> [...]
>>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
>>>>         (unspec:SI [
>>>>                 (reg:SI 320)
>>>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>      (expr_list:REG_DEAD (reg:SI 320)
>>>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>>>             (nil))))
>>>> [...]
>>>> (insn 234 233 235 11 (set (reg:SI 340)
>>>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>>
>>>>
>>>> ... and transforms insn 234 in an invalid insn:
>>>>
>>>>
>>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
>>>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>>
>>>> which triggers the assertion in the arm back-end, because the MEM is not aligned.
>>>>
>>>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
>>>> ALIAS_SET as different.
>>>
>>> I think the appropriate fix is to retain the mem_attrs of the original
>>> MEM since obviously the replacement does not change those?  It's a mere
>>> change of the MEMs address but implementation-wise CSE replaces the whole
>>> MEM?
>>>
>>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>>
>>> I'm not sure what CSE does above, that is, what does it think it does?
>>> It doesn't replace the loaded value, it seems to do sth else which
>>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>>  
>>
>> What I think CSE is thinking there is this:
>>
>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
>> that is the same address as where insn 234 reads the value back but there we know it is aligned.
>>
>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8]
>>
>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
>> insn 234 processed, it replaces one mem with another mem that has the same
>> value.
> 
> But why?  I see zero benefit doing that.  Note the addresses are equal,
> so if it for some reason is beneficial CSE could still simply 
> _preserve_ the original mem_attrs.
> 

Yes, but the addresses are simply *not* equal.

--- cse.c.orig	2019-07-24 21:21:53.590065924 +0200
+++ cse.c	2019-09-11 16:03:43.429236836 +0200
@@ -4564,6 +4564,10 @@
   else if (GET_CODE (x) == PARALLEL)
     sets = XALLOCAVEC (struct set, XVECLEN (x, 0));
 
+  if (INSN_UID(insn) == 234 && GET_CODE(x) == SET
+      && GET_CODE(SET_DEST(x)) == REG && REGNO(SET_DEST(x)) == 340
+      && getenv("debug")) asm("int3");
+
   this_insn = insn;
   /* Records what this insn does to set CC0.  */
   this_insn_cc0 = 0;

(gdb) n
4809		elt = lookup (src, sets[i].src_hash, mode);
(gdb) call debug(src)
(mem:SI (reg/v/f:SI 273 [ r ]) [52 MEM <unsigned int> [(struct real_value *)r_77(D)]+0 S4 A32])
(gdb) call dump_class(elt)
Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0 S4 A8]): 
(unspec:SI [
        (reg:SI 320)
    ] UNSPEC_UNALIGNED_STORE)
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 16 [0x10])) [0 MEM <char[1:24]> [(void *)r_77(D)]+16 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 12 [0xc])) [0 MEM <char[1:24]> [(void *)r_77(D)]+12 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 8 [0x8])) [0 MEM <char[1:24]> [(void *)r_77(D)]+8 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 4 [0x4])) [0 MEM <char[1:24]> [(void *)r_77(D)]+4 S4 A8])
(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0 S4 A8])

[...]
5393		  else if (validate_unshare_change (insn, &SET_SRC (sets[i].rtl),
(gdb) call debug(insn)
(insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_value *)r_77(D)] ])
        (mem:SI (reg/v/f:SI 273 [ r ]) [52 MEM <unsigned int> [(struct real_value *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
     (nil))
(gdb) call debug(trial)
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
(gdb) n
5396		      rtx new_rtx = canon_reg (SET_SRC (sets[i].rtl), insn);
(gdb) call debug(insn)
(insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_value *)r_77(D)] ])
        (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
                (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
     (nil))


It  looks like it is designed to pick a different mem with a different address,
but the same value.

Right?


Bernd.
Richard Biener Sept. 11, 2019, 3:38 p.m. UTC | #8
On September 11, 2019 4:41:10 PM GMT+02:00, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
>On 9/11/19 3:55 PM, Richard Biener wrote:
>> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
>> 
>>> On 9/11/19 9:23 AM, Richard Biener wrote:
>>>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> This ICE happens when compiling real_nextafter in real.c.
>>>>> CSE sees this:
>>>>>
>>>>> (insn 179 178 180 11 (set (reg:SI 319)
>>>>>         (reg/v/f:SI 273 [ rD.73757 ]))
>"../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>>>>      (nil))
>>>>> [...]
>>>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM
><charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>>>>         (unspec:SI [
>>>>>                 (reg:SI 320)
>>>>>             ] UNSPEC_UNALIGNED_STORE))
>"../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>>      (nil))
>>>>> [...]
>>>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [
>rD.73757 ])
>>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]>
>[(voidD.73 *)r_77(D)]+20 S4 A8])
>>>>>         (unspec:SI [
>>>>>                 (reg:SI 320)
>>>>>             ] UNSPEC_UNALIGNED_STORE))
>"../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>>      (expr_list:REG_DEAD (reg:SI 320)
>>>>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>>>>             (nil))))
>>>>> [...]
>>>>> (insn 234 233 235 11 (set (reg:SI 340)
>>>>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned
>int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]))
>"../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>>      (nil))
>>>>>
>>>>>
>>>>> ... and transforms insn 234 in an invalid insn:
>>>>>
>>>>>
>>>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int>
>[(struct real_valueD.28367 *)r_77(D)] ])
>>>>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]>
>[(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9
>643 {*thumb2_movsi_vfp}
>>>>>      (nil))
>>>>>
>>>>> which triggers the assertion in the arm back-end, because the MEM
>is not aligned.
>>>>>
>>>>> To fix that I changed exp_equiv_p to consider MEMs with different
>MEM_ALIGN or
>>>>> ALIAS_SET as different.
>>>>
>>>> I think the appropriate fix is to retain the mem_attrs of the
>original
>>>> MEM since obviously the replacement does not change those?  It's a
>mere
>>>> change of the MEMs address but implementation-wise CSE replaces the
>whole
>>>> MEM?
>>>>
>>>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>>>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>>>
>>>> I'm not sure what CSE does above, that is, what does it think it
>does?
>>>> It doesn't replace the loaded value, it seems to do sth else which
>>>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>>>  
>>>
>>> What I think CSE is thinking there is this:
>>>
>>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0
>MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
>>> that is the same address as where insn 234 reads the value back but
>there we know it is aligned.
>>>
>>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [
>rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73
>*)r_77(D)]+20 S4 A8]
>>>
>>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well,
>when
>>> insn 234 processed, it replaces one mem with another mem that has
>the same
>>> value.
>> 
>> But why?  I see zero benefit doing that.  Note the addresses are
>equal,
>> so if it for some reason is beneficial CSE could still simply 
>> _preserve_ the original mem_attrs.
>> 
>
>Yes, but the addresses are simply *not* equal.
>
>--- cse.c.orig	2019-07-24 21:21:53.590065924 +0200
>+++ cse.c	2019-09-11 16:03:43.429236836 +0200
>@@ -4564,6 +4564,10 @@
>   else if (GET_CODE (x) == PARALLEL)
>     sets = XALLOCAVEC (struct set, XVECLEN (x, 0));
> 
>+  if (INSN_UID(insn) == 234 && GET_CODE(x) == SET
>+      && GET_CODE(SET_DEST(x)) == REG && REGNO(SET_DEST(x)) == 340
>+      && getenv("debug")) asm("int3");
>+
>   this_insn = insn;
>   /* Records what this insn does to set CC0.  */
>   this_insn_cc0 = 0;
>
>(gdb) n
>4809		elt = lookup (src, sets[i].src_hash, mode);
>(gdb) call debug(src)
>(mem:SI (reg/v/f:SI 273 [ r ]) [52 MEM <unsigned int> [(struct
>real_value *)r_77(D)]+0 S4 A32])
>(gdb) call dump_class(elt)
>Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM
><char[1:24]> [(void *)r_77(D)]+0 S4 A8]): 
>(unspec:SI [
>        (reg:SI 320)
>    ] UNSPEC_UNALIGNED_STORE)
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 16 [0x10])) [0 MEM <char[1:24]> [(void *)r_77(D)]+16 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> (const_int 12 [0xc])) [0 MEM <char[1:24]> [(void *)r_77(D)]+12 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>   (const_int 8 [0x8])) [0 MEM <char[1:24]> [(void *)r_77(D)]+8 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>   (const_int 4 [0x4])) [0 MEM <char[1:24]> [(void *)r_77(D)]+4 S4 A8])
>(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0
>S4 A8])
>
>[...]
>5393		  else if (validate_unshare_change (insn, &SET_SRC (sets[i].rtl),
>(gdb) call debug(insn)
>(insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct
>real_value *)r_77(D)] ])
>(mem:SI (reg/v/f:SI 273 [ r ]) [52 MEM <unsigned int> [(struct
>real_value *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9
>643 {*thumb2_movsi_vfp}
>     (nil))
>(gdb) call debug(trial)
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
>(gdb) n
>5396		      rtx new_rtx = canon_reg (SET_SRC (sets[i].rtl), insn);
>(gdb) call debug(insn)
>(insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct
>real_value *)r_77(D)] ])
>        (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4
>A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>     (nil))
>
>
>It  looks like it is designed to pick a different mem with a different
>address,
>but the same value.
>
>Right?

But the function you originally patched recursed into the MEM operands and thus the address, comparing them. So I conclude that CSE thinks they have the same value. 

Richard. 

>
>Bernd.
Jeff Law Sept. 11, 2019, 4:03 p.m. UTC | #9
On 9/11/19 7:37 AM, Richard Biener wrote:
> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
> 
>> On 9/11/19 12:43 AM, Jeff Law wrote:
>>> On 9/10/19 1:51 PM, Bernd Edlinger wrote:
>>>> Hi!
>>>>
>>>> This ICE happens when compiling real_nextafter in real.c.
>>>> CSE sees this:
>>>>
>>>> (insn 179 178 180 11 (set (reg:SI 319)
>>>>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>> [...]
>>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>>>         (unspec:SI [
>>>>                 (reg:SI 320)
>>>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>      (nil))
>>>> [...]
>>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
>>>>         (unspec:SI [
>>>>                 (reg:SI 320)
>>>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>      (expr_list:REG_DEAD (reg:SI 320)
>>>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>>>             (nil))))
>>>> [...]
>>>> (insn 234 233 235 11 (set (reg:SI 340)
>>>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>>
>>>>
>>>> ... and transforms insn 234 in an invalid insn:
>>>>
>>>>
>>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
>>>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>>
>>>> which triggers the assertion in the arm back-end, because the MEM is not aligned.
>>>>
>>>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
>>>> ALIAS_SET as different.
>>>>
>>>> This patch fixes the arm bootstrap for --with-cpu=cortex-a57 --with-mode=thumb --with-fpu=fp-armv8 --with-float=hard
>>>> which I confirmed using a cross compiler.  And it fixes the test case that is attached to the PR, but it is way
>>>> too large for the test suite.
>>>>
>>>>
>>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>>>> Is it OK for trunk?
>>>>
>>>>
>>>> Thanks
>>>> Bernd.
>>>>
>>>>
>>>> patch-pr91708.diff
>>>>
>>>> 2019-09-10  Bernd Edlinger  <bernd.edlinger@hotmail.de>
>>>>
>>>> 	PR middle-end/91708
>>>> 	* cse.c (exp_equiv_p): Consider MEMs with different
>>>> 	alias set or alignment as different.
>>> Hmm, I would have expected the address space and alignment checks to be
>>> handled by the call to mem_attrs_eq_p.
>>>
>>
>> Yes, but exp_equiv_p is called from here:
>>
>> lookup (rtx x, unsigned int hash, machine_mode mode)
>> {
>>   struct table_elt *p;
>>
>>   for (p = table[hash]; p; p = p->next_same_hash)
>>     if (mode == p->mode && ((x == p->exp && REG_P (x))
>>                             || exp_equiv_p (x, p->exp, !REG_P (x), false)))
>>       return p;
>>
>> with for_gcse = false si mem_attrs_eq_p is not used.
>>
>>> Which aspect of the MEMs is really causing the problem here?
>>>
>>
>> The alignment is (formally) different, but since also the address is different,
>> the alignment could be also be really wrong.
>>
>> Originally it was [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]
>> but it is replaced with [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8],
>> that appears to be done, because there is the same value that was in the original
>> location.
>>
>> But also the wrong alias set will cause problems if the instruction
>> is re-materialized in another place (which actually happened in LRA and caused the
>> assertion, BTW).
> 
> But !for_gcse shouldn't do this kind of things with MEMs.  It should
> only replace a MEM with a REG, not put in some other "equivalent"
> MEMs.
Agreed, generally replacing one MEM with another MEM would be silly.
Though I wouldn't be terribly surprised if CSE did so if the form of one
of the MEMs was cheaper than the other.

It be better to use a REG though.

jeff
Jeff Law Sept. 11, 2019, 4:08 p.m. UTC | #10
On 9/11/19 7:49 AM, Bernd Edlinger wrote:
> On 9/11/19 9:23 AM, Richard Biener wrote:
>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>
>>> Hi!
>>>
>>> This ICE happens when compiling real_nextafter in real.c.
>>> CSE sees this:
>>>
>>> (insn 179 178 180 11 (set (reg:SI 319)
>>>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>>      (nil))
>>> [...]
>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>>         (unspec:SI [
>>>                 (reg:SI 320)
>>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>      (nil))
>>> [...]
>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
>>>         (unspec:SI [
>>>                 (reg:SI 320)
>>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>      (expr_list:REG_DEAD (reg:SI 320)
>>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>>             (nil))))
>>> [...]
>>> (insn 234 233 235 11 (set (reg:SI 340)
>>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>      (nil))
>>>
>>>
>>> ... and transforms insn 234 in an invalid insn:
>>>
>>>
>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
>>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>      (nil))
>>>
>>> which triggers the assertion in the arm back-end, because the MEM is not aligned.
>>>
>>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
>>> ALIAS_SET as different.
>>
>> I think the appropriate fix is to retain the mem_attrs of the original
>> MEM since obviously the replacement does not change those?  It's a mere
>> change of the MEMs address but implementation-wise CSE replaces the whole
>> MEM?
>>
>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>
>> I'm not sure what CSE does above, that is, what does it think it does?
>> It doesn't replace the loaded value, it seems to do sth else which
>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>  
> 
> What I think CSE is thinking there is this:
> 
> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
> that is the same address as where insn 234 reads the value back but there we know it is aligned.
> 
> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8]
> 
> But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
> insn 234 processed, it replaces one mem with another mem that has the same
> value.
Just to make sure I understand.  Are you saying the addresses for the
MEMs are equal or the contents of the memory location are equal.

For the former the alignment has to be the same, plain and simple, even
if GCC isn't aware the alignments have to be the same.

For the latter we'd be better off loading the data into a REG, then
using the REG for the source of both memory stores.

Jeff
Bernd Edlinger Sept. 11, 2019, 5:41 p.m. UTC | #11
On 9/11/19 6:08 PM, Jeff Law wrote:
> On 9/11/19 7:49 AM, Bernd Edlinger wrote:
>> On 9/11/19 9:23 AM, Richard Biener wrote:
>>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>>
>>>> Hi!
>>>>
>>>> This ICE happens when compiling real_nextafter in real.c.
>>>> CSE sees this:
>>>>
>>>> (insn 179 178 180 11 (set (reg:SI 319)
>>>>         (reg/v/f:SI 273 [ rD.73757 ])) "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>> [...]
>>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>>>         (unspec:SI [
>>>>                 (reg:SI 320)
>>>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>      (nil))
>>>> [...]
>>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])
>>>>         (unspec:SI [
>>>>                 (reg:SI 320)
>>>>             ] UNSPEC_UNALIGNED_STORE)) "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>      (expr_list:REG_DEAD (reg:SI 320)
>>>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>>>             (nil))))
>>>> [...]
>>>> (insn 234 233 235 11 (set (reg:SI 340)
>>>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>>
>>>>
>>>> ... and transforms insn 234 in an invalid insn:
>>>>
>>>>
>>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int> [(struct real_valueD.28367 *)r_77(D)] ])
>>>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>      (nil))
>>>>
>>>> which triggers the assertion in the arm back-end, because the MEM is not aligned.
>>>>
>>>> To fix that I changed exp_equiv_p to consider MEMs with different MEM_ALIGN or
>>>> ALIAS_SET as different.
>>>
>>> I think the appropriate fix is to retain the mem_attrs of the original
>>> MEM since obviously the replacement does not change those?  It's a mere
>>> change of the MEMs address but implementation-wise CSE replaces the whole
>>> MEM?
>>>
>>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>>
>>> I'm not sure what CSE does above, that is, what does it think it does?
>>> It doesn't replace the loaded value, it seems to do sth else which
>>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>>  
>>
>> What I think CSE is thinking there is this:
>>
>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
>> that is the same address as where insn 234 reads the value back but there we know it is aligned.
>>
>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+20 S4 A8]
>>
>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well, when
>> insn 234 processed, it replaces one mem with another mem that has the same
>> value.
> Just to make sure I understand.  Are you saying the addresses for the
> MEMs are equal or the contents of the memory location are equal.
> 

The MEMs all have different addresses, but the same value, they are from a
memset or something:

(gdb) call dump_class(elt)
Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0 S4 A8]): 
(unspec:SI [
        (reg:SI 320)
    ] UNSPEC_UNALIGNED_STORE)
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 16 [0x10])) [0 MEM <char[1:24]> [(void *)r_77(D)]+16 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 12 [0xc])) [0 MEM <char[1:24]> [(void *)r_77(D)]+12 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 8 [0x8])) [0 MEM <char[1:24]> [(void *)r_77(D)]+8 S4 A8])
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 4 [0x4])) [0 MEM <char[1:24]> [(void *)r_77(D)]+4 S4 A8])
(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0 S4 A8])


The insn 234, uses the same address as the last in the list
(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0 S4 A8])
but incompatible alignment and alias set.

since we only are interested in the value CSE picks the most silly variant,
(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
        (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])

now this has a wrong alias set, a wrong alignment and a different address,
and is much more expensive, no wonder that lra needs to rewrite that.

> For the former the alignment has to be the same, plain and simple, even
> if GCC isn't aware the alignments have to be the same.
> 

Yes.

> For the latter we'd be better off loading the data into a REG, then
> using the REG for the source of both memory stores.
> 

I think Richi said, replacing MEM with a REG or a CONST would make sense,
We could just try not replace MEM with different MEM.


Bernd.
Richard Biener Sept. 11, 2019, 6:30 p.m. UTC | #12
On September 11, 2019 7:41:22 PM GMT+02:00, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
>On 9/11/19 6:08 PM, Jeff Law wrote:
>> On 9/11/19 7:49 AM, Bernd Edlinger wrote:
>>> On 9/11/19 9:23 AM, Richard Biener wrote:
>>>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> This ICE happens when compiling real_nextafter in real.c.
>>>>> CSE sees this:
>>>>>
>>>>> (insn 179 178 180 11 (set (reg:SI 319)
>>>>>         (reg/v/f:SI 273 [ rD.73757 ]))
>"../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>>>>      (nil))
>>>>> [...]
>>>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM
><charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>>>>         (unspec:SI [
>>>>>                 (reg:SI 320)
>>>>>             ] UNSPEC_UNALIGNED_STORE))
>"../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>>      (nil))
>>>>> [...]
>>>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [
>rD.73757 ])
>>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]>
>[(voidD.73 *)r_77(D)]+20 S4 A8])
>>>>>         (unspec:SI [
>>>>>                 (reg:SI 320)
>>>>>             ] UNSPEC_UNALIGNED_STORE))
>"../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>>      (expr_list:REG_DEAD (reg:SI 320)
>>>>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>>>>             (nil))))
>>>>> [...]
>>>>> (insn 234 233 235 11 (set (reg:SI 340)
>>>>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned
>int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]))
>"../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>>      (nil))
>>>>>
>>>>>
>>>>> ... and transforms insn 234 in an invalid insn:
>>>>>
>>>>>
>>>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int>
>[(struct real_valueD.28367 *)r_77(D)] ])
>>>>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]>
>[(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9
>643 {*thumb2_movsi_vfp}
>>>>>      (nil))
>>>>>
>>>>> which triggers the assertion in the arm back-end, because the MEM
>is not aligned.
>>>>>
>>>>> To fix that I changed exp_equiv_p to consider MEMs with different
>MEM_ALIGN or
>>>>> ALIAS_SET as different.
>>>>
>>>> I think the appropriate fix is to retain the mem_attrs of the
>original
>>>> MEM since obviously the replacement does not change those?  It's a
>mere
>>>> change of the MEMs address but implementation-wise CSE replaces the
>whole
>>>> MEM?
>>>>
>>>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>>>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>>>
>>>> I'm not sure what CSE does above, that is, what does it think it
>does?
>>>> It doesn't replace the loaded value, it seems to do sth else which
>>>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>>>  
>>>
>>> What I think CSE is thinking there is this:
>>>
>>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0
>MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
>>> that is the same address as where insn 234 reads the value back but
>there we know it is aligned.
>>>
>>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [
>rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73
>*)r_77(D)]+20 S4 A8]
>>>
>>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well,
>when
>>> insn 234 processed, it replaces one mem with another mem that has
>the same
>>> value.
>> Just to make sure I understand.  Are you saying the addresses for the
>> MEMs are equal or the contents of the memory location are equal.
>> 
>
>The MEMs all have different addresses, but the same value, they are
>from a
>memset or something:
>
>(gdb) call dump_class(elt)
>Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM
><char[1:24]> [(void *)r_77(D)]+0 S4 A8]): 
>(unspec:SI [
>        (reg:SI 320)
>    ] UNSPEC_UNALIGNED_STORE)
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 16 [0x10])) [0 MEM <char[1:24]> [(void *)r_77(D)]+16 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> (const_int 12 [0xc])) [0 MEM <char[1:24]> [(void *)r_77(D)]+12 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>   (const_int 8 [0x8])) [0 MEM <char[1:24]> [(void *)r_77(D)]+8 S4 A8])
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>   (const_int 4 [0x4])) [0 MEM <char[1:24]> [(void *)r_77(D)]+4 S4 A8])
>(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0
>S4 A8])
>
>
>The insn 234, uses the same address as the last in the list
>(mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0
>S4 A8])
>but incompatible alignment and alias set.
>
>since we only are interested in the value CSE picks the most silly
>variant,
>(mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>(const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
>
>now this has a wrong alias set, a wrong alignment and a different
>address,
>and is much more expensive, no wonder that lra needs to rewrite that.

But the alias and alignment are properties of the expression representing the value and have no influence on the value itself. 
For expression replacement you usually don't want expression equality but other checks (even the for_gcse ones show lack of distinction here). 

>> For the former the alignment has to be the same, plain and simple,
>even
>> if GCC isn't aware the alignments have to be the same.
>> 
>
>Yes.
>
>> For the latter we'd be better off loading the data into a REG, then
>> using the REG for the source of both memory stores.
>> 
>
>I think Richi said, replacing MEM with a REG or a CONST would make
>sense,
>We could just try not replace MEM with different MEM.

Yes. We can only replace expressions with ones that are obviously safe to insert here. MEMs are not obviously safe (some might even trap). Regs and constants are. 

Richard. 


>
>Bernd.
Bernd Edlinger Sept. 11, 2019, 7:32 p.m. UTC | #13
On 9/11/19 8:30 PM, Richard Biener wrote:
> On September 11, 2019 7:41:22 PM GMT+02:00, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
>> On 9/11/19 6:08 PM, Jeff Law wrote:
>>> On 9/11/19 7:49 AM, Bernd Edlinger wrote:
>>>> On 9/11/19 9:23 AM, Richard Biener wrote:
>>>>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> This ICE happens when compiling real_nextafter in real.c.
>>>>>> CSE sees this:
>>>>>>
>>>>>> (insn 179 178 180 11 (set (reg:SI 319)
>>>>>>         (reg/v/f:SI 273 [ rD.73757 ]))
>> "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
>>>>>>      (nil))
>>>>>> [...]
>>>>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM
>> <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
>>>>>>         (unspec:SI [
>>>>>>                 (reg:SI 320)
>>>>>>             ] UNSPEC_UNALIGNED_STORE))
>> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>>>      (nil))
>>>>>> [...]
>>>>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [
>> rD.73757 ])
>>>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]>
>> [(voidD.73 *)r_77(D)]+20 S4 A8])
>>>>>>         (unspec:SI [
>>>>>>                 (reg:SI 320)
>>>>>>             ] UNSPEC_UNALIGNED_STORE))
>> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
>>>>>>      (expr_list:REG_DEAD (reg:SI 320)
>>>>>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
>>>>>>             (nil))))
>>>>>> [...]
>>>>>> (insn 234 233 235 11 (set (reg:SI 340)
>>>>>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned
>> int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]))
>> "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
>>>>>>      (nil))
>>>>>>
>>>>>>
>>>>>> ... and transforms insn 234 in an invalid insn:
>>>>>>
>>>>>>
>>>>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int>
>> [(struct real_valueD.28367 *)r_77(D)] ])
>>>>>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
>>>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]>
>> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9
>> 643 {*thumb2_movsi_vfp}
>>>>>>      (nil))
>>>>>>
>>>>>> which triggers the assertion in the arm back-end, because the MEM
>> is not aligned.
>>>>>>
>>>>>> To fix that I changed exp_equiv_p to consider MEMs with different
>> MEM_ALIGN or
>>>>>> ALIAS_SET as different.
>>>>>
>>>>> I think the appropriate fix is to retain the mem_attrs of the
>> original
>>>>> MEM since obviously the replacement does not change those?  It's a
>> mere
>>>>> change of the MEMs address but implementation-wise CSE replaces the
>> whole
>>>>> MEM?
>>>>>
>>>>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
>>>>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
>>>>>
>>>>> I'm not sure what CSE does above, that is, what does it think it
>> does?
>>>>> It doesn't replace the loaded value, it seems to do sth else which
>>>>> is odd for CSE (replaces a REG with a PLUS - but why?)
>>>>>  
>>>>
>>>> What I think CSE is thinking there is this:
>>>>
>>>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0
>> MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
>>>> that is the same address as where insn 234 reads the value back but
>> there we know it is aligned.
>>>>
>>>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [
>> rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73
>> *)r_77(D)]+20 S4 A8]
>>>>
>>>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well,
>> when
>>>> insn 234 processed, it replaces one mem with another mem that has
>> the same
>>>> value.
>>> Just to make sure I understand.  Are you saying the addresses for the
>>> MEMs are equal or the contents of the memory location are equal.
>>>
>>
>> The MEMs all have different addresses, but the same value, they are
>>from a
>> memset or something:
>>
>> (gdb) call dump_class(elt)
>> Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM
>> <char[1:24]> [(void *)r_77(D)]+0 S4 A8]): 
>> (unspec:SI [
>>        (reg:SI 320)
>>    ] UNSPEC_UNALIGNED_STORE)
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>> (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>> (const_int 16 [0x10])) [0 MEM <char[1:24]> [(void *)r_77(D)]+16 S4 A8])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>> (const_int 12 [0xc])) [0 MEM <char[1:24]> [(void *)r_77(D)]+12 S4 A8])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>>   (const_int 8 [0x8])) [0 MEM <char[1:24]> [(void *)r_77(D)]+8 S4 A8])
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>>   (const_int 4 [0x4])) [0 MEM <char[1:24]> [(void *)r_77(D)]+4 S4 A8])
>> (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0
>> S4 A8])
>>
>>
>> The insn 234, uses the same address as the last in the list
>> (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0
>> S4 A8])
>> but incompatible alignment and alias set.
>>
>> since we only are interested in the value CSE picks the most silly
>> variant,
>> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
>> (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
>>
>> now this has a wrong alias set, a wrong alignment and a different
>> address,
>> and is much more expensive, no wonder that lra needs to rewrite that.
> 
> But the alias and alignment are properties of the expression representing the value and have no influence on the value itself. 
> For expression replacement you usually don't want expression equality but other checks (even the for_gcse ones show lack of distinction here). 
> 
>>> For the former the alignment has to be the same, plain and simple,
>> even
>>> if GCC isn't aware the alignments have to be the same.
>>>
>>
>> Yes.
>>
>>> For the latter we'd be better off loading the data into a REG, then
>>> using the REG for the source of both memory stores.
>>>
>>
>> I think Richi said, replacing MEM with a REG or a CONST would make
>> sense,
>> We could just try not replace MEM with different MEM.
> 
> Yes. We can only replace expressions with ones that are obviously safe to insert here. MEMs are not obviously safe (some might even trap). Regs and constants are. 
> 

Hmm, yes, I can just ignore any misaligned MEMs in the equivalence class.
This seems to fix the test case as it seems.

Are we sure that replacing a read from a different alias set cannot cause
problems here?  I almost start to think so...


So, how about this?


Thanks
Bernd.
Richard Biener Sept. 12, 2019, 8:08 a.m. UTC | #14
On Wed, 11 Sep 2019, Bernd Edlinger wrote:

> On 9/11/19 8:30 PM, Richard Biener wrote:
> > On September 11, 2019 7:41:22 PM GMT+02:00, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
> >> On 9/11/19 6:08 PM, Jeff Law wrote:
> >>> On 9/11/19 7:49 AM, Bernd Edlinger wrote:
> >>>> On 9/11/19 9:23 AM, Richard Biener wrote:
> >>>>> On Tue, 10 Sep 2019, Bernd Edlinger wrote:
> >>>>>
> >>>>>> Hi!
> >>>>>>
> >>>>>> This ICE happens when compiling real_nextafter in real.c.
> >>>>>> CSE sees this:
> >>>>>>
> >>>>>> (insn 179 178 180 11 (set (reg:SI 319)
> >>>>>>         (reg/v/f:SI 273 [ rD.73757 ]))
> >> "../../gcc-trunk-1/gcc/real.c":120:10 643 {*thumb2_movsi_vfp}
> >>>>>>      (nil))
> >>>>>> [...]
> >>>>>> (insn 181 180 182 11 (set (mem:SI (reg:SI 319) [0 MEM
> >> <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8])
> >>>>>>         (unspec:SI [
> >>>>>>                 (reg:SI 320)
> >>>>>>             ] UNSPEC_UNALIGNED_STORE))
> >> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>>>>>      (nil))
> >>>>>> [...]
> >>>>>> (insn 186 185 187 11 (set (mem:SI (plus:SI (reg/v/f:SI 273 [
> >> rD.73757 ])
> >>>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]>
> >> [(voidD.73 *)r_77(D)]+20 S4 A8])
> >>>>>>         (unspec:SI [
> >>>>>>                 (reg:SI 320)
> >>>>>>             ] UNSPEC_UNALIGNED_STORE))
> >> "../../gcc-trunk-1/gcc/real.c":120:10 129 {unaligned_storesi}
> >>>>>>      (expr_list:REG_DEAD (reg:SI 320)
> >>>>>>         (expr_list:REG_DEAD (reg/f:SI 319 [ rD.73757 ])
> >>>>>>             (nil))))
> >>>>>> [...]
> >>>>>> (insn 234 233 235 11 (set (reg:SI 340)
> >>>>>>         (mem:SI (reg/v/f:SI 273 [ rD.73757 ]) [52 MEM <unsigned
> >> int> [(struct real_valueD.28367 *)r_77(D)]+0 S4 A32]))
> >> "../../gcc-trunk-1/gcc/real.c":5185:9 643 {*thumb2_movsi_vfp}
> >>>>>>      (nil))
> >>>>>>
> >>>>>>
> >>>>>> ... and transforms insn 234 in an invalid insn:
> >>>>>>
> >>>>>>
> >>>>>> (insn 234 233 235 11 (set (reg:SI 340 [ MEM <unsigned int>
> >> [(struct real_valueD.28367 *)r_77(D)] ])
> >>>>>>         (mem:SI (plus:SI (reg/v/f:SI 273 [ rD.73757 ])
> >>>>>>                 (const_int 20 [0x14])) [0 MEM <charD.7[1:24]>
> >> [(voidD.73 *)r_77(D)]+20 S4 A8])) "../../gcc-trunk-1/gcc/real.c":5185:9
> >> 643 {*thumb2_movsi_vfp}
> >>>>>>      (nil))
> >>>>>>
> >>>>>> which triggers the assertion in the arm back-end, because the MEM
> >> is not aligned.
> >>>>>>
> >>>>>> To fix that I changed exp_equiv_p to consider MEMs with different
> >> MEM_ALIGN or
> >>>>>> ALIAS_SET as different.
> >>>>>
> >>>>> I think the appropriate fix is to retain the mem_attrs of the
> >> original
> >>>>> MEM since obviously the replacement does not change those?  It's a
> >> mere
> >>>>> change of the MEMs address but implementation-wise CSE replaces the
> >> whole
> >>>>> MEM?
> >>>>>
> >>>>> For CSE exp_equiv_p simply compares the addresses, only if for_gcse
> >>>>> we do further checks like mem_attrs_eq_p.  IMHO that is correct.
> >>>>>
> >>>>> I'm not sure what CSE does above, that is, what does it think it
> >> does?
> >>>>> It doesn't replace the loaded value, it seems to do sth else which
> >>>>> is odd for CSE (replaces a REG with a PLUS - but why?)
> >>>>>  
> >>>>
> >>>> What I think CSE is thinking there is this:
> >>>>
> >>>> insn 181 stores the value of (reg:SI 320) at (mem:SI (reg:SI 319) [0
> >> MEM <charD.7[1:24]> [(voidD.73 *)r_77(D)]+0 S4 A8]
> >>>> that is the same address as where insn 234 reads the value back but
> >> there we know it is aligned.
> >>>>
> >>>> insn 186 stores the same value at (mem:SI (plus:SI (reg/v/f:SI 273 [
> >> rD.73757 ]) (const_int 20 [0x14])) [0 MEM <charD.7[1:24]> [(voidD.73
> >> *)r_77(D)]+20 S4 A8]
> >>>>
> >>>> But since both reg:SI 320 is dead and reg:SI 319 is dead as well,
> >> when
> >>>> insn 234 processed, it replaces one mem with another mem that has
> >> the same
> >>>> value.
> >>> Just to make sure I understand.  Are you saying the addresses for the
> >>> MEMs are equal or the contents of the memory location are equal.
> >>>
> >>
> >> The MEMs all have different addresses, but the same value, they are
> >>from a
> >> memset or something:
> >>
> >> (gdb) call dump_class(elt)
> >> Equivalence chain for (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM
> >> <char[1:24]> [(void *)r_77(D)]+0 S4 A8]): 
> >> (unspec:SI [
> >>        (reg:SI 320)
> >>    ] UNSPEC_UNALIGNED_STORE)
> >> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> >> (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
> >> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> >> (const_int 16 [0x10])) [0 MEM <char[1:24]> [(void *)r_77(D)]+16 S4 A8])
> >> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> >> (const_int 12 [0xc])) [0 MEM <char[1:24]> [(void *)r_77(D)]+12 S4 A8])
> >> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> >>   (const_int 8 [0x8])) [0 MEM <char[1:24]> [(void *)r_77(D)]+8 S4 A8])
> >> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> >>   (const_int 4 [0x4])) [0 MEM <char[1:24]> [(void *)r_77(D)]+4 S4 A8])
> >> (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0
> >> S4 A8])
> >>
> >>
> >> The insn 234, uses the same address as the last in the list
> >> (mem:SI (reg/v/f:SI 273 [ r ]) [0 MEM <char[1:24]> [(void *)r_77(D)]+0
> >> S4 A8])
> >> but incompatible alignment and alias set.
> >>
> >> since we only are interested in the value CSE picks the most silly
> >> variant,
> >> (mem:SI (plus:SI (reg/v/f:SI 273 [ r ])
> >> (const_int 20 [0x14])) [0 MEM <char[1:24]> [(void *)r_77(D)]+20 S4 A8])
> >>
> >> now this has a wrong alias set, a wrong alignment and a different
> >> address,
> >> and is much more expensive, no wonder that lra needs to rewrite that.
> > 
> > But the alias and alignment are properties of the expression representing the value and have no influence on the value itself. 
> > For expression replacement you usually don't want expression equality but other checks (even the for_gcse ones show lack of distinction here). 
> > 
> >>> For the former the alignment has to be the same, plain and simple,
> >> even
> >>> if GCC isn't aware the alignments have to be the same.
> >>>
> >>
> >> Yes.
> >>
> >>> For the latter we'd be better off loading the data into a REG, then
> >>> using the REG for the source of both memory stores.
> >>>
> >>
> >> I think Richi said, replacing MEM with a REG or a CONST would make
> >> sense,
> >> We could just try not replace MEM with different MEM.
> > 
> > Yes. We can only replace expressions with ones that are obviously safe to insert here. MEMs are not obviously safe (some might even trap). Regs and constants are. 
> > 
> 
> Hmm, yes, I can just ignore any misaligned MEMs in the equivalence class.
> This seems to fix the test case as it seems.
> 
> Are we sure that replacing a read from a different alias set cannot cause
> problems here?  I almost start to think so...

Sure, those will be a problem as well.

> 
> So, how about this?

More like the following?  I wonder if we can assert that
MEM_NOTRAP_P () are equal (see all the for_gcse checks in exp_equiv_p).
But as said earlier I wonder in which cases a replacement is
profitable at all - thus, if a

  else if (MEM_P (trial))
    /* Do not replace anything with a MEM.  */
    ;

isn't better?  I see we detect noop MEM = MEM earlier thus this kind
of check would need to be done after that.  And note that code
is also buggy since it doesn't check for the case where the
store changes the dynamic type of the memory...

Index: cse.c
===================================================================
--- cse.c       (revision 275648)
+++ cse.c       (working copy)
@@ -5385,6 +5385,14 @@ cse_insn (rtx_insn *insn)
            /* Do nothing for this case.  */
            ;
 
+         else if (MEM_P (trial)
+                  && MEM_P (SET_SRC (sets[i].rtl))
+                  && !mem_attrs_eq_p (MEM_ATTRS (trial),
+                                      MEM_ATTRS (SET_SRC (sets[i].rtl))))
+           /* Do not try to replace a MEM with another MEM when
+              attributes like alignment and alias-set do not match.  */
+           ;
+
          /* Look for a substitution that makes a valid insn.  */
          else if (validate_unshare_change (insn, &SET_SRC (sets[i].rtl),
                                            trial, 0))
Bernd Edlinger Sept. 12, 2019, 2:27 p.m. UTC | #15
On 9/12/19 10:08 AM, Richard Biener wrote:
> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
> 
>> On 9/11/19 8:30 PM, Richard Biener wrote:
> 
> More like the following?  I wonder if we can assert that
> MEM_NOTRAP_P () are equal (see all the for_gcse checks in exp_equiv_p).
> But as said earlier I wonder in which cases a replacement is
> profitable at all - thus, if a
> 
>   else if (MEM_P (trial))
>     /* Do not replace anything with a MEM.  */
>     ;
> 

Yes, I like that better, since there is essentially nothing stopping
it from replacing a REG with a MEM, except the rtxcost function, maybe.

It turned out that the previous version got into an endless loop here,
since the loop termination happens when replacing MEM by itself, except
when something else terminates the search.

So how about this?

The only possible MEM->MEM transformations are now:
- replacing trial = dest (result mov dest, dest; which is a no-op insn)
- replacing trial = src (which is a no-op transformation, and exits the loop)

Therefore the overlapping mem move handling no longer necessary. 


Bootstrap and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
Richard Biener Sept. 13, 2019, 11:23 a.m. UTC | #16
On Thu, 12 Sep 2019, Bernd Edlinger wrote:

> On 9/12/19 10:08 AM, Richard Biener wrote:
> > On Wed, 11 Sep 2019, Bernd Edlinger wrote:
> > 
> >> On 9/11/19 8:30 PM, Richard Biener wrote:
> > 
> > More like the following?  I wonder if we can assert that
> > MEM_NOTRAP_P () are equal (see all the for_gcse checks in exp_equiv_p).
> > But as said earlier I wonder in which cases a replacement is
> > profitable at all - thus, if a
> > 
> >   else if (MEM_P (trial))
> >     /* Do not replace anything with a MEM.  */
> >     ;
> > 
> 
> Yes, I like that better, since there is essentially nothing stopping
> it from replacing a REG with a MEM, except the rtxcost function, maybe.
> 
> It turned out that the previous version got into an endless loop here,
> since the loop termination happens when replacing MEM by itself, except
> when something else terminates the search.

Is that how it works without the patch as well?  I admit the loop
is a bit hard to follow since it is not only iterating
over elt->next_same_value ...

> So how about this?
> 
> The only possible MEM->MEM transformations are now:
> - replacing trial = dest (result mov dest, dest; which is a no-op insn)
> - replacing trial = src (which is a no-op transformation, and exits the loop)
> 
> Therefore the overlapping mem move handling no longer necessary. 
> 
> 
> Bootstrap and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?

Looks OK to me.  Can you see if this has any unexpected code-generation
effects?  I would expect it to not make a difference at all, but you
never know -- any differences in cc1files?

Not really my primary area of expertise...

Thanks,
Richard.
Bernd Edlinger Sept. 13, 2019, 3:26 p.m. UTC | #17
On 9/13/19 1:23 PM, Richard Biener wrote:
> On Thu, 12 Sep 2019, Bernd Edlinger wrote:
> 
>> On 9/12/19 10:08 AM, Richard Biener wrote:
>>> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
>>>
>>>> On 9/11/19 8:30 PM, Richard Biener wrote:
>>>
>>> More like the following?  I wonder if we can assert that
>>> MEM_NOTRAP_P () are equal (see all the for_gcse checks in exp_equiv_p).
>>> But as said earlier I wonder in which cases a replacement is
>>> profitable at all - thus, if a
>>>
>>>   else if (MEM_P (trial))
>>>     /* Do not replace anything with a MEM.  */
>>>     ;
>>>
>>
>> Yes, I like that better, since there is essentially nothing stopping
>> it from replacing a REG with a MEM, except the rtxcost function, maybe.
>>
>> It turned out that the previous version got into an endless loop here,
>> since the loop termination happens when replacing MEM by itself, except
>> when something else terminates the search.
> 
> Is that how it works without the patch as well?  I admit the loop
> is a bit hard to follow since it is not only iterating
> over elt->next_same_value ...
> 

Yes, that seems to be the case, since the memory reference itself
is always in the set.  There was either an endless loop, when
the src variable was set this if can always be taken, which makes
no progress:

          else if (src
                   && preferable (src_cost, src_regcost,
                                  src_eqv_cost, src_eqv_regcost) <= 0
                   && preferable (src_cost, src_regcost,
                                  src_related_cost, src_related_regcost) <= 0
                   && preferable (src_cost, src_regcost,
                                  src_elt_cost, src_elt_regcost) <= 0)
            trial = src, src_cost = MAX_COST;

or there was a crash when the above was not taken then:

          else
            {
              trial = elt->exp;
              elt = elt->next_same_value;
              src_elt_cost = MAX_COST;
            }

crashes, because elt == NULL at some point.

>> So how about this?
>>
>> The only possible MEM->MEM transformations are now:
>> - replacing trial = dest (result mov dest, dest; which is a no-op insn)
>> - replacing trial = src (which is a no-op transformation, and exits the loop)
>>
>> Therefore the overlapping mem move handling no longer necessary. 
>>
>>
>> Bootstrap and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> 
> Looks OK to me.  Can you see if this has any unexpected code-generation
> effects?  I would expect it to not make a difference at all, but you
> never know -- any differences in cc1files?
> 

I tried to install the patch after _stage1_ was competed,
but there was no bootstrap miscompare (on x86_64-pc-linux-gnu)
so, no this does not make any difference here.


Bernd.

> Not really my primary area of expertise...
> 
> Thanks,
> Richard.
>
Richard Biener Sept. 13, 2019, 4:56 p.m. UTC | #18
On Fri, 13 Sep 2019, Bernd Edlinger wrote:

> On 9/13/19 1:23 PM, Richard Biener wrote:
> > On Thu, 12 Sep 2019, Bernd Edlinger wrote:
> > 
> >> On 9/12/19 10:08 AM, Richard Biener wrote:
> >>> On Wed, 11 Sep 2019, Bernd Edlinger wrote:
> >>>
> >>>> On 9/11/19 8:30 PM, Richard Biener wrote:
> >>>
> >>> More like the following?  I wonder if we can assert that
> >>> MEM_NOTRAP_P () are equal (see all the for_gcse checks in exp_equiv_p).
> >>> But as said earlier I wonder in which cases a replacement is
> >>> profitable at all - thus, if a
> >>>
> >>>   else if (MEM_P (trial))
> >>>     /* Do not replace anything with a MEM.  */
> >>>     ;
> >>>
> >>
> >> Yes, I like that better, since there is essentially nothing stopping
> >> it from replacing a REG with a MEM, except the rtxcost function, maybe.
> >>
> >> It turned out that the previous version got into an endless loop here,
> >> since the loop termination happens when replacing MEM by itself, except
> >> when something else terminates the search.
> > 
> > Is that how it works without the patch as well?  I admit the loop
> > is a bit hard to follow since it is not only iterating
> > over elt->next_same_value ...
> > 
> 
> Yes, that seems to be the case, since the memory reference itself
> is always in the set.  There was either an endless loop, when
> the src variable was set this if can always be taken, which makes
> no progress:
> 
>           else if (src
>                    && preferable (src_cost, src_regcost,
>                                   src_eqv_cost, src_eqv_regcost) <= 0
>                    && preferable (src_cost, src_regcost,
>                                   src_related_cost, src_related_regcost) <= 0
>                    && preferable (src_cost, src_regcost,
>                                   src_elt_cost, src_elt_regcost) <= 0)
>             trial = src, src_cost = MAX_COST;
> 
> or there was a crash when the above was not taken then:
> 
>           else
>             {
>               trial = elt->exp;
>               elt = elt->next_same_value;
>               src_elt_cost = MAX_COST;
>             }
> 
> crashes, because elt == NULL at some point.
> 
> >> So how about this?
> >>
> >> The only possible MEM->MEM transformations are now:
> >> - replacing trial = dest (result mov dest, dest; which is a no-op insn)
> >> - replacing trial = src (which is a no-op transformation, and exits the loop)
> >>
> >> Therefore the overlapping mem move handling no longer necessary. 
> >>
> >>
> >> Bootstrap and reg-tested on x86_64-pc-linux-gnu.
> >> Is it OK for trunk?
> > 
> > Looks OK to me.  Can you see if this has any unexpected code-generation
> > effects?  I would expect it to not make a difference at all, but you
> > never know -- any differences in cc1files?
> > 
> 
> I tried to install the patch after _stage1_ was competed,
> but there was no bootstrap miscompare (on x86_64-pc-linux-gnu)
> so, no this does not make any difference here.

The patch is OK for trunk then.

Thanks,
Richard.

Patch
diff mbox series

2019-09-10  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR middle-end/91708
	* cse.c (exp_equiv_p): Consider MEMs with different
	alias set or alignment as different.

--- gcc/cse.c.orig	2019-07-24 21:21:53.590065924 +0200
+++ gcc/cse.c	2019-09-10 16:15:37.899933738 +0200
@@ -2637,8 +2637,13 @@  exp_equiv_p (const_rtx x, const_rtx y, i
   if (GET_MODE (x) != GET_MODE (y))
     return 0;
 
-  /* MEMs referring to different address space are not equivalent.  */
-  if (code == MEM && MEM_ADDR_SPACE (x) != MEM_ADDR_SPACE (y))
+  /* MEMs referring to different address spaces are not equivalent.
+     MEMs with different alias sets are not equivalent either.
+     Also the MEM_ALIGN needs to be identical in order not to break
+     constraints of insn's that need certain alignment (see PR91708).  */
+  if (code == MEM && (MEM_ADDR_SPACE (x) != MEM_ADDR_SPACE (y)
+		      || MEM_ALIAS_SET (x) != MEM_ALIAS_SET (y)
+		      || MEM_ALIGN (x) != MEM_ALIGN (y)))
     return 0;
 
   switch (code)