Fix PR tree-optimization/51624

Message ID	201112310033.24438.ebotcazou@adacore.com
State	New
Headers	show Return-Path: <gcc-patches-return-310663-incoming=patchwork.ozlabs.org@gcc.gnu.org> Comment: DKIM? See http://www.dkim.org Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:From:To:Subject:Date:User-Agent:MIME-Version:Message-Id:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=X+30drDsz0s85T+f+VtNLJYY6ThfJ4VLWHI1lvL9qSDTXRsH5gLc1mBAnTYGrX aeMz1QDjn4jIG2kwmyp8pXixGhS8Am0dly9w1xdTCN0aIZybp26R0FB3Jp4NTBPB /2ElOdjK9WZNsASI6E+QN3x4rxMgwmkVYuqWuwS6ujAqY=; From: Eric Botcazou <ebotcazou@adacore.com> To: gcc-patches@gcc.gnu.org Subject: [patch] Fix PR tree-optimization/51624 Date: Sat, 31 Dec 2011 00:33:24 +0100 User-Agent: KMail/1.13.5 (Linux/2.6.34.10-0.4-desktop; KDE/4.4.4; x86_64; ; ) MIME-Version: 1.0 Message-Id: <201112310033.24438.ebotcazou@adacore.com> Content-Type: Multipart/Mixed; boundary="Boundary-00=_Epk/OLbpy9QcREv" Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org

Eric Botcazou Dec. 30, 2011, 11:33 p.m. UTC

This is the bootstrap failure of the Ada compiler on MIPS/IRIX, a recent 
regression present on mainline and 4.6 branch.  The stage 2 compiler 
miscompiles the stage 3 compiler (sem_type.adb:Disambiguate) because of an 
oversight in the fix for PR tree-opt/50569 which changed build_ref_for_model to 
replicate chains of COMPONENT_REFs instead of just the last ones.

The problem is that, when build_ref_for_model is called on the RHS of an 
aggregate assignment of the form:

  b1.f1 = b2

with the model associated with a child of the LHS (say b1.f1.f2), it will build 
something like:

  MEM [&b2 + -4B].f1.f2

which will wreak havoc downstream.  I think that the most straightforward way 
out is to stop going up the chain of COMPONENT_REFs as soon as the type of the 
COMPONENT_REF matches that of the base.

Bootstrapped on MIPS/IRIX and regtested on x86_64-suse-linux.  OK after a full 
testing cycle?


2011-12-30  Eric Botcazou  <ebotcazou@adacore.com>

	PR tree-optimization/51624
	* tree-sra.c (build_ref_for_model): When replicating a chain of
	COMPONENT_REFs, stop as soon as the type of the COMPONENT_REF
	matches that of the base.

Richard Biener Jan. 2, 2012, 11:49 a.m. UTC | #1

On Sat, Dec 31, 2011 at 12:33 AM, Eric Botcazou <ebotcazou@adacore.com>
wrote:
> This is the bootstrap failure of the Ada compiler on MIPS/IRIX, a recent
> regression present on mainline and 4.6 branch.  The stage 2 compiler
> miscompiles the stage 3 compiler (sem_type.adb:Disambiguate) because of an
> oversight in the fix for PR tree-opt/50569 which changed build_ref_for_model to
> replicate chains of COMPONENT_REFs instead of just the last ones.
>
> The problem is that, when build_ref_for_model is called on the RHS of an
> aggregate assignment of the form:
>
>  b1.f1 = b2
>
> with the model associated with a child of the LHS (say b1.f1.f2), it will build
> something like:
>
>  MEM [&b2 + -4B].f1.f2
>
> which will wreak havoc downstream.  I think that the most straightforward way
> out is to stop going up the chain of COMPONENT_REFs as soon as the type of the
> COMPONENT_REF matches that of the base.
>
> Bootstrapped on MIPS/IRIX and regtested on x86_64-suse-linux.  OK after a full
> testing cycle?

Note that you'll get ICEs whenever TYPE_CANONICAL of some
aggregate type is NULL (thus, TYPE_STRUCTURAL_EQUALITY), so this
seems inherently fragile (similar to using TYPE_CANONICAL == TYPE_CANONICAL
here, we'd break the TYPE_STRUCTURAL_EQUALITY case again).
I think that either the original fix needs to be re-thought or we need to pass
the base FIELD_DECL to build_ref_for_model.

That is, why not stop extracting the component-refs in

      do {
        tree field = TREE_OPERAND (expr, 1);
        tree cr_offset = component_ref_field_offset (expr);
        gcc_assert (cr_offset && host_integerp (cr_offset, 1));

        offset -= TREE_INT_CST_LOW (cr_offset) * BITS_PER_UNIT;
        offset -= TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));

        VEC_safe_push (tree, stack, cr_stack, expr);

        expr = TREE_OPERAND (expr, 0);
        type = TREE_TYPE (expr);
      } while (TREE_CODE (expr) == COMPONENT_REF);

when expr == base?

The whole point of course was to never need something like
build_ref_for_model given that we have MEM_REFs.

Thanks,
Richard.

> 2011-12-30  Eric Botcazou  <ebotcazou@adacore.com>
>
>        PR tree-optimization/51624
>        * tree-sra.c (build_ref_for_model): When replicating a chain of
>        COMPONENT_REFs, stop as soon as the type of the COMPONENT_REF
>        matches that of the base.
>
>
> --
> Eric Botcazou

Eric Botcazou Jan. 2, 2012, 12:06 p.m. UTC | #2

> Note that you'll get ICEs whenever TYPE_CANONICAL of some
> aggregate type is NULL (thus, TYPE_STRUCTURAL_EQUALITY), so this
> seems inherently fragile (similar to using TYPE_CANONICAL == TYPE_CANONICAL
> here, we'd break the TYPE_STRUCTURAL_EQUALITY case again).

Could you elaborate?  The patch was successfully regtested.

> I think that either the original fix needs to be re-thought or we need to
> pass the base FIELD_DECL to build_ref_for_model.

> That is, why not stop extracting the component-refs in
>
>       do {
>         tree field = TREE_OPERAND (expr, 1);
>         tree cr_offset = component_ref_field_offset (expr);
>         gcc_assert (cr_offset && host_integerp (cr_offset, 1));
>
>         offset -= TREE_INT_CST_LOW (cr_offset) * BITS_PER_UNIT;
>         offset -= TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
>
>         VEC_safe_push (tree, stack, cr_stack, expr);
>
>         expr = TREE_OPERAND (expr, 0);
>         type = TREE_TYPE (expr);
>       } while (TREE_CODE (expr) == COMPONENT_REF);
>
> when expr == base?

Because SRA uses the model of the LHS to build the access on the RHS, so this 
kind of equalitty condition cannot work.

> The whole point of course was to never need something like
> build_ref_for_model given that we have MEM_REFs.

MEM_REF is nice, but SRA should stop rewriting component accesses that it 
doesn't scalarize, this is waste of time and memory, and introduces bugs.

Richard Biener Jan. 2, 2012, 12:37 p.m. UTC | #3

On Mon, Jan 2, 2012 at 1:06 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Note that you'll get ICEs whenever TYPE_CANONICAL of some
>> aggregate type is NULL (thus, TYPE_STRUCTURAL_EQUALITY), so this
>> seems inherently fragile (similar to using TYPE_CANONICAL == TYPE_CANONICAL
>> here, we'd break the TYPE_STRUCTURAL_EQUALITY case again).
>
> Could you elaborate?  The patch was successfully regtested.

If you have nested structure types that are TYPE_STRUCTURAL_EQUALITY
then they have TYPE_CANONICAL == NULL_TREE and types_compatible_p
will return false (it ICEd in previous releases I believe) and thus you won't
stop attaching COMPONENT_REFs.

So the fix isn't complete.

>> I think that either the original fix needs to be re-thought or we need to
>> pass the base FIELD_DECL to build_ref_for_model.
>
>> That is, why not stop extracting the component-refs in
>>
>>       do {
>>         tree field = TREE_OPERAND (expr, 1);
>>         tree cr_offset = component_ref_field_offset (expr);
>>         gcc_assert (cr_offset && host_integerp (cr_offset, 1));
>>
>>         offset -= TREE_INT_CST_LOW (cr_offset) * BITS_PER_UNIT;
>>         offset -= TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
>>
>>         VEC_safe_push (tree, stack, cr_stack, expr);
>>
>>         expr = TREE_OPERAND (expr, 0);
>>         type = TREE_TYPE (expr);
>>       } while (TREE_CODE (expr) == COMPONENT_REF);
>>
>> when expr == base?
>
> Because SRA uses the model of the LHS to build the access on the RHS, so this
> kind of equalitty condition cannot work.

Ugh.

>> The whole point of course was to never need something like
>> build_ref_for_model given that we have MEM_REFs.
>
> MEM_REF is nice, but SRA should stop rewriting component accesses that it
> doesn't scalarize, this is waste of time and memory, and introduces bugs.

I agree - I didn't know it does that.  Do you have an example?

Richard.

> --
> Eric Botcazou

Eric Botcazou Jan. 3, 2012, 10:16 a.m. UTC | #4

> If you have nested structure types that are TYPE_STRUCTURAL_EQUALITY
> then they have TYPE_CANONICAL == NULL_TREE and types_compatible_p
> will return false (it ICEd in previous releases I believe) and thus you
> won't stop attaching COMPONENT_REFs.
>
> So the fix isn't complete.

OK, I see.  Then the only way out I can think of is to stop going up the chain 
of COMPONENT_REFs as soon as the offset becomes negative.

> > MEM_REF is nice, but SRA should stop rewriting component accesses that it
> > doesn't scalarize, this is waste of time and memory, and introduces bugs.
>
> I agree - I didn't know it does that.  Do you have an example?

The very problem I'm debugging:

  SR.37_109 = it.nam;
  SR.38_110 = it.typ;

What's the point in generating MEM_REF to access a simple component?  Of 
course, this is somewhat hidden since the tree pretty-printer is clever 
(evil?) enough to print

  <COMPONENT_REF <MEM_REF <ADDR_EXPR <VAR_DECL it>>>, nam>

as

  it.nam

but this wastes 2 tree nodes.

Richard Biener Jan. 3, 2012, 10:36 a.m. UTC | #5

On Tue, Jan 3, 2012 at 11:16 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> If you have nested structure types that are TYPE_STRUCTURAL_EQUALITY
>> then they have TYPE_CANONICAL == NULL_TREE and types_compatible_p
>> will return false (it ICEd in previous releases I believe) and thus you
>> won't stop attaching COMPONENT_REFs.
>>
>> So the fix isn't complete.
>
> OK, I see.  Then the only way out I can think of is to stop going up the chain
> of COMPONENT_REFs as soon as the offset becomes negative.

Yeah, that sounds like a better solution.

>> > MEM_REF is nice, but SRA should stop rewriting component accesses that it
>> > doesn't scalarize, this is waste of time and memory, and introduces bugs.
>>
>> I agree - I didn't know it does that.  Do you have an example?
>
> The very problem I'm debugging:
>
>  SR.37_109 = it.nam;
>  SR.38_110 = it.typ;
>
> What's the point in generating MEM_REF to access a simple component?  Of
> course, this is somewhat hidden since the tree pretty-printer is clever
> (evil?) enough to print
>
>  <COMPONENT_REF <MEM_REF <ADDR_EXPR <VAR_DECL it>>>, nam>
>
> as
>
>  it.nam
>
> but this wastes 2 tree nodes.

Ah.  Well - the point is that SRA analyzes accesses using a type, offset, size
triple, thus it handles scalarizing unions well even if multiple fields are used
from it.  But it indeed can as well use the original access tree if it does not
change the access (by decomposing it, for example).  That's probably
a worthwhile change anyway - Martin?

Thanks,
Richard.

> --
> Eric Botcazou

Eric Botcazou Jan. 6, 2012, 6:17 p.m. UTC | #6

> > OK, I see.  Then the only way out I can think of is to stop going up the
> > chain of COMPONENT_REFs as soon as the offset becomes negative.
>
> Yeah, that sounds like a better solution.

PR ada/51775 shows that this pessimizes though because if you have an enclosing 
sub-component at offset 0 on the LHS and not on the RHS, and you're using the 
model of the LHS for the RHS, you'll build a convoluted (but correct) MEM on 
the RHS, and this is apparently enough to thwart FRE/PRE and the likes in some 
simple cases:

  t$F$i2_9 = MEM[(struct pack9__r2 *)y_1(D)].F.i2;
  D.2233_3 = y_1(D)->i2;

The compiler doesn't see that D.2233_3 == t$F$i2_9.  Additionally using the 
type trick would help, at least for PR ada/51775 and I guess in most cases.  

Thoughts?

Richard Biener Jan. 9, 2012, 9:43 a.m. UTC | #7

On Fri, Jan 6, 2012 at 7:17 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> > OK, I see.  Then the only way out I can think of is to stop going up the
>> > chain of COMPONENT_REFs as soon as the offset becomes negative.
>>
>> Yeah, that sounds like a better solution.
>
> PR ada/51775 shows that this pessimizes though because if you have an enclosing
> sub-component at offset 0 on the LHS and not on the RHS, and you're using the
> model of the LHS for the RHS, you'll build a convoluted (but correct) MEM on
> the RHS, and this is apparently enough to thwart FRE/PRE and the likes in some
> simple cases:
>
>  t$F$i2_9 = MEM[(struct pack9__r2 *)y_1(D)].F.i2;
>  D.2233_3 = y_1(D)->i2;
>
> The compiler doesn't see that D.2233_3 == t$F$i2_9.  Additionally using the
> type trick would help, at least for PR ada/51775 and I guess in most cases.
>
> Thoughts?

Can you get me a testcase that I can compile?  This should be fixed in FRE.

Thanks,
Richard.

> --
> Eric Botcazou

Eric Botcazou Jan. 9, 2012, 10:33 a.m. UTC | #8

> Can you get me a testcase that I can compile?  This should be fixed in FRE.

This is gnat.dg/pack9.ad[sb].  You need a bare cross-compiler to s390x-linux: 
configure, build with make CFLAGS=-g, when the build fails, do gcc/ada then 
make gnatlib.  Go back to the build dir, copy gnat.dg/pack9.ad[sb] and do:

gcc/xgcc -Bgcc -S -O2 -gnatp pack9.adb -Igcc/ada/rts -fdump-tree-optimized

The .optimized dump shouldn't contain any call to gnat_rcheck_21.


If you install a change, please mention PR tree-optimization/51775.  TIA.

Richard Biener Jan. 9, 2012, 11:21 a.m. UTC | #9

On Mon, Jan 9, 2012 at 11:33 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Can you get me a testcase that I can compile?  This should be fixed in FRE.
>
> This is gnat.dg/pack9.ad[sb].  You need a bare cross-compiler to s390x-linux:
> configure, build with make CFLAGS=-g, when the build fails, do gcc/ada then
> make gnatlib.  Go back to the build dir, copy gnat.dg/pack9.ad[sb] and do:
>
> gcc/xgcc -Bgcc -S -O2 -gnatp pack9.adb -Igcc/ada/rts -fdump-tree-optimized
>
> The .optimized dump shouldn't contain any call to gnat_rcheck_21.
>
>
> If you install a change, please mention PR tree-optimization/51775.  TIA.

Hmm, it seems to be because we do not value-number loads that
satisfy stmt_could_throw_p (for whatever reason ...).  Seems to date
back to rev. 131610, the fix for PR34648.  Looking at that bug it
seems that we could at least allow stmts that only throw externally
(but generally CSE should still work, so value-number them when
we are doing FRE but not PRE).

Richard.

> --
> Eric Botcazou

Eric Botcazou Jan. 9, 2012, 11:45 a.m. UTC | #10

> Hmm, it seems to be because we do not value-number loads that
> satisfy stmt_could_throw_p (for whatever reason ...).  Seems to date
> back to rev. 131610, the fix for PR34648.  Looking at that bug it
> seems that we could at least allow stmts that only throw externally
> (but generally CSE should still work, so value-number them when
> we are doing FRE but not PRE).

OK.  The annoying thing is that it's optimized if you have y_1(D)->i2:

  t$F$i2_9 = y_1(D)->i2;
  D.1579_3 = y_1(D)->i2;

and this is equivalent wrt throwing (in particular TREE_THIS_NOTRAP is not set 
anywhere).

Richard Biener Jan. 9, 2012, 12:26 p.m. UTC | #11

On Mon, Jan 9, 2012 at 12:45 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Hmm, it seems to be because we do not value-number loads that
>> satisfy stmt_could_throw_p (for whatever reason ...).  Seems to date
>> back to rev. 131610, the fix for PR34648.  Looking at that bug it
>> seems that we could at least allow stmts that only throw externally
>> (but generally CSE should still work, so value-number them when
>> we are doing FRE but not PRE).
>
> OK.  The annoying thing is that it's optimized if you have y_1(D)->i2:
>
>  t$F$i2_9 = y_1(D)->i2;
>  D.1579_3 = y_1(D)->i2;
>
> and this is equivalent wrt throwing (in particular TREE_THIS_NOTRAP is not set
> anywhere).

Yeah, there is no good reason to disallow CSEing throwing loads (or even
volatile loads).  The change that introduced this is no longer necessary
as the PRE set filling was re-organized completely, avoiding the original
issue (PRE indeed should not insert throwing stmts - well, at least if
they throw internally, but we can't distinguish the case we have two
matching loads, one throwing internally and one externally).

I have a patch.

Richard.

> --
> Eric Botcazou

Fix PR tree-optimization/51624

Commit Message

Comments

Patch