diff mbox series

sra: Fix sra_modify_expr handling of partial writes (PR 94482)

Message ID ri6h7xw8dgy.fsf@suse.cz
State New
Headers show
Series sra: Fix sra_modify_expr handling of partial writes (PR 94482) | expand

Commit Message

Martin Jambor April 6, 2020, 10:18 p.m. UTC
Hi,

when sra_modify_expr is invoked on an expression that modifies only
part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
of an assignment and the SRA replacement's type is not compatible with
what is being replaced (0th operand of the B_F_R in the above
example), it does not work properly, basically throwing away the partd
of the expr that should have stayed intact.

This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
binary image of the replacement (and so in a way serve as a
VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
the complex partial access expression, which is OK even in a LHS of an
assignment (and other write contexts) because in those cases the
replacement is not going to be a giple register anyway.

The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
fragile because SRA prefers complex and vector types over anything else
(and in between the two it decides based on TYPE_UID which in my testing
today always preferred complex types) and even when GIMPLE is wrong I
often still did not get failing runs, so I only run it at -O1 (which is
the only level where the the test fails for me).

Bootstrapped and tested on x86_64-linux, bootstrap and testing on
i686-linux and aarch64-linux underway.

OK for trunk (and subsequently for release branches) if it passes?

Thanks,

Martin

2020-04-06  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/94482
	* tree-sra.c (create_access_replacement): Dump new replacement with
	TDF_UID.
	(sra_modify_expr): Fix handling of cases when the original EXPR writes
	to only part of the replacement.

	testsuite/
	* gcc.dg/torture/pr94482.c: New test.
	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
---
 gcc/ChangeLog                             |  8 ++++
 gcc/testsuite/ChangeLog                   |  6 +++
 gcc/testsuite/gcc.dg/torture/pr94482.c    | 34 +++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++++++++++++++++++++++
 gcc/tree-sra.c                            | 17 ++++++--
 5 files changed, 111 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c

Comments

Richard Biener April 7, 2020, 9:41 a.m. UTC | #1
On Tue, 7 Apr 2020, Martin Jambor wrote:

> Hi,
> 
> when sra_modify_expr is invoked on an expression that modifies only
> part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
> of an assignment and the SRA replacement's type is not compatible with
> what is being replaced (0th operand of the B_F_R in the above
> example), it does not work properly, basically throwing away the partd
> of the expr that should have stayed intact.
> 
> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
> binary image of the replacement (and so in a way serve as a
> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
> the complex partial access expression, which is OK even in a LHS of an
> assignment (and other write contexts) because in those cases the
> replacement is not going to be a giple register anyway.
> 
> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
> fragile because SRA prefers complex and vector types over anything else
> (and in between the two it decides based on TYPE_UID which in my testing
> today always preferred complex types) and even when GIMPLE is wrong I
> often still did not get failing runs, so I only run it at -O1 (which is
> the only level where the the test fails for me).
> 
> Bootstrapped and tested on x86_64-linux, bootstrap and testing on
> i686-linux and aarch64-linux underway.
> 
> OK for trunk (and subsequently for release branches) if it passes?

OK.

generate_subtree_copies contains similar code and the call in
sra_modify_expr suggests that an (outer?) bit-field-ref is possible
here, so is generate_subtree_copies affected as well?  I'm not sure
how subtree copy generation "works" when we already replaced sth
but maybe access->grp_to_be_replaced is never true when
access->first_child is not NULL?  But then we still pass down
the "stripped" orig_expr (op0 of BIT_FIELD_REF/REALPART_EXPR).

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 2020-04-06  Martin Jambor  <mjambor@suse.cz>
> 
> 	PR tree-optimization/94482
> 	* tree-sra.c (create_access_replacement): Dump new replacement with
> 	TDF_UID.
> 	(sra_modify_expr): Fix handling of cases when the original EXPR writes
> 	to only part of the replacement.
> 
> 	testsuite/
> 	* gcc.dg/torture/pr94482.c: New test.
> 	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
> ---
>  gcc/ChangeLog                             |  8 ++++
>  gcc/testsuite/ChangeLog                   |  6 +++
>  gcc/testsuite/gcc.dg/torture/pr94482.c    | 34 +++++++++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++++++++++++++++++++++
>  gcc/tree-sra.c                            | 17 ++++++--
>  5 files changed, 111 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index e10fb251c16..36ddef64afd 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,11 @@
> +2020-04-06  Martin Jambor  <mjambor@suse.cz>
> +
> +	PR tree-optimization/94482
> +	* tree-sra.c (create_access_replacement): Dump new replacement with
> +	TDF_UID.
> +	(sra_modify_expr): Fix handling of cases when the original EXPR writes
> +	to only part of the replacement.
> +
>  2020-04-05 Zachary Spytz  <zspytz@gmail.com>
>  
>  	* extend.texi: Add free to list of ISO C90 functions that
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 88bab5d3d19..8b85e55afe8 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,9 @@
> +2020-04-06  Martin Jambor  <mjambor@suse.cz>
> +
> +	PR tree-optimization/94482
> +	* gcc.dg/torture/pr94482.c: New test.
> +	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
> +
>  2020-04-05  Iain Sandoe  <iain@sandoe.co.uk>
>  
>  	* g++.dg/coroutines/torture/co-await-14-template-traits.C: Rename...
> diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c b/gcc/testsuite/gcc.dg/torture/pr94482.c
> new file mode 100644
> index 00000000000..5bb19e745c2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
> @@ -0,0 +1,34 @@
> +/* { dg-do run } */
> +
> +typedef unsigned V __attribute__ ((__vector_size__ (16)));
> +union U
> +{
> +  V j;
> +  unsigned long long i __attribute__ ((__vector_size__ (16)));
> +};
> +
> +static inline __attribute__((always_inline)) V
> +foo (unsigned long long a)
> +{
> +  union U z = { .j = (V) {} };
> +  for (unsigned long i = 0; i < 1; i++)
> +    z.i[i] = a;
> +  return z.j;
> +}
> +
> +static inline __attribute__((always_inline)) V
> +bar (V a, unsigned long long i, int q)
> +{
> +  union U z = { .j = a };
> +  z.i[q] = i;
> +  return z.j;
> +}
> +
> +int
> +main ()
> +{
> +  union U z = { .j = bar (foo (1729), 2, 1) };
> +  if (z.i[0] != 1729)
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
> new file mode 100644
> index 00000000000..fcac9d5e439
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
> @@ -0,0 +1,50 @@
> +/* { dg-do run } */
> +/* { dg-options "-O1" } */
> +
> +typedef unsigned long V __attribute__ ((__vector_size__ (8)));
> +typedef _Complex int Ci;
> +typedef _Complex float Cf;
> +
> +union U
> +{
> +  Ci ci;
> +  Cf cf;
> +};
> +
> +volatile Ci vgi;
> +
> +Cf foo (Cf c)
> +{
> +  __real c = 0x1ffp10;
> +  return c;
> +}
> +
> +Ci ioo (Ci c)
> +{
> +  __real c = 50;
> +  return c;
> +}
> +
> +
> +int main (int argc, char *argv[])
> +{
> +  union U u;
> +
> +  __real u.ci = 500;
> +  __imag u.ci = 1000;
> +  vgi = u.ci;
> +
> +  u.ci = ioo (u.ci);
> +  __imag u.ci = 100;
> +
> +  if (__real u.ci != 50 || __imag u.ci != 100)
> +    __builtin_abort();
> +
> +  u.cf = foo (u.cf);
> +  __imag u.cf = 0x1p3;
> +
> +  if (__real u.cf != 0x1ffp10 || __imag u.cf != 0x1p3)
> +    __builtin_abort();
> +
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index b2056b58750..494ab609149 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -2257,7 +2257,7 @@ create_access_replacement (struct access *access, tree reg_type = NULL_TREE)
>  	  print_generic_expr (dump_file, access->base);
>  	  fprintf (dump_file, " offset: %u, size: %u: ",
>  		   (unsigned) access->offset, (unsigned) access->size);
> -	  print_generic_expr (dump_file, repl);
> +	  print_generic_expr (dump_file, repl, TDF_UID);
>  	  fprintf (dump_file, "\n");
>  	}
>      }
> @@ -3698,6 +3698,7 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
>    location_t loc;
>    struct access *access;
>    tree type, bfr, orig_expr;
> +  bool partial_cplx_access = false;
>  
>    if (TREE_CODE (*expr) == BIT_FIELD_REF)
>      {
> @@ -3708,7 +3709,10 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
>      bfr = NULL_TREE;
>  
>    if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
> -    expr = &TREE_OPERAND (*expr, 0);
> +    {
> +      expr = &TREE_OPERAND (*expr, 0);
> +      partial_cplx_access = true;
> +    }
>    access = get_access_for_expr (*expr);
>    if (!access)
>      return false;
> @@ -3736,13 +3740,18 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
>           be accessed as a different type too, potentially creating a need for
>           type conversion (see PR42196) and when scalarized unions are involved
>           in assembler statements (see PR42398).  */
> -      if (!useless_type_conversion_p (type, access->type))
> +      if (!bfr && !useless_type_conversion_p (type, access->type))
>  	{
>  	  tree ref;
>  
>  	  ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
>  
> -	  if (write)
> +	  if (partial_cplx_access)
> +	    /* VIEW_CONVERT_EXPRs in partial complex access are fine even in
> +	       the case of a write because in such case the replacement cannot
> +	       be a gimple register.  */
> +	    *expr = build1 (VIEW_CONVERT_EXPR, type, repl);
> +	  else if (write)
>  	    {
>  	      gassign *stmt;
>  
>
Martin Jambor April 7, 2020, 4:25 p.m. UTC | #2
Hi,

On Tue, Apr 07 2020, Richard Biener wrote:
> On Tue, 7 Apr 2020, Martin Jambor wrote:
>
>> Hi,
>> 
>> when sra_modify_expr is invoked on an expression that modifies only
>> part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>> of an assignment and the SRA replacement's type is not compatible with
>> what is being replaced (0th operand of the B_F_R in the above
>> example), it does not work properly, basically throwing away the partd
>> of the expr that should have stayed intact.
>> 
>> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>> binary image of the replacement (and so in a way serve as a
>> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
>> the complex partial access expression, which is OK even in a LHS of an
>> assignment (and other write contexts) because in those cases the
>> replacement is not going to be a giple register anyway.
>> 
>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>> fragile because SRA prefers complex and vector types over anything else
>> (and in between the two it decides based on TYPE_UID which in my testing
>> today always preferred complex types) and even when GIMPLE is wrong I
>> often still did not get failing runs, so I only run it at -O1 (which is
>> the only level where the the test fails for me).
>> 
>> Bootstrapped and tested on x86_64-linux, bootstrap and testing on
>> i686-linux and aarch64-linux underway.
>> 
>> OK for trunk (and subsequently for release branches) if it passes?
>
> OK.

I must have been already half asleep when writing that it passed
testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
even on x86_64, because fwprop happily combines

  u$ci_10 = MEM[(union U *)&u];

and

  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;

into

  f1_6 = IMAGPART_EXPR <MEM[(union U *)&u]>;

and the gimple verifier of course does not like that.  I'll have a look
at that tomorrow.

>
> generate_subtree_copies contains similar code and the call in
> sra_modify_expr suggests that an (outer?) bit-field-ref is possible
> here, so is generate_subtree_copies affected as well?  I'm not sure
> how subtree copy generation "works" when we already replaced sth

Because they were just horribly problematic, SRA does not create
replacements for any scalar access that has a scalar ancestor or any
children in the access tree.  So the generate_subtree_copies call is
only invoked when the expr itself is not replaced (because there are
children), even after SRA the expression will still load/store the
original bit of the original aggregate.  The task of
generate_subtree_copies is to write all children replacements back to
the aggregate before it is read or re-load the replacements after a
write to the aggregate.  So no data should be forgotten here.

Martin
Richard Biener April 7, 2020, 4:27 p.m. UTC | #3
On April 7, 2020 6:25:26 PM GMT+02:00, Martin Jambor <mjambor@suse.cz> wrote:
>Hi,
>
>On Tue, Apr 07 2020, Richard Biener wrote:
>> On Tue, 7 Apr 2020, Martin Jambor wrote:
>>
>>> Hi,
>>> 
>>> when sra_modify_expr is invoked on an expression that modifies only
>>> part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>>> of an assignment and the SRA replacement's type is not compatible
>with
>>> what is being replaced (0th operand of the B_F_R in the above
>>> example), it does not work properly, basically throwing away the
>partd
>>> of the expr that should have stayed intact.
>>> 
>>> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>>> binary image of the replacement (and so in a way serve as a
>>> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR
>under
>>> the complex partial access expression, which is OK even in a LHS of
>an
>>> assignment (and other write contexts) because in those cases the
>>> replacement is not going to be a giple register anyway.
>>> 
>>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>>> fragile because SRA prefers complex and vector types over anything
>else
>>> (and in between the two it decides based on TYPE_UID which in my
>testing
>>> today always preferred complex types) and even when GIMPLE is wrong
>I
>>> often still did not get failing runs, so I only run it at -O1 (which
>is
>>> the only level where the the test fails for me).
>>> 
>>> Bootstrapped and tested on x86_64-linux, bootstrap and testing on
>>> i686-linux and aarch64-linux underway.
>>> 
>>> OK for trunk (and subsequently for release branches) if it passes?
>>
>> OK.
>
>I must have been already half asleep when writing that it passed
>testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
>even on x86_64, because fwprop happily combines
>
>  u$ci_10 = MEM[(union U *)&u];
>
>and
>
>  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>
>into
>
>  f1_6 = IMAGPART_EXPR <MEM[(union U *)&u]>;
>
>and the gimple verifier of course does not like that.  I'll have a look
>at that tomorrow.

Ah, that might be a genuine fwprop bug. 

>>
>> generate_subtree_copies contains similar code and the call in
>> sra_modify_expr suggests that an (outer?) bit-field-ref is possible
>> here, so is generate_subtree_copies affected as well?  I'm not sure
>> how subtree copy generation "works" when we already replaced sth
>
>Because they were just horribly problematic, SRA does not create
>replacements for any scalar access that has a scalar ancestor or any
>children in the access tree.  So the generate_subtree_copies call is
>only invoked when the expr itself is not replaced (because there are
>children), even after SRA the expression will still load/store the
>original bit of the original aggregate.  The task of
>generate_subtree_copies is to write all children replacements back to
>the aggregate before it is read or re-load the replacements after a
>write to the aggregate.  So no data should be forgotten here.
>
>Martin
Li, Pan2 via Gcc-patches April 7, 2020, 7:08 p.m. UTC | #4
On Tue, 2020-04-07 at 18:25 +0200, Martin Jambor wrote:
> Hi,
> 
> On Tue, Apr 07 2020, Richard Biener wrote:
> > On Tue, 7 Apr 2020, Martin Jambor wrote:
> > 
> > > Hi,
> > > 
> > > when sra_modify_expr is invoked on an expression that modifies only
> > > part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
> > > of an assignment and the SRA replacement's type is not compatible with
> > > what is being replaced (0th operand of the B_F_R in the above
> > > example), it does not work properly, basically throwing away the partd
> > > of the expr that should have stayed intact.
> > > 
> > > This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
> > > binary image of the replacement (and so in a way serve as a
> > > VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
> > > REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
> > > the complex partial access expression, which is OK even in a LHS of an
> > > assignment (and other write contexts) because in those cases the
> > > replacement is not going to be a giple register anyway.
> > > 
> > > The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
> > > fragile because SRA prefers complex and vector types over anything else
> > > (and in between the two it decides based on TYPE_UID which in my testing
> > > today always preferred complex types) and even when GIMPLE is wrong I
> > > often still did not get failing runs, so I only run it at -O1 (which is
> > > the only level where the the test fails for me).
> > > 
> > > Bootstrapped and tested on x86_64-linux, bootstrap and testing on
> > > i686-linux and aarch64-linux underway.
> > > 
> > > OK for trunk (and subsequently for release branches) if it passes?
> > 
> > OK.
> 
> I must have been already half asleep when writing that it passed
> testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
> even on x86_64, because fwprop happily combines
Yea, my tester complained about that on msp430-elf as well.  There's other recent
failures on msp430, ft32 and h8300 that I'm digging into now.
> 

Jeff
Martin Jambor April 7, 2020, 7:31 p.m. UTC | #5
Hi Jeff,

On Tue, Apr 07 2020, Jeff Law wrote:
> On Tue, 2020-04-07 at 18:25 +0200, Martin Jambor wrote:
>> Hi,
>> 
>> On Tue, Apr 07 2020, Richard Biener wrote:
>> > On Tue, 7 Apr 2020, Martin Jambor wrote:
>> > 
>> > > Hi,
>> > > 
>> > > when sra_modify_expr is invoked on an expression that modifies only
>> > > part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>> > > of an assignment and the SRA replacement's type is not compatible with
>> > > what is being replaced (0th operand of the B_F_R in the above
>> > > example), it does not work properly, basically throwing away the partd
>> > > of the expr that should have stayed intact.
>> > > 
>> > > This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>> > > binary image of the replacement (and so in a way serve as a
>> > > VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>> > > REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
>> > > the complex partial access expression, which is OK even in a LHS of an
>> > > assignment (and other write contexts) because in those cases the
>> > > replacement is not going to be a giple register anyway.
>> > > 
>> > > The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>> > > fragile because SRA prefers complex and vector types over anything else
>> > > (and in between the two it decides based on TYPE_UID which in my testing
>> > > today always preferred complex types) and even when GIMPLE is wrong I
>> > > often still did not get failing runs, so I only run it at -O1 (which is
>> > > the only level where the the test fails for me).
>> > > 
>> > > Bootstrapped and tested on x86_64-linux, bootstrap and testing on
>> > > i686-linux and aarch64-linux underway.
>> > > 
>> > > OK for trunk (and subsequently for release branches) if it passes?
>> > 
>> > OK.
>> 
>> I must have been already half asleep when writing that it passed
>> testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
>> even on x86_64, because fwprop happily combines
> Yea, my tester complained about that on msp430-elf as well.  There's other recent
> failures on msp430, ft32 and h8300 that I'm digging into now.
>> 

I did not commit the patch.  I waited to look at test results from other
platforms and saw the failure there and so did not proceed.  So unless
you tested it independently, it's not caused by my patch.

Martin
Li, Pan2 via Gcc-patches April 7, 2020, 7:34 p.m. UTC | #6
On Tue, 2020-04-07 at 21:31 +0200, Martin Jambor wrote:
> Hi Jeff,
> 
> On Tue, Apr 07 2020, Jeff Law wrote:
> > On Tue, 2020-04-07 at 18:25 +0200, Martin Jambor wrote:
> > > Hi,
> > > 
> > > On Tue, Apr 07 2020, Richard Biener wrote:
> > > > On Tue, 7 Apr 2020, Martin Jambor wrote:
> > > > 
> > > > > Hi,
> > > > > 
> > > > > when sra_modify_expr is invoked on an expression that modifies only
> > > > > part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
> > > > > of an assignment and the SRA replacement's type is not compatible with
> > > > > what is being replaced (0th operand of the B_F_R in the above
> > > > > example), it does not work properly, basically throwing away the partd
> > > > > of the expr that should have stayed intact.
> > > > > 
> > > > > This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
> > > > > binary image of the replacement (and so in a way serve as a
> > > > > VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
> > > > > REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR under
> > > > > the complex partial access expression, which is OK even in a LHS of an
> > > > > assignment (and other write contexts) because in those cases the
> > > > > replacement is not going to be a giple register anyway.
> > > > > 
> > > > > The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
> > > > > fragile because SRA prefers complex and vector types over anything else
> > > > > (and in between the two it decides based on TYPE_UID which in my
> > > > > testing
> > > > > today always preferred complex types) and even when GIMPLE is wrong I
> > > > > often still did not get failing runs, so I only run it at -O1 (which is
> > > > > the only level where the the test fails for me).
> > > > > 
> > > > > Bootstrapped and tested on x86_64-linux, bootstrap and testing on
> > > > > i686-linux and aarch64-linux underway.
> > > > > 
> > > > > OK for trunk (and subsequently for release branches) if it passes?
> > > > 
> > > > OK.
> > > 
> > > I must have been already half asleep when writing that it passed
> > > testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
> > > even on x86_64, because fwprop happily combines
> > Yea, my tester complained about that on msp430-elf as well.  There's other
> > recent
> > failures on msp430, ft32 and h8300 that I'm digging into now.
> 
> I did not commit the patch.  I waited to look at test results from other
> platforms and saw the failure there and so did not proceed.  So unless
> you tested it independently, it's not caused by my patch.
ohhh, interesting, no I did not test that independently...  I'll put it back in
the queue of things to investigate.

jeff
>
Richard Biener April 8, 2020, 8:26 a.m. UTC | #7
On Tue, 7 Apr 2020, Richard Biener wrote:

> On April 7, 2020 6:25:26 PM GMT+02:00, Martin Jambor <mjambor@suse.cz> wrote:
> >Hi,
> >
> >On Tue, Apr 07 2020, Richard Biener wrote:
> >> On Tue, 7 Apr 2020, Martin Jambor wrote:
> >>
> >>> Hi,
> >>> 
> >>> when sra_modify_expr is invoked on an expression that modifies only
> >>> part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
> >>> of an assignment and the SRA replacement's type is not compatible
> >with
> >>> what is being replaced (0th operand of the B_F_R in the above
> >>> example), it does not work properly, basically throwing away the
> >partd
> >>> of the expr that should have stayed intact.
> >>> 
> >>> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
> >>> binary image of the replacement (and so in a way serve as a
> >>> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
> >>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR
> >under
> >>> the complex partial access expression, which is OK even in a LHS of
> >an
> >>> assignment (and other write contexts) because in those cases the
> >>> replacement is not going to be a giple register anyway.
> >>> 
> >>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
> >>> fragile because SRA prefers complex and vector types over anything
> >else
> >>> (and in between the two it decides based on TYPE_UID which in my
> >testing
> >>> today always preferred complex types) and even when GIMPLE is wrong
> >I
> >>> often still did not get failing runs, so I only run it at -O1 (which
> >is
> >>> the only level where the the test fails for me).
> >>> 
> >>> Bootstrapped and tested on x86_64-linux, bootstrap and testing on
> >>> i686-linux and aarch64-linux underway.
> >>> 
> >>> OK for trunk (and subsequently for release branches) if it passes?
> >>
> >> OK.
> >
> >I must have been already half asleep when writing that it passed
> >testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
> >even on x86_64, because fwprop happily combines
> >
> >  u$ci_10 = MEM[(union U *)&u];
> >
> >and
> >
> >  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
> >
> >into
> >
> >  f1_6 = IMAGPART_EXPR <MEM[(union U *)&u]>;
> >
> >and the gimple verifier of course does not like that.  I'll have a look
> >at that tomorrow.
> 
> Ah, that might be a genuine fwprop bug. 

On a second thought

  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;

shouldn't be valid GIMPLE since 'complex float' is a register type
and thus it should have been

  _11 = VIEW_CONVERT_EXPR<complex float>(u$ci_10);
  f1_6 = IMAGPART_EXPR <_11>;

now it's a bit of a grey area since a load like

  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u);

is valid (we don't want to force a whole _Complex load here).

For example in VN

  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;

goes through the load machinery.

So let's say the IL is undesirable at least.

The following fixes the forwprop laziness, please include it in the
patch so it gets cherry-picked for backports.

Thanks,
Richard.

From 29348ec5efbbc0430714a2929c6b44d174f78533 Mon Sep 17 00:00:00 2001
From: Richard Biener <rguenther@suse.de>
Date: Wed, 8 Apr 2020 10:23:35 +0200
Subject: [PATCH] Properly verify forwprop merging into REAL/IMAGPART and
 BIT_FIELD_REF
To: gcc-patches@gcc.gnu.org

2020-04-08  Richard Biener  <rguenther@suse.de>

	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly
	verify the first operand of combinations into REAL/IMAGPART_EXPR
	and BIT_FIELD_REF.
---
 gcc/tree-ssa-forwprop.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index e7eaa18ccad..3d8acf7eb03 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -2815,7 +2815,8 @@ pass_forwprop::execute (function *fun)
 		    continue;
 		  if (!is_gimple_assign (use_stmt)
 		      || (gimple_assign_rhs_code (use_stmt) != REALPART_EXPR
-			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR))
+			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
 		    {
 		      rewrite = false;
 		      break;
@@ -2877,7 +2878,8 @@ pass_forwprop::execute (function *fun)
 		  if (is_gimple_debug (use_stmt))
 		    continue;
 		  if (!is_gimple_assign (use_stmt)
-		      || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF)
+		      || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF
+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
 		    {
 		      rewrite = false;
 		      break;
Martin Jambor April 8, 2020, 6:34 p.m. UTC | #8
Hi,

On Wed, Apr 08 2020, Richard Biener wrote:
> On Tue, 7 Apr 2020, Richard Biener wrote:
>
>> On April 7, 2020 6:25:26 PM GMT+02:00, Martin Jambor <mjambor@suse.cz> wrote:
>> >Hi,
>> >
>> >On Tue, Apr 07 2020, Richard Biener wrote:
>> >> On Tue, 7 Apr 2020, Martin Jambor wrote:
>> >>
>> >>> Hi,
>> >>> 
>> >>> when sra_modify_expr is invoked on an expression that modifies only
>> >>> part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>> >>> of an assignment and the SRA replacement's type is not compatible
>> >with
>> >>> what is being replaced (0th operand of the B_F_R in the above
>> >>> example), it does not work properly, basically throwing away the
>> >partd
>> >>> of the expr that should have stayed intact.
>> >>> 
>> >>> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>> >>> binary image of the replacement (and so in a way serve as a
>> >>> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>> >>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR
>> >under
>> >>> the complex partial access expression, which is OK even in a LHS of
>> >an
>> >>> assignment (and other write contexts) because in those cases the
>> >>> replacement is not going to be a giple register anyway.
>> >>> 
>> >>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>> >>> fragile because SRA prefers complex and vector types over anything
>> >else
>> >>> (and in between the two it decides based on TYPE_UID which in my
>> >testing
>> >>> today always preferred complex types) and even when GIMPLE is wrong
>> >I
>> >>> often still did not get failing runs, so I only run it at -O1 (which
>> >is
>> >>> the only level where the the test fails for me).
>> >>> 
>> >>> Bootstrapped and tested on x86_64-linux, bootstrap and testing on
>> >>> i686-linux and aarch64-linux underway.
>> >>> 
>> >>> OK for trunk (and subsequently for release branches) if it passes?
>> >>
>> >> OK.
>> >
>> >I must have been already half asleep when writing that it passed
>> >testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c fails
>> >even on x86_64, because fwprop happily combines
>> >
>> >  u$ci_10 = MEM[(union U *)&u];
>> >
>> >and
>> >
>> >  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>> >
>> >into
>> >
>> >  f1_6 = IMAGPART_EXPR <MEM[(union U *)&u]>;
>> >
>> >and the gimple verifier of course does not like that.  I'll have a look
>> >at that tomorrow.
>> 
>> Ah, that might be a genuine fwprop bug. 
>
> On a second thought
>
>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>
> shouldn't be valid GIMPLE since 'complex float' is a register type
> and thus it should have been
>
>   _11 = VIEW_CONVERT_EXPR<complex float>(u$ci_10);
>   f1_6 = IMAGPART_EXPR <_11>;
>

OK, the newest version of the patch below is careful to do that if the
replacement is going to be a gimple_register,

> now it's a bit of a grey area since a load like
>
>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u);
>
> is valid (we don't want to force a whole _Complex load here).

and the above for non-gimple-registers.

>
> For example in VN
>
>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>
> goes through the load machinery.
>
> So let's say the IL is undesirable at least.
>
> The following fixes the forwprop laziness, please include it in the
> patch so it gets cherry-picked for backports.

I added it to the modified patch, it was still necessary.  The version
below has passed bootstrap and testing on x86-64-linux and aarch64-linux
(and I have double checked!) and bootstrap on i686-linux, testing on the
32-bit platform is still ongoing.  OK if it passes there too?

Thanks,

Martin


sra: Fix sra_modify_expr handling of partial writes (PR 94482)

when sra_modify_expr is invoked on an expression that modifies only
part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
of an assignment and the SRA replacement's type is not compatible with
what is being replaced (0th operand of the B_F_R in the above
example), it does not work properly, basically throwing away the partd
of the expr that should have stayed intact.

This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
binary image of the replacement (and so in a way serve as a
VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
REALPART_EXPRs and IMAGPART_EXPRs, if the replacement is not a
register, we insert a VIEW_CONVERT_EXPR under
the complex partial access expression, which is always OK, for loads
from registers we take the extra step of converting it to a temporary.

This revealed a bug in fwprop which is fixed with the hunk from Richi.

The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
fragile because SRA prefers complex and vector types over anything
else (and in between the two it decides based on TYPE_UID which in my
testing today always preferred complex types) and so I only run it at
-O1 (which is the only level where the the test fails for me).


2020-04-08  Martin Jambor  <mjambor@suse.cz>
	    Richard Biener  <rguenther@suse.de>

	PR tree-optimization/94482
	* tree-sra.c (create_access_replacement): Dump new replacement with
	TDF_UID.
	(sra_modify_expr): Fix handling of cases when the original EXPR writes
	to only part of the replacement.
	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
	the first operand of combinations into REAL/IMAGPART_EXPR and
	BIT_FIELD_REF.

	testsuite/
	* gcc.dg/torture/pr94482.c: New test.
	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
---
 gcc/ChangeLog                             | 12 ++++++
 gcc/testsuite/ChangeLog                   |  6 +++
 gcc/testsuite/gcc.dg/torture/pr94482.c    | 34 +++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++++++++++++++++++++++
 gcc/tree-sra.c                            | 31 ++++++++++++--
 gcc/tree-ssa-forwprop.c                   |  6 ++-
 6 files changed, 133 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e10fb251c16..675ac2a4b6a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,15 @@
+2020-04-08  Martin Jambor  <mjambor@suse.cz>
+	    Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/94482
+	* tree-sra.c (create_access_replacement): Dump new replacement with
+	TDF_UID.
+	(sra_modify_expr): Fix handling of cases when the original EXPR writes
+	to only part of the replacement.
+	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
+	the first operand of combinations into REAL/IMAGPART_EXPR and
+	BIT_FIELD_REF.
+
 2020-04-05 Zachary Spytz  <zspytz@gmail.com>
 
 	* extend.texi: Add free to list of ISO C90 functions that
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 88bab5d3d19..b0d94b5394e 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2020-04-08  Martin Jambor  <mjambor@suse.cz>
+
+	PR tree-optimization/94482
+	* gcc.dg/torture/pr94482.c: New test.
+	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
+
 2020-04-05  Iain Sandoe  <iain@sandoe.co.uk>
 
 	* g++.dg/coroutines/torture/co-await-14-template-traits.C: Rename...
diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c b/gcc/testsuite/gcc.dg/torture/pr94482.c
new file mode 100644
index 00000000000..5bb19e745c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
@@ -0,0 +1,34 @@
+/* { dg-do run } */
+
+typedef unsigned V __attribute__ ((__vector_size__ (16)));
+union U
+{
+  V j;
+  unsigned long long i __attribute__ ((__vector_size__ (16)));
+};
+
+static inline __attribute__((always_inline)) V
+foo (unsigned long long a)
+{
+  union U z = { .j = (V) {} };
+  for (unsigned long i = 0; i < 1; i++)
+    z.i[i] = a;
+  return z.j;
+}
+
+static inline __attribute__((always_inline)) V
+bar (V a, unsigned long long i, int q)
+{
+  union U z = { .j = a };
+  z.i[q] = i;
+  return z.j;
+}
+
+int
+main ()
+{
+  union U z = { .j = bar (foo (1729), 2, 1) };
+  if (z.i[0] != 1729)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
new file mode 100644
index 00000000000..fcac9d5e439
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+typedef unsigned long V __attribute__ ((__vector_size__ (8)));
+typedef _Complex int Ci;
+typedef _Complex float Cf;
+
+union U
+{
+  Ci ci;
+  Cf cf;
+};
+
+volatile Ci vgi;
+
+Cf foo (Cf c)
+{
+  __real c = 0x1ffp10;
+  return c;
+}
+
+Ci ioo (Ci c)
+{
+  __real c = 50;
+  return c;
+}
+
+
+int main (int argc, char *argv[])
+{
+  union U u;
+
+  __real u.ci = 500;
+  __imag u.ci = 1000;
+  vgi = u.ci;
+
+  u.ci = ioo (u.ci);
+  __imag u.ci = 100;
+
+  if (__real u.ci != 50 || __imag u.ci != 100)
+    __builtin_abort();
+
+  u.cf = foo (u.cf);
+  __imag u.cf = 0x1p3;
+
+  if (__real u.cf != 0x1ffp10 || __imag u.cf != 0x1p3)
+    __builtin_abort();
+
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index b2056b58750..84c113c403c 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2257,7 +2257,7 @@ create_access_replacement (struct access *access, tree reg_type = NULL_TREE)
 	  print_generic_expr (dump_file, access->base);
 	  fprintf (dump_file, " offset: %u, size: %u: ",
 		   (unsigned) access->offset, (unsigned) access->size);
-	  print_generic_expr (dump_file, repl);
+	  print_generic_expr (dump_file, repl, TDF_UID);
 	  fprintf (dump_file, "\n");
 	}
     }
@@ -3698,6 +3698,7 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
   location_t loc;
   struct access *access;
   tree type, bfr, orig_expr;
+  bool partial_cplx_access = false;
 
   if (TREE_CODE (*expr) == BIT_FIELD_REF)
     {
@@ -3708,7 +3709,10 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
     bfr = NULL_TREE;
 
   if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
-    expr = &TREE_OPERAND (*expr, 0);
+    {
+      expr = &TREE_OPERAND (*expr, 0);
+      partial_cplx_access = true;
+    }
   access = get_access_for_expr (*expr);
   if (!access)
     return false;
@@ -3736,13 +3740,32 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
          be accessed as a different type too, potentially creating a need for
          type conversion (see PR42196) and when scalarized unions are involved
          in assembler statements (see PR42398).  */
-      if (!useless_type_conversion_p (type, access->type))
+      if (!bfr && !useless_type_conversion_p (type, access->type))
 	{
 	  tree ref;
 
 	  ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
 
-	  if (write)
+	  if (partial_cplx_access)
+	    {
+	    /* VIEW_CONVERT_EXPRs in partial complex access are always fine in
+	       the case of a write because in such case the replacement cannot
+	       be a gimple register.  In the case of a load, we have to
+	       differentiate in between a register an non-register
+	       replacement.  */
+	      tree t = build1 (VIEW_CONVERT_EXPR, type, repl);
+	      gcc_checking_assert (!write || access->grp_partial_lhs);
+	      if (!access->grp_partial_lhs)
+		{
+		  tree tmp = make_ssa_name (type);
+		  gassign *stmt = gimple_build_assign (tmp, t);
+		  /* This is always a read. */
+		  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+		  t = tmp;
+		}
+	      *expr = t;
+	    }
+	  else if (write)
 	    {
 	      gassign *stmt;
 
diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index e7eaa18ccad..3d8acf7eb03 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -2815,7 +2815,8 @@ pass_forwprop::execute (function *fun)
 		    continue;
 		  if (!is_gimple_assign (use_stmt)
 		      || (gimple_assign_rhs_code (use_stmt) != REALPART_EXPR
-			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR))
+			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
 		    {
 		      rewrite = false;
 		      break;
@@ -2877,7 +2878,8 @@ pass_forwprop::execute (function *fun)
 		  if (is_gimple_debug (use_stmt))
 		    continue;
 		  if (!is_gimple_assign (use_stmt)
-		      || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF)
+		      || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF
+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
 		    {
 		      rewrite = false;
 		      break;
Richard Biener April 8, 2020, 6:56 p.m. UTC | #9
On April 8, 2020 8:34:08 PM GMT+02:00, Martin Jambor <mjambor@suse.cz> wrote:
>Hi,
>
>On Wed, Apr 08 2020, Richard Biener wrote:
>> On Tue, 7 Apr 2020, Richard Biener wrote:
>>
>>> On April 7, 2020 6:25:26 PM GMT+02:00, Martin Jambor
><mjambor@suse.cz> wrote:
>>> >Hi,
>>> >
>>> >On Tue, Apr 07 2020, Richard Biener wrote:
>>> >> On Tue, 7 Apr 2020, Martin Jambor wrote:
>>> >>
>>> >>> Hi,
>>> >>> 
>>> >>> when sra_modify_expr is invoked on an expression that modifies
>only
>>> >>> part of the underlying replacement, such as a BIT_FIELD_REF on a
>LHS
>>> >>> of an assignment and the SRA replacement's type is not
>compatible
>>> >with
>>> >>> what is being replaced (0th operand of the B_F_R in the above
>>> >>> example), it does not work properly, basically throwing away the
>>> >partd
>>> >>> of the expr that should have stayed intact.
>>> >>> 
>>> >>> This is fixed in two ways.  For BIT_FIELD_REFs, which operate on
>the
>>> >>> binary image of the replacement (and so in a way serve as a
>>> >>> VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>>> >>> REALPART_EXPRs and IMAGPART_EXPRs, we insert a VIEW_CONVERT_EXPR
>>> >under
>>> >>> the complex partial access expression, which is OK even in a LHS
>of
>>> >an
>>> >>> assignment (and other write contexts) because in those cases the
>>> >>> replacement is not going to be a giple register anyway.
>>> >>> 
>>> >>> The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a
>bit
>>> >>> fragile because SRA prefers complex and vector types over
>anything
>>> >else
>>> >>> (and in between the two it decides based on TYPE_UID which in my
>>> >testing
>>> >>> today always preferred complex types) and even when GIMPLE is
>wrong
>>> >I
>>> >>> often still did not get failing runs, so I only run it at -O1
>(which
>>> >is
>>> >>> the only level where the the test fails for me).
>>> >>> 
>>> >>> Bootstrapped and tested on x86_64-linux, bootstrap and testing
>on
>>> >>> i686-linux and aarch64-linux underway.
>>> >>> 
>>> >>> OK for trunk (and subsequently for release branches) if it
>passes?
>>> >>
>>> >> OK.
>>> >
>>> >I must have been already half asleep when writing that it passed
>>> >testing.  It did not, testsuite/gcc.c-torture/compile/pr42196-3.c
>fails
>>> >even on x86_64, because fwprop happily combines
>>> >
>>> >  u$ci_10 = MEM[(union U *)&u];
>>> >
>>> >and
>>> >
>>> >  f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>> >
>>> >into
>>> >
>>> >  f1_6 = IMAGPART_EXPR <MEM[(union U *)&u]>;
>>> >
>>> >and the gimple verifier of course does not like that.  I'll have a
>look
>>> >at that tomorrow.
>>> 
>>> Ah, that might be a genuine fwprop bug. 
>>
>> On a second thought
>>
>>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>
>> shouldn't be valid GIMPLE since 'complex float' is a register type
>> and thus it should have been
>>
>>   _11 = VIEW_CONVERT_EXPR<complex float>(u$ci_10);
>>   f1_6 = IMAGPART_EXPR <_11>;
>>
>
>OK, the newest version of the patch below is careful to do that if the
>replacement is going to be a gimple_register,
>
>> now it's a bit of a grey area since a load like
>>
>>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u);
>>
>> is valid (we don't want to force a whole _Complex load here).
>
>and the above for non-gimple-registers.
>
>>
>> For example in VN
>>
>>   f1_6 = IMAGPART_EXPR <VIEW_CONVERT_EXPR<complex float>(u$ci_10)>;
>>
>> goes through the load machinery.
>>
>> So let's say the IL is undesirable at least.
>>
>> The following fixes the forwprop laziness, please include it in the
>> patch so it gets cherry-picked for backports.
>
>I added it to the modified patch, it was still necessary.  The version
>below has passed bootstrap and testing on x86-64-linux and
>aarch64-linux
>(and I have double checked!) and bootstrap on i686-linux, testing on
>the
>32-bit platform is still ongoing.  OK if it passes there too?

OK. 

Richard. 

>Thanks,
>
>Martin
>
>
>sra: Fix sra_modify_expr handling of partial writes (PR 94482)
>
>when sra_modify_expr is invoked on an expression that modifies only
>part of the underlying replacement, such as a BIT_FIELD_REF on a LHS
>of an assignment and the SRA replacement's type is not compatible with
>what is being replaced (0th operand of the B_F_R in the above
>example), it does not work properly, basically throwing away the partd
>of the expr that should have stayed intact.
>
>This is fixed in two ways.  For BIT_FIELD_REFs, which operate on the
>binary image of the replacement (and so in a way serve as a
>VIEW_CONVERT_EXPR) we just do not bother with convertsing.  For
>REALPART_EXPRs and IMAGPART_EXPRs, if the replacement is not a
>register, we insert a VIEW_CONVERT_EXPR under
>the complex partial access expression, which is always OK, for loads
>from registers we take the extra step of converting it to a temporary.
>
>This revealed a bug in fwprop which is fixed with the hunk from Richi.
>
>The testcase for handling REALPART_EXPR and IMAGPART_EXPR is a bit
>fragile because SRA prefers complex and vector types over anything
>else (and in between the two it decides based on TYPE_UID which in my
>testing today always preferred complex types) and so I only run it at
>-O1 (which is the only level where the the test fails for me).
>
>
>2020-04-08  Martin Jambor  <mjambor@suse.cz>
>	    Richard Biener  <rguenther@suse.de>
>
>	PR tree-optimization/94482
>	* tree-sra.c (create_access_replacement): Dump new replacement with
>	TDF_UID.
>	(sra_modify_expr): Fix handling of cases when the original EXPR writes
>	to only part of the replacement.
>	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
>	the first operand of combinations into REAL/IMAGPART_EXPR and
>	BIT_FIELD_REF.
>
>	testsuite/
>	* gcc.dg/torture/pr94482.c: New test.
>	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
>---
> gcc/ChangeLog                             | 12 ++++++
> gcc/testsuite/ChangeLog                   |  6 +++
> gcc/testsuite/gcc.dg/torture/pr94482.c    | 34 +++++++++++++++
> gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++++++++++++++++++++++
> gcc/tree-sra.c                            | 31 ++++++++++++--
> gcc/tree-ssa-forwprop.c                   |  6 ++-
> 6 files changed, 133 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>
>diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>index e10fb251c16..675ac2a4b6a 100644
>--- a/gcc/ChangeLog
>+++ b/gcc/ChangeLog
>@@ -1,3 +1,15 @@
>+2020-04-08  Martin Jambor  <mjambor@suse.cz>
>+	    Richard Biener  <rguenther@suse.de>
>+
>+	PR tree-optimization/94482
>+	* tree-sra.c (create_access_replacement): Dump new replacement with
>+	TDF_UID.
>+	(sra_modify_expr): Fix handling of cases when the original EXPR
>writes
>+	to only part of the replacement.
>+	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
>+	the first operand of combinations into REAL/IMAGPART_EXPR and
>+	BIT_FIELD_REF.
>+
> 2020-04-05 Zachary Spytz  <zspytz@gmail.com>
> 
> 	* extend.texi: Add free to list of ISO C90 functions that
>diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>index 88bab5d3d19..b0d94b5394e 100644
>--- a/gcc/testsuite/ChangeLog
>+++ b/gcc/testsuite/ChangeLog
>@@ -1,3 +1,9 @@
>+2020-04-08  Martin Jambor  <mjambor@suse.cz>
>+
>+	PR tree-optimization/94482
>+	* gcc.dg/torture/pr94482.c: New test.
>+	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
>+
> 2020-04-05  Iain Sandoe  <iain@sandoe.co.uk>
> 
> 	* g++.dg/coroutines/torture/co-await-14-template-traits.C: Rename...
>diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c
>b/gcc/testsuite/gcc.dg/torture/pr94482.c
>new file mode 100644
>index 00000000000..5bb19e745c2
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
>@@ -0,0 +1,34 @@
>+/* { dg-do run } */
>+
>+typedef unsigned V __attribute__ ((__vector_size__ (16)));
>+union U
>+{
>+  V j;
>+  unsigned long long i __attribute__ ((__vector_size__ (16)));
>+};
>+
>+static inline __attribute__((always_inline)) V
>+foo (unsigned long long a)
>+{
>+  union U z = { .j = (V) {} };
>+  for (unsigned long i = 0; i < 1; i++)
>+    z.i[i] = a;
>+  return z.j;
>+}
>+
>+static inline __attribute__((always_inline)) V
>+bar (V a, unsigned long long i, int q)
>+{
>+  union U z = { .j = a };
>+  z.i[q] = i;
>+  return z.j;
>+}
>+
>+int
>+main ()
>+{
>+  union U z = { .j = bar (foo (1729), 2, 1) };
>+  if (z.i[0] != 1729)
>+    __builtin_abort ();
>+  return 0;
>+}
>diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>new file mode 100644
>index 00000000000..fcac9d5e439
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
>@@ -0,0 +1,50 @@
>+/* { dg-do run } */
>+/* { dg-options "-O1" } */
>+
>+typedef unsigned long V __attribute__ ((__vector_size__ (8)));
>+typedef _Complex int Ci;
>+typedef _Complex float Cf;
>+
>+union U
>+{
>+  Ci ci;
>+  Cf cf;
>+};
>+
>+volatile Ci vgi;
>+
>+Cf foo (Cf c)
>+{
>+  __real c = 0x1ffp10;
>+  return c;
>+}
>+
>+Ci ioo (Ci c)
>+{
>+  __real c = 50;
>+  return c;
>+}
>+
>+
>+int main (int argc, char *argv[])
>+{
>+  union U u;
>+
>+  __real u.ci = 500;
>+  __imag u.ci = 1000;
>+  vgi = u.ci;
>+
>+  u.ci = ioo (u.ci);
>+  __imag u.ci = 100;
>+
>+  if (__real u.ci != 50 || __imag u.ci != 100)
>+    __builtin_abort();
>+
>+  u.cf = foo (u.cf);
>+  __imag u.cf = 0x1p3;
>+
>+  if (__real u.cf != 0x1ffp10 || __imag u.cf != 0x1p3)
>+    __builtin_abort();
>+
>+  return 0;
>+}
>diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>index b2056b58750..84c113c403c 100644
>--- a/gcc/tree-sra.c
>+++ b/gcc/tree-sra.c
>@@ -2257,7 +2257,7 @@ create_access_replacement (struct access *access,
>tree reg_type = NULL_TREE)
> 	  print_generic_expr (dump_file, access->base);
> 	  fprintf (dump_file, " offset: %u, size: %u: ",
> 		   (unsigned) access->offset, (unsigned) access->size);
>-	  print_generic_expr (dump_file, repl);
>+	  print_generic_expr (dump_file, repl, TDF_UID);
> 	  fprintf (dump_file, "\n");
> 	}
>     }
>@@ -3698,6 +3698,7 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator
>*gsi, bool write)
>   location_t loc;
>   struct access *access;
>   tree type, bfr, orig_expr;
>+  bool partial_cplx_access = false;
> 
>   if (TREE_CODE (*expr) == BIT_FIELD_REF)
>     {
>@@ -3708,7 +3709,10 @@ sra_modify_expr (tree *expr,
>gimple_stmt_iterator *gsi, bool write)
>     bfr = NULL_TREE;
> 
>if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) ==
>IMAGPART_EXPR)
>-    expr = &TREE_OPERAND (*expr, 0);
>+    {
>+      expr = &TREE_OPERAND (*expr, 0);
>+      partial_cplx_access = true;
>+    }
>   access = get_access_for_expr (*expr);
>   if (!access)
>     return false;
>@@ -3736,13 +3740,32 @@ sra_modify_expr (tree *expr,
>gimple_stmt_iterator *gsi, bool write)
>   be accessed as a different type too, potentially creating a need for
>  type conversion (see PR42196) and when scalarized unions are involved
>          in assembler statements (see PR42398).  */
>-      if (!useless_type_conversion_p (type, access->type))
>+      if (!bfr && !useless_type_conversion_p (type, access->type))
> 	{
> 	  tree ref;
> 
> 	  ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
> 
>-	  if (write)
>+	  if (partial_cplx_access)
>+	    {
>+	    /* VIEW_CONVERT_EXPRs in partial complex access are always fine
>in
>+	       the case of a write because in such case the replacement
>cannot
>+	       be a gimple register.  In the case of a load, we have to
>+	       differentiate in between a register an non-register
>+	       replacement.  */
>+	      tree t = build1 (VIEW_CONVERT_EXPR, type, repl);
>+	      gcc_checking_assert (!write || access->grp_partial_lhs);
>+	      if (!access->grp_partial_lhs)
>+		{
>+		  tree tmp = make_ssa_name (type);
>+		  gassign *stmt = gimple_build_assign (tmp, t);
>+		  /* This is always a read. */
>+		  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
>+		  t = tmp;
>+		}
>+	      *expr = t;
>+	    }
>+	  else if (write)
> 	    {
> 	      gassign *stmt;
> 
>diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
>index e7eaa18ccad..3d8acf7eb03 100644
>--- a/gcc/tree-ssa-forwprop.c
>+++ b/gcc/tree-ssa-forwprop.c
>@@ -2815,7 +2815,8 @@ pass_forwprop::execute (function *fun)
> 		    continue;
> 		  if (!is_gimple_assign (use_stmt)
> 		      || (gimple_assign_rhs_code (use_stmt) != REALPART_EXPR
>-			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR))
>+			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
>+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
> 		    {
> 		      rewrite = false;
> 		      break;
>@@ -2877,7 +2878,8 @@ pass_forwprop::execute (function *fun)
> 		  if (is_gimple_debug (use_stmt))
> 		    continue;
> 		  if (!is_gimple_assign (use_stmt)
>-		      || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF)
>+		      || gimple_assign_rhs_code (use_stmt) != BIT_FIELD_REF
>+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
> 		    {
> 		      rewrite = false;
> 		      break;
Martin Jambor April 21, 2020, 9:50 a.m. UTC | #10
On Wed, Apr 08 2020, Martin Jambor wrote:
>
[...]
>
> 2020-04-08  Martin Jambor  <mjambor@suse.cz>
> 	    Richard Biener  <rguenther@suse.de>
>
> 	PR tree-optimization/94482
> 	* tree-sra.c (create_access_replacement): Dump new replacement with
> 	TDF_UID.
> 	(sra_modify_expr): Fix handling of cases when the original EXPR writes
> 	to only part of the replacement.
> 	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
> 	the first operand of combinations into REAL/IMAGPART_EXPR and
> 	BIT_FIELD_REF.
>
> 	testsuite/
> 	* gcc.dg/torture/pr94482.c: New test.
> 	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.

I'd like to backport the patch to the gcc-9 and gcc-8 branches.  The
backport patch is the same for both release branches, the differences
compared to the mainline one are:

1. the code handling BIT_FIELD_REF if fwprop is not present on the
   release branches so the "fixing" hunk was dropped, and

2. the testcase options were changed to what Jakub put there on
   the mainline to suppress all vector ABI warnings.

Bootstrapped and tested on x86_64-linux.  OK for both?

Thanks,

Martin


2020-04-21  Martin Jambor  <mjambor@suse.cz>

        Backport from master
	2020-04-09  Martin Jambor  <mjambor@suse.cz>
                    Richard Biener  <rguenther@suse.de>

	PR tree-optimization/94482
	* tree-sra.c (create_access_replacement): Dump new replacement with
	TDF_UID.
	(sra_modify_expr): Fix handling of cases when the original EXPR writes
	to only part of the replacement.
	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
	the first operand of combinations into REAL/IMAGPART_EXPR and
	BIT_FIELD_REF.

	testsuite/
	* gcc.dg/torture/pr94482.c: New test.
	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
---
 gcc/ChangeLog                             | 15 +++++++
 gcc/testsuite/ChangeLog                   |  9 ++++
 gcc/testsuite/gcc.dg/torture/pr94482.c    | 36 ++++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++++++++++++++++++++++
 gcc/tree-sra.c                            | 31 ++++++++++++--
 gcc/tree-ssa-forwprop.c                   |  3 +-
 6 files changed, 139 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7df3a0d0990..be843b53e6c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2020-04-21  Martin Jambor  <mjambor@suse.cz>
+
+	Backport from master
+	2020-04-09  Martin Jambor  <mjambor@suse.cz>
+	            Richard Biener  <rguenther@suse.de>
+
+	PR tree-optimization/94482
+	* tree-sra.c (create_access_replacement): Dump new replacement with
+	TDF_UID.
+	(sra_modify_expr): Fix handling of cases when the original EXPR writes
+	to only part of the replacement.
+	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
+	the first operand of combinations into REAL/IMAGPART_EXPR and
+	BIT_FIELD_REF.
+
 2020-04-17  H.J. Lu  <hongjiu.lu@intel.com>
 
 	Backport from master
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 3df9dd78d0c..248244343c4 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,12 @@
+2020-04-21  Martin Jambor  <mjambor@suse.cz>
+
+	Backport from master
+	2020-04-09  Martin Jambor  <mjambor@suse.cz>
+
+	PR tree-optimization/94482
+	* gcc.dg/torture/pr94482.c: New test.
+	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
+
 2020-04-17  H.J. Lu  <hongjiu.lu@intel.com>
 
 	Backport from master
diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c b/gcc/testsuite/gcc.dg/torture/pr94482.c
new file mode 100644
index 00000000000..9264842e349
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-additional-options "-Wno-psabi -w" } */
+/* { dg-additional-options "-msse2" { target sse2_runtime } } */
+
+typedef unsigned V __attribute__ ((__vector_size__ (16)));
+union U
+{
+  V j;
+  unsigned long long i __attribute__ ((__vector_size__ (16)));
+};
+
+static inline __attribute__((always_inline)) V
+foo (unsigned long long a)
+{
+  union U z = { .j = (V) {} };
+  for (unsigned long i = 0; i < 1; i++)
+    z.i[i] = a;
+  return z.j;
+}
+
+static inline __attribute__((always_inline)) V
+bar (V a, unsigned long long i, int q)
+{
+  union U z = { .j = a };
+  z.i[q] = i;
+  return z.j;
+}
+
+int
+main ()
+{
+  union U z = { .j = bar (foo (1729), 2, 1) };
+  if (z.i[0] != 1729)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
new file mode 100644
index 00000000000..fcac9d5e439
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+typedef unsigned long V __attribute__ ((__vector_size__ (8)));
+typedef _Complex int Ci;
+typedef _Complex float Cf;
+
+union U
+{
+  Ci ci;
+  Cf cf;
+};
+
+volatile Ci vgi;
+
+Cf foo (Cf c)
+{
+  __real c = 0x1ffp10;
+  return c;
+}
+
+Ci ioo (Ci c)
+{
+  __real c = 50;
+  return c;
+}
+
+
+int main (int argc, char *argv[])
+{
+  union U u;
+
+  __real u.ci = 500;
+  __imag u.ci = 1000;
+  vgi = u.ci;
+
+  u.ci = ioo (u.ci);
+  __imag u.ci = 100;
+
+  if (__real u.ci != 50 || __imag u.ci != 100)
+    __builtin_abort();
+
+  u.cf = foo (u.cf);
+  __imag u.cf = 0x1p3;
+
+  if (__real u.cf != 0x1ffp10 || __imag u.cf != 0x1p3)
+    __builtin_abort();
+
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 0a5c24a183a..805839cac50 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2329,7 +2329,7 @@ create_access_replacement (struct access *access, tree reg_type = NULL_TREE)
 	  print_generic_expr (dump_file, access->base);
 	  fprintf (dump_file, " offset: %u, size: %u: ",
 		   (unsigned) access->offset, (unsigned) access->size);
-	  print_generic_expr (dump_file, repl);
+	  print_generic_expr (dump_file, repl, TDF_UID);
 	  fprintf (dump_file, "\n");
 	}
     }
@@ -3187,6 +3187,7 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
   location_t loc;
   struct access *access;
   tree type, bfr, orig_expr;
+  bool partial_cplx_access = false;
 
   if (TREE_CODE (*expr) == BIT_FIELD_REF)
     {
@@ -3197,7 +3198,10 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
     bfr = NULL_TREE;
 
   if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
-    expr = &TREE_OPERAND (*expr, 0);
+    {
+      expr = &TREE_OPERAND (*expr, 0);
+      partial_cplx_access = true;
+    }
   access = get_access_for_expr (*expr);
   if (!access)
     return false;
@@ -3225,13 +3229,32 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
          be accessed as a different type too, potentially creating a need for
          type conversion (see PR42196) and when scalarized unions are involved
          in assembler statements (see PR42398).  */
-      if (!useless_type_conversion_p (type, access->type))
+      if (!bfr && !useless_type_conversion_p (type, access->type))
 	{
 	  tree ref;
 
 	  ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
 
-	  if (write)
+	  if (partial_cplx_access)
+	    {
+	    /* VIEW_CONVERT_EXPRs in partial complex access are always fine in
+	       the case of a write because in such case the replacement cannot
+	       be a gimple register.  In the case of a load, we have to
+	       differentiate in between a register an non-register
+	       replacement.  */
+	      tree t = build1 (VIEW_CONVERT_EXPR, type, repl);
+	      gcc_checking_assert (!write || access->grp_partial_lhs);
+	      if (!access->grp_partial_lhs)
+		{
+		  tree tmp = make_ssa_name (type);
+		  gassign *stmt = gimple_build_assign (tmp, t);
+		  /* This is always a read. */
+		  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+		  t = tmp;
+		}
+	      *expr = t;
+	    }
+	  else if (write)
 	    {
 	      gassign *stmt;
 
diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index bbfa1bc6fae..ca5443b79b3 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -2357,7 +2357,8 @@ pass_forwprop::execute (function *fun)
 		    continue;
 		  if (!is_gimple_assign (use_stmt)
 		      || (gimple_assign_rhs_code (use_stmt) != REALPART_EXPR
-			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR))
+			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
+		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
 		    {
 		      rewrite = false;
 		      break;
Richard Biener April 21, 2020, 9:51 a.m. UTC | #11
On Tue, 21 Apr 2020, Martin Jambor wrote:

> On Wed, Apr 08 2020, Martin Jambor wrote:
> >
> [...]
> >
> > 2020-04-08  Martin Jambor  <mjambor@suse.cz>
> > 	    Richard Biener  <rguenther@suse.de>
> >
> > 	PR tree-optimization/94482
> > 	* tree-sra.c (create_access_replacement): Dump new replacement with
> > 	TDF_UID.
> > 	(sra_modify_expr): Fix handling of cases when the original EXPR writes
> > 	to only part of the replacement.
> > 	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
> > 	the first operand of combinations into REAL/IMAGPART_EXPR and
> > 	BIT_FIELD_REF.
> >
> > 	testsuite/
> > 	* gcc.dg/torture/pr94482.c: New test.
> > 	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
> 
> I'd like to backport the patch to the gcc-9 and gcc-8 branches.  The
> backport patch is the same for both release branches, the differences
> compared to the mainline one are:
> 
> 1. the code handling BIT_FIELD_REF if fwprop is not present on the
>    release branches so the "fixing" hunk was dropped, and
> 
> 2. the testcase options were changed to what Jakub put there on
>    the mainline to suppress all vector ABI warnings.
> 
> Bootstrapped and tested on x86_64-linux.  OK for both?

OK.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 2020-04-21  Martin Jambor  <mjambor@suse.cz>
> 
>         Backport from master
> 	2020-04-09  Martin Jambor  <mjambor@suse.cz>
>                     Richard Biener  <rguenther@suse.de>
> 
> 	PR tree-optimization/94482
> 	* tree-sra.c (create_access_replacement): Dump new replacement with
> 	TDF_UID.
> 	(sra_modify_expr): Fix handling of cases when the original EXPR writes
> 	to only part of the replacement.
> 	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
> 	the first operand of combinations into REAL/IMAGPART_EXPR and
> 	BIT_FIELD_REF.
> 
> 	testsuite/
> 	* gcc.dg/torture/pr94482.c: New test.
> 	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
> ---
>  gcc/ChangeLog                             | 15 +++++++
>  gcc/testsuite/ChangeLog                   |  9 ++++
>  gcc/testsuite/gcc.dg/torture/pr94482.c    | 36 ++++++++++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c | 50 +++++++++++++++++++++++
>  gcc/tree-sra.c                            | 31 ++++++++++++--
>  gcc/tree-ssa-forwprop.c                   |  3 +-
>  6 files changed, 139 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr94482.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 7df3a0d0990..be843b53e6c 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,18 @@
> +2020-04-21  Martin Jambor  <mjambor@suse.cz>
> +
> +	Backport from master
> +	2020-04-09  Martin Jambor  <mjambor@suse.cz>
> +	            Richard Biener  <rguenther@suse.de>
> +
> +	PR tree-optimization/94482
> +	* tree-sra.c (create_access_replacement): Dump new replacement with
> +	TDF_UID.
> +	(sra_modify_expr): Fix handling of cases when the original EXPR writes
> +	to only part of the replacement.
> +	* tree-ssa-forwprop.c (pass_forwprop::execute): Properly verify
> +	the first operand of combinations into REAL/IMAGPART_EXPR and
> +	BIT_FIELD_REF.
> +
>  2020-04-17  H.J. Lu  <hongjiu.lu@intel.com>
>  
>  	Backport from master
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 3df9dd78d0c..248244343c4 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,12 @@
> +2020-04-21  Martin Jambor  <mjambor@suse.cz>
> +
> +	Backport from master
> +	2020-04-09  Martin Jambor  <mjambor@suse.cz>
> +
> +	PR tree-optimization/94482
> +	* gcc.dg/torture/pr94482.c: New test.
> +	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
> +
>  2020-04-17  H.J. Lu  <hongjiu.lu@intel.com>
>  
>  	Backport from master
> diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c b/gcc/testsuite/gcc.dg/torture/pr94482.c
> new file mode 100644
> index 00000000000..9264842e349
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run } */
> +/* { dg-additional-options "-Wno-psabi -w" } */
> +/* { dg-additional-options "-msse2" { target sse2_runtime } } */
> +
> +typedef unsigned V __attribute__ ((__vector_size__ (16)));
> +union U
> +{
> +  V j;
> +  unsigned long long i __attribute__ ((__vector_size__ (16)));
> +};
> +
> +static inline __attribute__((always_inline)) V
> +foo (unsigned long long a)
> +{
> +  union U z = { .j = (V) {} };
> +  for (unsigned long i = 0; i < 1; i++)
> +    z.i[i] = a;
> +  return z.j;
> +}
> +
> +static inline __attribute__((always_inline)) V
> +bar (V a, unsigned long long i, int q)
> +{
> +  union U z = { .j = a };
> +  z.i[q] = i;
> +  return z.j;
> +}
> +
> +int
> +main ()
> +{
> +  union U z = { .j = bar (foo (1729), 2, 1) };
> +  if (z.i[0] != 1729)
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
> new file mode 100644
> index 00000000000..fcac9d5e439
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
> @@ -0,0 +1,50 @@
> +/* { dg-do run } */
> +/* { dg-options "-O1" } */
> +
> +typedef unsigned long V __attribute__ ((__vector_size__ (8)));
> +typedef _Complex int Ci;
> +typedef _Complex float Cf;
> +
> +union U
> +{
> +  Ci ci;
> +  Cf cf;
> +};
> +
> +volatile Ci vgi;
> +
> +Cf foo (Cf c)
> +{
> +  __real c = 0x1ffp10;
> +  return c;
> +}
> +
> +Ci ioo (Ci c)
> +{
> +  __real c = 50;
> +  return c;
> +}
> +
> +
> +int main (int argc, char *argv[])
> +{
> +  union U u;
> +
> +  __real u.ci = 500;
> +  __imag u.ci = 1000;
> +  vgi = u.ci;
> +
> +  u.ci = ioo (u.ci);
> +  __imag u.ci = 100;
> +
> +  if (__real u.ci != 50 || __imag u.ci != 100)
> +    __builtin_abort();
> +
> +  u.cf = foo (u.cf);
> +  __imag u.cf = 0x1p3;
> +
> +  if (__real u.cf != 0x1ffp10 || __imag u.cf != 0x1p3)
> +    __builtin_abort();
> +
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 0a5c24a183a..805839cac50 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -2329,7 +2329,7 @@ create_access_replacement (struct access *access, tree reg_type = NULL_TREE)
>  	  print_generic_expr (dump_file, access->base);
>  	  fprintf (dump_file, " offset: %u, size: %u: ",
>  		   (unsigned) access->offset, (unsigned) access->size);
> -	  print_generic_expr (dump_file, repl);
> +	  print_generic_expr (dump_file, repl, TDF_UID);
>  	  fprintf (dump_file, "\n");
>  	}
>      }
> @@ -3187,6 +3187,7 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
>    location_t loc;
>    struct access *access;
>    tree type, bfr, orig_expr;
> +  bool partial_cplx_access = false;
>  
>    if (TREE_CODE (*expr) == BIT_FIELD_REF)
>      {
> @@ -3197,7 +3198,10 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
>      bfr = NULL_TREE;
>  
>    if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
> -    expr = &TREE_OPERAND (*expr, 0);
> +    {
> +      expr = &TREE_OPERAND (*expr, 0);
> +      partial_cplx_access = true;
> +    }
>    access = get_access_for_expr (*expr);
>    if (!access)
>      return false;
> @@ -3225,13 +3229,32 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
>           be accessed as a different type too, potentially creating a need for
>           type conversion (see PR42196) and when scalarized unions are involved
>           in assembler statements (see PR42398).  */
> -      if (!useless_type_conversion_p (type, access->type))
> +      if (!bfr && !useless_type_conversion_p (type, access->type))
>  	{
>  	  tree ref;
>  
>  	  ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
>  
> -	  if (write)
> +	  if (partial_cplx_access)
> +	    {
> +	    /* VIEW_CONVERT_EXPRs in partial complex access are always fine in
> +	       the case of a write because in such case the replacement cannot
> +	       be a gimple register.  In the case of a load, we have to
> +	       differentiate in between a register an non-register
> +	       replacement.  */
> +	      tree t = build1 (VIEW_CONVERT_EXPR, type, repl);
> +	      gcc_checking_assert (!write || access->grp_partial_lhs);
> +	      if (!access->grp_partial_lhs)
> +		{
> +		  tree tmp = make_ssa_name (type);
> +		  gassign *stmt = gimple_build_assign (tmp, t);
> +		  /* This is always a read. */
> +		  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> +		  t = tmp;
> +		}
> +	      *expr = t;
> +	    }
> +	  else if (write)
>  	    {
>  	      gassign *stmt;
>  
> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
> index bbfa1bc6fae..ca5443b79b3 100644
> --- a/gcc/tree-ssa-forwprop.c
> +++ b/gcc/tree-ssa-forwprop.c
> @@ -2357,7 +2357,8 @@ pass_forwprop::execute (function *fun)
>  		    continue;
>  		  if (!is_gimple_assign (use_stmt)
>  		      || (gimple_assign_rhs_code (use_stmt) != REALPART_EXPR
> -			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR))
> +			  && gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
> +		      || TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) != lhs)
>  		    {
>  		      rewrite = false;
>  		      break;
>
diff mbox series

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e10fb251c16..36ddef64afd 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@ 
+2020-04-06  Martin Jambor  <mjambor@suse.cz>
+
+	PR tree-optimization/94482
+	* tree-sra.c (create_access_replacement): Dump new replacement with
+	TDF_UID.
+	(sra_modify_expr): Fix handling of cases when the original EXPR writes
+	to only part of the replacement.
+
 2020-04-05 Zachary Spytz  <zspytz@gmail.com>
 
 	* extend.texi: Add free to list of ISO C90 functions that
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 88bab5d3d19..8b85e55afe8 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@ 
+2020-04-06  Martin Jambor  <mjambor@suse.cz>
+
+	PR tree-optimization/94482
+	* gcc.dg/torture/pr94482.c: New test.
+	* gcc.dg/tree-ssa/pr94482-2.c: Likewise.
+
 2020-04-05  Iain Sandoe  <iain@sandoe.co.uk>
 
 	* g++.dg/coroutines/torture/co-await-14-template-traits.C: Rename...
diff --git a/gcc/testsuite/gcc.dg/torture/pr94482.c b/gcc/testsuite/gcc.dg/torture/pr94482.c
new file mode 100644
index 00000000000..5bb19e745c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr94482.c
@@ -0,0 +1,34 @@ 
+/* { dg-do run } */
+
+typedef unsigned V __attribute__ ((__vector_size__ (16)));
+union U
+{
+  V j;
+  unsigned long long i __attribute__ ((__vector_size__ (16)));
+};
+
+static inline __attribute__((always_inline)) V
+foo (unsigned long long a)
+{
+  union U z = { .j = (V) {} };
+  for (unsigned long i = 0; i < 1; i++)
+    z.i[i] = a;
+  return z.j;
+}
+
+static inline __attribute__((always_inline)) V
+bar (V a, unsigned long long i, int q)
+{
+  union U z = { .j = a };
+  z.i[q] = i;
+  return z.j;
+}
+
+int
+main ()
+{
+  union U z = { .j = bar (foo (1729), 2, 1) };
+  if (z.i[0] != 1729)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
new file mode 100644
index 00000000000..fcac9d5e439
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr94482-2.c
@@ -0,0 +1,50 @@ 
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+typedef unsigned long V __attribute__ ((__vector_size__ (8)));
+typedef _Complex int Ci;
+typedef _Complex float Cf;
+
+union U
+{
+  Ci ci;
+  Cf cf;
+};
+
+volatile Ci vgi;
+
+Cf foo (Cf c)
+{
+  __real c = 0x1ffp10;
+  return c;
+}
+
+Ci ioo (Ci c)
+{
+  __real c = 50;
+  return c;
+}
+
+
+int main (int argc, char *argv[])
+{
+  union U u;
+
+  __real u.ci = 500;
+  __imag u.ci = 1000;
+  vgi = u.ci;
+
+  u.ci = ioo (u.ci);
+  __imag u.ci = 100;
+
+  if (__real u.ci != 50 || __imag u.ci != 100)
+    __builtin_abort();
+
+  u.cf = foo (u.cf);
+  __imag u.cf = 0x1p3;
+
+  if (__real u.cf != 0x1ffp10 || __imag u.cf != 0x1p3)
+    __builtin_abort();
+
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index b2056b58750..494ab609149 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2257,7 +2257,7 @@  create_access_replacement (struct access *access, tree reg_type = NULL_TREE)
 	  print_generic_expr (dump_file, access->base);
 	  fprintf (dump_file, " offset: %u, size: %u: ",
 		   (unsigned) access->offset, (unsigned) access->size);
-	  print_generic_expr (dump_file, repl);
+	  print_generic_expr (dump_file, repl, TDF_UID);
 	  fprintf (dump_file, "\n");
 	}
     }
@@ -3698,6 +3698,7 @@  sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
   location_t loc;
   struct access *access;
   tree type, bfr, orig_expr;
+  bool partial_cplx_access = false;
 
   if (TREE_CODE (*expr) == BIT_FIELD_REF)
     {
@@ -3708,7 +3709,10 @@  sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
     bfr = NULL_TREE;
 
   if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
-    expr = &TREE_OPERAND (*expr, 0);
+    {
+      expr = &TREE_OPERAND (*expr, 0);
+      partial_cplx_access = true;
+    }
   access = get_access_for_expr (*expr);
   if (!access)
     return false;
@@ -3736,13 +3740,18 @@  sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write)
          be accessed as a different type too, potentially creating a need for
          type conversion (see PR42196) and when scalarized unions are involved
          in assembler statements (see PR42398).  */
-      if (!useless_type_conversion_p (type, access->type))
+      if (!bfr && !useless_type_conversion_p (type, access->type))
 	{
 	  tree ref;
 
 	  ref = build_ref_for_model (loc, orig_expr, 0, access, gsi, false);
 
-	  if (write)
+	  if (partial_cplx_access)
+	    /* VIEW_CONVERT_EXPRs in partial complex access are fine even in
+	       the case of a write because in such case the replacement cannot
+	       be a gimple register.  */
+	    *expr = build1 (VIEW_CONVERT_EXPR, type, repl);
+	  else if (write)
 	    {
 	      gassign *stmt;