diff mbox

Remove self-assignments

Message ID alpine.DEB.2.02.1306091817340.24602@stedding.saclay.inria.fr
State New
Headers show

Commit Message

Marc Glisse June 9, 2013, 4:25 p.m. UTC
Hello,

this patch removes some self-assignments. I don't know if this is the best 
way, but it passes a bootstrap and the testsuite on x86_64-linux-gnu.

2013-06-10  Marc Glisse  <marc.glisse@inria.fr>

 	PR tree-optimization/57361
gcc/
 	* tree-ssa-dse.c (dse_possible_dead_store_p): Handle self-assignment.

gcc/testsuite/
 	* gcc.dg/tree-ssa/pr57361.c: New file.

Comments

Jeff Law June 11, 2013, 6:37 p.m. UTC | #1
On 06/09/13 10:25, Marc Glisse wrote:
> Hello,
>
> this patch removes some self-assignments. I don't know if this is the
> best way, but it passes a bootstrap and the testsuite on x86_64-linux-gnu.
>
> 2013-06-10  Marc Glisse  <marc.glisse@inria.fr>
>
>      PR tree-optimization/57361
> gcc/
>      * tree-ssa-dse.c (dse_possible_dead_store_p): Handle self-assignment.
>
> gcc/testsuite/
>      * gcc.dg/tree-ssa/pr57361.c: New file.
So dse_optimize_stmt will verify the statement does not have volatile 
operands.  However, it doesn't verify the statement does not potentially 
throw (think about a segfault on the store when async exceptions are 
enabled).  I think you need to test for that explicitly.


Is there some reason this won't work in tree-ssa-dce.c?  That gets run 
more often and these stores may be preventing code motion opportunities, 
so getting them out of the IL stream as early as possible would be good.

I'd be curious how often this triggers in GCC itself as well.

jeff
Marc Glisse June 11, 2013, 7:30 p.m. UTC | #2
On Tue, 11 Jun 2013, Jeff Law wrote:

> On 06/09/13 10:25, Marc Glisse wrote:
>> Hello,
>> 
>> this patch removes some self-assignments. I don't know if this is the
>> best way, but it passes a bootstrap and the testsuite on x86_64-linux-gnu.
>> 
>> 2013-06-10  Marc Glisse  <marc.glisse@inria.fr>
>>
>>      PR tree-optimization/57361
>> gcc/
>>      * tree-ssa-dse.c (dse_possible_dead_store_p): Handle self-assignment.
>> 
>> gcc/testsuite/
>>      * gcc.dg/tree-ssa/pr57361.c: New file.
> So dse_optimize_stmt will verify the statement does not have volatile 
> operands.

operand_equal_p also does it, so we are well covered there ;-)

> However, it doesn't verify the statement does not potentially 
> throw (think about a segfault on the store when async exceptions are 
> enabled).  I think you need to test for that explicitly.

Hmm, I am not at all familiar with that. Google drowns me in C# and 
javascript links, and grepping through the sources only pointed me to the 
-fasynchronous-unwind-tables flag.

Richard noticed in the PR that expand_assignment already does:

   /* Optimize away no-op moves without side-effects.  */
   if (operand_equal_p (to, from, 0))
     return;

so it looks like the operand_equal_p test should be sufficient (or the 
compiler already breaks that code).

> Is there some reason this won't work in tree-ssa-dce.c?  That gets run more 
> often and these stores may be preventing code motion opportunities, so 
> getting them out of the IL stream as early as possible would be good.

In the first version of the patch:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6
I was doing it in fold_stmt_1, which should be called often enough. 
Richard suggested DSE in the PR, but that might be because he had a more 
subtle test in mind. I am certainly ok with moving it to DCE or anywhere 
else...

> I'd be curious how often this triggers in GCC itself as well.

Do you know a convenient way to test that?
Marek Polacek June 11, 2013, 8:02 p.m. UTC | #3
On Tue, Jun 11, 2013 at 09:30:29PM +0200, Marc Glisse wrote:
> >I'd be curious how often this triggers in GCC itself as well.
> 
> Do you know a convenient way to test that?

Perhaps you could put in the 
if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt))
    && operand_equal_p (gimple_assign_rhs1 (stmt),
			gimple_assign_lhs (stmt), 0))
{
 ...
}

something like
  FILE *f = fopen ("/tmp/self", "a");
  fprintf (f, "%s ", main_input_filename);
  print_gimple_stmt (f, stmt, 0, TDF_VOPS|TDF_MEMSYMS);
  fclose (f);

(completely untested)

	Marek
Richard Biener June 12, 2013, 8:03 a.m. UTC | #4
On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Tue, 11 Jun 2013, Jeff Law wrote:
>
>> On 06/09/13 10:25, Marc Glisse wrote:
>>>
>>> Hello,
>>>
>>> this patch removes some self-assignments. I don't know if this is the
>>> best way, but it passes a bootstrap and the testsuite on
>>> x86_64-linux-gnu.
>>>
>>> 2013-06-10  Marc Glisse  <marc.glisse@inria.fr>
>>>
>>>      PR tree-optimization/57361
>>> gcc/
>>>      * tree-ssa-dse.c (dse_possible_dead_store_p): Handle
>>> self-assignment.
>>>
>>> gcc/testsuite/
>>>      * gcc.dg/tree-ssa/pr57361.c: New file.
>>
>> So dse_optimize_stmt will verify the statement does not have volatile
>> operands.
>
>
> operand_equal_p also does it, so we are well covered there ;-)
>
>
>> However, it doesn't verify the statement does not potentially throw (think
>> about a segfault on the store when async exceptions are enabled).  I think
>> you need to test for that explicitly.
>
>
> Hmm, I am not at all familiar with that. Google drowns me in C# and
> javascript links, and grepping through the sources only pointed me to the
> -fasynchronous-unwind-tables flag.
>
> Richard noticed in the PR that expand_assignment already does:
>
>   /* Optimize away no-op moves without side-effects.  */
>   if (operand_equal_p (to, from, 0))
>     return;
>
> so it looks like the operand_equal_p test should be sufficient (or the
> compiler already breaks that code).
>
>
>> Is there some reason this won't work in tree-ssa-dce.c?  That gets run
>> more often and these stores may be preventing code motion opportunities, so
>> getting them out of the IL stream as early as possible would be good.
>
>
> In the first version of the patch:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6
> I was doing it in fold_stmt_1, which should be called often enough. Richard
> suggested DSE in the PR, but that might be because he had a more subtle test
> in mind. I am certainly ok with moving it to DCE or anywhere else...

DSE looks like the right place to me as we are removing a store.  Yes,
DCE removes a limited set of stores as well, but the way we remove this kind
of store makes it much more suited to DSE.

As of possibly trapping/throwing stores, we do not bother to preserve those
(even with -fnon-call-exceptions).

Thus, the patch is ok with me if you agree with that and with the following
adjustment:

+  /* Self-assignments are zombies.  */
+  if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt))
+      && operand_equal_p (gimple_assign_rhs1 (stmt),
+                         gimple_assign_lhs (stmt), 0))

I see no need to compare the codes of the LHS and the RHS, that's redundant
with what operand_equal_p does.

Thus, ok with removing that test if it bootstraps / regtests ok and if Jeff
has no further comments.

Thanks,
Richard.

>> I'd be curious how often this triggers in GCC itself as well.
>
>
> Do you know a convenient way to test that?
>
> --
> Marc Glisse
Marc Glisse June 12, 2013, 8:47 a.m. UTC | #5
On Wed, 12 Jun 2013, Richard Biener wrote:

> On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Tue, 11 Jun 2013, Jeff Law wrote:
>>
>>> On 06/09/13 10:25, Marc Glisse wrote:
>>>>
>>>> Hello,
>>>>
>>>> this patch removes some self-assignments. I don't know if this is the
>>>> best way, but it passes a bootstrap and the testsuite on
>>>> x86_64-linux-gnu.
>>>>
>>>> 2013-06-10  Marc Glisse  <marc.glisse@inria.fr>
>>>>
>>>>      PR tree-optimization/57361
>>>> gcc/
>>>>      * tree-ssa-dse.c (dse_possible_dead_store_p): Handle
>>>> self-assignment.
>>>>
>>>> gcc/testsuite/
>>>>      * gcc.dg/tree-ssa/pr57361.c: New file.
>>>
>>> So dse_optimize_stmt will verify the statement does not have volatile
>>> operands.
>>
>>
>> operand_equal_p also does it, so we are well covered there ;-)
>>
>>
>>> However, it doesn't verify the statement does not potentially throw (think
>>> about a segfault on the store when async exceptions are enabled).  I think
>>> you need to test for that explicitly.
>>
>>
>> Hmm, I am not at all familiar with that. Google drowns me in C# and
>> javascript links, and grepping through the sources only pointed me to the
>> -fasynchronous-unwind-tables flag.
>>
>> Richard noticed in the PR that expand_assignment already does:
>>
>>   /* Optimize away no-op moves without side-effects.  */
>>   if (operand_equal_p (to, from, 0))
>>     return;
>>
>> so it looks like the operand_equal_p test should be sufficient (or the
>> compiler already breaks that code).
>>
>>
>>> Is there some reason this won't work in tree-ssa-dce.c?  That gets run
>>> more often and these stores may be preventing code motion opportunities, so
>>> getting them out of the IL stream as early as possible would be good.
>>
>>
>> In the first version of the patch:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6
>> I was doing it in fold_stmt_1, which should be called often enough. Richard
>> suggested DSE in the PR, but that might be because he had a more subtle test
>> in mind. I am certainly ok with moving it to DCE or anywhere else...
>
> DSE looks like the right place to me as we are removing a store.  Yes,
> DCE removes a limited set of stores as well, but the way we remove this kind
> of store makes it much more suited to DSE.
>
> As of possibly trapping/throwing stores, we do not bother to preserve those
> (even with -fnon-call-exceptions).
>
> Thus, the patch is ok with me if you agree with that and with the following
> adjustment:
>
> +  /* Self-assignments are zombies.  */
> +  if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt))
> +      && operand_equal_p (gimple_assign_rhs1 (stmt),
> +                         gimple_assign_lhs (stmt), 0))
>
> I see no need to compare the codes of the LHS and the RHS, that's redundant
> with what operand_equal_p does.

I was trying to make sure first that it was a pure assignment and thus 
that rhs1 was the only thing to look at, not for instance *a=-*a, but I 
guess in gimple that would be 3 statements. I'll retest without it.

> Thus, ok with removing that test if it bootstraps / regtests ok and if Jeff
> has no further comments.


>>> I'd be curious how often this triggers in GCC itself as well.

Essentially never. I tried with the fold_stmt version of the patch, and 
libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers. 
Note that the case:
b=*a
*a=b
is already handled by FRE which removes *a=b (copyprop later removes the 
dead b=*a and release_ssa removes the unused variable b), it is only *a=*a 
that wasn't handled. Now that I look at it, it is a bit surprising that:

struct A {int i;};
void f(A*a,A*b){ A c=*b; *a=c; }
void g(A*a,A*b){ *a=*b; }

gives 2 different .optimized gimple:

   c$i_5 = MEM[(const struct A &)b_2(D)];
   MEM[(struct A *)a_3(D)] = c$i_5;

and:

   *a_2(D) = MEM[(const struct A &)b_3(D)];

Aren't they equivalent? And if so, which form should be preferred?
Richard Biener June 12, 2013, 9:44 a.m. UTC | #6
On Wed, Jun 12, 2013 at 10:47 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Wed, 12 Jun 2013, Richard Biener wrote:
>
>> On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> On Tue, 11 Jun 2013, Jeff Law wrote:
>>>
>>>> On 06/09/13 10:25, Marc Glisse wrote:
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> this patch removes some self-assignments. I don't know if this is the
>>>>> best way, but it passes a bootstrap and the testsuite on
>>>>> x86_64-linux-gnu.
>>>>>
>>>>> 2013-06-10  Marc Glisse  <marc.glisse@inria.fr>
>>>>>
>>>>>      PR tree-optimization/57361
>>>>> gcc/
>>>>>      * tree-ssa-dse.c (dse_possible_dead_store_p): Handle
>>>>> self-assignment.
>>>>>
>>>>> gcc/testsuite/
>>>>>      * gcc.dg/tree-ssa/pr57361.c: New file.
>>>>
>>>>
>>>> So dse_optimize_stmt will verify the statement does not have volatile
>>>> operands.
>>>
>>>
>>>
>>> operand_equal_p also does it, so we are well covered there ;-)
>>>
>>>
>>>> However, it doesn't verify the statement does not potentially throw
>>>> (think
>>>> about a segfault on the store when async exceptions are enabled).  I
>>>> think
>>>> you need to test for that explicitly.
>>>
>>>
>>>
>>> Hmm, I am not at all familiar with that. Google drowns me in C# and
>>> javascript links, and grepping through the sources only pointed me to the
>>> -fasynchronous-unwind-tables flag.
>>>
>>> Richard noticed in the PR that expand_assignment already does:
>>>
>>>   /* Optimize away no-op moves without side-effects.  */
>>>   if (operand_equal_p (to, from, 0))
>>>     return;
>>>
>>> so it looks like the operand_equal_p test should be sufficient (or the
>>> compiler already breaks that code).
>>>
>>>
>>>> Is there some reason this won't work in tree-ssa-dce.c?  That gets run
>>>> more often and these stores may be preventing code motion opportunities,
>>>> so
>>>> getting them out of the IL stream as early as possible would be good.
>>>
>>>
>>>
>>> In the first version of the patch:
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6
>>> I was doing it in fold_stmt_1, which should be called often enough.
>>> Richard
>>> suggested DSE in the PR, but that might be because he had a more subtle
>>> test
>>> in mind. I am certainly ok with moving it to DCE or anywhere else...
>>
>>
>> DSE looks like the right place to me as we are removing a store.  Yes,
>> DCE removes a limited set of stores as well, but the way we remove this
>> kind
>> of store makes it much more suited to DSE.
>>
>> As of possibly trapping/throwing stores, we do not bother to preserve
>> those
>> (even with -fnon-call-exceptions).
>>
>> Thus, the patch is ok with me if you agree with that and with the
>> following
>> adjustment:
>>
>> +  /* Self-assignments are zombies.  */
>> +  if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs
>> (stmt))
>> +      && operand_equal_p (gimple_assign_rhs1 (stmt),
>> +                         gimple_assign_lhs (stmt), 0))
>>
>> I see no need to compare the codes of the LHS and the RHS, that's
>> redundant
>> with what operand_equal_p does.
>
>
> I was trying to make sure first that it was a pure assignment and thus that
> rhs1 was the only thing to look at, not for instance *a=-*a, but I guess in
> gimple that would be 3 statements. I'll retest without it.
>
>
>> Thus, ok with removing that test if it bootstraps / regtests ok and if
>> Jeff
>> has no further comments.
>
>
>
>>>> I'd be curious how often this triggers in GCC itself as well.
>
>
> Essentially never. I tried with the fold_stmt version of the patch, and
> libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers.
> Note that the case:
> b=*a
> *a=b
> is already handled by FRE which removes *a=b (copyprop later removes the
> dead b=*a and release_ssa removes the unused variable b), it is only *a=*a
> that wasn't handled. Now that I look at it, it is a bit surprising that:
>
> struct A {int i;};
> void f(A*a,A*b){ A c=*b; *a=c; }
> void g(A*a,A*b){ *a=*b; }
>
> gives 2 different .optimized gimple:
>
>   c$i_5 = MEM[(const struct A &)b_2(D)];
>   MEM[(struct A *)a_3(D)] = c$i_5;
>
> and:
>
>   *a_2(D) = MEM[(const struct A &)b_3(D)];
>
> Aren't they equivalent? And if so, which form should be preferred?

Well, the first is optimized by SRA to copy element-wise and thus the
loads/stores have is_gimple_reg_type () which require separate loads/stores.
The second is an aggregate copy where we cannot generate SSA temporaries
for the result of the load (!is_gimple_reg_type ()) and thus we are required
to have a single statement.

One of my pending GIMPLE re-org tasks is to always separate loads and
stores and allow SSA names of aggregate type, thus we'd have

 tem_1 = MEM[(const struct A &)b_3(D)];
 *a_2(D) = tem_1;

even for the 2nd case.  That solves the fact that we are missing an
aggregate copy propagation pass quite nicely.

Yes, you have to watch for not creating (too many) overlapping life-ranges
as out-of-SSA won't be able to assign the temporary aggregate SSA names
to registers but possibly has to allocate costly stack space for them.

Setting SSA_NAME_OCCURS_IN_ABNORMAL_PHI on them solves this
issue in a hacky way.

On my list since about 5 years ... ;)

Richard.

> --
> Marc Glisse
Jeff Law June 12, 2013, 3:42 p.m. UTC | #7
On 06/12/13 02:03, Richard Biener wrote:
>
> DSE looks like the right place to me as we are removing a store.  Yes,
> DCE removes a limited set of stores as well, but the way we remove this kind
> of store makes it much more suited to DSE.
>
> As of possibly trapping/throwing stores, we do not bother to preserve those
> (even with -fnon-call-exceptions).
A bit of a surprise to hear that.  I don't mind much though, as 
-fnon-call-exceptions isn't something I find terribly useful.

Consider my "objections" withdrawn.
Jeff
Jeff Law June 12, 2013, 3:46 p.m. UTC | #8
On 06/11/13 13:30, Marc Glisse wrote:
>
>
>> I'd be curious how often this triggers in GCC itself as well.
>
> Do you know a convenient way to test that?
I usually put in some kind of debugging printfs during early development 
which I can then grep for in build logs.  Not very sexy, but effective 
to see if a particular transformation is triggering outside contrived 
testcodes.

jeff
Jakub Jelinek June 12, 2013, 3:49 p.m. UTC | #9
On Wed, Jun 12, 2013 at 09:42:55AM -0600, Jeff Law wrote:
> On 06/12/13 02:03, Richard Biener wrote:
> >DSE looks like the right place to me as we are removing a store.  Yes,
> >DCE removes a limited set of stores as well, but the way we remove this kind
> >of store makes it much more suited to DSE.
> >
> >As of possibly trapping/throwing stores, we do not bother to preserve those
> >(even with -fnon-call-exceptions).
> A bit of a surprise to hear that.  I don't mind much though, as
> -fnon-call-exceptions isn't something I find terribly useful.

Well, a segfault accessing invalid pointer is undefined behavior, so it is
fine to optimize it away.

	Jakub
Marc Glisse June 12, 2013, 6 p.m. UTC | #10
On Wed, 12 Jun 2013, Richard Biener wrote:

> On Wed, Jun 12, 2013 at 10:47 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>
>> Essentially never. I tried with the fold_stmt version of the patch, and
>> libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers.
>> Note that the case:
>> b=*a
>> *a=b
>> is already handled by FRE which removes *a=b (copyprop later removes the
>> dead b=*a and release_ssa removes the unused variable b), it is only *a=*a
>> that wasn't handled. Now that I look at it, it is a bit surprising that:
>>
>> struct A {int i;};
>> void f(A*a,A*b){ A c=*b; *a=c; }
>> void g(A*a,A*b){ *a=*b; }
>>
>> gives 2 different .optimized gimple:
>>
>>   c$i_5 = MEM[(const struct A &)b_2(D)];
>>   MEM[(struct A *)a_3(D)] = c$i_5;
>>
>> and:
>>
>>   *a_2(D) = MEM[(const struct A &)b_3(D)];
>>
>> Aren't they equivalent? And if so, which form should be preferred?
>
> Well, the first is optimized by SRA to copy element-wise and thus the
> loads/stores have is_gimple_reg_type () which require separate loads/stores.
> The second is an aggregate copy where we cannot generate SSA temporaries
> for the result of the load (!is_gimple_reg_type ()) and thus we are required
> to have a single statement.
>
> One of my pending GIMPLE re-org tasks is to always separate loads and
> stores and allow SSA names of aggregate type, thus we'd have
>
> tem_1 = MEM[(const struct A &)b_3(D)];
> *a_2(D) = tem_1;
>
> even for the 2nd case.  That solves the fact that we are missing an
> aggregate copy propagation pass quite nicely.
>
> Yes, you have to watch for not creating (too many) overlapping life-ranges
> as out-of-SSA won't be able to assign the temporary aggregate SSA names
> to registers but possibly has to allocate costly stack space for them.
>
> Setting SSA_NAME_OCCURS_IN_ABNORMAL_PHI on them solves this
> issue in a hacky way.
>
> On my list since about 5 years ... ;)

I was going to ask if I should wait for it (it makes my patch 
unnecessary), but I guess that answers it. Since Jeff seems ok, I 
committed it at r200034.
diff mbox

Patch

Index: testsuite/gcc.dg/tree-ssa/pr57361.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/pr57361.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/pr57361.c	(revision 0)
@@ -0,0 +1,9 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-dse1-details" } */
+
+struct A { int x; double y; };
+void f (struct A *a) {
+  *a = *a;
+}
+
+/* { dg-final { scan-tree-dump "Deleted dead store" "dse1"} } */

Property changes on: testsuite/gcc.dg/tree-ssa/pr57361.c
___________________________________________________________________
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: tree-ssa-dse.c
===================================================================
--- tree-ssa-dse.c	(revision 199867)
+++ tree-ssa-dse.c	(working copy)
@@ -77,20 +77,29 @@  static void dse_enter_block (struct dom_
    Return TRUE if the above conditions are met, otherwise FALSE.  */
 
 static bool
 dse_possible_dead_store_p (gimple stmt, gimple *use_stmt)
 {
   gimple temp;
   unsigned cnt = 0;
 
   *use_stmt = NULL;
 
+  /* Self-assignments are zombies.  */
+  if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt))
+      && operand_equal_p (gimple_assign_rhs1 (stmt),
+			  gimple_assign_lhs (stmt), 0))
+    {
+      *use_stmt = stmt;
+      return true;
+    }
+
   /* Find the first dominated statement that clobbers (part of) the
      memory stmt stores to with no intermediate statement that may use
      part of the memory stmt stores.  That is, find a store that may
      prove stmt to be a dead store.  */
   temp = stmt;
   do
     {
       gimple use_stmt, defvar_def;
       imm_use_iterator ui;
       bool fail = false;