Message ID | alpine.DEB.2.02.1306091817340.24602@stedding.saclay.inria.fr |
---|---|
State | New |
Headers | show |
On 06/09/13 10:25, Marc Glisse wrote: > Hello, > > this patch removes some self-assignments. I don't know if this is the > best way, but it passes a bootstrap and the testsuite on x86_64-linux-gnu. > > 2013-06-10 Marc Glisse <marc.glisse@inria.fr> > > PR tree-optimization/57361 > gcc/ > * tree-ssa-dse.c (dse_possible_dead_store_p): Handle self-assignment. > > gcc/testsuite/ > * gcc.dg/tree-ssa/pr57361.c: New file. So dse_optimize_stmt will verify the statement does not have volatile operands. However, it doesn't verify the statement does not potentially throw (think about a segfault on the store when async exceptions are enabled). I think you need to test for that explicitly. Is there some reason this won't work in tree-ssa-dce.c? That gets run more often and these stores may be preventing code motion opportunities, so getting them out of the IL stream as early as possible would be good. I'd be curious how often this triggers in GCC itself as well. jeff
On Tue, 11 Jun 2013, Jeff Law wrote: > On 06/09/13 10:25, Marc Glisse wrote: >> Hello, >> >> this patch removes some self-assignments. I don't know if this is the >> best way, but it passes a bootstrap and the testsuite on x86_64-linux-gnu. >> >> 2013-06-10 Marc Glisse <marc.glisse@inria.fr> >> >> PR tree-optimization/57361 >> gcc/ >> * tree-ssa-dse.c (dse_possible_dead_store_p): Handle self-assignment. >> >> gcc/testsuite/ >> * gcc.dg/tree-ssa/pr57361.c: New file. > So dse_optimize_stmt will verify the statement does not have volatile > operands. operand_equal_p also does it, so we are well covered there ;-) > However, it doesn't verify the statement does not potentially > throw (think about a segfault on the store when async exceptions are > enabled). I think you need to test for that explicitly. Hmm, I am not at all familiar with that. Google drowns me in C# and javascript links, and grepping through the sources only pointed me to the -fasynchronous-unwind-tables flag. Richard noticed in the PR that expand_assignment already does: /* Optimize away no-op moves without side-effects. */ if (operand_equal_p (to, from, 0)) return; so it looks like the operand_equal_p test should be sufficient (or the compiler already breaks that code). > Is there some reason this won't work in tree-ssa-dce.c? That gets run more > often and these stores may be preventing code motion opportunities, so > getting them out of the IL stream as early as possible would be good. In the first version of the patch: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6 I was doing it in fold_stmt_1, which should be called often enough. Richard suggested DSE in the PR, but that might be because he had a more subtle test in mind. I am certainly ok with moving it to DCE or anywhere else... > I'd be curious how often this triggers in GCC itself as well. Do you know a convenient way to test that?
On Tue, Jun 11, 2013 at 09:30:29PM +0200, Marc Glisse wrote: > >I'd be curious how often this triggers in GCC itself as well. > > Do you know a convenient way to test that? Perhaps you could put in the if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt)) && operand_equal_p (gimple_assign_rhs1 (stmt), gimple_assign_lhs (stmt), 0)) { ... } something like FILE *f = fopen ("/tmp/self", "a"); fprintf (f, "%s ", main_input_filename); print_gimple_stmt (f, stmt, 0, TDF_VOPS|TDF_MEMSYMS); fclose (f); (completely untested) Marek
On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Tue, 11 Jun 2013, Jeff Law wrote: > >> On 06/09/13 10:25, Marc Glisse wrote: >>> >>> Hello, >>> >>> this patch removes some self-assignments. I don't know if this is the >>> best way, but it passes a bootstrap and the testsuite on >>> x86_64-linux-gnu. >>> >>> 2013-06-10 Marc Glisse <marc.glisse@inria.fr> >>> >>> PR tree-optimization/57361 >>> gcc/ >>> * tree-ssa-dse.c (dse_possible_dead_store_p): Handle >>> self-assignment. >>> >>> gcc/testsuite/ >>> * gcc.dg/tree-ssa/pr57361.c: New file. >> >> So dse_optimize_stmt will verify the statement does not have volatile >> operands. > > > operand_equal_p also does it, so we are well covered there ;-) > > >> However, it doesn't verify the statement does not potentially throw (think >> about a segfault on the store when async exceptions are enabled). I think >> you need to test for that explicitly. > > > Hmm, I am not at all familiar with that. Google drowns me in C# and > javascript links, and grepping through the sources only pointed me to the > -fasynchronous-unwind-tables flag. > > Richard noticed in the PR that expand_assignment already does: > > /* Optimize away no-op moves without side-effects. */ > if (operand_equal_p (to, from, 0)) > return; > > so it looks like the operand_equal_p test should be sufficient (or the > compiler already breaks that code). > > >> Is there some reason this won't work in tree-ssa-dce.c? That gets run >> more often and these stores may be preventing code motion opportunities, so >> getting them out of the IL stream as early as possible would be good. > > > In the first version of the patch: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6 > I was doing it in fold_stmt_1, which should be called often enough. Richard > suggested DSE in the PR, but that might be because he had a more subtle test > in mind. I am certainly ok with moving it to DCE or anywhere else... DSE looks like the right place to me as we are removing a store. Yes, DCE removes a limited set of stores as well, but the way we remove this kind of store makes it much more suited to DSE. As of possibly trapping/throwing stores, we do not bother to preserve those (even with -fnon-call-exceptions). Thus, the patch is ok with me if you agree with that and with the following adjustment: + /* Self-assignments are zombies. */ + if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt)) + && operand_equal_p (gimple_assign_rhs1 (stmt), + gimple_assign_lhs (stmt), 0)) I see no need to compare the codes of the LHS and the RHS, that's redundant with what operand_equal_p does. Thus, ok with removing that test if it bootstraps / regtests ok and if Jeff has no further comments. Thanks, Richard. >> I'd be curious how often this triggers in GCC itself as well. > > > Do you know a convenient way to test that? > > -- > Marc Glisse
On Wed, 12 Jun 2013, Richard Biener wrote: > On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >> On Tue, 11 Jun 2013, Jeff Law wrote: >> >>> On 06/09/13 10:25, Marc Glisse wrote: >>>> >>>> Hello, >>>> >>>> this patch removes some self-assignments. I don't know if this is the >>>> best way, but it passes a bootstrap and the testsuite on >>>> x86_64-linux-gnu. >>>> >>>> 2013-06-10 Marc Glisse <marc.glisse@inria.fr> >>>> >>>> PR tree-optimization/57361 >>>> gcc/ >>>> * tree-ssa-dse.c (dse_possible_dead_store_p): Handle >>>> self-assignment. >>>> >>>> gcc/testsuite/ >>>> * gcc.dg/tree-ssa/pr57361.c: New file. >>> >>> So dse_optimize_stmt will verify the statement does not have volatile >>> operands. >> >> >> operand_equal_p also does it, so we are well covered there ;-) >> >> >>> However, it doesn't verify the statement does not potentially throw (think >>> about a segfault on the store when async exceptions are enabled). I think >>> you need to test for that explicitly. >> >> >> Hmm, I am not at all familiar with that. Google drowns me in C# and >> javascript links, and grepping through the sources only pointed me to the >> -fasynchronous-unwind-tables flag. >> >> Richard noticed in the PR that expand_assignment already does: >> >> /* Optimize away no-op moves without side-effects. */ >> if (operand_equal_p (to, from, 0)) >> return; >> >> so it looks like the operand_equal_p test should be sufficient (or the >> compiler already breaks that code). >> >> >>> Is there some reason this won't work in tree-ssa-dce.c? That gets run >>> more often and these stores may be preventing code motion opportunities, so >>> getting them out of the IL stream as early as possible would be good. >> >> >> In the first version of the patch: >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6 >> I was doing it in fold_stmt_1, which should be called often enough. Richard >> suggested DSE in the PR, but that might be because he had a more subtle test >> in mind. I am certainly ok with moving it to DCE or anywhere else... > > DSE looks like the right place to me as we are removing a store. Yes, > DCE removes a limited set of stores as well, but the way we remove this kind > of store makes it much more suited to DSE. > > As of possibly trapping/throwing stores, we do not bother to preserve those > (even with -fnon-call-exceptions). > > Thus, the patch is ok with me if you agree with that and with the following > adjustment: > > + /* Self-assignments are zombies. */ > + if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt)) > + && operand_equal_p (gimple_assign_rhs1 (stmt), > + gimple_assign_lhs (stmt), 0)) > > I see no need to compare the codes of the LHS and the RHS, that's redundant > with what operand_equal_p does. I was trying to make sure first that it was a pure assignment and thus that rhs1 was the only thing to look at, not for instance *a=-*a, but I guess in gimple that would be 3 statements. I'll retest without it. > Thus, ok with removing that test if it bootstraps / regtests ok and if Jeff > has no further comments. >>> I'd be curious how often this triggers in GCC itself as well. Essentially never. I tried with the fold_stmt version of the patch, and libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers. Note that the case: b=*a *a=b is already handled by FRE which removes *a=b (copyprop later removes the dead b=*a and release_ssa removes the unused variable b), it is only *a=*a that wasn't handled. Now that I look at it, it is a bit surprising that: struct A {int i;}; void f(A*a,A*b){ A c=*b; *a=c; } void g(A*a,A*b){ *a=*b; } gives 2 different .optimized gimple: c$i_5 = MEM[(const struct A &)b_2(D)]; MEM[(struct A *)a_3(D)] = c$i_5; and: *a_2(D) = MEM[(const struct A &)b_3(D)]; Aren't they equivalent? And if so, which form should be preferred?
On Wed, Jun 12, 2013 at 10:47 AM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Wed, 12 Jun 2013, Richard Biener wrote: > >> On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> >>> On Tue, 11 Jun 2013, Jeff Law wrote: >>> >>>> On 06/09/13 10:25, Marc Glisse wrote: >>>>> >>>>> >>>>> Hello, >>>>> >>>>> this patch removes some self-assignments. I don't know if this is the >>>>> best way, but it passes a bootstrap and the testsuite on >>>>> x86_64-linux-gnu. >>>>> >>>>> 2013-06-10 Marc Glisse <marc.glisse@inria.fr> >>>>> >>>>> PR tree-optimization/57361 >>>>> gcc/ >>>>> * tree-ssa-dse.c (dse_possible_dead_store_p): Handle >>>>> self-assignment. >>>>> >>>>> gcc/testsuite/ >>>>> * gcc.dg/tree-ssa/pr57361.c: New file. >>>> >>>> >>>> So dse_optimize_stmt will verify the statement does not have volatile >>>> operands. >>> >>> >>> >>> operand_equal_p also does it, so we are well covered there ;-) >>> >>> >>>> However, it doesn't verify the statement does not potentially throw >>>> (think >>>> about a segfault on the store when async exceptions are enabled). I >>>> think >>>> you need to test for that explicitly. >>> >>> >>> >>> Hmm, I am not at all familiar with that. Google drowns me in C# and >>> javascript links, and grepping through the sources only pointed me to the >>> -fasynchronous-unwind-tables flag. >>> >>> Richard noticed in the PR that expand_assignment already does: >>> >>> /* Optimize away no-op moves without side-effects. */ >>> if (operand_equal_p (to, from, 0)) >>> return; >>> >>> so it looks like the operand_equal_p test should be sufficient (or the >>> compiler already breaks that code). >>> >>> >>>> Is there some reason this won't work in tree-ssa-dce.c? That gets run >>>> more often and these stores may be preventing code motion opportunities, >>>> so >>>> getting them out of the IL stream as early as possible would be good. >>> >>> >>> >>> In the first version of the patch: >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6 >>> I was doing it in fold_stmt_1, which should be called often enough. >>> Richard >>> suggested DSE in the PR, but that might be because he had a more subtle >>> test >>> in mind. I am certainly ok with moving it to DCE or anywhere else... >> >> >> DSE looks like the right place to me as we are removing a store. Yes, >> DCE removes a limited set of stores as well, but the way we remove this >> kind >> of store makes it much more suited to DSE. >> >> As of possibly trapping/throwing stores, we do not bother to preserve >> those >> (even with -fnon-call-exceptions). >> >> Thus, the patch is ok with me if you agree with that and with the >> following >> adjustment: >> >> + /* Self-assignments are zombies. */ >> + if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs >> (stmt)) >> + && operand_equal_p (gimple_assign_rhs1 (stmt), >> + gimple_assign_lhs (stmt), 0)) >> >> I see no need to compare the codes of the LHS and the RHS, that's >> redundant >> with what operand_equal_p does. > > > I was trying to make sure first that it was a pure assignment and thus that > rhs1 was the only thing to look at, not for instance *a=-*a, but I guess in > gimple that would be 3 statements. I'll retest without it. > > >> Thus, ok with removing that test if it bootstraps / regtests ok and if >> Jeff >> has no further comments. > > > >>>> I'd be curious how often this triggers in GCC itself as well. > > > Essentially never. I tried with the fold_stmt version of the patch, and > libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers. > Note that the case: > b=*a > *a=b > is already handled by FRE which removes *a=b (copyprop later removes the > dead b=*a and release_ssa removes the unused variable b), it is only *a=*a > that wasn't handled. Now that I look at it, it is a bit surprising that: > > struct A {int i;}; > void f(A*a,A*b){ A c=*b; *a=c; } > void g(A*a,A*b){ *a=*b; } > > gives 2 different .optimized gimple: > > c$i_5 = MEM[(const struct A &)b_2(D)]; > MEM[(struct A *)a_3(D)] = c$i_5; > > and: > > *a_2(D) = MEM[(const struct A &)b_3(D)]; > > Aren't they equivalent? And if so, which form should be preferred? Well, the first is optimized by SRA to copy element-wise and thus the loads/stores have is_gimple_reg_type () which require separate loads/stores. The second is an aggregate copy where we cannot generate SSA temporaries for the result of the load (!is_gimple_reg_type ()) and thus we are required to have a single statement. One of my pending GIMPLE re-org tasks is to always separate loads and stores and allow SSA names of aggregate type, thus we'd have tem_1 = MEM[(const struct A &)b_3(D)]; *a_2(D) = tem_1; even for the 2nd case. That solves the fact that we are missing an aggregate copy propagation pass quite nicely. Yes, you have to watch for not creating (too many) overlapping life-ranges as out-of-SSA won't be able to assign the temporary aggregate SSA names to registers but possibly has to allocate costly stack space for them. Setting SSA_NAME_OCCURS_IN_ABNORMAL_PHI on them solves this issue in a hacky way. On my list since about 5 years ... ;) Richard. > -- > Marc Glisse
On 06/12/13 02:03, Richard Biener wrote: > > DSE looks like the right place to me as we are removing a store. Yes, > DCE removes a limited set of stores as well, but the way we remove this kind > of store makes it much more suited to DSE. > > As of possibly trapping/throwing stores, we do not bother to preserve those > (even with -fnon-call-exceptions). A bit of a surprise to hear that. I don't mind much though, as -fnon-call-exceptions isn't something I find terribly useful. Consider my "objections" withdrawn. Jeff
On 06/11/13 13:30, Marc Glisse wrote: > > >> I'd be curious how often this triggers in GCC itself as well. > > Do you know a convenient way to test that? I usually put in some kind of debugging printfs during early development which I can then grep for in build logs. Not very sexy, but effective to see if a particular transformation is triggering outside contrived testcodes. jeff
On Wed, Jun 12, 2013 at 09:42:55AM -0600, Jeff Law wrote: > On 06/12/13 02:03, Richard Biener wrote: > >DSE looks like the right place to me as we are removing a store. Yes, > >DCE removes a limited set of stores as well, but the way we remove this kind > >of store makes it much more suited to DSE. > > > >As of possibly trapping/throwing stores, we do not bother to preserve those > >(even with -fnon-call-exceptions). > A bit of a surprise to hear that. I don't mind much though, as > -fnon-call-exceptions isn't something I find terribly useful. Well, a segfault accessing invalid pointer is undefined behavior, so it is fine to optimize it away. Jakub
On Wed, 12 Jun 2013, Richard Biener wrote: > On Wed, Jun 12, 2013 at 10:47 AM, Marc Glisse <marc.glisse@inria.fr> wrote: > >> Essentially never. I tried with the fold_stmt version of the patch, and >> libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers. >> Note that the case: >> b=*a >> *a=b >> is already handled by FRE which removes *a=b (copyprop later removes the >> dead b=*a and release_ssa removes the unused variable b), it is only *a=*a >> that wasn't handled. Now that I look at it, it is a bit surprising that: >> >> struct A {int i;}; >> void f(A*a,A*b){ A c=*b; *a=c; } >> void g(A*a,A*b){ *a=*b; } >> >> gives 2 different .optimized gimple: >> >> c$i_5 = MEM[(const struct A &)b_2(D)]; >> MEM[(struct A *)a_3(D)] = c$i_5; >> >> and: >> >> *a_2(D) = MEM[(const struct A &)b_3(D)]; >> >> Aren't they equivalent? And if so, which form should be preferred? > > Well, the first is optimized by SRA to copy element-wise and thus the > loads/stores have is_gimple_reg_type () which require separate loads/stores. > The second is an aggregate copy where we cannot generate SSA temporaries > for the result of the load (!is_gimple_reg_type ()) and thus we are required > to have a single statement. > > One of my pending GIMPLE re-org tasks is to always separate loads and > stores and allow SSA names of aggregate type, thus we'd have > > tem_1 = MEM[(const struct A &)b_3(D)]; > *a_2(D) = tem_1; > > even for the 2nd case. That solves the fact that we are missing an > aggregate copy propagation pass quite nicely. > > Yes, you have to watch for not creating (too many) overlapping life-ranges > as out-of-SSA won't be able to assign the temporary aggregate SSA names > to registers but possibly has to allocate costly stack space for them. > > Setting SSA_NAME_OCCURS_IN_ABNORMAL_PHI on them solves this > issue in a hacky way. > > On my list since about 5 years ... ;) I was going to ask if I should wait for it (it makes my patch unnecessary), but I guess that answers it. Since Jeff seems ok, I committed it at r200034.
Index: testsuite/gcc.dg/tree-ssa/pr57361.c =================================================================== --- testsuite/gcc.dg/tree-ssa/pr57361.c (revision 0) +++ testsuite/gcc.dg/tree-ssa/pr57361.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-dse1-details" } */ + +struct A { int x; double y; }; +void f (struct A *a) { + *a = *a; +} + +/* { dg-final { scan-tree-dump "Deleted dead store" "dse1"} } */ Property changes on: testsuite/gcc.dg/tree-ssa/pr57361.c ___________________________________________________________________ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: tree-ssa-dse.c =================================================================== --- tree-ssa-dse.c (revision 199867) +++ tree-ssa-dse.c (working copy) @@ -77,20 +77,29 @@ static void dse_enter_block (struct dom_ Return TRUE if the above conditions are met, otherwise FALSE. */ static bool dse_possible_dead_store_p (gimple stmt, gimple *use_stmt) { gimple temp; unsigned cnt = 0; *use_stmt = NULL; + /* Self-assignments are zombies. */ + if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt)) + && operand_equal_p (gimple_assign_rhs1 (stmt), + gimple_assign_lhs (stmt), 0)) + { + *use_stmt = stmt; + return true; + } + /* Find the first dominated statement that clobbers (part of) the memory stmt stores to with no intermediate statement that may use part of the memory stmt stores. That is, find a store that may prove stmt to be a dead store. */ temp = stmt; do { gimple use_stmt, defvar_def; imm_use_iterator ui; bool fail = false;