Message ID | 4F987310.5000603@mentor.com |
---|---|
State | New |
Headers | show |
On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries <Tom_deVries@mentor.com> wrote: > On 25/04/12 11:57, Richard Guenther wrote: > <SNIP> >>>> >> Hmm. I'm not sure we can conclude that they have the same value! >>>> >> >>>> >> +int bar (int); >>>> >> +void baz (int); >>>> >> +void bla (int); >>>> >> + >>>> >> +void >>>> >> +foo (int y) >>>> >> +{ >>>> >> + int a; >>>> >> + if (y == 6) >>>> >> + { >>>> >> + bla (5); >>>> >> + a = bar (7); >>>> >> + } >>>> >> + else >>>> >> + { >>>> >> + bla (5); >>>> >> + a = bar (7); >>>> >> + } >>>> >> + baz (a); >>>> >> +} >>>> >> >>>> >> I can implement bla as >>>> >> >>>> >> void bla(int) { a = random (); } >>>> >> >>>> >> thus a would not have the same value on both paths >>> > >>> > I think it's hard to decide whether they have the same value or not, since we >>> > cannot write a test-case that compares the results of calls from 2 branches of >>> > an if statement. >> void *foo () >> { >> return __builtin_return_address (0); >> } >> >> void *bar (_Bool b) >> { >> if (b) >> return foo (); >> else >> return foo (); >> } >> >> int main() >> { >> if (bar(true) == bar(false)) >> abort (); >> } >> >> ok ... outside of the scope of standard "C", but we certainly _can_ do this. >> Which would question tail-merging the above at all, of course. >> > > I added noinline to make sure there's no variation from inlining. > ... > extern void abort (void) __attribute__ ((__nothrow__)) __attribute__ > ((__noreturn__)); > > __attribute__((noinline)) > void *foo () > { > return __builtin_return_address (0); > } > > __attribute__((noinline)) > void *bar (int b) > { > if (b) > return foo (); > else > return foo (); > } > > int main() > { > void *a, *b; > a = bar (1); > b = bar (0); > if (a == b) > abort (); > return 0; > } > ... > > This test-case passes with: > ... > gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fno-crossjumping > ... > > and fails with: > ... > gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping > ... > > with the proposed patch, it also fails with: > ... > gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping > ... > > Tree-tail-merge performs the same transformation as cross-jumping for this > test-case. I think that the transformation is valid, and that the test-case > behaves as expected. Subsequent calls to foo may or may not return the same > value, depending on the optimizations, we can't assert a != b, that's just the > nature of __builtin_return_address. > > Btw, this example is not what I meant when I said 'a test-case that compares the > results of calls from 2 branches of an if statement'. I tried to say that the > semantics of C is defined in terms of taking either 1 branch or the other, > rather than executing both in parallel, after which both results would be > available. From that point of view, this example compares subsequent calls to > foo, not parallel. Ah, I see. Btw, I was not suggesting that we should not optimize the above. >>> > I started looking at the problem in terms of subsequent calls. Consider the >>> > imaginary attribute nosideeffect, similar to const and pure. >>> > >>> > const: >>> > - no effects except the return value >>> > - the return value depends only on the parameters >>> > >>> > pure: >>> > - no effects except the return value >>> > - the return value depends only on the parameters and/or global variables >>> > >>> > nosideeffect: >>> > - no effects except the return value >>> > >>> > The code fragment: >>> > ... >>> > extern int f (int) __attribute ((nosideeffect)); >>> > int g() >>> > { >>> > int a; >>> > int b; >>> > a = f (5); >>> > b = f (5); >>> > return a + b; >>> > } >>> > ... >>> > >>> > would result in a gimple sequence more or less like this: >>> > ... >>> > # VUSE <.MEMD.1712_4(D)> >>> > aD.1707_1 = fD.1704 (5); >>> > # VUSE <.MEMD.1712_4(D)> >>> > bD.1708_2 = fD.1704 (5); >>> > D.1710_3 = aD.1707_1 + bD.1708_2; >>> > # VUSE <.MEMD.1712_6> >>> > return D.1710_3; >>> > ... >>> > >>> > The value numbering modification I'm proposing would say that a and b have the >>> > same value, which is not true. I updated the patch to ensure in this case that a >>> > and b would be seen as different. >>> > Note that things won't break if the attribute is missing on a function, since >>> > that just means that the function would have a vdef, and there would be no 2 >>> > subsequent function calls with the same vuse. >>> > >>>> >> - but that is not what >>>> >> matters - because obviously we can still merge the blocks. >>> > >>> > Right. >>> > >>>> >> That is, >>>> >> we cannot be sure that the side-effects on memory of bla (5) and bla (5) >>>> >> are the same. But is that not what your value-numbering changes will assert? >>> > >>> > In the case of calls in different branches of an control flow statement, we can >>> > aggressively assume they're the same, since we cannot write a check that would >>> > start failing. >> Apart from the above ... which shows that even the returned values can matter. >> >>> > In the case of subsequent calls with side effects, the vuses will never be the same. >>> > >>> > In the case of subsequent calls without side effects, but not pure or const, we >>> > can't assume they have the same result. AFAIK, there's currently no way to >>> > specify such a function, but the updated patch should be able handle this. >>> > >>>> >> (not sure how you achieve this, but it looks like you insert/lookup general >>>> >> calls, thus not pure/const ones >>> > >>> > Correct. >>> > >>>> >> - that could easily lead to miscompiles(?)). >>>> >> >> And to increased compile-time (not sure if it matters). >> >>> > I have bootstrapped and reg-tested the updated (and the previous) patch on >>> > x86_64 (ada inclusive), and found no issues. >>> > >>> > Can you think of a test-case that fails with the previous or the updated patch? >> I see you do not handle >> > > I assume you mean 'not handle' in the sense of 'handle conservatively'. Yes. >> struct S { int i; }; >> struct S foo (void); >> struct S bar (void) >> { >> struct S s1, s2; >> if (...) >> s = foo (); >> else >> s = foo (); >> >> because the calls have a LHS that is not an SSA name. > > Indeed, the gvn patch handles this example conservatively, and tree-tail-merge > fails to optimize this test-case: > ... > struct S { int i; }; > extern struct S foo (void); > extern int foo2 (void); > struct S s; > int bar (int c) { > int r; > if (c) > { > s = foo (); > r = foo2 (); > } > else > { > s = foo (); > r = foo2 (); > } > return r; > } > ... > > A todo. > >> In general >> value-numbering virtual operands will change what further consumers lookup. >> So changing >> >> # MEM_2 = VUSE <MEM_3> >> foo (); >> # VUSE <MEM_2> >> ... = x; >> >> by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use >> MEM_3, not considering a clobber by foo. > > AFAIU, the patch doesn't do this type of value numbering. > >> This might not actually be an >> issue (after all we do this for stores, too). >> >> I see you do not handle >> >> int bar (int); >> void baz (int); >> void bla (int); >> int s; >> void >> foo (int y) >> { >> int a; >> if (y == 6) >> { >> s = 1; >> bla (5); >> a = bar (7); >> } >> else >> { >> s = 1; >> bla (5); >> a = bar (7); >> } >> baz (a); >> } >> > > I think a more basic example is this one from PR52009, which is basically about > stores: > ... > int z; > > void > foo (int y) > { > int a; > if (y == 6) > z = 5; > else > z = 5; > } > ... > > I proposed a patch for that separately: > http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the > patch now, I suppose I could use part of that patch as infrastructure for the > todo above. Yes, I think so. >> and I don't see how you need the new ref->vdef member > > I'm regarding the result of the ref as a compound <vdef, result>. Most of the > time I could (as I did in an earlier version of the patch) deduce vdef as > gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when > result is NULL_TREE. > >> - that is also >> not handled consistently (not in hashing or comparing, etc.). > > I handle the vdef as a result. Ok. Can you please rename ->vdef to ->result_vdef then? >> Can't you >> always use SSA_VAL (vuse)? > > I'm interested in storing the vdef, not the vuse. I meant that result_vdef will always be SSA_VAL (vuse) (the vdef of the stmt defining vuse). No? >> >> Btw, what's >> >> + if (vdef) >> + VN_INFO (vdef)->use_processed = true; >> >> for? We arrive here by visiting the VDEF after all and visit_use does already >> set the use processed. Is it that we end up visiting the call twice because >> of the lhs SSA name definition and the virtual operand definition? > > Yes. > >> If so then >> visit_use should instead do >> >> FOR_EACH_SSA_DEF_OPERAND (...) >> VN_INFO (def)->use_processed = true; >> >> and defs_to_varying then no longer needs to do that. >> > > factored mark_use_processed out of defs_to_varying, calling mark_use_processed > at start of visit_use. Thanks. Which case hits SSA_NAME_DEF_STMT == NULL btw? The default-def for the virtual operand? >>> > Otherwise, ok for trunk? >> Let's give this one more iteration, but I guess the basic idea should work. >> > > bootstrapped and reg-tested on x86_64 (ada inclusive). > > Is this patch ok, or is the todo required? No, you can followup with that. Ok with s/vdef/result_vdef/ Thanks, Richard. > Thanks, > - Tom > > 2012-04-25 Tom de Vries <tom@codesourcery.com> > > PR tree-optimization/51879 > * tree-ssa-sccvn.h (struct vn_reference_s): Add vdef field. > * tree-ssa-sccvn.c (mark_use_processed): New function, factored out > of ... > (defs_to_varying): ... here. > (visit_reference_op_call): Handle gimple_vdef. > Handle case that lhs is NULL_TREE. > (visit_use): Use mark_use_processed. Handle calls with side-effect > using visit_reference_op_call. > > gcc.dg/pr51879.c: New test. > gcc.dg/pr51879-2.c: Same. > gcc.dg/pr51879-3.c: Same. > gcc.dg/pr51879-4.c: Same. > gcc.dg/pr51879-6.c: Same.
On 26/04/12 12:20, Richard Guenther wrote: > On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries <Tom_deVries@mentor.com> wrote: >> On 25/04/12 11:57, Richard Guenther wrote: >> <SNIP> >>>>>>> Hmm. I'm not sure we can conclude that they have the same value! >>>>>>> >>>>>>> +int bar (int); >>>>>>> +void baz (int); >>>>>>> +void bla (int); >>>>>>> + >>>>>>> +void >>>>>>> +foo (int y) >>>>>>> +{ >>>>>>> + int a; >>>>>>> + if (y == 6) >>>>>>> + { >>>>>>> + bla (5); >>>>>>> + a = bar (7); >>>>>>> + } >>>>>>> + else >>>>>>> + { >>>>>>> + bla (5); >>>>>>> + a = bar (7); >>>>>>> + } >>>>>>> + baz (a); >>>>>>> +} >>>>>>> >>>>>>> I can implement bla as >>>>>>> >>>>>>> void bla(int) { a = random (); } >>>>>>> >>>>>>> thus a would not have the same value on both paths >>>>> >>>>> I think it's hard to decide whether they have the same value or not, since we >>>>> cannot write a test-case that compares the results of calls from 2 branches of >>>>> an if statement. >>> void *foo () >>> { >>> return __builtin_return_address (0); >>> } >>> >>> void *bar (_Bool b) >>> { >>> if (b) >>> return foo (); >>> else >>> return foo (); >>> } >>> >>> int main() >>> { >>> if (bar(true) == bar(false)) >>> abort (); >>> } >>> >>> ok ... outside of the scope of standard "C", but we certainly _can_ do this. >>> Which would question tail-merging the above at all, of course. >>> >> >> I added noinline to make sure there's no variation from inlining. >> ... >> extern void abort (void) __attribute__ ((__nothrow__)) __attribute__ >> ((__noreturn__)); >> >> __attribute__((noinline)) >> void *foo () >> { >> return __builtin_return_address (0); >> } >> >> __attribute__((noinline)) >> void *bar (int b) >> { >> if (b) >> return foo (); >> else >> return foo (); >> } >> >> int main() >> { >> void *a, *b; >> a = bar (1); >> b = bar (0); >> if (a == b) >> abort (); >> return 0; >> } >> ... >> >> This test-case passes with: >> ... >> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fno-crossjumping >> ... >> >> and fails with: >> ... >> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping >> ... >> >> with the proposed patch, it also fails with: >> ... >> gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping >> ... >> >> Tree-tail-merge performs the same transformation as cross-jumping for this >> test-case. I think that the transformation is valid, and that the test-case >> behaves as expected. Subsequent calls to foo may or may not return the same >> value, depending on the optimizations, we can't assert a != b, that's just the >> nature of __builtin_return_address. >> >> Btw, this example is not what I meant when I said 'a test-case that compares the >> results of calls from 2 branches of an if statement'. I tried to say that the >> semantics of C is defined in terms of taking either 1 branch or the other, >> rather than executing both in parallel, after which both results would be >> available. From that point of view, this example compares subsequent calls to >> foo, not parallel. > > Ah, I see. Btw, I was not suggesting that we should not optimize the above. > >>>>> I started looking at the problem in terms of subsequent calls. Consider the >>>>> imaginary attribute nosideeffect, similar to const and pure. >>>>> >>>>> const: >>>>> - no effects except the return value >>>>> - the return value depends only on the parameters >>>>> >>>>> pure: >>>>> - no effects except the return value >>>>> - the return value depends only on the parameters and/or global variables >>>>> >>>>> nosideeffect: >>>>> - no effects except the return value >>>>> >>>>> The code fragment: >>>>> ... >>>>> extern int f (int) __attribute ((nosideeffect)); >>>>> int g() >>>>> { >>>>> int a; >>>>> int b; >>>>> a = f (5); >>>>> b = f (5); >>>>> return a + b; >>>>> } >>>>> ... >>>>> >>>>> would result in a gimple sequence more or less like this: >>>>> ... >>>>> # VUSE <.MEMD.1712_4(D)> >>>>> aD.1707_1 = fD.1704 (5); >>>>> # VUSE <.MEMD.1712_4(D)> >>>>> bD.1708_2 = fD.1704 (5); >>>>> D.1710_3 = aD.1707_1 + bD.1708_2; >>>>> # VUSE <.MEMD.1712_6> >>>>> return D.1710_3; >>>>> ... >>>>> >>>>> The value numbering modification I'm proposing would say that a and b have the >>>>> same value, which is not true. I updated the patch to ensure in this case that a >>>>> and b would be seen as different. >>>>> Note that things won't break if the attribute is missing on a function, since >>>>> that just means that the function would have a vdef, and there would be no 2 >>>>> subsequent function calls with the same vuse. >>>>> >>>>>>> - but that is not what >>>>>>> matters - because obviously we can still merge the blocks. >>>>> >>>>> Right. >>>>> >>>>>>> That is, >>>>>>> we cannot be sure that the side-effects on memory of bla (5) and bla (5) >>>>>>> are the same. But is that not what your value-numbering changes will assert? >>>>> >>>>> In the case of calls in different branches of an control flow statement, we can >>>>> aggressively assume they're the same, since we cannot write a check that would >>>>> start failing. >>> Apart from the above ... which shows that even the returned values can matter. >>> >>>>> In the case of subsequent calls with side effects, the vuses will never be the same. >>>>> >>>>> In the case of subsequent calls without side effects, but not pure or const, we >>>>> can't assume they have the same result. AFAIK, there's currently no way to >>>>> specify such a function, but the updated patch should be able handle this. >>>>> >>>>>>> (not sure how you achieve this, but it looks like you insert/lookup general >>>>>>> calls, thus not pure/const ones >>>>> >>>>> Correct. >>>>> >>>>>>> - that could easily lead to miscompiles(?)). >>>>>>> >>> And to increased compile-time (not sure if it matters). >>> >>>>> I have bootstrapped and reg-tested the updated (and the previous) patch on >>>>> x86_64 (ada inclusive), and found no issues. >>>>> >>>>> Can you think of a test-case that fails with the previous or the updated patch? >>> I see you do not handle >>> >> >> I assume you mean 'not handle' in the sense of 'handle conservatively'. > > Yes. > >>> struct S { int i; }; >>> struct S foo (void); >>> struct S bar (void) >>> { >>> struct S s1, s2; >>> if (...) >>> s = foo (); >>> else >>> s = foo (); >>> >>> because the calls have a LHS that is not an SSA name. >> >> Indeed, the gvn patch handles this example conservatively, and tree-tail-merge >> fails to optimize this test-case: >> ... >> struct S { int i; }; >> extern struct S foo (void); >> extern int foo2 (void); >> struct S s; >> int bar (int c) { >> int r; >> if (c) >> { >> s = foo (); >> r = foo2 (); >> } >> else >> { >> s = foo (); >> r = foo2 (); >> } >> return r; >> } >> ... >> >> A todo. >> >>> In general >>> value-numbering virtual operands will change what further consumers lookup. >>> So changing >>> >>> # MEM_2 = VUSE <MEM_3> >>> foo (); >>> # VUSE <MEM_2> >>> ... = x; >>> >>> by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use >>> MEM_3, not considering a clobber by foo. >> >> AFAIU, the patch doesn't do this type of value numbering. >> >>> This might not actually be an >>> issue (after all we do this for stores, too). >>> >>> I see you do not handle >>> >>> int bar (int); >>> void baz (int); >>> void bla (int); >>> int s; >>> void >>> foo (int y) >>> { >>> int a; >>> if (y == 6) >>> { >>> s = 1; >>> bla (5); >>> a = bar (7); >>> } >>> else >>> { >>> s = 1; >>> bla (5); >>> a = bar (7); >>> } >>> baz (a); >>> } >>> >> >> I think a more basic example is this one from PR52009, which is basically about >> stores: >> ... >> int z; >> >> void >> foo (int y) >> { >> int a; >> if (y == 6) >> z = 5; >> else >> z = 5; >> } >> ... >> >> I proposed a patch for that separately: >> http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the >> patch now, I suppose I could use part of that patch as infrastructure for the >> todo above. > > Yes, I think so. > >>> and I don't see how you need the new ref->vdef member >> >> I'm regarding the result of the ref as a compound <vdef, result>. Most of the >> time I could (as I did in an earlier version of the patch) deduce vdef as >> gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when >> result is NULL_TREE. >> >>> - that is also >>> not handled consistently (not in hashing or comparing, etc.). >> >> I handle the vdef as a result. > > Ok. Can you please rename ->vdef to ->result_vdef then? > Done. >>> Can't you >>> always use SSA_VAL (vuse)? >> >> I'm interested in storing the vdef, not the vuse. > > I meant that result_vdef will always be SSA_VAL (vuse) (the vdef of > the stmt defining vuse). No? > Right. >>> >>> Btw, what's >>> >>> + if (vdef) >>> + VN_INFO (vdef)->use_processed = true; >>> >>> for? We arrive here by visiting the VDEF after all and visit_use does already >>> set the use processed. Is it that we end up visiting the call twice because >>> of the lhs SSA name definition and the virtual operand definition? >> >> Yes. >> >>> If so then >>> visit_use should instead do >>> >>> FOR_EACH_SSA_DEF_OPERAND (...) >>> VN_INFO (def)->use_processed = true; >>> >>> and defs_to_varying then no longer needs to do that. >>> >> >> factored mark_use_processed out of defs_to_varying, calling mark_use_processed >> at start of visit_use. > > Thanks. Which case hits SSA_NAME_DEF_STMT == NULL btw? The > default-def for the virtual operand? > SSA_NAME_DEF_STMT == NULL is tested by !stmt in mark_use_processed. I suppose I better use SSA_NAME_IS_DEFAULT_DEF, as is done in visit_use. I'm still somewhat confused as to what is the difference. I suppose we might have a nop statement that is a default def, while it's unequal to NULL. Updated in committed patch. >>>>> Otherwise, ok for trunk? >>> Let's give this one more iteration, but I guess the basic idea should work. >>> >> >> bootstrapped and reg-tested on x86_64 (ada inclusive). >> >> Is this patch ok, or is the todo required? > > No, you can followup with that. > > Ok with s/vdef/result_vdef/ > retested and committed with this change and SSA_NAME_IS_DEFAULT_DEF instead of '!stmt'. Thanks. - Tom
On Fri, Apr 27, 2012 at 8:20 AM, Tom de Vries <Tom_deVries@mentor.com> wrote: > On 26/04/12 12:20, Richard Guenther wrote: >> On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries <Tom_deVries@mentor.com> wrote: >>> On 25/04/12 11:57, Richard Guenther wrote: >>> <SNIP> >>>>>>>> Hmm. I'm not sure we can conclude that they have the same value! >>>>>>>> >>>>>>>> +int bar (int); >>>>>>>> +void baz (int); >>>>>>>> +void bla (int); >>>>>>>> + >>>>>>>> +void >>>>>>>> +foo (int y) >>>>>>>> +{ >>>>>>>> + int a; >>>>>>>> + if (y == 6) >>>>>>>> + { >>>>>>>> + bla (5); >>>>>>>> + a = bar (7); >>>>>>>> + } >>>>>>>> + else >>>>>>>> + { >>>>>>>> + bla (5); >>>>>>>> + a = bar (7); >>>>>>>> + } >>>>>>>> + baz (a); >>>>>>>> +} >>>>>>>> >>>>>>>> I can implement bla as >>>>>>>> >>>>>>>> void bla(int) { a = random (); } >>>>>>>> >>>>>>>> thus a would not have the same value on both paths >>>>>> >>>>>> I think it's hard to decide whether they have the same value or not, since we >>>>>> cannot write a test-case that compares the results of calls from 2 branches of >>>>>> an if statement. >>>> void *foo () >>>> { >>>> return __builtin_return_address (0); >>>> } >>>> >>>> void *bar (_Bool b) >>>> { >>>> if (b) >>>> return foo (); >>>> else >>>> return foo (); >>>> } >>>> >>>> int main() >>>> { >>>> if (bar(true) == bar(false)) >>>> abort (); >>>> } >>>> >>>> ok ... outside of the scope of standard "C", but we certainly _can_ do this. >>>> Which would question tail-merging the above at all, of course. >>>> >>> >>> I added noinline to make sure there's no variation from inlining. >>> ... >>> extern void abort (void) __attribute__ ((__nothrow__)) __attribute__ >>> ((__noreturn__)); >>> >>> __attribute__((noinline)) >>> void *foo () >>> { >>> return __builtin_return_address (0); >>> } >>> >>> __attribute__((noinline)) >>> void *bar (int b) >>> { >>> if (b) >>> return foo (); >>> else >>> return foo (); >>> } >>> >>> int main() >>> { >>> void *a, *b; >>> a = bar (1); >>> b = bar (0); >>> if (a == b) >>> abort (); >>> return 0; >>> } >>> ... >>> >>> This test-case passes with: >>> ... >>> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fno-crossjumping >>> ... >>> >>> and fails with: >>> ... >>> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping >>> ... >>> >>> with the proposed patch, it also fails with: >>> ... >>> gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping >>> ... >>> >>> Tree-tail-merge performs the same transformation as cross-jumping for this >>> test-case. I think that the transformation is valid, and that the test-case >>> behaves as expected. Subsequent calls to foo may or may not return the same >>> value, depending on the optimizations, we can't assert a != b, that's just the >>> nature of __builtin_return_address. >>> >>> Btw, this example is not what I meant when I said 'a test-case that compares the >>> results of calls from 2 branches of an if statement'. I tried to say that the >>> semantics of C is defined in terms of taking either 1 branch or the other, >>> rather than executing both in parallel, after which both results would be >>> available. From that point of view, this example compares subsequent calls to >>> foo, not parallel. >> >> Ah, I see. Btw, I was not suggesting that we should not optimize the above. >> >>>>>> I started looking at the problem in terms of subsequent calls. Consider the >>>>>> imaginary attribute nosideeffect, similar to const and pure. >>>>>> >>>>>> const: >>>>>> - no effects except the return value >>>>>> - the return value depends only on the parameters >>>>>> >>>>>> pure: >>>>>> - no effects except the return value >>>>>> - the return value depends only on the parameters and/or global variables >>>>>> >>>>>> nosideeffect: >>>>>> - no effects except the return value >>>>>> >>>>>> The code fragment: >>>>>> ... >>>>>> extern int f (int) __attribute ((nosideeffect)); >>>>>> int g() >>>>>> { >>>>>> int a; >>>>>> int b; >>>>>> a = f (5); >>>>>> b = f (5); >>>>>> return a + b; >>>>>> } >>>>>> ... >>>>>> >>>>>> would result in a gimple sequence more or less like this: >>>>>> ... >>>>>> # VUSE <.MEMD.1712_4(D)> >>>>>> aD.1707_1 = fD.1704 (5); >>>>>> # VUSE <.MEMD.1712_4(D)> >>>>>> bD.1708_2 = fD.1704 (5); >>>>>> D.1710_3 = aD.1707_1 + bD.1708_2; >>>>>> # VUSE <.MEMD.1712_6> >>>>>> return D.1710_3; >>>>>> ... >>>>>> >>>>>> The value numbering modification I'm proposing would say that a and b have the >>>>>> same value, which is not true. I updated the patch to ensure in this case that a >>>>>> and b would be seen as different. >>>>>> Note that things won't break if the attribute is missing on a function, since >>>>>> that just means that the function would have a vdef, and there would be no 2 >>>>>> subsequent function calls with the same vuse. >>>>>> >>>>>>>> - but that is not what >>>>>>>> matters - because obviously we can still merge the blocks. >>>>>> >>>>>> Right. >>>>>> >>>>>>>> That is, >>>>>>>> we cannot be sure that the side-effects on memory of bla (5) and bla (5) >>>>>>>> are the same. But is that not what your value-numbering changes will assert? >>>>>> >>>>>> In the case of calls in different branches of an control flow statement, we can >>>>>> aggressively assume they're the same, since we cannot write a check that would >>>>>> start failing. >>>> Apart from the above ... which shows that even the returned values can matter. >>>> >>>>>> In the case of subsequent calls with side effects, the vuses will never be the same. >>>>>> >>>>>> In the case of subsequent calls without side effects, but not pure or const, we >>>>>> can't assume they have the same result. AFAIK, there's currently no way to >>>>>> specify such a function, but the updated patch should be able handle this. >>>>>> >>>>>>>> (not sure how you achieve this, but it looks like you insert/lookup general >>>>>>>> calls, thus not pure/const ones >>>>>> >>>>>> Correct. >>>>>> >>>>>>>> - that could easily lead to miscompiles(?)). >>>>>>>> >>>> And to increased compile-time (not sure if it matters). >>>> >>>>>> I have bootstrapped and reg-tested the updated (and the previous) patch on >>>>>> x86_64 (ada inclusive), and found no issues. >>>>>> >>>>>> Can you think of a test-case that fails with the previous or the updated patch? >>>> I see you do not handle >>>> >>> >>> I assume you mean 'not handle' in the sense of 'handle conservatively'. >> >> Yes. >> >>>> struct S { int i; }; >>>> struct S foo (void); >>>> struct S bar (void) >>>> { >>>> struct S s1, s2; >>>> if (...) >>>> s = foo (); >>>> else >>>> s = foo (); >>>> >>>> because the calls have a LHS that is not an SSA name. >>> >>> Indeed, the gvn patch handles this example conservatively, and tree-tail-merge >>> fails to optimize this test-case: >>> ... >>> struct S { int i; }; >>> extern struct S foo (void); >>> extern int foo2 (void); >>> struct S s; >>> int bar (int c) { >>> int r; >>> if (c) >>> { >>> s = foo (); >>> r = foo2 (); >>> } >>> else >>> { >>> s = foo (); >>> r = foo2 (); >>> } >>> return r; >>> } >>> ... >>> >>> A todo. >>> >>>> In general >>>> value-numbering virtual operands will change what further consumers lookup. >>>> So changing >>>> >>>> # MEM_2 = VUSE <MEM_3> >>>> foo (); >>>> # VUSE <MEM_2> >>>> ... = x; >>>> >>>> by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use >>>> MEM_3, not considering a clobber by foo. >>> >>> AFAIU, the patch doesn't do this type of value numbering. >>> >>>> This might not actually be an >>>> issue (after all we do this for stores, too). >>>> >>>> I see you do not handle >>>> >>>> int bar (int); >>>> void baz (int); >>>> void bla (int); >>>> int s; >>>> void >>>> foo (int y) >>>> { >>>> int a; >>>> if (y == 6) >>>> { >>>> s = 1; >>>> bla (5); >>>> a = bar (7); >>>> } >>>> else >>>> { >>>> s = 1; >>>> bla (5); >>>> a = bar (7); >>>> } >>>> baz (a); >>>> } >>>> >>> >>> I think a more basic example is this one from PR52009, which is basically about >>> stores: >>> ... >>> int z; >>> >>> void >>> foo (int y) >>> { >>> int a; >>> if (y == 6) >>> z = 5; >>> else >>> z = 5; >>> } >>> ... >>> >>> I proposed a patch for that separately: >>> http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the >>> patch now, I suppose I could use part of that patch as infrastructure for the >>> todo above. >> >> Yes, I think so. >> >>>> and I don't see how you need the new ref->vdef member >>> >>> I'm regarding the result of the ref as a compound <vdef, result>. Most of the >>> time I could (as I did in an earlier version of the patch) deduce vdef as >>> gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when >>> result is NULL_TREE. >>> >>>> - that is also >>>> not handled consistently (not in hashing or comparing, etc.). >>> >>> I handle the vdef as a result. >> >> Ok. Can you please rename ->vdef to ->result_vdef then? >> > > Done. > >>>> Can't you >>>> always use SSA_VAL (vuse)? >>> >>> I'm interested in storing the vdef, not the vuse. >> >> I meant that result_vdef will always be SSA_VAL (vuse) (the vdef of >> the stmt defining vuse). No? >> > > Right. > >>>> >>>> Btw, what's >>>> >>>> + if (vdef) >>>> + VN_INFO (vdef)->use_processed = true; >>>> >>>> for? We arrive here by visiting the VDEF after all and visit_use does already >>>> set the use processed. Is it that we end up visiting the call twice because >>>> of the lhs SSA name definition and the virtual operand definition? >>> >>> Yes. >>> >>>> If so then >>>> visit_use should instead do >>>> >>>> FOR_EACH_SSA_DEF_OPERAND (...) >>>> VN_INFO (def)->use_processed = true; >>>> >>>> and defs_to_varying then no longer needs to do that. >>>> >>> >>> factored mark_use_processed out of defs_to_varying, calling mark_use_processed >>> at start of visit_use. >> >> Thanks. Which case hits SSA_NAME_DEF_STMT == NULL btw? The >> default-def for the virtual operand? >> > > SSA_NAME_DEF_STMT == NULL is tested by !stmt in mark_use_processed. > > I suppose I better use SSA_NAME_IS_DEFAULT_DEF, as is done in visit_use. I'm > still somewhat confused as to what is the difference. I suppose we might have a > nop statement that is a default def, while it's unequal to NULL. All default defs have a GIMPLE_NOP as definition, so I was wondering for which case you see a NULL SSA_NAME_DEF_STMT (and only came up with a virtual operand maybe?) > Updated in committed patch. > >>>>>> Otherwise, ok for trunk? >>>> Let's give this one more iteration, but I guess the basic idea should work. >>>> >>> >>> bootstrapped and reg-tested on x86_64 (ada inclusive). >>> >>> Is this patch ok, or is the todo required? >> >> No, you can followup with that. >> >> Ok with s/vdef/result_vdef/ >> > > retested and committed with this change and SSA_NAME_IS_DEFAULT_DEF instead of > '!stmt'. Thanks, Richard. > Thanks. > - Tom
Index: gcc/tree-ssa-sccvn.c =================================================================== --- gcc/tree-ssa-sccvn.c (revision 185028) +++ gcc/tree-ssa-sccvn.c (working copy) @@ -2525,6 +2525,30 @@ set_ssa_val_to (tree from, tree to) return false; } +/* Mark as processed all the definitions in the defining stmt of USE, or + the USE itself. */ + +static void +mark_use_processed (tree use) +{ + ssa_op_iter iter; + def_operand_p defp; + gimple stmt = SSA_NAME_DEF_STMT (use); + + if (!stmt || gimple_code (stmt) == GIMPLE_PHI) + { + VN_INFO (use)->use_processed = true; + return; + } + + FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_ALL_DEFS) + { + tree def = DEF_FROM_PTR (defp); + + VN_INFO (def)->use_processed = true; + } +} + /* Set all definitions in STMT to value number to themselves. Return true if a value number changed. */ @@ -2538,8 +2562,6 @@ defs_to_varying (gimple stmt) FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_ALL_DEFS) { tree def = DEF_FROM_PTR (defp); - - VN_INFO (def)->use_processed = true; changed |= set_ssa_val_to (def, def); } return changed; @@ -2598,27 +2620,41 @@ visit_reference_op_call (tree lhs, gimpl { bool changed = false; struct vn_reference_s vr1; - tree result; + vn_reference_t vnresult = NULL; tree vuse = gimple_vuse (stmt); + tree vdef = gimple_vdef (stmt); vr1.vuse = vuse ? SSA_VAL (vuse) : NULL_TREE; vr1.operands = valueize_shared_reference_ops_from_call (stmt); vr1.type = gimple_expr_type (stmt); vr1.set = 0; vr1.hashcode = vn_reference_compute_hash (&vr1); - result = vn_reference_lookup_1 (&vr1, NULL); - if (result) + vn_reference_lookup_1 (&vr1, &vnresult); + + if (vnresult) { - changed = set_ssa_val_to (lhs, result); - if (TREE_CODE (result) == SSA_NAME - && VN_INFO (result)->has_constants) - VN_INFO (lhs)->has_constants = true; + if (vnresult->vdef) + changed |= set_ssa_val_to (vdef, vnresult->vdef); + + if (!vnresult->result && lhs) + vnresult->result = lhs; + + if (vnresult->result && lhs) + { + changed |= set_ssa_val_to (lhs, vnresult->result); + + if (VN_INFO (vnresult->result)->has_constants) + VN_INFO (lhs)->has_constants = true; + } } else { void **slot; vn_reference_t vr2; - changed = set_ssa_val_to (lhs, lhs); + if (vdef) + changed |= set_ssa_val_to (vdef, vdef); + if (lhs) + changed |= set_ssa_val_to (lhs, lhs); vr2 = (vn_reference_t) pool_alloc (current_info->references_pool); vr2->vuse = vr1.vuse; vr2->operands = valueize_refs (create_reference_ops_from_call (stmt)); @@ -2626,6 +2662,7 @@ visit_reference_op_call (tree lhs, gimpl vr2->set = vr1.set; vr2->hashcode = vr1.hashcode; vr2->result = lhs; + vr2->vdef = vdef; slot = htab_find_slot_with_hash (current_info->references, vr2, vr2->hashcode, INSERT); if (*slot) @@ -2795,7 +2832,6 @@ visit_reference_op_store (tree lhs, tree going to valueize the references in-place. */ if ((vdef = gimple_vdef (stmt))) { - VN_INFO (vdef)->use_processed = true; changed |= set_ssa_val_to (vdef, vdef); } @@ -2817,7 +2853,6 @@ visit_reference_op_store (tree lhs, tree def = gimple_vdef (stmt); use = gimple_vuse (stmt); - VN_INFO (def)->use_processed = true; changed |= set_ssa_val_to (def, SSA_VAL (use)); } @@ -3167,7 +3202,7 @@ visit_use (tree use) bool changed = false; gimple stmt = SSA_NAME_DEF_STMT (use); - VN_INFO (use)->use_processed = true; + mark_use_processed (use); gcc_assert (!SSA_NAME_IN_FREE_LIST (use)); if (dump_file && (dump_flags & TDF_DETAILS) @@ -3186,8 +3221,7 @@ visit_use (tree use) { if (gimple_code (stmt) == GIMPLE_PHI) changed = visit_phi (stmt); - else if (!gimple_has_lhs (stmt) - || gimple_has_volatile_ops (stmt)) + else if (gimple_has_volatile_ops (stmt)) changed = defs_to_varying (stmt); else if (is_gimple_assign (stmt)) { @@ -3349,34 +3383,44 @@ visit_use (tree use) /* ??? We could try to simplify calls. */ - if (stmt_has_constants (stmt) - && TREE_CODE (lhs) == SSA_NAME) - VN_INFO (lhs)->has_constants = true; - else if (TREE_CODE (lhs) == SSA_NAME) + if (lhs && TREE_CODE (lhs) == SSA_NAME) { - /* We reset expr and constantness here because we may - have been value numbering optimistically, and - iterating. They may become non-constant in this case, - even if they were optimistically constant. */ - VN_INFO (lhs)->has_constants = false; - VN_INFO (lhs)->expr = NULL_TREE; + if (stmt_has_constants (stmt)) + VN_INFO (lhs)->has_constants = true; + else + { + /* We reset expr and constantness here because we may + have been value numbering optimistically, and + iterating. They may become non-constant in this case, + even if they were optimistically constant. */ + VN_INFO (lhs)->has_constants = false; + VN_INFO (lhs)->expr = NULL_TREE; + } + + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) + { + changed = defs_to_varying (stmt); + goto done; + } } - if (TREE_CODE (lhs) == SSA_NAME - && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) - changed = defs_to_varying (stmt); /* ??? We should handle stores from calls. */ - else if (TREE_CODE (lhs) == SSA_NAME) + if (!gimple_call_internal_p (stmt) + && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST) + /* If the call has side effects, subsequent calls won't have + the same incoming vuse, so it's save to assume + equality. */ + || gimple_has_side_effects (stmt)) + && ((lhs && TREE_CODE (lhs) == SSA_NAME) + || (!lhs && gimple_vdef (stmt)))) { - if (!gimple_call_internal_p (stmt) - && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)) - changed = visit_reference_op_call (lhs, stmt); - else - changed = defs_to_varying (stmt); + changed = visit_reference_op_call (lhs, stmt); } else changed = defs_to_varying (stmt); } + else + changed = defs_to_varying (stmt); } done: return changed; Index: gcc/tree-ssa-sccvn.h =================================================================== --- gcc/tree-ssa-sccvn.h (revision 185028) +++ gcc/tree-ssa-sccvn.h (working copy) @@ -110,6 +110,7 @@ typedef struct vn_reference_s tree type; VEC (vn_reference_op_s, heap) *operands; tree result; + tree vdef; } *vn_reference_t; typedef const struct vn_reference_s *const_vn_reference_t; Index: gcc/testsuite/gcc.dg/pr51879-2.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879-2.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + +int bar (int); +void baz (int); + +void +foo (int y) +{ + int a; + if (y) + baz (bar (7) + 6); + else + baz (bar (7) + 6); +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */ +/* { dg-final { scan-tree-dump-times "baz \\(" 1 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: gcc/testsuite/gcc.dg/pr51879-6.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879-6.c (revision 0) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + + +int bar (int); +void baz (int); +void bla (int); + +void +foo (int y) +{ + int a; + if (y == 6) + { + bla (5); + a = bar (7); + } + else + { + bla (5); + a = bar (7); + } + baz (a); +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: gcc/testsuite/gcc.dg/pr51879.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + +int bar (int); +void baz (int); + +void +foo (int y) +{ + int a; + if (y == 6) + a = bar (7); + else + a = bar (7); + baz (a); +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: gcc/testsuite/gcc.dg/pr51879-3.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879-3.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + +int bar (int); +void baz (int); + +void +foo (int y) +{ + int a; + if (y == 6) + a = bar (7) + 6; + else + a = bar (7) + 6; + baz (a); +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */ Index: gcc/testsuite/gcc.dg/pr51879-4.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr51879-4.c (revision 0) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-pre" } */ + +int bar (int); +void baz (int); + +int foo (int y) +{ + int a, b; + a = bar (7) + 6; + b = bar (7) + 6; + return a + b; +} + +/* { dg-final { scan-tree-dump-times "bar \\(" 2 "pre"} } */ +/* { dg-final { cleanup-tree-dump "pre" } } */