diff mbox

Fix for PR51879 - Missed tail merging with non-const/pure calls

Message ID 4F987310.5000603@mentor.com
State New
Headers show

Commit Message

Tom de Vries April 25, 2012, 9:56 p.m. UTC
On 25/04/12 11:57, Richard Guenther wrote:
<SNIP>
>>> >> Hmm.  I'm not sure we can conclude that they have the same value!
>>> >>
>>> >> +int bar (int);
>>> >> +void baz (int);
>>> >> +void bla (int);
>>> >> +
>>> >> +void
>>> >> +foo (int y)
>>> >> +{
>>> >> +  int a;
>>> >> +  if (y == 6)
>>> >> +    {
>>> >> +      bla (5);
>>> >> +      a = bar (7);
>>> >> +    }
>>> >> +  else
>>> >> +    {
>>> >> +      bla (5);
>>> >> +      a = bar (7);
>>> >> +    }
>>> >> +  baz (a);
>>> >> +}
>>> >>
>>> >> I can implement bla as
>>> >>
>>> >> void bla(int) { a = random (); }
>>> >>
>>> >> thus a would not have the same value on both paths
>> >
>> > I think it's hard to decide whether they have the same value or not, since we
>> > cannot write a test-case that compares the results of calls from 2 branches of
>> > an if statement.
> void *foo ()
> {
>   return __builtin_return_address (0);
> }
> 
> void *bar (_Bool b)
> {
>   if (b)
>     return foo ();
>   else
>     return foo ();
> }
> 
> int main()
> {
>   if (bar(true) == bar(false))
>     abort ();
> }
> 
> ok ... outside of the scope of standard "C", but we certainly _can_ do this.
> Which would question tail-merging the above at all, of course.
> 

I added noinline to make sure there's no variation from inlining.
...
extern void abort (void) __attribute__ ((__nothrow__)) __attribute__
((__noreturn__));

__attribute__((noinline))
void *foo ()
{
  return __builtin_return_address (0);
}

__attribute__((noinline))
void *bar (int b)
{
  if (b)
    return foo ();
  else
    return foo ();
}

int main()
{
  void *a, *b;
  a = bar (1);
  b = bar (0);
  if (a == b)
    abort ();
  return 0;
}
...

This test-case passes with:
...
gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fno-crossjumping
...

and fails with:
...
gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping
...

with the proposed patch, it also fails with:
...
gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping
...

Tree-tail-merge performs the same transformation as cross-jumping for this
test-case. I think that the transformation is valid, and that the test-case
behaves as expected. Subsequent calls to foo may or may not return the same
value, depending on the optimizations, we can't assert a != b, that's just the
nature of __builtin_return_address.

Btw, this example is not what I meant when I said 'a test-case that compares the
results of calls from 2 branches of an if statement'. I tried to say that the
semantics of C is defined in terms of taking either 1 branch or the other,
rather than executing both in parallel, after which both results would be
available. From that point of view, this example compares subsequent calls to
foo, not parallel.

>> > I started looking at the problem in terms of subsequent calls. Consider the
>> > imaginary attribute nosideeffect, similar to const and pure.
>> >
>> > const:
>> > - no effects except the return value
>> > - the return value depends only on the parameters
>> >
>> > pure:
>> > - no effects except the return value
>> > - the return value depends only on the parameters and/or global variables
>> >
>> > nosideeffect:
>> > - no effects except the return value
>> >
>> > The code fragment:
>> > ...
>> > extern int f (int) __attribute ((nosideeffect));
>> > int g()
>> > {
>> >  int a;
>> >  int b;
>> >  a = f (5);
>> >  b = f (5);
>> >  return a + b;
>> > }
>> > ...
>> >
>> > would result in a gimple sequence more or less like this:
>> > ...
>> >  # VUSE <.MEMD.1712_4(D)>
>> >  aD.1707_1 = fD.1704 (5);
>> >  # VUSE <.MEMD.1712_4(D)>
>> >  bD.1708_2 = fD.1704 (5);
>> >  D.1710_3 = aD.1707_1 + bD.1708_2;
>> >  # VUSE <.MEMD.1712_6>
>> >  return D.1710_3;
>> > ...
>> >
>> > The value numbering modification I'm proposing would say that a and b have the
>> > same value, which is not true. I updated the patch to ensure in this case that a
>> > and b would be seen as different.
>> > Note that things won't break if the attribute is missing on a function, since
>> > that just means that the function would have a vdef, and there would be no 2
>> > subsequent function calls with the same vuse.
>> >
>>> >> - but that is not what
>>> >> matters - because obviously we can still merge the blocks.
>> >
>> > Right.
>> >
>>> >>  That is,
>>> >> we cannot be sure that the side-effects on memory of bla (5) and bla (5)
>>> >> are the same.  But is that not what your value-numbering changes will assert?
>> >
>> > In the case of calls in different branches of an control flow statement, we can
>> > aggressively assume they're the same, since we cannot write a check that would
>> > start failing.
> Apart from the above ... which shows that even the returned values can matter.
> 
>> > In the case of subsequent calls with side effects, the vuses will never be the same.
>> >
>> > In the case of subsequent calls without side effects, but not pure or const, we
>> > can't assume they have the same result. AFAIK, there's currently no way to
>> > specify such a function, but the updated patch should be able handle this.
>> >
>>> >> (not sure how you achieve this, but it looks like you insert/lookup general
>>> >> calls, thus not pure/const ones
>> >
>> > Correct.
>> >
>>> >> - that could easily lead to miscompiles(?)).
>>> >>
> And to increased compile-time (not sure if it matters).
> 
>> > I have bootstrapped and reg-tested the updated (and the previous) patch on
>> > x86_64 (ada inclusive), and found no issues.
>> >
>> > Can you think of a test-case that fails with the previous or the updated patch?
> I see you do not handle
> 

I assume you mean 'not handle' in the sense of 'handle conservatively'.

> struct S { int i; };
> struct S foo (void);
> struct S bar (void)
> {
>   struct S s1, s2;
>   if (...)
>    s = foo ();
>   else
>    s = foo ();
> 
> because the calls have a LHS that is not an SSA name.

Indeed, the gvn patch handles this example conservatively, and tree-tail-merge
fails to optimize this test-case:
...
struct S { int i; };
extern struct S foo (void);
extern int foo2 (void);
struct S s;
int bar (int c) {
  int r;
  if (c)
    {
      s = foo ();
      r = foo2 ();
    }
  else
    {
      s = foo ();
      r = foo2 ();
    }
  return r;
}
...

A todo.

>  In general
> value-numbering virtual operands will change what further consumers lookup.
> So changing
> 
>   # MEM_2 = VUSE <MEM_3>
>   foo ();
>   # VUSE <MEM_2>
>   ... = x;
> 
> by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use
> MEM_3, not considering a clobber by foo.

AFAIU, the patch doesn't do this type of value numbering.

> This might not actually be an
> issue (after all we do this for stores, too).
> 
> I see you do not handle
> 
> int bar (int);
> void baz (int);
> void bla (int);
> int s;
> void
> foo (int y)
> {
>   int a;
>   if (y == 6)
>     {
>       s = 1;
>       bla (5);
>       a = bar (7);
>     }
>   else
>     {
>       s = 1;
>       bla (5);
>       a = bar (7);
>     }
>   baz (a);
> }
> 

I think a more basic example is this one from PR52009, which is basically about
stores:
...
int z;

void
foo (int y)
{
  int a;
  if (y == 6)
    z = 5;
  else
    z = 5;
}
...

I proposed a patch for that separately:
http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the
patch now, I suppose I could use part of that patch as infrastructure for the
todo above.

> and I don't see how you need the new ref->vdef member

I'm regarding the result of the ref as a compound <vdef, result>. Most of the
time I could (as I did in an earlier version of the patch) deduce vdef as
gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when
result is NULL_TREE.

> - that is also
> not handled consistently (not in hashing or comparing, etc.).

I handle the vdef as a result.

> Can't you
> always use SSA_VAL (vuse)?

I'm interested in storing the vdef, not the vuse.

> 
> Btw, what's
> 
> +  if (vdef)
> +    VN_INFO (vdef)->use_processed = true;
> 
> for?  We arrive here by visiting the VDEF after all and visit_use does already
> set the use processed.  Is it that we end up visiting the call twice because
> of the lhs SSA name definition and the virtual operand definition? 

Yes.

> If so then
> visit_use should instead do
> 
> FOR_EACH_SSA_DEF_OPERAND (...)
>    VN_INFO (def)->use_processed = true;
> 
> and defs_to_varying then no longer needs to do that.
> 

factored mark_use_processed out of defs_to_varying, calling mark_use_processed
at start of visit_use.

>> > Otherwise, ok for trunk?
> Let's give this one more iteration, but I guess the basic idea should work.
> 

bootstrapped and reg-tested on x86_64 (ada inclusive).

Is this patch ok, or is the todo required?

Thanks,
 - Tom

2012-04-25  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/51879
	* tree-ssa-sccvn.h (struct vn_reference_s): Add vdef field.
	* tree-ssa-sccvn.c (mark_use_processed): New function, factored out
	of ...
	(defs_to_varying): ... here.
	(visit_reference_op_call): Handle gimple_vdef.
	Handle case that lhs is NULL_TREE.
	(visit_use): Use mark_use_processed.  Handle calls with side-effect
	using visit_reference_op_call.

	gcc.dg/pr51879.c: New test.
	gcc.dg/pr51879-2.c: Same.
	gcc.dg/pr51879-3.c: Same.
	gcc.dg/pr51879-4.c: Same.
	gcc.dg/pr51879-6.c: Same.

Comments

Richard Biener April 26, 2012, 10:20 a.m. UTC | #1
On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 25/04/12 11:57, Richard Guenther wrote:
> <SNIP>
>>>> >> Hmm.  I'm not sure we can conclude that they have the same value!
>>>> >>
>>>> >> +int bar (int);
>>>> >> +void baz (int);
>>>> >> +void bla (int);
>>>> >> +
>>>> >> +void
>>>> >> +foo (int y)
>>>> >> +{
>>>> >> +  int a;
>>>> >> +  if (y == 6)
>>>> >> +    {
>>>> >> +      bla (5);
>>>> >> +      a = bar (7);
>>>> >> +    }
>>>> >> +  else
>>>> >> +    {
>>>> >> +      bla (5);
>>>> >> +      a = bar (7);
>>>> >> +    }
>>>> >> +  baz (a);
>>>> >> +}
>>>> >>
>>>> >> I can implement bla as
>>>> >>
>>>> >> void bla(int) { a = random (); }
>>>> >>
>>>> >> thus a would not have the same value on both paths
>>> >
>>> > I think it's hard to decide whether they have the same value or not, since we
>>> > cannot write a test-case that compares the results of calls from 2 branches of
>>> > an if statement.
>> void *foo ()
>> {
>>   return __builtin_return_address (0);
>> }
>>
>> void *bar (_Bool b)
>> {
>>   if (b)
>>     return foo ();
>>   else
>>     return foo ();
>> }
>>
>> int main()
>> {
>>   if (bar(true) == bar(false))
>>     abort ();
>> }
>>
>> ok ... outside of the scope of standard "C", but we certainly _can_ do this.
>> Which would question tail-merging the above at all, of course.
>>
>
> I added noinline to make sure there's no variation from inlining.
> ...
> extern void abort (void) __attribute__ ((__nothrow__)) __attribute__
> ((__noreturn__));
>
> __attribute__((noinline))
> void *foo ()
> {
>  return __builtin_return_address (0);
> }
>
> __attribute__((noinline))
> void *bar (int b)
> {
>  if (b)
>    return foo ();
>  else
>    return foo ();
> }
>
> int main()
> {
>  void *a, *b;
>  a = bar (1);
>  b = bar (0);
>  if (a == b)
>    abort ();
>  return 0;
> }
> ...
>
> This test-case passes with:
> ...
> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fno-crossjumping
> ...
>
> and fails with:
> ...
> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping
> ...
>
> with the proposed patch, it also fails with:
> ...
> gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping
> ...
>
> Tree-tail-merge performs the same transformation as cross-jumping for this
> test-case. I think that the transformation is valid, and that the test-case
> behaves as expected. Subsequent calls to foo may or may not return the same
> value, depending on the optimizations, we can't assert a != b, that's just the
> nature of __builtin_return_address.
>
> Btw, this example is not what I meant when I said 'a test-case that compares the
> results of calls from 2 branches of an if statement'. I tried to say that the
> semantics of C is defined in terms of taking either 1 branch or the other,
> rather than executing both in parallel, after which both results would be
> available. From that point of view, this example compares subsequent calls to
> foo, not parallel.

Ah, I see.  Btw, I was not suggesting that we should not optimize the above.

>>> > I started looking at the problem in terms of subsequent calls. Consider the
>>> > imaginary attribute nosideeffect, similar to const and pure.
>>> >
>>> > const:
>>> > - no effects except the return value
>>> > - the return value depends only on the parameters
>>> >
>>> > pure:
>>> > - no effects except the return value
>>> > - the return value depends only on the parameters and/or global variables
>>> >
>>> > nosideeffect:
>>> > - no effects except the return value
>>> >
>>> > The code fragment:
>>> > ...
>>> > extern int f (int) __attribute ((nosideeffect));
>>> > int g()
>>> > {
>>> >  int a;
>>> >  int b;
>>> >  a = f (5);
>>> >  b = f (5);
>>> >  return a + b;
>>> > }
>>> > ...
>>> >
>>> > would result in a gimple sequence more or less like this:
>>> > ...
>>> >  # VUSE <.MEMD.1712_4(D)>
>>> >  aD.1707_1 = fD.1704 (5);
>>> >  # VUSE <.MEMD.1712_4(D)>
>>> >  bD.1708_2 = fD.1704 (5);
>>> >  D.1710_3 = aD.1707_1 + bD.1708_2;
>>> >  # VUSE <.MEMD.1712_6>
>>> >  return D.1710_3;
>>> > ...
>>> >
>>> > The value numbering modification I'm proposing would say that a and b have the
>>> > same value, which is not true. I updated the patch to ensure in this case that a
>>> > and b would be seen as different.
>>> > Note that things won't break if the attribute is missing on a function, since
>>> > that just means that the function would have a vdef, and there would be no 2
>>> > subsequent function calls with the same vuse.
>>> >
>>>> >> - but that is not what
>>>> >> matters - because obviously we can still merge the blocks.
>>> >
>>> > Right.
>>> >
>>>> >>  That is,
>>>> >> we cannot be sure that the side-effects on memory of bla (5) and bla (5)
>>>> >> are the same.  But is that not what your value-numbering changes will assert?
>>> >
>>> > In the case of calls in different branches of an control flow statement, we can
>>> > aggressively assume they're the same, since we cannot write a check that would
>>> > start failing.
>> Apart from the above ... which shows that even the returned values can matter.
>>
>>> > In the case of subsequent calls with side effects, the vuses will never be the same.
>>> >
>>> > In the case of subsequent calls without side effects, but not pure or const, we
>>> > can't assume they have the same result. AFAIK, there's currently no way to
>>> > specify such a function, but the updated patch should be able handle this.
>>> >
>>>> >> (not sure how you achieve this, but it looks like you insert/lookup general
>>>> >> calls, thus not pure/const ones
>>> >
>>> > Correct.
>>> >
>>>> >> - that could easily lead to miscompiles(?)).
>>>> >>
>> And to increased compile-time (not sure if it matters).
>>
>>> > I have bootstrapped and reg-tested the updated (and the previous) patch on
>>> > x86_64 (ada inclusive), and found no issues.
>>> >
>>> > Can you think of a test-case that fails with the previous or the updated patch?
>> I see you do not handle
>>
>
> I assume you mean 'not handle' in the sense of 'handle conservatively'.

Yes.

>> struct S { int i; };
>> struct S foo (void);
>> struct S bar (void)
>> {
>>   struct S s1, s2;
>>   if (...)
>>    s = foo ();
>>   else
>>    s = foo ();
>>
>> because the calls have a LHS that is not an SSA name.
>
> Indeed, the gvn patch handles this example conservatively, and tree-tail-merge
> fails to optimize this test-case:
> ...
> struct S { int i; };
> extern struct S foo (void);
> extern int foo2 (void);
> struct S s;
> int bar (int c) {
>  int r;
>  if (c)
>    {
>      s = foo ();
>      r = foo2 ();
>    }
>  else
>    {
>      s = foo ();
>      r = foo2 ();
>    }
>  return r;
> }
> ...
>
> A todo.
>
>>  In general
>> value-numbering virtual operands will change what further consumers lookup.
>> So changing
>>
>>   # MEM_2 = VUSE <MEM_3>
>>   foo ();
>>   # VUSE <MEM_2>
>>   ... = x;
>>
>> by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use
>> MEM_3, not considering a clobber by foo.
>
> AFAIU, the patch doesn't do this type of value numbering.
>
>> This might not actually be an
>> issue (after all we do this for stores, too).
>>
>> I see you do not handle
>>
>> int bar (int);
>> void baz (int);
>> void bla (int);
>> int s;
>> void
>> foo (int y)
>> {
>>   int a;
>>   if (y == 6)
>>     {
>>       s = 1;
>>       bla (5);
>>       a = bar (7);
>>     }
>>   else
>>     {
>>       s = 1;
>>       bla (5);
>>       a = bar (7);
>>     }
>>   baz (a);
>> }
>>
>
> I think a more basic example is this one from PR52009, which is basically about
> stores:
> ...
> int z;
>
> void
> foo (int y)
> {
>  int a;
>  if (y == 6)
>    z = 5;
>  else
>    z = 5;
> }
> ...
>
> I proposed a patch for that separately:
> http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the
> patch now, I suppose I could use part of that patch as infrastructure for the
> todo above.

Yes, I think so.

>> and I don't see how you need the new ref->vdef member
>
> I'm regarding the result of the ref as a compound <vdef, result>. Most of the
> time I could (as I did in an earlier version of the patch) deduce vdef as
> gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when
> result is NULL_TREE.
>
>> - that is also
>> not handled consistently (not in hashing or comparing, etc.).
>
> I handle the vdef as a result.

Ok.  Can you please rename ->vdef to ->result_vdef then?

>> Can't you
>> always use SSA_VAL (vuse)?
>
> I'm interested in storing the vdef, not the vuse.

I meant that result_vdef will always be SSA_VAL (vuse) (the vdef of
the stmt defining vuse).  No?

>>
>> Btw, what's
>>
>> +  if (vdef)
>> +    VN_INFO (vdef)->use_processed = true;
>>
>> for?  We arrive here by visiting the VDEF after all and visit_use does already
>> set the use processed.  Is it that we end up visiting the call twice because
>> of the lhs SSA name definition and the virtual operand definition?
>
> Yes.
>
>> If so then
>> visit_use should instead do
>>
>> FOR_EACH_SSA_DEF_OPERAND (...)
>>    VN_INFO (def)->use_processed = true;
>>
>> and defs_to_varying then no longer needs to do that.
>>
>
> factored mark_use_processed out of defs_to_varying, calling mark_use_processed
> at start of visit_use.

Thanks.  Which case hits SSA_NAME_DEF_STMT == NULL btw?  The
default-def for the virtual operand?

>>> > Otherwise, ok for trunk?
>> Let's give this one more iteration, but I guess the basic idea should work.
>>
>
> bootstrapped and reg-tested on x86_64 (ada inclusive).
>
> Is this patch ok, or is the todo required?

No, you can followup with that.

Ok with s/vdef/result_vdef/

Thanks,
Richard.

> Thanks,
>  - Tom
>
> 2012-04-25  Tom de Vries  <tom@codesourcery.com>
>
>        PR tree-optimization/51879
>        * tree-ssa-sccvn.h (struct vn_reference_s): Add vdef field.
>        * tree-ssa-sccvn.c (mark_use_processed): New function, factored out
>        of ...
>        (defs_to_varying): ... here.
>        (visit_reference_op_call): Handle gimple_vdef.
>        Handle case that lhs is NULL_TREE.
>        (visit_use): Use mark_use_processed.  Handle calls with side-effect
>        using visit_reference_op_call.
>
>        gcc.dg/pr51879.c: New test.
>        gcc.dg/pr51879-2.c: Same.
>        gcc.dg/pr51879-3.c: Same.
>        gcc.dg/pr51879-4.c: Same.
>        gcc.dg/pr51879-6.c: Same.
Tom de Vries April 27, 2012, 6:20 a.m. UTC | #2
On 26/04/12 12:20, Richard Guenther wrote:
> On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>> On 25/04/12 11:57, Richard Guenther wrote:
>> <SNIP>
>>>>>>> Hmm.  I'm not sure we can conclude that they have the same value!
>>>>>>>
>>>>>>> +int bar (int);
>>>>>>> +void baz (int);
>>>>>>> +void bla (int);
>>>>>>> +
>>>>>>> +void
>>>>>>> +foo (int y)
>>>>>>> +{
>>>>>>> +  int a;
>>>>>>> +  if (y == 6)
>>>>>>> +    {
>>>>>>> +      bla (5);
>>>>>>> +      a = bar (7);
>>>>>>> +    }
>>>>>>> +  else
>>>>>>> +    {
>>>>>>> +      bla (5);
>>>>>>> +      a = bar (7);
>>>>>>> +    }
>>>>>>> +  baz (a);
>>>>>>> +}
>>>>>>>
>>>>>>> I can implement bla as
>>>>>>>
>>>>>>> void bla(int) { a = random (); }
>>>>>>>
>>>>>>> thus a would not have the same value on both paths
>>>>>
>>>>> I think it's hard to decide whether they have the same value or not, since we
>>>>> cannot write a test-case that compares the results of calls from 2 branches of
>>>>> an if statement.
>>> void *foo ()
>>> {
>>>   return __builtin_return_address (0);
>>> }
>>>
>>> void *bar (_Bool b)
>>> {
>>>   if (b)
>>>     return foo ();
>>>   else
>>>     return foo ();
>>> }
>>>
>>> int main()
>>> {
>>>   if (bar(true) == bar(false))
>>>     abort ();
>>> }
>>>
>>> ok ... outside of the scope of standard "C", but we certainly _can_ do this.
>>> Which would question tail-merging the above at all, of course.
>>>
>>
>> I added noinline to make sure there's no variation from inlining.
>> ...
>> extern void abort (void) __attribute__ ((__nothrow__)) __attribute__
>> ((__noreturn__));
>>
>> __attribute__((noinline))
>> void *foo ()
>> {
>>  return __builtin_return_address (0);
>> }
>>
>> __attribute__((noinline))
>> void *bar (int b)
>> {
>>  if (b)
>>    return foo ();
>>  else
>>    return foo ();
>> }
>>
>> int main()
>> {
>>  void *a, *b;
>>  a = bar (1);
>>  b = bar (0);
>>  if (a == b)
>>    abort ();
>>  return 0;
>> }
>> ...
>>
>> This test-case passes with:
>> ...
>> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fno-crossjumping
>> ...
>>
>> and fails with:
>> ...
>> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping
>> ...
>>
>> with the proposed patch, it also fails with:
>> ...
>> gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping
>> ...
>>
>> Tree-tail-merge performs the same transformation as cross-jumping for this
>> test-case. I think that the transformation is valid, and that the test-case
>> behaves as expected. Subsequent calls to foo may or may not return the same
>> value, depending on the optimizations, we can't assert a != b, that's just the
>> nature of __builtin_return_address.
>>
>> Btw, this example is not what I meant when I said 'a test-case that compares the
>> results of calls from 2 branches of an if statement'. I tried to say that the
>> semantics of C is defined in terms of taking either 1 branch or the other,
>> rather than executing both in parallel, after which both results would be
>> available. From that point of view, this example compares subsequent calls to
>> foo, not parallel.
> 
> Ah, I see.  Btw, I was not suggesting that we should not optimize the above.
> 
>>>>> I started looking at the problem in terms of subsequent calls. Consider the
>>>>> imaginary attribute nosideeffect, similar to const and pure.
>>>>>
>>>>> const:
>>>>> - no effects except the return value
>>>>> - the return value depends only on the parameters
>>>>>
>>>>> pure:
>>>>> - no effects except the return value
>>>>> - the return value depends only on the parameters and/or global variables
>>>>>
>>>>> nosideeffect:
>>>>> - no effects except the return value
>>>>>
>>>>> The code fragment:
>>>>> ...
>>>>> extern int f (int) __attribute ((nosideeffect));
>>>>> int g()
>>>>> {
>>>>>  int a;
>>>>>  int b;
>>>>>  a = f (5);
>>>>>  b = f (5);
>>>>>  return a + b;
>>>>> }
>>>>> ...
>>>>>
>>>>> would result in a gimple sequence more or less like this:
>>>>> ...
>>>>>  # VUSE <.MEMD.1712_4(D)>
>>>>>  aD.1707_1 = fD.1704 (5);
>>>>>  # VUSE <.MEMD.1712_4(D)>
>>>>>  bD.1708_2 = fD.1704 (5);
>>>>>  D.1710_3 = aD.1707_1 + bD.1708_2;
>>>>>  # VUSE <.MEMD.1712_6>
>>>>>  return D.1710_3;
>>>>> ...
>>>>>
>>>>> The value numbering modification I'm proposing would say that a and b have the
>>>>> same value, which is not true. I updated the patch to ensure in this case that a
>>>>> and b would be seen as different.
>>>>> Note that things won't break if the attribute is missing on a function, since
>>>>> that just means that the function would have a vdef, and there would be no 2
>>>>> subsequent function calls with the same vuse.
>>>>>
>>>>>>> - but that is not what
>>>>>>> matters - because obviously we can still merge the blocks.
>>>>>
>>>>> Right.
>>>>>
>>>>>>>  That is,
>>>>>>> we cannot be sure that the side-effects on memory of bla (5) and bla (5)
>>>>>>> are the same.  But is that not what your value-numbering changes will assert?
>>>>>
>>>>> In the case of calls in different branches of an control flow statement, we can
>>>>> aggressively assume they're the same, since we cannot write a check that would
>>>>> start failing.
>>> Apart from the above ... which shows that even the returned values can matter.
>>>
>>>>> In the case of subsequent calls with side effects, the vuses will never be the same.
>>>>>
>>>>> In the case of subsequent calls without side effects, but not pure or const, we
>>>>> can't assume they have the same result. AFAIK, there's currently no way to
>>>>> specify such a function, but the updated patch should be able handle this.
>>>>>
>>>>>>> (not sure how you achieve this, but it looks like you insert/lookup general
>>>>>>> calls, thus not pure/const ones
>>>>>
>>>>> Correct.
>>>>>
>>>>>>> - that could easily lead to miscompiles(?)).
>>>>>>>
>>> And to increased compile-time (not sure if it matters).
>>>
>>>>> I have bootstrapped and reg-tested the updated (and the previous) patch on
>>>>> x86_64 (ada inclusive), and found no issues.
>>>>>
>>>>> Can you think of a test-case that fails with the previous or the updated patch?
>>> I see you do not handle
>>>
>>
>> I assume you mean 'not handle' in the sense of 'handle conservatively'.
> 
> Yes.
> 
>>> struct S { int i; };
>>> struct S foo (void);
>>> struct S bar (void)
>>> {
>>>   struct S s1, s2;
>>>   if (...)
>>>    s = foo ();
>>>   else
>>>    s = foo ();
>>>
>>> because the calls have a LHS that is not an SSA name.
>>
>> Indeed, the gvn patch handles this example conservatively, and tree-tail-merge
>> fails to optimize this test-case:
>> ...
>> struct S { int i; };
>> extern struct S foo (void);
>> extern int foo2 (void);
>> struct S s;
>> int bar (int c) {
>>  int r;
>>  if (c)
>>    {
>>      s = foo ();
>>      r = foo2 ();
>>    }
>>  else
>>    {
>>      s = foo ();
>>      r = foo2 ();
>>    }
>>  return r;
>> }
>> ...
>>
>> A todo.
>>
>>>  In general
>>> value-numbering virtual operands will change what further consumers lookup.
>>> So changing
>>>
>>>   # MEM_2 = VUSE <MEM_3>
>>>   foo ();
>>>   # VUSE <MEM_2>
>>>   ... = x;
>>>
>>> by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use
>>> MEM_3, not considering a clobber by foo.
>>
>> AFAIU, the patch doesn't do this type of value numbering.
>>
>>> This might not actually be an
>>> issue (after all we do this for stores, too).
>>>
>>> I see you do not handle
>>>
>>> int bar (int);
>>> void baz (int);
>>> void bla (int);
>>> int s;
>>> void
>>> foo (int y)
>>> {
>>>   int a;
>>>   if (y == 6)
>>>     {
>>>       s = 1;
>>>       bla (5);
>>>       a = bar (7);
>>>     }
>>>   else
>>>     {
>>>       s = 1;
>>>       bla (5);
>>>       a = bar (7);
>>>     }
>>>   baz (a);
>>> }
>>>
>>
>> I think a more basic example is this one from PR52009, which is basically about
>> stores:
>> ...
>> int z;
>>
>> void
>> foo (int y)
>> {
>>  int a;
>>  if (y == 6)
>>    z = 5;
>>  else
>>    z = 5;
>> }
>> ...
>>
>> I proposed a patch for that separately:
>> http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the
>> patch now, I suppose I could use part of that patch as infrastructure for the
>> todo above.
> 
> Yes, I think so.
> 
>>> and I don't see how you need the new ref->vdef member
>>
>> I'm regarding the result of the ref as a compound <vdef, result>. Most of the
>> time I could (as I did in an earlier version of the patch) deduce vdef as
>> gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when
>> result is NULL_TREE.
>>
>>> - that is also
>>> not handled consistently (not in hashing or comparing, etc.).
>>
>> I handle the vdef as a result.
> 
> Ok.  Can you please rename ->vdef to ->result_vdef then?
> 

Done.

>>> Can't you
>>> always use SSA_VAL (vuse)?
>>
>> I'm interested in storing the vdef, not the vuse.
> 
> I meant that result_vdef will always be SSA_VAL (vuse) (the vdef of
> the stmt defining vuse).  No?
> 

Right.

>>>
>>> Btw, what's
>>>
>>> +  if (vdef)
>>> +    VN_INFO (vdef)->use_processed = true;
>>>
>>> for?  We arrive here by visiting the VDEF after all and visit_use does already
>>> set the use processed.  Is it that we end up visiting the call twice because
>>> of the lhs SSA name definition and the virtual operand definition?
>>
>> Yes.
>>
>>> If so then
>>> visit_use should instead do
>>>
>>> FOR_EACH_SSA_DEF_OPERAND (...)
>>>    VN_INFO (def)->use_processed = true;
>>>
>>> and defs_to_varying then no longer needs to do that.
>>>
>>
>> factored mark_use_processed out of defs_to_varying, calling mark_use_processed
>> at start of visit_use.
> 
> Thanks.  Which case hits SSA_NAME_DEF_STMT == NULL btw?  The
> default-def for the virtual operand?
> 

SSA_NAME_DEF_STMT == NULL is tested by !stmt in mark_use_processed.

I suppose I better use SSA_NAME_IS_DEFAULT_DEF, as is done in visit_use. I'm
still somewhat confused as to what is the difference. I suppose we might have a
nop statement that is a default def, while it's unequal to NULL.

Updated in committed patch.

>>>>> Otherwise, ok for trunk?
>>> Let's give this one more iteration, but I guess the basic idea should work.
>>>
>>
>> bootstrapped and reg-tested on x86_64 (ada inclusive).
>>
>> Is this patch ok, or is the todo required?
> 
> No, you can followup with that.
> 
> Ok with s/vdef/result_vdef/
> 

retested and committed with this change and SSA_NAME_IS_DEFAULT_DEF instead of
'!stmt'.

Thanks.
- Tom
Richard Biener April 27, 2012, 9:01 a.m. UTC | #3
On Fri, Apr 27, 2012 at 8:20 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 26/04/12 12:20, Richard Guenther wrote:
>> On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>> On 25/04/12 11:57, Richard Guenther wrote:
>>> <SNIP>
>>>>>>>> Hmm.  I'm not sure we can conclude that they have the same value!
>>>>>>>>
>>>>>>>> +int bar (int);
>>>>>>>> +void baz (int);
>>>>>>>> +void bla (int);
>>>>>>>> +
>>>>>>>> +void
>>>>>>>> +foo (int y)
>>>>>>>> +{
>>>>>>>> +  int a;
>>>>>>>> +  if (y == 6)
>>>>>>>> +    {
>>>>>>>> +      bla (5);
>>>>>>>> +      a = bar (7);
>>>>>>>> +    }
>>>>>>>> +  else
>>>>>>>> +    {
>>>>>>>> +      bla (5);
>>>>>>>> +      a = bar (7);
>>>>>>>> +    }
>>>>>>>> +  baz (a);
>>>>>>>> +}
>>>>>>>>
>>>>>>>> I can implement bla as
>>>>>>>>
>>>>>>>> void bla(int) { a = random (); }
>>>>>>>>
>>>>>>>> thus a would not have the same value on both paths
>>>>>>
>>>>>> I think it's hard to decide whether they have the same value or not, since we
>>>>>> cannot write a test-case that compares the results of calls from 2 branches of
>>>>>> an if statement.
>>>> void *foo ()
>>>> {
>>>>   return __builtin_return_address (0);
>>>> }
>>>>
>>>> void *bar (_Bool b)
>>>> {
>>>>   if (b)
>>>>     return foo ();
>>>>   else
>>>>     return foo ();
>>>> }
>>>>
>>>> int main()
>>>> {
>>>>   if (bar(true) == bar(false))
>>>>     abort ();
>>>> }
>>>>
>>>> ok ... outside of the scope of standard "C", but we certainly _can_ do this.
>>>> Which would question tail-merging the above at all, of course.
>>>>
>>>
>>> I added noinline to make sure there's no variation from inlining.
>>> ...
>>> extern void abort (void) __attribute__ ((__nothrow__)) __attribute__
>>> ((__noreturn__));
>>>
>>> __attribute__((noinline))
>>> void *foo ()
>>> {
>>>  return __builtin_return_address (0);
>>> }
>>>
>>> __attribute__((noinline))
>>> void *bar (int b)
>>> {
>>>  if (b)
>>>    return foo ();
>>>  else
>>>    return foo ();
>>> }
>>>
>>> int main()
>>> {
>>>  void *a, *b;
>>>  a = bar (1);
>>>  b = bar (0);
>>>  if (a == b)
>>>    abort ();
>>>  return 0;
>>> }
>>> ...
>>>
>>> This test-case passes with:
>>> ...
>>> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fno-crossjumping
>>> ...
>>>
>>> and fails with:
>>> ...
>>> gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping
>>> ...
>>>
>>> with the proposed patch, it also fails with:
>>> ...
>>> gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping
>>> ...
>>>
>>> Tree-tail-merge performs the same transformation as cross-jumping for this
>>> test-case. I think that the transformation is valid, and that the test-case
>>> behaves as expected. Subsequent calls to foo may or may not return the same
>>> value, depending on the optimizations, we can't assert a != b, that's just the
>>> nature of __builtin_return_address.
>>>
>>> Btw, this example is not what I meant when I said 'a test-case that compares the
>>> results of calls from 2 branches of an if statement'. I tried to say that the
>>> semantics of C is defined in terms of taking either 1 branch or the other,
>>> rather than executing both in parallel, after which both results would be
>>> available. From that point of view, this example compares subsequent calls to
>>> foo, not parallel.
>>
>> Ah, I see.  Btw, I was not suggesting that we should not optimize the above.
>>
>>>>>> I started looking at the problem in terms of subsequent calls. Consider the
>>>>>> imaginary attribute nosideeffect, similar to const and pure.
>>>>>>
>>>>>> const:
>>>>>> - no effects except the return value
>>>>>> - the return value depends only on the parameters
>>>>>>
>>>>>> pure:
>>>>>> - no effects except the return value
>>>>>> - the return value depends only on the parameters and/or global variables
>>>>>>
>>>>>> nosideeffect:
>>>>>> - no effects except the return value
>>>>>>
>>>>>> The code fragment:
>>>>>> ...
>>>>>> extern int f (int) __attribute ((nosideeffect));
>>>>>> int g()
>>>>>> {
>>>>>>  int a;
>>>>>>  int b;
>>>>>>  a = f (5);
>>>>>>  b = f (5);
>>>>>>  return a + b;
>>>>>> }
>>>>>> ...
>>>>>>
>>>>>> would result in a gimple sequence more or less like this:
>>>>>> ...
>>>>>>  # VUSE <.MEMD.1712_4(D)>
>>>>>>  aD.1707_1 = fD.1704 (5);
>>>>>>  # VUSE <.MEMD.1712_4(D)>
>>>>>>  bD.1708_2 = fD.1704 (5);
>>>>>>  D.1710_3 = aD.1707_1 + bD.1708_2;
>>>>>>  # VUSE <.MEMD.1712_6>
>>>>>>  return D.1710_3;
>>>>>> ...
>>>>>>
>>>>>> The value numbering modification I'm proposing would say that a and b have the
>>>>>> same value, which is not true. I updated the patch to ensure in this case that a
>>>>>> and b would be seen as different.
>>>>>> Note that things won't break if the attribute is missing on a function, since
>>>>>> that just means that the function would have a vdef, and there would be no 2
>>>>>> subsequent function calls with the same vuse.
>>>>>>
>>>>>>>> - but that is not what
>>>>>>>> matters - because obviously we can still merge the blocks.
>>>>>>
>>>>>> Right.
>>>>>>
>>>>>>>>  That is,
>>>>>>>> we cannot be sure that the side-effects on memory of bla (5) and bla (5)
>>>>>>>> are the same.  But is that not what your value-numbering changes will assert?
>>>>>>
>>>>>> In the case of calls in different branches of an control flow statement, we can
>>>>>> aggressively assume they're the same, since we cannot write a check that would
>>>>>> start failing.
>>>> Apart from the above ... which shows that even the returned values can matter.
>>>>
>>>>>> In the case of subsequent calls with side effects, the vuses will never be the same.
>>>>>>
>>>>>> In the case of subsequent calls without side effects, but not pure or const, we
>>>>>> can't assume they have the same result. AFAIK, there's currently no way to
>>>>>> specify such a function, but the updated patch should be able handle this.
>>>>>>
>>>>>>>> (not sure how you achieve this, but it looks like you insert/lookup general
>>>>>>>> calls, thus not pure/const ones
>>>>>>
>>>>>> Correct.
>>>>>>
>>>>>>>> - that could easily lead to miscompiles(?)).
>>>>>>>>
>>>> And to increased compile-time (not sure if it matters).
>>>>
>>>>>> I have bootstrapped and reg-tested the updated (and the previous) patch on
>>>>>> x86_64 (ada inclusive), and found no issues.
>>>>>>
>>>>>> Can you think of a test-case that fails with the previous or the updated patch?
>>>> I see you do not handle
>>>>
>>>
>>> I assume you mean 'not handle' in the sense of 'handle conservatively'.
>>
>> Yes.
>>
>>>> struct S { int i; };
>>>> struct S foo (void);
>>>> struct S bar (void)
>>>> {
>>>>   struct S s1, s2;
>>>>   if (...)
>>>>    s = foo ();
>>>>   else
>>>>    s = foo ();
>>>>
>>>> because the calls have a LHS that is not an SSA name.
>>>
>>> Indeed, the gvn patch handles this example conservatively, and tree-tail-merge
>>> fails to optimize this test-case:
>>> ...
>>> struct S { int i; };
>>> extern struct S foo (void);
>>> extern int foo2 (void);
>>> struct S s;
>>> int bar (int c) {
>>>  int r;
>>>  if (c)
>>>    {
>>>      s = foo ();
>>>      r = foo2 ();
>>>    }
>>>  else
>>>    {
>>>      s = foo ();
>>>      r = foo2 ();
>>>    }
>>>  return r;
>>> }
>>> ...
>>>
>>> A todo.
>>>
>>>>  In general
>>>> value-numbering virtual operands will change what further consumers lookup.
>>>> So changing
>>>>
>>>>   # MEM_2 = VUSE <MEM_3>
>>>>   foo ();
>>>>   # VUSE <MEM_2>
>>>>   ... = x;
>>>>
>>>> by value-numbering MEM_2 to MEM_3 will make the lookup of ... = x use
>>>> MEM_3, not considering a clobber by foo.
>>>
>>> AFAIU, the patch doesn't do this type of value numbering.
>>>
>>>> This might not actually be an
>>>> issue (after all we do this for stores, too).
>>>>
>>>> I see you do not handle
>>>>
>>>> int bar (int);
>>>> void baz (int);
>>>> void bla (int);
>>>> int s;
>>>> void
>>>> foo (int y)
>>>> {
>>>>   int a;
>>>>   if (y == 6)
>>>>     {
>>>>       s = 1;
>>>>       bla (5);
>>>>       a = bar (7);
>>>>     }
>>>>   else
>>>>     {
>>>>       s = 1;
>>>>       bla (5);
>>>>       a = bar (7);
>>>>     }
>>>>   baz (a);
>>>> }
>>>>
>>>
>>> I think a more basic example is this one from PR52009, which is basically about
>>> stores:
>>> ...
>>> int z;
>>>
>>> void
>>> foo (int y)
>>> {
>>>  int a;
>>>  if (y == 6)
>>>    z = 5;
>>>  else
>>>    z = 5;
>>> }
>>> ...
>>>
>>> I proposed a patch for that separately:
>>> http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html. Mmmm, looking at the
>>> patch now, I suppose I could use part of that patch as infrastructure for the
>>> todo above.
>>
>> Yes, I think so.
>>
>>>> and I don't see how you need the new ref->vdef member
>>>
>>> I'm regarding the result of the ref as a compound <vdef, result>. Most of the
>>> time I could (as I did in an earlier version of the patch) deduce vdef as
>>> gimple_vdef (SSA_NAME_DEF_STMT (result)), but not always, in particular when
>>> result is NULL_TREE.
>>>
>>>> - that is also
>>>> not handled consistently (not in hashing or comparing, etc.).
>>>
>>> I handle the vdef as a result.
>>
>> Ok.  Can you please rename ->vdef to ->result_vdef then?
>>
>
> Done.
>
>>>> Can't you
>>>> always use SSA_VAL (vuse)?
>>>
>>> I'm interested in storing the vdef, not the vuse.
>>
>> I meant that result_vdef will always be SSA_VAL (vuse) (the vdef of
>> the stmt defining vuse).  No?
>>
>
> Right.
>
>>>>
>>>> Btw, what's
>>>>
>>>> +  if (vdef)
>>>> +    VN_INFO (vdef)->use_processed = true;
>>>>
>>>> for?  We arrive here by visiting the VDEF after all and visit_use does already
>>>> set the use processed.  Is it that we end up visiting the call twice because
>>>> of the lhs SSA name definition and the virtual operand definition?
>>>
>>> Yes.
>>>
>>>> If so then
>>>> visit_use should instead do
>>>>
>>>> FOR_EACH_SSA_DEF_OPERAND (...)
>>>>    VN_INFO (def)->use_processed = true;
>>>>
>>>> and defs_to_varying then no longer needs to do that.
>>>>
>>>
>>> factored mark_use_processed out of defs_to_varying, calling mark_use_processed
>>> at start of visit_use.
>>
>> Thanks.  Which case hits SSA_NAME_DEF_STMT == NULL btw?  The
>> default-def for the virtual operand?
>>
>
> SSA_NAME_DEF_STMT == NULL is tested by !stmt in mark_use_processed.
>
> I suppose I better use SSA_NAME_IS_DEFAULT_DEF, as is done in visit_use. I'm
> still somewhat confused as to what is the difference. I suppose we might have a
> nop statement that is a default def, while it's unequal to NULL.

All default defs have a GIMPLE_NOP as definition, so I was wondering for
which case you see a NULL SSA_NAME_DEF_STMT (and only came up
with a virtual operand maybe?)

> Updated in committed patch.
>
>>>>>> Otherwise, ok for trunk?
>>>> Let's give this one more iteration, but I guess the basic idea should work.
>>>>
>>>
>>> bootstrapped and reg-tested on x86_64 (ada inclusive).
>>>
>>> Is this patch ok, or is the todo required?
>>
>> No, you can followup with that.
>>
>> Ok with s/vdef/result_vdef/
>>
>
> retested and committed with this change and SSA_NAME_IS_DEFAULT_DEF instead of
> '!stmt'.

Thanks,
Richard.

> Thanks.
> - Tom
diff mbox

Patch

Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c (revision 185028)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -2525,6 +2525,30 @@  set_ssa_val_to (tree from, tree to)
   return false;
 }
 
+/* Mark as processed all the definitions in the defining stmt of USE, or
+   the USE itself.  */
+
+static void
+mark_use_processed (tree use)
+{
+  ssa_op_iter iter;
+  def_operand_p defp;
+  gimple stmt = SSA_NAME_DEF_STMT (use);
+
+  if (!stmt || gimple_code (stmt) == GIMPLE_PHI)
+    {
+      VN_INFO (use)->use_processed = true;
+      return;
+    }
+
+  FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_ALL_DEFS)
+    {
+      tree def = DEF_FROM_PTR (defp);
+
+      VN_INFO (def)->use_processed = true;
+    }
+}
+
 /* Set all definitions in STMT to value number to themselves.
    Return true if a value number changed. */
 
@@ -2538,8 +2562,6 @@  defs_to_varying (gimple stmt)
   FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_ALL_DEFS)
     {
       tree def = DEF_FROM_PTR (defp);
-
-      VN_INFO (def)->use_processed = true;
       changed |= set_ssa_val_to (def, def);
     }
   return changed;
@@ -2598,27 +2620,41 @@  visit_reference_op_call (tree lhs, gimpl
 {
   bool changed = false;
   struct vn_reference_s vr1;
-  tree result;
+  vn_reference_t vnresult = NULL;
   tree vuse = gimple_vuse (stmt);
+  tree vdef = gimple_vdef (stmt);
 
   vr1.vuse = vuse ? SSA_VAL (vuse) : NULL_TREE;
   vr1.operands = valueize_shared_reference_ops_from_call (stmt);
   vr1.type = gimple_expr_type (stmt);
   vr1.set = 0;
   vr1.hashcode = vn_reference_compute_hash (&vr1);
-  result = vn_reference_lookup_1 (&vr1, NULL);
-  if (result)
+  vn_reference_lookup_1 (&vr1, &vnresult);
+
+  if (vnresult)
     {
-      changed = set_ssa_val_to (lhs, result);
-      if (TREE_CODE (result) == SSA_NAME
-	  && VN_INFO (result)->has_constants)
-	VN_INFO (lhs)->has_constants = true;
+      if (vnresult->vdef)
+	changed |= set_ssa_val_to (vdef, vnresult->vdef);
+
+      if (!vnresult->result && lhs)
+	vnresult->result = lhs;
+
+      if (vnresult->result && lhs)
+	{
+	  changed |= set_ssa_val_to (lhs, vnresult->result);
+
+	  if (VN_INFO (vnresult->result)->has_constants)
+	    VN_INFO (lhs)->has_constants = true;
+	}
     }
   else
     {
       void **slot;
       vn_reference_t vr2;
-      changed = set_ssa_val_to (lhs, lhs);
+      if (vdef)
+	changed |= set_ssa_val_to (vdef, vdef);
+      if (lhs)
+	changed |= set_ssa_val_to (lhs, lhs);
       vr2 = (vn_reference_t) pool_alloc (current_info->references_pool);
       vr2->vuse = vr1.vuse;
       vr2->operands = valueize_refs (create_reference_ops_from_call (stmt));
@@ -2626,6 +2662,7 @@  visit_reference_op_call (tree lhs, gimpl
       vr2->set = vr1.set;
       vr2->hashcode = vr1.hashcode;
       vr2->result = lhs;
+      vr2->vdef = vdef;
       slot = htab_find_slot_with_hash (current_info->references,
 				       vr2, vr2->hashcode, INSERT);
       if (*slot)
@@ -2795,7 +2832,6 @@  visit_reference_op_store (tree lhs, tree
 	 going to valueize the references in-place.  */
       if ((vdef = gimple_vdef (stmt)))
 	{
-	  VN_INFO (vdef)->use_processed = true;
 	  changed |= set_ssa_val_to (vdef, vdef);
 	}
 
@@ -2817,7 +2853,6 @@  visit_reference_op_store (tree lhs, tree
       def = gimple_vdef (stmt);
       use = gimple_vuse (stmt);
 
-      VN_INFO (def)->use_processed = true;
       changed |= set_ssa_val_to (def, SSA_VAL (use));
     }
 
@@ -3167,7 +3202,7 @@  visit_use (tree use)
   bool changed = false;
   gimple stmt = SSA_NAME_DEF_STMT (use);
 
-  VN_INFO (use)->use_processed = true;
+  mark_use_processed (use);
 
   gcc_assert (!SSA_NAME_IN_FREE_LIST (use));
   if (dump_file && (dump_flags & TDF_DETAILS)
@@ -3186,8 +3221,7 @@  visit_use (tree use)
     {
       if (gimple_code (stmt) == GIMPLE_PHI)
 	changed = visit_phi (stmt);
-      else if (!gimple_has_lhs (stmt)
-	       || gimple_has_volatile_ops (stmt))
+      else if (gimple_has_volatile_ops (stmt))
 	changed = defs_to_varying (stmt);
       else if (is_gimple_assign (stmt))
 	{
@@ -3349,34 +3383,44 @@  visit_use (tree use)
 
 	  /* ???  We could try to simplify calls.  */
 
-	  if (stmt_has_constants (stmt)
-	      && TREE_CODE (lhs) == SSA_NAME)
-	    VN_INFO (lhs)->has_constants = true;
-	  else if (TREE_CODE (lhs) == SSA_NAME)
+	  if (lhs && TREE_CODE (lhs) == SSA_NAME)
 	    {
-	      /* We reset expr and constantness here because we may
-		 have been value numbering optimistically, and
-		 iterating. They may become non-constant in this case,
-		 even if they were optimistically constant. */
-	      VN_INFO (lhs)->has_constants = false;
-	      VN_INFO (lhs)->expr = NULL_TREE;
+	      if (stmt_has_constants (stmt))
+		VN_INFO (lhs)->has_constants = true;
+	      else
+		{
+		  /* We reset expr and constantness here because we may
+		     have been value numbering optimistically, and
+		     iterating.  They may become non-constant in this case,
+		     even if they were optimistically constant.  */
+		  VN_INFO (lhs)->has_constants = false;
+		  VN_INFO (lhs)->expr = NULL_TREE;
+		}
+
+	      if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
+		{
+		  changed = defs_to_varying (stmt);
+		  goto done;
+		}
 	    }
 
-	  if (TREE_CODE (lhs) == SSA_NAME
-	      && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
-	    changed = defs_to_varying (stmt);
 	  /* ???  We should handle stores from calls.  */
-	  else if (TREE_CODE (lhs) == SSA_NAME)
+	  if (!gimple_call_internal_p (stmt)
+	      && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)
+		  /* If the call has side effects, subsequent calls won't have
+		     the same incoming vuse, so it's save to assume
+		     equality.  */
+		  || gimple_has_side_effects (stmt))
+	      && ((lhs && TREE_CODE (lhs) == SSA_NAME)
+		  || (!lhs && gimple_vdef (stmt))))
 	    {
-	      if (!gimple_call_internal_p (stmt)
-		  && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST))
-		changed = visit_reference_op_call (lhs, stmt);
-	      else
-		changed = defs_to_varying (stmt);
+	      changed = visit_reference_op_call (lhs, stmt);
 	    }
 	  else
 	    changed = defs_to_varying (stmt);
 	}
+      else
+	changed = defs_to_varying (stmt);
     }
  done:
   return changed;
Index: gcc/tree-ssa-sccvn.h
===================================================================
--- gcc/tree-ssa-sccvn.h (revision 185028)
+++ gcc/tree-ssa-sccvn.h (working copy)
@@ -110,6 +110,7 @@  typedef struct vn_reference_s
   tree type;
   VEC (vn_reference_op_s, heap) *operands;
   tree result;
+  tree vdef;
 } *vn_reference_t;
 typedef const struct vn_reference_s *const_vn_reference_t;
 
Index: gcc/testsuite/gcc.dg/pr51879-2.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-2.c (revision 0)
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int bar (int);
+void baz (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y)
+    baz (bar (7) + 6);
+  else
+    baz (bar (7) + 6);
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-times "baz \\(" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr51879-6.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-6.c (revision 0)
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+
+int bar (int);
+void baz (int);
+void bla (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y == 6)
+    {
+      bla (5);
+      a = bar (7);
+    }
+  else
+    {
+      bla (5);
+      a = bar (7);
+    }
+  baz (a);
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr51879.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879.c (revision 0)
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int bar (int);
+void baz (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y == 6)
+    a = bar (7);
+  else
+    a = bar (7);
+  baz (a);
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr51879-3.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-3.c (revision 0)
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int bar (int);
+void baz (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y == 6)
+    a = bar (7) + 6;
+  else
+    a = bar (7) + 6;
+  baz (a);
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr51879-4.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-4.c (revision 0)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int bar (int);
+void baz (int);
+
+int foo (int y)
+{
+  int a, b;
+  a = bar (7) + 6;
+  b = bar (7) + 6;
+  return a + b;
+}
+
+/* { dg-final { scan-tree-dump-times "bar \\(" 2 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */