diff mbox

[PR,tree-optimization/67955] Exploit PTA in DSE

Message ID 4a6ac23e-162f-5d06-bc55-87bb17783bf0@redhat.com
State New
Headers show

Commit Message

Jeff Law Jan. 4, 2017, 7:24 p.m. UTC
On 01/04/2017 11:55 AM, Jeff Law wrote:
> On 12/09/2016 01:28 AM, Richard Biener wrote:
>> On Wed, Dec 7, 2016 at 12:18 AM, Jeff Law <law@redhat.com> wrote:
>>>
>>>
>>> So I was going through the various DSE related bugs as stumbled across
>>> 67955.
>>>
>>> The basic issue in 67955 is that DSE is too simplistic when trying to
>>> determine if two writes hit the same memory address.  There are cases
>>> were
>>> PTA analysis can get us that information.
>>>
>>> The unfortunate problem is that PTA can generate a single points-to
>>> set for
>>> pointers to different elements in the same array.  So we can only
>>> exploit a
>>> special case.  Namely when we get a PTA singleton and the size of the
>>> store
>>> is the same size of the pointed-to object.
>>>
>>> That doesn't happen in a bootstrap, but it does happen for the
>>> testcase in
>>> the BZ as well as a handful of tests in the testsuite -- Tom reported 6
>>> unique tests where it triggered, I ran my own tests where two more
>>> were spit
>>> out.  Not huge.  But the cost is low.
>>>
>>> All that I've done with Tom's patch is update the call into the PTA
>>> system.
>>> It's very clearly Tom's work.
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu.  Installing
>>> on the
>>> trunk.
>>
>> Just double-checked it doesn't break
>>
>> int foo (int *p, int b)
>> {
>>   int *q;
>>   int i = 1;
>>   if (b)
>>     q = &i;
>>   else
>>     q = (void *)0;
>>   *q = 2;
>>   return i;
>> }
> Right.  ISTM this shouldn't have a singleton points-to set and thus
> shouldn't change behavior.    But looking at things more closely, it
> looks like points-to-null is distinct from the points-to variables.  ie,
> we want to know if its a singleton and ! null.  Thus we have to check
> pt_solution_singleton_or_null_p (pi->pt, &pt_uid) && !pi->pt->null
>
> I think it's working for you because the path isolation code splits off
> the NULL dereference path.  Adding
> -fno-isolate-erroneous-paths-dereference to the switches shows DSE2,
> incorrectly IMHO, removing the i = 1 store.
>
> Thoughts?
The more I think about this the more I'm sure we need to verify pt.null 
is not in the points-to set.    I've taken the above testcase and added 
it as a negative test.  Bootstrapped, regression tested and committed to 
the trunk along with the other minor cleanups you pointed out.

Thanks,

Jeff

Comments

Richard Biener Jan. 5, 2017, 8:34 a.m. UTC | #1
On Wed, Jan 4, 2017 at 8:24 PM, Jeff Law <law@redhat.com> wrote:
> On 01/04/2017 11:55 AM, Jeff Law wrote:
>>
>> On 12/09/2016 01:28 AM, Richard Biener wrote:
>>>
>>> On Wed, Dec 7, 2016 at 12:18 AM, Jeff Law <law@redhat.com> wrote:
>>>>
>>>>
>>>>
>>>> So I was going through the various DSE related bugs as stumbled across
>>>> 67955.
>>>>
>>>> The basic issue in 67955 is that DSE is too simplistic when trying to
>>>> determine if two writes hit the same memory address.  There are cases
>>>> were
>>>> PTA analysis can get us that information.
>>>>
>>>> The unfortunate problem is that PTA can generate a single points-to
>>>> set for
>>>> pointers to different elements in the same array.  So we can only
>>>> exploit a
>>>> special case.  Namely when we get a PTA singleton and the size of the
>>>> store
>>>> is the same size of the pointed-to object.
>>>>
>>>> That doesn't happen in a bootstrap, but it does happen for the
>>>> testcase in
>>>> the BZ as well as a handful of tests in the testsuite -- Tom reported 6
>>>> unique tests where it triggered, I ran my own tests where two more
>>>> were spit
>>>> out.  Not huge.  But the cost is low.
>>>>
>>>> All that I've done with Tom's patch is update the call into the PTA
>>>> system.
>>>> It's very clearly Tom's work.
>>>>
>>>> Bootstrapped and regression tested on x86_64-linux-gnu.  Installing
>>>> on the
>>>> trunk.
>>>
>>>
>>> Just double-checked it doesn't break
>>>
>>> int foo (int *p, int b)
>>> {
>>>   int *q;
>>>   int i = 1;
>>>   if (b)
>>>     q = &i;
>>>   else
>>>     q = (void *)0;
>>>   *q = 2;
>>>   return i;
>>> }
>>
>> Right.  ISTM this shouldn't have a singleton points-to set and thus
>> shouldn't change behavior.    But looking at things more closely, it
>> looks like points-to-null is distinct from the points-to variables.  ie,
>> we want to know if its a singleton and ! null.  Thus we have to check
>> pt_solution_singleton_or_null_p (pi->pt, &pt_uid) && !pi->pt->null
>>
>> I think it's working for you because the path isolation code splits off
>> the NULL dereference path.  Adding
>> -fno-isolate-erroneous-paths-dereference to the switches shows DSE2,
>> incorrectly IMHO, removing the i = 1 store.
>>
>> Thoughts?
>
> The more I think about this the more I'm sure we need to verify pt.null is
> not in the points-to set.    I've taken the above testcase and added it as a
> negative test.  Bootstrapped, regression tested and committed to the trunk
> along with the other minor cleanups you pointed out.

Note disabling this for pt.null == 1 makes it pretty useless given we compute
that conservatively to always 1 in points-to analysis (and only VRP ever sets
it to zero).  See PTAs find_what_p_points_to.  This is because PTA does
not conservatively compute whether a pointer may be NULL (all bugs, I have
partly fixed some and have an incomplete patch to fix others -- at the point
I looked into this we had no users of pt.null info and thus I decided the
extra constraints and complexity wasn't worth the compile-time/memory-use).

Without -fnon-call-exceptions removing the *0 = 2 store is IMHO ok, so we
only have to make sure to not break the exception case.

For

int foo (int *p, int b)
{
  int *q;
  int i = 1;
  if (b)
    q = &i;
  else
    q = (void *)0;
  *q = 2;
  i = 3;
  return *q;
}

we remove all stores but the last store to i and the load from q (but we don't
replace q with &i here, a missed optimization if removing the other stores is
valid).

So for

int foo (int *p, int b)
{
  int i = 1;
  try {
  int *q;
  if (b)
    q = &i;
  else
    q = (int *)0;
  *q = 2;
  } catch (...) {
      return 0;
  }
  return i;
}

we do not remove the store to i with -fnon-call-exceptions.  Which means we are
fine not testing for pt.null != 1.

?

Richard.

> Thanks,
>
> Jeff
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index d43c9bc..3a7ad9d 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,11 @@
> +2017-01-04  Jeff Law  <law@redhat.com>
> +
> +       PR tree-optimizatin/67955
> +       * tree-ssa-alias.c (same_addr_size_stores_p): Check offsets first.
> +       Allow any SSA_VAR_P as the base objects.  Use integer_zerop.  Verify
> +       the points-to solution does not include pt_null.  Use DECL_PT_UID
> +       unconditionally.
> +
>  2017-01-04  Uros Bizjak  <ubizjak@gmail.com>
>
>         * config/i386/i386.md (HI/SImode test with imm to QImode splitters):
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index d8ff32f..0cbc8cc 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2017-01-03  Jeff Law  <law@redhat.com>
> +
> +       PR tree-optimization/67955
> +       * gcc.dg/tree-ssa/ssa-dse-28.c: New test.
> +
>  2017-01-04  Marek Polacek  <polacek@redhat.com>
>
>         PR c++/77545
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
> new file mode 100644
> index 0000000..d35377b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-dse-details -fexceptions
> -fnon-call-exceptions -fno-isolate-erroneous-paths-dereference" } */
> +
> +
> +int foo (int *p, int b)
> +{
> +  int *q;
> +  int i = 1;
> +  if (b)
> +    q = &i;
> +  else
> +    q = (void *)0;
> +  *q = 2;
> +  return i;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse1"} } */
> +/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse2"} } */
> +/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse3"} } */
> +
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 0a9fc74..871fa12 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -2326,9 +2326,13 @@ same_addr_size_stores_p (tree base1, HOST_WIDE_INT
> offset1, HOST_WIDE_INT size1,
>                          tree base2, HOST_WIDE_INT offset2, HOST_WIDE_INT
> size2,
>                          HOST_WIDE_INT max_size2)
>  {
> -  /* For now, just handle VAR_DECL.  */
> -  bool base1_obj_p = VAR_P (base1);
> -  bool base2_obj_p = VAR_P (base2);
> +  /* Offsets need to be 0.  */
> +  if (offset1 != 0
> +      || offset2 != 0)
> +    return false;
> +
> +  bool base1_obj_p = SSA_VAR_P (base1);
> +  bool base2_obj_p = SSA_VAR_P (base2);
>
>    /* We need one object.  */
>    if (base1_obj_p == base2_obj_p)
> @@ -2356,13 +2360,9 @@ same_addr_size_stores_p (tree base1, HOST_WIDE_INT
> offset1, HOST_WIDE_INT size1,
>    if (size1 != size2)
>      return false;
>
> -  /* Offsets need to be 0.  */
> -  if (offset1 != 0
> -      || offset2 != 0)
> -    return false;
>
>    /* Check that memref is a store to pointer with singleton points-to info.
> */
> -  if (!tree_int_cst_equal (TREE_OPERAND (memref, 1), integer_zero_node))
> +  if (!integer_zerop (TREE_OPERAND (memref, 1)))
>      return false;
>    tree ptr = TREE_OPERAND (memref, 0);
>    if (TREE_CODE (ptr) != SSA_NAME)
> @@ -2373,10 +2373,13 @@ same_addr_size_stores_p (tree base1, HOST_WIDE_INT
> offset1, HOST_WIDE_INT size1,
>        || !pt_solution_singleton_or_null_p (&pi->pt, &pt_uid))
>      return false;
>
> +  /* If the solution has a singleton and NULL, then we can not
> +     be sure that the two stores hit the same address.  */
> +  if (pi->pt.null)
> +    return false;
> +
>    /* Check that ptr points relative to obj.  */
> -  unsigned int obj_uid = (DECL_PT_UID_SET_P (obj)
> -                         ? DECL_PT_UID (obj)
> -                         : DECL_UID (obj));
> +  unsigned int obj_uid = DECL_PT_UID (obj);
>    if (obj_uid != pt_uid)
>      return false;
>
>
Christophe Lyon Jan. 6, 2017, 3:22 p.m. UTC | #2
Hi Jeff,


On 5 January 2017 at 09:34, Richard Biener <richard.guenther@gmail.com> wrote:
> On Wed, Jan 4, 2017 at 8:24 PM, Jeff Law <law@redhat.com> wrote:
>> On 01/04/2017 11:55 AM, Jeff Law wrote:
>>>
>>> On 12/09/2016 01:28 AM, Richard Biener wrote:
>>>>
>>>> On Wed, Dec 7, 2016 at 12:18 AM, Jeff Law <law@redhat.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> So I was going through the various DSE related bugs as stumbled across
>>>>> 67955.
>>>>>
>>>>> The basic issue in 67955 is that DSE is too simplistic when trying to
>>>>> determine if two writes hit the same memory address.  There are cases
>>>>> were
>>>>> PTA analysis can get us that information.
>>>>>
>>>>> The unfortunate problem is that PTA can generate a single points-to
>>>>> set for
>>>>> pointers to different elements in the same array.  So we can only
>>>>> exploit a
>>>>> special case.  Namely when we get a PTA singleton and the size of the
>>>>> store
>>>>> is the same size of the pointed-to object.
>>>>>
>>>>> That doesn't happen in a bootstrap, but it does happen for the
>>>>> testcase in
>>>>> the BZ as well as a handful of tests in the testsuite -- Tom reported 6
>>>>> unique tests where it triggered, I ran my own tests where two more
>>>>> were spit
>>>>> out.  Not huge.  But the cost is low.
>>>>>
>>>>> All that I've done with Tom's patch is update the call into the PTA
>>>>> system.
>>>>> It's very clearly Tom's work.
>>>>>
>>>>> Bootstrapped and regression tested on x86_64-linux-gnu.  Installing
>>>>> on the
>>>>> trunk.
>>>>
>>>>
>>>> Just double-checked it doesn't break
>>>>
>>>> int foo (int *p, int b)
>>>> {
>>>>   int *q;
>>>>   int i = 1;
>>>>   if (b)
>>>>     q = &i;
>>>>   else
>>>>     q = (void *)0;
>>>>   *q = 2;
>>>>   return i;
>>>> }
>>>
>>> Right.  ISTM this shouldn't have a singleton points-to set and thus
>>> shouldn't change behavior.    But looking at things more closely, it
>>> looks like points-to-null is distinct from the points-to variables.  ie,
>>> we want to know if its a singleton and ! null.  Thus we have to check
>>> pt_solution_singleton_or_null_p (pi->pt, &pt_uid) && !pi->pt->null
>>>
>>> I think it's working for you because the path isolation code splits off
>>> the NULL dereference path.  Adding
>>> -fno-isolate-erroneous-paths-dereference to the switches shows DSE2,
>>> incorrectly IMHO, removing the i = 1 store.
>>>
>>> Thoughts?
>>
>> The more I think about this the more I'm sure we need to verify pt.null is
>> not in the points-to set.    I've taken the above testcase and added it as a
>> negative test.  Bootstrapped, regression tested and committed to the trunk
>> along with the other minor cleanups you pointed out.
>
> Note disabling this for pt.null == 1 makes it pretty useless given we compute
> that conservatively to always 1 in points-to analysis (and only VRP ever sets
> it to zero).  See PTAs find_what_p_points_to.  This is because PTA does
> not conservatively compute whether a pointer may be NULL (all bugs, I have
> partly fixed some and have an incomplete patch to fix others -- at the point
> I looked into this we had no users of pt.null info and thus I decided the
> extra constraints and complexity wasn't worth the compile-time/memory-use).
>
> Without -fnon-call-exceptions removing the *0 = 2 store is IMHO ok, so we
> only have to make sure to not break the exception case.
>
> For
>
> int foo (int *p, int b)
> {
>   int *q;
>   int i = 1;
>   if (b)
>     q = &i;
>   else
>     q = (void *)0;
>   *q = 2;
>   i = 3;
>   return *q;
> }
>
> we remove all stores but the last store to i and the load from q (but we don't
> replace q with &i here, a missed optimization if removing the other stores is
> valid).
>
> So for
>
> int foo (int *p, int b)
> {
>   int i = 1;
>   try {
>   int *q;
>   if (b)
>     q = &i;
>   else
>     q = (int *)0;
>   *q = 2;
>   } catch (...) {
>       return 0;
>   }
>   return i;
> }
>
> we do not remove the store to i with -fnon-call-exceptions.  Which means we are
> fine not testing for pt.null != 1.
>
> ?
>
> Richard.
>
>> Thanks,
>>
>> Jeff
>>
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index d43c9bc..3a7ad9d 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,11 @@
>> +2017-01-04  Jeff Law  <law@redhat.com>
>> +
>> +       PR tree-optimizatin/67955
>> +       * tree-ssa-alias.c (same_addr_size_stores_p): Check offsets first.
>> +       Allow any SSA_VAR_P as the base objects.  Use integer_zerop.  Verify
>> +       the points-to solution does not include pt_null.  Use DECL_PT_UID
>> +       unconditionally.
>> +
>>  2017-01-04  Uros Bizjak  <ubizjak@gmail.com>
>>
>>         * config/i386/i386.md (HI/SImode test with imm to QImode splitters):
>> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>> index d8ff32f..0cbc8cc 100644
>> --- a/gcc/testsuite/ChangeLog
>> +++ b/gcc/testsuite/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2017-01-03  Jeff Law  <law@redhat.com>
>> +
>> +       PR tree-optimization/67955
>> +       * gcc.dg/tree-ssa/ssa-dse-28.c: New test.
>> +

Since you committed this patch (r244067), I have noticed that
gcc.dg/tree-ssa/dse-points-to.c scan-tree-dump-times dse1 "Deleted
dead store.*p_1" 1
now fails on arm/aarch64.

Christophe




>>  2017-01-04  Marek Polacek  <polacek@redhat.com>
>>
>>         PR c++/77545
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
>> new file mode 100644
>> index 0000000..d35377b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-dse-details -fexceptions
>> -fnon-call-exceptions -fno-isolate-erroneous-paths-dereference" } */
>> +
>> +
>> +int foo (int *p, int b)
>> +{
>> +  int *q;
>> +  int i = 1;
>> +  if (b)
>> +    q = &i;
>> +  else
>> +    q = (void *)0;
>> +  *q = 2;
>> +  return i;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse1"} } */
>> +/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse2"} } */
>> +/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse3"} } */
>> +
>> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
>> index 0a9fc74..871fa12 100644
>> --- a/gcc/tree-ssa-alias.c
>> +++ b/gcc/tree-ssa-alias.c
>> @@ -2326,9 +2326,13 @@ same_addr_size_stores_p (tree base1, HOST_WIDE_INT
>> offset1, HOST_WIDE_INT size1,
>>                          tree base2, HOST_WIDE_INT offset2, HOST_WIDE_INT
>> size2,
>>                          HOST_WIDE_INT max_size2)
>>  {
>> -  /* For now, just handle VAR_DECL.  */
>> -  bool base1_obj_p = VAR_P (base1);
>> -  bool base2_obj_p = VAR_P (base2);
>> +  /* Offsets need to be 0.  */
>> +  if (offset1 != 0
>> +      || offset2 != 0)
>> +    return false;
>> +
>> +  bool base1_obj_p = SSA_VAR_P (base1);
>> +  bool base2_obj_p = SSA_VAR_P (base2);
>>
>>    /* We need one object.  */
>>    if (base1_obj_p == base2_obj_p)
>> @@ -2356,13 +2360,9 @@ same_addr_size_stores_p (tree base1, HOST_WIDE_INT
>> offset1, HOST_WIDE_INT size1,
>>    if (size1 != size2)
>>      return false;
>>
>> -  /* Offsets need to be 0.  */
>> -  if (offset1 != 0
>> -      || offset2 != 0)
>> -    return false;
>>
>>    /* Check that memref is a store to pointer with singleton points-to info.
>> */
>> -  if (!tree_int_cst_equal (TREE_OPERAND (memref, 1), integer_zero_node))
>> +  if (!integer_zerop (TREE_OPERAND (memref, 1)))
>>      return false;
>>    tree ptr = TREE_OPERAND (memref, 0);
>>    if (TREE_CODE (ptr) != SSA_NAME)
>> @@ -2373,10 +2373,13 @@ same_addr_size_stores_p (tree base1, HOST_WIDE_INT
>> offset1, HOST_WIDE_INT size1,
>>        || !pt_solution_singleton_or_null_p (&pi->pt, &pt_uid))
>>      return false;
>>
>> +  /* If the solution has a singleton and NULL, then we can not
>> +     be sure that the two stores hit the same address.  */
>> +  if (pi->pt.null)
>> +    return false;
>> +
>>    /* Check that ptr points relative to obj.  */
>> -  unsigned int obj_uid = (DECL_PT_UID_SET_P (obj)
>> -                         ? DECL_PT_UID (obj)
>> -                         : DECL_UID (obj));
>> +  unsigned int obj_uid = DECL_PT_UID (obj);
>>    if (obj_uid != pt_uid)
>>      return false;
>>
>>
Jeff Law Jan. 7, 2017, 6:01 p.m. UTC | #3
On 01/05/2017 01:34 AM, Richard Biener wrote:
> On Wed, Jan 4, 2017 at 8:24 PM, Jeff Law <law@redhat.com> wrote:
>>
>> The more I think about this the more I'm sure we need to verify pt.null is
>> not in the points-to set.    I've taken the above testcase and added it as a
>> negative test.  Bootstrapped, regression tested and committed to the trunk
>> along with the other minor cleanups you pointed out.
>
> Note disabling this for pt.null == 1 makes it pretty useless given we compute
> that conservatively to always 1 in points-to analysis (and only VRP ever sets
> it to zero).  See PTAs find_what_p_points_to.  This is because PTA does
> not conservatively compute whether a pointer may be NULL (all bugs, I have
> partly fixed some and have an incomplete patch to fix others -- at the point
> I looked into this we had no users of pt.null info and thus I decided the
> extra constraints and complexity wasn't worth the compile-time/memory-use).
>
> Without -fnon-call-exceptions removing the *0 = 2 store is IMHO ok, so we
> only have to make sure to not break the exception case.
I spent a goodly amount of time thinking about this...  I think the key 
point is whether or not removing the store is observable in a conforming 
program.

Essentially if we get a non-call exception or receive a signal between 
the "dead" store and the subsequent store, then we could observe that 
the "dead" store was removed if the object being stored escapes.

This seems to have larger implications than just the cases we're looking 
at (assume "a" is something in memory, of course).


a = 1;
<trigger non-call exception>
a = 2;


If "a" escapes such that its value can be queried in the exception 
handler, then the exception handler would be able to observe the first 
store and thus it should not be removed.

We also have to be cognizant of systems where there is memory mapped at 
location 0.  When that is true, we must check pt.null and honor it, even 
if it pessimizes code.



> For
>
> int foo (int *p, int b)
> {
>   int *q;
>   int i = 1;
>   if (b)
>     q = &i;
>   else
>     q = (void *)0;
>   *q = 2;
>   i = 3;
>   return *q;
> }
So on a system where *0 is a valid memory address, *q = 2 does not make 
anything dead, nor does i = 3 unless we were to isolate the THEN/ELSE 
blocks.

On a system where *0 traps, there is no way to observe the value of "i" 
in the handler.  Thus i = 1 is a dead store.  I believe we must keep the 
*q = 2 store because it can trigger a signal/exception which is itself 
an observable side effect?  Right?









>
> we remove all stores but the last store to i and the load from q (but we don't
> replace q with &i here, a missed optimization if removing the other stores is
> valid).
But if we remove the *q = 2 store, we remove an observable side effect, 
the trap/exception itself if we reach that statement via the ELSE path.

Jeff
Richard Biener Jan. 9, 2017, 9:36 a.m. UTC | #4
On Sat, Jan 7, 2017 at 7:01 PM, Jeff Law <law@redhat.com> wrote:
> On 01/05/2017 01:34 AM, Richard Biener wrote:
>>
>> On Wed, Jan 4, 2017 at 8:24 PM, Jeff Law <law@redhat.com> wrote:
>>>
>>>
>>> The more I think about this the more I'm sure we need to verify pt.null
>>> is
>>> not in the points-to set.    I've taken the above testcase and added it
>>> as a
>>> negative test.  Bootstrapped, regression tested and committed to the
>>> trunk
>>> along with the other minor cleanups you pointed out.
>>
>>
>> Note disabling this for pt.null == 1 makes it pretty useless given we
>> compute
>> that conservatively to always 1 in points-to analysis (and only VRP ever
>> sets
>> it to zero).  See PTAs find_what_p_points_to.  This is because PTA does
>> not conservatively compute whether a pointer may be NULL (all bugs, I have
>> partly fixed some and have an incomplete patch to fix others -- at the
>> point
>> I looked into this we had no users of pt.null info and thus I decided the
>> extra constraints and complexity wasn't worth the
>> compile-time/memory-use).
>>
>> Without -fnon-call-exceptions removing the *0 = 2 store is IMHO ok, so we
>> only have to make sure to not break the exception case.
>
> I spent a goodly amount of time thinking about this...  I think the key
> point is whether or not removing the store is observable in a conforming
> program.
>
> Essentially if we get a non-call exception or receive a signal between the
> "dead" store and the subsequent store, then we could observe that the "dead"
> store was removed if the object being stored escapes.
>
> This seems to have larger implications than just the cases we're looking at
> (assume "a" is something in memory, of course).
>
>
> a = 1;
> <trigger non-call exception>
> a = 2;
>
>
> If "a" escapes such that its value can be queried in the exception handler,
> then the exception handler would be able to observe the first store and thus
> it should not be removed.

Yes, and it won't as long as the EH is thrown internally (and thus we have
a CFG reflecting it).  When it's only externally catched we lose of course...

We'd need an Ada testcase to actually show behavior that is not conforming
to an existing language specification though.

I suspect we have a similar issue in C++ for sth like

void __attribute__((const)) foo () { throw; }

int x;
void bar ()
{
  x = 1;
  foo ();
  x = 2;
}

where foo is const but not nothrow.

> We also have to be cognizant of systems where there is memory mapped at
> location 0.  When that is true, we must check pt.null and honor it, even if
> it pessimizes code.

With -fno-delete-null-pointer-checks (that's what such systems set) PTA computes
0 as "nonlocal" and thus it won't be a singleton points-to solution.

>
>
>> For
>>
>> int foo (int *p, int b)
>> {
>>   int *q;
>>   int i = 1;
>>   if (b)
>>     q = &i;
>>   else
>>     q = (void *)0;
>>   *q = 2;
>>   i = 3;
>>   return *q;
>> }
>
> So on a system where *0 is a valid memory address, *q = 2 does not make
> anything dead, nor does i = 3 unless we were to isolate the THEN/ELSE
> blocks.
>
> On a system where *0 traps, there is no way to observe the value of "i" in
> the handler.  Thus i = 1 is a dead store.  I believe we must keep the *q = 2
> store because it can trigger a signal/exception which is itself an
> observable side effect?  Right?

But writing to 0 invokes undefined behavior which we have no obligation to
preserve (unless we make it well-defined with -fnon-call-exceptions -fexceptions
as a GCC extension).

>>
>> we remove all stores but the last store to i and the load from q (but we
>> don't
>> replace q with &i here, a missed optimization if removing the other stores
>> is
>> valid).
>
> But if we remove the *q = 2 store, we remove an observable side effect, the
> trap/exception itself if we reach that statement via the ELSE path.

As said above - I don't think we have to care for C/C++ w/o
-fnon-call-exceptions.

Richard.

>
> Jeff
Jeff Law Jan. 9, 2017, 5:02 p.m. UTC | #5
On 01/09/2017 02:36 AM, Richard Biener wrote:
>>
>>
>> a = 1;
>> <trigger non-call exception>
>> a = 2;
>>
>>
>> If "a" escapes such that its value can be queried in the exception handler,
>> then the exception handler would be able to observe the first store and thus
>> it should not be removed.
>
> Yes, and it won't as long as the EH is thrown internally (and thus we have
> a CFG reflecting it).  When it's only externally catched we lose of course...
>
> We'd need an Ada testcase to actually show behavior that is not conforming
> to an existing language specification though.
I'm not versed enough in Ada to even attempt to pull together a testcase 
for this.

>
> I suspect we have a similar issue in C++ for sth like
>
> void __attribute__((const)) foo () { throw; }
>
> int x;
> void bar ()
> {
>   x = 1;
>   foo ();
>   x = 2;
> }
>
> where foo is const but not nothrow.
I wouldn't be surprised if there's other problems with const functions 
that can throw.

>
>> We also have to be cognizant of systems where there is memory mapped at
>> location 0.  When that is true, we must check pt.null and honor it, even if
>> it pessimizes code.
>
> With -fno-delete-null-pointer-checks (that's what such systems set) PTA computes
> 0 as "nonlocal" and thus it won't be a singleton points-to solution.
Ah, good.

>
>>
>>
>>> For
>>>
>>> int foo (int *p, int b)
>>> {
>>>   int *q;
>>>   int i = 1;
>>>   if (b)
>>>     q = &i;
>>>   else
>>>     q = (void *)0;
>>>   *q = 2;
>>>   i = 3;
>>>   return *q;
>>> }
>>
>> So on a system where *0 is a valid memory address, *q = 2 does not make
>> anything dead, nor does i = 3 unless we were to isolate the THEN/ELSE
>> blocks.
>>
>> On a system where *0 traps, there is no way to observe the value of "i" in
>> the handler.  Thus i = 1 is a dead store.  I believe we must keep the *q = 2
>> store because it can trigger a signal/exception which is itself an
>> observable side effect?  Right?
>
> But writing to 0 invokes undefined behavior which we have no obligation to
> preserve (unless we make it well-defined with -fnon-call-exceptions -fexceptions
> as a GCC extension).
It may invoke undefined behavior, but to date we have explicitly chosen 
to preserve the *0 = <something> to trigger a fault via the virtual 
memory system.  We kicked this around extensively in Nov 2013 with the 
introduction of isolation of erroneous paths.

I think part of what pushed us that direction was that a program could 
catch the signal related to the NULL dereference, then do something 
sensible in the handler.  That also happens to match what the Go 
language requires.



>
>>>
>>> we remove all stores but the last store to i and the load from q (but we
>>> don't
>>> replace q with &i here, a missed optimization if removing the other stores
>>> is
>>> valid).
>>
>> But if we remove the *q = 2 store, we remove an observable side effect, the
>> trap/exception itself if we reach that statement via the ELSE path.
>
> As said above - I don't think we have to care for C/C++ w/o
> -fnon-call-exceptions.
So in the immediate term I propose we conditionalize the pt.null check 
on non-call exceptions.  THen I'll look more closely at the example 
above and see what we can reasonably do there.

jeff
Richard Biener Jan. 9, 2017, 5:19 p.m. UTC | #6
On January 9, 2017 6:02:16 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>On 01/09/2017 02:36 AM, Richard Biener wrote:
>>>
>>>
>>> a = 1;
>>> <trigger non-call exception>
>>> a = 2;
>>>
>>>
>>> If "a" escapes such that its value can be queried in the exception
>handler,
>>> then the exception handler would be able to observe the first store
>and thus
>>> it should not be removed.
>>
>> Yes, and it won't as long as the EH is thrown internally (and thus we
>have
>> a CFG reflecting it).  When it's only externally catched we lose of
>course...
>>
>> We'd need an Ada testcase to actually show behavior that is not
>conforming
>> to an existing language specification though.
>I'm not versed enough in Ada to even attempt to pull together a
>testcase 
>for this.
>
>>
>> I suspect we have a similar issue in C++ for sth like
>>
>> void __attribute__((const)) foo () { throw; }
>>
>> int x;
>> void bar ()
>> {
>>   x = 1;
>>   foo ();
>>   x = 2;
>> }
>>
>> where foo is const but not nothrow.
>I wouldn't be surprised if there's other problems with const functions 
>that can throw.
>
>>
>>> We also have to be cognizant of systems where there is memory mapped
>at
>>> location 0.  When that is true, we must check pt.null and honor it,
>even if
>>> it pessimizes code.
>>
>> With -fno-delete-null-pointer-checks (that's what such systems set)
>PTA computes
>> 0 as "nonlocal" and thus it won't be a singleton points-to solution.
>Ah, good.
>
>>
>>>
>>>
>>>> For
>>>>
>>>> int foo (int *p, int b)
>>>> {
>>>>   int *q;
>>>>   int i = 1;
>>>>   if (b)
>>>>     q = &i;
>>>>   else
>>>>     q = (void *)0;
>>>>   *q = 2;
>>>>   i = 3;
>>>>   return *q;
>>>> }
>>>
>>> So on a system where *0 is a valid memory address, *q = 2 does not
>make
>>> anything dead, nor does i = 3 unless we were to isolate the
>THEN/ELSE
>>> blocks.
>>>
>>> On a system where *0 traps, there is no way to observe the value of
>"i" in
>>> the handler.  Thus i = 1 is a dead store.  I believe we must keep
>the *q = 2
>>> store because it can trigger a signal/exception which is itself an
>>> observable side effect?  Right?
>>
>> But writing to 0 invokes undefined behavior which we have no
>obligation to
>> preserve (unless we make it well-defined with -fnon-call-exceptions
>-fexceptions
>> as a GCC extension).
>It may invoke undefined behavior, but to date we have explicitly chosen
>
>to preserve the *0 = <something> to trigger a fault via the virtual 
>memory system.  We kicked this around extensively in Nov 2013 with the 
>introduction of isolation of erroneous paths.

Note we do not preserve traps with -ftrapv -fnon-call-exceptions either.  Or generally we happily reorder global sides effects with externally throwing stmts.

>I think part of what pushed us that direction was that a program could 
>catch the signal related to the NULL dereference, then do something 
>sensible in the handler.  That also happens to match what the Go 
>language requires.
>
>
>
>>
>>>>
>>>> we remove all stores but the last store to i and the load from q
>(but we
>>>> don't
>>>> replace q with &i here, a missed optimization if removing the other
>stores
>>>> is
>>>> valid).
>>>
>>> But if we remove the *q = 2 store, we remove an observable side
>effect, the
>>> trap/exception itself if we reach that statement via the ELSE path.
>>
>> As said above - I don't think we have to care for C/C++ w/o
>> -fnon-call-exceptions.
>So in the immediate term I propose we conditionalize the pt.null check 
>on non-call exceptions.  THen I'll look more closely at the example 
>above and see what we can reasonably do there.

Fix the pt-null bugs in PTA so we do not need the conservative fallback.  As said I have partial patches for this.

Richard.

>jeff
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d43c9bc..3a7ad9d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@ 
+2017-01-04  Jeff Law  <law@redhat.com>
+
+	PR tree-optimizatin/67955
+	* tree-ssa-alias.c (same_addr_size_stores_p): Check offsets first.
+	Allow any SSA_VAR_P as the base objects.  Use integer_zerop.  Verify
+	the points-to solution does not include pt_null.  Use DECL_PT_UID
+	unconditionally.
+
 2017-01-04  Uros Bizjak  <ubizjak@gmail.com>
 
 	* config/i386/i386.md (HI/SImode test with imm to QImode splitters):
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index d8ff32f..0cbc8cc 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@ 
+2017-01-03  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/67955
+	* gcc.dg/tree-ssa/ssa-dse-28.c: New test.
+
 2017-01-04  Marek Polacek  <polacek@redhat.com>
 
 	PR c++/77545
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
new file mode 100644
index 0000000..d35377b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse-details -fexceptions -fnon-call-exceptions -fno-isolate-erroneous-paths-dereference" } */
+
+
+int foo (int *p, int b)
+{
+  int *q;
+  int i = 1;
+  if (b)
+    q = &i;
+  else
+    q = (void *)0;
+  *q = 2;
+  return i;
+}
+
+/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse1"} } */
+/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse2"} } */
+/* { dg-final { scan-tree-dump-not "Deleted dead store" "dse3"} } */
+
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 0a9fc74..871fa12 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -2326,9 +2326,13 @@  same_addr_size_stores_p (tree base1, HOST_WIDE_INT offset1, HOST_WIDE_INT size1,
 			 tree base2, HOST_WIDE_INT offset2, HOST_WIDE_INT size2,
 			 HOST_WIDE_INT max_size2)
 {
-  /* For now, just handle VAR_DECL.  */
-  bool base1_obj_p = VAR_P (base1);
-  bool base2_obj_p = VAR_P (base2);
+  /* Offsets need to be 0.  */
+  if (offset1 != 0
+      || offset2 != 0)
+    return false;
+
+  bool base1_obj_p = SSA_VAR_P (base1);
+  bool base2_obj_p = SSA_VAR_P (base2);
 
   /* We need one object.  */
   if (base1_obj_p == base2_obj_p)
@@ -2356,13 +2360,9 @@  same_addr_size_stores_p (tree base1, HOST_WIDE_INT offset1, HOST_WIDE_INT size1,
   if (size1 != size2)
     return false;
 
-  /* Offsets need to be 0.  */
-  if (offset1 != 0
-      || offset2 != 0)
-    return false;
 
   /* Check that memref is a store to pointer with singleton points-to info.  */
-  if (!tree_int_cst_equal (TREE_OPERAND (memref, 1), integer_zero_node))
+  if (!integer_zerop (TREE_OPERAND (memref, 1)))
     return false;
   tree ptr = TREE_OPERAND (memref, 0);
   if (TREE_CODE (ptr) != SSA_NAME)
@@ -2373,10 +2373,13 @@  same_addr_size_stores_p (tree base1, HOST_WIDE_INT offset1, HOST_WIDE_INT size1,
       || !pt_solution_singleton_or_null_p (&pi->pt, &pt_uid))
     return false;
 
+  /* If the solution has a singleton and NULL, then we can not
+     be sure that the two stores hit the same address.  */
+  if (pi->pt.null)
+    return false;
+
   /* Check that ptr points relative to obj.  */
-  unsigned int obj_uid = (DECL_PT_UID_SET_P (obj)
-			  ? DECL_PT_UID (obj)
-			  : DECL_UID (obj));
+  unsigned int obj_uid = DECL_PT_UID (obj);
   if (obj_uid != pt_uid)
     return false;