diff mbox

free is a killer

Message ID alpine.DEB.2.10.1310282228150.4104@laptop-mg.saclay.inria.fr
State New
Headers show

Commit Message

Marc Glisse Oct. 28, 2013, 10:05 p.m. UTC
On Mon, 28 Oct 2013, Jeff Law wrote:

> On 10/26/13 01:15, Marc Glisse wrote:
>> Hello,
>> 
>> this patch teaches gcc that free kills the memory its argument points
>> to. The equality test is probably too strict, I guess we can loosen it
>> later (unless you have suggestions?).
>> 
>> Note that the corresponding code for BUILT_IN_MEMCPY and others seems
>> suspicious to me, it looks like it is testing for equality between a
>> pointer and a mem_ref, which is unlikely to happen.
>> 
>> Bootstrap+testsuite on x86_64-unknown-linux-gnu.
>> 
>> 2013-10-27  Marc Glisse  <marc.glisse@inria.fr>
>>
>>      PR tree-optimization/19831
>> gcc/
>>      * tree-ssa-alias.c (stmt_kills_ref_p_1): Handle BUILT_IN_FREE.
>> 
>> gcc/testsuite/
>>      * gcc.dg/tree-ssa/alias-25.c: New file.
> OK for the trunk.

Thanks.

> I agree the MEM_REF* and VA_END cases look strange and I think they're wrong 
> as well.  Did you happen to try fixing them and running the testsuite to see 
> if anything fails?
>
> ISTM your testcase could be tweaked to test conclusively if they're doing the 
> right thing -- and regardless of what's found that test can/should make its 
> way into the testsuite.

I checked and it does the wrong thing (I don't have the testcase handy 
anymore, but it shouldn't be hard to recreate one), I even wrote a patch 
(attached) but it is related to:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02238.html
so it can't go in. A more limited fix (still missing many cases) would be 
possible. I didn't test VA_END, but it doesn't look as bad (it is the 
SSA_NAME case that is wrong for memcpy).


While I am posting non-submitted patches, does the following vaguely make 
sense? I couldn't find a testcase that exercised it without some local 
patches, but the idea (IIRC) is that with:
struct A { struct B b; int i; }
we can easily end up with one ao_ref.base that is a MEM_REF[p] and 
another one a MEM_REF[(B*)q] where p and q are of type A*, and that 
prevents us from noticing that p->i and q->b don't alias. Maybe I should 
instead find a way to get MEM_REF[q] as ao_ref.base, but if q actually 
points to a larger structure that starts with A it is likely to fail as 
well...




In the category "ugly hack that I won't submit", let me also attach this 
patch that sometimes helps notice that malloc doesn't alias other things 
(I don't know if the first hunk ever fires).

Comments

Jeff Law Oct. 29, 2013, 5:35 a.m. UTC | #1
On 10/28/13 16:05, Marc Glisse wrote:
>
> I checked and it does the wrong thing (I don't have the testcase handy
> anymore, but it shouldn't be hard to recreate one), I even wrote a patch
> (attached) but it is related to:
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02238.html
> so it can't go in. A more limited fix (still missing many cases) would
> be possible. I didn't test VA_END, but it doesn't look as bad (it is the
> SSA_NAME case that is wrong for memcpy
Actually, it's fairly easy to see with memset.  Something like this:

/* { dg-do compile } */
/* { dg-options "-O1 -fdump-tree-optimized" } */

void f (long *p) {
   *p = 42;
   p[4] = 42;
   __builtin_memset (p, 0, 100);
}

/* { dg-final { scan-tree-dump-not "= 42" "optimized" } } */
/* { dg-final { cleanup-tree-dump "optimized" } } */


I'm sure it can be made to work with memcpy as well, it's just harder 
because you have to ensure the source/dest args don't alias.


>
>
> While I am posting non-submitted patches, does the following vaguely
> make sense? I couldn't find a testcase that exercised it without some
> local patches, but the idea (IIRC) is that with:
> struct A { struct B b; int i; }
> we can easily end up with one ao_ref.base that is a MEM_REF[p] and
> another one a MEM_REF[(B*)q] where p and q are of type A*, and that
> prevents us from noticing that p->i and q->b don't alias. Maybe I should
> instead find a way to get MEM_REF[q] as ao_ref.base, but if q actually
> points to a larger structure that starts with A it is likely to fail as
> well...
I've never looked at the alias oracle stuff, but ISTM that offsets from 
the base ought to be enough to disambiguate?

>
>
>
> In the category "ugly hack that I won't submit", let me also attach this
> patch that sometimes helps notice that malloc doesn't alias other things
> (I don't know if the first hunk ever fires).
Well, one thing that can be done here (I think it comes from Steensgaard 
and I'm pretty sure it's discussed in Morgan's text) is you assign a 
storage class and propagate that.   Once you do so, you can eliminate 
certain things.  You use a vrp-like engine with a fairly simple meet 
operator to handle propagation of the storage class.

Once you have storage classes for all the pointers, you can make certain 
deductions about aliasing relationships.  Including what things in the 
malloc storage class can and can not alias.

Jeff
Marc Glisse Oct. 29, 2013, 6:38 a.m. UTC | #2
On Mon, 28 Oct 2013, Jeff Law wrote:

> On 10/28/13 16:05, Marc Glisse wrote:
>> 
>> I checked and it does the wrong thing (I don't have the testcase handy
>> anymore, but it shouldn't be hard to recreate one), I even wrote a patch
>> (attached) but it is related to:
>> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02238.html
>> so it can't go in. A more limited fix (still missing many cases) would
>> be possible. I didn't test VA_END, but it doesn't look as bad (it is the
>> SSA_NAME case that is wrong for memcpy
> Actually, it's fairly easy to see with memset.  Something like this:
>
> /* { dg-do compile } */
> /* { dg-options "-O1 -fdump-tree-optimized" } */
>
> void f (long *p) {
>  *p = 42;
>  p[4] = 42;
>  __builtin_memset (p, 0, 100);
> }
>
> /* { dg-final { scan-tree-dump-not "= 42" "optimized" } } */
> /* { dg-final { cleanup-tree-dump "optimized" } } */

Indeed. Do you want to commit it xfailed or put it in bugzilla so we don't 
lose it? (it becomes harder if you replace p with p-1 in the memset 
arguments).

>> While I am posting non-submitted patches, does the following vaguely
>> make sense? I couldn't find a testcase that exercised it without some
>> local patches, but the idea (IIRC) is that with:
>> struct A { struct B b; int i; }
>> we can easily end up with one ao_ref.base that is a MEM_REF[p] and
>> another one a MEM_REF[(B*)q] where p and q are of type A*, and that
>> prevents us from noticing that p->i and q->b don't alias. Maybe I should
>> instead find a way to get MEM_REF[q] as ao_ref.base, but if q actually
>> points to a larger structure that starts with A it is likely to fail as
>> well...
> I've never looked at the alias oracle stuff, but ISTM that offsets from the 
> base ought to be enough to disambiguate?

Yes, the issue here is making them notice that they can be seen as having 
bases of the same type (and thus that it is meaningful to compare 
offsets), since there are multiple possible base choices when a structure 
starts with a structure which starts with...

>> In the category "ugly hack that I won't submit", let me also attach this
>> patch that sometimes helps notice that malloc doesn't alias other things
>> (I don't know if the first hunk ever fires).
> Well, one thing that can be done here (I think it comes from Steensgaard and 
> I'm pretty sure it's discussed in Morgan's text) is you assign a storage 
> class and propagate that.   Once you do so, you can eliminate certain things. 
> You use a vrp-like engine with a fairly simple meet operator to handle 
> propagation of the storage class.
>
> Once you have storage classes for all the pointers, you can make certain 
> deductions about aliasing relationships.  Including what things in the malloc 
> storage class can and can not alias.

Sounds good, but I am not really up to rewriting the points-to analysis at 
the moment...

Thanks,
Richard Biener Oct. 29, 2013, 10:37 a.m. UTC | #3
On Mon, Oct 28, 2013 at 11:05 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Mon, 28 Oct 2013, Jeff Law wrote:
>
>> On 10/26/13 01:15, Marc Glisse wrote:
>>>
>>> Hello,
>>>
>>> this patch teaches gcc that free kills the memory its argument points
>>> to. The equality test is probably too strict, I guess we can loosen it
>>> later (unless you have suggestions?).
>>>
>>> Note that the corresponding code for BUILT_IN_MEMCPY and others seems
>>> suspicious to me, it looks like it is testing for equality between a
>>> pointer and a mem_ref, which is unlikely to happen.
>>>
>>> Bootstrap+testsuite on x86_64-unknown-linux-gnu.
>>>
>>> 2013-10-27  Marc Glisse  <marc.glisse@inria.fr>
>>>
>>>      PR tree-optimization/19831
>>> gcc/
>>>      * tree-ssa-alias.c (stmt_kills_ref_p_1): Handle BUILT_IN_FREE.
>>>
>>> gcc/testsuite/
>>>      * gcc.dg/tree-ssa/alias-25.c: New file.
>>
>> OK for the trunk.
>
>
> Thanks.
>
>
>> I agree the MEM_REF* and VA_END cases look strange and I think they're
>> wrong as well.  Did you happen to try fixing them and running the testsuite
>> to see if anything fails?
>>
>> ISTM your testcase could be tweaked to test conclusively if they're doing
>> the right thing -- and regardless of what's found that test can/should make
>> its way into the testsuite.

The VA_END case looks ok (though == equality testing is a bit
too conservative, the SSA_NAME case of the builtins handling
looks wrong indeed.

> I checked and it does the wrong thing (I don't have the testcase handy
> anymore, but it shouldn't be hard to recreate one), I even wrote a patch
> (attached) but it is related to:
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02238.html
> so it can't go in. A more limited fix (still missing many cases) would be
> possible. I didn't test VA_END, but it doesn't look as bad (it is the
> SSA_NAME case that is wrong for memcpy).
>
>
> While I am posting non-submitted patches, does the following vaguely make
> sense? I couldn't find a testcase that exercised it without some local
> patches, but the idea (IIRC) is that with:
> struct A { struct B b; int i; }
> we can easily end up with one ao_ref.base that is a MEM_REF[p] and another
> one a MEM_REF[(B*)q] where p and q are of type A*, and that prevents us from
> noticing that p->i and q->b don't alias. Maybe I should instead find a way
> to get MEM_REF[q] as ao_ref.base, but if q actually points to a larger
> structure that starts with A it is likely to fail as well...
>
> --- gcc/tree-ssa-alias.c        (revision 204095)
> +++ gcc/tree-ssa-alias.c        (working copy)
> @@ -1099,22 +1099,24 @@ indirect_refs_may_alias_p (tree ref1 ATT
>
>    /* If both references are through the same type, they do not alias
>       if the accesses do not overlap.  This does extra disambiguation
>       for mixed/pointer accesses but requires strict aliasing.  */
>    if ((TREE_CODE (base1) != TARGET_MEM_REF
>         || (!TMR_INDEX (base1) && !TMR_INDEX2 (base1)))
>        && (TREE_CODE (base2) != TARGET_MEM_REF
>           || (!TMR_INDEX (base2) && !TMR_INDEX2 (base2)))
>        && same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) == 1
>        && same_type_for_tbaa (TREE_TYPE (base2), TREE_TYPE (ptrtype2)) == 1
> -      && same_type_for_tbaa (TREE_TYPE (ptrtype1),
> -                            TREE_TYPE (ptrtype2)) == 1)
> +      && (same_type_for_tbaa (TREE_TYPE (ptrtype1),
> +                             TREE_TYPE (ptrtype2)) == 1
> +         || same_type_for_tbaa (TREE_TYPE (TREE_TYPE (ptr1)),
> +                                TREE_TYPE (TREE_TYPE (ptr2))) == 1))

No, the type of 'ptr' are not in any way relevant.

Richard.

>      return ranges_overlap_p (offset1, max_size1, offset2, max_size2);
>
>    /* Do type-based disambiguation.  */
>    if (base1_alias_set != base2_alias_set
>        && !alias_sets_conflict_p (base1_alias_set, base2_alias_set))
>      return false;
>
>    /* Do access-path based disambiguation.  */
>    if (ref1 && ref2
>        && (handled_component_p (ref1) || handled_component_p (ref2))
>
>
>
> In the category "ugly hack that I won't submit", let me also attach this
> patch that sometimes helps notice that malloc doesn't alias other things (I
> don't know if the first hunk ever fires).
>
> --
> Marc Glisse
Richard Biener Oct. 29, 2013, 10:40 a.m. UTC | #4
On Tue, Oct 29, 2013 at 6:35 AM, Jeff Law <law@redhat.com> wrote:
> On 10/28/13 16:05, Marc Glisse wrote:
>>
>>
>> I checked and it does the wrong thing (I don't have the testcase handy
>> anymore, but it shouldn't be hard to recreate one), I even wrote a patch
>> (attached) but it is related to:
>> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02238.html
>> so it can't go in. A more limited fix (still missing many cases) would
>> be possible. I didn't test VA_END, but it doesn't look as bad (it is the
>> SSA_NAME case that is wrong for memcpy
>
> Actually, it's fairly easy to see with memset.  Something like this:
>
> /* { dg-do compile } */
> /* { dg-options "-O1 -fdump-tree-optimized" } */
>
> void f (long *p) {
>   *p = 42;
>   p[4] = 42;
>   __builtin_memset (p, 0, 100);
> }
>
> /* { dg-final { scan-tree-dump-not "= 42" "optimized" } } */
> /* { dg-final { cleanup-tree-dump "optimized" } } */
>
>
> I'm sure it can be made to work with memcpy as well, it's just harder
> because you have to ensure the source/dest args don't alias.
>
>
>
>>
>>
>> While I am posting non-submitted patches, does the following vaguely
>> make sense? I couldn't find a testcase that exercised it without some
>> local patches, but the idea (IIRC) is that with:
>> struct A { struct B b; int i; }
>> we can easily end up with one ao_ref.base that is a MEM_REF[p] and
>> another one a MEM_REF[(B*)q] where p and q are of type A*, and that
>> prevents us from noticing that p->i and q->b don't alias. Maybe I should
>> instead find a way to get MEM_REF[q] as ao_ref.base, but if q actually
>> points to a larger structure that starts with A it is likely to fail as
>> well...
>
> I've never looked at the alias oracle stuff, but ISTM that offsets from the
> base ought to be enough to disambiguate?
>
>
>>
>>
>>
>> In the category "ugly hack that I won't submit", let me also attach this
>> patch that sometimes helps notice that malloc doesn't alias other things
>> (I don't know if the first hunk ever fires).
>
> Well, one thing that can be done here (I think it comes from Steensgaard and
> I'm pretty sure it's discussed in Morgan's text) is you assign a storage
> class and propagate that.   Once you do so, you can eliminate certain
> things.  You use a vrp-like engine with a fairly simple meet operator to
> handle propagation of the storage class.
>
> Once you have storage classes for all the pointers, you can make certain
> deductions about aliasing relationships.  Including what things in the
> malloc storage class can and can not alias.

Of course in the example we have the "global memory" storage class
(incoming function argument) and "malloc memory" which is really
the same storage class.  It only becomes a different storage class
if you factor in flow analysis (for which the current PTA code is weak
once you leave the SSA web).

Richard.

> Jeff
>
>
Jeff Law Oct. 29, 2013, 3:01 p.m. UTC | #5
On 10/29/13 04:40, Richard Biener wrote:
>
> Of course in the example we have the "global memory" storage class
> (incoming function argument) and "malloc memory" which is really
> the same storage class.  It only becomes a different storage class
> if you factor in flow analysis (for which the current PTA code is weak
> once you leave the SSA web).
The global memory storage class is really better thought of as "i 
haven't a clue" class.

You want a different class for things passed in as you know they can't 
conflict with anything in the current function's stack or anything 
malloc'd by the current function.


Jeff
Richard Biener Oct. 29, 2013, 3:15 p.m. UTC | #6
On Tue, Oct 29, 2013 at 4:01 PM, Jeff Law <law@redhat.com> wrote:
> On 10/29/13 04:40, Richard Biener wrote:
>>
>>
>> Of course in the example we have the "global memory" storage class
>> (incoming function argument) and "malloc memory" which is really
>> the same storage class.  It only becomes a different storage class
>> if you factor in flow analysis (for which the current PTA code is weak
>> once you leave the SSA web).
>
> The global memory storage class is really better thought of as "i haven't a
> clue" class.

No, that's the "anything" class in the current PTA.

> You want a different class for things passed in as you know they can't
> conflict with anything in the current function's stack or anything malloc'd
> by the current function.

That's the NONLOCAL class in the current PTA.  Which also means
that simple cases should work.  It's only when the malloc result
escapes somewhere that it gets reachable by NONLOCAL because
of the limited flow-sensitiveness of PTA (and its lack of context sensitivity).

Richard.



>
> Jeff
Jeff Law Oct. 29, 2013, 3:37 p.m. UTC | #7
On 10/29/13 09:15, Richard Biener wrote:
> On Tue, Oct 29, 2013 at 4:01 PM, Jeff Law <law@redhat.com> wrote:
>> On 10/29/13 04:40, Richard Biener wrote:
>>>
>>>
>>> Of course in the example we have the "global memory" storage class
>>> (incoming function argument) and "malloc memory" which is really
>>> the same storage class.  It only becomes a different storage class
>>> if you factor in flow analysis (for which the current PTA code is weak
>>> once you leave the SSA web).
>>
>> The global memory storage class is really better thought of as "i haven't a
>> clue" class.
>
> No, that's the "anything" class in the current PTA.
>
>> You want a different class for things passed in as you know they can't
>> conflict with anything in the current function's stack or anything malloc'd
>> by the current function.
>
> That's the NONLOCAL class in the current PTA.  Which also means
> that simple cases should work.  It's only when the malloc result
> escapes somewhere that it gets reachable by NONLOCAL because
> of the limited flow-sensitiveness of PTA (and its lack of context sensitivity).
OK.  I think we've got all the same concepts and may just be arguing 
about terminology :-)

jeff
Jeff Law Oct. 29, 2013, 3:53 p.m. UTC | #8
On 10/29/13 00:38, Marc Glisse wrote:
>
> Indeed. Do you want to commit it xfailed or put it in bugzilla so we
> don't lose it? (it becomes harder if you replace p with p-1 in the
> memset arguments).
I'll either add it as xfailed, or I'll add it as-is if I fix the alias 
code.   It'd primarly be to address the current incorrect state of the 
code as it stands today rather than to extensively feature test this code.


>
> Sounds good, but I am not really up to rewriting the points-to analysis
> at the moment...
Understandable.  Never hurts to put a bug in someone's ear, you never 
know when they (or someone else reading) will run with it.

jeff
Jeff Law Oct. 29, 2013, 7:23 p.m. UTC | #9
On 10/29/13 00:38, Marc Glisse wrote:
> On Mon, 28 Oct 2013, Jeff Law wrote:
>
>> On 10/28/13 16:05, Marc Glisse wrote:
>>>
>>> I checked and it does the wrong thing (I don't have the testcase handy
>>> anymore, but it shouldn't be hard to recreate one), I even wrote a patch
>>> (attached) but it is related to:
>>> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02238.html
>>> so it can't go in. A more limited fix (still missing many cases) would
>>> be possible. I didn't test VA_END, but it doesn't look as bad (it is the
>>> SSA_NAME case that is wrong for memcpy
>> Actually, it's fairly easy to see with memset.  Something like this:
>>
>> /* { dg-do compile } */
>> /* { dg-options "-O1 -fdump-tree-optimized" } */
>>
>> void f (long *p) {
>>  *p = 42;
>>  p[4] = 42;
>>  __builtin_memset (p, 0, 100);
>> }
>>
>> /* { dg-final { scan-tree-dump-not "= 42" "optimized" } } */
>> /* { dg-final { cleanup-tree-dump "optimized" } } */
>
> Indeed. Do you want to commit it xfailed or put it in bugzilla so we
> don't lose it? (it becomes harder if you replace p with p-1 in the
> memset arguments).
BTW, you can get a VAR_DECL (and presumably a PARM_DECL) from 
ao_ref_base.  Some of the ssa-dse tests show this.

Jeff
diff mbox

Patch

--- gcc/tree-ssa-alias.c	(revision 204095)
+++ gcc/tree-ssa-alias.c	(working copy)
@@ -1099,22 +1099,24 @@  indirect_refs_may_alias_p (tree ref1 ATT

    /* If both references are through the same type, they do not alias
       if the accesses do not overlap.  This does extra disambiguation
       for mixed/pointer accesses but requires strict aliasing.  */
    if ((TREE_CODE (base1) != TARGET_MEM_REF
         || (!TMR_INDEX (base1) && !TMR_INDEX2 (base1)))
        && (TREE_CODE (base2) != TARGET_MEM_REF
  	  || (!TMR_INDEX (base2) && !TMR_INDEX2 (base2)))
        && same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) == 1
        && same_type_for_tbaa (TREE_TYPE (base2), TREE_TYPE (ptrtype2)) == 1
-      && same_type_for_tbaa (TREE_TYPE (ptrtype1),
-			     TREE_TYPE (ptrtype2)) == 1)
+      && (same_type_for_tbaa (TREE_TYPE (ptrtype1),
+			      TREE_TYPE (ptrtype2)) == 1
+	  || same_type_for_tbaa (TREE_TYPE (TREE_TYPE (ptr1)),
+				 TREE_TYPE (TREE_TYPE (ptr2))) == 1))
      return ranges_overlap_p (offset1, max_size1, offset2, max_size2);

    /* Do type-based disambiguation.  */
    if (base1_alias_set != base2_alias_set
        && !alias_sets_conflict_p (base1_alias_set, base2_alias_set))
      return false;

    /* Do access-path based disambiguation.  */
    if (ref1 && ref2
        && (handled_component_p (ref1) || handled_component_p (ref2))