[PR,middle-end/79123] cast false positive in -Walloca-larger-than=
diff mbox

Message ID 8a5496d5-1811-a89d-ea75-b15c68cb6ddf@redhat.com
State New
Headers show

Commit Message

Aldy Hernandez Jan. 19, 2017, 12:17 p.m. UTC
In the attached testcase, we have a clearly bounded case of alloca which 
is being incorrectly reported:

void g (int *p, int *q)
{
    size_t n = (size_t)(p - q);

    if (n < 10)
      f (__builtin_alloca (n));
}

The problem is that VRP gives us an anti-range for `n' which may be out 
of range:

   # RANGE ~[2305843009213693952, 16140901064495857663]
    n_9 = (long unsigned int) _4;

We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly 
because we're trying various heuristics to make up for the fact that we 
have crappy range info from VRP.  More specifically, we're basically 
punting on an VR_ANTI_RANGE and ignoring that the casted result (n_9) 
has a bound later on.

Luckily, we already have code to check simple ranges coming into the 
alloca by looking into all the basic blocks feeding it.  The attached 
patch delays the final decision on anti ranges until we have examined 
the basic blocks and determined for that we are definitely out of range.

I expect all this to disappear with Andrew's upcoming range info overhaul.

OK for trunk?
commit 07677ba03a01cbd1f1c4747b4df333b35d0d3afd
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Thu Jan 19 05:44:58 2017 -0500

            PR middle-end/79123
            * gimple-ssa-warn-alloca.c (alloca_call_type): Make sure
            casts from signed to unsigned really don't have a range.

Comments

Richard Biener Jan. 19, 2017, 12:45 p.m. UTC | #1
On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> In the attached testcase, we have a clearly bounded case of alloca which is
> being incorrectly reported:
>
> void g (int *p, int *q)
> {
>    size_t n = (size_t)(p - q);
>
>    if (n < 10)
>      f (__builtin_alloca (n));
> }
>
> The problem is that VRP gives us an anti-range for `n' which may be out of
> range:
>
>   # RANGE ~[2305843009213693952, 16140901064495857663]
>    n_9 = (long unsigned int) _4;
>
> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly because
> we're trying various heuristics to make up for the fact that we have crappy
> range info from VRP.  More specifically, we're basically punting on an
> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
> on.
>
> Luckily, we already have code to check simple ranges coming into the alloca
> by looking into all the basic blocks feeding it.  The attached patch delays
> the final decision on anti ranges until we have examined the basic blocks
> and determined for that we are definitely out of range.
>
> I expect all this to disappear with Andrew's upcoming range info overhaul.
>
> OK for trunk?

I _really_ wonder why all the range consuming warnings are not emitted
from VRP itself (like we do for -Warray-bounds).  There we'd still see
a range for the argument derived from the if () rather than needing to
do our own mini-VRP from the needessly "incomplete" range-info on
SSA vars.

Richard.

>
Aldy Hernandez Jan. 19, 2017, 1:02 p.m. UTC | #2
On 01/19/2017 07:45 AM, Richard Biener wrote:
> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> In the attached testcase, we have a clearly bounded case of alloca which is
>> being incorrectly reported:
>>
>> void g (int *p, int *q)
>> {
>>    size_t n = (size_t)(p - q);
>>
>>    if (n < 10)
>>      f (__builtin_alloca (n));
>> }
>>
>> The problem is that VRP gives us an anti-range for `n' which may be out of
>> range:
>>
>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>    n_9 = (long unsigned int) _4;
>>
>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly because
>> we're trying various heuristics to make up for the fact that we have crappy
>> range info from VRP.  More specifically, we're basically punting on an
>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
>> on.
>>
>> Luckily, we already have code to check simple ranges coming into the alloca
>> by looking into all the basic blocks feeding it.  The attached patch delays
>> the final decision on anti ranges until we have examined the basic blocks
>> and determined for that we are definitely out of range.
>>
>> I expect all this to disappear with Andrew's upcoming range info overhaul.
>>
>> OK for trunk?
>
> I _really_ wonder why all the range consuming warnings are not emitted
> from VRP itself (like we do for -Warray-bounds).  There we'd still see
> a range for the argument derived from the if () rather than needing to
> do our own mini-VRP from the needessly "incomplete" range-info on
> SSA vars.
>
> Richard.

My original implementation was within VRP itself, using the 
ASSERT_EXPR's, but Jeff suggested doing it in it's own pass.  I can't 
remember the details (I could look it up), but I think it had to do with 
getting better information post-inlining about the call to alloca (not 
sure).

Also, with Andrew's GCC8 work on ranges, it seemed pointless to bend 
over backwards handling ASSERT_EXPR's etc.  And IIRC, implementing it in 
VRP was ugly.

Perhaps Jeff remembers the details better.

Aldy
Richard Biener Jan. 19, 2017, 1:42 p.m. UTC | #3
On Thu, Jan 19, 2017 at 2:02 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> On 01/19/2017 07:45 AM, Richard Biener wrote:
>>
>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>>
>>> In the attached testcase, we have a clearly bounded case of alloca which
>>> is
>>> being incorrectly reported:
>>>
>>> void g (int *p, int *q)
>>> {
>>>    size_t n = (size_t)(p - q);
>>>
>>>    if (n < 10)
>>>      f (__builtin_alloca (n));
>>> }
>>>
>>> The problem is that VRP gives us an anti-range for `n' which may be out
>>> of
>>> range:
>>>
>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>    n_9 = (long unsigned int) _4;
>>>
>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>> because
>>> we're trying various heuristics to make up for the fact that we have
>>> crappy
>>> range info from VRP.  More specifically, we're basically punting on an
>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
>>> on.
>>>
>>> Luckily, we already have code to check simple ranges coming into the
>>> alloca
>>> by looking into all the basic blocks feeding it.  The attached patch
>>> delays
>>> the final decision on anti ranges until we have examined the basic blocks
>>> and determined for that we are definitely out of range.
>>>
>>> I expect all this to disappear with Andrew's upcoming range info
>>> overhaul.
>>>
>>> OK for trunk?
>>
>>
>> I _really_ wonder why all the range consuming warnings are not emitted
>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>> a range for the argument derived from the if () rather than needing to
>> do our own mini-VRP from the needessly "incomplete" range-info on
>> SSA vars.
>>
>> Richard.
>
>
> My original implementation was within VRP itself, using the ASSERT_EXPR's,
> but Jeff suggested doing it in it's own pass.  I can't remember the details
> (I could look it up), but I think it had to do with getting better
> information post-inlining about the call to alloca (not sure).
>
> Also, with Andrew's GCC8 work on ranges, it seemed pointless to bend over
> backwards handling ASSERT_EXPR's etc.  And IIRC, implementing it in VRP was
> ugly.

You don't need to handle ASSERT_EXPRs you simply use the VRP lattice after
propagation (or during if you hook into EVRP of course).  And VRP runs after
inlining.

Richard.

> Perhaps Jeff remembers the details better.
>
> Aldy
Martin Sebor Jan. 19, 2017, 4:53 p.m. UTC | #4
On 01/19/2017 05:45 AM, Richard Biener wrote:
> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> In the attached testcase, we have a clearly bounded case of alloca which is
>> being incorrectly reported:
>>
>> void g (int *p, int *q)
>> {
>>    size_t n = (size_t)(p - q);
>>
>>    if (n < 10)
>>      f (__builtin_alloca (n));
>> }
>>
>> The problem is that VRP gives us an anti-range for `n' which may be out of
>> range:
>>
>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>    n_9 = (long unsigned int) _4;
>>
>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly because
>> we're trying various heuristics to make up for the fact that we have crappy
>> range info from VRP.  More specifically, we're basically punting on an
>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
>> on.
>>
>> Luckily, we already have code to check simple ranges coming into the alloca
>> by looking into all the basic blocks feeding it.  The attached patch delays
>> the final decision on anti ranges until we have examined the basic blocks
>> and determined for that we are definitely out of range.
>>
>> I expect all this to disappear with Andrew's upcoming range info overhaul.
>>
>> OK for trunk?
>
> I _really_ wonder why all the range consuming warnings are not emitted
> from VRP itself (like we do for -Warray-bounds).  There we'd still see
> a range for the argument derived from the if () rather than needing to
> do our own mini-VRP from the needessly "incomplete" range-info on
> SSA vars.

Can you explain why the range info is only available in VRP and
not outside, via the get_range_info() API?  It sounds as though
the API shouldn't be relied on (it was virtually unused before
GCC 7).

To answer your question, the gimple-ssa-sprintf pass that depends
heavily on ranges would also benefit from having access to the data
computed in the strlen pass that's not available outside it.

The -Wstringop-overflow and -Walloc-size-larger-than warnings depend
on both VRP and tree-object-size.

I have been thinking about how to integrate these passes in GCC 8
to improve the overall quality.  (By integrating I don't necessarily
mean merging the code but rather having them share data or be able
to make calls into one another.)

I'd be very interested in your thoughts on this.

Thanks
Martin
Richard Biener Jan. 20, 2017, 8:17 a.m. UTC | #5
On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote:
> On 01/19/2017 05:45 AM, Richard Biener wrote:
>>
>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>>
>>> In the attached testcase, we have a clearly bounded case of alloca which
>>> is
>>> being incorrectly reported:
>>>
>>> void g (int *p, int *q)
>>> {
>>>    size_t n = (size_t)(p - q);
>>>
>>>    if (n < 10)
>>>      f (__builtin_alloca (n));
>>> }
>>>
>>> The problem is that VRP gives us an anti-range for `n' which may be out
>>> of
>>> range:
>>>
>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>    n_9 = (long unsigned int) _4;
>>>
>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>> because
>>> we're trying various heuristics to make up for the fact that we have
>>> crappy
>>> range info from VRP.  More specifically, we're basically punting on an
>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
>>> on.
>>>
>>> Luckily, we already have code to check simple ranges coming into the
>>> alloca
>>> by looking into all the basic blocks feeding it.  The attached patch
>>> delays
>>> the final decision on anti ranges until we have examined the basic blocks
>>> and determined for that we are definitely out of range.
>>>
>>> I expect all this to disappear with Andrew's upcoming range info
>>> overhaul.
>>>
>>> OK for trunk?
>>
>>
>> I _really_ wonder why all the range consuming warnings are not emitted
>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>> a range for the argument derived from the if () rather than needing to
>> do our own mini-VRP from the needessly "incomplete" range-info on
>> SSA vars.
>
>
> Can you explain why the range info is only available in VRP and
> not outside, via the get_range_info() API?  It sounds as though
> the API shouldn't be relied on (it was virtually unused before
> GCC 7).

It's very simple.  Look at the testcase from above

void g (int *p, int *q)
{
   size_t n = (size_t)(p - q);

   if (n < 10)
     f (__builtin_alloca (n));
}

The IL outside of VRP is

  <bb 2> [100.00%]:
  p.0_1 = (long int) p_7(D);
  q.1_2 = (long int) q_8(D);
  _3 = p.0_1 - q.1_2;
  _4 = _3 /[ex] 4;
  n_9 = (size_t) _4;
  if (n_9 <= 9)
    goto <bb 3>; [36.64%]
  else
    goto <bb 4>; [63.36%]

  <bb 3> [36.64%]:
  _5 = __builtin_alloca (n_9);
  f (_5);

so there is no SSA name we can tack a range to covering the n_9 <= 9
condition that dominates __builtin_alloca.  Inside VRP we have

  <bb 2> [100.00%]:
  p.0_1 = (long int) p_7(D);
  q.1_2 = (long int) q_8(D);
  _3 = p.0_1 - q.1_2;
  _4 = _3 /[ex] 4;
  n_9 = (size_t) _4;
  if (n_9 <= 9)
    goto <bb 3>; [36.64%]
  else
    goto <bb 4>; [63.36%]

  <bb 3> [36.64%]:
  n_13 = ASSERT_EXPR <n_9, n_9 <= 9>;
  _5 = __builtin_alloca (n_13);
  f (_5);

and viola - now the alloca call uses n_13 which is defined at a point
dominated by if (n_9 <= 9) and thus it has an improved range:

n_13: [0, 9]  EQUIVALENCES: { n_9 } (1 elements)

When in EVRP you get similar behavior (well, there are some missed
cases I have patches queued for for GCC 8), but instead of modifying
the IL EVRP just temporarily sets the above range when it processes
BBs dominated by the condition.

So while for VRP you can put the warning code after propagation for
EVRP you'd have to do warning processing during propagation itself
(and EVRP doesn't iterate).

> To answer your question, the gimple-ssa-sprintf pass that depends
> heavily on ranges would also benefit from having access to the data
> computed in the strlen pass that's not available outside it.
>
> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend
> on both VRP and tree-object-size.
>
> I have been thinking about how to integrate these passes in GCC 8
> to improve the overall quality.  (By integrating I don't necessarily
> mean merging the code but rather having them share data or be able
> to make calls into one another.)
>
> I'd be very interested in your thoughts on this.

Well, my thought is that we should not have N SSA propagation and
M "DOM" propagation passes but one of each kind.  My thought is
also that object-size and strlen are of either kind.

So ideally DOM and EVRP would merge and CCP, copyprop and VRP
would merge.  It's probably not possible (or wise) to merge their lattices
(maybe to some extent).

Those two passes could be used by a static analysis phase prior to
any optimization to emit warnings (just computing the lattice, not doing
any IL modification).

Richard.

> Thanks
> Martin
Aldy Hernandez Jan. 20, 2017, 10:52 a.m. UTC | #6
On 01/20/2017 03:17 AM, Richard Biener wrote:
> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote:
>> On 01/19/2017 05:45 AM, Richard Biener wrote:
>>>
>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>
>>>> In the attached testcase, we have a clearly bounded case of alloca which
>>>> is
>>>> being incorrectly reported:
>>>>
>>>> void g (int *p, int *q)
>>>> {
>>>>    size_t n = (size_t)(p - q);
>>>>
>>>>    if (n < 10)
>>>>      f (__builtin_alloca (n));
>>>> }
>>>>
>>>> The problem is that VRP gives us an anti-range for `n' which may be out
>>>> of
>>>> range:
>>>>
>>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>>    n_9 = (long unsigned int) _4;
>>>>
>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>>> because
>>>> we're trying various heuristics to make up for the fact that we have
>>>> crappy
>>>> range info from VRP.  More specifically, we're basically punting on an
>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
>>>> on.
>>>>
>>>> Luckily, we already have code to check simple ranges coming into the
>>>> alloca
>>>> by looking into all the basic blocks feeding it.  The attached patch
>>>> delays
>>>> the final decision on anti ranges until we have examined the basic blocks
>>>> and determined for that we are definitely out of range.
>>>>
>>>> I expect all this to disappear with Andrew's upcoming range info
>>>> overhaul.
>>>>
>>>> OK for trunk?
>>>
>>>
>>> I _really_ wonder why all the range consuming warnings are not emitted
>>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>>> a range for the argument derived from the if () rather than needing to
>>> do our own mini-VRP from the needessly "incomplete" range-info on
>>> SSA vars.
>>
>>
>> Can you explain why the range info is only available in VRP and
>> not outside, via the get_range_info() API?  It sounds as though
>> the API shouldn't be relied on (it was virtually unused before
>> GCC 7).
>
> It's very simple.  Look at the testcase from above
>
> void g (int *p, int *q)
> {
>    size_t n = (size_t)(p - q);
>
>    if (n < 10)
>      f (__builtin_alloca (n));
> }
>
> The IL outside of VRP is
>
>   <bb 2> [100.00%]:
>   p.0_1 = (long int) p_7(D);
>   q.1_2 = (long int) q_8(D);
>   _3 = p.0_1 - q.1_2;
>   _4 = _3 /[ex] 4;
>   n_9 = (size_t) _4;
>   if (n_9 <= 9)
>     goto <bb 3>; [36.64%]
>   else
>     goto <bb 4>; [63.36%]
>
>   <bb 3> [36.64%]:
>   _5 = __builtin_alloca (n_9);
>   f (_5);
>
> so there is no SSA name we can tack a range to covering the n_9 <= 9
> condition that dominates __builtin_alloca.  Inside VRP we have
>
>   <bb 2> [100.00%]:
>   p.0_1 = (long int) p_7(D);
>   q.1_2 = (long int) q_8(D);
>   _3 = p.0_1 - q.1_2;
>   _4 = _3 /[ex] 4;
>   n_9 = (size_t) _4;
>   if (n_9 <= 9)
>     goto <bb 3>; [36.64%]
>   else
>     goto <bb 4>; [63.36%]
>
>   <bb 3> [36.64%]:
>   n_13 = ASSERT_EXPR <n_9, n_9 <= 9>;
>   _5 = __builtin_alloca (n_13);
>   f (_5);
>
> and viola - now the alloca call uses n_13 which is defined at a point
> dominated by if (n_9 <= 9) and thus it has an improved range:
>
> n_13: [0, 9]  EQUIVALENCES: { n_9 } (1 elements)

Yes, I remember getting much better range information when doing things 
within VRP2, however since VRP2 runs after jump threading sometimes I'd 
get more false positives.

Tell you what, for GCC8 I volunteer to rewrite gimple-ssa-warn-alloca.c 
within VRP (or using whatever range information we decide upon or is 
ready for GCC8).  In the meantime, can we fix the PR with my patch (or a 
similar approach)?

Aldy
Richard Biener Jan. 20, 2017, 11:39 a.m. UTC | #7
On Fri, Jan 20, 2017 at 11:52 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
> On 01/20/2017 03:17 AM, Richard Biener wrote:
>>
>> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote:
>>>
>>> On 01/19/2017 05:45 AM, Richard Biener wrote:
>>>>
>>>>
>>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> In the attached testcase, we have a clearly bounded case of alloca
>>>>> which
>>>>> is
>>>>> being incorrectly reported:
>>>>>
>>>>> void g (int *p, int *q)
>>>>> {
>>>>>    size_t n = (size_t)(p - q);
>>>>>
>>>>>    if (n < 10)
>>>>>      f (__builtin_alloca (n));
>>>>> }
>>>>>
>>>>> The problem is that VRP gives us an anti-range for `n' which may be out
>>>>> of
>>>>> range:
>>>>>
>>>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>>>    n_9 = (long unsigned int) _4;
>>>>>
>>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>>>> because
>>>>> we're trying various heuristics to make up for the fact that we have
>>>>> crappy
>>>>> range info from VRP.  More specifically, we're basically punting on an
>>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound
>>>>> later
>>>>> on.
>>>>>
>>>>> Luckily, we already have code to check simple ranges coming into the
>>>>> alloca
>>>>> by looking into all the basic blocks feeding it.  The attached patch
>>>>> delays
>>>>> the final decision on anti ranges until we have examined the basic
>>>>> blocks
>>>>> and determined for that we are definitely out of range.
>>>>>
>>>>> I expect all this to disappear with Andrew's upcoming range info
>>>>> overhaul.
>>>>>
>>>>> OK for trunk?
>>>>
>>>>
>>>>
>>>> I _really_ wonder why all the range consuming warnings are not emitted
>>>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>>>> a range for the argument derived from the if () rather than needing to
>>>> do our own mini-VRP from the needessly "incomplete" range-info on
>>>> SSA vars.
>>>
>>>
>>>
>>> Can you explain why the range info is only available in VRP and
>>> not outside, via the get_range_info() API?  It sounds as though
>>> the API shouldn't be relied on (it was virtually unused before
>>> GCC 7).
>>
>>
>> It's very simple.  Look at the testcase from above
>>
>> void g (int *p, int *q)
>> {
>>    size_t n = (size_t)(p - q);
>>
>>    if (n < 10)
>>      f (__builtin_alloca (n));
>> }
>>
>> The IL outside of VRP is
>>
>>   <bb 2> [100.00%]:
>>   p.0_1 = (long int) p_7(D);
>>   q.1_2 = (long int) q_8(D);
>>   _3 = p.0_1 - q.1_2;
>>   _4 = _3 /[ex] 4;
>>   n_9 = (size_t) _4;
>>   if (n_9 <= 9)
>>     goto <bb 3>; [36.64%]
>>   else
>>     goto <bb 4>; [63.36%]
>>
>>   <bb 3> [36.64%]:
>>   _5 = __builtin_alloca (n_9);
>>   f (_5);
>>
>> so there is no SSA name we can tack a range to covering the n_9 <= 9
>> condition that dominates __builtin_alloca.  Inside VRP we have
>>
>>   <bb 2> [100.00%]:
>>   p.0_1 = (long int) p_7(D);
>>   q.1_2 = (long int) q_8(D);
>>   _3 = p.0_1 - q.1_2;
>>   _4 = _3 /[ex] 4;
>>   n_9 = (size_t) _4;
>>   if (n_9 <= 9)
>>     goto <bb 3>; [36.64%]
>>   else
>>     goto <bb 4>; [63.36%]
>>
>>   <bb 3> [36.64%]:
>>   n_13 = ASSERT_EXPR <n_9, n_9 <= 9>;
>>   _5 = __builtin_alloca (n_13);
>>   f (_5);
>>
>> and viola - now the alloca call uses n_13 which is defined at a point
>> dominated by if (n_9 <= 9) and thus it has an improved range:
>>
>> n_13: [0, 9]  EQUIVALENCES: { n_9 } (1 elements)
>
>
> Yes, I remember getting much better range information when doing things
> within VRP2, however since VRP2 runs after jump threading sometimes I'd get
> more false positives.
>
> Tell you what, for GCC8 I volunteer to rewrite gimple-ssa-warn-alloca.c
> within VRP (or using whatever range information we decide upon or is ready
> for GCC8).  In the meantime, can we fix the PR with my patch (or a similar
> approach)?

Sure.

Richard.

> Aldy
Martin Sebor Jan. 20, 2017, 4:37 p.m. UTC | #8
On 01/20/2017 01:17 AM, Richard Biener wrote:
> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote:
>> On 01/19/2017 05:45 AM, Richard Biener wrote:
>>>
>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>
>>>> In the attached testcase, we have a clearly bounded case of alloca which
>>>> is
>>>> being incorrectly reported:
>>>>
>>>> void g (int *p, int *q)
>>>> {
>>>>    size_t n = (size_t)(p - q);
>>>>
>>>>    if (n < 10)
>>>>      f (__builtin_alloca (n));
>>>> }
>>>>
>>>> The problem is that VRP gives us an anti-range for `n' which may be out
>>>> of
>>>> range:
>>>>
>>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>>    n_9 = (long unsigned int) _4;
>>>>
>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>>> because
>>>> we're trying various heuristics to make up for the fact that we have
>>>> crappy
>>>> range info from VRP.  More specifically, we're basically punting on an
>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
>>>> on.
>>>>
>>>> Luckily, we already have code to check simple ranges coming into the
>>>> alloca
>>>> by looking into all the basic blocks feeding it.  The attached patch
>>>> delays
>>>> the final decision on anti ranges until we have examined the basic blocks
>>>> and determined for that we are definitely out of range.
>>>>
>>>> I expect all this to disappear with Andrew's upcoming range info
>>>> overhaul.
>>>>
>>>> OK for trunk?
>>>
>>>
>>> I _really_ wonder why all the range consuming warnings are not emitted
>>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>>> a range for the argument derived from the if () rather than needing to
>>> do our own mini-VRP from the needessly "incomplete" range-info on
>>> SSA vars.
>>
>>
>> Can you explain why the range info is only available in VRP and
>> not outside, via the get_range_info() API?  It sounds as though
>> the API shouldn't be relied on (it was virtually unused before
>> GCC 7).
>
> It's very simple.  Look at the testcase from above

Thanks for the detailed answer.  A few more questions below.

>
> void g (int *p, int *q)
> {
>    size_t n = (size_t)(p - q);
>
>    if (n < 10)
>      f (__builtin_alloca (n));
> }
>
> The IL outside of VRP is
>
>   <bb 2> [100.00%]:
>   p.0_1 = (long int) p_7(D);
>   q.1_2 = (long int) q_8(D);
>   _3 = p.0_1 - q.1_2;
>   _4 = _3 /[ex] 4;
>   n_9 = (size_t) _4;
>   if (n_9 <= 9)
>     goto <bb 3>; [36.64%]
>   else
>     goto <bb 4>; [63.36%]
>
>   <bb 3> [36.64%]:
>   _5 = __builtin_alloca (n_9);
>   f (_5);
>
> so there is no SSA name we can tack a range to covering the n_9 <= 9
> condition that dominates __builtin_alloca.

This may be a naive question but why is it not possible to create
such an SSA name?

Having the get_range_info() API depend on this subtlety makes its
clients quite unreliable.  It may be acceptable for optimization
but when it's visible to users in the form of false positives or
negatives it's a source of bug reports.

> Inside VRP we have
>
>   <bb 2> [100.00%]:
>   p.0_1 = (long int) p_7(D);
>   q.1_2 = (long int) q_8(D);
>   _3 = p.0_1 - q.1_2;
>   _4 = _3 /[ex] 4;
>   n_9 = (size_t) _4;
>   if (n_9 <= 9)
>     goto <bb 3>; [36.64%]
>   else
>     goto <bb 4>; [63.36%]
>
>   <bb 3> [36.64%]:
>   n_13 = ASSERT_EXPR <n_9, n_9 <= 9>;
>   _5 = __builtin_alloca (n_13);
>   f (_5);
>
> and viola - now the alloca call uses n_13 which is defined at a point
> dominated by if (n_9 <= 9) and thus it has an improved range:
>
> n_13: [0, 9]  EQUIVALENCES: { n_9 } (1 elements)

Yes, I see that.  But when I change size_t to unsigned int (in LP64)
like so:

   void g (int *p, int *q)
   {
     unsigned n = (unsigned)(p - q);

     if (n < 10)
       f (__builtin_alloca (n));
   }

-Walloca-larger-than=100 still complains:

   warning: argument to ‘alloca’ may be too large
   note: limit is 100 bytes, but argument may be as large as 4294967295

and the VRP dump shows

   _5: [0, 4294967295]
   _14: [0, 4294967295]
   ...
     _3 = p.0_1 - q.1_2;
     _4 = _3 /[ex] 4;
     n_10 = (unsigned int) _4;
     if (n_10 <= 9)
       goto <bb 3>; [36.64%]
     else
       goto <bb 4>; [63.36%]

     <bb 3> [36.64%]:
     _14 = _4 & 4294967295;
     _5 = (long unsigned int) _14;
     _6 = __builtin_alloca (_5);

so it seems that even in VRP itself the range information isn't
quite good enough to avoid false positives.  (Perhaps that's one
the missed cases you mention below?)

> When in EVRP you get similar behavior (well, there are some missed
> cases I have patches queued for for GCC 8), but instead of modifying
> the IL EVRP just temporarily sets the above range when it processes
> BBs dominated by the condition.
>
> So while for VRP you can put the warning code after propagation for
> EVRP you'd have to do warning processing during propagation itself
> (and EVRP doesn't iterate).
>
>> To answer your question, the gimple-ssa-sprintf pass that depends
>> heavily on ranges would also benefit from having access to the data
>> computed in the strlen pass that's not available outside it.
>>
>> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend
>> on both VRP and tree-object-size.
>>
>> I have been thinking about how to integrate these passes in GCC 8
>> to improve the overall quality.  (By integrating I don't necessarily
>> mean merging the code but rather having them share data or be able
>> to make calls into one another.)
>>
>> I'd be very interested in your thoughts on this.
>
> Well, my thought is that we should not have N SSA propagation and
> M "DOM" propagation passes but one of each kind.  My thought is
> also that object-size and strlen are of either kind.
>
> So ideally DOM and EVRP would merge and CCP, copyprop and VRP
> would merge.  It's probably not possible (or wise) to merge their lattices
> (maybe to some extent).
>
> Those two passes could be used by a static analysis phase prior to
> any optimization to emit warnings (just computing the lattice, not doing
> any IL modification).

Thanks.  I'll ponder that.

Martin
Jeff Law Jan. 20, 2017, 7:30 p.m. UTC | #9
On 01/19/2017 05:45 AM, Richard Biener wrote:
> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> In the attached testcase, we have a clearly bounded case of alloca which is
>> being incorrectly reported:
>>
>> void g (int *p, int *q)
>> {
>>    size_t n = (size_t)(p - q);
>>
>>    if (n < 10)
>>      f (__builtin_alloca (n));
>> }
>>
>> The problem is that VRP gives us an anti-range for `n' which may be out of
>> range:
>>
>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>    n_9 = (long unsigned int) _4;
>>
>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly because
>> we're trying various heuristics to make up for the fact that we have crappy
>> range info from VRP.  More specifically, we're basically punting on an
>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
>> on.
>>
>> Luckily, we already have code to check simple ranges coming into the alloca
>> by looking into all the basic blocks feeding it.  The attached patch delays
>> the final decision on anti ranges until we have examined the basic blocks
>> and determined for that we are definitely out of range.
>>
>> I expect all this to disappear with Andrew's upcoming range info overhaul.
>>
>> OK for trunk?
>
> I _really_ wonder why all the range consuming warnings are not emitted
> from VRP itself (like we do for -Warray-bounds).  There we'd still see
> a range for the argument derived from the if () rather than needing to
> do our own mini-VRP from the needessly "incomplete" range-info on
> SSA vars.
Blame me.

My thinking was that the warnings themselves are just part of what we 
want to be doing.  We ought to be warning *and* doing erroneous path 
isolation when we find these out of bounds situations.  I felt that VRP 
was already complex and big enough that adding the additional code to do 
the path specific bounds checking and path isolation wasn't appropriate.

The downside is that we lose a lot of valuable information when we leave 
VRP -- essentially what's left attached to a given SSA_NAME is a 
conservative range that applies in all contexts where the SSA_NAME is 
used.  And that is usually too imprecise to give the warnings we want.

jeff
Jeff Law Jan. 20, 2017, 7:35 p.m. UTC | #10
On 01/20/2017 09:37 AM, Martin Sebor wrote:
> On 01/20/2017 01:17 AM, Richard Biener wrote:
>> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote:
>>> On 01/19/2017 05:45 AM, Richard Biener wrote:
>>>>
>>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com>
>>>> wrote:
>>>>>
>>>>> In the attached testcase, we have a clearly bounded case of alloca
>>>>> which
>>>>> is
>>>>> being incorrectly reported:
>>>>>
>>>>> void g (int *p, int *q)
>>>>> {
>>>>>    size_t n = (size_t)(p - q);
>>>>>
>>>>>    if (n < 10)
>>>>>      f (__builtin_alloca (n));
>>>>> }
>>>>>
>>>>> The problem is that VRP gives us an anti-range for `n' which may be
>>>>> out
>>>>> of
>>>>> range:
>>>>>
>>>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>>>    n_9 = (long unsigned int) _4;
>>>>>
>>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>>>> because
>>>>> we're trying various heuristics to make up for the fact that we have
>>>>> crappy
>>>>> range info from VRP.  More specifically, we're basically punting on an
>>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound
>>>>> later
>>>>> on.
>>>>>
>>>>> Luckily, we already have code to check simple ranges coming into the
>>>>> alloca
>>>>> by looking into all the basic blocks feeding it.  The attached patch
>>>>> delays
>>>>> the final decision on anti ranges until we have examined the basic
>>>>> blocks
>>>>> and determined for that we are definitely out of range.
>>>>>
>>>>> I expect all this to disappear with Andrew's upcoming range info
>>>>> overhaul.
>>>>>
>>>>> OK for trunk?
>>>>
>>>>
>>>> I _really_ wonder why all the range consuming warnings are not emitted
>>>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>>>> a range for the argument derived from the if () rather than needing to
>>>> do our own mini-VRP from the needessly "incomplete" range-info on
>>>> SSA vars.
>>>
>>>
>>> Can you explain why the range info is only available in VRP and
>>> not outside, via the get_range_info() API?  It sounds as though
>>> the API shouldn't be relied on (it was virtually unused before
>>> GCC 7).
>>
>> It's very simple.  Look at the testcase from above
>
> Thanks for the detailed answer.  A few more questions below.
>
>>
>> void g (int *p, int *q)
>> {
>>    size_t n = (size_t)(p - q);
>>
>>    if (n < 10)
>>      f (__builtin_alloca (n));
>> }
>>
>> The IL outside of VRP is
>>
>>   <bb 2> [100.00%]:
>>   p.0_1 = (long int) p_7(D);
>>   q.1_2 = (long int) q_8(D);
>>   _3 = p.0_1 - q.1_2;
>>   _4 = _3 /[ex] 4;
>>   n_9 = (size_t) _4;
>>   if (n_9 <= 9)
>>     goto <bb 3>; [36.64%]
>>   else
>>     goto <bb 4>; [63.36%]
>>
>>   <bb 3> [36.64%]:
>>   _5 = __builtin_alloca (n_9);
>>   f (_5);
>>
>> so there is no SSA name we can tack a range to covering the n_9 <= 9
>> condition that dominates __builtin_alloca.
>
> This may be a naive question but why is it not possible to create
> such an SSA name?
Time and space complexity.  To get the range information in this case we 
have to create new SSA_NAMEs and PHI nodes to merge them at BB3.

Even if you create them, passes post VRP are going to blow them away :-)

This is the problem Andrew is working on.  In simplest terms the ability 
to query the range of an object on a path through the CFG.  So we could 
ask for the range of n_9 on the edge 2->3 (<=9) or we could ask for the 
range of n_9 on the edge 2->4 which would be > 9.




Jeff
Jakub Jelinek Jan. 20, 2017, 7:43 p.m. UTC | #11
On Fri, Jan 20, 2017 at 12:35:32PM -0700, Jeff Law wrote:
> > This may be a naive question but why is it not possible to create
> > such an SSA name?
> Time and space complexity.  To get the range information in this case we
> have to create new SSA_NAMEs and PHI nodes to merge them at BB3.

Not just that.  Also lots of optimizations rely on the extra SSA_NAME copies
to be removed to be able to optimize well.  If you'd have lots of SSA_NAMEs with
the same value, just differing with their range info, then unless passes
perform SCCVN or something similar, they wouldn't be able to figure equality
etc.  And if in SCCVN those SSA_NAMEs with different range info, but same
value, compare the same, then they are going to be replaced just by one of
them anyway.

> Even if you create them, passes post VRP are going to blow them away :-)

Exactly.

	Jakub
Jeff Law Jan. 20, 2017, 7:48 p.m. UTC | #12
On 01/20/2017 01:17 AM, Richard Biener wrote:
> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote:
>> On 01/19/2017 05:45 AM, Richard Biener wrote:
>>>
>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>
>>>> In the attached testcase, we have a clearly bounded case of alloca which
>>>> is
>>>> being incorrectly reported:
>>>>
>>>> void g (int *p, int *q)
>>>> {
>>>>    size_t n = (size_t)(p - q);
>>>>
>>>>    if (n < 10)
>>>>      f (__builtin_alloca (n));
>>>> }
>>>>
>>>> The problem is that VRP gives us an anti-range for `n' which may be out
>>>> of
>>>> range:
>>>>
>>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>>    n_9 = (long unsigned int) _4;
>>>>
>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>>> because
>>>> we're trying various heuristics to make up for the fact that we have
>>>> crappy
>>>> range info from VRP.  More specifically, we're basically punting on an
>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later
>>>> on.
>>>>
>>>> Luckily, we already have code to check simple ranges coming into the
>>>> alloca
>>>> by looking into all the basic blocks feeding it.  The attached patch
>>>> delays
>>>> the final decision on anti ranges until we have examined the basic blocks
>>>> and determined for that we are definitely out of range.
>>>>
>>>> I expect all this to disappear with Andrew's upcoming range info
>>>> overhaul.
>>>>
>>>> OK for trunk?
>>>
>>>
>>> I _really_ wonder why all the range consuming warnings are not emitted
>>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>>> a range for the argument derived from the if () rather than needing to
>>> do our own mini-VRP from the needessly "incomplete" range-info on
>>> SSA vars.
>>
>>
>> Can you explain why the range info is only available in VRP and
>> not outside, via the get_range_info() API?  It sounds as though
>> the API shouldn't be relied on (it was virtually unused before
>> GCC 7).
>
> It's very simple.  Look at the testcase from above
>
> void g (int *p, int *q)
> {
>    size_t n = (size_t)(p - q);
>
>    if (n < 10)
>      f (__builtin_alloca (n));
> }
>
> The IL outside of VRP is
>
>   <bb 2> [100.00%]:
>   p.0_1 = (long int) p_7(D);
>   q.1_2 = (long int) q_8(D);
>   _3 = p.0_1 - q.1_2;
>   _4 = _3 /[ex] 4;
>   n_9 = (size_t) _4;
>   if (n_9 <= 9)
>     goto <bb 3>; [36.64%]
>   else
>     goto <bb 4>; [63.36%]
>
>   <bb 3> [36.64%]:
>   _5 = __builtin_alloca (n_9);
>   f (_5);
>
> so there is no SSA name we can tack a range to covering the n_9 <= 9
> condition that dominates __builtin_alloca.  Inside VRP we have
>
>   <bb 2> [100.00%]:
>   p.0_1 = (long int) p_7(D);
>   q.1_2 = (long int) q_8(D);
>   _3 = p.0_1 - q.1_2;
>   _4 = _3 /[ex] 4;
>   n_9 = (size_t) _4;
>   if (n_9 <= 9)
>     goto <bb 3>; [36.64%]
>   else
>     goto <bb 4>; [63.36%]
>
>   <bb 3> [36.64%]:
>   n_13 = ASSERT_EXPR <n_9, n_9 <= 9>;
>   _5 = __builtin_alloca (n_13);
>   f (_5);
>
> and viola - now the alloca call uses n_13 which is defined at a point
> dominated by if (n_9 <= 9) and thus it has an improved range:
>
> n_13: [0, 9]  EQUIVALENCES: { n_9 } (1 elements)
>
> When in EVRP you get similar behavior (well, there are some missed
> cases I have patches queued for for GCC 8), but instead of modifying
> the IL EVRP just temporarily sets the above range when it processes
> BBs dominated by the condition.
>
> So while for VRP you can put the warning code after propagation for
> EVRP you'd have to do warning processing during propagation itself
> (and EVRP doesn't iterate).
>
>> To answer your question, the gimple-ssa-sprintf pass that depends
>> heavily on ranges would also benefit from having access to the data
>> computed in the strlen pass that's not available outside it.
>>
>> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend
>> on both VRP and tree-object-size.
>>
>> I have been thinking about how to integrate these passes in GCC 8
>> to improve the overall quality.  (By integrating I don't necessarily
>> mean merging the code but rather having them share data or be able
>> to make calls into one another.)
>>
>> I'd be very interested in your thoughts on this.
>
> Well, my thought is that we should not have N SSA propagation and
> M "DOM" propagation passes but one of each kind.  My thought is
> also that object-size and strlen are of either kind.
>
> So ideally DOM and EVRP would merge and CCP, copyprop and VRP
> would merge.  It's probably not possible (or wise) to merge their lattices
> (maybe to some extent)
DOM's equivalency tables could be easily superceded by other 
implementations.  It's just two datastructures.  A trivial const/copy 
table and a stack to allow unwinding the state of the const/copy table. 
Throwing away the basic const/copy table and replacing it with something 
built by copyprop ought to be trivial.

The big thing that would be left would be the scoped hash table for 
tracking redundant expressions.  But I don't think that we'd necessarily 
have to rip that out to merge DOM with one of hte other passes.

Hell we know DOM can fit well in any dominator based walk -- we used to 
do it as part of the into-ssa transformation.


Jeff
Richard Biener Jan. 23, 2017, 9:45 a.m. UTC | #13
On Fri, Jan 20, 2017 at 5:37 PM, Martin Sebor <msebor@gmail.com> wrote:
> On 01/20/2017 01:17 AM, Richard Biener wrote:
>>
>> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote:
>>>
>>> On 01/19/2017 05:45 AM, Richard Biener wrote:
>>>>
>>>>
>>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> In the attached testcase, we have a clearly bounded case of alloca
>>>>> which
>>>>> is
>>>>> being incorrectly reported:
>>>>>
>>>>> void g (int *p, int *q)
>>>>> {
>>>>>    size_t n = (size_t)(p - q);
>>>>>
>>>>>    if (n < 10)
>>>>>      f (__builtin_alloca (n));
>>>>> }
>>>>>
>>>>> The problem is that VRP gives us an anti-range for `n' which may be out
>>>>> of
>>>>> range:
>>>>>
>>>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>>>    n_9 = (long unsigned int) _4;
>>>>>
>>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>>>> because
>>>>> we're trying various heuristics to make up for the fact that we have
>>>>> crappy
>>>>> range info from VRP.  More specifically, we're basically punting on an
>>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound
>>>>> later
>>>>> on.
>>>>>
>>>>> Luckily, we already have code to check simple ranges coming into the
>>>>> alloca
>>>>> by looking into all the basic blocks feeding it.  The attached patch
>>>>> delays
>>>>> the final decision on anti ranges until we have examined the basic
>>>>> blocks
>>>>> and determined for that we are definitely out of range.
>>>>>
>>>>> I expect all this to disappear with Andrew's upcoming range info
>>>>> overhaul.
>>>>>
>>>>> OK for trunk?
>>>>
>>>>
>>>>
>>>> I _really_ wonder why all the range consuming warnings are not emitted
>>>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>>>> a range for the argument derived from the if () rather than needing to
>>>> do our own mini-VRP from the needessly "incomplete" range-info on
>>>> SSA vars.
>>>
>>>
>>>
>>> Can you explain why the range info is only available in VRP and
>>> not outside, via the get_range_info() API?  It sounds as though
>>> the API shouldn't be relied on (it was virtually unused before
>>> GCC 7).
>>
>>
>> It's very simple.  Look at the testcase from above
>
>
> Thanks for the detailed answer.  A few more questions below.
>
>>
>> void g (int *p, int *q)
>> {
>>    size_t n = (size_t)(p - q);
>>
>>    if (n < 10)
>>      f (__builtin_alloca (n));
>> }
>>
>> The IL outside of VRP is
>>
>>   <bb 2> [100.00%]:
>>   p.0_1 = (long int) p_7(D);
>>   q.1_2 = (long int) q_8(D);
>>   _3 = p.0_1 - q.1_2;
>>   _4 = _3 /[ex] 4;
>>   n_9 = (size_t) _4;
>>   if (n_9 <= 9)
>>     goto <bb 3>; [36.64%]
>>   else
>>     goto <bb 4>; [63.36%]
>>
>>   <bb 3> [36.64%]:
>>   _5 = __builtin_alloca (n_9);
>>   f (_5);
>>
>> so there is no SSA name we can tack a range to covering the n_9 <= 9
>> condition that dominates __builtin_alloca.
>
>
> This may be a naive question but why is it not possible to create
> such an SSA name?

You'd insert a copy (like VRP does with ASSERTs) but those will be
propagated out quickly again.

> Having the get_range_info() API depend on this subtlety makes its
> clients quite unreliable.  It may be acceptable for optimization
> but when it's visible to users in the form of false positives or
> negatives it's a source of bug reports.

Yes, but that's the fault of the warning machinery not the range-info API
which was designed for optimization.

>> Inside VRP we have
>>
>>   <bb 2> [100.00%]:
>>   p.0_1 = (long int) p_7(D);
>>   q.1_2 = (long int) q_8(D);
>>   _3 = p.0_1 - q.1_2;
>>   _4 = _3 /[ex] 4;
>>   n_9 = (size_t) _4;
>>   if (n_9 <= 9)
>>     goto <bb 3>; [36.64%]
>>   else
>>     goto <bb 4>; [63.36%]
>>
>>   <bb 3> [36.64%]:
>>   n_13 = ASSERT_EXPR <n_9, n_9 <= 9>;
>>   _5 = __builtin_alloca (n_13);
>>   f (_5);
>>
>> and viola - now the alloca call uses n_13 which is defined at a point
>> dominated by if (n_9 <= 9) and thus it has an improved range:
>>
>> n_13: [0, 9]  EQUIVALENCES: { n_9 } (1 elements)
>
>
> Yes, I see that.  But when I change size_t to unsigned int (in LP64)
> like so:
>
>   void g (int *p, int *q)
>   {
>     unsigned n = (unsigned)(p - q);
>
>     if (n < 10)
>       f (__builtin_alloca (n));
>   }
>
> -Walloca-larger-than=100 still complains:
>
>   warning: argument to ‘alloca’ may be too large
>   note: limit is 100 bytes, but argument may be as large as 4294967295
>
> and the VRP dump shows
>
>   _5: [0, 4294967295]
>   _14: [0, 4294967295]
>   ...
>     _3 = p.0_1 - q.1_2;
>     _4 = _3 /[ex] 4;
>     n_10 = (unsigned int) _4;
>     if (n_10 <= 9)
>       goto <bb 3>; [36.64%]
>     else
>       goto <bb 4>; [63.36%]
>
>     <bb 3> [36.64%]:
>     _14 = _4 & 4294967295;
>     _5 = (long unsigned int) _14;
>     _6 = __builtin_alloca (_5);
>
> so it seems that even in VRP itself the range information isn't
> quite good enough to avoid false positives.  (Perhaps that's one
> the missed cases you mention below?)

Well, you see that there's no obvious way to use the n_10 <= 9 conditional
here.  VRP would need to register a [0, 9] range for the lower half of n_10
and then figure after seeing

>     _14 = _4 & 4294967295;

that _14 is now [0, 9].

That's a neat trick VRP cannot handle at the moment (there's no way the
lattice can encode a range for a sub-set of all bits of a SSA name).

Your source is bogus in the way that (unsigned)(p - q) might truncate
the pointer difference.

Richard.

>
>
>> When in EVRP you get similar behavior (well, there are some missed
>> cases I have patches queued for for GCC 8), but instead of modifying
>> the IL EVRP just temporarily sets the above range when it processes
>> BBs dominated by the condition.
>>
>> So while for VRP you can put the warning code after propagation for
>> EVRP you'd have to do warning processing during propagation itself
>> (and EVRP doesn't iterate).
>>
>>> To answer your question, the gimple-ssa-sprintf pass that depends
>>> heavily on ranges would also benefit from having access to the data
>>> computed in the strlen pass that's not available outside it.
>>>
>>> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend
>>> on both VRP and tree-object-size.
>>>
>>> I have been thinking about how to integrate these passes in GCC 8
>>> to improve the overall quality.  (By integrating I don't necessarily
>>> mean merging the code but rather having them share data or be able
>>> to make calls into one another.)
>>>
>>> I'd be very interested in your thoughts on this.
>>
>>
>> Well, my thought is that we should not have N SSA propagation and
>> M "DOM" propagation passes but one of each kind.  My thought is
>> also that object-size and strlen are of either kind.
>>
>> So ideally DOM and EVRP would merge and CCP, copyprop and VRP
>> would merge.  It's probably not possible (or wise) to merge their lattices
>> (maybe to some extent).
>>
>> Those two passes could be used by a static analysis phase prior to
>> any optimization to emit warnings (just computing the lattice, not doing
>> any IL modification).
>
>
> Thanks.  I'll ponder that.
>
> Martin
Richard Biener Jan. 23, 2017, 9:50 a.m. UTC | #14
On Fri, Jan 20, 2017 at 8:48 PM, Jeff Law <law@redhat.com> wrote:
> On 01/20/2017 01:17 AM, Richard Biener wrote:
>>
>> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote:
>>>
>>> On 01/19/2017 05:45 AM, Richard Biener wrote:
>>>>
>>>>
>>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> In the attached testcase, we have a clearly bounded case of alloca
>>>>> which
>>>>> is
>>>>> being incorrectly reported:
>>>>>
>>>>> void g (int *p, int *q)
>>>>> {
>>>>>    size_t n = (size_t)(p - q);
>>>>>
>>>>>    if (n < 10)
>>>>>      f (__builtin_alloca (n));
>>>>> }
>>>>>
>>>>> The problem is that VRP gives us an anti-range for `n' which may be out
>>>>> of
>>>>> range:
>>>>>
>>>>>   # RANGE ~[2305843009213693952, 16140901064495857663]
>>>>>    n_9 = (long unsigned int) _4;
>>>>>
>>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
>>>>> because
>>>>> we're trying various heuristics to make up for the fact that we have
>>>>> crappy
>>>>> range info from VRP.  More specifically, we're basically punting on an
>>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound
>>>>> later
>>>>> on.
>>>>>
>>>>> Luckily, we already have code to check simple ranges coming into the
>>>>> alloca
>>>>> by looking into all the basic blocks feeding it.  The attached patch
>>>>> delays
>>>>> the final decision on anti ranges until we have examined the basic
>>>>> blocks
>>>>> and determined for that we are definitely out of range.
>>>>>
>>>>> I expect all this to disappear with Andrew's upcoming range info
>>>>> overhaul.
>>>>>
>>>>> OK for trunk?
>>>>
>>>>
>>>>
>>>> I _really_ wonder why all the range consuming warnings are not emitted
>>>> from VRP itself (like we do for -Warray-bounds).  There we'd still see
>>>> a range for the argument derived from the if () rather than needing to
>>>> do our own mini-VRP from the needessly "incomplete" range-info on
>>>> SSA vars.
>>>
>>>
>>>
>>> Can you explain why the range info is only available in VRP and
>>> not outside, via the get_range_info() API?  It sounds as though
>>> the API shouldn't be relied on (it was virtually unused before
>>> GCC 7).
>>
>>
>> It's very simple.  Look at the testcase from above
>>
>> void g (int *p, int *q)
>> {
>>    size_t n = (size_t)(p - q);
>>
>>    if (n < 10)
>>      f (__builtin_alloca (n));
>> }
>>
>> The IL outside of VRP is
>>
>>   <bb 2> [100.00%]:
>>   p.0_1 = (long int) p_7(D);
>>   q.1_2 = (long int) q_8(D);
>>   _3 = p.0_1 - q.1_2;
>>   _4 = _3 /[ex] 4;
>>   n_9 = (size_t) _4;
>>   if (n_9 <= 9)
>>     goto <bb 3>; [36.64%]
>>   else
>>     goto <bb 4>; [63.36%]
>>
>>   <bb 3> [36.64%]:
>>   _5 = __builtin_alloca (n_9);
>>   f (_5);
>>
>> so there is no SSA name we can tack a range to covering the n_9 <= 9
>> condition that dominates __builtin_alloca.  Inside VRP we have
>>
>>   <bb 2> [100.00%]:
>>   p.0_1 = (long int) p_7(D);
>>   q.1_2 = (long int) q_8(D);
>>   _3 = p.0_1 - q.1_2;
>>   _4 = _3 /[ex] 4;
>>   n_9 = (size_t) _4;
>>   if (n_9 <= 9)
>>     goto <bb 3>; [36.64%]
>>   else
>>     goto <bb 4>; [63.36%]
>>
>>   <bb 3> [36.64%]:
>>   n_13 = ASSERT_EXPR <n_9, n_9 <= 9>;
>>   _5 = __builtin_alloca (n_13);
>>   f (_5);
>>
>> and viola - now the alloca call uses n_13 which is defined at a point
>> dominated by if (n_9 <= 9) and thus it has an improved range:
>>
>> n_13: [0, 9]  EQUIVALENCES: { n_9 } (1 elements)
>>
>> When in EVRP you get similar behavior (well, there are some missed
>> cases I have patches queued for for GCC 8), but instead of modifying
>> the IL EVRP just temporarily sets the above range when it processes
>> BBs dominated by the condition.
>>
>> So while for VRP you can put the warning code after propagation for
>> EVRP you'd have to do warning processing during propagation itself
>> (and EVRP doesn't iterate).
>>
>>> To answer your question, the gimple-ssa-sprintf pass that depends
>>> heavily on ranges would also benefit from having access to the data
>>> computed in the strlen pass that's not available outside it.
>>>
>>> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend
>>> on both VRP and tree-object-size.
>>>
>>> I have been thinking about how to integrate these passes in GCC 8
>>> to improve the overall quality.  (By integrating I don't necessarily
>>> mean merging the code but rather having them share data or be able
>>> to make calls into one another.)
>>>
>>> I'd be very interested in your thoughts on this.
>>
>>
>> Well, my thought is that we should not have N SSA propagation and
>> M "DOM" propagation passes but one of each kind.  My thought is
>> also that object-size and strlen are of either kind.
>>
>> So ideally DOM and EVRP would merge and CCP, copyprop and VRP
>> would merge.  It's probably not possible (or wise) to merge their lattices
>> (maybe to some extent)
>
> DOM's equivalency tables could be easily superceded by other
> implementations.  It's just two datastructures.  A trivial const/copy table
> and a stack to allow unwinding the state of the const/copy table. Throwing
> away the basic const/copy table and replacing it with something built by
> copyprop ought to be trivial.
>
> The big thing that would be left would be the scoped hash table for tracking
> redundant expressions.  But I don't think that we'd necessarily have to rip
> that out to merge DOM with one of hte other passes.
>
> Hell we know DOM can fit well in any dominator based walk -- we used to do
> it as part of the into-ssa transformation.

Sure.

The question is whether we want to make warning "passes" more expensive
by essentially doing a [E]VRP/DOM pass (but not doing any IL transform).

I believe doing that is more reasonable than doing their own hacky analysis.

Having less passes to choose to copy from for such static analysis (and the
ability to tame compile-time by some switches) would be a good thing to have.

Richard.

>
> Jeff
Martin Sebor Jan. 23, 2017, 3:55 p.m. UTC | #15
>> Yes, I see that.  But when I change size_t to unsigned int (in LP64)
>> like so:
>>
>>   void g (int *p, int *q)
>>   {
>>     unsigned n = (unsigned)(p - q);
>>
>>     if (n < 10)
>>       f (__builtin_alloca (n));
>>   }
>>
>> -Walloca-larger-than=100 still complains:
>>
>>   warning: argument to ‘alloca’ may be too large
>>   note: limit is 100 bytes, but argument may be as large as 4294967295
>>
>> and the VRP dump shows
>>
>>   _5: [0, 4294967295]
>>   _14: [0, 4294967295]
>>   ...
>>     _3 = p.0_1 - q.1_2;
>>     _4 = _3 /[ex] 4;
>>     n_10 = (unsigned int) _4;
>>     if (n_10 <= 9)
>>       goto <bb 3>; [36.64%]
>>     else
>>       goto <bb 4>; [63.36%]
>>
>>     <bb 3> [36.64%]:
>>     _14 = _4 & 4294967295;
>>     _5 = (long unsigned int) _14;
>>     _6 = __builtin_alloca (_5);
>>
>> so it seems that even in VRP itself the range information isn't
>> quite good enough to avoid false positives.  (Perhaps that's one
>> the missed cases you mention below?)
>
> Well, you see that there's no obvious way to use the n_10 <= 9 conditional
> here.  VRP would need to register a [0, 9] range for the lower half of n_10
> and then figure after seeing
>
>>     _14 = _4 & 4294967295;
>
> that _14 is now [0, 9].
>
> That's a neat trick VRP cannot handle at the moment (there's no way the
> lattice can encode a range for a sub-set of all bits of a SSA name).

Sure.  My point is just that it looks like there will be some false
positives whether the warnings are implemented as part of the VRP
pass or elsewhere.  (I admit I haven't studied the implementation
of the VRP pass to understand whether these false positives are
avoidable and I'm happy to take your word if if you say they can.)

> Your source is bogus in the way that (unsigned)(p - q) might truncate
> the pointer difference.

The problem is independent of the pointer difference and can be
reproduced by any conversion from a wider type to a narrower
unsigned type (even the safe ones), as the test case in bug 79191
I opened for it shows:

   void f (unsigned long n)
   {
     if (n > 3)
       __builtin_abort ();
   }

   void h (unsigned long m)
   {
     unsigned n = m;

     if (n < 3)
       f (n);
   }

Martin
Jeff Law Jan. 23, 2017, 4:07 p.m. UTC | #16
On 01/23/2017 02:50 AM, Richard Biener wrote:
>>>
>>> So ideally DOM and EVRP would merge and CCP, copyprop and VRP
>>> would merge.  It's probably not possible (or wise) to merge their lattices
>>> (maybe to some extent)
>>
>> DOM's equivalency tables could be easily superceded by other
>> implementations.  It's just two datastructures.  A trivial const/copy table
>> and a stack to allow unwinding the state of the const/copy table. Throwing
>> away the basic const/copy table and replacing it with something built by
>> copyprop ought to be trivial.
>>
>> The big thing that would be left would be the scoped hash table for tracking
>> redundant expressions.  But I don't think that we'd necessarily have to rip
>> that out to merge DOM with one of hte other passes.
>>
>> Hell we know DOM can fit well in any dominator based walk -- we used to do
>> it as part of the into-ssa transformation.
>
> Sure.
>
> The question is whether we want to make warning "passes" more expensive
> by essentially doing a [E]VRP/DOM pass (but not doing any IL transform).
>
> I believe doing that is more reasonable than doing their own hacky analysis.
>
> Having less passes to choose to copy from for such static analysis (and the
> ability to tame compile-time by some switches) would be a good thing to have.
Very true.  But your comment seemed to be about merging DOM and EVRP or 
other passes -- which is certainly possible and possibly even desirable.

The more I look at our SCCVN implementation, the more I want to explore 
trying to re-use that infrastructure to simplify DOM.


Jeff
Jeff Law Jan. 23, 2017, 10:24 p.m. UTC | #17
On 01/19/2017 05:17 AM, Aldy Hernandez wrote:
> In the attached testcase, we have a clearly bounded case of alloca which
> is being incorrectly reported:
>
> void g (int *p, int *q)
> {
>    size_t n = (size_t)(p - q);
>
>    if (n < 10)
>      f (__builtin_alloca (n));
> }
>
> The problem is that VRP gives us an anti-range for `n' which may be out
> of range:
>
>   # RANGE ~[2305843009213693952, 16140901064495857663]
>    n_9 = (long unsigned int) _4;
>
> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
> because we're trying various heuristics to make up for the fact that we
> have crappy range info from VRP.  More specifically, we're basically
> punting on an VR_ANTI_RANGE and ignoring that the casted result (n_9)
> has a bound later on.
>
> Luckily, we already have code to check simple ranges coming into the
> alloca by looking into all the basic blocks feeding it.  The attached
> patch delays the final decision on anti ranges until we have examined
> the basic blocks and determined for that we are definitely out of range.
>
> I expect all this to disappear with Andrew's upcoming range info overhaul.
>
> OK for trunk?
>
Yes this is another case where we're hoping that Andrew's work can help 
significantly.   The recorded range for n_9 is the range that appplies 
over the entire function, when we really want the range for n_9 that 
occurs within the TRUE arm of the conditional.

jeff
Jeff Law Jan. 23, 2017, 10:25 p.m. UTC | #18
On 01/19/2017 05:17 AM, Aldy Hernandez wrote:
> In the attached testcase, we have a clearly bounded case of alloca which
> is being incorrectly reported:
>
> void g (int *p, int *q)
> {
>    size_t n = (size_t)(p - q);
>
>    if (n < 10)
>      f (__builtin_alloca (n));
> }
>
> The problem is that VRP gives us an anti-range for `n' which may be out
> of range:
>
>   # RANGE ~[2305843009213693952, 16140901064495857663]
>    n_9 = (long unsigned int) _4;
>
> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly
> because we're trying various heuristics to make up for the fact that we
> have crappy range info from VRP.  More specifically, we're basically
> punting on an VR_ANTI_RANGE and ignoring that the casted result (n_9)
> has a bound later on.
>
> Luckily, we already have code to check simple ranges coming into the
> alloca by looking into all the basic blocks feeding it.  The attached
> patch delays the final decision on anti ranges until we have examined
> the basic blocks and determined for that we are definitely out of range.
>
> I expect all this to disappear with Andrew's upcoming range info overhaul.
>
> OK for trunk?
>
Oh, and this is fine for the trunk.

jeff
Richard Biener Jan. 24, 2017, 10:15 a.m. UTC | #19
On Mon, Jan 23, 2017 at 5:07 PM, Jeff Law <law@redhat.com> wrote:
> On 01/23/2017 02:50 AM, Richard Biener wrote:
>>>>
>>>>
>>>> So ideally DOM and EVRP would merge and CCP, copyprop and VRP
>>>> would merge.  It's probably not possible (or wise) to merge their
>>>> lattices
>>>> (maybe to some extent)
>>>
>>>
>>> DOM's equivalency tables could be easily superceded by other
>>> implementations.  It's just two datastructures.  A trivial const/copy
>>> table
>>> and a stack to allow unwinding the state of the const/copy table.
>>> Throwing
>>> away the basic const/copy table and replacing it with something built by
>>> copyprop ought to be trivial.
>>>
>>> The big thing that would be left would be the scoped hash table for
>>> tracking
>>> redundant expressions.  But I don't think that we'd necessarily have to
>>> rip
>>> that out to merge DOM with one of hte other passes.
>>>
>>> Hell we know DOM can fit well in any dominator based walk -- we used to
>>> do
>>> it as part of the into-ssa transformation.
>>
>>
>> Sure.
>>
>> The question is whether we want to make warning "passes" more expensive
>> by essentially doing a [E]VRP/DOM pass (but not doing any IL transform).
>>
>> I believe doing that is more reasonable than doing their own hacky
>> analysis.
>>
>> Having less passes to choose to copy from for such static analysis (and
>> the
>> ability to tame compile-time by some switches) would be a good thing to
>> have.
>
> Very true.  But your comment seemed to be about merging DOM and EVRP or
> other passes -- which is certainly possible and possibly even desirable.
>
> The more I look at our SCCVN implementation, the more I want to explore
> trying to re-use that infrastructure to simplify DOM.

Certainly having a single way to hash/record stmts/expressions on GIMPLE would
be nice.  Not sure if the SCCVN one is perfect (enhancing the memory
part further
is on my TODO list).

But I also want to merge SCCVN and DOM in the end.  Well, rip out the 'SCC'
part of SCCVN and replace it with a RPO style VN which makes it very similar
to DOM.  Ideally we'd have one "VN" we can configure (like enable a VRP lattice,
disable memory handling to get a pure CCP, disable optimistic VN and thus
require no iteration, etc.).

It may make sense to keep (a similarly single!) SSA-propagator based VN
(ccp/copyprop/VRP).

Richard.

>
> Jeff
Jeff Law Jan. 24, 2017, 3:30 p.m. UTC | #20
On 01/24/2017 03:15 AM, Richard Biener wrote:

>> The more I look at our SCCVN implementation, the more I want to explore
>> trying to re-use that infrastructure to simplify DOM.
>
> Certainly having a single way to hash/record stmts/expressions on GIMPLE would
> be nice.  Not sure if the SCCVN one is perfect (enhancing the memory
> part further is on my TODO list).
It doesn't have to be perfect.  From my wanderings in its code I think 
it's probably sufficient to replace 90% of the non-path specific DOM 
optimizations.

The idea is to do value numbering as an independent step (SCCVN, RPO 
walk, whatever).  Then use a scheduler similar to Click's 95 work to 
place statements where they belong.  The result is far simpler than what 
DOM does and re-uses the VN framework.


In that world DOM morphs away from generic CSE/simplifications and 
focuses on path specific stuff.  And *that* should be reimplemented as 
backwards walk with value numbering.  It's hard to tease that out right 
now given how we depend on DOM for generic CSE/simplifications.

Jeff

Patch
diff mbox

diff --git a/gcc/gimple-ssa-warn-alloca.c b/gcc/gimple-ssa-warn-alloca.c
index a27eea1..d553a34 100644
--- a/gcc/gimple-ssa-warn-alloca.c
+++ b/gcc/gimple-ssa-warn-alloca.c
@@ -272,6 +272,7 @@  static struct alloca_type_and_limit
 alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type)
 {
   gcc_assert (gimple_alloca_call_p (stmt));
+  bool tentative_cast_from_signed = false;
   tree len = gimple_call_arg (stmt, 0);
   tree len_casted = NULL;
   wide_int min, max;
@@ -352,8 +353,26 @@  alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type)
 	  // with this heuristic.  Hopefully, this VR_ANTI_RANGE
 	  // nonsense will go away, and we won't have to catch the
 	  // sign conversion problems with this crap.
+	  //
+	  // This is here to catch things like:
+	  // void foo(signed int n) {
+	  //   if (n < 100)
+	  //     alloca(n);
+	  //   ...
+	  // }
 	  if (cast_from_signed_p (len, invalid_casted_type))
-	    return alloca_type_and_limit (ALLOCA_CAST_FROM_SIGNED);
+	    {
+	      // Unfortunately this also triggers:
+	      //
+	      // __SIZE_TYPE__ n = (__SIZE_TYPE__)blah;
+	      // if (n < 100)
+	      //   alloca(n);
+	      //
+	      // ...which is clearly bounded.  So, double check that
+	      // the paths leading up to the size definitely don't
+	      // have a bound.
+	      tentative_cast_from_signed = true;
+	    }
 	}
       // No easily determined range and try other things.
     }
@@ -371,10 +390,12 @@  alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type)
 	  ret = alloca_call_type_by_arg (len, len_casted,
 					 EDGE_PRED (bb, ix), max_size);
 	  if (ret.type != ALLOCA_OK)
-	    return ret;
+	    break;
 	}
     }
 
+  if (tentative_cast_from_signed && ret.type != ALLOCA_OK)
+    return alloca_type_and_limit (ALLOCA_CAST_FROM_SIGNED);
   return ret;
 }
 
diff --git a/gcc/testsuite/gcc.dg/Walloca-13.c b/gcc/testsuite/gcc.dg/Walloca-13.c
new file mode 100644
index 0000000..f9bdcef
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Walloca-13.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Walloca-larger-than=100 -O2" } */
+
+void f (void*);
+
+void g (int *p, int *q)
+{
+  __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q);
+  if (n < 100)
+    f (__builtin_alloca (n));
+}