Message ID | 8a5496d5-1811-a89d-ea75-b15c68cb6ddf@redhat.com |
---|---|
State | New |
Headers | show |
On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: > In the attached testcase, we have a clearly bounded case of alloca which is > being incorrectly reported: > > void g (int *p, int *q) > { > size_t n = (size_t)(p - q); > > if (n < 10) > f (__builtin_alloca (n)); > } > > The problem is that VRP gives us an anti-range for `n' which may be out of > range: > > # RANGE ~[2305843009213693952, 16140901064495857663] > n_9 = (long unsigned int) _4; > > We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly because > we're trying various heuristics to make up for the fact that we have crappy > range info from VRP. More specifically, we're basically punting on an > VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later > on. > > Luckily, we already have code to check simple ranges coming into the alloca > by looking into all the basic blocks feeding it. The attached patch delays > the final decision on anti ranges until we have examined the basic blocks > and determined for that we are definitely out of range. > > I expect all this to disappear with Andrew's upcoming range info overhaul. > > OK for trunk? I _really_ wonder why all the range consuming warnings are not emitted from VRP itself (like we do for -Warray-bounds). There we'd still see a range for the argument derived from the if () rather than needing to do our own mini-VRP from the needessly "incomplete" range-info on SSA vars. Richard. >
On 01/19/2017 07:45 AM, Richard Biener wrote: > On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >> In the attached testcase, we have a clearly bounded case of alloca which is >> being incorrectly reported: >> >> void g (int *p, int *q) >> { >> size_t n = (size_t)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> The problem is that VRP gives us an anti-range for `n' which may be out of >> range: >> >> # RANGE ~[2305843009213693952, 16140901064495857663] >> n_9 = (long unsigned int) _4; >> >> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly because >> we're trying various heuristics to make up for the fact that we have crappy >> range info from VRP. More specifically, we're basically punting on an >> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later >> on. >> >> Luckily, we already have code to check simple ranges coming into the alloca >> by looking into all the basic blocks feeding it. The attached patch delays >> the final decision on anti ranges until we have examined the basic blocks >> and determined for that we are definitely out of range. >> >> I expect all this to disappear with Andrew's upcoming range info overhaul. >> >> OK for trunk? > > I _really_ wonder why all the range consuming warnings are not emitted > from VRP itself (like we do for -Warray-bounds). There we'd still see > a range for the argument derived from the if () rather than needing to > do our own mini-VRP from the needessly "incomplete" range-info on > SSA vars. > > Richard. My original implementation was within VRP itself, using the ASSERT_EXPR's, but Jeff suggested doing it in it's own pass. I can't remember the details (I could look it up), but I think it had to do with getting better information post-inlining about the call to alloca (not sure). Also, with Andrew's GCC8 work on ranges, it seemed pointless to bend over backwards handling ASSERT_EXPR's etc. And IIRC, implementing it in VRP was ugly. Perhaps Jeff remembers the details better. Aldy
On Thu, Jan 19, 2017 at 2:02 PM, Aldy Hernandez <aldyh@redhat.com> wrote: > On 01/19/2017 07:45 AM, Richard Biener wrote: >> >> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >>> >>> In the attached testcase, we have a clearly bounded case of alloca which >>> is >>> being incorrectly reported: >>> >>> void g (int *p, int *q) >>> { >>> size_t n = (size_t)(p - q); >>> >>> if (n < 10) >>> f (__builtin_alloca (n)); >>> } >>> >>> The problem is that VRP gives us an anti-range for `n' which may be out >>> of >>> range: >>> >>> # RANGE ~[2305843009213693952, 16140901064495857663] >>> n_9 = (long unsigned int) _4; >>> >>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>> because >>> we're trying various heuristics to make up for the fact that we have >>> crappy >>> range info from VRP. More specifically, we're basically punting on an >>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later >>> on. >>> >>> Luckily, we already have code to check simple ranges coming into the >>> alloca >>> by looking into all the basic blocks feeding it. The attached patch >>> delays >>> the final decision on anti ranges until we have examined the basic blocks >>> and determined for that we are definitely out of range. >>> >>> I expect all this to disappear with Andrew's upcoming range info >>> overhaul. >>> >>> OK for trunk? >> >> >> I _really_ wonder why all the range consuming warnings are not emitted >> from VRP itself (like we do for -Warray-bounds). There we'd still see >> a range for the argument derived from the if () rather than needing to >> do our own mini-VRP from the needessly "incomplete" range-info on >> SSA vars. >> >> Richard. > > > My original implementation was within VRP itself, using the ASSERT_EXPR's, > but Jeff suggested doing it in it's own pass. I can't remember the details > (I could look it up), but I think it had to do with getting better > information post-inlining about the call to alloca (not sure). > > Also, with Andrew's GCC8 work on ranges, it seemed pointless to bend over > backwards handling ASSERT_EXPR's etc. And IIRC, implementing it in VRP was > ugly. You don't need to handle ASSERT_EXPRs you simply use the VRP lattice after propagation (or during if you hook into EVRP of course). And VRP runs after inlining. Richard. > Perhaps Jeff remembers the details better. > > Aldy
On 01/19/2017 05:45 AM, Richard Biener wrote: > On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >> In the attached testcase, we have a clearly bounded case of alloca which is >> being incorrectly reported: >> >> void g (int *p, int *q) >> { >> size_t n = (size_t)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> The problem is that VRP gives us an anti-range for `n' which may be out of >> range: >> >> # RANGE ~[2305843009213693952, 16140901064495857663] >> n_9 = (long unsigned int) _4; >> >> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly because >> we're trying various heuristics to make up for the fact that we have crappy >> range info from VRP. More specifically, we're basically punting on an >> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later >> on. >> >> Luckily, we already have code to check simple ranges coming into the alloca >> by looking into all the basic blocks feeding it. The attached patch delays >> the final decision on anti ranges until we have examined the basic blocks >> and determined for that we are definitely out of range. >> >> I expect all this to disappear with Andrew's upcoming range info overhaul. >> >> OK for trunk? > > I _really_ wonder why all the range consuming warnings are not emitted > from VRP itself (like we do for -Warray-bounds). There we'd still see > a range for the argument derived from the if () rather than needing to > do our own mini-VRP from the needessly "incomplete" range-info on > SSA vars. Can you explain why the range info is only available in VRP and not outside, via the get_range_info() API? It sounds as though the API shouldn't be relied on (it was virtually unused before GCC 7). To answer your question, the gimple-ssa-sprintf pass that depends heavily on ranges would also benefit from having access to the data computed in the strlen pass that's not available outside it. The -Wstringop-overflow and -Walloc-size-larger-than warnings depend on both VRP and tree-object-size. I have been thinking about how to integrate these passes in GCC 8 to improve the overall quality. (By integrating I don't necessarily mean merging the code but rather having them share data or be able to make calls into one another.) I'd be very interested in your thoughts on this. Thanks Martin
On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote: > On 01/19/2017 05:45 AM, Richard Biener wrote: >> >> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >>> >>> In the attached testcase, we have a clearly bounded case of alloca which >>> is >>> being incorrectly reported: >>> >>> void g (int *p, int *q) >>> { >>> size_t n = (size_t)(p - q); >>> >>> if (n < 10) >>> f (__builtin_alloca (n)); >>> } >>> >>> The problem is that VRP gives us an anti-range for `n' which may be out >>> of >>> range: >>> >>> # RANGE ~[2305843009213693952, 16140901064495857663] >>> n_9 = (long unsigned int) _4; >>> >>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>> because >>> we're trying various heuristics to make up for the fact that we have >>> crappy >>> range info from VRP. More specifically, we're basically punting on an >>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later >>> on. >>> >>> Luckily, we already have code to check simple ranges coming into the >>> alloca >>> by looking into all the basic blocks feeding it. The attached patch >>> delays >>> the final decision on anti ranges until we have examined the basic blocks >>> and determined for that we are definitely out of range. >>> >>> I expect all this to disappear with Andrew's upcoming range info >>> overhaul. >>> >>> OK for trunk? >> >> >> I _really_ wonder why all the range consuming warnings are not emitted >> from VRP itself (like we do for -Warray-bounds). There we'd still see >> a range for the argument derived from the if () rather than needing to >> do our own mini-VRP from the needessly "incomplete" range-info on >> SSA vars. > > > Can you explain why the range info is only available in VRP and > not outside, via the get_range_info() API? It sounds as though > the API shouldn't be relied on (it was virtually unused before > GCC 7). It's very simple. Look at the testcase from above void g (int *p, int *q) { size_t n = (size_t)(p - q); if (n < 10) f (__builtin_alloca (n)); } The IL outside of VRP is <bb 2> [100.00%]: p.0_1 = (long int) p_7(D); q.1_2 = (long int) q_8(D); _3 = p.0_1 - q.1_2; _4 = _3 /[ex] 4; n_9 = (size_t) _4; if (n_9 <= 9) goto <bb 3>; [36.64%] else goto <bb 4>; [63.36%] <bb 3> [36.64%]: _5 = __builtin_alloca (n_9); f (_5); so there is no SSA name we can tack a range to covering the n_9 <= 9 condition that dominates __builtin_alloca. Inside VRP we have <bb 2> [100.00%]: p.0_1 = (long int) p_7(D); q.1_2 = (long int) q_8(D); _3 = p.0_1 - q.1_2; _4 = _3 /[ex] 4; n_9 = (size_t) _4; if (n_9 <= 9) goto <bb 3>; [36.64%] else goto <bb 4>; [63.36%] <bb 3> [36.64%]: n_13 = ASSERT_EXPR <n_9, n_9 <= 9>; _5 = __builtin_alloca (n_13); f (_5); and viola - now the alloca call uses n_13 which is defined at a point dominated by if (n_9 <= 9) and thus it has an improved range: n_13: [0, 9] EQUIVALENCES: { n_9 } (1 elements) When in EVRP you get similar behavior (well, there are some missed cases I have patches queued for for GCC 8), but instead of modifying the IL EVRP just temporarily sets the above range when it processes BBs dominated by the condition. So while for VRP you can put the warning code after propagation for EVRP you'd have to do warning processing during propagation itself (and EVRP doesn't iterate). > To answer your question, the gimple-ssa-sprintf pass that depends > heavily on ranges would also benefit from having access to the data > computed in the strlen pass that's not available outside it. > > The -Wstringop-overflow and -Walloc-size-larger-than warnings depend > on both VRP and tree-object-size. > > I have been thinking about how to integrate these passes in GCC 8 > to improve the overall quality. (By integrating I don't necessarily > mean merging the code but rather having them share data or be able > to make calls into one another.) > > I'd be very interested in your thoughts on this. Well, my thought is that we should not have N SSA propagation and M "DOM" propagation passes but one of each kind. My thought is also that object-size and strlen are of either kind. So ideally DOM and EVRP would merge and CCP, copyprop and VRP would merge. It's probably not possible (or wise) to merge their lattices (maybe to some extent). Those two passes could be used by a static analysis phase prior to any optimization to emit warnings (just computing the lattice, not doing any IL modification). Richard. > Thanks > Martin
On 01/20/2017 03:17 AM, Richard Biener wrote: > On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote: >> On 01/19/2017 05:45 AM, Richard Biener wrote: >>> >>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >>>> >>>> In the attached testcase, we have a clearly bounded case of alloca which >>>> is >>>> being incorrectly reported: >>>> >>>> void g (int *p, int *q) >>>> { >>>> size_t n = (size_t)(p - q); >>>> >>>> if (n < 10) >>>> f (__builtin_alloca (n)); >>>> } >>>> >>>> The problem is that VRP gives us an anti-range for `n' which may be out >>>> of >>>> range: >>>> >>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>> n_9 = (long unsigned int) _4; >>>> >>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>>> because >>>> we're trying various heuristics to make up for the fact that we have >>>> crappy >>>> range info from VRP. More specifically, we're basically punting on an >>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later >>>> on. >>>> >>>> Luckily, we already have code to check simple ranges coming into the >>>> alloca >>>> by looking into all the basic blocks feeding it. The attached patch >>>> delays >>>> the final decision on anti ranges until we have examined the basic blocks >>>> and determined for that we are definitely out of range. >>>> >>>> I expect all this to disappear with Andrew's upcoming range info >>>> overhaul. >>>> >>>> OK for trunk? >>> >>> >>> I _really_ wonder why all the range consuming warnings are not emitted >>> from VRP itself (like we do for -Warray-bounds). There we'd still see >>> a range for the argument derived from the if () rather than needing to >>> do our own mini-VRP from the needessly "incomplete" range-info on >>> SSA vars. >> >> >> Can you explain why the range info is only available in VRP and >> not outside, via the get_range_info() API? It sounds as though >> the API shouldn't be relied on (it was virtually unused before >> GCC 7). > > It's very simple. Look at the testcase from above > > void g (int *p, int *q) > { > size_t n = (size_t)(p - q); > > if (n < 10) > f (__builtin_alloca (n)); > } > > The IL outside of VRP is > > <bb 2> [100.00%]: > p.0_1 = (long int) p_7(D); > q.1_2 = (long int) q_8(D); > _3 = p.0_1 - q.1_2; > _4 = _3 /[ex] 4; > n_9 = (size_t) _4; > if (n_9 <= 9) > goto <bb 3>; [36.64%] > else > goto <bb 4>; [63.36%] > > <bb 3> [36.64%]: > _5 = __builtin_alloca (n_9); > f (_5); > > so there is no SSA name we can tack a range to covering the n_9 <= 9 > condition that dominates __builtin_alloca. Inside VRP we have > > <bb 2> [100.00%]: > p.0_1 = (long int) p_7(D); > q.1_2 = (long int) q_8(D); > _3 = p.0_1 - q.1_2; > _4 = _3 /[ex] 4; > n_9 = (size_t) _4; > if (n_9 <= 9) > goto <bb 3>; [36.64%] > else > goto <bb 4>; [63.36%] > > <bb 3> [36.64%]: > n_13 = ASSERT_EXPR <n_9, n_9 <= 9>; > _5 = __builtin_alloca (n_13); > f (_5); > > and viola - now the alloca call uses n_13 which is defined at a point > dominated by if (n_9 <= 9) and thus it has an improved range: > > n_13: [0, 9] EQUIVALENCES: { n_9 } (1 elements) Yes, I remember getting much better range information when doing things within VRP2, however since VRP2 runs after jump threading sometimes I'd get more false positives. Tell you what, for GCC8 I volunteer to rewrite gimple-ssa-warn-alloca.c within VRP (or using whatever range information we decide upon or is ready for GCC8). In the meantime, can we fix the PR with my patch (or a similar approach)? Aldy
On Fri, Jan 20, 2017 at 11:52 AM, Aldy Hernandez <aldyh@redhat.com> wrote: > On 01/20/2017 03:17 AM, Richard Biener wrote: >> >> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote: >>> >>> On 01/19/2017 05:45 AM, Richard Biener wrote: >>>> >>>> >>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> >>>> wrote: >>>>> >>>>> >>>>> In the attached testcase, we have a clearly bounded case of alloca >>>>> which >>>>> is >>>>> being incorrectly reported: >>>>> >>>>> void g (int *p, int *q) >>>>> { >>>>> size_t n = (size_t)(p - q); >>>>> >>>>> if (n < 10) >>>>> f (__builtin_alloca (n)); >>>>> } >>>>> >>>>> The problem is that VRP gives us an anti-range for `n' which may be out >>>>> of >>>>> range: >>>>> >>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>> n_9 = (long unsigned int) _4; >>>>> >>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>>>> because >>>>> we're trying various heuristics to make up for the fact that we have >>>>> crappy >>>>> range info from VRP. More specifically, we're basically punting on an >>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound >>>>> later >>>>> on. >>>>> >>>>> Luckily, we already have code to check simple ranges coming into the >>>>> alloca >>>>> by looking into all the basic blocks feeding it. The attached patch >>>>> delays >>>>> the final decision on anti ranges until we have examined the basic >>>>> blocks >>>>> and determined for that we are definitely out of range. >>>>> >>>>> I expect all this to disappear with Andrew's upcoming range info >>>>> overhaul. >>>>> >>>>> OK for trunk? >>>> >>>> >>>> >>>> I _really_ wonder why all the range consuming warnings are not emitted >>>> from VRP itself (like we do for -Warray-bounds). There we'd still see >>>> a range for the argument derived from the if () rather than needing to >>>> do our own mini-VRP from the needessly "incomplete" range-info on >>>> SSA vars. >>> >>> >>> >>> Can you explain why the range info is only available in VRP and >>> not outside, via the get_range_info() API? It sounds as though >>> the API shouldn't be relied on (it was virtually unused before >>> GCC 7). >> >> >> It's very simple. Look at the testcase from above >> >> void g (int *p, int *q) >> { >> size_t n = (size_t)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> The IL outside of VRP is >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> _5 = __builtin_alloca (n_9); >> f (_5); >> >> so there is no SSA name we can tack a range to covering the n_9 <= 9 >> condition that dominates __builtin_alloca. Inside VRP we have >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> n_13 = ASSERT_EXPR <n_9, n_9 <= 9>; >> _5 = __builtin_alloca (n_13); >> f (_5); >> >> and viola - now the alloca call uses n_13 which is defined at a point >> dominated by if (n_9 <= 9) and thus it has an improved range: >> >> n_13: [0, 9] EQUIVALENCES: { n_9 } (1 elements) > > > Yes, I remember getting much better range information when doing things > within VRP2, however since VRP2 runs after jump threading sometimes I'd get > more false positives. > > Tell you what, for GCC8 I volunteer to rewrite gimple-ssa-warn-alloca.c > within VRP (or using whatever range information we decide upon or is ready > for GCC8). In the meantime, can we fix the PR with my patch (or a similar > approach)? Sure. Richard. > Aldy
On 01/20/2017 01:17 AM, Richard Biener wrote: > On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote: >> On 01/19/2017 05:45 AM, Richard Biener wrote: >>> >>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >>>> >>>> In the attached testcase, we have a clearly bounded case of alloca which >>>> is >>>> being incorrectly reported: >>>> >>>> void g (int *p, int *q) >>>> { >>>> size_t n = (size_t)(p - q); >>>> >>>> if (n < 10) >>>> f (__builtin_alloca (n)); >>>> } >>>> >>>> The problem is that VRP gives us an anti-range for `n' which may be out >>>> of >>>> range: >>>> >>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>> n_9 = (long unsigned int) _4; >>>> >>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>>> because >>>> we're trying various heuristics to make up for the fact that we have >>>> crappy >>>> range info from VRP. More specifically, we're basically punting on an >>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later >>>> on. >>>> >>>> Luckily, we already have code to check simple ranges coming into the >>>> alloca >>>> by looking into all the basic blocks feeding it. The attached patch >>>> delays >>>> the final decision on anti ranges until we have examined the basic blocks >>>> and determined for that we are definitely out of range. >>>> >>>> I expect all this to disappear with Andrew's upcoming range info >>>> overhaul. >>>> >>>> OK for trunk? >>> >>> >>> I _really_ wonder why all the range consuming warnings are not emitted >>> from VRP itself (like we do for -Warray-bounds). There we'd still see >>> a range for the argument derived from the if () rather than needing to >>> do our own mini-VRP from the needessly "incomplete" range-info on >>> SSA vars. >> >> >> Can you explain why the range info is only available in VRP and >> not outside, via the get_range_info() API? It sounds as though >> the API shouldn't be relied on (it was virtually unused before >> GCC 7). > > It's very simple. Look at the testcase from above Thanks for the detailed answer. A few more questions below. > > void g (int *p, int *q) > { > size_t n = (size_t)(p - q); > > if (n < 10) > f (__builtin_alloca (n)); > } > > The IL outside of VRP is > > <bb 2> [100.00%]: > p.0_1 = (long int) p_7(D); > q.1_2 = (long int) q_8(D); > _3 = p.0_1 - q.1_2; > _4 = _3 /[ex] 4; > n_9 = (size_t) _4; > if (n_9 <= 9) > goto <bb 3>; [36.64%] > else > goto <bb 4>; [63.36%] > > <bb 3> [36.64%]: > _5 = __builtin_alloca (n_9); > f (_5); > > so there is no SSA name we can tack a range to covering the n_9 <= 9 > condition that dominates __builtin_alloca. This may be a naive question but why is it not possible to create such an SSA name? Having the get_range_info() API depend on this subtlety makes its clients quite unreliable. It may be acceptable for optimization but when it's visible to users in the form of false positives or negatives it's a source of bug reports. > Inside VRP we have > > <bb 2> [100.00%]: > p.0_1 = (long int) p_7(D); > q.1_2 = (long int) q_8(D); > _3 = p.0_1 - q.1_2; > _4 = _3 /[ex] 4; > n_9 = (size_t) _4; > if (n_9 <= 9) > goto <bb 3>; [36.64%] > else > goto <bb 4>; [63.36%] > > <bb 3> [36.64%]: > n_13 = ASSERT_EXPR <n_9, n_9 <= 9>; > _5 = __builtin_alloca (n_13); > f (_5); > > and viola - now the alloca call uses n_13 which is defined at a point > dominated by if (n_9 <= 9) and thus it has an improved range: > > n_13: [0, 9] EQUIVALENCES: { n_9 } (1 elements) Yes, I see that. But when I change size_t to unsigned int (in LP64) like so: void g (int *p, int *q) { unsigned n = (unsigned)(p - q); if (n < 10) f (__builtin_alloca (n)); } -Walloca-larger-than=100 still complains: warning: argument to ‘alloca’ may be too large note: limit is 100 bytes, but argument may be as large as 4294967295 and the VRP dump shows _5: [0, 4294967295] _14: [0, 4294967295] ... _3 = p.0_1 - q.1_2; _4 = _3 /[ex] 4; n_10 = (unsigned int) _4; if (n_10 <= 9) goto <bb 3>; [36.64%] else goto <bb 4>; [63.36%] <bb 3> [36.64%]: _14 = _4 & 4294967295; _5 = (long unsigned int) _14; _6 = __builtin_alloca (_5); so it seems that even in VRP itself the range information isn't quite good enough to avoid false positives. (Perhaps that's one the missed cases you mention below?) > When in EVRP you get similar behavior (well, there are some missed > cases I have patches queued for for GCC 8), but instead of modifying > the IL EVRP just temporarily sets the above range when it processes > BBs dominated by the condition. > > So while for VRP you can put the warning code after propagation for > EVRP you'd have to do warning processing during propagation itself > (and EVRP doesn't iterate). > >> To answer your question, the gimple-ssa-sprintf pass that depends >> heavily on ranges would also benefit from having access to the data >> computed in the strlen pass that's not available outside it. >> >> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend >> on both VRP and tree-object-size. >> >> I have been thinking about how to integrate these passes in GCC 8 >> to improve the overall quality. (By integrating I don't necessarily >> mean merging the code but rather having them share data or be able >> to make calls into one another.) >> >> I'd be very interested in your thoughts on this. > > Well, my thought is that we should not have N SSA propagation and > M "DOM" propagation passes but one of each kind. My thought is > also that object-size and strlen are of either kind. > > So ideally DOM and EVRP would merge and CCP, copyprop and VRP > would merge. It's probably not possible (or wise) to merge their lattices > (maybe to some extent). > > Those two passes could be used by a static analysis phase prior to > any optimization to emit warnings (just computing the lattice, not doing > any IL modification). Thanks. I'll ponder that. Martin
On 01/19/2017 05:45 AM, Richard Biener wrote: > On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >> In the attached testcase, we have a clearly bounded case of alloca which is >> being incorrectly reported: >> >> void g (int *p, int *q) >> { >> size_t n = (size_t)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> The problem is that VRP gives us an anti-range for `n' which may be out of >> range: >> >> # RANGE ~[2305843009213693952, 16140901064495857663] >> n_9 = (long unsigned int) _4; >> >> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly because >> we're trying various heuristics to make up for the fact that we have crappy >> range info from VRP. More specifically, we're basically punting on an >> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later >> on. >> >> Luckily, we already have code to check simple ranges coming into the alloca >> by looking into all the basic blocks feeding it. The attached patch delays >> the final decision on anti ranges until we have examined the basic blocks >> and determined for that we are definitely out of range. >> >> I expect all this to disappear with Andrew's upcoming range info overhaul. >> >> OK for trunk? > > I _really_ wonder why all the range consuming warnings are not emitted > from VRP itself (like we do for -Warray-bounds). There we'd still see > a range for the argument derived from the if () rather than needing to > do our own mini-VRP from the needessly "incomplete" range-info on > SSA vars. Blame me. My thinking was that the warnings themselves are just part of what we want to be doing. We ought to be warning *and* doing erroneous path isolation when we find these out of bounds situations. I felt that VRP was already complex and big enough that adding the additional code to do the path specific bounds checking and path isolation wasn't appropriate. The downside is that we lose a lot of valuable information when we leave VRP -- essentially what's left attached to a given SSA_NAME is a conservative range that applies in all contexts where the SSA_NAME is used. And that is usually too imprecise to give the warnings we want. jeff
On 01/20/2017 09:37 AM, Martin Sebor wrote: > On 01/20/2017 01:17 AM, Richard Biener wrote: >> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote: >>> On 01/19/2017 05:45 AM, Richard Biener wrote: >>>> >>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> >>>> wrote: >>>>> >>>>> In the attached testcase, we have a clearly bounded case of alloca >>>>> which >>>>> is >>>>> being incorrectly reported: >>>>> >>>>> void g (int *p, int *q) >>>>> { >>>>> size_t n = (size_t)(p - q); >>>>> >>>>> if (n < 10) >>>>> f (__builtin_alloca (n)); >>>>> } >>>>> >>>>> The problem is that VRP gives us an anti-range for `n' which may be >>>>> out >>>>> of >>>>> range: >>>>> >>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>> n_9 = (long unsigned int) _4; >>>>> >>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>>>> because >>>>> we're trying various heuristics to make up for the fact that we have >>>>> crappy >>>>> range info from VRP. More specifically, we're basically punting on an >>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound >>>>> later >>>>> on. >>>>> >>>>> Luckily, we already have code to check simple ranges coming into the >>>>> alloca >>>>> by looking into all the basic blocks feeding it. The attached patch >>>>> delays >>>>> the final decision on anti ranges until we have examined the basic >>>>> blocks >>>>> and determined for that we are definitely out of range. >>>>> >>>>> I expect all this to disappear with Andrew's upcoming range info >>>>> overhaul. >>>>> >>>>> OK for trunk? >>>> >>>> >>>> I _really_ wonder why all the range consuming warnings are not emitted >>>> from VRP itself (like we do for -Warray-bounds). There we'd still see >>>> a range for the argument derived from the if () rather than needing to >>>> do our own mini-VRP from the needessly "incomplete" range-info on >>>> SSA vars. >>> >>> >>> Can you explain why the range info is only available in VRP and >>> not outside, via the get_range_info() API? It sounds as though >>> the API shouldn't be relied on (it was virtually unused before >>> GCC 7). >> >> It's very simple. Look at the testcase from above > > Thanks for the detailed answer. A few more questions below. > >> >> void g (int *p, int *q) >> { >> size_t n = (size_t)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> The IL outside of VRP is >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> _5 = __builtin_alloca (n_9); >> f (_5); >> >> so there is no SSA name we can tack a range to covering the n_9 <= 9 >> condition that dominates __builtin_alloca. > > This may be a naive question but why is it not possible to create > such an SSA name? Time and space complexity. To get the range information in this case we have to create new SSA_NAMEs and PHI nodes to merge them at BB3. Even if you create them, passes post VRP are going to blow them away :-) This is the problem Andrew is working on. In simplest terms the ability to query the range of an object on a path through the CFG. So we could ask for the range of n_9 on the edge 2->3 (<=9) or we could ask for the range of n_9 on the edge 2->4 which would be > 9. Jeff
On Fri, Jan 20, 2017 at 12:35:32PM -0700, Jeff Law wrote: > > This may be a naive question but why is it not possible to create > > such an SSA name? > Time and space complexity. To get the range information in this case we > have to create new SSA_NAMEs and PHI nodes to merge them at BB3. Not just that. Also lots of optimizations rely on the extra SSA_NAME copies to be removed to be able to optimize well. If you'd have lots of SSA_NAMEs with the same value, just differing with their range info, then unless passes perform SCCVN or something similar, they wouldn't be able to figure equality etc. And if in SCCVN those SSA_NAMEs with different range info, but same value, compare the same, then they are going to be replaced just by one of them anyway. > Even if you create them, passes post VRP are going to blow them away :-) Exactly. Jakub
On 01/20/2017 01:17 AM, Richard Biener wrote: > On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote: >> On 01/19/2017 05:45 AM, Richard Biener wrote: >>> >>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >>>> >>>> In the attached testcase, we have a clearly bounded case of alloca which >>>> is >>>> being incorrectly reported: >>>> >>>> void g (int *p, int *q) >>>> { >>>> size_t n = (size_t)(p - q); >>>> >>>> if (n < 10) >>>> f (__builtin_alloca (n)); >>>> } >>>> >>>> The problem is that VRP gives us an anti-range for `n' which may be out >>>> of >>>> range: >>>> >>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>> n_9 = (long unsigned int) _4; >>>> >>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>>> because >>>> we're trying various heuristics to make up for the fact that we have >>>> crappy >>>> range info from VRP. More specifically, we're basically punting on an >>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound later >>>> on. >>>> >>>> Luckily, we already have code to check simple ranges coming into the >>>> alloca >>>> by looking into all the basic blocks feeding it. The attached patch >>>> delays >>>> the final decision on anti ranges until we have examined the basic blocks >>>> and determined for that we are definitely out of range. >>>> >>>> I expect all this to disappear with Andrew's upcoming range info >>>> overhaul. >>>> >>>> OK for trunk? >>> >>> >>> I _really_ wonder why all the range consuming warnings are not emitted >>> from VRP itself (like we do for -Warray-bounds). There we'd still see >>> a range for the argument derived from the if () rather than needing to >>> do our own mini-VRP from the needessly "incomplete" range-info on >>> SSA vars. >> >> >> Can you explain why the range info is only available in VRP and >> not outside, via the get_range_info() API? It sounds as though >> the API shouldn't be relied on (it was virtually unused before >> GCC 7). > > It's very simple. Look at the testcase from above > > void g (int *p, int *q) > { > size_t n = (size_t)(p - q); > > if (n < 10) > f (__builtin_alloca (n)); > } > > The IL outside of VRP is > > <bb 2> [100.00%]: > p.0_1 = (long int) p_7(D); > q.1_2 = (long int) q_8(D); > _3 = p.0_1 - q.1_2; > _4 = _3 /[ex] 4; > n_9 = (size_t) _4; > if (n_9 <= 9) > goto <bb 3>; [36.64%] > else > goto <bb 4>; [63.36%] > > <bb 3> [36.64%]: > _5 = __builtin_alloca (n_9); > f (_5); > > so there is no SSA name we can tack a range to covering the n_9 <= 9 > condition that dominates __builtin_alloca. Inside VRP we have > > <bb 2> [100.00%]: > p.0_1 = (long int) p_7(D); > q.1_2 = (long int) q_8(D); > _3 = p.0_1 - q.1_2; > _4 = _3 /[ex] 4; > n_9 = (size_t) _4; > if (n_9 <= 9) > goto <bb 3>; [36.64%] > else > goto <bb 4>; [63.36%] > > <bb 3> [36.64%]: > n_13 = ASSERT_EXPR <n_9, n_9 <= 9>; > _5 = __builtin_alloca (n_13); > f (_5); > > and viola - now the alloca call uses n_13 which is defined at a point > dominated by if (n_9 <= 9) and thus it has an improved range: > > n_13: [0, 9] EQUIVALENCES: { n_9 } (1 elements) > > When in EVRP you get similar behavior (well, there are some missed > cases I have patches queued for for GCC 8), but instead of modifying > the IL EVRP just temporarily sets the above range when it processes > BBs dominated by the condition. > > So while for VRP you can put the warning code after propagation for > EVRP you'd have to do warning processing during propagation itself > (and EVRP doesn't iterate). > >> To answer your question, the gimple-ssa-sprintf pass that depends >> heavily on ranges would also benefit from having access to the data >> computed in the strlen pass that's not available outside it. >> >> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend >> on both VRP and tree-object-size. >> >> I have been thinking about how to integrate these passes in GCC 8 >> to improve the overall quality. (By integrating I don't necessarily >> mean merging the code but rather having them share data or be able >> to make calls into one another.) >> >> I'd be very interested in your thoughts on this. > > Well, my thought is that we should not have N SSA propagation and > M "DOM" propagation passes but one of each kind. My thought is > also that object-size and strlen are of either kind. > > So ideally DOM and EVRP would merge and CCP, copyprop and VRP > would merge. It's probably not possible (or wise) to merge their lattices > (maybe to some extent) DOM's equivalency tables could be easily superceded by other implementations. It's just two datastructures. A trivial const/copy table and a stack to allow unwinding the state of the const/copy table. Throwing away the basic const/copy table and replacing it with something built by copyprop ought to be trivial. The big thing that would be left would be the scoped hash table for tracking redundant expressions. But I don't think that we'd necessarily have to rip that out to merge DOM with one of hte other passes. Hell we know DOM can fit well in any dominator based walk -- we used to do it as part of the into-ssa transformation. Jeff
On Fri, Jan 20, 2017 at 5:37 PM, Martin Sebor <msebor@gmail.com> wrote: > On 01/20/2017 01:17 AM, Richard Biener wrote: >> >> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote: >>> >>> On 01/19/2017 05:45 AM, Richard Biener wrote: >>>> >>>> >>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> >>>> wrote: >>>>> >>>>> >>>>> In the attached testcase, we have a clearly bounded case of alloca >>>>> which >>>>> is >>>>> being incorrectly reported: >>>>> >>>>> void g (int *p, int *q) >>>>> { >>>>> size_t n = (size_t)(p - q); >>>>> >>>>> if (n < 10) >>>>> f (__builtin_alloca (n)); >>>>> } >>>>> >>>>> The problem is that VRP gives us an anti-range for `n' which may be out >>>>> of >>>>> range: >>>>> >>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>> n_9 = (long unsigned int) _4; >>>>> >>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>>>> because >>>>> we're trying various heuristics to make up for the fact that we have >>>>> crappy >>>>> range info from VRP. More specifically, we're basically punting on an >>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound >>>>> later >>>>> on. >>>>> >>>>> Luckily, we already have code to check simple ranges coming into the >>>>> alloca >>>>> by looking into all the basic blocks feeding it. The attached patch >>>>> delays >>>>> the final decision on anti ranges until we have examined the basic >>>>> blocks >>>>> and determined for that we are definitely out of range. >>>>> >>>>> I expect all this to disappear with Andrew's upcoming range info >>>>> overhaul. >>>>> >>>>> OK for trunk? >>>> >>>> >>>> >>>> I _really_ wonder why all the range consuming warnings are not emitted >>>> from VRP itself (like we do for -Warray-bounds). There we'd still see >>>> a range for the argument derived from the if () rather than needing to >>>> do our own mini-VRP from the needessly "incomplete" range-info on >>>> SSA vars. >>> >>> >>> >>> Can you explain why the range info is only available in VRP and >>> not outside, via the get_range_info() API? It sounds as though >>> the API shouldn't be relied on (it was virtually unused before >>> GCC 7). >> >> >> It's very simple. Look at the testcase from above > > > Thanks for the detailed answer. A few more questions below. > >> >> void g (int *p, int *q) >> { >> size_t n = (size_t)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> The IL outside of VRP is >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> _5 = __builtin_alloca (n_9); >> f (_5); >> >> so there is no SSA name we can tack a range to covering the n_9 <= 9 >> condition that dominates __builtin_alloca. > > > This may be a naive question but why is it not possible to create > such an SSA name? You'd insert a copy (like VRP does with ASSERTs) but those will be propagated out quickly again. > Having the get_range_info() API depend on this subtlety makes its > clients quite unreliable. It may be acceptable for optimization > but when it's visible to users in the form of false positives or > negatives it's a source of bug reports. Yes, but that's the fault of the warning machinery not the range-info API which was designed for optimization. >> Inside VRP we have >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> n_13 = ASSERT_EXPR <n_9, n_9 <= 9>; >> _5 = __builtin_alloca (n_13); >> f (_5); >> >> and viola - now the alloca call uses n_13 which is defined at a point >> dominated by if (n_9 <= 9) and thus it has an improved range: >> >> n_13: [0, 9] EQUIVALENCES: { n_9 } (1 elements) > > > Yes, I see that. But when I change size_t to unsigned int (in LP64) > like so: > > void g (int *p, int *q) > { > unsigned n = (unsigned)(p - q); > > if (n < 10) > f (__builtin_alloca (n)); > } > > -Walloca-larger-than=100 still complains: > > warning: argument to ‘alloca’ may be too large > note: limit is 100 bytes, but argument may be as large as 4294967295 > > and the VRP dump shows > > _5: [0, 4294967295] > _14: [0, 4294967295] > ... > _3 = p.0_1 - q.1_2; > _4 = _3 /[ex] 4; > n_10 = (unsigned int) _4; > if (n_10 <= 9) > goto <bb 3>; [36.64%] > else > goto <bb 4>; [63.36%] > > <bb 3> [36.64%]: > _14 = _4 & 4294967295; > _5 = (long unsigned int) _14; > _6 = __builtin_alloca (_5); > > so it seems that even in VRP itself the range information isn't > quite good enough to avoid false positives. (Perhaps that's one > the missed cases you mention below?) Well, you see that there's no obvious way to use the n_10 <= 9 conditional here. VRP would need to register a [0, 9] range for the lower half of n_10 and then figure after seeing > _14 = _4 & 4294967295; that _14 is now [0, 9]. That's a neat trick VRP cannot handle at the moment (there's no way the lattice can encode a range for a sub-set of all bits of a SSA name). Your source is bogus in the way that (unsigned)(p - q) might truncate the pointer difference. Richard. > > >> When in EVRP you get similar behavior (well, there are some missed >> cases I have patches queued for for GCC 8), but instead of modifying >> the IL EVRP just temporarily sets the above range when it processes >> BBs dominated by the condition. >> >> So while for VRP you can put the warning code after propagation for >> EVRP you'd have to do warning processing during propagation itself >> (and EVRP doesn't iterate). >> >>> To answer your question, the gimple-ssa-sprintf pass that depends >>> heavily on ranges would also benefit from having access to the data >>> computed in the strlen pass that's not available outside it. >>> >>> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend >>> on both VRP and tree-object-size. >>> >>> I have been thinking about how to integrate these passes in GCC 8 >>> to improve the overall quality. (By integrating I don't necessarily >>> mean merging the code but rather having them share data or be able >>> to make calls into one another.) >>> >>> I'd be very interested in your thoughts on this. >> >> >> Well, my thought is that we should not have N SSA propagation and >> M "DOM" propagation passes but one of each kind. My thought is >> also that object-size and strlen are of either kind. >> >> So ideally DOM and EVRP would merge and CCP, copyprop and VRP >> would merge. It's probably not possible (or wise) to merge their lattices >> (maybe to some extent). >> >> Those two passes could be used by a static analysis phase prior to >> any optimization to emit warnings (just computing the lattice, not doing >> any IL modification). > > > Thanks. I'll ponder that. > > Martin
On Fri, Jan 20, 2017 at 8:48 PM, Jeff Law <law@redhat.com> wrote: > On 01/20/2017 01:17 AM, Richard Biener wrote: >> >> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <msebor@gmail.com> wrote: >>> >>> On 01/19/2017 05:45 AM, Richard Biener wrote: >>>> >>>> >>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <aldyh@redhat.com> >>>> wrote: >>>>> >>>>> >>>>> In the attached testcase, we have a clearly bounded case of alloca >>>>> which >>>>> is >>>>> being incorrectly reported: >>>>> >>>>> void g (int *p, int *q) >>>>> { >>>>> size_t n = (size_t)(p - q); >>>>> >>>>> if (n < 10) >>>>> f (__builtin_alloca (n)); >>>>> } >>>>> >>>>> The problem is that VRP gives us an anti-range for `n' which may be out >>>>> of >>>>> range: >>>>> >>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>> n_9 = (long unsigned int) _4; >>>>> >>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>>>> because >>>>> we're trying various heuristics to make up for the fact that we have >>>>> crappy >>>>> range info from VRP. More specifically, we're basically punting on an >>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound >>>>> later >>>>> on. >>>>> >>>>> Luckily, we already have code to check simple ranges coming into the >>>>> alloca >>>>> by looking into all the basic blocks feeding it. The attached patch >>>>> delays >>>>> the final decision on anti ranges until we have examined the basic >>>>> blocks >>>>> and determined for that we are definitely out of range. >>>>> >>>>> I expect all this to disappear with Andrew's upcoming range info >>>>> overhaul. >>>>> >>>>> OK for trunk? >>>> >>>> >>>> >>>> I _really_ wonder why all the range consuming warnings are not emitted >>>> from VRP itself (like we do for -Warray-bounds). There we'd still see >>>> a range for the argument derived from the if () rather than needing to >>>> do our own mini-VRP from the needessly "incomplete" range-info on >>>> SSA vars. >>> >>> >>> >>> Can you explain why the range info is only available in VRP and >>> not outside, via the get_range_info() API? It sounds as though >>> the API shouldn't be relied on (it was virtually unused before >>> GCC 7). >> >> >> It's very simple. Look at the testcase from above >> >> void g (int *p, int *q) >> { >> size_t n = (size_t)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> The IL outside of VRP is >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> _5 = __builtin_alloca (n_9); >> f (_5); >> >> so there is no SSA name we can tack a range to covering the n_9 <= 9 >> condition that dominates __builtin_alloca. Inside VRP we have >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> n_13 = ASSERT_EXPR <n_9, n_9 <= 9>; >> _5 = __builtin_alloca (n_13); >> f (_5); >> >> and viola - now the alloca call uses n_13 which is defined at a point >> dominated by if (n_9 <= 9) and thus it has an improved range: >> >> n_13: [0, 9] EQUIVALENCES: { n_9 } (1 elements) >> >> When in EVRP you get similar behavior (well, there are some missed >> cases I have patches queued for for GCC 8), but instead of modifying >> the IL EVRP just temporarily sets the above range when it processes >> BBs dominated by the condition. >> >> So while for VRP you can put the warning code after propagation for >> EVRP you'd have to do warning processing during propagation itself >> (and EVRP doesn't iterate). >> >>> To answer your question, the gimple-ssa-sprintf pass that depends >>> heavily on ranges would also benefit from having access to the data >>> computed in the strlen pass that's not available outside it. >>> >>> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend >>> on both VRP and tree-object-size. >>> >>> I have been thinking about how to integrate these passes in GCC 8 >>> to improve the overall quality. (By integrating I don't necessarily >>> mean merging the code but rather having them share data or be able >>> to make calls into one another.) >>> >>> I'd be very interested in your thoughts on this. >> >> >> Well, my thought is that we should not have N SSA propagation and >> M "DOM" propagation passes but one of each kind. My thought is >> also that object-size and strlen are of either kind. >> >> So ideally DOM and EVRP would merge and CCP, copyprop and VRP >> would merge. It's probably not possible (or wise) to merge their lattices >> (maybe to some extent) > > DOM's equivalency tables could be easily superceded by other > implementations. It's just two datastructures. A trivial const/copy table > and a stack to allow unwinding the state of the const/copy table. Throwing > away the basic const/copy table and replacing it with something built by > copyprop ought to be trivial. > > The big thing that would be left would be the scoped hash table for tracking > redundant expressions. But I don't think that we'd necessarily have to rip > that out to merge DOM with one of hte other passes. > > Hell we know DOM can fit well in any dominator based walk -- we used to do > it as part of the into-ssa transformation. Sure. The question is whether we want to make warning "passes" more expensive by essentially doing a [E]VRP/DOM pass (but not doing any IL transform). I believe doing that is more reasonable than doing their own hacky analysis. Having less passes to choose to copy from for such static analysis (and the ability to tame compile-time by some switches) would be a good thing to have. Richard. > > Jeff
>> Yes, I see that. But when I change size_t to unsigned int (in LP64) >> like so: >> >> void g (int *p, int *q) >> { >> unsigned n = (unsigned)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> -Walloca-larger-than=100 still complains: >> >> warning: argument to ‘alloca’ may be too large >> note: limit is 100 bytes, but argument may be as large as 4294967295 >> >> and the VRP dump shows >> >> _5: [0, 4294967295] >> _14: [0, 4294967295] >> ... >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_10 = (unsigned int) _4; >> if (n_10 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> _14 = _4 & 4294967295; >> _5 = (long unsigned int) _14; >> _6 = __builtin_alloca (_5); >> >> so it seems that even in VRP itself the range information isn't >> quite good enough to avoid false positives. (Perhaps that's one >> the missed cases you mention below?) > > Well, you see that there's no obvious way to use the n_10 <= 9 conditional > here. VRP would need to register a [0, 9] range for the lower half of n_10 > and then figure after seeing > >> _14 = _4 & 4294967295; > > that _14 is now [0, 9]. > > That's a neat trick VRP cannot handle at the moment (there's no way the > lattice can encode a range for a sub-set of all bits of a SSA name). Sure. My point is just that it looks like there will be some false positives whether the warnings are implemented as part of the VRP pass or elsewhere. (I admit I haven't studied the implementation of the VRP pass to understand whether these false positives are avoidable and I'm happy to take your word if if you say they can.) > Your source is bogus in the way that (unsigned)(p - q) might truncate > the pointer difference. The problem is independent of the pointer difference and can be reproduced by any conversion from a wider type to a narrower unsigned type (even the safe ones), as the test case in bug 79191 I opened for it shows: void f (unsigned long n) { if (n > 3) __builtin_abort (); } void h (unsigned long m) { unsigned n = m; if (n < 3) f (n); } Martin
On 01/23/2017 02:50 AM, Richard Biener wrote: >>> >>> So ideally DOM and EVRP would merge and CCP, copyprop and VRP >>> would merge. It's probably not possible (or wise) to merge their lattices >>> (maybe to some extent) >> >> DOM's equivalency tables could be easily superceded by other >> implementations. It's just two datastructures. A trivial const/copy table >> and a stack to allow unwinding the state of the const/copy table. Throwing >> away the basic const/copy table and replacing it with something built by >> copyprop ought to be trivial. >> >> The big thing that would be left would be the scoped hash table for tracking >> redundant expressions. But I don't think that we'd necessarily have to rip >> that out to merge DOM with one of hte other passes. >> >> Hell we know DOM can fit well in any dominator based walk -- we used to do >> it as part of the into-ssa transformation. > > Sure. > > The question is whether we want to make warning "passes" more expensive > by essentially doing a [E]VRP/DOM pass (but not doing any IL transform). > > I believe doing that is more reasonable than doing their own hacky analysis. > > Having less passes to choose to copy from for such static analysis (and the > ability to tame compile-time by some switches) would be a good thing to have. Very true. But your comment seemed to be about merging DOM and EVRP or other passes -- which is certainly possible and possibly even desirable. The more I look at our SCCVN implementation, the more I want to explore trying to re-use that infrastructure to simplify DOM. Jeff
On 01/19/2017 05:17 AM, Aldy Hernandez wrote: > In the attached testcase, we have a clearly bounded case of alloca which > is being incorrectly reported: > > void g (int *p, int *q) > { > size_t n = (size_t)(p - q); > > if (n < 10) > f (__builtin_alloca (n)); > } > > The problem is that VRP gives us an anti-range for `n' which may be out > of range: > > # RANGE ~[2305843009213693952, 16140901064495857663] > n_9 = (long unsigned int) _4; > > We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly > because we're trying various heuristics to make up for the fact that we > have crappy range info from VRP. More specifically, we're basically > punting on an VR_ANTI_RANGE and ignoring that the casted result (n_9) > has a bound later on. > > Luckily, we already have code to check simple ranges coming into the > alloca by looking into all the basic blocks feeding it. The attached > patch delays the final decision on anti ranges until we have examined > the basic blocks and determined for that we are definitely out of range. > > I expect all this to disappear with Andrew's upcoming range info overhaul. > > OK for trunk? > Yes this is another case where we're hoping that Andrew's work can help significantly. The recorded range for n_9 is the range that appplies over the entire function, when we really want the range for n_9 that occurs within the TRUE arm of the conditional. jeff
On 01/19/2017 05:17 AM, Aldy Hernandez wrote: > In the attached testcase, we have a clearly bounded case of alloca which > is being incorrectly reported: > > void g (int *p, int *q) > { > size_t n = (size_t)(p - q); > > if (n < 10) > f (__builtin_alloca (n)); > } > > The problem is that VRP gives us an anti-range for `n' which may be out > of range: > > # RANGE ~[2305843009213693952, 16140901064495857663] > n_9 = (long unsigned int) _4; > > We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly > because we're trying various heuristics to make up for the fact that we > have crappy range info from VRP. More specifically, we're basically > punting on an VR_ANTI_RANGE and ignoring that the casted result (n_9) > has a bound later on. > > Luckily, we already have code to check simple ranges coming into the > alloca by looking into all the basic blocks feeding it. The attached > patch delays the final decision on anti ranges until we have examined > the basic blocks and determined for that we are definitely out of range. > > I expect all this to disappear with Andrew's upcoming range info overhaul. > > OK for trunk? > Oh, and this is fine for the trunk. jeff
On Mon, Jan 23, 2017 at 5:07 PM, Jeff Law <law@redhat.com> wrote: > On 01/23/2017 02:50 AM, Richard Biener wrote: >>>> >>>> >>>> So ideally DOM and EVRP would merge and CCP, copyprop and VRP >>>> would merge. It's probably not possible (or wise) to merge their >>>> lattices >>>> (maybe to some extent) >>> >>> >>> DOM's equivalency tables could be easily superceded by other >>> implementations. It's just two datastructures. A trivial const/copy >>> table >>> and a stack to allow unwinding the state of the const/copy table. >>> Throwing >>> away the basic const/copy table and replacing it with something built by >>> copyprop ought to be trivial. >>> >>> The big thing that would be left would be the scoped hash table for >>> tracking >>> redundant expressions. But I don't think that we'd necessarily have to >>> rip >>> that out to merge DOM with one of hte other passes. >>> >>> Hell we know DOM can fit well in any dominator based walk -- we used to >>> do >>> it as part of the into-ssa transformation. >> >> >> Sure. >> >> The question is whether we want to make warning "passes" more expensive >> by essentially doing a [E]VRP/DOM pass (but not doing any IL transform). >> >> I believe doing that is more reasonable than doing their own hacky >> analysis. >> >> Having less passes to choose to copy from for such static analysis (and >> the >> ability to tame compile-time by some switches) would be a good thing to >> have. > > Very true. But your comment seemed to be about merging DOM and EVRP or > other passes -- which is certainly possible and possibly even desirable. > > The more I look at our SCCVN implementation, the more I want to explore > trying to re-use that infrastructure to simplify DOM. Certainly having a single way to hash/record stmts/expressions on GIMPLE would be nice. Not sure if the SCCVN one is perfect (enhancing the memory part further is on my TODO list). But I also want to merge SCCVN and DOM in the end. Well, rip out the 'SCC' part of SCCVN and replace it with a RPO style VN which makes it very similar to DOM. Ideally we'd have one "VN" we can configure (like enable a VRP lattice, disable memory handling to get a pure CCP, disable optimistic VN and thus require no iteration, etc.). It may make sense to keep (a similarly single!) SSA-propagator based VN (ccp/copyprop/VRP). Richard. > > Jeff
On 01/24/2017 03:15 AM, Richard Biener wrote: >> The more I look at our SCCVN implementation, the more I want to explore >> trying to re-use that infrastructure to simplify DOM. > > Certainly having a single way to hash/record stmts/expressions on GIMPLE would > be nice. Not sure if the SCCVN one is perfect (enhancing the memory > part further is on my TODO list). It doesn't have to be perfect. From my wanderings in its code I think it's probably sufficient to replace 90% of the non-path specific DOM optimizations. The idea is to do value numbering as an independent step (SCCVN, RPO walk, whatever). Then use a scheduler similar to Click's 95 work to place statements where they belong. The result is far simpler than what DOM does and re-uses the VN framework. In that world DOM morphs away from generic CSE/simplifications and focuses on path specific stuff. And *that* should be reimplemented as backwards walk with value numbering. It's hard to tease that out right now given how we depend on DOM for generic CSE/simplifications. Jeff
diff --git a/gcc/gimple-ssa-warn-alloca.c b/gcc/gimple-ssa-warn-alloca.c index a27eea1..d553a34 100644 --- a/gcc/gimple-ssa-warn-alloca.c +++ b/gcc/gimple-ssa-warn-alloca.c @@ -272,6 +272,7 @@ static struct alloca_type_and_limit alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type) { gcc_assert (gimple_alloca_call_p (stmt)); + bool tentative_cast_from_signed = false; tree len = gimple_call_arg (stmt, 0); tree len_casted = NULL; wide_int min, max; @@ -352,8 +353,26 @@ alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type) // with this heuristic. Hopefully, this VR_ANTI_RANGE // nonsense will go away, and we won't have to catch the // sign conversion problems with this crap. + // + // This is here to catch things like: + // void foo(signed int n) { + // if (n < 100) + // alloca(n); + // ... + // } if (cast_from_signed_p (len, invalid_casted_type)) - return alloca_type_and_limit (ALLOCA_CAST_FROM_SIGNED); + { + // Unfortunately this also triggers: + // + // __SIZE_TYPE__ n = (__SIZE_TYPE__)blah; + // if (n < 100) + // alloca(n); + // + // ...which is clearly bounded. So, double check that + // the paths leading up to the size definitely don't + // have a bound. + tentative_cast_from_signed = true; + } } // No easily determined range and try other things. } @@ -371,10 +390,12 @@ alloca_call_type (gimple *stmt, bool is_vla, tree *invalid_casted_type) ret = alloca_call_type_by_arg (len, len_casted, EDGE_PRED (bb, ix), max_size); if (ret.type != ALLOCA_OK) - return ret; + break; } } + if (tentative_cast_from_signed && ret.type != ALLOCA_OK) + return alloca_type_and_limit (ALLOCA_CAST_FROM_SIGNED); return ret; } diff --git a/gcc/testsuite/gcc.dg/Walloca-13.c b/gcc/testsuite/gcc.dg/Walloca-13.c new file mode 100644 index 0000000..f9bdcef --- /dev/null +++ b/gcc/testsuite/gcc.dg/Walloca-13.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-Walloca-larger-than=100 -O2" } */ + +void f (void*); + +void g (int *p, int *q) +{ + __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); + if (n < 100) + f (__builtin_alloca (n)); +}