diff mbox series

canonicalize unsigned [1,MAX] ranges into ~[0,0]

Message ID f57ee68e-f1ee-e406-e90b-32c651a7683e@redhat.com
State New
Headers show
Series canonicalize unsigned [1,MAX] ranges into ~[0,0] | expand

Commit Message

Aldy Hernandez Oct. 4, 2019, 12:59 p.m. UTC
When I did the value_range canonicalization work, I noticed that an 
unsigned [1,MAX] and an ~[0,0] could be two different representations 
for the same thing.  I didn't address the problem then because callers 
to ranges_from_anti_range() would go into an infinite loop trying to 
extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to 
ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.

Now that we have one main caller (from the symbolic PLUS/MINUS 
handling), it's a lot easier to contain.  Well, singleton_p also calls 
it, but it's already handling nonzero specially, so it wouldn't be affected.

With some upcoming cleanups I'm about to post, the fact that [1,MAX] and 
~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just 
good form to have one representation, giving us the ability to pick at 
nonzero_p ranges with ease.

The code in extract_range_from_plus_minus_expr() continues to be a mess 
(as it has always been), but at least it's contained, and with this 
patch, it's slightly smaller.

Note, I'm avoiding adding a comment header for functions with highly 
descriptive obvious names.

OK?

Aldy

Comments

Jeff Law Oct. 4, 2019, 3:38 p.m. UTC | #1
On 10/4/19 6:59 AM, Aldy Hernandez wrote:
> When I did the value_range canonicalization work, I noticed that an
> unsigned [1,MAX] and an ~[0,0] could be two different representations
> for the same thing.  I didn't address the problem then because callers
> to ranges_from_anti_range() would go into an infinite loop trying to
> extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
> ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.
> 
> Now that we have one main caller (from the symbolic PLUS/MINUS
> handling), it's a lot easier to contain.  Well, singleton_p also calls
> it, but it's already handling nonzero specially, so it wouldn't be affected.
> 
> 
> With some upcoming cleanups I'm about to post, the fact that [1,MAX] and
> ~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
> good form to have one representation, giving us the ability to pick at
> nonzero_p ranges with ease.
> 
> The code in extract_range_from_plus_minus_expr() continues to be a mess
> (as it has always been), but at least it's contained, and with this
> patch, it's slightly smaller.
> 
> Note, I'm avoiding adding a comment header for functions with highly
> descriptive obvious names.
> 
> OK?
> 
> Aldy
> 
> canonicalize-nonzero-ranges.patch
> 
> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
> Author: Aldy Hernandez <aldyh@redhat.com>
> Date:   Fri Oct 4 08:51:25 2019 +0200
> 
>     Canonicalize UNSIGNED [1,MAX] into ~[0,0].
>     
>     Adapt PLUS/MINUS symbolic handling, so it doesn't call
>     ranges_from_anti_range with a VR_ANTI_RANGE containing one sub-range.
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 6e4f145af46..3934b41fdf9 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,18 @@
> +2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
> +
> +	* tree-vrp.c (value_range_base::singleton_p): Use num_pairs
> +	instead of calling vrp_val_is_*.
> +	(value_range_base::set): Canonicalize unsigned [1,MAX] into
> +	non-zero.
> +	(range_has_numeric_bounds_p): New.
> +	(range_int_cst_p): Use range_has_numeric_bounds_p.
> +	(ranges_from_anti_range): Assert that we won't recurse
> +	indefinitely.
> +	(extract_extremes_from_range): New.
> +	(extract_range_from_plus_minus_expr): Adapt so we don't call
> +	ranges_from_anti_range with an anti-range containing only one
> +	sub-range.
So no problem with the implementation, but I do have a higher level
question.

One of the goals of the representation side of the Ranger project is to
drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
in the opposite direction.   So do we really want to canonicalize to ~[0,0]?

jeff
Aldy Hernandez Oct. 4, 2019, 3:49 p.m. UTC | #2
On 10/4/19 11:38 AM, Jeff Law wrote:
> On 10/4/19 6:59 AM, Aldy Hernandez wrote:
>> When I did the value_range canonicalization work, I noticed that an
>> unsigned [1,MAX] and an ~[0,0] could be two different representations
>> for the same thing.  I didn't address the problem then because callers
>> to ranges_from_anti_range() would go into an infinite loop trying to
>> extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
>> ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.
>>
>> Now that we have one main caller (from the symbolic PLUS/MINUS
>> handling), it's a lot easier to contain.  Well, singleton_p also calls
>> it, but it's already handling nonzero specially, so it wouldn't be affected.
>>
>>
>> With some upcoming cleanups I'm about to post, the fact that [1,MAX] and
>> ~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
>> good form to have one representation, giving us the ability to pick at
>> nonzero_p ranges with ease.
>>
>> The code in extract_range_from_plus_minus_expr() continues to be a mess
>> (as it has always been), but at least it's contained, and with this
>> patch, it's slightly smaller.
>>
>> Note, I'm avoiding adding a comment header for functions with highly
>> descriptive obvious names.
>>
>> OK?
>>
>> Aldy
>>
>> canonicalize-nonzero-ranges.patch
>>
>> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
>> Author: Aldy Hernandez <aldyh@redhat.com>
>> Date:   Fri Oct 4 08:51:25 2019 +0200
>>
>>      Canonicalize UNSIGNED [1,MAX] into ~[0,0].
>>      
>>      Adapt PLUS/MINUS symbolic handling, so it doesn't call
>>      ranges_from_anti_range with a VR_ANTI_RANGE containing one sub-range.
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 6e4f145af46..3934b41fdf9 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,18 @@
>> +2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
>> +
>> +	* tree-vrp.c (value_range_base::singleton_p): Use num_pairs
>> +	instead of calling vrp_val_is_*.
>> +	(value_range_base::set): Canonicalize unsigned [1,MAX] into
>> +	non-zero.
>> +	(range_has_numeric_bounds_p): New.
>> +	(range_int_cst_p): Use range_has_numeric_bounds_p.
>> +	(ranges_from_anti_range): Assert that we won't recurse
>> +	indefinitely.
>> +	(extract_extremes_from_range): New.
>> +	(extract_range_from_plus_minus_expr): Adapt so we don't call
>> +	ranges_from_anti_range with an anti-range containing only one
>> +	sub-range.
> So no problem with the implementation, but I do have a higher level
> question.
> 
> One of the goals of the representation side of the Ranger project is to
> drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
> in the opposite direction.   So do we really want to canonicalize to ~[0,0]?

Hmmm, Andrew had the same question.

It really doesn't matter what we canonicalize too, as long as we're 
consistent, but there are a bunch of non-zero tests throughout that were 
checking for the ~[0,0] construct, and I didn't want to rock the boat 
too much.  Although in all honesty, most of those should already be 
converted to the ::nonzero_p() API.

However, if we canonicalize into [1,MAX] for unsigned, we have the 
problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p 
code will have to check two different representations, not to mention it 
will now have to check TYPE_UNSIGNED(type).

Aldy
Jeff Law Oct. 4, 2019, 4:02 p.m. UTC | #3
On 10/4/19 9:49 AM, Aldy Hernandez wrote:
> 
> 
> On 10/4/19 11:38 AM, Jeff Law wrote:
>> On 10/4/19 6:59 AM, Aldy Hernandez wrote:
>>> When I did the value_range canonicalization work, I noticed that an
>>> unsigned [1,MAX] and an ~[0,0] could be two different representations
>>> for the same thing.  I didn't address the problem then because callers
>>> to ranges_from_anti_range() would go into an infinite loop trying to
>>> extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
>>> ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.
>>>
>>> Now that we have one main caller (from the symbolic PLUS/MINUS
>>> handling), it's a lot easier to contain.  Well, singleton_p also calls
>>> it, but it's already handling nonzero specially, so it wouldn't be affected.
>>>
>>>
>>>
>>> With some upcoming cleanups I'm about to post, the fact that [1,MAX] and
>>> ~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
>>> good form to have one representation, giving us the ability to pick at
>>> nonzero_p ranges with ease.
>>>
>>> The code in extract_range_from_plus_minus_expr() continues to be a mess
>>> (as it has always been), but at least it's contained, and with this
>>> patch, it's slightly smaller.
>>>
>>> Note, I'm avoiding adding a comment header for functions with highly
>>> descriptive obvious names.
>>>
>>> OK?
>>>
>>> Aldy
>>>
>>> canonicalize-nonzero-ranges.patch
>>>
>>> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
>>> Author: Aldy Hernandez <aldyh@redhat.com>
>>> Date:   Fri Oct 4 08:51:25 2019 +0200
>>>
>>>      Canonicalize UNSIGNED [1,MAX] into ~[0,0].
>>>           Adapt PLUS/MINUS symbolic handling, so it doesn't call
>>>      ranges_from_anti_range with a VR_ANTI_RANGE containing one
>>> sub-range.
>>>
>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>> index 6e4f145af46..3934b41fdf9 100644
>>> --- a/gcc/ChangeLog
>>> +++ b/gcc/ChangeLog
>>> @@ -1,3 +1,18 @@
>>> +2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
>>> +
>>> +    * tree-vrp.c (value_range_base::singleton_p): Use num_pairs
>>> +    instead of calling vrp_val_is_*.
>>> +    (value_range_base::set): Canonicalize unsigned [1,MAX] into
>>> +    non-zero.
>>> +    (range_has_numeric_bounds_p): New.
>>> +    (range_int_cst_p): Use range_has_numeric_bounds_p.
>>> +    (ranges_from_anti_range): Assert that we won't recurse
>>> +    indefinitely.
>>> +    (extract_extremes_from_range): New.
>>> +    (extract_range_from_plus_minus_expr): Adapt so we don't call
>>> +    ranges_from_anti_range with an anti-range containing only one
>>> +    sub-range.
>> So no problem with the implementation, but I do have a higher level
>> question.
>>
>> One of the goals of the representation side of the Ranger project is to
>> drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
>> in the opposite direction.   So do we really want to canonicalize to
>> ~[0,0]?
> 
> Hmmm, Andrew had the same question.
> 
> It really doesn't matter what we canonicalize too, as long as we're
> consistent, but there are a bunch of non-zero tests throughout that were
> checking for the ~[0,0] construct, and I didn't want to rock the boat
> too much.  Although in all honesty, most of those should already be
> converted to the ::nonzero_p() API.
> 
> However, if we canonicalize into [1,MAX] for unsigned, we have the
> problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p
> code will have to check two different representations, not to mention it
> will now have to check TYPE_UNSIGNED(type).
ISTM that the right thing to do is to first move to the ::nonzero_p API,
which should be a behavior preserving change.  It'd probably be a medium
sized change, but highly mechanical and behavior preserving, so easy to
review.

Once that's in place, then I'd seriously look at [1,MAX] as the
canonical form for unsigned types and [MIN, -1] [1, MAX] as the
canonical form for signed types.  If we're trying to get away from ANTI
ranges, canonicalizing on ~[0,0] just seems wrong.

Where things may get painful is things like [MIN, -3] [3, MAX] which
occur due to arithmetic and pointer alignments.  I think we have a hack
somewhere which special cases this into [MIN, -1], [1, MAX] even though
it's technically less precise.

jeff
Aldy Hernandez Oct. 4, 2019, 4:14 p.m. UTC | #4
On 10/4/19 12:02 PM, Jeff Law wrote:
> On 10/4/19 9:49 AM, Aldy Hernandez wrote:
>>
>>
>> On 10/4/19 11:38 AM, Jeff Law wrote:
>>> On 10/4/19 6:59 AM, Aldy Hernandez wrote:
>>>> When I did the value_range canonicalization work, I noticed that an
>>>> unsigned [1,MAX] and an ~[0,0] could be two different representations
>>>> for the same thing.  I didn't address the problem then because callers
>>>> to ranges_from_anti_range() would go into an infinite loop trying to
>>>> extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
>>>> ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.
>>>>
>>>> Now that we have one main caller (from the symbolic PLUS/MINUS
>>>> handling), it's a lot easier to contain.  Well, singleton_p also calls
>>>> it, but it's already handling nonzero specially, so it wouldn't be affected.
>>>>
>>>>
>>>>
>>>> With some upcoming cleanups I'm about to post, the fact that [1,MAX] and
>>>> ~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
>>>> good form to have one representation, giving us the ability to pick at
>>>> nonzero_p ranges with ease.
>>>>
>>>> The code in extract_range_from_plus_minus_expr() continues to be a mess
>>>> (as it has always been), but at least it's contained, and with this
>>>> patch, it's slightly smaller.
>>>>
>>>> Note, I'm avoiding adding a comment header for functions with highly
>>>> descriptive obvious names.
>>>>
>>>> OK?
>>>>
>>>> Aldy
>>>>
>>>> canonicalize-nonzero-ranges.patch
>>>>
>>>> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
>>>> Author: Aldy Hernandez <aldyh@redhat.com>
>>>> Date:   Fri Oct 4 08:51:25 2019 +0200
>>>>
>>>>       Canonicalize UNSIGNED [1,MAX] into ~[0,0].
>>>>            Adapt PLUS/MINUS symbolic handling, so it doesn't call
>>>>       ranges_from_anti_range with a VR_ANTI_RANGE containing one
>>>> sub-range.
>>>>
>>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>>> index 6e4f145af46..3934b41fdf9 100644
>>>> --- a/gcc/ChangeLog
>>>> +++ b/gcc/ChangeLog
>>>> @@ -1,3 +1,18 @@
>>>> +2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
>>>> +
>>>> +    * tree-vrp.c (value_range_base::singleton_p): Use num_pairs
>>>> +    instead of calling vrp_val_is_*.
>>>> +    (value_range_base::set): Canonicalize unsigned [1,MAX] into
>>>> +    non-zero.
>>>> +    (range_has_numeric_bounds_p): New.
>>>> +    (range_int_cst_p): Use range_has_numeric_bounds_p.
>>>> +    (ranges_from_anti_range): Assert that we won't recurse
>>>> +    indefinitely.
>>>> +    (extract_extremes_from_range): New.
>>>> +    (extract_range_from_plus_minus_expr): Adapt so we don't call
>>>> +    ranges_from_anti_range with an anti-range containing only one
>>>> +    sub-range.
>>> So no problem with the implementation, but I do have a higher level
>>> question.
>>>
>>> One of the goals of the representation side of the Ranger project is to
>>> drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
>>> in the opposite direction.   So do we really want to canonicalize to
>>> ~[0,0]?
>>
>> Hmmm, Andrew had the same question.
>>
>> It really doesn't matter what we canonicalize too, as long as we're
>> consistent, but there are a bunch of non-zero tests throughout that were
>> checking for the ~[0,0] construct, and I didn't want to rock the boat
>> too much.  Although in all honesty, most of those should already be
>> converted to the ::nonzero_p() API.
>>
>> However, if we canonicalize into [1,MAX] for unsigned, we have the
>> problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p
>> code will have to check two different representations, not to mention it
>> will now have to check TYPE_UNSIGNED(type).
> ISTM that the right thing to do is to first move to the ::nonzero_p API,
> which should be a behavior preserving change.  It'd probably be a medium
> sized change, but highly mechanical and behavior preserving, so easy to
> review.

Ughh, please no?  This was just a means to get the general range_fold* 
cleanups which are important and make everything easier to read.  I'd 
rather not get distracted by having to audit all the nonzero checking 
throughout.

Besides, we can't get away from anti-ranges inasmuch as we have 
value_range_base, which hopefully we can move away from in a future 
release.  So this will eventually become a non-issue.  At that point, 
we'll loose ALL anti-ranges once and for all.

~[0,0] has been the accepted way for a long time, I'd really prefer to 
keep that (for now).

And really, this is just plain ugly:

bool
value_range_base::nonzero_p ()
{
   if (m_kind == VR_ANTI_RANGE
       && !TYPE_UNSIGNED (type ())
       && integer_zerop (m_min)
       && integer_zerop (m_max))
     return true;

   return (m_kind == VR_RANGE
	  && TYPE_UNSIGNED (type ())
	  && integer_onep (m_min)
	  && vrp_val_is_max (m_max));
}

Aldy

> 
> Once that's in place, then I'd seriously look at [1,MAX] as the
> canonical form for unsigned types and [MIN, -1] [1, MAX] as the
> canonical form for signed types.  If we're trying to get away from ANTI
> ranges, canonicalizing on ~[0,0] just seems wrong.
> 
> Where things may get painful is things like [MIN, -3] [3, MAX] which
> occur due to arithmetic and pointer alignments.  I think we have a hack
> somewhere which special cases this into [MIN, -1], [1, MAX] even though
> it's technically less precise.
> 
> jeff
>
Richard Biener Oct. 4, 2019, 4:29 p.m. UTC | #5
On October 4, 2019 5:38:09 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 10/4/19 6:59 AM, Aldy Hernandez wrote:
>> When I did the value_range canonicalization work, I noticed that an
>> unsigned [1,MAX] and an ~[0,0] could be two different representations
>> for the same thing.  I didn't address the problem then because
>callers
>> to ranges_from_anti_range() would go into an infinite loop trying to
>> extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
>>
>ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.
>> 
>> Now that we have one main caller (from the symbolic PLUS/MINUS
>> handling), it's a lot easier to contain.  Well, singleton_p also
>calls
>>
>it, but it's already handling nonzero specially, so it wouldn't be affected.
>> 
>> 
>> With some upcoming cleanups I'm about to post, the fact that [1,MAX]
>and
>> ~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
>> good form to have one representation, giving us the ability to pick
>at
>> nonzero_p ranges with ease.
>> 
>> The code in extract_range_from_plus_minus_expr() continues to be a
>mess
>> (as it has always been), but at least it's contained, and with this
>> patch, it's slightly smaller.
>> 
>> Note, I'm avoiding adding a comment header for functions with highly
>> descriptive obvious names.
>> 
>> OK?
>> 
>> Aldy
>> 
>> canonicalize-nonzero-ranges.patch
>> 
>> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
>> Author: Aldy Hernandez <aldyh@redhat.com>
>> Date:   Fri Oct 4 08:51:25 2019 +0200
>> 
>>     Canonicalize UNSIGNED [1,MAX] into ~[0,0].
>>     
>>     Adapt PLUS/MINUS symbolic handling, so it doesn't call
>>     ranges_from_anti_range with a VR_ANTI_RANGE containing one
>sub-range.
>> 
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 6e4f145af46..3934b41fdf9 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,18 @@
>> +2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
>> +
>> +	* tree-vrp.c (value_range_base::singleton_p): Use num_pairs
>> +	instead of calling vrp_val_is_*.
>> +	(value_range_base::set): Canonicalize unsigned [1,MAX] into
>> +	non-zero.
>> +	(range_has_numeric_bounds_p): New.
>> +	(range_int_cst_p): Use range_has_numeric_bounds_p.
>> +	(ranges_from_anti_range): Assert that we won't recurse
>> +	indefinitely.
>> +	(extract_extremes_from_range): New.
>> +	(extract_range_from_plus_minus_expr): Adapt so we don't call
>> +	ranges_from_anti_range with an anti-range containing only one
>> +	sub-range.
>So no problem with the implementation, but I do have a higher level
>question.
>
>One of the goals of the representation side of the Ranger project is to
>drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
>in the opposite direction.   So do we really want to canonicalize to
>~[0,0]?

No, we don't. 

Richard. 

>jeff
Jeff Law Oct. 4, 2019, 5:17 p.m. UTC | #6
On 10/4/19 10:14 AM, Aldy Hernandez wrote:
> 
> 
> On 10/4/19 12:02 PM, Jeff Law wrote:
>> On 10/4/19 9:49 AM, Aldy Hernandez wrote:
>>>
>>>
>>> On 10/4/19 11:38 AM, Jeff Law wrote:
>>>> On 10/4/19 6:59 AM, Aldy Hernandez wrote:
>>>>> When I did the value_range canonicalization work, I noticed that an
>>>>> unsigned [1,MAX] and an ~[0,0] could be two different representations
>>>>> for the same thing.  I didn't address the problem then because callers
>>>>> to ranges_from_anti_range() would go into an infinite loop trying to
>>>>> extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
>>>>> ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.
>>>>>
>>>>> Now that we have one main caller (from the symbolic PLUS/MINUS
>>>>> handling), it's a lot easier to contain.  Well, singleton_p also calls
>>>>> it, but it's already handling nonzero specially, so it wouldn't be affected.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> With some upcoming cleanups I'm about to post, the fact that
>>>>> [1,MAX] and
>>>>> ~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
>>>>> good form to have one representation, giving us the ability to pick at
>>>>> nonzero_p ranges with ease.
>>>>>
>>>>> The code in extract_range_from_plus_minus_expr() continues to be a
>>>>> mess
>>>>> (as it has always been), but at least it's contained, and with this
>>>>> patch, it's slightly smaller.
>>>>>
>>>>> Note, I'm avoiding adding a comment header for functions with highly
>>>>> descriptive obvious names.
>>>>>
>>>>> OK?
>>>>>
>>>>> Aldy
>>>>>
>>>>> canonicalize-nonzero-ranges.patch
>>>>>
>>>>> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
>>>>> Author: Aldy Hernandez <aldyh@redhat.com>
>>>>> Date:   Fri Oct 4 08:51:25 2019 +0200
>>>>>
>>>>>       Canonicalize UNSIGNED [1,MAX] into ~[0,0].
>>>>>            Adapt PLUS/MINUS symbolic handling, so it doesn't call
>>>>>       ranges_from_anti_range with a VR_ANTI_RANGE containing one
>>>>> sub-range.
>>>>>
>>>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>>>> index 6e4f145af46..3934b41fdf9 100644
>>>>> --- a/gcc/ChangeLog
>>>>> +++ b/gcc/ChangeLog
>>>>> @@ -1,3 +1,18 @@
>>>>> +2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
>>>>> +
>>>>> +    * tree-vrp.c (value_range_base::singleton_p): Use num_pairs
>>>>> +    instead of calling vrp_val_is_*.
>>>>> +    (value_range_base::set): Canonicalize unsigned [1,MAX] into
>>>>> +    non-zero.
>>>>> +    (range_has_numeric_bounds_p): New.
>>>>> +    (range_int_cst_p): Use range_has_numeric_bounds_p.
>>>>> +    (ranges_from_anti_range): Assert that we won't recurse
>>>>> +    indefinitely.
>>>>> +    (extract_extremes_from_range): New.
>>>>> +    (extract_range_from_plus_minus_expr): Adapt so we don't call
>>>>> +    ranges_from_anti_range with an anti-range containing only one
>>>>> +    sub-range.
>>>> So no problem with the implementation, but I do have a higher level
>>>> question.
>>>>
>>>> One of the goals of the representation side of the Ranger project is to
>>>> drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
>>>> in the opposite direction.   So do we really want to canonicalize to
>>>> ~[0,0]?
>>>
>>> Hmmm, Andrew had the same question.
>>>
>>> It really doesn't matter what we canonicalize too, as long as we're
>>> consistent, but there are a bunch of non-zero tests throughout that were
>>> checking for the ~[0,0] construct, and I didn't want to rock the boat
>>> too much.  Although in all honesty, most of those should already be
>>> converted to the ::nonzero_p() API.
>>>
>>> However, if we canonicalize into [1,MAX] for unsigned, we have the
>>> problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p
>>> code will have to check two different representations, not to mention it
>>> will now have to check TYPE_UNSIGNED(type).
>> ISTM that the right thing to do is to first move to the ::nonzero_p API,
>> which should be a behavior preserving change.  It'd probably be a medium
>> sized change, but highly mechanical and behavior preserving, so easy to
>> review.
> 
> Ughh, please no?  This was just a means to get the general range_fold*
> cleanups which are important and make everything easier to read.  I'd
> rather not get distracted by having to audit all the nonzero checking
> throughout.
Doesn't the audit just have to look for an open-coded check for ~[0,0]
and convert any to use the API?  I don't think we have *that* many
anti-range bits I wouldn't think this would be terrible.

What am I missing here that makes this so painful?


> 
> Besides, we can't get away from anti-ranges inasmuch as we have
> value_range_base, which hopefully we can move away from in a future
> release.  So this will eventually become a non-issue.  At that point,
> we'll loose ALL anti-ranges once and for all.
Even if we can't get away from them, introducing more, particularly
canonicalizing on a form using anti-ranges just seems like we're going
backwards.

If we funnel everything through the established API, then changing the
canonical form becomes trivial because it's isolated to just two places.
 The test ::nonzero_p method and the canonicalization point.

> 
> ~[0,0] has been the accepted way for a long time, I'd really prefer to
> keep that (for now).
It has.  Very true.  But I don't necessarily think that means we should
be introducing even more of 'em.


> 
> And really, this is just plain ugly:
> 
> bool
> value_range_base::nonzero_p ()
> {
>   if (m_kind == VR_ANTI_RANGE
>       && !TYPE_UNSIGNED (type ())
>       && integer_zerop (m_min)
>       && integer_zerop (m_max))
>     return true;
> 
>   return (m_kind == VR_RANGE
>       && TYPE_UNSIGNED (type ())
>       && integer_onep (m_min)
>       && vrp_val_is_max (m_max));
> }
That doesn't seem bad at all, particularly if this is where we contain
the wart.

jeff
Aldy Hernandez Oct. 7, 2019, 12:28 p.m. UTC | #7
On 10/4/19 1:17 PM, Jeff Law wrote:
> On 10/4/19 10:14 AM, Aldy Hernandez wrote:
>>
>>
>> On 10/4/19 12:02 PM, Jeff Law wrote:
>>> On 10/4/19 9:49 AM, Aldy Hernandez wrote:
>>>>
>>>>
>>>> On 10/4/19 11:38 AM, Jeff Law wrote:
>>>>> On 10/4/19 6:59 AM, Aldy Hernandez wrote:
>>>>>> When I did the value_range canonicalization work, I noticed that an
>>>>>> unsigned [1,MAX] and an ~[0,0] could be two different representations
>>>>>> for the same thing.  I didn't address the problem then because callers
>>>>>> to ranges_from_anti_range() would go into an infinite loop trying to
>>>>>> extract ~[0,0] into [1,MAX] and [].  We had a lot of callers to
>>>>>> ranges_from_anti_range, and it smelled like a rat's nest, so I bailed.
>>>>>>
>>>>>> Now that we have one main caller (from the symbolic PLUS/MINUS
>>>>>> handling), it's a lot easier to contain.  Well, singleton_p also calls
>>>>>> it, but it's already handling nonzero specially, so it wouldn't be affected.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> With some upcoming cleanups I'm about to post, the fact that
>>>>>> [1,MAX] and
>>>>>> ~[0,0] are equal_p(), but not nonzero_p(), matters.  Plus, it's just
>>>>>> good form to have one representation, giving us the ability to pick at
>>>>>> nonzero_p ranges with ease.
>>>>>>
>>>>>> The code in extract_range_from_plus_minus_expr() continues to be a
>>>>>> mess
>>>>>> (as it has always been), but at least it's contained, and with this
>>>>>> patch, it's slightly smaller.
>>>>>>
>>>>>> Note, I'm avoiding adding a comment header for functions with highly
>>>>>> descriptive obvious names.
>>>>>>
>>>>>> OK?
>>>>>>
>>>>>> Aldy
>>>>>>
>>>>>> canonicalize-nonzero-ranges.patch
>>>>>>
>>>>>> commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
>>>>>> Author: Aldy Hernandez <aldyh@redhat.com>
>>>>>> Date:   Fri Oct 4 08:51:25 2019 +0200
>>>>>>
>>>>>>        Canonicalize UNSIGNED [1,MAX] into ~[0,0].
>>>>>>             Adapt PLUS/MINUS symbolic handling, so it doesn't call
>>>>>>        ranges_from_anti_range with a VR_ANTI_RANGE containing one
>>>>>> sub-range.
>>>>>>
>>>>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>>>>> index 6e4f145af46..3934b41fdf9 100644
>>>>>> --- a/gcc/ChangeLog
>>>>>> +++ b/gcc/ChangeLog
>>>>>> @@ -1,3 +1,18 @@
>>>>>> +2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
>>>>>> +
>>>>>> +    * tree-vrp.c (value_range_base::singleton_p): Use num_pairs
>>>>>> +    instead of calling vrp_val_is_*.
>>>>>> +    (value_range_base::set): Canonicalize unsigned [1,MAX] into
>>>>>> +    non-zero.
>>>>>> +    (range_has_numeric_bounds_p): New.
>>>>>> +    (range_int_cst_p): Use range_has_numeric_bounds_p.
>>>>>> +    (ranges_from_anti_range): Assert that we won't recurse
>>>>>> +    indefinitely.
>>>>>> +    (extract_extremes_from_range): New.
>>>>>> +    (extract_range_from_plus_minus_expr): Adapt so we don't call
>>>>>> +    ranges_from_anti_range with an anti-range containing only one
>>>>>> +    sub-range.
>>>>> So no problem with the implementation, but I do have a higher level
>>>>> question.
>>>>>
>>>>> One of the goals of the representation side of the Ranger project is to
>>>>> drop anti-ranges.  Canonicalizing [1, MAX] to ~[0,0] seems to be going
>>>>> in the opposite direction.   So do we really want to canonicalize to
>>>>> ~[0,0]?
>>>>
>>>> Hmmm, Andrew had the same question.
>>>>
>>>> It really doesn't matter what we canonicalize too, as long as we're
>>>> consistent, but there are a bunch of non-zero tests throughout that were
>>>> checking for the ~[0,0] construct, and I didn't want to rock the boat
>>>> too much.  Although in all honesty, most of those should already be
>>>> converted to the ::nonzero_p() API.
>>>>
>>>> However, if we canonicalize into [1,MAX] for unsigned, we have the
>>>> problem that a signed non-zero will still be ~[0,0], so our ::nonzero_p
>>>> code will have to check two different representations, not to mention it
>>>> will now have to check TYPE_UNSIGNED(type).
>>> ISTM that the right thing to do is to first move to the ::nonzero_p API,
>>> which should be a behavior preserving change.  It'd probably be a medium
>>> sized change, but highly mechanical and behavior preserving, so easy to
>>> review.
>>
>> Ughh, please no?  This was just a means to get the general range_fold*
>> cleanups which are important and make everything easier to read.  I'd
>> rather not get distracted by having to audit all the nonzero checking
>> throughout.
> Doesn't the audit just have to look for an open-coded check for ~[0,0]
> and convert any to use the API?  I don't think we have *that* many
> anti-range bits I wouldn't think this would be terrible.
> 
> What am I missing here that makes this so painful?

I think I'm just suffering from PTSD from anything VRP related.  If you 
fix something correctly, 20 things are bound to break because they were 
expecting incorrect behavior.

> 
> 
>>
>> Besides, we can't get away from anti-ranges inasmuch as we have
>> value_range_base, which hopefully we can move away from in a future
>> release.  So this will eventually become a non-issue.  At that point,
>> we'll loose ALL anti-ranges once and for all.
> Even if we can't get away from them, introducing more, particularly
> canonicalizing on a form using anti-ranges just seems like we're going
> backwards.
> 
> If we funnel everything through the established API, then changing the
> canonical form becomes trivial because it's isolated to just two places.
>   The test ::nonzero_p method and the canonicalization point.
> 
>>
>> ~[0,0] has been the accepted way for a long time, I'd really prefer to
>> keep that (for now).
> It has.  Very true.  But I don't necessarily think that means we should
> be introducing even more of 'em.
> 
> 
>>
>> And really, this is just plain ugly:
>>
>> bool
>> value_range_base::nonzero_p ()
>> {
>>    if (m_kind == VR_ANTI_RANGE
>>        && !TYPE_UNSIGNED (type ())
>>        && integer_zerop (m_min)
>>        && integer_zerop (m_max))
>>      return true;
>>
>>    return (m_kind == VR_RANGE
>>        && TYPE_UNSIGNED (type ())
>>        && integer_onep (m_min)
>>        && vrp_val_is_max (m_max));
>> }
> That doesn't seem bad at all, particularly if this is where we contain
> the wart.

Fair enough.  I guess I don't care what we settle on, inasmuch as we 
canonicalize into one true value.  For some reason, I thought the above 
nonzero would cause you to cringe, I guess not :).

Happily, normalizing into ~0 for signed and [1,MAX] for unsigned, 
simplified the patch because it on longer needs tweaks to 
ranges_from_anti_range.

OK for trunk?

Aldy

p.s. This still leaves open the issue with ipa_vr's handling of 
nonzero_p.  As per my last message, I've adjusted it to check for either 
~[0,0] or the [1,MAX] variant for unsigned, since before this patch 
there were two ways of representing the same thing.  However, ipa-prop 
has no API (it open codes everything), as it has rolled its own version 
of "value ranges" with wide-ints.

class GTY(()) ipa_vr
{
public:
   /* The data fields below are valid only if known is true.  */
   bool known;
   enum value_range_kind type;
   wide_int min;
   wide_int max;
   bool nonzero_p (tree) const;
};

I'm tempted to leave the nonzero_p which checks for both ~0 and [1,MAX]:

bool
ipa_vr::nonzero_p (tree expr_type) const
{
   if (type == VR_ANTI_RANGE && wi::eq_p (min, 0) && wi::eq_p (max, 0))
     return true;

   unsigned prec = TYPE_PRECISION (expr_type);
   return (type == VR_RANGE
	  && TYPE_UNSIGNED (expr_type)
	  && wi::eq_p (min, wi::one (prec))
	  && wi::eq_p (max, wi::max_value (prec, TYPE_SIGN (expr_type))));
}

I don't care either way.
Jeff Law Oct. 13, 2019, 4:28 p.m. UTC | #8
On 10/7/19 6:28 AM, Aldy Hernandez wrote:

> 
> Fair enough.  I guess I don't care what we settle on, inasmuch as we
> canonicalize into one true value.  For some reason, I thought the above
> nonzero would cause you to cringe, I guess not :).
:-)  Takes more than that these days..  And just to reiterate for the
larger audience what we discussed in IRC -- I'm absolutely for
canonicalization.  We're just debating what the canonical form should be
and how to get there.

> 
> Happily, normalizing into ~0 for signed and [1,MAX] for unsigned,
> simplified the patch because it on longer needs tweaks to
> ranges_from_anti_range.
Looks like it was ultimately far smaller than I expected.  Unless we
haven't audited existing code looking for open-coded ~[0,0] tests.


> 
> OK for trunk?
With a ChangeLog, yes.


> 
> Aldy
> 
> p.s. This still leaves open the issue with ipa_vr's handling of
> nonzero_p.  As per my last message, I've adjusted it to check for either
> ~[0,0] or the [1,MAX] variant for unsigned, since before this patch
> there were two ways of representing the same thing.  However, ipa-prop
> has no API (it open codes everything), as it has rolled its own version
> of "value ranges" with wide-ints.
> 
> class GTY(()) ipa_vr
> {
> public:
>   /* The data fields below are valid only if known is true.  */
>   bool known;
>   enum value_range_kind type;
>   wide_int min;
>   wide_int max;
>   bool nonzero_p (tree) const;
> };
> 
> I'm tempted to leave the nonzero_p which checks for both ~0 and [1,MAX]:
> 
> bool
> ipa_vr::nonzero_p (tree expr_type) const
> {
>   if (type == VR_ANTI_RANGE && wi::eq_p (min, 0) && wi::eq_p (max, 0))
>     return true;
> 
>   unsigned prec = TYPE_PRECISION (expr_type);
>   return (type == VR_RANGE
>       && TYPE_UNSIGNED (expr_type)
>       && wi::eq_p (min, wi::one (prec))
>       && wi::eq_p (max, wi::max_value (prec, TYPE_SIGN (expr_type))));
> }
> 
> I don't care either way.
Sigh.  I can live with either here as well.  I'm hesitant to start
mucking around with it simply because it's well outside of where we've
got expertise.

jeff
Rainer Orth Oct. 15, 2019, 11:58 a.m. UTC | #9
Hi Aldy,

>>> ~[0,0] has been the accepted way for a long time, I'd really prefer to
>>> keep that (for now).
>> It has.  Very true.  But I don't necessarily think that means we should
>> be introducing even more of 'em.
[...]
> Happily, normalizing into ~0 for signed and [1,MAX] for unsigned,
> simplified the patch because it on longer needs tweaks to
> ranges_from_anti_range.
>
> OK for trunk?

the new testcase FAILs on several (all?) 32-bit targets:

+FAIL: gcc.dg/tree-ssa/evrp4.c scan-tree-dump evrp "\\\\[1B, -1B\\\\]"

I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11,
with more reports for armv8l, pru, and s390x.

Comparing the dumps between 64 and 32-bit, I see

-_1: int * [1B, -1B]
+_1: int * [1B, 4294967295B]

	Rainer
Aldy Hernandez Oct. 15, 2019, 12:35 p.m. UTC | #10
On 10/15/19 7:58 AM, Rainer Orth wrote:
> Hi Aldy,
> 
>>>> ~[0,0] has been the accepted way for a long time, I'd really prefer to
>>>> keep that (for now).
>>> It has.  Very true.  But I don't necessarily think that means we should
>>> be introducing even more of 'em.
> [...]
>> Happily, normalizing into ~0 for signed and [1,MAX] for unsigned,
>> simplified the patch because it on longer needs tweaks to
>> ranges_from_anti_range.
>>
>> OK for trunk?
> 
> the new testcase FAILs on several (all?) 32-bit targets:
> 
> +FAIL: gcc.dg/tree-ssa/evrp4.c scan-tree-dump evrp "\\\\[1B, -1B\\\\]"

That's unfortunate.

Is this the only test that is failing?

> 
> I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11,
> with more reports for armv8l, pru, and s390x.
> 
> Comparing the dumps between 64 and 32-bit, I see
> 
> -_1: int * [1B, -1B]
> +_1: int * [1B, 4294967295B]

I wonder why 32-bit targets at displaying 4294967295 instead of -1.  Or 
are pointers 64-bits here?

I wonder if we should just change value_range_base::dump() to dump 
~[0,0] for ::nonzero_p(), that way we have a consistent way of 
*displaying* non-zeroness for tests.

Aldy
Rainer Orth Oct. 15, 2019, 12:37 p.m. UTC | #11
Hi Aldy,

> On 10/15/19 7:58 AM, Rainer Orth wrote:
>> Hi Aldy,
>>
>>>>> ~[0,0] has been the accepted way for a long time, I'd really prefer to
>>>>> keep that (for now).
>>>> It has.  Very true.  But I don't necessarily think that means we should
>>>> be introducing even more of 'em.
>> [...]
>>> Happily, normalizing into ~0 for signed and [1,MAX] for unsigned,
>>> simplified the patch because it on longer needs tweaks to
>>> ranges_from_anti_range.
>>>
>>> OK for trunk?
>>
>> the new testcase FAILs on several (all?) 32-bit targets:
>>
>> +FAIL: gcc.dg/tree-ssa/evrp4.c scan-tree-dump evrp "\\\\[1B, -1B\\\\]"
>
> That's unfortunate.
>
> Is this the only test that is failing?

it's the only on on Solaris/SPARC and Solaris/x86.  Haven't checked
other affected targets, though.

>> I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11,
>> with more reports for armv8l, pru, and s390x.
>>
>> Comparing the dumps between 64 and 32-bit, I see
>>
>> -_1: int * [1B, -1B]
>> +_1: int * [1B, 4294967295B]
>
> I wonder why 32-bit targets at displaying 4294967295 instead of -1.  Or are
> pointers 64-bits here?

No, it's a pure 32-bit target.  The compiler is 32-bit, too, but
bi-arch (32 and 64-bit).

	Rainer
Iain Sandoe Oct. 15, 2019, 12:53 p.m. UTC | #12
Hi Aldy,

Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> wrote:
>> 
>> On 10/15/19 7:58 AM, Rainer Orth wrote:
>>> Hi Aldy,
>>> 
>>>>>> ~[0,0] has been the accepted way for a long time, I'd really prefer to
>>>>>> keep that (for now).
>>>>> It has.  Very true.  But I don't necessarily think that means we should
>>>>> be introducing even more of 'em.
>>> [...]
>>>> Happily, normalizing into ~0 for signed and [1,MAX] for unsigned,
>>>> simplified the patch because it on longer needs tweaks to
>>>> ranges_from_anti_range.
>>>> 
>>>> OK for trunk?
>>> 
>>> the new testcase FAILs on several (all?) 32-bit targets:
>>> 
>>> +FAIL: gcc.dg/tree-ssa/evrp4.c scan-tree-dump evrp "\\\\[1B, -1B\\\\]"
>> 
>> That's unfortunate.
>> 
>> Is this the only test that is failing?
> 
> it's the only on on Solaris/SPARC and Solaris/x86.  Haven't checked
> other affected targets, though.
> 
>>> I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11,
>>> with more reports for armv8l, pru, and s390x.
>>> 
>>> Comparing the dumps between 64 and 32-bit, I see
>>> 
>>> -_1: int * [1B, -1B]
>>> +_1: int * [1B, 4294967295B]
>> 
>> I wonder why 32-bit targets at displaying 4294967295 instead of -1.  Or are
>> pointers 64-bits here?
> 
> No, it's a pure 32-bit target.  The compiler is 32-bit, too, but
> bi-arch (32 and 64-bit).

identical result on i686-darwin9, also a pure32b target (with a 64b multilb).
Iain
Jakub Jelinek Oct. 15, 2019, 6:16 p.m. UTC | #13
On Tue, Oct 15, 2019 at 08:35:07AM -0400, Aldy Hernandez wrote:
> > I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11,
> > with more reports for armv8l, pru, and s390x.
> > 
> > Comparing the dumps between 64 and 32-bit, I see
> > 
> > -_1: int * [1B, -1B]
> > +_1: int * [1B, 4294967295B]
> 
> I wonder why 32-bit targets at displaying 4294967295 instead of -1.  Or are
> pointers 64-bits here?

Because the dump method does:
      if (INTEGRAL_TYPE_P (ttype)
          && vrp_val_is_max (max ())
          && TYPE_PRECISION (ttype) != 1)
        fprintf (file, "+INF");
      else
        print_generic_expr (file, max ());
so for integral types and maximum value, it prints +INF, but not for
pointers.
Perhaps we want to print +INF also for pointers,
      if ((INTEGRAL_TYPE_P (ttype) || POINTER_TYPE_P (ttype))
	  && vrp_val_is_max (max (), true)
          && TYPE_PRECISION (ttype) != 1)
        fprintf (file, "+INF");
      else
        print_generic_expr (file, max ());
but maybe vrp_val_is_{min,max} should be rewritten for pointer types to be
more efficient, don't create trees, for min just use integer_zerop and for
max just compare wide_int?

	Jakub
Aldy Hernandez Oct. 16, 2019, 7:38 a.m. UTC | #14
On 10/15/19 2:16 PM, Jakub Jelinek wrote:
> On Tue, Oct 15, 2019 at 08:35:07AM -0400, Aldy Hernandez wrote:
>>> I'm seeing this on 32-bit i386-pc-solaris2.11 and sparc-sun-solaris2.11,
>>> with more reports for armv8l, pru, and s390x.
>>>
>>> Comparing the dumps between 64 and 32-bit, I see
>>>
>>> -_1: int * [1B, -1B]
>>> +_1: int * [1B, 4294967295B]
>>
>> I wonder why 32-bit targets at displaying 4294967295 instead of -1.  Or are
>> pointers 64-bits here?
> 
> Because the dump method does:
>        if (INTEGRAL_TYPE_P (ttype)
>            && vrp_val_is_max (max ())
>            && TYPE_PRECISION (ttype) != 1)
>          fprintf (file, "+INF");
>        else
>          print_generic_expr (file, max ());
> so for integral types and maximum value, it prints +INF, but not for
> pointers.

Ah, I see.

> Perhaps we want to print +INF also for pointers,
>        if ((INTEGRAL_TYPE_P (ttype) || POINTER_TYPE_P (ttype))
> 	  && vrp_val_is_max (max (), true)
>            && TYPE_PRECISION (ttype) != 1)
>          fprintf (file, "+INF");
>        else
>          print_generic_expr (file, max ());

That sounds reasonable, though I would use supports_type_p() instead of 
open-coding the check for INTEGRAL and POINTER.

Would you take care of this, or shall I?

> but maybe vrp_val_is_{min,max} should be rewritten for pointer types to be
> more efficient, don't create trees, for min just use integer_zerop and for
> max just compare wide_int?

That sounds like a separate issue, but sure.  No complaints.

Note, that I highly dislike the whole handle_pointers=bool argument in 
vrp_*val*, even though I added it.  I think it should default to the 
handle_pointers behavior, though I was unsure what would break, so I 
kept existing behavior gated by a bool (yuck).

Thanks.
Aldy
Jakub Jelinek Oct. 16, 2019, 7:46 a.m. UTC | #15
On Wed, Oct 16, 2019 at 03:38:38AM -0400, Aldy Hernandez wrote:
> Would you take care of this, or shall I?

Will defer to you, I have quite a lot of stuff on my plate ATM.

	Jakub
Aldy Hernandez Oct. 17, 2019, 7:15 a.m. UTC | #16
On 10/16/19 3:46 AM, Jakub Jelinek wrote:
> On Wed, Oct 16, 2019 at 03:38:38AM -0400, Aldy Hernandez wrote:
>> Would you take care of this, or shall I?
> 
> Will defer to you, I have quite a lot of stuff on my plate ATM.
> 
> 	Jakub
> 

No problem.  Thanks for your analysis though.

The attached patch fixes the regression.

OK pending tests?
Jakub Jelinek Oct. 17, 2019, 7:19 a.m. UTC | #17
On Thu, Oct 17, 2019 at 03:15:28AM -0400, Aldy Hernandez wrote:
> On 10/16/19 3:46 AM, Jakub Jelinek wrote:
> > On Wed, Oct 16, 2019 at 03:38:38AM -0400, Aldy Hernandez wrote:
> > > Would you take care of this, or shall I?
> > 
> > Will defer to you, I have quite a lot of stuff on my plate ATM.
> > 
> > 	Jakub
> > 
> 
> No problem.  Thanks for your analysis though.
> 
> The attached patch fixes the regression.
> 
> OK pending tests?

> gcc/
> 
> 	PR tree-optimization/92131
> 	* tree-vrp.c (value_range_base::dump): Display +INF for both
> 	pointers and integers when appropriate.
> 
> gcc/testsuite/
> 
> 	* gcc.dg/tree-ssa/evrp4.c: Check for +INF instead of -1.

LGTM.

	Jakub
diff mbox series

Patch

commit 1c333730deeb4ddadc46ad6d12d5344f92c0352c
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Fri Oct 4 08:51:25 2019 +0200

    Canonicalize UNSIGNED [1,MAX] into ~[0,0].
    
    Adapt PLUS/MINUS symbolic handling, so it doesn't call
    ranges_from_anti_range with a VR_ANTI_RANGE containing one sub-range.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6e4f145af46..3934b41fdf9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@ 
+2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
+
+	* tree-vrp.c (value_range_base::singleton_p): Use num_pairs
+	instead of calling vrp_val_is_*.
+	(value_range_base::set): Canonicalize unsigned [1,MAX] into
+	non-zero.
+	(range_has_numeric_bounds_p): New.
+	(range_int_cst_p): Use range_has_numeric_bounds_p.
+	(ranges_from_anti_range): Assert that we won't recurse
+	indefinitely.
+	(extract_extremes_from_range): New.
+	(extract_range_from_plus_minus_expr): Adapt so we don't call
+	ranges_from_anti_range with an anti-range containing only one
+	sub-range.
+
 2019-10-04  Aldy Hernandez  <aldyh@redhat.com>
 
 	(value_range_from_overflowed_bounds): Rename from
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index a2ab4a21925..97dd2b7a734 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -379,10 +379,7 @@  value_range_base::singleton_p (tree *result) const
 	    }
 	  return false;
 	}
-
-      /* An anti-range that includes an extreme, is just a range with
-	 one sub-range.  Use the one sub-range.  */
-      if (vrp_val_is_min (m_min, true) || vrp_val_is_max (m_max, true))
+      if (num_pairs () == 1)
 	{
 	  value_range_base vr0, vr1;
 	  ranges_from_anti_range (this, &vr0, &vr1, true);
@@ -843,6 +840,17 @@  value_range_base::set (enum value_range_kind kind, tree min, tree max)
       return;
     }
 
+  /* Unsigned [1, MAX] is canonicalized as ~[0, 0].  */
+  if (kind == VR_RANGE && TYPE_UNSIGNED (type)
+      && prec > 1
+      && wi::eq_p (wi::to_wide (min), wi::one (prec))
+      && wi::eq_p (wi::to_wide (max), wi::max_value (prec, sign)))
+    {
+      m_kind = VR_ANTI_RANGE;
+      m_min = m_max = build_int_cst (type, 0);
+      return;
+    }
+
   /* Do not drop [-INF(OVF), +INF(OVF)] to varying.  (OVF) has to be sticky
      to make sure VRP iteration terminates, otherwise we can get into
      oscillations.  */
@@ -913,15 +921,21 @@  vrp_bitmap_equal_p (const_bitmap b1, const_bitmap b2)
 	      && bitmap_equal_p (b1, b2)));
 }
 
+static bool
+range_has_numeric_bounds_p (const value_range_base *vr)
+{
+  return (vr->min ()
+	  && TREE_CODE (vr->min ()) == INTEGER_CST
+	  && TREE_CODE (vr->max ()) == INTEGER_CST);
+}
+
 /* Return true if max and min of VR are INTEGER_CST.  It's not necessary
    a singleton.  */
 
 bool
 range_int_cst_p (const value_range_base *vr)
 {
-  return (vr->kind () == VR_RANGE
-	  && TREE_CODE (vr->min ()) == INTEGER_CST
-	  && TREE_CODE (vr->max ()) == INTEGER_CST);
+  return (vr->kind () == VR_RANGE && range_has_numeric_bounds_p (vr));
 }
 
 /* Return true if VR is a INTEGER_CST singleton.  */
@@ -1327,6 +1341,10 @@  ranges_from_anti_range (const value_range_base *ar,
 			value_range_base *vr0, value_range_base *vr1,
 			bool handle_pointers)
 {
+  /* An unsigned ~[0,0] cannot be split into [1,MAX] because it gets
+     normalized back into ~[0,0].  Avoid infinite loop.  */
+  gcc_checking_assert (!ar->nonzero_p () || !TYPE_UNSIGNED (ar->type ()));
+
   tree type = ar->type ();
 
   vr0->set_undefined ();
@@ -1582,6 +1600,22 @@  extract_range_from_pointer_plus_expr (value_range_base *vr,
     vr->set_varying (expr_type);
 }
 
+static void
+extract_extremes_from_range (const value_range_base *vr, tree *min, tree *max)
+{
+  if (range_has_numeric_bounds_p (vr))
+    {
+      *min = wide_int_to_tree (vr->type (), vr->lower_bound ());
+      *max = wide_int_to_tree (vr->type (), vr->upper_bound ());
+    }
+  else
+    {
+      gcc_checking_assert (vr->kind () != VR_ANTI_RANGE);
+      *min = vr->min ();
+      *max = vr->max ();
+    }
+}
+
 /* Extract range information from a PLUS/MINUS_EXPR and store the
    result in *VR.  */
 
@@ -1597,9 +1631,18 @@  extract_range_from_plus_minus_expr (value_range_base *vr,
   value_range_base vr0 = *vr0_, vr1 = *vr1_;
   value_range_base vrtem0, vrtem1;
 
+  /* We can't do anything with symbolic anti ranges.  */
+  if ((vr0.kind () == VR_ANTI_RANGE && !range_has_numeric_bounds_p (&vr0))
+      || (vr1.kind () == VR_ANTI_RANGE && !range_has_numeric_bounds_p (&vr1)))
+    {
+      vr->set_varying (expr_type);
+      return;
+    }
+
   /* Now canonicalize anti-ranges to ranges when they are not symbolic
      and express ~[] op X as ([]' op X) U ([]'' op X).  */
   if (vr0.kind () == VR_ANTI_RANGE
+      && vr0.num_pairs () == 2
       && ranges_from_anti_range (&vr0, &vrtem0, &vrtem1))
     {
       extract_range_from_plus_minus_expr (vr, code, expr_type, &vrtem0, vr1_);
@@ -1614,6 +1657,7 @@  extract_range_from_plus_minus_expr (value_range_base *vr,
     }
   /* Likewise for X op ~[].  */
   if (vr1.kind () == VR_ANTI_RANGE
+      && vr1.num_pairs () == 2
       && ranges_from_anti_range (&vr1, &vrtem0, &vrtem1))
     {
       extract_range_from_plus_minus_expr (vr, code, expr_type, vr0_, &vrtem0);
@@ -1628,26 +1672,10 @@  extract_range_from_plus_minus_expr (value_range_base *vr,
     }
 
   value_range_kind kind;
-  value_range_kind vr0_kind = vr0.kind (), vr1_kind = vr1.kind ();
-  tree vr0_min = vr0.min (), vr0_max = vr0.max ();
-  tree vr1_min = vr1.min (), vr1_max = vr1.max ();
   tree min = NULL, max = NULL;
-
-  /* This will normalize things such that calculating
-     [0,0] - VR_VARYING is not dropped to varying, but is
-     calculated as [MIN+1, MAX].  */
-  if (vr0.varying_p ())
-    {
-      vr0_kind = VR_RANGE;
-      vr0_min = vrp_val_min (expr_type);
-      vr0_max = vrp_val_max (expr_type);
-    }
-  if (vr1.varying_p ())
-    {
-      vr1_kind = VR_RANGE;
-      vr1_min = vrp_val_min (expr_type);
-      vr1_max = vrp_val_max (expr_type);
-    }
+  tree vr0_min, vr0_max, vr1_min, vr1_max;
+  extract_extremes_from_range (&vr0, &vr0_min, &vr0_max);
+  extract_extremes_from_range (&vr1, &vr1_min, &vr1_max);
 
   const bool minus_p = (code == MINUS_EXPR);
   tree min_op0 = vr0_min;
@@ -1666,10 +1694,9 @@  extract_range_from_plus_minus_expr (value_range_base *vr,
      single-symbolic ranges, try to compute the precise resulting range,
      but only if we know that this resulting range will also be constant
      or single-symbolic.  */
-  if (vr0_kind == VR_RANGE && vr1_kind == VR_RANGE
-      && (TREE_CODE (min_op0) == INTEGER_CST
-	  || (sym_min_op0
-	      = get_single_symbol (min_op0, &neg_min_op0, &min_op0)))
+  if ((TREE_CODE (min_op0) == INTEGER_CST
+       || (sym_min_op0
+	   = get_single_symbol (min_op0, &neg_min_op0, &min_op0)))
       && (TREE_CODE (min_op1) == INTEGER_CST
 	  || (sym_min_op1
 	      = get_single_symbol (min_op1, &neg_min_op1, &min_op1)))
@@ -1724,18 +1751,6 @@  extract_range_from_plus_minus_expr (value_range_base *vr,
     }
   else
     {
-      /* For other cases, for example if we have a PLUS_EXPR with two
-	 VR_ANTI_RANGEs, drop to VR_VARYING.  It would take more effort
-	 to compute a precise range for such a case.
-	 ???  General even mixed range kind operations can be expressed
-	 by for example transforming ~[3, 5] + [1, 2] to range-only
-	 operations and a union primitive:
-	 [-INF, 2] + [1, 2]  U  [5, +INF] + [1, 2]
-	 [-INF+1, 4]     U    [6, +INF(OVF)]
-	 though usually the union is not exactly representable with
-	 a single range or anti-range as the above is
-	 [-INF+1, +INF(OVF)] intersected with ~[5, 5]
-	 but one could use a scheme similar to equivalences for this. */
       vr->set_varying (expr_type);
       return;
     }