VRP: abstract out wide int CONVERT_EXPR_P code
diff mbox series

Message ID ca140cb9-b884-6ff2-3a6e-494fb22cee22@redhat.com
State New
Headers show
Series
  • VRP: abstract out wide int CONVERT_EXPR_P code
Related show

Commit Message

Aldy Hernandez Aug. 27, 2018, 12:24 p.m. UTC
Howdy!

Phew, I think this is the last abstraction.  This handles the unary 
CONVERT_EXPR_P code.

It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle 
everything generically.

Normalizing the symbolics brought about some nice surprises.  We now 
handle a few things we were punting on before, which I've documented in 
the patch, but can remove if so desired.  I wrote them mainly for myself:

/* NOTES: Previously we were returning VARYING for all symbolics, but
    we can do better by treating them as [-MIN, +MAX].  For
    example, converting [SYM, SYM] from INT to LONG UNSIGNED,
    we can return: ~[0x8000000, 0xffffffff7fffffff].

    We were also failing to convert ~[0,0] from char* to unsigned,
    instead choosing to return VR_VARYING.  Now we return ~[0,0].  */

Tested on x86-64 by the usual bootstrap and regtest gymnastics, 
including --enable-languages=all, because my past sins are still 
haunting me.

OK?

Comments

Richard Biener Aug. 28, 2018, 9:27 a.m. UTC | #1
On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>
> Howdy!
>
> Phew, I think this is the last abstraction.  This handles the unary
> CONVERT_EXPR_P code.
>
> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
> everything generically.
>
> Normalizing the symbolics brought about some nice surprises.  We now
> handle a few things we were punting on before, which I've documented in
> the patch, but can remove if so desired.  I wrote them mainly for myself:
>
> /* NOTES: Previously we were returning VARYING for all symbolics, but
>     we can do better by treating them as [-MIN, +MAX].  For
>     example, converting [SYM, SYM] from INT to LONG UNSIGNED,
>     we can return: ~[0x8000000, 0xffffffff7fffffff].
>
>     We were also failing to convert ~[0,0] from char* to unsigned,
>     instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
>
> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
> including --enable-languages=all, because my past sins are still
> haunting me.
>
> OK?

The new wide_int_range_convert_tree looks odd given it returns
tree's.  I'd have expected an API that does the conversion resulting
in a wide_int range and the VRP code adapting to that by converting
the result to trees.

Richard.
Aldy Hernandez Aug. 28, 2018, 9:30 a.m. UTC | #2
On 08/28/2018 05:27 AM, Richard Biener wrote:
> On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>> Howdy!
>>
>> Phew, I think this is the last abstraction.  This handles the unary
>> CONVERT_EXPR_P code.
>>
>> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
>> everything generically.
>>
>> Normalizing the symbolics brought about some nice surprises.  We now
>> handle a few things we were punting on before, which I've documented in
>> the patch, but can remove if so desired.  I wrote them mainly for myself:
>>
>> /* NOTES: Previously we were returning VARYING for all symbolics, but
>>      we can do better by treating them as [-MIN, +MAX].  For
>>      example, converting [SYM, SYM] from INT to LONG UNSIGNED,
>>      we can return: ~[0x8000000, 0xffffffff7fffffff].
>>
>>      We were also failing to convert ~[0,0] from char* to unsigned,
>>      instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
>>
>> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
>> including --enable-languages=all, because my past sins are still
>> haunting me.
>>
>> OK?
> 
> The new wide_int_range_convert_tree looks odd given it returns
> tree's.  I'd have expected an API that does the conversion resulting
> in a wide_int range and the VRP code adapting to that by converting
> the result to trees.

Hmmm, yeah.  I agree.  I'll think about this some more and see what I 
can come up with.

Thanks.
Aldy
Bernhard Reutner-Fischer Aug. 29, 2018, 2:12 p.m. UTC | #3
On 27 August 2018 14:24:33 CEST, Aldy Hernandez <aldyh@redhat.com> wrote:
>Howdy!
>
>Phew, I think this is the last abstraction.  This handles the unary 
>CONVERT_EXPR_P code.

+bool
+wide_int_range_convert_tree (tree &tmin, tree &tmax,
+			     tree outer_type,
+			     signop inner_sign,
+			     unsigned inner_prec,
+			     const wide_int &vr0_min,
+			     const wide_int &vr0_max)
+{
+  /* If the conversion is not truncating we can convert the min and
+     max values and canonicalize the resulting range.  Otherwise we
+     can do the conversion if the size of the range is less than what
+     the precision of the target type can represent.  */
+  if (TYPE_PRECISION (outer_type) >= inner_prec
+      || wi::rshift (wi::sub (vr0_max, vr0_min),
+		     wi::uhwi (TYPE_PRECISION (outer_type), inner_prec),
+		     inner_sign) == 0)
+    {
+      signop outer_sign = TYPE_SIGN (outer_type);
+      outer_sign = inner_sign;

AFAIU you don't need outer_sign at all ( since you want inner_sign always ) or what subtlety  am I missing?

Thanks,

+      tmin = force_fit_type (outer_type,
+			     widest_int::from (vr0_min, outer_sign), 0, false);
+      tmax = force_fit_type (outer_type,
+			     widest_int::from (vr0_max, outer_sign), 0, false);
+      return (!operand_equal_p (tmin, TYPE_MIN_VALUE (outer_type), 0)
+	      || !operand_equal_p (tmax, TYPE_MAX_VALUE (outer_type), 0));
+    }
+  return false;
+}
Aldy Hernandez Sept. 3, 2018, 11:32 a.m. UTC | #4
On 08/28/2018 05:27 AM, Richard Biener wrote:
> On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>> Howdy!
>>
>> Phew, I think this is the last abstraction.  This handles the unary
>> CONVERT_EXPR_P code.
>>
>> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
>> everything generically.
>>
>> Normalizing the symbolics brought about some nice surprises.  We now
>> handle a few things we were punting on before, which I've documented in
>> the patch, but can remove if so desired.  I wrote them mainly for myself:
>>
>> /* NOTES: Previously we were returning VARYING for all symbolics, but
>>      we can do better by treating them as [-MIN, +MAX].  For
>>      example, converting [SYM, SYM] from INT to LONG UNSIGNED,
>>      we can return: ~[0x8000000, 0xffffffff7fffffff].
>>
>>      We were also failing to convert ~[0,0] from char* to unsigned,
>>      instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
>>
>> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
>> including --enable-languages=all, because my past sins are still
>> haunting me.
>>
>> OK?
> 
> The new wide_int_range_convert_tree looks odd given it returns
> tree's.  I'd have expected an API that does the conversion resulting
> in a wide_int range and the VRP code adapting to that by converting
> the result to trees.

Rewritten as suggested.

A few notes.

First.  I am not using widest_int as was done previously.  We were 
passing 0/false to the overflow args in force_fit_type so the call was 
just degrading into wide_int_to_tree() anyhow.  Am I missing some 
subtlety here, and must we use widest_int and then force_fit_type on the 
caller?

Second.  I need extract_range_into_wide_ints to work with anti ranges of 
constants.  It seems like only the unary handling of CONVERT_EXPR here 
uses it that way, so I believe it should go into one patch.  If you 
believe strongly otherwise, I could split out the 4 lines into a 
separate patch, but I'd prefer not to.

Finally, I could use vr0_min.get_precision() instead of inner_precision, 
but I think it's more regular and readable the way I have it.

Tested on all languages on x86-64 Linux.

OK for trunk?
gcc/

	* wide-int-range.cc (wide_int_range_convert): New.
	* wide-int-range.h (wide_int_range_convert): New.
	* tree-vrp.c (extract_range_from_unary_expr): Abstract wide int
	code into wide_int_range_convert.
	(extract_range_into_wide_ints): Do not munge anti range constants
	into the entire domain.  Just return the range back.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index f20730a85ba..a78ed45c713 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -995,7 +995,7 @@ ranges_from_anti_range (value_range *ar,
 /* Extract the components of a value range into a pair of wide ints in
    [WMIN, WMAX].
 
-   If the value range is anything but a VR_RANGE of constants, the
+   If the value range is anything but a VR_*RANGE of constants, the
    resulting wide ints are set to [-MIN, +MAX] for the type.  */
 
 static void inline
@@ -1003,7 +1003,10 @@ extract_range_into_wide_ints (value_range *vr,
 			      signop sign, unsigned prec,
 			      wide_int &wmin, wide_int &wmax)
 {
-  if (range_int_cst_p (vr))
+  if ((vr->type == VR_RANGE
+       || vr->type == VR_ANTI_RANGE)
+      && TREE_CODE (vr->min) == INTEGER_CST
+      && TREE_CODE (vr->max) == INTEGER_CST)
     {
       wmin = wi::to_wide (vr->min);
       wmax = wi::to_wide (vr->max);
@@ -1894,44 +1897,41 @@ extract_range_from_unary_expr (value_range *vr,
 	  return;
 	}
 
-      /* If VR0 is varying and we increase the type precision, assume
-	 a full range for the following transformation.  */
-      if (vr0.type == VR_VARYING
-	  && INTEGRAL_TYPE_P (inner_type)
-	  && TYPE_PRECISION (inner_type) < TYPE_PRECISION (outer_type))
-	{
-	  vr0.type = VR_RANGE;
-	  vr0.min = TYPE_MIN_VALUE (inner_type);
-	  vr0.max = TYPE_MAX_VALUE (inner_type);
-	}
-
-      /* If VR0 is a constant range or anti-range and the conversion is
-	 not truncating we can convert the min and max values and
-	 canonicalize the resulting range.  Otherwise we can do the
-	 conversion if the size of the range is less than what the
-	 precision of the target type can represent and the range is
-	 not an anti-range.  */
-      if ((vr0.type == VR_RANGE
-	   || vr0.type == VR_ANTI_RANGE)
+      /* We normalize everything to a VR_RANGE, but for constant
+	 anti-ranges we must handle them by leaving the final result
+	 as an anti range.  This allows us to convert things like
+	 ~[0,5] seamlessly.  */
+      value_range_type vr_type = VR_RANGE;
+      if (vr0.type == VR_ANTI_RANGE
 	  && TREE_CODE (vr0.min) == INTEGER_CST
-	  && TREE_CODE (vr0.max) == INTEGER_CST
-	  && (TYPE_PRECISION (outer_type) >= TYPE_PRECISION (inner_type)
-	      || (vr0.type == VR_RANGE
-		  && integer_zerop (int_const_binop (RSHIFT_EXPR,
-		       int_const_binop (MINUS_EXPR, vr0.max, vr0.min),
-		         size_int (TYPE_PRECISION (outer_type)))))))
+	  && TREE_CODE (vr0.max) == INTEGER_CST)
+	vr_type = VR_ANTI_RANGE;
+
+      /* NOTES: Previously we were returning VARYING for all symbolics, but
+	 we can do better by treating them as [-MIN, +MAX].  For
+	 example, converting [SYM, SYM] from INT to LONG UNSIGNED,
+	 we can return: ~[0x8000000, 0xffffffff7fffffff].
+
+	 We were also failing to convert ~[0,0] from char* to unsigned,
+	 instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
+      wide_int vr0_min, vr0_max, wmin, wmax;
+      signop inner_sign = TYPE_SIGN (inner_type);
+      signop outer_sign = TYPE_SIGN (outer_type);
+      unsigned inner_prec = TYPE_PRECISION (inner_type);
+      unsigned outer_prec = TYPE_PRECISION (outer_type);
+      extract_range_into_wide_ints (&vr0, inner_sign, inner_prec,
+				    vr0_min, vr0_max);
+      if (wide_int_range_convert (wmin, wmax,
+				  inner_sign, inner_prec,
+				  outer_sign, outer_prec,
+				  vr0_min, vr0_max))
 	{
-	  tree new_min, new_max;
-	  new_min = force_fit_type (outer_type, wi::to_widest (vr0.min),
-				    0, false);
-	  new_max = force_fit_type (outer_type, wi::to_widest (vr0.max),
-				    0, false);
-	  set_and_canonicalize_value_range (vr, vr0.type,
-					    new_min, new_max, NULL);
-	  return;
+	  tree min = wide_int_to_tree (outer_type, wmin);
+	  tree max = wide_int_to_tree (outer_type, wmax);
+	  set_and_canonicalize_value_range (vr, vr_type, min, max, NULL);
 	}
-
-      set_value_range_to_varying (vr);
+      else
+	set_value_range_to_varying (vr);
       return;
     }
   else if (code == ABS_EXPR)
diff --git a/gcc/wide-int-range.cc b/gcc/wide-int-range.cc
index 3cdcede04cd..5f257966819 100644
--- a/gcc/wide-int-range.cc
+++ b/gcc/wide-int-range.cc
@@ -665,6 +665,39 @@ wide_int_range_abs (wide_int &min, wide_int &max,
   return true;
 }
 
+/* Convert range in [VR0_MIN, VR0_MAX] with INNER_SIGN and INNER_PREC,
+   to a range in [MIN, MAX] with OUTER_SIGN and OUTER_PREC.
+
+   Return TRUE if we were able to successfully calculate the new range.
+
+   Caller is responsible for canonicalizing the resulting range.  */
+
+bool
+wide_int_range_convert (wide_int &min, wide_int &max,
+			signop inner_sign,
+			unsigned inner_prec,
+			signop outer_sign,
+			unsigned outer_prec,
+			const wide_int &vr0_min,
+			const wide_int &vr0_max)
+{
+  /* If the conversion is not truncating we can convert the min and
+     max values and canonicalize the resulting range.  Otherwise we
+     can do the conversion if the size of the range is less than what
+     the precision of the target type can represent.  */
+  if (outer_prec >= inner_prec
+      || wi::rshift (wi::sub (vr0_max, vr0_min),
+		     wi::uhwi (outer_prec, inner_prec),
+		     inner_sign) == 0)
+    {
+      min = wide_int::from (vr0_min, outer_prec, inner_sign);
+      max = wide_int::from (vr0_max, outer_prec, inner_sign);
+      return (!wi::eq_p (min, wi::min_value (outer_prec, outer_sign))
+	      || !wi::eq_p (max, wi::max_value (outer_prec, outer_sign)));
+    }
+  return false;
+}
+
 /* Calculate a division operation on two ranges and store the result in
    [WMIN, WMAX] U [EXTRA_MIN, EXTRA_MAX].
 
diff --git a/gcc/wide-int-range.h b/gcc/wide-int-range.h
index 427ef34c6b4..83574b7cac1 100644
--- a/gcc/wide-int-range.h
+++ b/gcc/wide-int-range.h
@@ -99,6 +99,13 @@ extern bool wide_int_range_abs (wide_int &min, wide_int &max,
 				const wide_int &vr0_min,
 				const wide_int &vr0_max,
 				bool overflow_undefined);
+extern bool wide_int_range_convert (wide_int &min, wide_int &max,
+				    signop inner_sign,
+				    unsigned inner_prec,
+				    signop outer_sign,
+				    unsigned outer_prec,
+				    const wide_int &vr0_min,
+				    const wide_int &vr0_max);
 extern bool wide_int_range_div (wide_int &wmin, wide_int &wmax,
 				enum tree_code code,
 				signop sign, unsigned prec,
Richard Biener Sept. 4, 2018, 11:58 a.m. UTC | #5
On Mon, Sep 3, 2018 at 1:32 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>
>
>
> On 08/28/2018 05:27 AM, Richard Biener wrote:
> > On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>
> >> Howdy!
> >>
> >> Phew, I think this is the last abstraction.  This handles the unary
> >> CONVERT_EXPR_P code.
> >>
> >> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
> >> everything generically.
> >>
> >> Normalizing the symbolics brought about some nice surprises.  We now
> >> handle a few things we were punting on before, which I've documented in
> >> the patch, but can remove if so desired.  I wrote them mainly for myself:
> >>
> >> /* NOTES: Previously we were returning VARYING for all symbolics, but
> >>      we can do better by treating them as [-MIN, +MAX].  For
> >>      example, converting [SYM, SYM] from INT to LONG UNSIGNED,
> >>      we can return: ~[0x8000000, 0xffffffff7fffffff].
> >>
> >>      We were also failing to convert ~[0,0] from char* to unsigned,
> >>      instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
> >>
> >> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
> >> including --enable-languages=all, because my past sins are still
> >> haunting me.
> >>
> >> OK?
> >
> > The new wide_int_range_convert_tree looks odd given it returns
> > tree's.  I'd have expected an API that does the conversion resulting
> > in a wide_int range and the VRP code adapting to that by converting
> > the result to trees.
>
> Rewritten as suggested.
>
> A few notes.
>
> First.  I am not using widest_int as was done previously.  We were
> passing 0/false to the overflow args in force_fit_type so the call was
> just degrading into wide_int_to_tree() anyhow.  Am I missing some
> subtlety here, and must we use widest_int and then force_fit_type on the
> caller?

The issue is that the wide-ints get "converted", so it's indeed subtle.

+      || wi::rshift (wi::sub (vr0_max, vr0_min),
+                    wi::uhwi (outer_prec, inner_prec),
+                    inner_sign) == 0)

this was previously allowed only for VR_RANGEs - now it's also for
VR_ANTI_RANGE.
That looks incorrect.  Testcase welcome ;)

+      return (!wi::eq_p (min, wi::min_value (outer_prec, outer_sign))
+             || !wi::eq_p (max, wi::max_value (outer_prec, outer_sign)));

so you return false for VARYING?  Why?  I'd simply return true and
return VARYING in the new type in the path you return false.

Richard.



>
> Second.  I need extract_range_into_wide_ints to work with anti ranges of
> constants.  It seems like only the unary handling of CONVERT_EXPR here
> uses it that way, so I believe it should go into one patch.  If you
> believe strongly otherwise, I could split out the 4 lines into a
> separate patch, but I'd prefer not to.
>
> Finally, I could use vr0_min.get_precision() instead of inner_precision,
> but I think it's more regular and readable the way I have it.
>
> Tested on all languages on x86-64 Linux.
>
> OK for trunk?
Aldy Hernandez Sept. 4, 2018, 12:41 p.m. UTC | #6
On 09/04/2018 07:58 AM, Richard Biener wrote:
> On Mon, Sep 3, 2018 at 1:32 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>>
>>
>> On 08/28/2018 05:27 AM, Richard Biener wrote:
>>> On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>
>>>> Howdy!
>>>>
>>>> Phew, I think this is the last abstraction.  This handles the unary
>>>> CONVERT_EXPR_P code.
>>>>
>>>> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
>>>> everything generically.
>>>>
>>>> Normalizing the symbolics brought about some nice surprises.  We now
>>>> handle a few things we were punting on before, which I've documented in
>>>> the patch, but can remove if so desired.  I wrote them mainly for myself:
>>>>
>>>> /* NOTES: Previously we were returning VARYING for all symbolics, but
>>>>       we can do better by treating them as [-MIN, +MAX].  For
>>>>       example, converting [SYM, SYM] from INT to LONG UNSIGNED,
>>>>       we can return: ~[0x8000000, 0xffffffff7fffffff].
>>>>
>>>>       We were also failing to convert ~[0,0] from char* to unsigned,
>>>>       instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
>>>>
>>>> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
>>>> including --enable-languages=all, because my past sins are still
>>>> haunting me.
>>>>
>>>> OK?
>>>
>>> The new wide_int_range_convert_tree looks odd given it returns
>>> tree's.  I'd have expected an API that does the conversion resulting
>>> in a wide_int range and the VRP code adapting to that by converting
>>> the result to trees.
>>
>> Rewritten as suggested.
>>
>> A few notes.
>>
>> First.  I am not using widest_int as was done previously.  We were
>> passing 0/false to the overflow args in force_fit_type so the call was
>> just degrading into wide_int_to_tree() anyhow.  Am I missing some
>> subtlety here, and must we use widest_int and then force_fit_type on the
>> caller?
> 
> The issue is that the wide-ints get "converted", so it's indeed subtle.

This seems like it should be documented-- at the very least in 
wide_int_range_convert.

What do you mean "converted"?  I'd like to understand this better before 
I write out the comment :).  Do we lose precision by keeping it in a 
wide_int as opposed to a widest_int?

Also, can we have the caller to wide_int_range_convert() use 
wide_int_to_tree directly instead of passing through force_fit_type?  It 
seems force_fit_type with overflowable=0 and overflowed=false is just a 
call to wide_int_to_tree anyhow.

> 
> +      || wi::rshift (wi::sub (vr0_max, vr0_min),
> +                    wi::uhwi (outer_prec, inner_prec),
> +                    inner_sign) == 0)
> 
> this was previously allowed only for VR_RANGEs - now it's also for
> VR_ANTI_RANGE.
> That looks incorrect.  Testcase welcome ;)

See change to extract_range_into_wide_ints.  VR_ANTI_RANGEs of constants 
will remain as is, whereas VR_ANTI_RANGEs of symbols will degrade into 
[-MIN,+MAX] and be handled generically.

Also, see comment:

+      /* We normalize everything to a VR_RANGE, but for constant
+        anti-ranges we must handle them by leaving the final result
+        as an anti range.  This allows us to convert things like
+        ~[0,5] seamlessly.  */

> 
> +      return (!wi::eq_p (min, wi::min_value (outer_prec, outer_sign))
> +             || !wi::eq_p (max, wi::max_value (outer_prec, outer_sign)));
> 
> so you return false for VARYING?  Why?  I'd simply return true and
> return VARYING in the new type in the path you return false.

Since symbols and VR_VARYING and other things get turned into a 
[-MIN,+MAX] by extract_range_into_wide_ints, it's a lot easier to handle 
things generically and return varying when the range spans the entire 
domain.  It also keeps with the rest of the wide_int_range_* functions 
that return false when we know it's going to be VR_VARYING.

Aldy

> 
> Richard.
> 
> 
> 
>>
>> Second.  I need extract_range_into_wide_ints to work with anti ranges of
>> constants.  It seems like only the unary handling of CONVERT_EXPR here
>> uses it that way, so I believe it should go into one patch.  If you
>> believe strongly otherwise, I could split out the 4 lines into a
>> separate patch, but I'd prefer not to.
>>
>> Finally, I could use vr0_min.get_precision() instead of inner_precision,
>> but I think it's more regular and readable the way I have it.
>>
>> Tested on all languages on x86-64 Linux.
>>
>> OK for trunk?
Richard Biener Sept. 4, 2018, 12:49 p.m. UTC | #7
On Tue, Sep 4, 2018 at 2:41 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>
>
>
> On 09/04/2018 07:58 AM, Richard Biener wrote:
> > On Mon, Sep 3, 2018 at 1:32 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>
> >>
> >>
> >> On 08/28/2018 05:27 AM, Richard Biener wrote:
> >>> On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>>>
> >>>> Howdy!
> >>>>
> >>>> Phew, I think this is the last abstraction.  This handles the unary
> >>>> CONVERT_EXPR_P code.
> >>>>
> >>>> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
> >>>> everything generically.
> >>>>
> >>>> Normalizing the symbolics brought about some nice surprises.  We now
> >>>> handle a few things we were punting on before, which I've documented in
> >>>> the patch, but can remove if so desired.  I wrote them mainly for myself:
> >>>>
> >>>> /* NOTES: Previously we were returning VARYING for all symbolics, but
> >>>>       we can do better by treating them as [-MIN, +MAX].  For
> >>>>       example, converting [SYM, SYM] from INT to LONG UNSIGNED,
> >>>>       we can return: ~[0x8000000, 0xffffffff7fffffff].
> >>>>
> >>>>       We were also failing to convert ~[0,0] from char* to unsigned,
> >>>>       instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
> >>>>
> >>>> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
> >>>> including --enable-languages=all, because my past sins are still
> >>>> haunting me.
> >>>>
> >>>> OK?
> >>>
> >>> The new wide_int_range_convert_tree looks odd given it returns
> >>> tree's.  I'd have expected an API that does the conversion resulting
> >>> in a wide_int range and the VRP code adapting to that by converting
> >>> the result to trees.
> >>
> >> Rewritten as suggested.
> >>
> >> A few notes.
> >>
> >> First.  I am not using widest_int as was done previously.  We were
> >> passing 0/false to the overflow args in force_fit_type so the call was
> >> just degrading into wide_int_to_tree() anyhow.  Am I missing some
> >> subtlety here, and must we use widest_int and then force_fit_type on the
> >> caller?
> >
> > The issue is that the wide-ints get "converted", so it's indeed subtle.
>
> This seems like it should be documented-- at the very least in
> wide_int_range_convert.

I don't think you need it there.

> What do you mean "converted"?  I'd like to understand this better before
> I write out the comment :).  Do we lose precision by keeping it in a
> wide_int as opposed to a widest_int?

As you're using wide_int::from that "changes" precision now.

> Also, can we have the caller to wide_int_range_convert() use
> wide_int_to_tree directly instead of passing through force_fit_type?

I think so.

> It
> seems force_fit_type with overflowable=0 and overflowed=false is just a
> call to wide_int_to_tree anyhow.
>
> >
> > +      || wi::rshift (wi::sub (vr0_max, vr0_min),
> > +                    wi::uhwi (outer_prec, inner_prec),
> > +                    inner_sign) == 0)
> >
> > this was previously allowed only for VR_RANGEs - now it's also for
> > VR_ANTI_RANGE.
> > That looks incorrect.  Testcase welcome ;)
>
> See change to extract_range_into_wide_ints.  VR_ANTI_RANGEs of constants
> will remain as is, whereas VR_ANTI_RANGEs of symbols will degrade into
> [-MIN,+MAX] and be handled generically.
>
> Also, see comment:
>
> +      /* We normalize everything to a VR_RANGE, but for constant
> +        anti-ranges we must handle them by leaving the final result
> +        as an anti range.  This allows us to convert things like
> +        ~[0,5] seamlessly.  */

Yes, but for truncation of ~[0, 5] from say int to short it's not correct
to make the result ~[0, 5], is it?  At least the original code dropped
that to VARYING.  For the same reason truncating [3, 765] from
short to unsigned char isn't [3, 253].  But maybe I'm missing something.

> >
> > +      return (!wi::eq_p (min, wi::min_value (outer_prec, outer_sign))
> > +             || !wi::eq_p (max, wi::max_value (outer_prec, outer_sign)));
> >
> > so you return false for VARYING?  Why?  I'd simply return true and
> > return VARYING in the new type in the path you return false.
>
> Since symbols and VR_VARYING and other things get turned into a
> [-MIN,+MAX] by extract_range_into_wide_ints, it's a lot easier to handle
> things generically and return varying when the range spans the entire
> domain.  It also keeps with the rest of the wide_int_range_* functions
> that return false when we know it's going to be VR_VARYING.

Ah, OK, didn't know they did that.  Not sure if that's convenient though
given VR_VARYING has no representation in the wide-int-range "range"
so callers chaining ops would need to do sth like

if (!wide_int_range_* (...., &out_min, &out_max))
  {
    out_min = wide_int::min (prec);
    out_max = wide_int::max (prec);
  }
if (!wide_int_range_* (out_min, out_max, ....))
...

to not lose cases where VARYING inputs produce non-VARYING results?

Richard.

> Aldy
>
> >
> > Richard.
> >
> >
> >
> >>
> >> Second.  I need extract_range_into_wide_ints to work with anti ranges of
> >> constants.  It seems like only the unary handling of CONVERT_EXPR here
> >> uses it that way, so I believe it should go into one patch.  If you
> >> believe strongly otherwise, I could split out the 4 lines into a
> >> separate patch, but I'd prefer not to.
> >>
> >> Finally, I could use vr0_min.get_precision() instead of inner_precision,
> >> but I think it's more regular and readable the way I have it.
> >>
> >> Tested on all languages on x86-64 Linux.
> >>
> >> OK for trunk?
Aldy Hernandez Sept. 4, 2018, 2:12 p.m. UTC | #8
On 09/04/2018 08:49 AM, Richard Biener wrote:
> On Tue, Sep 4, 2018 at 2:41 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>>
>>
>> On 09/04/2018 07:58 AM, Richard Biener wrote:
>>> On Mon, Sep 3, 2018 at 1:32 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>
>>>>
>>>>
>>>> On 08/28/2018 05:27 AM, Richard Biener wrote:
>>>>> On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>>>
>>>>>> Howdy!
>>>>>>
>>>>>> Phew, I think this is the last abstraction.  This handles the unary
>>>>>> CONVERT_EXPR_P code.
>>>>>>
>>>>>> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
>>>>>> everything generically.
>>>>>>
>>>>>> Normalizing the symbolics brought about some nice surprises.  We now
>>>>>> handle a few things we were punting on before, which I've documented in
>>>>>> the patch, but can remove if so desired.  I wrote them mainly for myself:
>>>>>>
>>>>>> /* NOTES: Previously we were returning VARYING for all symbolics, but
>>>>>>        we can do better by treating them as [-MIN, +MAX].  For
>>>>>>        example, converting [SYM, SYM] from INT to LONG UNSIGNED,
>>>>>>        we can return: ~[0x8000000, 0xffffffff7fffffff].
>>>>>>
>>>>>>        We were also failing to convert ~[0,0] from char* to unsigned,
>>>>>>        instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
>>>>>>
>>>>>> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
>>>>>> including --enable-languages=all, because my past sins are still
>>>>>> haunting me.
>>>>>>
>>>>>> OK?
>>>>>
>>>>> The new wide_int_range_convert_tree looks odd given it returns
>>>>> tree's.  I'd have expected an API that does the conversion resulting
>>>>> in a wide_int range and the VRP code adapting to that by converting
>>>>> the result to trees.
>>>>
>>>> Rewritten as suggested.
>>>>
>>>> A few notes.
>>>>
>>>> First.  I am not using widest_int as was done previously.  We were
>>>> passing 0/false to the overflow args in force_fit_type so the call was
>>>> just degrading into wide_int_to_tree() anyhow.  Am I missing some
>>>> subtlety here, and must we use widest_int and then force_fit_type on the
>>>> caller?
>>>
>>> The issue is that the wide-ints get "converted", so it's indeed subtle.
>>
>> This seems like it should be documented-- at the very least in
>> wide_int_range_convert.
> 
> I don't think you need it there.
> 
>> What do you mean "converted"?  I'd like to understand this better before
>> I write out the comment :).  Do we lose precision by keeping it in a
>> wide_int as opposed to a widest_int?
> 
> As you're using wide_int::from that "changes" precision now.

Ah, so it's wide_int_to_tree that wants to see the widest precision.  I see.

> 
>> Also, can we have the caller to wide_int_range_convert() use
>> wide_int_to_tree directly instead of passing through force_fit_type?
> 
> I think so.
> 
>> It
>> seems force_fit_type with overflowable=0 and overflowed=false is just a
>> call to wide_int_to_tree anyhow.
>>
>>>
>>> +      || wi::rshift (wi::sub (vr0_max, vr0_min),
>>> +                    wi::uhwi (outer_prec, inner_prec),
>>> +                    inner_sign) == 0)
>>>
>>> this was previously allowed only for VR_RANGEs - now it's also for
>>> VR_ANTI_RANGE.
>>> That looks incorrect.  Testcase welcome ;)
>>
>> See change to extract_range_into_wide_ints.  VR_ANTI_RANGEs of constants
>> will remain as is, whereas VR_ANTI_RANGEs of symbols will degrade into
>> [-MIN,+MAX] and be handled generically.
>>
>> Also, see comment:
>>
>> +      /* We normalize everything to a VR_RANGE, but for constant
>> +        anti-ranges we must handle them by leaving the final result
>> +        as an anti range.  This allows us to convert things like
>> +        ~[0,5] seamlessly.  */
> 
> Yes, but for truncation of ~[0, 5] from say int to short it's not correct
> to make the result ~[0, 5], is it?  At least the original code dropped
> that to VARYING.  For the same reason truncating [3, 765] from
> short to unsigned char isn't [3, 253].  But maybe I'm missing something.

Correct, but in that case we will realize that in wide_int_range_convert 
and refuse to do the conversion:

   /* If the conversion is not truncating we can convert the min and
      max values and canonicalize the resulting range.  Otherwise we
      can do the conversion if the size of the range is less than what
      the precision of the target type can represent.  */
   if (outer_prec >= inner_prec
       || wi::rshift (wi::sub (vr0_max, vr0_min),
		     wi::uhwi (outer_prec, inner_prec),
		     inner_sign) == 0)

> 
>>>
>>> +      return (!wi::eq_p (min, wi::min_value (outer_prec, outer_sign))
>>> +             || !wi::eq_p (max, wi::max_value (outer_prec, outer_sign)));
>>>
>>> so you return false for VARYING?  Why?  I'd simply return true and
>>> return VARYING in the new type in the path you return false.
>>
>> Since symbols and VR_VARYING and other things get turned into a
>> [-MIN,+MAX] by extract_range_into_wide_ints, it's a lot easier to handle
>> things generically and return varying when the range spans the entire
>> domain.  It also keeps with the rest of the wide_int_range_* functions
>> that return false when we know it's going to be VR_VARYING.
> 
> Ah, OK, didn't know they did that.  Not sure if that's convenient though
> given VR_VARYING has no representation in the wide-int-range "range"
> so callers chaining ops would need to do sth like
> 
> if (!wide_int_range_* (...., &out_min, &out_max))
>    {
>      out_min = wide_int::min (prec);
>      out_max = wide_int::max (prec);
>    }
> if (!wide_int_range_* (out_min, out_max, ....))
> ...
> 
> to not lose cases where VARYING inputs produce non-VARYING results?

Well in the ssa-range branch we treat [-MIN,+MAX] as varying.  We call 
it range_for_type().

Also, the above if(!wide_int_range*) chaining is not necessary in the 
ssa-range work.  See op_wi() in range-op.c.  We just return false and 
the caller loop takes care of things.

(Andrew's forte is not readable function names ;-).  When I have my way 
with range-op.c, this op_RANDOM_TWO_LETTERS nonsense will go away.)

OK for trunk?
Aldy
gcc/

	* wide-int-range.cc (wide_int_range_convert): New.
	* wide-int-range.h (wide_int_range_convert): New.
	* tree-vrp.c (extract_range_from_unary_expr): Abstract wide int
	code into wide_int_range_convert.
	(extract_range_into_wide_ints): Do not munge anti range constants
	into the entire domain.  Just return the range back.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d29a2c9b507..2a3797e7f24 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -995,7 +995,7 @@ ranges_from_anti_range (const value_range *ar,
 /* Extract the components of a value range into a pair of wide ints in
    [WMIN, WMAX].
 
-   If the value range is anything but a VR_RANGE of constants, the
+   If the value range is anything but a VR_*RANGE of constants, the
    resulting wide ints are set to [-MIN, +MAX] for the type.  */
 
 static void inline
@@ -1003,7 +1003,10 @@ extract_range_into_wide_ints (const value_range *vr,
 			      signop sign, unsigned prec,
 			      wide_int &wmin, wide_int &wmax)
 {
-  if (range_int_cst_p (vr))
+  if ((vr->type == VR_RANGE
+       || vr->type == VR_ANTI_RANGE)
+      && TREE_CODE (vr->min) == INTEGER_CST
+      && TREE_CODE (vr->max) == INTEGER_CST)
     {
       wmin = wi::to_wide (vr->min);
       wmax = wi::to_wide (vr->max);
@@ -1850,44 +1853,42 @@ extract_range_from_unary_expr (value_range *vr,
 	  return;
 	}
 
-      /* If VR0 is varying and we increase the type precision, assume
-	 a full range for the following transformation.  */
-      if (vr0.type == VR_VARYING
-	  && INTEGRAL_TYPE_P (inner_type)
-	  && TYPE_PRECISION (inner_type) < TYPE_PRECISION (outer_type))
-	{
-	  vr0.type = VR_RANGE;
-	  vr0.min = TYPE_MIN_VALUE (inner_type);
-	  vr0.max = TYPE_MAX_VALUE (inner_type);
-	}
-
-      /* If VR0 is a constant range or anti-range and the conversion is
-	 not truncating we can convert the min and max values and
-	 canonicalize the resulting range.  Otherwise we can do the
-	 conversion if the size of the range is less than what the
-	 precision of the target type can represent and the range is
-	 not an anti-range.  */
-      if ((vr0.type == VR_RANGE
-	   || vr0.type == VR_ANTI_RANGE)
+      /* We normalize everything to a VR_RANGE, but for constant
+	 anti-ranges we must handle them by leaving the final result
+	 as an anti range.  This allows us to convert things like
+	 ~[0,5] seamlessly.  */
+      value_range_type vr_type = VR_RANGE;
+      if (vr0.type == VR_ANTI_RANGE
 	  && TREE_CODE (vr0.min) == INTEGER_CST
-	  && TREE_CODE (vr0.max) == INTEGER_CST
-	  && (TYPE_PRECISION (outer_type) >= TYPE_PRECISION (inner_type)
-	      || (vr0.type == VR_RANGE
-		  && integer_zerop (int_const_binop (RSHIFT_EXPR,
-		       int_const_binop (MINUS_EXPR, vr0.max, vr0.min),
-		         size_int (TYPE_PRECISION (outer_type)))))))
+	  && TREE_CODE (vr0.max) == INTEGER_CST)
+	vr_type = VR_ANTI_RANGE;
+
+      /* NOTES: Previously we were returning VARYING for all symbolics, but
+	 we can do better by treating them as [-MIN, +MAX].  For
+	 example, converting [SYM, SYM] from INT to LONG UNSIGNED,
+	 we can return: ~[0x8000000, 0xffffffff7fffffff].
+
+	 We were also failing to convert ~[0,0] from char* to unsigned,
+	 instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
+      wide_int vr0_min, vr0_max;
+      widest_int wmin, wmax;
+      signop inner_sign = TYPE_SIGN (inner_type);
+      signop outer_sign = TYPE_SIGN (outer_type);
+      unsigned inner_prec = TYPE_PRECISION (inner_type);
+      unsigned outer_prec = TYPE_PRECISION (outer_type);
+      extract_range_into_wide_ints (&vr0, inner_sign, inner_prec,
+				    vr0_min, vr0_max);
+      if (wide_int_range_convert (wmin, wmax,
+				  inner_sign, inner_prec,
+				  outer_sign, outer_prec,
+				  vr0_min, vr0_max))
 	{
-	  tree new_min, new_max;
-	  new_min = force_fit_type (outer_type, wi::to_widest (vr0.min),
-				    0, false);
-	  new_max = force_fit_type (outer_type, wi::to_widest (vr0.max),
-				    0, false);
-	  set_and_canonicalize_value_range (vr, vr0.type,
-					    new_min, new_max, NULL);
-	  return;
+	  tree min = wide_int_to_tree (outer_type, wmin);
+	  tree max = wide_int_to_tree (outer_type, wmax);
+	  set_and_canonicalize_value_range (vr, vr_type, min, max, NULL);
 	}
-
-      set_value_range_to_varying (vr);
+      else
+	set_value_range_to_varying (vr);
       return;
     }
   else if (code == ABS_EXPR)
diff --git a/gcc/wide-int-range.cc b/gcc/wide-int-range.cc
index 04d391b33d5..aa25b88b982 100644
--- a/gcc/wide-int-range.cc
+++ b/gcc/wide-int-range.cc
@@ -735,6 +735,42 @@ wide_int_range_abs (wide_int &min, wide_int &max,
   return true;
 }
 
+/* Convert range in [VR0_MIN, VR0_MAX] with INNER_SIGN and INNER_PREC,
+   to a widest_int range in [MIN, MAX] with OUTER_SIGN and OUTER_PREC.
+
+   Return TRUE if we were able to successfully calculate the new range.
+
+   Caller is responsible for canonicalizing the resulting range.  */
+
+bool
+wide_int_range_convert (widest_int &min, widest_int &max,
+			signop inner_sign,
+			unsigned inner_prec,
+			signop outer_sign,
+			unsigned outer_prec,
+			const wide_int &vr0_min,
+			const wide_int &vr0_max)
+{
+  /* If the conversion is not truncating we can convert the min and
+     max values and canonicalize the resulting range.  Otherwise we
+     can do the conversion if the size of the range is less than what
+     the precision of the target type can represent.  */
+  if (outer_prec >= inner_prec
+      || wi::rshift (wi::sub (vr0_max, vr0_min),
+		     wi::uhwi (outer_prec, inner_prec),
+		     inner_sign) == 0)
+    {
+      min = widest_int::from (vr0_min, inner_sign);
+      max = widest_int::from (vr0_max, inner_sign);
+      widest_int min_value
+	= widest_int::from (wi::min_value (outer_prec, outer_sign), outer_sign);
+      widest_int max_value
+	= widest_int::from (wi::max_value (outer_prec, outer_sign), outer_sign);
+      return !wi::eq_p (min, min_value) || !wi::eq_p (max, max_value);
+    }
+  return false;
+}
+
 /* Calculate a division operation on two ranges and store the result in
    [WMIN, WMAX] U [EXTRA_MIN, EXTRA_MAX].
 
diff --git a/gcc/wide-int-range.h b/gcc/wide-int-range.h
index eaf47bca449..d0efd5ca694 100644
--- a/gcc/wide-int-range.h
+++ b/gcc/wide-int-range.h
@@ -109,6 +109,13 @@ extern bool wide_int_range_abs (wide_int &min, wide_int &max,
 				const wide_int &vr0_min,
 				const wide_int &vr0_max,
 				bool overflow_undefined);
+extern bool wide_int_range_convert (widest_int &min, widest_int &max,
+				    signop inner_sign,
+				    unsigned inner_prec,
+				    signop outer_sign,
+				    unsigned outer_prec,
+				    const wide_int &vr0_min,
+				    const wide_int &vr0_max);
 extern bool wide_int_range_div (wide_int &wmin, wide_int &wmax,
 				enum tree_code code,
 				signop sign, unsigned prec,
Richard Biener Sept. 4, 2018, 2:20 p.m. UTC | #9
On Tue, Sep 4, 2018 at 4:12 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>
>
>
> On 09/04/2018 08:49 AM, Richard Biener wrote:
> > On Tue, Sep 4, 2018 at 2:41 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>
> >>
> >>
> >> On 09/04/2018 07:58 AM, Richard Biener wrote:
> >>> On Mon, Sep 3, 2018 at 1:32 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 08/28/2018 05:27 AM, Richard Biener wrote:
> >>>>> On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>>>>>
> >>>>>> Howdy!
> >>>>>>
> >>>>>> Phew, I think this is the last abstraction.  This handles the unary
> >>>>>> CONVERT_EXPR_P code.
> >>>>>>
> >>>>>> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
> >>>>>> everything generically.
> >>>>>>
> >>>>>> Normalizing the symbolics brought about some nice surprises.  We now
> >>>>>> handle a few things we were punting on before, which I've documented in
> >>>>>> the patch, but can remove if so desired.  I wrote them mainly for myself:
> >>>>>>
> >>>>>> /* NOTES: Previously we were returning VARYING for all symbolics, but
> >>>>>>        we can do better by treating them as [-MIN, +MAX].  For
> >>>>>>        example, converting [SYM, SYM] from INT to LONG UNSIGNED,
> >>>>>>        we can return: ~[0x8000000, 0xffffffff7fffffff].
> >>>>>>
> >>>>>>        We were also failing to convert ~[0,0] from char* to unsigned,
> >>>>>>        instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
> >>>>>>
> >>>>>> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
> >>>>>> including --enable-languages=all, because my past sins are still
> >>>>>> haunting me.
> >>>>>>
> >>>>>> OK?
> >>>>>
> >>>>> The new wide_int_range_convert_tree looks odd given it returns
> >>>>> tree's.  I'd have expected an API that does the conversion resulting
> >>>>> in a wide_int range and the VRP code adapting to that by converting
> >>>>> the result to trees.
> >>>>
> >>>> Rewritten as suggested.
> >>>>
> >>>> A few notes.
> >>>>
> >>>> First.  I am not using widest_int as was done previously.  We were
> >>>> passing 0/false to the overflow args in force_fit_type so the call was
> >>>> just degrading into wide_int_to_tree() anyhow.  Am I missing some
> >>>> subtlety here, and must we use widest_int and then force_fit_type on the
> >>>> caller?
> >>>
> >>> The issue is that the wide-ints get "converted", so it's indeed subtle.
> >>
> >> This seems like it should be documented-- at the very least in
> >> wide_int_range_convert.
> >
> > I don't think you need it there.
> >
> >> What do you mean "converted"?  I'd like to understand this better before
> >> I write out the comment :).  Do we lose precision by keeping it in a
> >> wide_int as opposed to a widest_int?
> >
> > As you're using wide_int::from that "changes" precision now.
>
> Ah, so it's wide_int_to_tree that wants to see the widest precision.  I see.

I honestly don't remember exactly but I suppose precision mismatched somehow.

Your code was OK with not using widest_ints.

> >
> >> Also, can we have the caller to wide_int_range_convert() use
> >> wide_int_to_tree directly instead of passing through force_fit_type?
> >
> > I think so.
> >
> >> It
> >> seems force_fit_type with overflowable=0 and overflowed=false is just a
> >> call to wide_int_to_tree anyhow.
> >>
> >>>
> >>> +      || wi::rshift (wi::sub (vr0_max, vr0_min),
> >>> +                    wi::uhwi (outer_prec, inner_prec),
> >>> +                    inner_sign) == 0)
> >>>
> >>> this was previously allowed only for VR_RANGEs - now it's also for
> >>> VR_ANTI_RANGE.
> >>> That looks incorrect.  Testcase welcome ;)
> >>
> >> See change to extract_range_into_wide_ints.  VR_ANTI_RANGEs of constants
> >> will remain as is, whereas VR_ANTI_RANGEs of symbols will degrade into
> >> [-MIN,+MAX] and be handled generically.
> >>
> >> Also, see comment:
> >>
> >> +      /* We normalize everything to a VR_RANGE, but for constant
> >> +        anti-ranges we must handle them by leaving the final result
> >> +        as an anti range.  This allows us to convert things like
> >> +        ~[0,5] seamlessly.  */
> >
> > Yes, but for truncation of ~[0, 5] from say int to short it's not correct
> > to make the result ~[0, 5], is it?  At least the original code dropped
> > that to VARYING.  For the same reason truncating [3, 765] from
> > short to unsigned char isn't [3, 253].  But maybe I'm missing something.
>
> Correct, but in that case we will realize that in wide_int_range_convert
> and refuse to do the conversion:
>
>    /* If the conversion is not truncating we can convert the min and
>       max values and canonicalize the resulting range.  Otherwise we
>       can do the conversion if the size of the range is less than what
>       the precision of the target type can represent.  */
>    if (outer_prec >= inner_prec
>        || wi::rshift (wi::sub (vr0_max, vr0_min),
>                      wi::uhwi (outer_prec, inner_prec),
>                      inner_sign) == 0)

so you say the check is also sufficient for all anti-ranges?  I see the
current VRP code does canonicalization to ranges before running into
the CONVERT_EXPR_CODE case.

> >
> >>>
> >>> +      return (!wi::eq_p (min, wi::min_value (outer_prec, outer_sign))
> >>> +             || !wi::eq_p (max, wi::max_value (outer_prec, outer_sign)));
> >>>
> >>> so you return false for VARYING?  Why?  I'd simply return true and
> >>> return VARYING in the new type in the path you return false.
> >>
> >> Since symbols and VR_VARYING and other things get turned into a
> >> [-MIN,+MAX] by extract_range_into_wide_ints, it's a lot easier to handle
> >> things generically and return varying when the range spans the entire
> >> domain.  It also keeps with the rest of the wide_int_range_* functions
> >> that return false when we know it's going to be VR_VARYING.
> >
> > Ah, OK, didn't know they did that.  Not sure if that's convenient though
> > given VR_VARYING has no representation in the wide-int-range "range"
> > so callers chaining ops would need to do sth like
> >
> > if (!wide_int_range_* (...., &out_min, &out_max))
> >    {
> >      out_min = wide_int::min (prec);
> >      out_max = wide_int::max (prec);
> >    }
> > if (!wide_int_range_* (out_min, out_max, ....))
> > ...
> >
> > to not lose cases where VARYING inputs produce non-VARYING results?
>
> Well in the ssa-range branch we treat [-MIN,+MAX] as varying.  We call
> it range_for_type().

Sure.  But out_min/max is uninitialized when the functions return false?

> Also, the above if(!wide_int_range*) chaining is not necessary in the
> ssa-range work.  See op_wi() in range-op.c.  We just return false and
> the caller loop takes care of things.

I see.  I'm still trying to view the wide-int-range.{cc,h} stuff as a generally
useful API rather than just factored out "stuff" to be used by VRP and
range-op.c ...

Anyhow, let's go with your original patch.

I hope I can spend some time going over wide-int-range.{cc,h} and sanitizing
its interface (yeah, didn't forget the class thing ... maybe tomorrow
before I leave ;))

Richard.

> (Andrew's forte is not readable function names ;-).  When I have my way
> with range-op.c, this op_RANDOM_TWO_LETTERS nonsense will go away.)
>
> OK for trunk?
> Aldy
Aldy Hernandez Sept. 4, 2018, 2:25 p.m. UTC | #10
On 09/04/2018 10:20 AM, Richard Biener wrote:
> On Tue, Sep 4, 2018 at 4:12 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>>
>>
>> On 09/04/2018 08:49 AM, Richard Biener wrote:
>>> On Tue, Sep 4, 2018 at 2:41 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>
>>>>
>>>>
>>>> On 09/04/2018 07:58 AM, Richard Biener wrote:
>>>>> On Mon, Sep 3, 2018 at 1:32 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 08/28/2018 05:27 AM, Richard Biener wrote:
>>>>>>> On Mon, Aug 27, 2018 at 2:24 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>>>>>
>>>>>>>> Howdy!
>>>>>>>>
>>>>>>>> Phew, I think this is the last abstraction.  This handles the unary
>>>>>>>> CONVERT_EXPR_P code.
>>>>>>>>
>>>>>>>> It's the usual story-- normalize the symbolics to [-MIN,+MAX] and handle
>>>>>>>> everything generically.
>>>>>>>>
>>>>>>>> Normalizing the symbolics brought about some nice surprises.  We now
>>>>>>>> handle a few things we were punting on before, which I've documented in
>>>>>>>> the patch, but can remove if so desired.  I wrote them mainly for myself:
>>>>>>>>
>>>>>>>> /* NOTES: Previously we were returning VARYING for all symbolics, but
>>>>>>>>         we can do better by treating them as [-MIN, +MAX].  For
>>>>>>>>         example, converting [SYM, SYM] from INT to LONG UNSIGNED,
>>>>>>>>         we can return: ~[0x8000000, 0xffffffff7fffffff].
>>>>>>>>
>>>>>>>>         We were also failing to convert ~[0,0] from char* to unsigned,
>>>>>>>>         instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
>>>>>>>>
>>>>>>>> Tested on x86-64 by the usual bootstrap and regtest gymnastics,
>>>>>>>> including --enable-languages=all, because my past sins are still
>>>>>>>> haunting me.
>>>>>>>>
>>>>>>>> OK?
>>>>>>>
>>>>>>> The new wide_int_range_convert_tree looks odd given it returns
>>>>>>> tree's.  I'd have expected an API that does the conversion resulting
>>>>>>> in a wide_int range and the VRP code adapting to that by converting
>>>>>>> the result to trees.
>>>>>>
>>>>>> Rewritten as suggested.
>>>>>>
>>>>>> A few notes.
>>>>>>
>>>>>> First.  I am not using widest_int as was done previously.  We were
>>>>>> passing 0/false to the overflow args in force_fit_type so the call was
>>>>>> just degrading into wide_int_to_tree() anyhow.  Am I missing some
>>>>>> subtlety here, and must we use widest_int and then force_fit_type on the
>>>>>> caller?
>>>>>
>>>>> The issue is that the wide-ints get "converted", so it's indeed subtle.
>>>>
>>>> This seems like it should be documented-- at the very least in
>>>> wide_int_range_convert.
>>>
>>> I don't think you need it there.
>>>
>>>> What do you mean "converted"?  I'd like to understand this better before
>>>> I write out the comment :).  Do we lose precision by keeping it in a
>>>> wide_int as opposed to a widest_int?
>>>
>>> As you're using wide_int::from that "changes" precision now.
>>
>> Ah, so it's wide_int_to_tree that wants to see the widest precision.  I see.
> 
> I honestly don't remember exactly but I suppose precision mismatched somehow.
> 
> Your code was OK with not using widest_ints.
> 
>>>
>>>> Also, can we have the caller to wide_int_range_convert() use
>>>> wide_int_to_tree directly instead of passing through force_fit_type?
>>>
>>> I think so.
>>>
>>>> It
>>>> seems force_fit_type with overflowable=0 and overflowed=false is just a
>>>> call to wide_int_to_tree anyhow.
>>>>
>>>>>
>>>>> +      || wi::rshift (wi::sub (vr0_max, vr0_min),
>>>>> +                    wi::uhwi (outer_prec, inner_prec),
>>>>> +                    inner_sign) == 0)
>>>>>
>>>>> this was previously allowed only for VR_RANGEs - now it's also for
>>>>> VR_ANTI_RANGE.
>>>>> That looks incorrect.  Testcase welcome ;)
>>>>
>>>> See change to extract_range_into_wide_ints.  VR_ANTI_RANGEs of constants
>>>> will remain as is, whereas VR_ANTI_RANGEs of symbols will degrade into
>>>> [-MIN,+MAX] and be handled generically.
>>>>
>>>> Also, see comment:
>>>>
>>>> +      /* We normalize everything to a VR_RANGE, but for constant
>>>> +        anti-ranges we must handle them by leaving the final result
>>>> +        as an anti range.  This allows us to convert things like
>>>> +        ~[0,5] seamlessly.  */
>>>
>>> Yes, but for truncation of ~[0, 5] from say int to short it's not correct
>>> to make the result ~[0, 5], is it?  At least the original code dropped
>>> that to VARYING.  For the same reason truncating [3, 765] from
>>> short to unsigned char isn't [3, 253].  But maybe I'm missing something.
>>
>> Correct, but in that case we will realize that in wide_int_range_convert
>> and refuse to do the conversion:
>>
>>     /* If the conversion is not truncating we can convert the min and
>>        max values and canonicalize the resulting range.  Otherwise we
>>        can do the conversion if the size of the range is less than what
>>        the precision of the target type can represent.  */
>>     if (outer_prec >= inner_prec
>>         || wi::rshift (wi::sub (vr0_max, vr0_min),
>>                       wi::uhwi (outer_prec, inner_prec),
>>                       inner_sign) == 0)
> 
> so you say the check is also sufficient for all anti-ranges?  I see the
> current VRP code does canonicalization to ranges before running into
> the CONVERT_EXPR_CODE case.
> 
>>>
>>>>>
>>>>> +      return (!wi::eq_p (min, wi::min_value (outer_prec, outer_sign))
>>>>> +             || !wi::eq_p (max, wi::max_value (outer_prec, outer_sign)));
>>>>>
>>>>> so you return false for VARYING?  Why?  I'd simply return true and
>>>>> return VARYING in the new type in the path you return false.
>>>>
>>>> Since symbols and VR_VARYING and other things get turned into a
>>>> [-MIN,+MAX] by extract_range_into_wide_ints, it's a lot easier to handle
>>>> things generically and return varying when the range spans the entire
>>>> domain.  It also keeps with the rest of the wide_int_range_* functions
>>>> that return false when we know it's going to be VR_VARYING.
>>>
>>> Ah, OK, didn't know they did that.  Not sure if that's convenient though
>>> given VR_VARYING has no representation in the wide-int-range "range"
>>> so callers chaining ops would need to do sth like
>>>
>>> if (!wide_int_range_* (...., &out_min, &out_max))
>>>     {
>>>       out_min = wide_int::min (prec);
>>>       out_max = wide_int::max (prec);
>>>     }
>>> if (!wide_int_range_* (out_min, out_max, ....))
>>> ...
>>>
>>> to not lose cases where VARYING inputs produce non-VARYING results?
>>
>> Well in the ssa-range branch we treat [-MIN,+MAX] as varying.  We call
>> it range_for_type().
> 
> Sure.  But out_min/max is uninitialized when the functions return false?
> 
>> Also, the above if(!wide_int_range*) chaining is not necessary in the
>> ssa-range work.  See op_wi() in range-op.c.  We just return false and
>> the caller loop takes care of things.
> 
> I see.  I'm still trying to view the wide-int-range.{cc,h} stuff as a generally
> useful API rather than just factored out "stuff" to be used by VRP and
> range-op.c ...
> 
> Anyhow, let's go with your original patch.

I assume you mean with my original patch but using widest_int instead of 
wide_int, because of the subtlety mentioned?  If so, that's the patch I 
just attached (and am doing so again for the record).

Please let me know and I will commit.

> 
> I hope I can spend some time going over wide-int-range.{cc,h} and sanitizing
> its interface (yeah, didn't forget the class thing ... maybe tomorrow
> before I leave ;))

Thanks.  I appreciate your comments.

Aldy
gcc/

	* wide-int-range.cc (wide_int_range_convert): New.
	* wide-int-range.h (wide_int_range_convert): New.
	* tree-vrp.c (extract_range_from_unary_expr): Abstract wide int
	code into wide_int_range_convert.
	(extract_range_into_wide_ints): Do not munge anti range constants
	into the entire domain.  Just return the range back.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d29a2c9b507..2a3797e7f24 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -995,7 +995,7 @@ ranges_from_anti_range (const value_range *ar,
 /* Extract the components of a value range into a pair of wide ints in
    [WMIN, WMAX].
 
-   If the value range is anything but a VR_RANGE of constants, the
+   If the value range is anything but a VR_*RANGE of constants, the
    resulting wide ints are set to [-MIN, +MAX] for the type.  */
 
 static void inline
@@ -1003,7 +1003,10 @@ extract_range_into_wide_ints (const value_range *vr,
 			      signop sign, unsigned prec,
 			      wide_int &wmin, wide_int &wmax)
 {
-  if (range_int_cst_p (vr))
+  if ((vr->type == VR_RANGE
+       || vr->type == VR_ANTI_RANGE)
+      && TREE_CODE (vr->min) == INTEGER_CST
+      && TREE_CODE (vr->max) == INTEGER_CST)
     {
       wmin = wi::to_wide (vr->min);
       wmax = wi::to_wide (vr->max);
@@ -1850,44 +1853,42 @@ extract_range_from_unary_expr (value_range *vr,
 	  return;
 	}
 
-      /* If VR0 is varying and we increase the type precision, assume
-	 a full range for the following transformation.  */
-      if (vr0.type == VR_VARYING
-	  && INTEGRAL_TYPE_P (inner_type)
-	  && TYPE_PRECISION (inner_type) < TYPE_PRECISION (outer_type))
-	{
-	  vr0.type = VR_RANGE;
-	  vr0.min = TYPE_MIN_VALUE (inner_type);
-	  vr0.max = TYPE_MAX_VALUE (inner_type);
-	}
-
-      /* If VR0 is a constant range or anti-range and the conversion is
-	 not truncating we can convert the min and max values and
-	 canonicalize the resulting range.  Otherwise we can do the
-	 conversion if the size of the range is less than what the
-	 precision of the target type can represent and the range is
-	 not an anti-range.  */
-      if ((vr0.type == VR_RANGE
-	   || vr0.type == VR_ANTI_RANGE)
+      /* We normalize everything to a VR_RANGE, but for constant
+	 anti-ranges we must handle them by leaving the final result
+	 as an anti range.  This allows us to convert things like
+	 ~[0,5] seamlessly.  */
+      value_range_type vr_type = VR_RANGE;
+      if (vr0.type == VR_ANTI_RANGE
 	  && TREE_CODE (vr0.min) == INTEGER_CST
-	  && TREE_CODE (vr0.max) == INTEGER_CST
-	  && (TYPE_PRECISION (outer_type) >= TYPE_PRECISION (inner_type)
-	      || (vr0.type == VR_RANGE
-		  && integer_zerop (int_const_binop (RSHIFT_EXPR,
-		       int_const_binop (MINUS_EXPR, vr0.max, vr0.min),
-		         size_int (TYPE_PRECISION (outer_type)))))))
+	  && TREE_CODE (vr0.max) == INTEGER_CST)
+	vr_type = VR_ANTI_RANGE;
+
+      /* NOTES: Previously we were returning VARYING for all symbolics, but
+	 we can do better by treating them as [-MIN, +MAX].  For
+	 example, converting [SYM, SYM] from INT to LONG UNSIGNED,
+	 we can return: ~[0x8000000, 0xffffffff7fffffff].
+
+	 We were also failing to convert ~[0,0] from char* to unsigned,
+	 instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
+      wide_int vr0_min, vr0_max;
+      widest_int wmin, wmax;
+      signop inner_sign = TYPE_SIGN (inner_type);
+      signop outer_sign = TYPE_SIGN (outer_type);
+      unsigned inner_prec = TYPE_PRECISION (inner_type);
+      unsigned outer_prec = TYPE_PRECISION (outer_type);
+      extract_range_into_wide_ints (&vr0, inner_sign, inner_prec,
+				    vr0_min, vr0_max);
+      if (wide_int_range_convert (wmin, wmax,
+				  inner_sign, inner_prec,
+				  outer_sign, outer_prec,
+				  vr0_min, vr0_max))
 	{
-	  tree new_min, new_max;
-	  new_min = force_fit_type (outer_type, wi::to_widest (vr0.min),
-				    0, false);
-	  new_max = force_fit_type (outer_type, wi::to_widest (vr0.max),
-				    0, false);
-	  set_and_canonicalize_value_range (vr, vr0.type,
-					    new_min, new_max, NULL);
-	  return;
+	  tree min = wide_int_to_tree (outer_type, wmin);
+	  tree max = wide_int_to_tree (outer_type, wmax);
+	  set_and_canonicalize_value_range (vr, vr_type, min, max, NULL);
 	}
-
-      set_value_range_to_varying (vr);
+      else
+	set_value_range_to_varying (vr);
       return;
     }
   else if (code == ABS_EXPR)
diff --git a/gcc/wide-int-range.cc b/gcc/wide-int-range.cc
index 04d391b33d5..aa25b88b982 100644
--- a/gcc/wide-int-range.cc
+++ b/gcc/wide-int-range.cc
@@ -735,6 +735,42 @@ wide_int_range_abs (wide_int &min, wide_int &max,
   return true;
 }
 
+/* Convert range in [VR0_MIN, VR0_MAX] with INNER_SIGN and INNER_PREC,
+   to a widest_int range in [MIN, MAX] with OUTER_SIGN and OUTER_PREC.
+
+   Return TRUE if we were able to successfully calculate the new range.
+
+   Caller is responsible for canonicalizing the resulting range.  */
+
+bool
+wide_int_range_convert (widest_int &min, widest_int &max,
+			signop inner_sign,
+			unsigned inner_prec,
+			signop outer_sign,
+			unsigned outer_prec,
+			const wide_int &vr0_min,
+			const wide_int &vr0_max)
+{
+  /* If the conversion is not truncating we can convert the min and
+     max values and canonicalize the resulting range.  Otherwise we
+     can do the conversion if the size of the range is less than what
+     the precision of the target type can represent.  */
+  if (outer_prec >= inner_prec
+      || wi::rshift (wi::sub (vr0_max, vr0_min),
+		     wi::uhwi (outer_prec, inner_prec),
+		     inner_sign) == 0)
+    {
+      min = widest_int::from (vr0_min, inner_sign);
+      max = widest_int::from (vr0_max, inner_sign);
+      widest_int min_value
+	= widest_int::from (wi::min_value (outer_prec, outer_sign), outer_sign);
+      widest_int max_value
+	= widest_int::from (wi::max_value (outer_prec, outer_sign), outer_sign);
+      return !wi::eq_p (min, min_value) || !wi::eq_p (max, max_value);
+    }
+  return false;
+}
+
 /* Calculate a division operation on two ranges and store the result in
    [WMIN, WMAX] U [EXTRA_MIN, EXTRA_MAX].
 
diff --git a/gcc/wide-int-range.h b/gcc/wide-int-range.h
index eaf47bca449..d0efd5ca694 100644
--- a/gcc/wide-int-range.h
+++ b/gcc/wide-int-range.h
@@ -109,6 +109,13 @@ extern bool wide_int_range_abs (wide_int &min, wide_int &max,
 				const wide_int &vr0_min,
 				const wide_int &vr0_max,
 				bool overflow_undefined);
+extern bool wide_int_range_convert (widest_int &min, widest_int &max,
+				    signop inner_sign,
+				    unsigned inner_prec,
+				    signop outer_sign,
+				    unsigned outer_prec,
+				    const wide_int &vr0_min,
+				    const wide_int &vr0_max);
 extern bool wide_int_range_div (wide_int &wmin, wide_int &wmax,
 				enum tree_code code,
 				signop sign, unsigned prec,
Aldy Hernandez Sept. 4, 2018, 2:35 p.m. UTC | #11
On 09/04/2018 10:20 AM, Richard Biener wrote:
> On Tue, Sep 4, 2018 at 4:12 PM Aldy Hernandez <aldyh@redhat.com> wrote:

> I honestly don't remember exactly but I suppose precision mismatched somehow.
> 
> Your code was OK with not using widest_ints.

Ah, this got lost in the noise.  Sorry.

Will commit the original patch.

>>> Yes, but for truncation of ~[0, 5] from say int to short it's not correct
>>> to make the result ~[0, 5], is it?  At least the original code dropped
>>> that to VARYING.  For the same reason truncating [3, 765] from
>>> short to unsigned char isn't [3, 253].  But maybe I'm missing something.
>>
>> Correct, but in that case we will realize that in wide_int_range_convert
>> and refuse to do the conversion:
>>
>>     /* If the conversion is not truncating we can convert the min and
>>        max values and canonicalize the resulting range.  Otherwise we
>>        can do the conversion if the size of the range is less than what
>>        the precision of the target type can represent.  */
>>     if (outer_prec >= inner_prec
>>         || wi::rshift (wi::sub (vr0_max, vr0_min),
>>                       wi::uhwi (outer_prec, inner_prec),
>>                       inner_sign) == 0)
> 
> so you say the check is also sufficient for all anti-ranges?  I see the
> current VRP code does canonicalization to ranges before running into
> the CONVERT_EXPR_CODE case.

Yes.  exactract_range_into_wide_ints() will do the right thing.

Thanks again.
Aldy
Michael Matz Sept. 5, 2018, 12:58 p.m. UTC | #12
Hi,

On Tue, 4 Sep 2018, Aldy Hernandez wrote:

> > to make the result ~[0, 5], is it?  At least the original code dropped
> > that to VARYING.  For the same reason truncating [3, 765] from
> > short to unsigned char isn't [3, 253].  But maybe I'm missing something.

Sorry for chiming in, but this catched my eye:

> Correct, but in that case we will realize that in wide_int_range_convert and
> refuse to do the conversion:
> 
>   /* If the conversion is not truncating we can convert the min and
>      max values and canonicalize the resulting range.  Otherwise we
>      can do the conversion if the size of the range is less than what
>      the precision of the target type can represent.  */
>   if (outer_prec >= inner_prec
>       || wi::rshift (wi::sub (vr0_max, vr0_min),
> 		     wi::uhwi (outer_prec, inner_prec),
> 		     inner_sign) == 0)
(followed by this code:)
+    {
+      min = widest_int::from (vr0_min, inner_sign);
+      max = widest_int::from (vr0_max, inner_sign);
+      widest_int min_value
+       = widest_int::from (wi::min_value (outer_prec, outer_sign),outer_sign);
+      widest_int max_value
+       = widest_int::from (wi::max_value (outer_prec, outer_sign),outer_sign);
+      return !wi::eq_p (min, min_value) || !wi::eq_p (max, max_value);
+    }
+  return false;

How can this test and following code catch all problematic cases?  Assume 
a range of [253..257], truncating to 8 bits unsigned.  The size of the 
range is 5 (not 4 as the code above calculates), well representable in 8 
bits.  Nevertheless, the min of the new range isn't 253, nor is the max of 
the new range 257.  In fact the new range must be [0..255] (if you have no 
multi ranges or anti-ranges).  So whatever this function is supposed to 
do (what btw?), it certainly does not convert a range.


Ciao,
Michael.
Aldy Hernandez Sept. 5, 2018, 2:01 p.m. UTC | #13
On 09/05/2018 08:58 AM, Michael Matz wrote:
> Hi,
> 
> On Tue, 4 Sep 2018, Aldy Hernandez wrote:
> 
>>> to make the result ~[0, 5], is it?  At least the original code dropped
>>> that to VARYING.  For the same reason truncating [3, 765] from
>>> short to unsigned char isn't [3, 253].  But maybe I'm missing something.
> 
> Sorry for chiming in, but this catched my eye:

No apologies needed.  I welcome all masochists to join me in my personal 
hell :).

> 
>> Correct, but in that case we will realize that in wide_int_range_convert and
>> refuse to do the conversion:
>>
>>    /* If the conversion is not truncating we can convert the min and
>>       max values and canonicalize the resulting range.  Otherwise we
>>       can do the conversion if the size of the range is less than what
>>       the precision of the target type can represent.  */
>>    if (outer_prec >= inner_prec
>>        || wi::rshift (wi::sub (vr0_max, vr0_min),
>> 		     wi::uhwi (outer_prec, inner_prec),
>> 		     inner_sign) == 0)
> (followed by this code:)
> +    {
> +      min = widest_int::from (vr0_min, inner_sign);
> +      max = widest_int::from (vr0_max, inner_sign);
> +      widest_int min_value
> +       = widest_int::from (wi::min_value (outer_prec, outer_sign),outer_sign);
> +      widest_int max_value
> +       = widest_int::from (wi::max_value (outer_prec, outer_sign),outer_sign);
> +      return !wi::eq_p (min, min_value) || !wi::eq_p (max, max_value);
> +    }
> +  return false;
> 
> How can this test and following code catch all problematic cases?  Assume
> a range of [253..257], truncating to 8 bits unsigned.  The size of the
> range is 5 (not 4 as the code above calculates), well representable in 8
> bits.  Nevertheless, the min of the new range isn't 253, nor is the max of
> the new range 257.  In fact the new range must be [0..255] (if you have no
> multi ranges or anti-ranges).  So whatever this function is supposed to
> do (what btw?), it certainly does not convert a range.

Welcome to the wonderful world of anti ranges, where nothing is as it seems.

First you're not looking at what's currently in mainline.  Richard 
approved the first incantation of this code.  But even so, I believe the 
above is correct as well.

What you're missing is the call to set_and_canonicalize_value_range in 
the caller which will transform ranges into anti-ranges when the 
returned range has min/max swapped:

      if (wide_int_range_convert (wmin, wmax,
				  inner_sign, inner_prec,
				  outer_sign, outer_prec,
				  vr0_min, vr0_max))
	{
	  tree min = wide_int_to_tree (outer_type, wmin);
	  tree max = wide_int_to_tree (outer_type, wmax);
	  set_and_canonicalize_value_range (vr, vr_type, min, max, NULL);
	}


Take your truncating example of [253, 257].  Wide_int_range_convert will 
return [253,1] and set_and_canonicalize_value_range will transform this 
into ~[2, 252].

See:
   /* Wrong order for min and max, to swap them and the VR type we need
      to adjust them.  */
   if (tree_int_cst_lt (max, min))

Also, I took care of documenting this need for canonizing in my code:

...
    Caller is responsible for canonicalizing the resulting range.  */

If desired, I could further document that the range returned may have 
its contents swapped and that is why it needs canonizing.  But hey, that 
shit was already broken when I got here ;-).

If you follow the code in tree-vrp.c before my patch you will see that 
the path is the same... we calculate a nonsensical range of [253, 1] and 
then massage it into ~[2, 252] in set_and_canonicalize_value_range.

I think the fact that it's so hard to get all this right, is a strong 
argument for getting rid of anti ranges.  Almost all the bugs or 
oversights I've fixed over the past months in VRP have been related to 
anti ranges.

Aldy
Michael Matz Sept. 5, 2018, 2:57 p.m. UTC | #14
Hi,

On Wed, 5 Sep 2018, Aldy Hernandez wrote:

> No apologies needed.  I welcome all masochists to join me in my personal hell
> :).

;-)

> > How can this test and following code catch all problematic cases?  Assume
> > a range of [253..257], truncating to 8 bits unsigned.  The size of the
> > range is 5 (not 4 as the code above calculates), well representable in 8
> > bits.  Nevertheless, the min of the new range isn't 253, nor is the max of
> > the new range 257.  In fact the new range must be [0..255] (if you have no
> > multi ranges or anti-ranges).  So whatever this function is supposed to
> > do (what btw?), it certainly does not convert a range.
> 
> Welcome to the wonderful world of anti ranges, where nothing is as it 
> seems.

Yes, as I said, to precisely capture the result of '[253..257] & 255' you 
either need multi-ranges or an anti-range.  

> First you're not looking at what's currently in mainline.  Richard 
> approved the first incantation of this code.  But even so, I believe the 
> above is correct as well.

Well, maybe as part of a larger sequence of code, but the above code 
doesn't even return [253..1], there's actually no conversion/change of the 
inputs done at all.  Yes, the current code actually does the truncation.  
I guess I was triggered by the name range_convert that didn't actually do 
any conversion.  I now have no issue with the code in mainline anymore 
(well, maybe except disliking functions with eight parameters).

> If desired, I could further document that the range returned may have 
> its contents swapped and that is why it needs canonizing.  But hey, that 
> shit was already broken when I got here ;-).
> 
> If you follow the code in tree-vrp.c before my patch you will see that 
> the path is the same... we calculate a nonsensical range of [253, 1] and 
> then massage it into ~[2, 252] in set_and_canonicalize_value_range.

Yes, I saw that, I think I was sceptical about splitting off a part of the 
code that wasn't self-sufficient.

> I think the fact that it's so hard to get all this right, is a strong 
> argument for getting rid of anti ranges.  Almost all the bugs or 
> oversights I've fixed over the past months in VRP have been related to 
> anti ranges.

Well, without anti-ranges you need multi-ranges (or at least two-ranges); 
it's a tradeoff.  If I'd have to choose a high-level interface and the 
choice is between supporting two-ranges directly (as two parameters) or a 
single range/anti-range I'd always take the latter (internally of course 
it would be represented as a two-range to make arguing and implementing 
stuff easier).


Ciao,
Michael.
Aldy Hernandez Sept. 5, 2018, 6 p.m. UTC | #15
On 09/05/2018 10:57 AM, Michael Matz wrote:
> Hi,
> 
> On Wed, 5 Sep 2018, Aldy Hernandez wrote:
> 
>> No apologies needed.  I welcome all masochists to join me in my personal hell
>> :).
> 
> ;-)
> 
>>> How can this test and following code catch all problematic cases?  Assume
>>> a range of [253..257], truncating to 8 bits unsigned.  The size of the
>>> range is 5 (not 4 as the code above calculates), well representable in 8
>>> bits.  Nevertheless, the min of the new range isn't 253, nor is the max of
>>> the new range 257.  In fact the new range must be [0..255] (if you have no
>>> multi ranges or anti-ranges).  So whatever this function is supposed to
>>> do (what btw?), it certainly does not convert a range.
>>
>> Welcome to the wonderful world of anti ranges, where nothing is as it
>> seems.
> 
> Yes, as I said, to precisely capture the result of '[253..257] & 255' you
> either need multi-ranges or an anti-range.
> 
>> First you're not looking at what's currently in mainline.  Richard
>> approved the first incantation of this code.  But even so, I believe the
>> above is correct as well.
> 
> Well, maybe as part of a larger sequence of code, but the above code
> doesn't even return [253..1], there's actually no conversion/change of the
> inputs done at all.  Yes, the current code actually does the truncation.
> I guess I was triggered by the name range_convert that didn't actually do
> any conversion.  I now have no issue with the code in mainline anymore
> (well, maybe except disliking functions with eight parameters).

If only someone had suggested a class to stuff those parameters in... ;-).

> 
>> If desired, I could further document that the range returned may have
>> its contents swapped and that is why it needs canonizing.  But hey, that
>> shit was already broken when I got here ;-).
>>
>> If you follow the code in tree-vrp.c before my patch you will see that
>> the path is the same... we calculate a nonsensical range of [253, 1] and
>> then massage it into ~[2, 252] in set_and_canonicalize_value_range.
> 
> Yes, I saw that, I think I was sceptical about splitting off a part of the
> code that wasn't self-sufficient.
> 
>> I think the fact that it's so hard to get all this right, is a strong
>> argument for getting rid of anti ranges.  Almost all the bugs or
>> oversights I've fixed over the past months in VRP have been related to
>> anti ranges.
> 
> Well, without anti-ranges you need multi-ranges (or at least two-ranges);
> it's a tradeoff.  If I'd have to choose a high-level interface and the
> choice is between supporting two-ranges directly (as two parameters) or a
> single range/anti-range I'd always take the latter (internally of course
> it would be represented as a two-range to make arguing and implementing
> stuff easier).

Yup, in the ssa-range work we have multi-ranges, and they're kept in one 
class we can pass as a unit.

Aldy

Patch
diff mbox series

gcc/

	* wide-int-range.cc (wide_int_range_convert_tree): New.
	* wide-int-range.h (wide_int_range_convert_tree): New.
	* tree-vrp.c (extract_range_from_unary_expr): Abstract wide int
	code into wide_int_range_convert_tree.
	(extract_range_into_wide_ints): Do not munge anti range constants
	into the entire domain.  Just return the range back.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index f20730a85ba..89fbbffcd2e 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -995,7 +995,7 @@  ranges_from_anti_range (value_range *ar,
 /* Extract the components of a value range into a pair of wide ints in
    [WMIN, WMAX].
 
-   If the value range is anything but a VR_RANGE of constants, the
+   If the value range is anything but a VR_*RANGE of constants, the
    resulting wide ints are set to [-MIN, +MAX] for the type.  */
 
 static void inline
@@ -1003,7 +1003,10 @@  extract_range_into_wide_ints (value_range *vr,
 			      signop sign, unsigned prec,
 			      wide_int &wmin, wide_int &wmax)
 {
-  if (range_int_cst_p (vr))
+  if ((vr->type == VR_RANGE
+       || vr->type == VR_ANTI_RANGE)
+      && TREE_CODE (vr->min) == INTEGER_CST
+      && TREE_CODE (vr->max) == INTEGER_CST)
     {
       wmin = wi::to_wide (vr->min);
       wmax = wi::to_wide (vr->max);
@@ -1894,44 +1897,35 @@  extract_range_from_unary_expr (value_range *vr,
 	  return;
 	}
 
-      /* If VR0 is varying and we increase the type precision, assume
-	 a full range for the following transformation.  */
-      if (vr0.type == VR_VARYING
-	  && INTEGRAL_TYPE_P (inner_type)
-	  && TYPE_PRECISION (inner_type) < TYPE_PRECISION (outer_type))
-	{
-	  vr0.type = VR_RANGE;
-	  vr0.min = TYPE_MIN_VALUE (inner_type);
-	  vr0.max = TYPE_MAX_VALUE (inner_type);
-	}
-
-      /* If VR0 is a constant range or anti-range and the conversion is
-	 not truncating we can convert the min and max values and
-	 canonicalize the resulting range.  Otherwise we can do the
-	 conversion if the size of the range is less than what the
-	 precision of the target type can represent and the range is
-	 not an anti-range.  */
-      if ((vr0.type == VR_RANGE
-	   || vr0.type == VR_ANTI_RANGE)
+      /* We normalize everything to a VR_RANGE, but for constant
+	 anti-ranges we must handle them by leaving the final result
+	 as an anti range.  This allows us to convert things like
+	 ~[0,5] seamlessly.  */
+      value_range_type vr_type = VR_RANGE;
+      if (vr0.type == VR_ANTI_RANGE
 	  && TREE_CODE (vr0.min) == INTEGER_CST
-	  && TREE_CODE (vr0.max) == INTEGER_CST
-	  && (TYPE_PRECISION (outer_type) >= TYPE_PRECISION (inner_type)
-	      || (vr0.type == VR_RANGE
-		  && integer_zerop (int_const_binop (RSHIFT_EXPR,
-		       int_const_binop (MINUS_EXPR, vr0.max, vr0.min),
-		         size_int (TYPE_PRECISION (outer_type)))))))
-	{
-	  tree new_min, new_max;
-	  new_min = force_fit_type (outer_type, wi::to_widest (vr0.min),
-				    0, false);
-	  new_max = force_fit_type (outer_type, wi::to_widest (vr0.max),
-				    0, false);
-	  set_and_canonicalize_value_range (vr, vr0.type,
-					    new_min, new_max, NULL);
-	  return;
-	}
+	  && TREE_CODE (vr0.max) == INTEGER_CST)
+	vr_type = VR_ANTI_RANGE;
 
-      set_value_range_to_varying (vr);
+      /* NOTES: Previously we were returning VARYING for all symbolics, but
+	 we can do better by treating them as [-MIN, +MAX].  For
+	 example, converting [SYM, SYM] from INT to LONG UNSIGNED,
+	 we can return: ~[0x8000000, 0xffffffff7fffffff].
+
+	 We were also failing to convert ~[0,0] from char* to unsigned,
+	 instead choosing to return VR_VARYING.  Now we return ~[0,0].  */
+      wide_int vr0_min, vr0_max;
+      signop inner_sign = TYPE_SIGN (inner_type);
+      unsigned inner_prec = TYPE_PRECISION (inner_type);
+      tree tmin, tmax;
+      extract_range_into_wide_ints (&vr0, inner_sign, inner_prec,
+				    vr0_min, vr0_max);
+      if (wide_int_range_convert_tree (tmin, tmax, outer_type,
+				       inner_sign, inner_prec,
+				       vr0_min, vr0_max))
+	set_and_canonicalize_value_range (vr, vr_type, tmin, tmax, NULL);
+      else
+	set_value_range_to_varying (vr);
       return;
     }
   else if (code == ABS_EXPR)
diff --git a/gcc/wide-int-range.cc b/gcc/wide-int-range.cc
index 3cdcede04cd..d3e1424e77f 100644
--- a/gcc/wide-int-range.cc
+++ b/gcc/wide-int-range.cc
@@ -665,6 +665,42 @@  wide_int_range_abs (wide_int &min, wide_int &max,
   return true;
 }
 
+/* Convert range in VR0 to OUTER_TYPE and store the result in a pair of
+   trees [TMIN, TMAX].
+
+   Return TRUE if we were able to successfully calculate the new range.
+
+   Caller is responsible for canonicalizing the resulting range.  */
+
+bool
+wide_int_range_convert_tree (tree &tmin, tree &tmax,
+			     tree outer_type,
+			     signop inner_sign,
+			     unsigned inner_prec,
+			     const wide_int &vr0_min,
+			     const wide_int &vr0_max)
+{
+  /* If the conversion is not truncating we can convert the min and
+     max values and canonicalize the resulting range.  Otherwise we
+     can do the conversion if the size of the range is less than what
+     the precision of the target type can represent.  */
+  if (TYPE_PRECISION (outer_type) >= inner_prec
+      || wi::rshift (wi::sub (vr0_max, vr0_min),
+		     wi::uhwi (TYPE_PRECISION (outer_type), inner_prec),
+		     inner_sign) == 0)
+    {
+      signop outer_sign = TYPE_SIGN (outer_type);
+      outer_sign = inner_sign;
+      tmin = force_fit_type (outer_type,
+			     widest_int::from (vr0_min, outer_sign), 0, false);
+      tmax = force_fit_type (outer_type,
+			     widest_int::from (vr0_max, outer_sign), 0, false);
+      return (!operand_equal_p (tmin, TYPE_MIN_VALUE (outer_type), 0)
+	      || !operand_equal_p (tmax, TYPE_MAX_VALUE (outer_type), 0));
+    }
+  return false;
+}
+
 /* Calculate a division operation on two ranges and store the result in
    [WMIN, WMAX] U [EXTRA_MIN, EXTRA_MAX].
 
diff --git a/gcc/wide-int-range.h b/gcc/wide-int-range.h
index 427ef34c6b4..6aa0b76b6e3 100644
--- a/gcc/wide-int-range.h
+++ b/gcc/wide-int-range.h
@@ -99,6 +99,12 @@  extern bool wide_int_range_abs (wide_int &min, wide_int &max,
 				const wide_int &vr0_min,
 				const wide_int &vr0_max,
 				bool overflow_undefined);
+extern bool wide_int_range_convert_tree (tree &tmin, tree &tmax,
+					 tree outer_type,
+					 signop inner_sign,
+					 unsigned inner_prec,
+					 const wide_int &vr0_min,
+					 const wide_int &vr0_max);
 extern bool wide_int_range_div (wide_int &wmin, wide_int &wmax,
 				enum tree_code code,
 				signop sign, unsigned prec,