diff mbox series

[range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING

Message ID 481033aa-6ecc-3bcb-c874-becee14c5605@redhat.com
State New
Headers show
Series [range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING | expand

Commit Message

Aldy Hernandez July 1, 2019, 8:52 a.m. UTC
As discussed before, this enforces types on undefined and varying, which 
makes everything more regular, and removes some special casing 
throughout range handling.

The min/max fields will contain TYPE_MIN_VALUE and TYPE_MAX_VALUE, which 
will make it easy to get at the bounds of a range later on.  Since 
pointers don't have TYPE_MIN/MAX_VALUE, we are using build_zero_cst() 
and wide_int_to_tree(wi::max_value(precision)), for consistency. 
UNDEFINED is set similarly, though nobody should ever ask for anything 
except type() from it.  That is, no one should be asking for the bounds.

There is one wrinkle, ipa-cp creates VR_VARYING ranges of floats, 
presumably to pass around state??  This causes value_range_base::type() 
and others to fail, even though I haven't actually seen a case of 
someone actually trying to set a VR_RANGE of a float.  For now, I've 
built a NOP_EXPR wrapper so type() works correctly.  The correct 
approach would probably be to avoid creating these varying/undefined 
ranges in ipa-cp, but I don't have enough ipa-cp-foo to do so. 
Suggestions welcome, if you don't like special casing this for ipa-cp. 
Or perhaps as a follow up.

In this patch I start introducing a couple small API changes that will 
be needed for range-ops.  Since they're needed/useful for this patch, I 
started adding them on a need-to-use basis.  They are:

value_range_base (tree type)

	This is our constructor for building a varying:
		value_range_base foo (some_type);

value_range_base::supports_type_p(tree type)

	We use this instead of having to test everywhere for
	INTEGRAL_TYPE_P and POINTER_TYPE_P which VRP uses throughout.
	I have not ported every use of the INTEGRAL || POINTER in the
	compiler to this function.  But that could be a follow up
	cleanup if someone (else) is interested :).

value_range_base_normalize_symbolics():

	This normalizes symbolics into constants.  In VRP we usually
	normalize necessary symbolics into [MIN, MAX].  This patch does
	slightly better.  Now we transform:

	  // [SYM, SYM] -> VARYING
	  // [SYM, NUM] -> [-MIN, NUM]
	  // [NUM, SYM] -> [NUM, +MAX]
	  // ~[SYM, NUM] -> [NUM + 1, +MAX]
	  // ~[NUM, SYM] -> [-MIN, NUM - 1]

	TBH, this bit and its use in *multiplicative_op probably fits
	better with the canonicalization patch, but as I said.  They're
	a bit intertwined.  Ughh.

Finally, as you mentioned before, we need a way of caching varyings in 
the allocation pool.  The type_range_cache class in the attached patch 
is Andrew's doing, but I'll be happy to take the blame and address 
anything that needs doing.

Tested on x86-64 Linux with --enable-languages=all.

Aldy

Comments

Jeff Law July 2, 2019, 8:17 p.m. UTC | #1
On 7/1/19 2:52 AM, Aldy Hernandez wrote:
> As discussed before, this enforces types on undefined and varying, which
> makes everything more regular, and removes some special casing
> throughout range handling.
> 
> The min/max fields will contain TYPE_MIN_VALUE and TYPE_MAX_VALUE, which
> will make it easy to get at the bounds of a range later on.  Since
> pointers don't have TYPE_MIN/MAX_VALUE, we are using build_zero_cst()
> and wide_int_to_tree(wi::max_value(precision)), for consistency.
> UNDEFINED is set similarly, though nobody should ever ask for anything
> except type() from it.  That is, no one should be asking for the bounds.
> 
> There is one wrinkle, ipa-cp creates VR_VARYING ranges of floats,
> presumably to pass around state??  This causes value_range_base::type()
> and others to fail, even though I haven't actually seen a case of
> someone actually trying to set a VR_RANGE of a float.  For now, I've
> built a NOP_EXPR wrapper so type() works correctly.  The correct
> approach would probably be to avoid creating these varying/undefined
> ranges in ipa-cp, but I don't have enough ipa-cp-foo to do so.
> Suggestions welcome, if you don't like special casing this for ipa-cp.
> Or perhaps as a follow up.
No idea why we create ranges of floats from ipa-cp.  I presume it's
coming from propagate_vr_across_jump_function?  Or somewhere else?

> 
> In this patch I start introducing a couple small API changes that will
> be needed for range-ops.  Since they're needed/useful for this patch, I
> started adding them on a need-to-use basis.  They are:
> 
> value_range_base (tree type)
> 
>     This is our constructor for building a varying:
>         value_range_base foo (some_type);
> 
> value_range_base::supports_type_p(tree type)
> 
>     We use this instead of having to test everywhere for
>     INTEGRAL_TYPE_P and POINTER_TYPE_P which VRP uses throughout.
>     I have not ported every use of the INTEGRAL || POINTER in the
>     compiler to this function.  But that could be a follow up
>     cleanup if someone (else) is interested :).
Cleanups of this nature are pre-approved once this patch is ACK'd and
installed.


> 
> value_range_base_normalize_symbolics():
> 
>     This normalizes symbolics into constants.  In VRP we usually
>     normalize necessary symbolics into [MIN, MAX].  This patch does
>     slightly better.  Now we transform:
> 
>       // [SYM, SYM] -> VARYING
>       // [SYM, NUM] -> [-MIN, NUM]
>       // [NUM, SYM] -> [NUM, +MAX]
>       // ~[SYM, NUM] -> [NUM + 1, +MAX]
>       // ~[NUM, SYM] -> [-MIN, NUM - 1]
> 
>     TBH, this bit and its use in *multiplicative_op probably fits
>     better with the canonicalization patch, but as I said.  They're
>     a bit intertwined.  Ughh.
I think it does fit better there, but we ought to be able to manage.


> 
> Finally, as you mentioned before, we need a way of caching varyings in
> the allocation pool.  The type_range_cache class in the attached patch
> is Andrew's doing, but I'll be happy to take the blame and address
> anything that needs doing.
> 
> Tested on x86-64 Linux with --enable-languages=all.
> 
> Aldy
> 
> range-ops-type.patch
> 
> commit 24c3a6a2cb7424a9c022930cada3a5f3c84a388a
> Author: Aldy Hernandez <aldyh@redhat.com>
> Date:   Fri Jun 28 11:00:10 2019 +0200
> 
>     VR_UNDEFINED and VR_VARYING now have a type.
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 01fb97cedb2..a5247735694 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,72 @@
> +2019-07-01  Aldy Hernandez  <aldyh@redhat.com>
> +
> +	* gimple-ssa-evrp-analyze.c (record_ranges_from_phis): Skip PHIs
> +	who's result are not valid for a value_range.
> +	Set type for varying value_range.
> +	* ipa-cp.c (ipcp_vr_lattice::init): Set type for varying/undef
> +	value_range.
> +	(ipcp_vr_lattice::set_to_bottom): Same.
> +	(initialize_node_lattices): Same.
> +	* tree-ssa-threadedge.c (record_temporary_equivalences_from_phis):
> +	Same.
> +	* tree-ssanames.c (get_range_info): Same.
> +	* tree-vrp.c (value_range::set_equiv): Do not set equiv on
> +	undef/varying.
> +	(value_range_base::value_range_base): New constructor.
> +	(value_range_base::check): Remove assert for empty min/max.
> +	(value_range_base::equal_p): Allow comparison of typeless undefs.
> +	(value_range_base::set_undefined): Add type.
> +	(value_range::set_undefined): Same.
> +	(value_range_base::set_varying): Same.
> +	(value_range::set_varying): Same.
> +	(value_range_base::type): Remove assert.
> +	(value_range_base::dump): Display type for varying/undef.
> +	(value_range_base::dump): Add argument-less overload.
> +	(value_range::dump): Same.
> +	(vrp_val_max): Add handle_pointers argument.
> +	(vrp_val_min): Same.
> +	(vrp_val_is_max): Same.
> +	(vrp_val_is_min): Same.
> +	(value_range_base::set_and_canonicalize): Adjust so type is
> +	allowed for varying/undef.
> +	(ranges_from_anti_range): Same.
> +	(extract_range_from_muliplicative_op): Same.
> +	(extract_range_from_binary_expr): Same.
> +	(extract_range_from_unary_expr): Same.
> +	(vrp_prop::vrp_initialize): Same.
> +	(vrp_prop::visit_stmt): Same.
> +	(value_range_base::union_helper): Same.
> +	(value_range_base::normalize_symbolics): Same.
> +	(determine_value_range_1): Same.
> +	* tree-vrp.h (value_range_base): Add value_range_base(tree type)
> +	constructor.
> +	Add dump (), supports_type_p,
> +	value_range_base_normalize_symbolics, set_varying, and
> +	set_undefined.
> +	(value_range): Add set_varying, set_undefined, and dump().
> +	(vrp_val_is_min): Add argument.
> +	(vrp_val_is_max): Same.
> +	(vrp_val_min): Same.
> +	(vrp_val_max): Same.
> +	(range_includes_zero_p): Adjust for varying/undef with types.
> +	* vr-values.c (class type_range_cache): New.
> +	(set_value_range_to_truthvalue): Adjust for varying/undef with
> +	types.
> +	(get_value_range): Same.
> +	(vr_values::set_defs_to_varying): Same.
> +	(vr_values::update_value_range): Same.
> +	(extract_range_for_var_from_comparison_expr): Same.
> +	(extract_range_from_binary_expr): Same.
> +	(extract_range_from_cond_expr): Same.
> +	(check_for_binary_op_overflow): Same.
> +	(extract_range_basic): Same.
> +	(extract_range_from_assignment): Same.
> +	(vr_values::vr_values): Initialize type cache.
> +	(vr_values::~vr_values): Free type cache.
> +	(extract_range_from_phi_node): Adjust for varying/undef with
> +	types.
> +	* vr-values.h (class vr_values): Add type_cache.
> +
>  2019-06-26  Jeff Law  <law@redhat.com>
>  


> +
> +/* Allocate a new range from the obstack and set it to VARYING for TYPE.  */
> +inline value_range *
> +type_range_cache::new_varying (tree type)
> +{
> +  /* Allocate memory.  */
> +  void *p = XOBNEW (&m_range_obj, value_range);
> +  /* Invoke the constructors on the memory using placement new.  */
> +  value_range *new_p = new (p) value_range ();
> +  /* Initialize it to varying.  */
> +  new_p->set_varying (type);
> +  return new_p;
> +}
So is placement new C++98/C++03 or C++11?

If the former then we can use it, if the latter we probably can't since
we haven't stepped forward to C++11.

jeff
Richard Biener July 3, 2019, 8:28 a.m. UTC | #2
On Mon, Jul 1, 2019 at 10:52 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>
> As discussed before, this enforces types on undefined and varying, which
> makes everything more regular, and removes some special casing
> throughout range handling.

I don't like it too much given you need to introduce that "cache".

Why do VARYING or UNDEFINED need a type?  Nobody should ever need
to ask for the type of a range anyhow - the type should be always that from
the context we're looking at (the SSA name the range is associated with,
the operation we are performing on one or two ranges, etc.).

Thinking again about this it looks fundamentally wrong to associate
a type with the VARYING or UNDEFINED lattice states.

Oh, and no, IPA shouldn't create ranges for floats - probably a sanity check
missing somewhere.

Richard.

> The min/max fields will contain TYPE_MIN_VALUE and TYPE_MAX_VALUE, which
> will make it easy to get at the bounds of a range later on.  Since
> pointers don't have TYPE_MIN/MAX_VALUE, we are using build_zero_cst()
> and wide_int_to_tree(wi::max_value(precision)), for consistency.
> UNDEFINED is set similarly, though nobody should ever ask for anything
> except type() from it.  That is, no one should be asking for the bounds.
>
> There is one wrinkle, ipa-cp creates VR_VARYING ranges of floats,
> presumably to pass around state??  This causes value_range_base::type()
> and others to fail, even though I haven't actually seen a case of
> someone actually trying to set a VR_RANGE of a float.  For now, I've
> built a NOP_EXPR wrapper so type() works correctly.  The correct
> approach would probably be to avoid creating these varying/undefined
> ranges in ipa-cp, but I don't have enough ipa-cp-foo to do so.
> Suggestions welcome, if you don't like special casing this for ipa-cp.
> Or perhaps as a follow up.
>
> In this patch I start introducing a couple small API changes that will
> be needed for range-ops.  Since they're needed/useful for this patch, I
> started adding them on a need-to-use basis.  They are:
>
> value_range_base (tree type)
>
>         This is our constructor for building a varying:
>                 value_range_base foo (some_type);
>
> value_range_base::supports_type_p(tree type)
>
>         We use this instead of having to test everywhere for
>         INTEGRAL_TYPE_P and POINTER_TYPE_P which VRP uses throughout.
>         I have not ported every use of the INTEGRAL || POINTER in the
>         compiler to this function.  But that could be a follow up
>         cleanup if someone (else) is interested :).
>
> value_range_base_normalize_symbolics():
>
>         This normalizes symbolics into constants.  In VRP we usually
>         normalize necessary symbolics into [MIN, MAX].  This patch does
>         slightly better.  Now we transform:
>
>           // [SYM, SYM] -> VARYING
>           // [SYM, NUM] -> [-MIN, NUM]
>           // [NUM, SYM] -> [NUM, +MAX]
>           // ~[SYM, NUM] -> [NUM + 1, +MAX]
>           // ~[NUM, SYM] -> [-MIN, NUM - 1]
>
>         TBH, this bit and its use in *multiplicative_op probably fits
>         better with the canonicalization patch, but as I said.  They're
>         a bit intertwined.  Ughh.
>
> Finally, as you mentioned before, we need a way of caching varyings in
> the allocation pool.  The type_range_cache class in the attached patch
> is Andrew's doing, but I'll be happy to take the blame and address
> anything that needs doing.
>
> Tested on x86-64 Linux with --enable-languages=all.
>
> Aldy
Aldy Hernandez July 3, 2019, 9:19 a.m. UTC | #3
On 7/3/19 4:28 AM, Richard Biener wrote:
> On Mon, Jul 1, 2019 at 10:52 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>> As discussed before, this enforces types on undefined and varying, which
>> makes everything more regular, and removes some special casing
>> throughout range handling.
> 
> I don't like it too much given you need to introduce that "cache".
> 
> Why do VARYING or UNDEFINED need a type?  Nobody should ever need
> to ask for the type of a range anyhow - the type should be always that from
> the context we're looking at (the SSA name the range is associated with,
> the operation we are performing on one or two ranges, etc.).
> 
> Thinking again about this it looks fundamentally wrong to associate
> a type with the VARYING or UNDEFINED lattice states.

We discussed this 2 weeks ago, and it was my understanding that we had 
an reached an agreement on the general approach.  Types on varying and 
undefined was the *first* thing I brought up.  Explanation quoted below.

By the way, the type for varying/undefined requires no space in the 
value_range_base structure, as it is kept in the min/max fields which we 
already use for VR_RANGE/VR_ANTI_RANGE.

Aldy

https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01292.html

In order to unify the APIs for value_range and irange, we'd like to make
some minor changes to value_range.  We believe most of these changes
could go in now, and would prefer so, to get broader testing and
minimize the plethora of changes we drag around on our branch.

First, introduce a type for VR_VARYING and VR_UNDEFINED.
--------------------------------------------------------

irange utilizes 0 or more sub-ranges to represent a range, and VARYING
is simply one subrange [MIN, MAX].    value_range represents this with
VR_VARYING, and since there is no type associated with it, we cannot
calculate the lower and upper bounds for the range.  There is also a
lack of canonicalness in value range in that VR_VARYING and [MIN, MAX]
are two different representations of the same value.

We tried to adjust irange to not associate a type with the empty range
[] (representing undefined), but found we were unable to perform all
operations properly.  In particular, we cannot invert an empty range.
i.e. invert ( [] ) should produce [MIN, MAX].  Again, we need to have a
type associated with this empty range.

We'd like to tweak value_range so that set_varying() and set_undefined()
both take a type, and then always set the min/max fields based on that
type.  This takes no additional memory in the structure, and is
virtually transparent to all the existing uses of value_range.

This allows:
     1)  invert to be implemented properly for both VARYING and UNDEFINED
by simply changing one to the other.
     2)  the type() method to always work without any special casing by
simply returning TREE_TYPE(min)
     3)  the new incoming bounds() routines to work trivially for these
cases as well (lbound/ubound, num_pairs(), etc).
Aldy Hernandez July 3, 2019, 9:46 a.m. UTC | #4
On 7/2/19 4:17 PM, Jeff Law wrote:
> On 7/1/19 2:52 AM, Aldy Hernandez wrote:
>> As discussed before, this enforces types on undefined and varying, which
>> makes everything more regular, and removes some special casing
>> throughout range handling.
>>
>> The min/max fields will contain TYPE_MIN_VALUE and TYPE_MAX_VALUE, which
>> will make it easy to get at the bounds of a range later on.  Since
>> pointers don't have TYPE_MIN/MAX_VALUE, we are using build_zero_cst()
>> and wide_int_to_tree(wi::max_value(precision)), for consistency.
>> UNDEFINED is set similarly, though nobody should ever ask for anything
>> except type() from it.  That is, no one should be asking for the bounds.
>>
>> There is one wrinkle, ipa-cp creates VR_VARYING ranges of floats,
>> presumably to pass around state??  This causes value_range_base::type()
>> and others to fail, even though I haven't actually seen a case of
>> someone actually trying to set a VR_RANGE of a float.  For now, I've
>> built a NOP_EXPR wrapper so type() works correctly.  The correct
>> approach would probably be to avoid creating these varying/undefined
>> ranges in ipa-cp, but I don't have enough ipa-cp-foo to do so.
>> Suggestions welcome, if you don't like special casing this for ipa-cp.
>> Or perhaps as a follow up.
> No idea why we create ranges of floats from ipa-cp.  I presume it's
> coming from propagate_vr_across_jump_function?  Or somewhere else?

I believe it was ipcp_vr_lattice::set_to_bottom, while changing an 
UNDEFINED to VARYING.  IMO, we shouldn't even have created UNDEFINED 
ranges of floats.  It's not like we can do anything with float ranges.

> 
>>
>> In this patch I start introducing a couple small API changes that will
>> be needed for range-ops.  Since they're needed/useful for this patch, I
>> started adding them on a need-to-use basis.  They are:
>>
>> value_range_base (tree type)
>>
>>      This is our constructor for building a varying:
>>          value_range_base foo (some_type);
>>
>> value_range_base::supports_type_p(tree type)
>>
>>      We use this instead of having to test everywhere for
>>      INTEGRAL_TYPE_P and POINTER_TYPE_P which VRP uses throughout.
>>      I have not ported every use of the INTEGRAL || POINTER in the
>>      compiler to this function.  But that could be a follow up
>>      cleanup if someone (else) is interested :).
> Cleanups of this nature are pre-approved once this patch is ACK'd and
> installed.
> 
> 
>>
>> value_range_base_normalize_symbolics():
>>
>>      This normalizes symbolics into constants.  In VRP we usually
>>      normalize necessary symbolics into [MIN, MAX].  This patch does
>>      slightly better.  Now we transform:
>>
>>        // [SYM, SYM] -> VARYING
>>        // [SYM, NUM] -> [-MIN, NUM]
>>        // [NUM, SYM] -> [NUM, +MAX]
>>        // ~[SYM, NUM] -> [NUM + 1, +MAX]
>>        // ~[NUM, SYM] -> [-MIN, NUM - 1]
>>
>>      TBH, this bit and its use in *multiplicative_op probably fits
>>      better with the canonicalization patch, but as I said.  They're
>>      a bit intertwined.  Ughh.
> I think it does fit better there, but we ought to be able to manage.
> 
> 
>>
>> Finally, as you mentioned before, we need a way of caching varyings in
>> the allocation pool.  The type_range_cache class in the attached patch
>> is Andrew's doing, but I'll be happy to take the blame and address
>> anything that needs doing.
>>
>> Tested on x86-64 Linux with --enable-languages=all.
>>
>> Aldy
>>
>> range-ops-type.patch
>>
>> commit 24c3a6a2cb7424a9c022930cada3a5f3c84a388a
>> Author: Aldy Hernandez <aldyh@redhat.com>
>> Date:   Fri Jun 28 11:00:10 2019 +0200
>>
>>      VR_UNDEFINED and VR_VARYING now have a type.
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 01fb97cedb2..a5247735694 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,72 @@
>> +2019-07-01  Aldy Hernandez  <aldyh@redhat.com>
>> +
>> +	* gimple-ssa-evrp-analyze.c (record_ranges_from_phis): Skip PHIs
>> +	who's result are not valid for a value_range.
>> +	Set type for varying value_range.
>> +	* ipa-cp.c (ipcp_vr_lattice::init): Set type for varying/undef
>> +	value_range.
>> +	(ipcp_vr_lattice::set_to_bottom): Same.
>> +	(initialize_node_lattices): Same.
>> +	* tree-ssa-threadedge.c (record_temporary_equivalences_from_phis):
>> +	Same.
>> +	* tree-ssanames.c (get_range_info): Same.
>> +	* tree-vrp.c (value_range::set_equiv): Do not set equiv on
>> +	undef/varying.
>> +	(value_range_base::value_range_base): New constructor.
>> +	(value_range_base::check): Remove assert for empty min/max.
>> +	(value_range_base::equal_p): Allow comparison of typeless undefs.
>> +	(value_range_base::set_undefined): Add type.
>> +	(value_range::set_undefined): Same.
>> +	(value_range_base::set_varying): Same.
>> +	(value_range::set_varying): Same.
>> +	(value_range_base::type): Remove assert.
>> +	(value_range_base::dump): Display type for varying/undef.
>> +	(value_range_base::dump): Add argument-less overload.
>> +	(value_range::dump): Same.
>> +	(vrp_val_max): Add handle_pointers argument.
>> +	(vrp_val_min): Same.
>> +	(vrp_val_is_max): Same.
>> +	(vrp_val_is_min): Same.
>> +	(value_range_base::set_and_canonicalize): Adjust so type is
>> +	allowed for varying/undef.
>> +	(ranges_from_anti_range): Same.
>> +	(extract_range_from_muliplicative_op): Same.
>> +	(extract_range_from_binary_expr): Same.
>> +	(extract_range_from_unary_expr): Same.
>> +	(vrp_prop::vrp_initialize): Same.
>> +	(vrp_prop::visit_stmt): Same.
>> +	(value_range_base::union_helper): Same.
>> +	(value_range_base::normalize_symbolics): Same.
>> +	(determine_value_range_1): Same.
>> +	* tree-vrp.h (value_range_base): Add value_range_base(tree type)
>> +	constructor.
>> +	Add dump (), supports_type_p,
>> +	value_range_base_normalize_symbolics, set_varying, and
>> +	set_undefined.
>> +	(value_range): Add set_varying, set_undefined, and dump().
>> +	(vrp_val_is_min): Add argument.
>> +	(vrp_val_is_max): Same.
>> +	(vrp_val_min): Same.
>> +	(vrp_val_max): Same.
>> +	(range_includes_zero_p): Adjust for varying/undef with types.
>> +	* vr-values.c (class type_range_cache): New.
>> +	(set_value_range_to_truthvalue): Adjust for varying/undef with
>> +	types.
>> +	(get_value_range): Same.
>> +	(vr_values::set_defs_to_varying): Same.
>> +	(vr_values::update_value_range): Same.
>> +	(extract_range_for_var_from_comparison_expr): Same.
>> +	(extract_range_from_binary_expr): Same.
>> +	(extract_range_from_cond_expr): Same.
>> +	(check_for_binary_op_overflow): Same.
>> +	(extract_range_basic): Same.
>> +	(extract_range_from_assignment): Same.
>> +	(vr_values::vr_values): Initialize type cache.
>> +	(vr_values::~vr_values): Free type cache.
>> +	(extract_range_from_phi_node): Adjust for varying/undef with
>> +	types.
>> +	* vr-values.h (class vr_values): Add type_cache.
>> +
>>   2019-06-26  Jeff Law  <law@redhat.com>
>>   
> 
> 
>> +
>> +/* Allocate a new range from the obstack and set it to VARYING for TYPE.  */
>> +inline value_range *
>> +type_range_cache::new_varying (tree type)
>> +{
>> +  /* Allocate memory.  */
>> +  void *p = XOBNEW (&m_range_obj, value_range);
>> +  /* Invoke the constructors on the memory using placement new.  */
>> +  value_range *new_p = new (p) value_range ();
>> +  /* Initialize it to varying.  */
>> +  new_p->set_varying (type);
>> +  return new_p;
>> +}
> So is placement new C++98/C++03 or C++11?
> 
> If the former then we can use it, if the latter we probably can't since
> we haven't stepped forward to C++11.

Google isn't cooperating here to narrow the specific C++ version, but 
I'm seeing some very old references to placement new, from the mid to 
the late 1990s.

FWIW, the above snippet shows no warnings when compiled with -std=c++-98 
-Wall.

Aldy
Richard Biener July 3, 2019, 11:08 a.m. UTC | #5
On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>
>
>
> On 7/3/19 4:28 AM, Richard Biener wrote:
> > On Mon, Jul 1, 2019 at 10:52 AM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>
> >> As discussed before, this enforces types on undefined and varying, which
> >> makes everything more regular, and removes some special casing
> >> throughout range handling.
> >
> > I don't like it too much given you need to introduce that "cache".
> >
> > Why do VARYING or UNDEFINED need a type?  Nobody should ever need
> > to ask for the type of a range anyhow - the type should be always that from
> > the context we're looking at (the SSA name the range is associated with,
> > the operation we are performing on one or two ranges, etc.).
> >
> > Thinking again about this it looks fundamentally wrong to associate
> > a type with the VARYING or UNDEFINED lattice states.
>
> We discussed this 2 weeks ago, and it was my understanding that we had
> an reached an agreement on the general approach.  Types on varying and
> undefined was the *first* thing I brought up.  Explanation quoted below.

Yes, and I asked how you handled that const static node for VARYING
which you answered, well - we could do some caching per type and I
replied I didn't like that very much.

So - I see why it might be "convenient" but I still see no actual
technical requirement to have a type for them.  I see you have
canonicalization for symbolic ranges to integer ranges so
you can have one for varying/undefined to integer ranges as well;
just make that canonicalization take a type argument.

> By the way, the type for varying/undefined requires no space in the
> value_range_base structure, as it is kept in the min/max fields which we
> already use for VR_RANGE/VR_ANTI_RANGE.

You can as well populate those with actual canonical integer range values
then.  [MIN, MAX] and [MAX, MIN] (or whatever we'd consider canonical for
the empty range).

But as said, point me to the place where you need the type of VARYING.
It should already exist since the current code does away without.

I refuse to uglify the current VRP with a not needed type-indexed cache
for VARYING nodes just to make ranger intergation more happy.   Just
ignore that extra 'type' argument in the ranger API then?

Richard.

> Aldy
>
> https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01292.html
>
> In order to unify the APIs for value_range and irange, we'd like to make
> some minor changes to value_range.  We believe most of these changes
> could go in now, and would prefer so, to get broader testing and
> minimize the plethora of changes we drag around on our branch.
>
> First, introduce a type for VR_VARYING and VR_UNDEFINED.
> --------------------------------------------------------
>
> irange utilizes 0 or more sub-ranges to represent a range, and VARYING
> is simply one subrange [MIN, MAX].    value_range represents this with
> VR_VARYING, and since there is no type associated with it, we cannot
> calculate the lower and upper bounds for the range.  There is also a
> lack of canonicalness in value range in that VR_VARYING and [MIN, MAX]
> are two different representations of the same value.
>
> We tried to adjust irange to not associate a type with the empty range
> [] (representing undefined), but found we were unable to perform all
> operations properly.  In particular, we cannot invert an empty range.
> i.e. invert ( [] ) should produce [MIN, MAX].  Again, we need to have a
> type associated with this empty range.
>
> We'd like to tweak value_range so that set_varying() and set_undefined()
> both take a type, and then always set the min/max fields based on that
> type.  This takes no additional memory in the structure, and is
> virtually transparent to all the existing uses of value_range.
>
> This allows:
>      1)  invert to be implemented properly for both VARYING and UNDEFINED
> by simply changing one to the other.
>      2)  the type() method to always work without any special casing by
> simply returning TREE_TYPE(min)
>      3)  the new incoming bounds() routines to work trivially for these
> cases as well (lbound/ubound, num_pairs(), etc).
Aldy Hernandez July 3, 2019, 12:17 p.m. UTC | #6
On 7/3/19 7:08 AM, Richard Biener wrote:
> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>>
>>
>> On 7/3/19 4:28 AM, Richard Biener wrote:
>>> On Mon, Jul 1, 2019 at 10:52 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>
>>>> As discussed before, this enforces types on undefined and varying, which
>>>> makes everything more regular, and removes some special casing
>>>> throughout range handling.
>>>
>>> I don't like it too much given you need to introduce that "cache".
>>>
>>> Why do VARYING or UNDEFINED need a type?  Nobody should ever need
>>> to ask for the type of a range anyhow - the type should be always that from
>>> the context we're looking at (the SSA name the range is associated with,
>>> the operation we are performing on one or two ranges, etc.).
>>>
>>> Thinking again about this it looks fundamentally wrong to associate
>>> a type with the VARYING or UNDEFINED lattice states.
>>
>> We discussed this 2 weeks ago, and it was my understanding that we had
>> an reached an agreement on the general approach.  Types on varying and
>> undefined was the *first* thing I brought up.  Explanation quoted below.
> 
> Yes, and I asked how you handled that const static node for VARYING
> which you answered, well - we could do some caching per type and I
> replied I didn't like that very much.
> 
> So - I see why it might be "convenient" but I still see no actual
> technical requirement to have a type for them.  I see you have
> canonicalization for symbolic ranges to integer ranges so
> you can have one for varying/undefined to integer ranges as well;
> just make that canonicalization take a type argument.
> 
>> By the way, the type for varying/undefined requires no space in the
>> value_range_base structure, as it is kept in the min/max fields which we
>> already use for VR_RANGE/VR_ANTI_RANGE.
> 
> You can as well populate those with actual canonical integer range values
> then.  [MIN, MAX] and [MAX, MIN] (or whatever we'd consider canonical for
> the empty range).
> 
> But as said, point me to the place where you need the type of VARYING.
> It should already exist since the current code does away without.
> 
> I refuse to uglify the current VRP with a not needed type-indexed cache
> for VARYING nodes just to make ranger intergation more happy.   Just
> ignore that extra 'type' argument in the ranger API then?

Ok, I see.  Your main beef is with the type cache.

How about we keep VARYING and UNDEFINED typeless until right before we 
call into the ranger.  At which point, we have can populate min/max 
because we have the tree_code and the type handy.  So right before we 
call into the ranger do:

	if (varying_p ())
	  foo->set_varying(TYPE);

This would avoid the type cache, and keep the ranger happy.

Another option, as you've hinted, would be to normalize VARYING into 
[MIN, MAX] before calling into the ranger.  The problem with this 
approach is that we would then need to change varying_p() to something like:

value_range_base::varying_p()
{
   return (m_kind == VR_VARYING ||
	  (vrp_val_is_min (m_min) && vrp_val_is_max (m_max));
}

Thus slowing everyone down (remember both range-ops and tree-vrp will 
share the implementation for varying_p).  Plus, I'd prefer to keep one 
representation for VARYING, that is m_kind == VR_VARYING.

Perhaps this last alternative would be more consistent-- never allowing 
types for VARYING, at the expense of the calls to vrp_val_is*.

Thoughts?

Aldy
Jeff Law July 3, 2019, 10:38 p.m. UTC | #7
On 7/3/19 3:46 AM, Aldy Hernandez wrote:
> 
> 
> On 7/2/19 4:17 PM, Jeff Law wrote:
>> On 7/1/19 2:52 AM, Aldy Hernandez wrote:
>>> As discussed before, this enforces types on undefined and varying, which
>>> makes everything more regular, and removes some special casing
>>> throughout range handling.
>>>
>>> The min/max fields will contain TYPE_MIN_VALUE and TYPE_MAX_VALUE, which
>>> will make it easy to get at the bounds of a range later on.  Since
>>> pointers don't have TYPE_MIN/MAX_VALUE, we are using build_zero_cst()
>>> and wide_int_to_tree(wi::max_value(precision)), for consistency.
>>> UNDEFINED is set similarly, though nobody should ever ask for anything
>>> except type() from it.  That is, no one should be asking for the bounds.
>>>
>>> There is one wrinkle, ipa-cp creates VR_VARYING ranges of floats,
>>> presumably to pass around state??  This causes value_range_base::type()
>>> and others to fail, even though I haven't actually seen a case of
>>> someone actually trying to set a VR_RANGE of a float.  For now, I've
>>> built a NOP_EXPR wrapper so type() works correctly.  The correct
>>> approach would probably be to avoid creating these varying/undefined
>>> ranges in ipa-cp, but I don't have enough ipa-cp-foo to do so.
>>> Suggestions welcome, if you don't like special casing this for ipa-cp.
>>> Or perhaps as a follow up.
>> No idea why we create ranges of floats from ipa-cp.  I presume it's
>> coming from propagate_vr_across_jump_function?  Or somewhere else?
> 
> I believe it was ipcp_vr_lattice::set_to_bottom, while changing an
> UNDEFINED to VARYING.  IMO, we shouldn't even have created UNDEFINED
> ranges of floats.  It's not like we can do anything with float ranges.
Note that propagate_vr_across_jump_function calls set_to_bottom ;-)

I zero'd in on that one because it did so when presented with something
that wasn't an INTEGRAL_TYPE_P and wasn't POINTER_TYPE_P.

I think digging into where these are coming from would be advisable.
Hell, if you've got types, abort the first time we try to create a range
for something that isn't an integer/pointer, then walk backwards.

I wouldn't be surprised if we find just a couple sites that are
responsible for these problems in ipa-cp.


>>> +/* Allocate a new range from the obstack and set it to VARYING for
>>> TYPE.  */
>>> +inline value_range *
>>> +type_range_cache::new_varying (tree type)
>>> +{
>>> +  /* Allocate memory.  */
>>> +  void *p = XOBNEW (&m_range_obj, value_range);
>>> +  /* Invoke the constructors on the memory using placement new.  */
>>> +  value_range *new_p = new (p) value_range ();
>>> +  /* Initialize it to varying.  */
>>> +  new_p->set_varying (type);
>>> +  return new_p;
>>> +}
>> So is placement new C++98/C++03 or C++11?
>>
>> If the former then we can use it, if the latter we probably can't since
>> we haven't stepped forward to C++11.
> 
> Google isn't cooperating here to narrow the specific C++ version, but
> I'm seeing some very old references to placement new, from the mid to
> the late 1990s.
> 
> FWIW, the above snippet shows no warnings when compiled with -std=c++-98
> -Wall.
Given it compiles in C++-98 mode, let's consider it a non-issue.

jeff
Richard Biener July 4, 2019, 10:33 a.m. UTC | #8
On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>
> On 7/3/19 7:08 AM, Richard Biener wrote:
> > On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>
> >>
> >>
> >> On 7/3/19 4:28 AM, Richard Biener wrote:
> >>> On Mon, Jul 1, 2019 at 10:52 AM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>>>
> >>>> As discussed before, this enforces types on undefined and varying, which
> >>>> makes everything more regular, and removes some special casing
> >>>> throughout range handling.
> >>>
> >>> I don't like it too much given you need to introduce that "cache".
> >>>
> >>> Why do VARYING or UNDEFINED need a type?  Nobody should ever need
> >>> to ask for the type of a range anyhow - the type should be always that from
> >>> the context we're looking at (the SSA name the range is associated with,
> >>> the operation we are performing on one or two ranges, etc.).
> >>>
> >>> Thinking again about this it looks fundamentally wrong to associate
> >>> a type with the VARYING or UNDEFINED lattice states.
> >>
> >> We discussed this 2 weeks ago, and it was my understanding that we had
> >> an reached an agreement on the general approach.  Types on varying and
> >> undefined was the *first* thing I brought up.  Explanation quoted below.
> >
> > Yes, and I asked how you handled that const static node for VARYING
> > which you answered, well - we could do some caching per type and I
> > replied I didn't like that very much.
> >
> > So - I see why it might be "convenient" but I still see no actual
> > technical requirement to have a type for them.  I see you have
> > canonicalization for symbolic ranges to integer ranges so
> > you can have one for varying/undefined to integer ranges as well;
> > just make that canonicalization take a type argument.
> >
> >> By the way, the type for varying/undefined requires no space in the
> >> value_range_base structure, as it is kept in the min/max fields which we
> >> already use for VR_RANGE/VR_ANTI_RANGE.
> >
> > You can as well populate those with actual canonical integer range values
> > then.  [MIN, MAX] and [MAX, MIN] (or whatever we'd consider canonical for
> > the empty range).
> >
> > But as said, point me to the place where you need the type of VARYING.
> > It should already exist since the current code does away without.
> >
> > I refuse to uglify the current VRP with a not needed type-indexed cache
> > for VARYING nodes just to make ranger intergation more happy.   Just
> > ignore that extra 'type' argument in the ranger API then?
>
> Ok, I see.  Your main beef is with the type cache.

yes.

> How about we keep VARYING and UNDEFINED typeless until right before we
> call into the ranger.  At which point, we have can populate min/max
> because we have the tree_code and the type handy.  So right before we
> call into the ranger do:
>
>         if (varying_p ())
>           foo->set_varying(TYPE);
>
> This would avoid the type cache, and keep the ranger happy.

you cannot do set_varying on the static const range but instead you'd do

  value_range tem (*foo);
  if (varying_p ())
   tem->set_full_range (TYPE);

which I think we already do in some places.  Thus my question _where_
you actually need this.

> Another option, as you've hinted, would be to normalize VARYING into
> [MIN, MAX] before calling into the ranger.  The problem with this
> approach is that we would then need to change varying_p() to something like:
>
> value_range_base::varying_p()
> {
>    return (m_kind == VR_VARYING ||
>           (vrp_val_is_min (m_min) && vrp_val_is_max (m_max));
> }
>
> Thus slowing everyone down (remember both range-ops and tree-vrp will
> share the implementation for varying_p).  Plus, I'd prefer to keep one
> representation for VARYING, that is m_kind == VR_VARYING.

Yes, I wasn't really suggesting the above but have m_kind == VR_VARYING
and for ranges we want to support [MIN, MAX] on have those pre-populated
when we drop sth to varying.  For pointers this range is pointless for example
and the static const could simply leave it unset as well.

> Perhaps this last alternative would be more consistent-- never allowing
> types for VARYING, at the expense of the calls to vrp_val_is*.

It doesn't work for the static const varying though.

Another way to circumvent the issue is to make get_value_range
return NULL if it doesn't have any range in the lattice (thus the
static const varying goes away).  Then callers need to deal with this
of course.

Just teach ranger about different kind of ranges?

Richard.

> Thoughts?
>
> Aldy
Aldy Hernandez July 9, 2019, 7:28 a.m. UTC | #9
On 7/4/19 6:33 AM, Richard Biener wrote:
> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>> On 7/3/19 7:08 AM, Richard Biener wrote:
>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com> wrote:
> 
>> How about we keep VARYING and UNDEFINED typeless until right before we
>> call into the ranger.  At which point, we have can populate min/max
>> because we have the tree_code and the type handy.  So right before we
>> call into the ranger do:
>>
>>          if (varying_p ())
>>            foo->set_varying(TYPE);
>>
>> This would avoid the type cache, and keep the ranger happy.
> 
> you cannot do set_varying on the static const range but instead you'd do
> 
>    value_range tem (*foo);
>    if (varying_p ())
>     tem->set_full_range (TYPE);
> 
> which I think we already do in some places.  Thus my question _where_
> you actually need this.

Basically, everywhere.  By having a type for varying/undefined, we don't 
have to special case anything.  Sure, we could for example, special case 
the invert operation for undefined / varying.  And we could special case 
everything dealing with ranges to handle varying and undefined, but why? 
  We could also pass a type argument everywhere, but that's just ugly. 
However, I do understand your objection to the type cache.

How about the attached approach?  Set the type for varying/undefined 
when we know it, while avoiding touching the CONST varying.  Then right 
before calling the ranger, pass down a new varying node with min/max for 
any varyings that were still typeless until that point.

I have taken care of never adding a set_varying() that was not already 
there.  Would this keep the const happy?

Technically we don't need to set varying/undef types for every instance 
in VRP, but we need it at least for the code that will be shared with 
range-ops (extract_range_from_multiplicative_op, union, intersect, etc). 
  I just figured if we have the information, might as well set it for 
consistency.

If you like this approach, I can rebase the other patches that depend on 
this one.

Aldy
Richard Biener July 9, 2019, 9:56 a.m. UTC | #10
On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>
>
>
> On 7/4/19 6:33 AM, Richard Biener wrote:
> > On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>
> >> On 7/3/19 7:08 AM, Richard Biener wrote:
> >>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com> wrote:
> >
> >> How about we keep VARYING and UNDEFINED typeless until right before we
> >> call into the ranger.  At which point, we have can populate min/max
> >> because we have the tree_code and the type handy.  So right before we
> >> call into the ranger do:
> >>
> >>          if (varying_p ())
> >>            foo->set_varying(TYPE);
> >>
> >> This would avoid the type cache, and keep the ranger happy.
> >
> > you cannot do set_varying on the static const range but instead you'd do
> >
> >    value_range tem (*foo);
> >    if (varying_p ())
> >     tem->set_full_range (TYPE);
> >
> > which I think we already do in some places.  Thus my question _where_
> > you actually need this.
>
> Basically, everywhere.  By having a type for varying/undefined, we don't
> have to special case anything.  Sure, we could for example, special case
> the invert operation for undefined / varying.  And we could special case
> everything dealing with ranges to handle varying and undefined, but why?
>   We could also pass a type argument everywhere, but that's just ugly.
> However, I do understand your objection to the type cache.
>
> How about the attached approach?  Set the type for varying/undefined
> when we know it, while avoiding touching the CONST varying.  Then right
> before calling the ranger, pass down a new varying node with min/max for
> any varyings that were still typeless until that point.
>
> I have taken care of never adding a set_varying() that was not already
> there.  Would this keep the const happy?
>
> Technically we don't need to set varying/undef types for every instance
> in VRP, but we need it at least for the code that will be shared with
> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
>   I just figured if we have the information, might as well set it for
> consistency.
>
> If you like this approach, I can rebase the other patches that depend on
> this one.

OK, so I went ant checked what you do for class irange which has
a type but no kind member (but constructors with a kind).  It also
uses wide_int members for storage.  For a pure integer constant
range representation this represents somewhat odd choices;  I'd
have elided the m_type member completely here, it seems fully
redundant.  Only range operations need to be carried out in a
specific type (what I was suggesting above).  Even the precision
encoded in the wide_int members is redundant then (I'd have
expected widest_int here and trailing-wide-ints for optimizing
storage).

Then class range_operator looks a bit strange to me (looking
just at the header).  Ugh, so it is all virtual because
you have one instance per tree code.  What an odd choice.
Why didn't you simply go with passing tree_code  (and type!)
to fold_range/op_range?  The API also seems to be oddly
constrained to binary ops.  Anyway, the way you build
the operator table requires an awful lot of global C++ ctor
invocations, sth we generally try to avoid.  But I'm getting
into too many details here.

So - to answer your question above, I'd like you to pass down
a type to operations.  Because that's what is fundamentally
required - a range doesn't have a "type" and the current
value_range_base doesn't fall into the trap of needing one.

Richard.

>
> Aldy
Andrew MacLeod July 16, 2019, 6:37 p.m. UTC | #11
On 7/9/19 5:56 AM, Richard Biener wrote:
> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>>
>> On 7/4/19 6:33 AM, Richard Biener wrote:
>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>> On 7/3/19 7:08 AM, Richard Biener wrote:
>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>> How about we keep VARYING and UNDEFINED typeless until right before we
>>>> call into the ranger.  At which point, we have can populate min/max
>>>> because we have the tree_code and the type handy.  So right before we
>>>> call into the ranger do:
>>>>
>>>>           if (varying_p ())
>>>>             foo->set_varying(TYPE);
>>>>
>>>> This would avoid the type cache, and keep the ranger happy.
>>> you cannot do set_varying on the static const range but instead you'd do
>>>
>>>     value_range tem (*foo);
>>>     if (varying_p ())
>>>      tem->set_full_range (TYPE);
>>>
>>> which I think we already do in some places.  Thus my question _where_
>>> you actually need this.
>> Basically, everywhere.  By having a type for varying/undefined, we don't
>> have to special case anything.  Sure, we could for example, special case
>> the invert operation for undefined / varying.  And we could special case
>> everything dealing with ranges to handle varying and undefined, but why?
>>    We could also pass a type argument everywhere, but that's just ugly.
>> However, I do understand your objection to the type cache.
>>
>> How about the attached approach?  Set the type for varying/undefined
>> when we know it, while avoiding touching the CONST varying.  Then right
>> before calling the ranger, pass down a new varying node with min/max for
>> any varyings that were still typeless until that point.
>>
>> I have taken care of never adding a set_varying() that was not already
>> there.  Would this keep the const happy?
>>
>> Technically we don't need to set varying/undef types for every instance
>> in VRP, but we need it at least for the code that will be shared with
>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
>>    I just figured if we have the information, might as well set it for
>> consistency.
>>
>> If you like this approach, I can rebase the other patches that depend on
>> this one.
> OK, so I went ant checked what you do for class irange which has
> a type but no kind member (but constructors with a kind).  It also
> uses wide_int members for storage.  For a pure integer constant
> range representation this represents somewhat odd choices;  I'd
> have elided the m_type member completely here, it seems fully
> redundant.  Only range operations need to be carried out in a
> specific type (what I was suggesting above).  Even the precision
> encoded in the wide_int members is redundant then (I'd have
> expected widest_int here and trailing-wide-ints for optimizing
> storage).

What irange contains internally is a bit arbitrary.  It's really an API 
for managing a set of subranges.  We could have used trees internally 
just as easily, then we wouldnt need a type field. Since we were doing 
lots of operations, rather than going back and forth from trees all the 
time, we just used the underlying wide_int directly.   we could have 
fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is 
there, has all the operations, and it works fine. so thats what it 
currently is on the branch.

We are treating a range object as a unique self contained object. 
Therefore, the range has a type so we know how to print it, and can 
confirm before any operation that the ranges being operated on are 
appropriately matched.  We found and opened bugzillas over the past 
couple years for places where our code caught bugs because a range was 
created and then operated on in a way that was not compatible with 
another range.  I think there is a still an open one against ada(?) 
where the switch and case  are different precision.

 From my point of view, a range object is similar to a tree node. A tree 
node has the bits to indicate what the value is, but also associates a 
type with those bits within the same object.  This is less error prone 
than passing around the bits and the type separately.  As ranges are 
starting to be used in many places outside of VRP, we should do the same 
thing with ranges.  WIth value_range it would actually be free since 
there is already a tree for the bounds already which contains the type.





> to fold_range/op_range?  The API also seems to be oddly
> constrained to binary ops.  Anyway, the way you build
> the operator table requires an awful lot of global C++ ctor
> invocations, sth we generally try to avoid.  But I'm getting
> into too many details here.

Its "oddly constrained" because what you are looking at  is just the 
standardized unary/binary ops code.. ie the equivalent of 
extract_range_from_binary_expr() and extract_range_from_unary_expr().   
The other ops we care about have specialized requirements, like PHIs  
and the arbitrary numbers of parameters in a call, or anything less 
common than one or two operands.    You are just not seeing those parts.

>
> So - to answer your question above, I'd like you to pass down
> a type to operations.  Because that's what is fundamentally
> required - a range doesn't have a "type" and the current
> value_range_base doesn't fall into the trap of needing one.
>
> Richard.
>

Why is having a type associated with the data a "trap"?   Perhaps the 
old VRP lattice idea didn't need a type with the UNDEFINED and VARYING 
lattice values, but we're moving past lattice values and into a realm 
where we have ranges as useful things outside of VRP, and trying to 
shoehorn lattice values does not seem appropriate anymore.

I looked at implementing range-ops without a type in the range, and we 
end up passing 2 parameters everywhere each time we do anything with a 
range.  This doubled the number of parameters in most routines, and when 
we had chains of calls, we were simply passing the type along with the 
range.  It seems archaic to be constantly passing information around 
instead of associating it with the range itself.  Its just another place 
for an error to creep in.. Aldy found a place where we were creating 
varying nodes for floats IIRC.. the type checking in the range 
operations caught it precisely because the range returned wasn't the 
type it was assumed to be.

That said. I believe we can do away with the need for a type with an 
'UNDEFINED' range.  That is not too onerous, and there doesn't really 
seem to be too many ripple effect from have a typeless undefined range. 
I think we can contain that, and prevents us from having to add a hack 
to value_range for that situation.

VARYING  is another situation completely.  We adopted the term 'varying' 
for ease of compatibility with VRP, but varying is simply a range going 
from MIN to MAX for a type.  Removing the type from that would then 
require we always pass a type in with every range which gets back to  
doubling the number of of parameters everywhere for no good reason.

If we standardize value_range so that MIN and MAX are set for varying, 
then everything works very smoothly, we can make value_range and irange 
interchangeable and facilitate getting the range ops code into trunk.

It seems like the only reason we cant do this right now is the CONST 
varying nodes that are returned from get_value_range().
Looking at that routine, it seems there are only 2 cases where that can 
be returned
  1) we ask for an ssa-name beyond the end of the local ssa-name vector
  2) once values_propagated is set  to true and an ssa-name has no entry.

Both seem pretty straightforward to fix...
1)  if we ask for an ssa-Name beyond the end, simply reallocate the 
vector to be big enough.   I doubt this will trigger a lot, and if we 
initially allocate it with room for an extra 10% names it'll probably 
never trigger.  Or pick whatever number seems appropriate.
2) if values_propagated is true, simply allocate a node for the 
ssa-name,  set it to varying and return it. THIs accomplishes the same 
thing.

the number of additional nodes we allocate will be pretty minimal, and 
it will never exceed the number of ssa-names. Then we never have to 
worry about having CONST set for a varying node either. I see places 
where there is special processing to avoid calling set_varying  because 
we dont know if the vbalue_range in hand is in constant memory and would 
cause the compiler to trap. This seems like a really bad situation, and 
it would eliminate that issue.  We can also then eliminate the places 
where VARYING is expanded to min/max for a given type.. instead we can 
just pick up min/max directly.    It seems much cleaner overall.

Something like the simple attached patch would resolve that issue, and 
remove any lurking concerns/bugs with the CONST code.

Then we can associate a type with varying, canonicalize it consistently, 
and set MIN/MAX appropriately.   This will give us full 
interchangeability between irange and value_range, establish a 
solid/consistent API,  and we can move on to the next thing :-)

Does this not seem reasonable?

Andrew
Index: vr-values.c
===================================================================
--- vr-values.c	(revision 272242)
+++ vr-values.c	(working copy)
@@ -91,18 +91,28 @@
      We should get here at most from the substitute-and-fold stage which
      will never try to change values.  */
   if (ver >= num_vr_values)
-    return CONST_CAST (value_range *, &vr_const_varying);
+    {
+      unsigned int old_sz = num_vr_values;
+      num_vr_values = num_ssa_names + num_ssa_names / 10;
+      vr_value = XRESIZEVEC (value_range *, vr_value, num_vr_values);
+      for ( ; old_sz < num_vr_values; old_sz++)
+        vr_value [old_sz] = NULL;
+    }
 
   vr = vr_value[ver];
   if (vr)
     return vr;
 
-  /* After propagation finished do not allocate new value-ranges.  */
+  /* Create a default value range.  */
+  vr_value[ver] = vr = vrp_value_range_pool.allocate ();
+
+  /* After propagation finished return varying.  */
   if (values_propagated)
-    return CONST_CAST (value_range *, &vr_const_varying);
+    {
+      vr->set_varying ();
+      return vr;
+    }
 
-  /* Create a default value range.  */
-  vr_value[ver] = vr = vrp_value_range_pool.allocate ();
   vr->set_undefined ();
 
   /* If VAR is a default definition of a parameter, the variable can
@@ -1920,7 +1930,7 @@
 vr_values::vr_values () : vrp_value_range_pool ("Tree VRP value ranges")
 {
   values_propagated = false;
-  num_vr_values = num_ssa_names;
+  num_vr_values = num_ssa_names + num_ssa_names / 10;   
   vr_value = XCNEWVEC (value_range *, num_vr_values);
   vr_phi_edge_counts = XCNEWVEC (int, num_ssa_names);
   bitmap_obstack_initialize (&vrp_equiv_obstack);
Aldy Hernandez July 22, 2019, 3:36 p.m. UTC | #12
On 7/16/19 2:37 PM, Andrew MacLeod wrote:
> On 7/9/19 5:56 AM, Richard Biener wrote:
>> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>
>>>
>>> On 7/4/19 6:33 AM, Richard Biener wrote:
>>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>> On 7/3/19 7:08 AM, Richard Biener wrote:
>>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com> 
>>>>>> wrote:
>>>>> How about we keep VARYING and UNDEFINED typeless until right before we
>>>>> call into the ranger.  At which point, we have can populate min/max
>>>>> because we have the tree_code and the type handy.  So right before we
>>>>> call into the ranger do:
>>>>>
>>>>>           if (varying_p ())
>>>>>             foo->set_varying(TYPE);
>>>>>
>>>>> This would avoid the type cache, and keep the ranger happy.
>>>> you cannot do set_varying on the static const range but instead 
>>>> you'd do
>>>>
>>>>     value_range tem (*foo);
>>>>     if (varying_p ())
>>>>      tem->set_full_range (TYPE);
>>>>
>>>> which I think we already do in some places.  Thus my question _where_
>>>> you actually need this.
>>> Basically, everywhere.  By having a type for varying/undefined, we don't
>>> have to special case anything.  Sure, we could for example, special case
>>> the invert operation for undefined / varying.  And we could special case
>>> everything dealing with ranges to handle varying and undefined, but why?
>>>    We could also pass a type argument everywhere, but that's just ugly.
>>> However, I do understand your objection to the type cache.
>>>
>>> How about the attached approach?  Set the type for varying/undefined
>>> when we know it, while avoiding touching the CONST varying.  Then right
>>> before calling the ranger, pass down a new varying node with min/max for
>>> any varyings that were still typeless until that point.
>>>
>>> I have taken care of never adding a set_varying() that was not already
>>> there.  Would this keep the const happy?
>>>
>>> Technically we don't need to set varying/undef types for every instance
>>> in VRP, but we need it at least for the code that will be shared with
>>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
>>>    I just figured if we have the information, might as well set it for
>>> consistency.
>>>
>>> If you like this approach, I can rebase the other patches that depend on
>>> this one.
>> OK, so I went ant checked what you do for class irange which has
>> a type but no kind member (but constructors with a kind).  It also
>> uses wide_int members for storage.  For a pure integer constant
>> range representation this represents somewhat odd choices;  I'd
>> have elided the m_type member completely here, it seems fully
>> redundant.  Only range operations need to be carried out in a
>> specific type (what I was suggesting above).  Even the precision
>> encoded in the wide_int members is redundant then (I'd have
>> expected widest_int here and trailing-wide-ints for optimizing
>> storage).
> 
> What irange contains internally is a bit arbitrary.  It's really an API 
> for managing a set of subranges.  We could have used trees internally 
> just as easily, then we wouldnt need a type field. Since we were doing 
> lots of operations, rather than going back and forth from trees all the 
> time, we just used the underlying wide_int directly.   we could have 
> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is 
> there, has all the operations, and it works fine. so thats what it 
> currently is on the branch.
> 
> We are treating a range object as a unique self contained object. 
> Therefore, the range has a type so we know how to print it, and can 
> confirm before any operation that the ranges being operated on are 
> appropriately matched.  We found and opened bugzillas over the past 
> couple years for places where our code caught bugs because a range was 
> created and then operated on in a way that was not compatible with 
> another range.  I think there is a still an open one against ada(?) 
> where the switch and case  are different precision.
> 
>  From my point of view, a range object is similar to a tree node. A tree 
> node has the bits to indicate what the value is, but also associates a 
> type with those bits within the same object.  This is less error prone 
> than passing around the bits and the type separately.  As ranges are 
> starting to be used in many places outside of VRP, we should do the same 
> thing with ranges.  WIth value_range it would actually be free since 
> there is already a tree for the bounds already which contains the type.
> 
> 
> 
> 
> 
>> to fold_range/op_range?  The API also seems to be oddly
>> constrained to binary ops.  Anyway, the way you build
>> the operator table requires an awful lot of global C++ ctor
>> invocations, sth we generally try to avoid.  But I'm getting
>> into too many details here.
> 
> Its "oddly constrained" because what you are looking at  is just the 
> standardized unary/binary ops code.. ie the equivalent of 
> extract_range_from_binary_expr() and extract_range_from_unary_expr(). 
> The other ops we care about have specialized requirements, like PHIs and 
> the arbitrary numbers of parameters in a call, or anything less common 
> than one or two operands.    You are just not seeing those parts.
> 
>>
>> So - to answer your question above, I'd like you to pass down
>> a type to operations.  Because that's what is fundamentally
>> required - a range doesn't have a "type" and the current
>> value_range_base doesn't fall into the trap of needing one.
>>
>> Richard.
>>
> 
> Why is having a type associated with the data a "trap"?   Perhaps the 
> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING 
> lattice values, but we're moving past lattice values and into a realm 
> where we have ranges as useful things outside of VRP, and trying to 
> shoehorn lattice values does not seem appropriate anymore.
> 
> I looked at implementing range-ops without a type in the range, and we 
> end up passing 2 parameters everywhere each time we do anything with a 
> range.  This doubled the number of parameters in most routines, and when 
> we had chains of calls, we were simply passing the type along with the 
> range.  It seems archaic to be constantly passing information around 
> instead of associating it with the range itself.  Its just another place 
> for an error to creep in.. Aldy found a place where we were creating 
> varying nodes for floats IIRC.. the type checking in the range 
> operations caught it precisely because the range returned wasn't the 
> type it was assumed to be.
> 
> That said. I believe we can do away with the need for a type with an 
> 'UNDEFINED' range.  That is not too onerous, and there doesn't really 
> seem to be too many ripple effect from have a typeless undefined range. 
> I think we can contain that, and prevents us from having to add a hack 
> to value_range for that situation.
> 
> VARYING  is another situation completely.  We adopted the term 'varying' 
> for ease of compatibility with VRP, but varying is simply a range going 
> from MIN to MAX for a type.  Removing the type from that would then 
> require we always pass a type in with every range which gets back to 
> doubling the number of of parameters everywhere for no good reason.
> 
> If we standardize value_range so that MIN and MAX are set for varying, 
> then everything works very smoothly, we can make value_range and irange 
> interchangeable and facilitate getting the range ops code into trunk.
> 
> It seems like the only reason we cant do this right now is the CONST 
> varying nodes that are returned from get_value_range().
> Looking at that routine, it seems there are only 2 cases where that can 
> be returned
>   1) we ask for an ssa-name beyond the end of the local ssa-name vector
>   2) once values_propagated is set  to true and an ssa-name has no entry.
> 
> Both seem pretty straightforward to fix...
> 1)  if we ask for an ssa-Name beyond the end, simply reallocate the 
> vector to be big enough.   I doubt this will trigger a lot, and if we 
> initially allocate it with room for an extra 10% names it'll probably 
> never trigger.  Or pick whatever number seems appropriate.
> 2) if values_propagated is true, simply allocate a node for the 
> ssa-name,  set it to varying and return it. THIs accomplishes the same 
> thing.
> 
> the number of additional nodes we allocate will be pretty minimal, and 
> it will never exceed the number of ssa-names. Then we never have to 
> worry about having CONST set for a varying node either. I see places 
> where there is special processing to avoid calling set_varying  because 
> we dont know if the vbalue_range in hand is in constant memory and would 
> cause the compiler to trap. This seems like a really bad situation, and 
> it would eliminate that issue.  We can also then eliminate the places 
> where VARYING is expanded to min/max for a given type.. instead we can 
> just pick up min/max directly.    It seems much cleaner overall.
> 
> Something like the simple attached patch would resolve that issue, and 
> remove any lurking concerns/bugs with the CONST code.
> 
> Then we can associate a type with varying, canonicalize it consistently, 
> and set MIN/MAX appropriately.   This will give us full 
> interchangeability between irange and value_range, establish a 
> solid/consistent API,  and we can move on to the next thing :-)
> 
> Does this not seem reasonable?

The attached patch takes this one step further, and adds types to 
varying (not undefined), while getting rid of vr_const_varying 
altogether.  It depends on the canonicalization patch, but I can rebase 
it to come before it if desired.  Either 01 had to depend on 02 or vice 
versa.  I had to choose :).

There is a small caveat in ipa-cp, which seems to want to create 
varyings of floats, vectors, structures, and other nonsensical things. 
I tried to plug the callers everywhere, but eventually found it easier 
with the included one-liner:

-  m_vr.set_varying ();
+  /* ?? We create all sorts of VARYING ranges for floats, structures,
+     and other types for which we obviously cannot handle as ranges.
+     We should probably avoid handling non pointers and non integers
+     throughout, but it's simpler to create a sensible VARYING here
+     and let the lattice propagate.  */
+  m_vr.set_varying (integer_type_node);

As mentioned before, I'll add ChangeLog entries if we agree on the approach.

What do y'all think?

Tested on x86-64 Linux.

Aldy
Jeff Law July 22, 2019, 11:59 p.m. UTC | #13
On 7/16/19 12:37 PM, Andrew MacLeod wrote:
> On 7/9/19 5:56 AM, Richard Biener wrote:
>> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>
>>>
>>> On 7/4/19 6:33 AM, Richard Biener wrote:
>>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>> On 7/3/19 7:08 AM, Richard Biener wrote:
>>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com>
>>>>>> wrote:
>>>>> How about we keep VARYING and UNDEFINED typeless until right before we
>>>>> call into the ranger.  At which point, we have can populate min/max
>>>>> because we have the tree_code and the type handy.  So right before we
>>>>> call into the ranger do:
>>>>>
>>>>>           if (varying_p ())
>>>>>             foo->set_varying(TYPE);
>>>>>
>>>>> This would avoid the type cache, and keep the ranger happy.
>>>> you cannot do set_varying on the static const range but instead
>>>> you'd do
>>>>
>>>>     value_range tem (*foo);
>>>>     if (varying_p ())
>>>>      tem->set_full_range (TYPE);
>>>>
>>>> which I think we already do in some places.  Thus my question _where_
>>>> you actually need this.
>>> Basically, everywhere.  By having a type for varying/undefined, we don't
>>> have to special case anything.  Sure, we could for example, special case
>>> the invert operation for undefined / varying.  And we could special case
>>> everything dealing with ranges to handle varying and undefined, but why?
>>>    We could also pass a type argument everywhere, but that's just ugly.
>>> However, I do understand your objection to the type cache.
>>>
>>> How about the attached approach?  Set the type for varying/undefined
>>> when we know it, while avoiding touching the CONST varying.  Then right
>>> before calling the ranger, pass down a new varying node with min/max for
>>> any varyings that were still typeless until that point.
>>>
>>> I have taken care of never adding a set_varying() that was not already
>>> there.  Would this keep the const happy?
>>>
>>> Technically we don't need to set varying/undef types for every instance
>>> in VRP, but we need it at least for the code that will be shared with
>>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
>>>    I just figured if we have the information, might as well set it for
>>> consistency.
>>>
>>> If you like this approach, I can rebase the other patches that depend on
>>> this one.
>> OK, so I went ant checked what you do for class irange which has
>> a type but no kind member (but constructors with a kind).  It also
>> uses wide_int members for storage.  For a pure integer constant
>> range representation this represents somewhat odd choices;  I'd
>> have elided the m_type member completely here, it seems fully
>> redundant.  Only range operations need to be carried out in a
>> specific type (what I was suggesting above).  Even the precision
>> encoded in the wide_int members is redundant then (I'd have
>> expected widest_int here and trailing-wide-ints for optimizing
>> storage).
> 
> What irange contains internally is a bit arbitrary.  It's really an API
> for managing a set of subranges.  We could have used trees internally
> just as easily, then we wouldnt need a type field. Since we were doing
> lots of operations, rather than going back and forth from trees all the
> time, we just used the underlying wide_int directly.   we could have
> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
> there, has all the operations, and it works fine. so thats what it
> currently is on the branch.
> 
> We are treating a range object as a unique self contained object.
> Therefore, the range has a type so we know how to print it, and can
> confirm before any operation that the ranges being operated on are
> appropriately matched.  We found and opened bugzillas over the past
> couple years for places where our code caught bugs because a range was
> created and then operated on in a way that was not compatible with
> another range.  I think there is a still an open one against ada(?)
> where the switch and case  are different precision.
> 
> From my point of view, a range object is similar to a tree node. A tree
> node has the bits to indicate what the value is, but also associates a
> type with those bits within the same object.  This is less error prone
> than passing around the bits and the type separately.  As ranges are
> starting to be used in many places outside of VRP, we should do the same
> thing with ranges.  WIth value_range it would actually be free since
> there is already a tree for the bounds already which contains the type.
> 
> 
> 
> 
> 
>> to fold_range/op_range?  The API also seems to be oddly
>> constrained to binary ops.  Anyway, the way you build
>> the operator table requires an awful lot of global C++ ctor
>> invocations, sth we generally try to avoid.  But I'm getting
>> into too many details here.
> 
> Its "oddly constrained" because what you are looking at  is just the
> standardized unary/binary ops code.. ie the equivalent of
> extract_range_from_binary_expr() and extract_range_from_unary_expr().  
> The other ops we care about have specialized requirements, like PHIs 
> and the arbitrary numbers of parameters in a call, or anything less
> common than one or two operands.    You are just not seeing those parts.
> 
>>
>> So - to answer your question above, I'd like you to pass down
>> a type to operations.  Because that's what is fundamentally
>> required - a range doesn't have a "type" and the current
>> value_range_base doesn't fall into the trap of needing one.
>>
>> Richard.
>>
> 
> Why is having a type associated with the data a "trap"?   Perhaps the
> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
> lattice values, but we're moving past lattice values and into a realm
> where we have ranges as useful things outside of VRP, and trying to
> shoehorn lattice values does not seem appropriate anymore.
> 
> I looked at implementing range-ops without a type in the range, and we
> end up passing 2 parameters everywhere each time we do anything with a
> range.  This doubled the number of parameters in most routines, and when
> we had chains of calls, we were simply passing the type along with the
> range.  It seems archaic to be constantly passing information around
> instead of associating it with the range itself.  Its just another place
> for an error to creep in.. Aldy found a place where we were creating
> varying nodes for floats IIRC.. the type checking in the range
> operations caught it precisely because the range returned wasn't the
> type it was assumed to be.
> 
> That said. I believe we can do away with the need for a type with an
> 'UNDEFINED' range.  That is not too onerous, and there doesn't really
> seem to be too many ripple effect from have a typeless undefined range.
> I think we can contain that, and prevents us from having to add a hack
> to value_range for that situation.
> 
> VARYING  is another situation completely.  We adopted the term 'varying'
> for ease of compatibility with VRP, but varying is simply a range going
> from MIN to MAX for a type.  Removing the type from that would then
> require we always pass a type in with every range which gets back to 
> doubling the number of of parameters everywhere for no good reason.
> 
> If we standardize value_range so that MIN and MAX are set for varying,
> then everything works very smoothly, we can make value_range and irange
> interchangeable and facilitate getting the range ops code into trunk.
> 
> It seems like the only reason we cant do this right now is the CONST
> varying nodes that are returned from get_value_range().
> Looking at that routine, it seems there are only 2 cases where that can
> be returned
>  1) we ask for an ssa-name beyond the end of the local ssa-name vector
>  2) once values_propagated is set  to true and an ssa-name has no entry.
> 
> Both seem pretty straightforward to fix...
> 1)  if we ask for an ssa-Name beyond the end, simply reallocate the
> vector to be big enough.   I doubt this will trigger a lot, and if we
> initially allocate it with room for an extra 10% names it'll probably
> never trigger.  Or pick whatever number seems appropriate.
> 2) if values_propagated is true, simply allocate a node for the
> ssa-name,  set it to varying and return it. THIs accomplishes the same
> thing.
> 
> the number of additional nodes we allocate will be pretty minimal, and
> it will never exceed the number of ssa-names. Then we never have to
> worry about having CONST set for a varying node either. I see places
> where there is special processing to avoid calling set_varying  because
> we dont know if the vbalue_range in hand is in constant memory and would
> cause the compiler to trap. This seems like a really bad situation, and
> it would eliminate that issue.  We can also then eliminate the places
> where VARYING is expanded to min/max for a given type.. instead we can
> just pick up min/max directly.    It seems much cleaner overall.
> 
> Something like the simple attached patch would resolve that issue, and
> remove any lurking concerns/bugs with the CONST code.
> 
> Then we can associate a type with varying, canonicalize it consistently,
> and set MIN/MAX appropriately.   This will give us full
> interchangeability between irange and value_range, establish a
> solid/consistent API,  and we can move on to the next thing :-)
> 
> Does this not seem reasonable?
One might even claim that this patch in and of itself is a step forward
independent of all the other work going on.

THe first time I saw that code when I copied it into vr-values I thought
it was ripe for fixing, but never got to it.  As it stands, it's a hack,
no more, no less and it's a hack that we can easily remove and do
something better.


Jeff
Richard Biener July 23, 2019, 9:33 a.m. UTC | #14
On Tue, Jul 16, 2019 at 8:37 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 7/9/19 5:56 AM, Richard Biener wrote:
> > On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>
> >>
> >> On 7/4/19 6:33 AM, Richard Biener wrote:
> >>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>>> On 7/3/19 7:08 AM, Richard Biener wrote:
> >>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>>> How about we keep VARYING and UNDEFINED typeless until right before we
> >>>> call into the ranger.  At which point, we have can populate min/max
> >>>> because we have the tree_code and the type handy.  So right before we
> >>>> call into the ranger do:
> >>>>
> >>>>           if (varying_p ())
> >>>>             foo->set_varying(TYPE);
> >>>>
> >>>> This would avoid the type cache, and keep the ranger happy.
> >>> you cannot do set_varying on the static const range but instead you'd do
> >>>
> >>>     value_range tem (*foo);
> >>>     if (varying_p ())
> >>>      tem->set_full_range (TYPE);
> >>>
> >>> which I think we already do in some places.  Thus my question _where_
> >>> you actually need this.
> >> Basically, everywhere.  By having a type for varying/undefined, we don't
> >> have to special case anything.  Sure, we could for example, special case
> >> the invert operation for undefined / varying.  And we could special case
> >> everything dealing with ranges to handle varying and undefined, but why?
> >>    We could also pass a type argument everywhere, but that's just ugly.
> >> However, I do understand your objection to the type cache.
> >>
> >> How about the attached approach?  Set the type for varying/undefined
> >> when we know it, while avoiding touching the CONST varying.  Then right
> >> before calling the ranger, pass down a new varying node with min/max for
> >> any varyings that were still typeless until that point.
> >>
> >> I have taken care of never adding a set_varying() that was not already
> >> there.  Would this keep the const happy?
> >>
> >> Technically we don't need to set varying/undef types for every instance
> >> in VRP, but we need it at least for the code that will be shared with
> >> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
> >>    I just figured if we have the information, might as well set it for
> >> consistency.
> >>
> >> If you like this approach, I can rebase the other patches that depend on
> >> this one.
> > OK, so I went ant checked what you do for class irange which has
> > a type but no kind member (but constructors with a kind).  It also
> > uses wide_int members for storage.  For a pure integer constant
> > range representation this represents somewhat odd choices;  I'd
> > have elided the m_type member completely here, it seems fully
> > redundant.  Only range operations need to be carried out in a
> > specific type (what I was suggesting above).  Even the precision
> > encoded in the wide_int members is redundant then (I'd have
> > expected widest_int here and trailing-wide-ints for optimizing
> > storage).
>
> What irange contains internally is a bit arbitrary.  It's really an API
> for managing a set of subranges.  We could have used trees internally
> just as easily, then we wouldnt need a type field. Since we were doing
> lots of operations, rather than going back and forth from trees all the
> time, we just used the underlying wide_int directly.   we could have
> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
> there, has all the operations, and it works fine. so thats what it
> currently is on the branch.

But a range has no type.  Period.  The fact that with the in-tree implementation
there's access to a type is artificial.  All workers in the in-tree
implementation
get a type for the operation they carry out.

But somehow irange doesn't get that.

In fact, an irange shouldn't be bound to any type, and the usual
"carry out multiplication of two integer typed vars with the result
in integer" on irange should be multiply two iranges (with unbound
result) plus a "truncate to integer type" operation.

>
> We are treating a range object as a unique self contained object.
> Therefore, the range has a type so we know how to print it, and can
> confirm before any operation that the ranges being operated on are
> appropriately matched.  We found and opened bugzillas over the past
> couple years for places where our code caught bugs because a range was
> created and then operated on in a way that was not compatible with
> another range.  I think there is a still an open one against ada(?)
> where the switch and case  are different precision.

I'm fine with sanity-checking where it makes sense but looking at
the irange code the fact that there is a type is a fundamental requirement.
IMHO that's bad design.

>
>  From my point of view, a range object is similar to a tree node. A tree
> node has the bits to indicate what the value is, but also associates a
> type with those bits within the same object.  This is less error prone
> than passing around the bits and the type separately.  As ranges are
> starting to be used in many places outside of VRP, we should do the same
> thing with ranges.  WIth value_range it would actually be free since
> there is already a tree for the bounds already which contains the type.

Not for VARYING or UNDEFINED.

>
>
>
>
> > to fold_range/op_range?  The API also seems to be oddly
> > constrained to binary ops.  Anyway, the way you build
> > the operator table requires an awful lot of global C++ ctor
> > invocations, sth we generally try to avoid.  But I'm getting
> > into too many details here.
>
> Its "oddly constrained" because what you are looking at  is just the
> standardized unary/binary ops code.. ie the equivalent of
> extract_range_from_binary_expr() and extract_range_from_unary_expr().
> The other ops we care about have specialized requirements, like PHIs
> and the arbitrary numbers of parameters in a call, or anything less
> common than one or two operands.    You are just not seeing those parts.
>
> >
> > So - to answer your question above, I'd like you to pass down
> > a type to operations.  Because that's what is fundamentally
> > required - a range doesn't have a "type" and the current
> > value_range_base doesn't fall into the trap of needing one.
> >
> > Richard.
> >
>
> Why is having a type associated with the data a "trap"?   Perhaps the
> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
> lattice values, but we're moving past lattice values and into a realm
> where we have ranges as useful things outside of VRP, and trying to
> shoehorn lattice values does not seem appropriate anymore.
>
> I looked at implementing range-ops without a type in the range, and we
> end up passing 2 parameters everywhere each time we do anything with a
> range.  This doubled the number of parameters in most routines, and when
> we had chains of calls, we were simply passing the type along with the
> range.  It seems archaic to be constantly passing information around
> instead of associating it with the range itself.  Its just another place
> for an error to creep in.. Aldy found a place where we were creating
> varying nodes for floats IIRC.. the type checking in the range
> operations caught it precisely because the range returned wasn't the
> type it was assumed to be.

Well, I don't agree.  It's not archaic, it's the way GCC works everywhere
else.  So it's consistent.  Archaically consistend in your view.

But [1, 2] doesn't have a "type".  [1, 2] and [10000, 10004] can
be added just fine even if [1, 2] is from 'char' and [10000, 10004] is
from 'int'.  If you want sanity-checking you can do

irange
irange_plus (tree in_type, irange op0, irange op1)
{
   gcc_assert (range_fits_type (in_type, op0) && range_fits_type
(in_type, op1));
   irange res = op0 + op1;
   return res.truncate_to (in_type);
}

or so easily.  See - even the checking part doesn't need the
type in the range.

I also do not have to "invent" a type when I want to work with irange
on widest_ints.  Oh, another comment when reading the code was
that irange storage should not be wide_int but trailing-widest-int.
Because obviously irange has storage requirement.  It makes
in-place modification of iranges impossible but that's the way modern
C++ code works (and be consistent with how wide-int works).

I don't really like you invented an API unlike any existing in GCC.

You could have gone after wide-int at least.

> That said. I believe we can do away with the need for a type with an
> 'UNDEFINED' range.  That is not too onerous, and there doesn't really
> seem to be too many ripple effect from have a typeless undefined range.
> I think we can contain that, and prevents us from having to add a hack
> to value_range for that situation.
>
> VARYING  is another situation completely.  We adopted the term 'varying'
> for ease of compatibility with VRP, but varying is simply a range going
> from MIN to MAX for a type.  Removing the type from that would then
> require we always pass a type in with every range which gets back to
> doubling the number of of parameters everywhere for no good reason.
>
> If we standardize value_range so that MIN and MAX are set for varying,
> then everything works very smoothly, we can make value_range and irange
> interchangeable and facilitate getting the range ops code into trunk.
>
> It seems like the only reason we cant do this right now is the CONST
> varying nodes that are returned from get_value_range().
> Looking at that routine, it seems there are only 2 cases where that can
> be returned
>   1) we ask for an ssa-name beyond the end of the local ssa-name vector
>   2) once values_propagated is set  to true and an ssa-name has no entry.
>
> Both seem pretty straightforward to fix...
> 1)  if we ask for an ssa-Name beyond the end, simply reallocate the
> vector to be big enough.   I doubt this will trigger a lot, and if we
> initially allocate it with room for an extra 10% names it'll probably
> never trigger.  Or pick whatever number seems appropriate.
> 2) if values_propagated is true, simply allocate a node for the
> ssa-name,  set it to varying and return it. THIs accomplishes the same
> thing.

But it's completely pointless to do this work!  I refuse to make the
existing code slower just to make ranger easier to merge (and your
compile-time comparisons worse for the old code - not that all the
ranger-driven C++ification helped here).

> the number of additional nodes we allocate will be pretty minimal, and
> it will never exceed the number of ssa-names. Then we never have to
> worry about having CONST set for a varying node either. I see places
> where there is special processing to avoid calling set_varying  because
> we dont know if the vbalue_range in hand is in constant memory and would
> cause the compiler to trap. This seems like a really bad situation, and
> it would eliminate that issue.  We can also then eliminate the places
> where VARYING is expanded to min/max for a given type.. instead we can
> just pick up min/max directly.    It seems much cleaner overall.
>
> Something like the simple attached patch would resolve that issue, and
> remove any lurking concerns/bugs with the CONST code.
>
> Then we can associate a type with varying, canonicalize it consistently,
> and set MIN/MAX appropriately.   This will give us full
> interchangeability between irange and value_range, establish a
> solid/consistent API,  and we can move on to the next thing :-)
>
> Does this not seem reasonable?

I still do not agree with the design decision of the ranger API.  You do
want that to be included in the end, no?  Of course I have no veto power
or whatnot but the API and implementation is awkward.  You didn't
answer my concern about all the global ctors, did you?

Richard.

> Andrew
>
Richard Biener July 23, 2019, 9:42 a.m. UTC | #15
On Tue, Jul 23, 2019 at 1:59 AM Jeff Law <law@redhat.com> wrote:
>
> On 7/16/19 12:37 PM, Andrew MacLeod wrote:
> > On 7/9/19 5:56 AM, Richard Biener wrote:
> >> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>>
> >>>
> >>> On 7/4/19 6:33 AM, Richard Biener wrote:
> >>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
> >>>>> On 7/3/19 7:08 AM, Richard Biener wrote:
> >>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com>
> >>>>>> wrote:
> >>>>> How about we keep VARYING and UNDEFINED typeless until right before we
> >>>>> call into the ranger.  At which point, we have can populate min/max
> >>>>> because we have the tree_code and the type handy.  So right before we
> >>>>> call into the ranger do:
> >>>>>
> >>>>>           if (varying_p ())
> >>>>>             foo->set_varying(TYPE);
> >>>>>
> >>>>> This would avoid the type cache, and keep the ranger happy.
> >>>> you cannot do set_varying on the static const range but instead
> >>>> you'd do
> >>>>
> >>>>     value_range tem (*foo);
> >>>>     if (varying_p ())
> >>>>      tem->set_full_range (TYPE);
> >>>>
> >>>> which I think we already do in some places.  Thus my question _where_
> >>>> you actually need this.
> >>> Basically, everywhere.  By having a type for varying/undefined, we don't
> >>> have to special case anything.  Sure, we could for example, special case
> >>> the invert operation for undefined / varying.  And we could special case
> >>> everything dealing with ranges to handle varying and undefined, but why?
> >>>    We could also pass a type argument everywhere, but that's just ugly.
> >>> However, I do understand your objection to the type cache.
> >>>
> >>> How about the attached approach?  Set the type for varying/undefined
> >>> when we know it, while avoiding touching the CONST varying.  Then right
> >>> before calling the ranger, pass down a new varying node with min/max for
> >>> any varyings that were still typeless until that point.
> >>>
> >>> I have taken care of never adding a set_varying() that was not already
> >>> there.  Would this keep the const happy?
> >>>
> >>> Technically we don't need to set varying/undef types for every instance
> >>> in VRP, but we need it at least for the code that will be shared with
> >>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
> >>>    I just figured if we have the information, might as well set it for
> >>> consistency.
> >>>
> >>> If you like this approach, I can rebase the other patches that depend on
> >>> this one.
> >> OK, so I went ant checked what you do for class irange which has
> >> a type but no kind member (but constructors with a kind).  It also
> >> uses wide_int members for storage.  For a pure integer constant
> >> range representation this represents somewhat odd choices;  I'd
> >> have elided the m_type member completely here, it seems fully
> >> redundant.  Only range operations need to be carried out in a
> >> specific type (what I was suggesting above).  Even the precision
> >> encoded in the wide_int members is redundant then (I'd have
> >> expected widest_int here and trailing-wide-ints for optimizing
> >> storage).
> >
> > What irange contains internally is a bit arbitrary.  It's really an API
> > for managing a set of subranges.  We could have used trees internally
> > just as easily, then we wouldnt need a type field. Since we were doing
> > lots of operations, rather than going back and forth from trees all the
> > time, we just used the underlying wide_int directly.   we could have
> > fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
> > there, has all the operations, and it works fine. so thats what it
> > currently is on the branch.
> >
> > We are treating a range object as a unique self contained object.
> > Therefore, the range has a type so we know how to print it, and can
> > confirm before any operation that the ranges being operated on are
> > appropriately matched.  We found and opened bugzillas over the past
> > couple years for places where our code caught bugs because a range was
> > created and then operated on in a way that was not compatible with
> > another range.  I think there is a still an open one against ada(?)
> > where the switch and case  are different precision.
> >
> > From my point of view, a range object is similar to a tree node. A tree
> > node has the bits to indicate what the value is, but also associates a
> > type with those bits within the same object.  This is less error prone
> > than passing around the bits and the type separately.  As ranges are
> > starting to be used in many places outside of VRP, we should do the same
> > thing with ranges.  WIth value_range it would actually be free since
> > there is already a tree for the bounds already which contains the type.
> >
> >
> >
> >
> >
> >> to fold_range/op_range?  The API also seems to be oddly
> >> constrained to binary ops.  Anyway, the way you build
> >> the operator table requires an awful lot of global C++ ctor
> >> invocations, sth we generally try to avoid.  But I'm getting
> >> into too many details here.
> >
> > Its "oddly constrained" because what you are looking at  is just the
> > standardized unary/binary ops code.. ie the equivalent of
> > extract_range_from_binary_expr() and extract_range_from_unary_expr().
> > The other ops we care about have specialized requirements, like PHIs
> > and the arbitrary numbers of parameters in a call, or anything less
> > common than one or two operands.    You are just not seeing those parts.
> >
> >>
> >> So - to answer your question above, I'd like you to pass down
> >> a type to operations.  Because that's what is fundamentally
> >> required - a range doesn't have a "type" and the current
> >> value_range_base doesn't fall into the trap of needing one.
> >>
> >> Richard.
> >>
> >
> > Why is having a type associated with the data a "trap"?   Perhaps the
> > old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
> > lattice values, but we're moving past lattice values and into a realm
> > where we have ranges as useful things outside of VRP, and trying to
> > shoehorn lattice values does not seem appropriate anymore.
> >
> > I looked at implementing range-ops without a type in the range, and we
> > end up passing 2 parameters everywhere each time we do anything with a
> > range.  This doubled the number of parameters in most routines, and when
> > we had chains of calls, we were simply passing the type along with the
> > range.  It seems archaic to be constantly passing information around
> > instead of associating it with the range itself.  Its just another place
> > for an error to creep in.. Aldy found a place where we were creating
> > varying nodes for floats IIRC.. the type checking in the range
> > operations caught it precisely because the range returned wasn't the
> > type it was assumed to be.
> >
> > That said. I believe we can do away with the need for a type with an
> > 'UNDEFINED' range.  That is not too onerous, and there doesn't really
> > seem to be too many ripple effect from have a typeless undefined range.
> > I think we can contain that, and prevents us from having to add a hack
> > to value_range for that situation.
> >
> > VARYING  is another situation completely.  We adopted the term 'varying'
> > for ease of compatibility with VRP, but varying is simply a range going
> > from MIN to MAX for a type.  Removing the type from that would then
> > require we always pass a type in with every range which gets back to
> > doubling the number of of parameters everywhere for no good reason.
> >
> > If we standardize value_range so that MIN and MAX are set for varying,
> > then everything works very smoothly, we can make value_range and irange
> > interchangeable and facilitate getting the range ops code into trunk.
> >
> > It seems like the only reason we cant do this right now is the CONST
> > varying nodes that are returned from get_value_range().
> > Looking at that routine, it seems there are only 2 cases where that can
> > be returned
> >  1) we ask for an ssa-name beyond the end of the local ssa-name vector
> >  2) once values_propagated is set  to true and an ssa-name has no entry.
> >
> > Both seem pretty straightforward to fix...
> > 1)  if we ask for an ssa-Name beyond the end, simply reallocate the
> > vector to be big enough.   I doubt this will trigger a lot, and if we
> > initially allocate it with room for an extra 10% names it'll probably
> > never trigger.  Or pick whatever number seems appropriate.
> > 2) if values_propagated is true, simply allocate a node for the
> > ssa-name,  set it to varying and return it. THIs accomplishes the same
> > thing.
> >
> > the number of additional nodes we allocate will be pretty minimal, and
> > it will never exceed the number of ssa-names. Then we never have to
> > worry about having CONST set for a varying node either. I see places
> > where there is special processing to avoid calling set_varying  because
> > we dont know if the vbalue_range in hand is in constant memory and would
> > cause the compiler to trap. This seems like a really bad situation, and
> > it would eliminate that issue.  We can also then eliminate the places
> > where VARYING is expanded to min/max for a given type.. instead we can
> > just pick up min/max directly.    It seems much cleaner overall.
> >
> > Something like the simple attached patch would resolve that issue, and
> > remove any lurking concerns/bugs with the CONST code.
> >
> > Then we can associate a type with varying, canonicalize it consistently,
> > and set MIN/MAX appropriately.   This will give us full
> > interchangeability between irange and value_range, establish a
> > solid/consistent API,  and we can move on to the next thing :-)
> >
> > Does this not seem reasonable?
> One might even claim that this patch in and of itself is a step forward
> independent of all the other work going on.
>
> THe first time I saw that code when I copied it into vr-values I thought
> it was ripe for fixing, but never got to it.  As it stands, it's a hack,
> no more, no less and it's a hack that we can easily remove and do
> something better.

It's not a hack - it's an optimization and a _sanitization_.  Iff you
want to remove it then
make it return NULL (no range) and make all callers deal with that.

That we return a pointer to read-only storage makes sure callers do not
change the value-range.  That's better than the alternative below which
is the appropriate "fix" for the existing code if you want to get rid of the
statically allocated constant varying and the "hack" of CONST_CASTing it.

But you lose the sanitizing that nobody can change it and the
changed info leaks to other SSA vars.

As said, fix all callers to deal with NULL.

But I argue the current code is exactly optimal and safe.

Richard.

Index: gcc/vr-values.c
===================================================================
--- gcc/vr-values.c     (revision 273657)
+++ gcc/vr-values.c     (working copy)
@@ -78,7 +78,6 @@ set_value_range_to_truthvalue (value_ran
 value_range *
 vr_values::get_value_range (const_tree var)
 {
-  static const value_range vr_const_varying (VR_VARYING, NULL, NULL);
   value_range *vr;
   tree sym;
   unsigned ver = SSA_NAME_VERSION (var);
@@ -91,7 +90,7 @@ vr_values::get_value_range (const_tree v
      We should get here at most from the substitute-and-fold stage which
      will never try to change values.  */
   if (ver >= num_vr_values)
-    return CONST_CAST (value_range *, &vr_const_varying);
+    return vr_value[0];

   vr = vr_value[ver];
   if (vr)
@@ -99,7 +98,7 @@ vr_values::get_value_range (const_tree v

   /* After propagation finished do not allocate new value-ranges.  */
   if (values_propagated)
-    return CONST_CAST (value_range *, &vr_const_varying);
+    return vr_value[0];

   /* Create a default value range.  */
   vr_value[ver] = vr = vrp_value_range_pool.allocate ();
@@ -1922,6 +1947,8 @@ vr_values::vr_values () : vrp_value_rang
   values_propagated = false;
   num_vr_values = num_ssa_names;
   vr_value = XCNEWVEC (value_range *, num_vr_values);
+  vr_value[0] = vrp_value_range_pool.allocate ();
+  vr_value[0]->set_varying ();
   vr_phi_edge_counts = XCNEWVEC (int, num_ssa_names);
   bitmap_obstack_initialize (&vrp_equiv_obstack);
   to_remove_edges = vNULL;


>
> Jeff
>
Andrew MacLeod July 23, 2019, 10:56 p.m. UTC | #16
On 7/23/19 5:33 AM, Richard Biener wrote:
>
>> What irange contains internally is a bit arbitrary.  It's really an API
>> for managing a set of subranges.  We could have used trees internally
>> just as easily, then we wouldnt need a type field. Since we were doing
>> lots of operations, rather than going back and forth from trees all the
>> time, we just used the underlying wide_int directly.   we could have
>> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
>> there, has all the operations, and it works fine. so thats what it
>> currently is on the branch.
> But a range has no type.  Period.  The fact that with the in-tree implementation
> there's access to a type is artificial.  All workers in the in-tree
> implementation
> get a type for the operation they carry out.
>
> But somehow irange doesn't get that.
>
> In fact, an irange shouldn't be bound to any type, and the usual
> "carry out multiplication of two integer typed vars with the result
> in integer" on irange should be multiply two iranges (with unbound
> result) plus a "truncate to integer type" operation.
>

so following thru on the implementation details of doing that,  we do 
the exact same thing that VRP does for multiply.. we eventually call 
wide_int_range_multiplicative_op.
The code from tree-vrp.c:

   wide_int res_lb, res_ub;
   wide_int vr0_lb = wi::to_wide (vr0_min);
   wide_int vr0_ub = wi::to_wide (vr0_max);
   wide_int vr1_lb = wi::to_wide (vr1->min ());
   wide_int vr1_ub = wi::to_wide (vr1->max ());
   bool overflow_undefined = TYPE_OVERFLOW_UNDEFINED (type);
   unsigned prec = TYPE_PRECISION (type);

   if (wide_int_range_multiplicative_op (res_lb, res_ub,
                                         code, TYPE_SIGN (type), prec,
                                         vr0_lb, vr0_ub, vr1_lb, vr1_ub,
                                         overflow_undefined))
     vr->set (VR_RANGE, wide_int_to_tree (type, res_lb),
              wide_int_to_tree (type, res_ub));


Which , lo and behold, it requires a type in order to get the signop 
right in the 4th argument.  we also use the type to figure out the 
precision in the 5th argument, as well as the overflow condition at the 
end.

So it *is* bound to a type in order to do the operation, its just a 
matter of whether you pass that type around, or include it with the 
object.  I simply can't imagine why you think it isn't.

sure, in this case the LHS, vr0, and vr1 are all the same type.. (or 
should be type compatible_p), so we can pass in one type, but not all 
operations follow that pattern... casts, shifts, comparisons, etc can 
have different types in the operand positions.  We include it with each 
range and  we always have accurate info associated with the operand.

How is that  a bad thing?

>> We are treating a range object as a unique self contained object.
>> Therefore, the range has a type so we know how to print it, and can
>> confirm before any operation that the ranges being operated on are
>> appropriately matched.  We found and opened bugzillas over the past
>> couple years for places where our code caught bugs because a range was
>> created and then operated on in a way that was not compatible with
>> another range.  I think there is a still an open one against ada(?)
>> where the switch and case  are different precision.
> I'm fine with sanity-checking where it makes sense but looking at
> the irange code the fact that there is a type is a fundamental requirement.
> IMHO that's bad design.

we could remove the type from the range object.. we aren't critically 
tied to it, but then virtually everywhere we pass a range, we'll be 
passing in the type to be operated on, or the sign flag, or overflow 
flag,   or some combination of those.  Its only a matter of time until 
someone gets them out of whack.. It seems far more logical  to simply 
keep the type associated so we can pick up overflow, signop, precision 
and such as needed.. and do some sanity checking along the way.  thats 
what the type field is for after all, to consolidate all the info you 
might want...  Why pass the extra parameters when you don' t need to.


>>   From my point of view, a range object is similar to a tree node. A tree
>> node has the bits to indicate what the value is, but also associates a
>> type with those bits within the same object.  This is less error prone
>> than passing around the bits and the type separately.  As ranges are
>> starting to be used in many places outside of VRP, we should do the same
>> thing with ranges.  WIth value_range it would actually be free since
>> there is already a tree for the bounds already which contains the type.
> Not for VARYING or UNDEFINED.

Personally, Id put it in both for consistency.  I view a range as an 
object we are associating with either an expression or an ssa-name. That 
range should then have a consistent type with that name or expression.  
Even if the range is empty, I would still expect it to be compatible 
since I'm usually associating it with an ssa_name or expression.


Because you seem so dead set against it, I can also see some consistency 
in not having a type for undefined... Since there are 0 ranges, we can't 
ask for a lower_bound () or upper_bound () on an empty range, so I can 
see an argument for extending that not being able to ask the type() 
either.  I agree that this is also a plausible viewpoint, and we are 
changing our code to make that happen, even though we have to pass a few 
extra types around and will lose some minor sanity checking when doing 
intersection or union with an empty range.   To me an empty range is 
akin to a NaN.. it isn't a real float value, but it does still have a 
type associated with it.


Being against a type for varying makes no sense to me since varying is 
simply shorthand for [MIN, MAX]. This is no different than some other 
range like [3, 45]  its just a range with 2 end points.  VRP may happen 
to special case it to represent the lattice value that maps to this and 
doesn't bother filling it in, but it doesn't change the fact that it 
represents [MIN, MAX]. And VRP does occasionally expand varying into its 
component limits... Aldy found cases where 2 ranges combined would 
result in VR_RANGE [MIN, MAX], but would not be considered varying 
because it wasn't normalized to VR_VARYING.. so there are 2 
representations of the same value which is also bad.

So we can now normalize that code to not special case varying.. simply 
ask for lower_bound and upper_bound ().  Its less special casing, its 
consistent, costs no memory.. It all seems like a really good cleanup 
which you appear to be opposing for reasons I can't quite fathom.


>
>>
>>
>>
>>> to fold_range/op_range?  The API also seems to be oddly
>>> constrained to binary ops.  Anyway, the way you build
>>> the operator table requires an awful lot of global C++ ctor
>>> invocations, sth we generally try to avoid.  But I'm getting
>>> into too many details here.
>> Its "oddly constrained" because what you are looking at  is just the
>> standardized unary/binary ops code.. ie the equivalent of
>> extract_range_from_binary_expr() and extract_range_from_unary_expr().
>> The other ops we care about have specialized requirements, like PHIs
>> and the arbitrary numbers of parameters in a call, or anything less
>> common than one or two operands.    You are just not seeing those parts.
>>
>>> So - to answer your question above, I'd like you to pass down
>>> a type to operations.  Because that's what is fundamentally
>>> required - a range doesn't have a "type" and the current
>>> value_range_base doesn't fall into the trap of needing one.
>>>
>>> Richard.
>>>
>> Why is having a type associated with the data a "trap"?   Perhaps the
>> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
>> lattice values, but we're moving past lattice values and into a realm
>> where we have ranges as useful things outside of VRP, and trying to
>> shoehorn lattice values does not seem appropriate anymore.
>>
>> I looked at implementing range-ops without a type in the range, and we
>> end up passing 2 parameters everywhere each time we do anything with a
>> range.  This doubled the number of parameters in most routines, and when
>> we had chains of calls, we were simply passing the type along with the
>> range.  It seems archaic to be constantly passing information around
>> instead of associating it with the range itself.  Its just another place
>> for an error to creep in.. Aldy found a place where we were creating
>> varying nodes for floats IIRC.. the type checking in the range
>> operations caught it precisely because the range returned wasn't the
>> type it was assumed to be.
> Well, I don't agree.  It's not archaic, it's the way GCC works everywhere
> else.  So it's consistent.  Archaically consistend in your view.

Are you really arguing we should write code the same way for 30+ years 
and never change?

Indeed, I would argue that when we have the opportunity to modernize 
code in our code base, we really ought to be doing it... and we don't do 
it nearly enough.

Why don't we just pass a type along with every tree node instead of 
including it in the node? we'd save memory in each tree node.   I 
*really* fail to understand this argument of keeping the type separate 
from the range when the range by definition represents sub-ranges of a 
specific type.



> But [1, 2] doesn't have a "type".  [1, 2] and [10000, 10004] can
> be added just fine even if [1, 2] is from 'char' and [10000, 10004] is
> from 'int'.  If you want sanity-checking you can do
But of course it does.  I defy you to create a value_range for [1,2] 
where I cant ask for the type of either the 1 or 2.

When  I write

int
  foo (unsigned x, char y)
{
   return x + y;
}

  we get

   _1 = (unsigned int) y_3(D);
   _2 = _1 + x_4(D);
   _5 = (int) _2;
   return _5;

In order to make sure everything works as expected, GCC inserts code to 
make sure the operands of the '+' are types_compatible_p.  It then uses 
the information in the TYPE field to perform the operation.   if GCC 
isn't going to add a 'char' to an 'unsigned int', why should the range 
code? we're using the same code under the covers.  I would say the same 
argument for ensuring the operands are compatible in GCC's '+' apply to 
determining the results of a ranges '+' operation...  If we don't, Im 
sure we'll get the wrong results at some point...  Either that, or GCC 
is wasting its time doing all this work.

What we are doing to calculate results is *exactly* the same from my 
point of view.  We use the type checking mostly for intersection and 
union, and while we don't do much type checking on operations like plus, 
(we could, but we assume GCC has already done that), We do use the 
fields of the TYPE to execute the operation in case we need to know the 
signop or overflow condition of the type.

This is where I do not understand what kind of point you are trying to 
make.

When we perform an operation like '+" on a range, we are calling gcc's 
routines to do the add on each of the operands of the range, which means 
we are trying to map range operations as directly to the IL operation as 
possible.  ie, PLUS is still the same PLUS that would be operated on by 
the two const tree values in 3 + 4.... , we are just teaching the 
compiler how ranges are different than constants, and require a few more 
operations combined to come up with the result... so for plus, we're 
teaching it
       [L1, U1] + [L2, U2] results in [L1+L2, U1+U2]  plus variants 
based on overflow and sign
and we use gcc's add operation for whatever the type  L1,L2, U1, and U2 
are, and we need the info from the types of those endpoints to do the 
operation.

Why on earth we'd want to go out of our way to not include the type when 
we'll need pieces of it passed in separately I cannot imagine.

In the case of irange, we added the type field since it isn't available 
from the wide_int structure.. and the primary reason we 'need' it is so 
we can pick up the correct signop value and overflow style when we are 
performing certain wide int operations.  In the case of value_range, end 
points are already a tree, so we have the type when we need it and don't 
need to add anything extra.  so its inherently free in value range, 
which should make it an even easier sell.



> irange
> irange_plus (tree in_type, irange op0, irange op1)
> {
>     gcc_assert (range_fits_type (in_type, op0) && range_fits_type
> (in_type, op1));
>     irange res = op0 + op1;
>     return res.truncate_to (in_type);
> }
>
> or so easily.  See - even the checking part doesn't need the
> type in the range.
That code could easily be the same , except our assert would read:

gcc_assert (types_compatible_p (op0.type (), op1.type ());

which ensures that those ranges came from places with the kinds of 
values we expect, and looks a lot like every thing else we see in GCC.

we don't actually do that, but we could if you like.  Its just a bit 
redundant since virtually every place we call into the fold operations 
have had ranges extracted from a gimple statement  and the resulting 
queries have a type matching the operand, so at that point we're pretty 
sure they will pass the type_compatible_p test. And even if they don't 
the underlying wide_int operational routines will complain.

Most of the type checking is for the benefit of the ranger, which goes 
off and finds ranges and brings them back. It typically has to perform a 
chain of operations over multiple statements , and when you are 
traveling over the CFG and combining computations, it includes unions 
and intersections.  its very helpful to ensure the range you are getting 
at back at each stage of the evaluation matches the type you were 
expecting. Its been a critical debugging aid.

There are also some operations, like shift, where the type of operands  
may not even be the same sign as 'in_type'...  so we need to pass in the 
right type for that to truly get it right.  I noticed the existing VRP 
code does some fudging for that particular case... Im not convinced 
there isn't a lurking bug there..   but in any case we'll just have the 
precise info right there in the range and wont have to do anything special.

a = b < c is another example.  The LHS is a different type (boolean)  
than op1 and op2.. its obscured in VRP because there is special code for 
comparison ops, but range-ops treats relationals exactly like any other 
generic binary operation.   We'd have to pass the type or signop for the 
RHS to the operation because its different than 'in_type'.  Basically, 
the API would have to have a type for each operand to be sure we are 
correct for all operations.

Range-ops provides a generic API that works for *all* binary or unary 
ops, and as such, you sometimes need the types of each operand in ways 
that can be hard to predict until you actually implement them.   We've 
found it to be very useful to have the type of the range easily 
available for some of today's tree codes...  and who knows in the 
future...  especially if we expand ranges to other types like floats 
which we are also preparing for.


> I also do not have to "invent" a type when I want to work with irange
> on widest_ints.  Oh, another comment when reading the code was

Well you had to get those wide ints from somewhere,  and you certainly 
have to "invent" a signop and possibly overflow flag to do many 
operations on wide_ints.
In all our code, I have yet to "invent" a type.. Any range operation I 
want to do is associated with something, and that something always has a 
type.

Are you thinking of something in particular that is type-less that you 
need or want to do range operations on?



> that irange storage should not be wide_int but trailing-widest-int.
> Because obviously irange has storage requirement.  It makes
> in-place modification of iranges impossible but that's the way modern
> C++ code works (and be consistent with how wide-int works).
>
> I don't really like you invented an API unlike any existing in GCC.
>

I'd say you are kidding, but I'm afraid you aren't.

Regardless, we don't need to discuss irange_storage details as we aren't 
proposing integration of irange, nor irange_storage in the near 
future... so I don't think the internal details of that prototype matter 
at this point, only what we are trying to do with value_range.

>> That said. I believe we can do away with the need for a type with an
>> 'UNDEFINED' range.  That is not too onerous, and there doesn't really
>> seem to be too many ripple effect from have a typeless undefined range.
>> I think we can contain that, and prevents us from having to add a hack
>> to value_range for that situation.
>>
>> VARYING  is another situation completely.  We adopted the term 'varying'
>> for ease of compatibility with VRP, but varying is simply a range going
>> from MIN to MAX for a type.  Removing the type from that would then
>> require we always pass a type in with every range which gets back to
>> doubling the number of of parameters everywhere for no good reason.
>>
>> If we standardize value_range so that MIN and MAX are set for varying,
>> then everything works very smoothly, we can make value_range and irange
>> interchangeable and facilitate getting the range ops code into trunk.
>>
>> It seems like the only reason we cant do this right now is the CONST
>> varying nodes that are returned from get_value_range().
>> Looking at that routine, it seems there are only 2 cases where that can
>> be returned
>>    1) we ask for an ssa-name beyond the end of the local ssa-name vector
>>    2) once values_propagated is set  to true and an ssa-name has no entry.
>>
>> Both seem pretty straightforward to fix...
>> 1)  if we ask for an ssa-Name beyond the end, simply reallocate the
>> vector to be big enough.   I doubt this will trigger a lot, and if we
>> initially allocate it with room for an extra 10% names it'll probably
>> never trigger.  Or pick whatever number seems appropriate.
>> 2) if values_propagated is true, simply allocate a node for the
>> ssa-name,  set it to varying and return it. THIs accomplishes the same
>> thing.
> But it's completely pointless to do this work!  I refuse to make the
> existing code slower just to make ranger easier to merge (and your
> compile-time comparisons worse for the old code - not that all the
> ranger-driven C++ification helped here).
slower? You really think you will *ever* measure this to be slower? 
mostly this is filling in the min and max fields for varying... Thats 
hardly expensive.

Most of the "c++ification" that Aldy did was simple factoring out of the 
common code so we could also call it...  and did a lot of bug fixing 
along the way.  Code reuse should be a *good* thing and I would be very 
surprised if that refactoring has any tangible difference on execution 
speed.

I fail to see how any of that, or what we are proposing here, will make 
things slower in any kind of measurable way.

And on top of that, thanks to this work of Aldy's, we are also able to 
use value_range internally in the ranger now.  When we wish to try to 
integrate range_ops or the ranger, it will be using value_range as 
well.  So if we are making value_range slower, we will be making 
ourselves slower by the same amount since we'll share the code base.   
There is no nefarious plan to slow VRP down so we look better...  we're 
actually trying to  make it easier to compare whats in trunk to what we 
have.


>
>> the number of additional nodes we allocate will be pretty minimal, and
>> it will never exceed the number of ssa-names. Then we never have to
>> worry about having CONST set for a varying node either. I see places
>> where there is special processing to avoid calling set_varying  because
>> we dont know if the vbalue_range in hand is in constant memory and would
>> cause the compiler to trap. This seems like a really bad situation, and
>> it would eliminate that issue.  We can also then eliminate the places
>> where VARYING is expanded to min/max for a given type.. instead we can
>> just pick up min/max directly.    It seems much cleaner overall.
>>
>> Something like the simple attached patch would resolve that issue, and
>> remove any lurking concerns/bugs with the CONST code.
>>
>> Then we can associate a type with varying, canonicalize it consistently,
>> and set MIN/MAX appropriately.   This will give us full
>> interchangeability between irange and value_range, establish a
>> solid/consistent API,  and we can move on to the next thing :-)
>>
>> Does this not seem reasonable?
> I still do not agree with the design decision of the ranger API.  You do
> want that to be included in the end, no?  Of course I have no veto power
> or whatnot but the API and implementation is awkward.  You didn't

You mean you disagree with our range API?     the ranger API is 
completely disjoint from this and is not being proposed right now... and 
in fact EVRP can easily be mapped to the ranger API. It a pretty simple 
API for querying ranges.  Or maybe you meant the range-ops code and API 
for evaluating ranges (Which we have also not proposed yet).  It's being 
reworked since it was in pure prototype mode and hasn't been sanitized.

The API we are proposing here applies only to ranges, and specifically 
value_range. It is designed to mirror what we do with tree nodes. It is 
designed to promote ranges to be "just another kind of type" in GCC, one 
that we could eventually represent with a tree_node if we wanted to. Its 
really a type which specifies one or more sub-ranges for a type.   As 
such, it contains values which are expected to be of that type. Thats it.

Our value_range type() query is simply extracting TREE_TYPE () from one 
of the bounds...  its really shorthand for the tree equivalent of 
TREE_TYPE (TREE_OPERAND (t, 0)) if there were a RANGE_TYPE tree node.

If we want to know the bounds for what is referred to as  a varying 
node  ([MIN, MAX]), you have to have a type available. and if we store 
min and max, its always there.

I still fail to understand why you are so dead set against representing 
varying as [MIN, MAX].  Its what varying *is*.

And really, that's the only only fundamental change we are proposing 
here... And we aren't even adding a field to the structure to do it!

Andrew
> answer my concern about all the global ctors, did you?
>
> Richard.
>

We arent proposing anything with global constructors yet, so I didn't 
see the need to comment on it.

1) the entire range-ops code base is being restructured before we 
actually propose it.   Aldy provided whats currently there for 
completeness so you could see how we were using things if you wanted.   
It needs some cleaning up and I'm in the middle of that now.   It still 
retains a lot of the old switch structure that doesn't need to be there. 
Operational flow will be much easier to read and localized.

2) You don't like global constructors.  What is the actual concern? To 
the best of my knowledge the tidbit of things that would be happening 
there are not going to have an
impact on anything. Each of the, what, 30ish constructors,  set a 'enum 
tree_code' field, and sets an array index to point to this object. 
That's it.

Im not aware of any policy that says we cant have compile time 
constructors.  Regardless, if the number of small constructors  is 
really a big concern, they can all be rolled into a single initializer 
for the table... assuming it even looks like this when we're ready to 
submit a proposal.      Its a similar amount of executable code, it just 
puts the onus on the author to register them since since they no longer 
self register with the table... Just an extra thing you have to remember 
to do when writing code for a new opcode.  And it would take 20 minutes 
to change.

Six of one, half dozen of the other by my reckoning...  I just prefer to 
use useful language features when available and remove the human error 
element.
Richard Biener July 24, 2019, 1:25 p.m. UTC | #17
On Wed, Jul 24, 2019 at 12:56 AM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 7/23/19 5:33 AM, Richard Biener wrote:
> >
> >> What irange contains internally is a bit arbitrary.  It's really an API
> >> for managing a set of subranges.  We could have used trees internally
> >> just as easily, then we wouldnt need a type field. Since we were doing
> >> lots of operations, rather than going back and forth from trees all the
> >> time, we just used the underlying wide_int directly.   we could have
> >> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
> >> there, has all the operations, and it works fine. so thats what it
> >> currently is on the branch.
> > But a range has no type.  Period.  The fact that with the in-tree implementation
> > there's access to a type is artificial.  All workers in the in-tree
> > implementation
> > get a type for the operation they carry out.
> >
> > But somehow irange doesn't get that.
> >
> > In fact, an irange shouldn't be bound to any type, and the usual
> > "carry out multiplication of two integer typed vars with the result
> > in integer" on irange should be multiply two iranges (with unbound
> > result) plus a "truncate to integer type" operation.
> >
>
> so following thru on the implementation details of doing that,  we do
> the exact same thing that VRP does for multiply.. we eventually call
> wide_int_range_multiplicative_op.
> The code from tree-vrp.c:
>
>    wide_int res_lb, res_ub;
>    wide_int vr0_lb = wi::to_wide (vr0_min);
>    wide_int vr0_ub = wi::to_wide (vr0_max);
>    wide_int vr1_lb = wi::to_wide (vr1->min ());
>    wide_int vr1_ub = wi::to_wide (vr1->max ());
>    bool overflow_undefined = TYPE_OVERFLOW_UNDEFINED (type);
>    unsigned prec = TYPE_PRECISION (type);
>
>    if (wide_int_range_multiplicative_op (res_lb, res_ub,
>                                          code, TYPE_SIGN (type), prec,
>                                          vr0_lb, vr0_ub, vr1_lb, vr1_ub,
>                                          overflow_undefined))
>      vr->set (VR_RANGE, wide_int_to_tree (type, res_lb),
>               wide_int_to_tree (type, res_ub));
>
>
> Which , lo and behold, it requires a type in order to get the signop
> right in the 4th argument.  we also use the type to figure out the
> precision in the 5th argument, as well as the overflow condition at the
> end.
>
> So it *is* bound to a type in order to do the operation, its just a
> matter of whether you pass that type around, or include it with the
> object.  I simply can't imagine why you think it isn't.

Because wide_int_range_multiplicative_op includes the truncating
and because for some unknown reason the caller of
extract_range_from_multiplicative_op doesn't pass down the
type of the computation we extract it from the type of operand zero.
The caller being extract_range_from_binary_expr.

That said, the wide_int_range_* stuff isn't the prettiest stuff in the
world and it uses wide_ints which have a precision (I didn't agree
to that - heh).

> sure, in this case the LHS, vr0, and vr1 are all the same type.. (or
> should be type compatible_p), so we can pass in one type, but not all
> operations follow that pattern... casts, shifts, comparisons, etc can
> have different types in the operand positions.  We include it with each
> range and  we always have accurate info associated with the operand.
>
> How is that  a bad thing?

It makes you _require_ a type on operands.  But constants do not
have a type.  A 1 is a 1 if it is unsigned int or bool.

> >> We are treating a range object as a unique self contained object.
> >> Therefore, the range has a type so we know how to print it, and can
> >> confirm before any operation that the ranges being operated on are
> >> appropriately matched.  We found and opened bugzillas over the past
> >> couple years for places where our code caught bugs because a range was
> >> created and then operated on in a way that was not compatible with
> >> another range.  I think there is a still an open one against ada(?)
> >> where the switch and case  are different precision.
> > I'm fine with sanity-checking where it makes sense but looking at
> > the irange code the fact that there is a type is a fundamental requirement.
> > IMHO that's bad design.
>
> we could remove the type from the range object.. we aren't critically
> tied to it, but then virtually everywhere we pass a range, we'll be
> passing in the type to be operated on, or the sign flag, or overflow
> flag,   or some combination of those.  Its only a matter of time until
> someone gets them out of whack.. It seems far more logical  to simply
> keep the type associated so we can pick up overflow, signop, precision
> and such as needed.. and do some sanity checking along the way.  thats
> what the type field is for after all, to consolidate all the info you
> might want...  Why pass the extra parameters when you don' t need to.

Because you might want to use a range without first needing to change
its type?  Like for the use-case to compute whether we can widen operands
without needing to add truncations we'd like to simulate the computes with
the original input ranges and see if anywhere in the intermediate operations
the resulting range computed in the wider type would still fit in the narrower
type.  The _ranges_ here are all typeless, originally computed by ops
in the smaller type, then we simulate in the larger type (no need to
change the input ranges), and see if range_fits (smaller_type).

It's simply more powerful and more flexible.

> >>   From my point of view, a range object is similar to a tree node. A tree
> >> node has the bits to indicate what the value is, but also associates a
> >> type with those bits within the same object.  This is less error prone
> >> than passing around the bits and the type separately.  As ranges are
> >> starting to be used in many places outside of VRP, we should do the same
> >> thing with ranges.  WIth value_range it would actually be free since
> >> there is already a tree for the bounds already which contains the type.
> > Not for VARYING or UNDEFINED.
>
> Personally, Id put it in both for consistency.  I view a range as an
> object we are associating with either an expression or an ssa-name. That
> range should then have a consistent type with that name or expression.
> Even if the range is empty, I would still expect it to be compatible
> since I'm usually associating it with an ssa_name or expression.

Imagine you have int i; long l; and you want to ask if both variables
have the same value range, how would you do that?  What is "same" here?

> Because you seem so dead set against it, I can also see some consistency
> in not having a type for undefined... Since there are 0 ranges, we can't
> ask for a lower_bound () or upper_bound () on an empty range, so I can
> see an argument for extending that not being able to ask the type()
> either.  I agree that this is also a plausible viewpoint, and we are
> changing our code to make that happen, even though we have to pass a few
> extra types around and will lose some minor sanity checking when doing
> intersection or union with an empty range.   To me an empty range is
> akin to a NaN.. it isn't a real float value, but it does still have a
> type associated with it.

Both VARYING and UNDEFINED are currently lattice states, not real
(integer) ranges.  irange (rightfully so) isn't convering the lattice state
and the current value_range does so for historical reasons.  In
principle the VRP lattice could be

  struct { enum lattice_state state; irange *range; };

where for state != VR_[ANTI]RANGE 'range' is NULL.

You shouldn't tie your hands by trying to design something that plugs
1:1 into the existing uses.  That constrains you too much IMHO.

Given that we do not represent VARYING as [-INF, +INF] currently
and you appearantly want to preserve that(?) it should be
straight-forward to add a irange_from_varying  (type)
irange_from_undefined (type)
if you want to get a range for a lattice state VARYING/UNDEFINED.

> Being against a type for varying makes no sense to me since varying is
> simply shorthand for [MIN, MAX]. This is no different than some other
> range like [3, 45]  its just a range with 2 end points.  VRP may happen
> to special case it to represent the lattice value that maps to this and
> doesn't bother filling it in, but it doesn't change the fact that it
> represents [MIN, MAX]. And VRP does occasionally expand varying into its
> component limits... Aldy found cases where 2 ranges combined would
> result in VR_RANGE [MIN, MAX], but would not be considered varying
> because it wasn't normalized to VR_VARYING.. so there are 2
> representations of the same value which is also bad.

You are mixing the lattice state and the range representation.  For
efficiency those are "merged" in tree-vrp.c.

> So we can now normalize that code to not special case varying.. simply
> ask for lower_bound and upper_bound ().  Its less special casing, its
> consistent, costs no memory.. It all seems like a really good cleanup
> which you appear to be opposing for reasons I can't quite fathom.
>
>
> >
> >>
> >>
> >>
> >>> to fold_range/op_range?  The API also seems to be oddly
> >>> constrained to binary ops.  Anyway, the way you build
> >>> the operator table requires an awful lot of global C++ ctor
> >>> invocations, sth we generally try to avoid.  But I'm getting
> >>> into too many details here.
> >> Its "oddly constrained" because what you are looking at  is just the
> >> standardized unary/binary ops code.. ie the equivalent of
> >> extract_range_from_binary_expr() and extract_range_from_unary_expr().
> >> The other ops we care about have specialized requirements, like PHIs
> >> and the arbitrary numbers of parameters in a call, or anything less
> >> common than one or two operands.    You are just not seeing those parts.
> >>
> >>> So - to answer your question above, I'd like you to pass down
> >>> a type to operations.  Because that's what is fundamentally
> >>> required - a range doesn't have a "type" and the current
> >>> value_range_base doesn't fall into the trap of needing one.
> >>>
> >>> Richard.
> >>>
> >> Why is having a type associated with the data a "trap"?   Perhaps the
> >> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
> >> lattice values, but we're moving past lattice values and into a realm
> >> where we have ranges as useful things outside of VRP, and trying to
> >> shoehorn lattice values does not seem appropriate anymore.
> >>
> >> I looked at implementing range-ops without a type in the range, and we
> >> end up passing 2 parameters everywhere each time we do anything with a
> >> range.  This doubled the number of parameters in most routines, and when
> >> we had chains of calls, we were simply passing the type along with the
> >> range.  It seems archaic to be constantly passing information around
> >> instead of associating it with the range itself.  Its just another place
> >> for an error to creep in.. Aldy found a place where we were creating
> >> varying nodes for floats IIRC.. the type checking in the range
> >> operations caught it precisely because the range returned wasn't the
> >> type it was assumed to be.
> > Well, I don't agree.  It's not archaic, it's the way GCC works everywhere
> > else.  So it's consistent.  Archaically consistend in your view.
>
> Are you really arguing we should write code the same way for 30+ years
> and never change?

Yes..

> Indeed, I would argue that when we have the opportunity to modernize
> code in our code base, we really ought to be doing it... and we don't do
> it nearly enough.
>
> Why don't we just pass a type along with every tree node instead of
> including it in the node? we'd save memory in each tree node.   I
> *really* fail to understand this argument of keeping the type separate
> from the range when the range by definition represents sub-ranges of a
> specific type.

Does it?  Only if you consider the narrow view that each range
comes from an operation on a GIMPLE/GENERIC expression.

But like wide_ints iranges might be used for general range computations
(see my example above).

>
>
>
> > But [1, 2] doesn't have a "type".  [1, 2] and [10000, 10004] can
> > be added just fine even if [1, 2] is from 'char' and [10000, 10004] is
> > from 'int'.  If you want sanity-checking you can do
> But of course it does.  I defy you to create a value_range for [1,2]
> where I cant ask for the type of either the 1 or 2.
>
> When  I write
>
> int
>   foo (unsigned x, char y)
> {
>    return x + y;
> }
>
>   we get
>
>    _1 = (unsigned int) y_3(D);
>    _2 = _1 + x_4(D);
>    _5 = (int) _2;
>    return _5;
>
> In order to make sure everything works as expected, GCC inserts code to
> make sure the operands of the '+' are types_compatible_p.  It then uses
> the information in the TYPE field to perform the operation.   if GCC
> isn't going to add a 'char' to an 'unsigned int', why should the range
> code? we're using the same code under the covers.  I would say the same
> argument for ensuring the operands are compatible in GCC's '+' apply to
> determining the results of a ranges '+' operation...  If we don't, Im
> sure we'll get the wrong results at some point...  Either that, or GCC
> is wasting its time doing all this work.
>
> What we are doing to calculate results is *exactly* the same from my
> point of view.  We use the type checking mostly for intersection and
> union, and while we don't do much type checking on operations like plus,
> (we could, but we assume GCC has already done that), We do use the
> fields of the TYPE to execute the operation in case we need to know the
> signop or overflow condition of the type.
>
> This is where I do not understand what kind of point you are trying to
> make.
>
> When we perform an operation like '+" on a range, we are calling gcc's
> routines to do the add on each of the operands of the range, which means
> we are trying to map range operations as directly to the IL operation as
> possible.  ie, PLUS is still the same PLUS that would be operated on by
> the two const tree values in 3 + 4.... , we are just teaching the
> compiler how ranges are different than constants, and require a few more
> operations combined to come up with the result... so for plus, we're
> teaching it
>        [L1, U1] + [L2, U2] results in [L1+L2, U1+U2]  plus variants
> based on overflow and sign
> and we use gcc's add operation for whatever the type  L1,L2, U1, and U2
> are, and we need the info from the types of those endpoints to do the
> operation.
>
> Why on earth we'd want to go out of our way to not include the type when
> we'll need pieces of it passed in separately I cannot imagine.
>
> In the case of irange, we added the type field since it isn't available
> from the wide_int structure.. and the primary reason we 'need' it is so
> we can pick up the correct signop value and overflow style when we are
> performing certain wide int operations.  In the case of value_range, end
> points are already a tree, so we have the type when we need it and don't
> need to add anything extra.  so its inherently free in value range,
> which should make it an even easier sell.
>
>
>
> > irange
> > irange_plus (tree in_type, irange op0, irange op1)
> > {
> >     gcc_assert (range_fits_type (in_type, op0) && range_fits_type
> > (in_type, op1));
> >     irange res = op0 + op1;
> >     return res.truncate_to (in_type);
> > }
> >
> > or so easily.  See - even the checking part doesn't need the
> > type in the range.
> That code could easily be the same , except our assert would read:
>
> gcc_assert (types_compatible_p (op0.type (), op1.type ());
>
> which ensures that those ranges came from places with the kinds of
> values we expect, and looks a lot like every thing else we see in GCC.
>
> we don't actually do that, but we could if you like.  Its just a bit
> redundant since virtually every place we call into the fold operations
> have had ranges extracted from a gimple statement  and the resulting
> queries have a type matching the operand, so at that point we're pretty
> sure they will pass the type_compatible_p test. And even if they don't
> the underlying wide_int operational routines will complain.
>
> Most of the type checking is for the benefit of the ranger, which goes
> off and finds ranges and brings them back. It typically has to perform a
> chain of operations over multiple statements , and when you are
> traveling over the CFG and combining computations, it includes unions
> and intersections.  its very helpful to ensure the range you are getting
> at back at each stage of the evaluation matches the type you were
> expecting. Its been a critical debugging aid.
>
> There are also some operations, like shift, where the type of operands
> may not even be the same sign as 'in_type'...  so we need to pass in the
> right type for that to truly get it right.  I noticed the existing VRP
> code does some fudging for that particular case... Im not convinced
> there isn't a lurking bug there..   but in any case we'll just have the
> precise info right there in the range and wont have to do anything special.
>
> a = b < c is another example.  The LHS is a different type (boolean)
> than op1 and op2.. its obscured in VRP because there is special code for
> comparison ops, but range-ops treats relationals exactly like any other
> generic binary operation.   We'd have to pass the type or signop for the
> RHS to the operation because its different than 'in_type'.  Basically,
> the API would have to have a type for each operand to be sure we are
> correct for all operations.
>
> Range-ops provides a generic API that works for *all* binary or unary
> ops, and as such, you sometimes need the types of each operand in ways
> that can be hard to predict until you actually implement them.   We've
> found it to be very useful to have the type of the range easily
> available for some of today's tree codes...  and who knows in the
> future...  especially if we expand ranges to other types like floats
> which we are also preparing for.
>
>
> > I also do not have to "invent" a type when I want to work with irange
> > on widest_ints.  Oh, another comment when reading the code was
>
> Well you had to get those wide ints from somewhere,  and you certainly
> have to "invent" a signop and possibly overflow flag to do many
> operations on wide_ints.
> In all our code, I have yet to "invent" a type.. Any range operation I
> want to do is associated with something, and that something always has a
> type.
>
> Are you thinking of something in particular that is type-less that you
> need or want to do range operations on?
>
>
>
> > that irange storage should not be wide_int but trailing-widest-int.
> > Because obviously irange has storage requirement.  It makes
> > in-place modification of iranges impossible but that's the way modern
> > C++ code works (and be consistent with how wide-int works).
> >
> > I don't really like you invented an API unlike any existing in GCC.
> >
>
> I'd say you are kidding, but I'm afraid you aren't.
>
> Regardless, we don't need to discuss irange_storage details as we aren't
> proposing integration of irange, nor irange_storage in the near
> future... so I don't think the internal details of that prototype matter
> at this point, only what we are trying to do with value_range.

Oh, you are not?  Why do we argue changing tree-vrp.c then?

> >> That said. I believe we can do away with the need for a type with an
> >> 'UNDEFINED' range.  That is not too onerous, and there doesn't really
> >> seem to be too many ripple effect from have a typeless undefined range.
> >> I think we can contain that, and prevents us from having to add a hack
> >> to value_range for that situation.
> >>
> >> VARYING  is another situation completely.  We adopted the term 'varying'
> >> for ease of compatibility with VRP, but varying is simply a range going
> >> from MIN to MAX for a type.  Removing the type from that would then
> >> require we always pass a type in with every range which gets back to
> >> doubling the number of of parameters everywhere for no good reason.
> >>
> >> If we standardize value_range so that MIN and MAX are set for varying,
> >> then everything works very smoothly, we can make value_range and irange
> >> interchangeable and facilitate getting the range ops code into trunk.
> >>
> >> It seems like the only reason we cant do this right now is the CONST
> >> varying nodes that are returned from get_value_range().
> >> Looking at that routine, it seems there are only 2 cases where that can
> >> be returned
> >>    1) we ask for an ssa-name beyond the end of the local ssa-name vector
> >>    2) once values_propagated is set  to true and an ssa-name has no entry.
> >>
> >> Both seem pretty straightforward to fix...
> >> 1)  if we ask for an ssa-Name beyond the end, simply reallocate the
> >> vector to be big enough.   I doubt this will trigger a lot, and if we
> >> initially allocate it with room for an extra 10% names it'll probably
> >> never trigger.  Or pick whatever number seems appropriate.
> >> 2) if values_propagated is true, simply allocate a node for the
> >> ssa-name,  set it to varying and return it. THIs accomplishes the same
> >> thing.
> > But it's completely pointless to do this work!  I refuse to make the
> > existing code slower just to make ranger easier to merge (and your
> > compile-time comparisons worse for the old code - not that all the
> > ranger-driven C++ification helped here).
> slower? You really think you will *ever* measure this to be slower?
> mostly this is filling in the min and max fields for varying... Thats
> hardly expensive.
>
> Most of the "c++ification" that Aldy did was simple factoring out of the
> common code so we could also call it...  and did a lot of bug fixing
> along the way.  Code reuse should be a *good* thing and I would be very
> surprised if that refactoring has any tangible difference on execution
> speed.
>
> I fail to see how any of that, or what we are proposing here, will make
> things slower in any kind of measurable way.
>
> And on top of that, thanks to this work of Aldy's, we are also able to
> use value_range internally in the ranger now.  When we wish to try to
> integrate range_ops or the ranger, it will be using value_range as
> well.  So if we are making value_range slower, we will be making
> ourselves slower by the same amount since we'll share the code base.
> There is no nefarious plan to slow VRP down so we look better...  we're
> actually trying to  make it easier to compare whats in trunk to what we
> have.
>
>
> >
> >> the number of additional nodes we allocate will be pretty minimal, and
> >> it will never exceed the number of ssa-names. Then we never have to
> >> worry about having CONST set for a varying node either. I see places
> >> where there is special processing to avoid calling set_varying  because
> >> we dont know if the vbalue_range in hand is in constant memory and would
> >> cause the compiler to trap. This seems like a really bad situation, and
> >> it would eliminate that issue.  We can also then eliminate the places
> >> where VARYING is expanded to min/max for a given type.. instead we can
> >> just pick up min/max directly.    It seems much cleaner overall.
> >>
> >> Something like the simple attached patch would resolve that issue, and
> >> remove any lurking concerns/bugs with the CONST code.
> >>
> >> Then we can associate a type with varying, canonicalize it consistently,
> >> and set MIN/MAX appropriately.   This will give us full
> >> interchangeability between irange and value_range, establish a
> >> solid/consistent API,  and we can move on to the next thing :-)
> >>
> >> Does this not seem reasonable?
> > I still do not agree with the design decision of the ranger API.  You do
> > want that to be included in the end, no?  Of course I have no veto power
> > or whatnot but the API and implementation is awkward.  You didn't
>
> You mean you disagree with our range API?     the ranger API is
> completely disjoint from this and is not being proposed right now... and
> in fact EVRP can easily be mapped to the ranger API. It a pretty simple
> API for querying ranges.  Or maybe you meant the range-ops code and API
> for evaluating ranges (Which we have also not proposed yet).  It's being
> reworked since it was in pure prototype mode and hasn't been sanitized.

Since you never posted actual patches I had to look at the SVN repo
and find the files that do not exist on trunk in looking for the reason
you seem to insist on having a type.

> The API we are proposing here applies only to ranges, and specifically
> value_range. It is designed to mirror what we do with tree nodes. It is
> designed to promote ranges to be "just another kind of type" in GCC, one
> that we could eventually represent with a tree_node if we wanted to. Its
> really a type which specifies one or more sub-ranges for a type.   As
> such, it contains values which are expected to be of that type. Thats it.
>
> Our value_range type() query is simply extracting TREE_TYPE () from one
> of the bounds...  its really shorthand for the tree equivalent of
> TREE_TYPE (TREE_OPERAND (t, 0)) if there were a RANGE_TYPE tree node.
>
> If we want to know the bounds for what is referred to as  a varying
> node  ([MIN, MAX]), you have to have a type available. and if we store
> min and max, its always there.
>
> I still fail to understand why you are so dead set against representing
> varying as [MIN, MAX].  Its what varying *is*.

But you are actually _not_ doing that because then you'd not need to
store the type!  And generally we are only using operands with
lattice state(!) VARYING for operations that possibly narrow its range
(as an optimization, saving compile-time).

> And really, that's the only only fundamental change we are proposing
> here... And we aren't even adding a field to the structure to do it!

You are changing a function that essentially sets the lattice state
from "has a range" to "varying" to need a type.  That's fundamentally
wrong.

> Andrew
> > answer my concern about all the global ctors, did you?
> >
> > Richard.
> >
>
> We arent proposing anything with global constructors yet, so I didn't
> see the need to comment on it.
>
> 1) the entire range-ops code base is being restructured before we
> actually propose it.   Aldy provided whats currently there for
> completeness so you could see how we were using things if you wanted.
> It needs some cleaning up and I'm in the middle of that now.   It still
> retains a lot of the old switch structure that doesn't need to be there.
> Operational flow will be much easier to read and localized.
>
> 2) You don't like global constructors.  What is the actual concern? To
> the best of my knowledge the tidbit of things that would be happening
> there are not going to have an
> impact on anything. Each of the, what, 30ish constructors,  set a 'enum
> tree_code' field, and sets an array index to point to this object.
> That's it.

Startup cost, ctor ordering.

Richard.

> Im not aware of any policy that says we cant have compile time
> constructors.  Regardless, if the number of small constructors  is
> really a big concern, they can all be rolled into a single initializer
> for the table... assuming it even looks like this when we're ready to
> submit a proposal.      Its a similar amount of executable code, it just
> puts the onus on the author to register them since since they no longer
> self register with the table... Just an extra thing you have to remember
> to do when writing code for a new opcode.  And it would take 20 minutes
> to change.
>
> Six of one, half dozen of the other by my reckoning...  I just prefer to
> use useful language features when available and remove the human error
> element.
>
>
>
>
>
>
Jeff Law July 24, 2019, 4 p.m. UTC | #18
On 7/23/19 3:42 AM, Richard Biener wrote:
> On Tue, Jul 23, 2019 at 1:59 AM Jeff Law <law@redhat.com> wrote:
>>
>> On 7/16/19 12:37 PM, Andrew MacLeod wrote:
>>> On 7/9/19 5:56 AM, Richard Biener wrote:
>>>> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>>
>>>>>
>>>>> On 7/4/19 6:33 AM, Richard Biener wrote:
>>>>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>>>>>> On 7/3/19 7:08 AM, Richard Biener wrote:
>>>>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <aldyh@redhat.com>
>>>>>>>> wrote:
>>>>>>> How about we keep VARYING and UNDEFINED typeless until right before we
>>>>>>> call into the ranger.  At which point, we have can populate min/max
>>>>>>> because we have the tree_code and the type handy.  So right before we
>>>>>>> call into the ranger do:
>>>>>>>
>>>>>>>           if (varying_p ())
>>>>>>>             foo->set_varying(TYPE);
>>>>>>>
>>>>>>> This would avoid the type cache, and keep the ranger happy.
>>>>>> you cannot do set_varying on the static const range but instead
>>>>>> you'd do
>>>>>>
>>>>>>     value_range tem (*foo);
>>>>>>     if (varying_p ())
>>>>>>      tem->set_full_range (TYPE);
>>>>>>
>>>>>> which I think we already do in some places.  Thus my question _where_
>>>>>> you actually need this.
>>>>> Basically, everywhere.  By having a type for varying/undefined, we don't
>>>>> have to special case anything.  Sure, we could for example, special case
>>>>> the invert operation for undefined / varying.  And we could special case
>>>>> everything dealing with ranges to handle varying and undefined, but why?
>>>>>    We could also pass a type argument everywhere, but that's just ugly.
>>>>> However, I do understand your objection to the type cache.
>>>>>
>>>>> How about the attached approach?  Set the type for varying/undefined
>>>>> when we know it, while avoiding touching the CONST varying.  Then right
>>>>> before calling the ranger, pass down a new varying node with min/max for
>>>>> any varyings that were still typeless until that point.
>>>>>
>>>>> I have taken care of never adding a set_varying() that was not already
>>>>> there.  Would this keep the const happy?
>>>>>
>>>>> Technically we don't need to set varying/undef types for every instance
>>>>> in VRP, but we need it at least for the code that will be shared with
>>>>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
>>>>>    I just figured if we have the information, might as well set it for
>>>>> consistency.
>>>>>
>>>>> If you like this approach, I can rebase the other patches that depend on
>>>>> this one.
>>>> OK, so I went ant checked what you do for class irange which has
>>>> a type but no kind member (but constructors with a kind).  It also
>>>> uses wide_int members for storage.  For a pure integer constant
>>>> range representation this represents somewhat odd choices;  I'd
>>>> have elided the m_type member completely here, it seems fully
>>>> redundant.  Only range operations need to be carried out in a
>>>> specific type (what I was suggesting above).  Even the precision
>>>> encoded in the wide_int members is redundant then (I'd have
>>>> expected widest_int here and trailing-wide-ints for optimizing
>>>> storage).
>>>
>>> What irange contains internally is a bit arbitrary.  It's really an API
>>> for managing a set of subranges.  We could have used trees internally
>>> just as easily, then we wouldnt need a type field. Since we were doing
>>> lots of operations, rather than going back and forth from trees all the
>>> time, we just used the underlying wide_int directly.   we could have
>>> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
>>> there, has all the operations, and it works fine. so thats what it
>>> currently is on the branch.
>>>
>>> We are treating a range object as a unique self contained object.
>>> Therefore, the range has a type so we know how to print it, and can
>>> confirm before any operation that the ranges being operated on are
>>> appropriately matched.  We found and opened bugzillas over the past
>>> couple years for places where our code caught bugs because a range was
>>> created and then operated on in a way that was not compatible with
>>> another range.  I think there is a still an open one against ada(?)
>>> where the switch and case  are different precision.
>>>
>>> From my point of view, a range object is similar to a tree node. A tree
>>> node has the bits to indicate what the value is, but also associates a
>>> type with those bits within the same object.  This is less error prone
>>> than passing around the bits and the type separately.  As ranges are
>>> starting to be used in many places outside of VRP, we should do the same
>>> thing with ranges.  WIth value_range it would actually be free since
>>> there is already a tree for the bounds already which contains the type.
>>>
>>>
>>>
>>>
>>>
>>>> to fold_range/op_range?  The API also seems to be oddly
>>>> constrained to binary ops.  Anyway, the way you build
>>>> the operator table requires an awful lot of global C++ ctor
>>>> invocations, sth we generally try to avoid.  But I'm getting
>>>> into too many details here.
>>>
>>> Its "oddly constrained" because what you are looking at  is just the
>>> standardized unary/binary ops code.. ie the equivalent of
>>> extract_range_from_binary_expr() and extract_range_from_unary_expr().
>>> The other ops we care about have specialized requirements, like PHIs
>>> and the arbitrary numbers of parameters in a call, or anything less
>>> common than one or two operands.    You are just not seeing those parts.
>>>
>>>>
>>>> So - to answer your question above, I'd like you to pass down
>>>> a type to operations.  Because that's what is fundamentally
>>>> required - a range doesn't have a "type" and the current
>>>> value_range_base doesn't fall into the trap of needing one.
>>>>
>>>> Richard.
>>>>
>>>
>>> Why is having a type associated with the data a "trap"?   Perhaps the
>>> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
>>> lattice values, but we're moving past lattice values and into a realm
>>> where we have ranges as useful things outside of VRP, and trying to
>>> shoehorn lattice values does not seem appropriate anymore.
>>>
>>> I looked at implementing range-ops without a type in the range, and we
>>> end up passing 2 parameters everywhere each time we do anything with a
>>> range.  This doubled the number of parameters in most routines, and when
>>> we had chains of calls, we were simply passing the type along with the
>>> range.  It seems archaic to be constantly passing information around
>>> instead of associating it with the range itself.  Its just another place
>>> for an error to creep in.. Aldy found a place where we were creating
>>> varying nodes for floats IIRC.. the type checking in the range
>>> operations caught it precisely because the range returned wasn't the
>>> type it was assumed to be.
>>>
>>> That said. I believe we can do away with the need for a type with an
>>> 'UNDEFINED' range.  That is not too onerous, and there doesn't really
>>> seem to be too many ripple effect from have a typeless undefined range.
>>> I think we can contain that, and prevents us from having to add a hack
>>> to value_range for that situation.
>>>
>>> VARYING  is another situation completely.  We adopted the term 'varying'
>>> for ease of compatibility with VRP, but varying is simply a range going
>>> from MIN to MAX for a type.  Removing the type from that would then
>>> require we always pass a type in with every range which gets back to
>>> doubling the number of of parameters everywhere for no good reason.
>>>
>>> If we standardize value_range so that MIN and MAX are set for varying,
>>> then everything works very smoothly, we can make value_range and irange
>>> interchangeable and facilitate getting the range ops code into trunk.
>>>
>>> It seems like the only reason we cant do this right now is the CONST
>>> varying nodes that are returned from get_value_range().
>>> Looking at that routine, it seems there are only 2 cases where that can
>>> be returned
>>>  1) we ask for an ssa-name beyond the end of the local ssa-name vector
>>>  2) once values_propagated is set  to true and an ssa-name has no entry.
>>>
>>> Both seem pretty straightforward to fix...
>>> 1)  if we ask for an ssa-Name beyond the end, simply reallocate the
>>> vector to be big enough.   I doubt this will trigger a lot, and if we
>>> initially allocate it with room for an extra 10% names it'll probably
>>> never trigger.  Or pick whatever number seems appropriate.
>>> 2) if values_propagated is true, simply allocate a node for the
>>> ssa-name,  set it to varying and return it. THIs accomplishes the same
>>> thing.
>>>
>>> the number of additional nodes we allocate will be pretty minimal, and
>>> it will never exceed the number of ssa-names. Then we never have to
>>> worry about having CONST set for a varying node either. I see places
>>> where there is special processing to avoid calling set_varying  because
>>> we dont know if the vbalue_range in hand is in constant memory and would
>>> cause the compiler to trap. This seems like a really bad situation, and
>>> it would eliminate that issue.  We can also then eliminate the places
>>> where VARYING is expanded to min/max for a given type.. instead we can
>>> just pick up min/max directly.    It seems much cleaner overall.
>>>
>>> Something like the simple attached patch would resolve that issue, and
>>> remove any lurking concerns/bugs with the CONST code.
>>>
>>> Then we can associate a type with varying, canonicalize it consistently,
>>> and set MIN/MAX appropriately.   This will give us full
>>> interchangeability between irange and value_range, establish a
>>> solid/consistent API,  and we can move on to the next thing :-)
>>>
>>> Does this not seem reasonable?
>> One might even claim that this patch in and of itself is a step forward
>> independent of all the other work going on.
>>
>> THe first time I saw that code when I copied it into vr-values I thought
>> it was ripe for fixing, but never got to it.  As it stands, it's a hack,
>> no more, no less and it's a hack that we can easily remove and do
>> something better.
> 
> It's not a hack - it's an optimization and a _sanitization_.  Iff you
> want to remove it then
> make it return NULL (no range) and make all callers deal with that.
We can do better here.  Callers shouldn't have to cope with the fact
that SSA_NAMEs and thus ranges can be created on the fly (gimple-fold)
and they should be able to get a valid range object that is no different
than any other range object.   NULL is no better than the constant
varying node.

If you really think we need some kind of special case here for a client,
can you please elaborate which client and why?  Clearly it's something
you feel strongly about and I'm open to the possibility there's a case
I'm not aware of.



> 
> That we return a pointer to read-only storage makes sure callers do not
> change the value-range.  That's better than the alternative below which
> is the appropriate "fix" for the existing code if you want to get rid of the
> statically allocated constant varying and the "hack" of CONST_CASTing it.
But I'd claim that if callers are required not to change these ranges,
then the callers are fundamentally broken.  I'm not sure what the
"sanitization" is really buying you here.  Can you point to something
specific?

> 
> But you lose the sanitizing that nobody can change it and the
> changed info leaks to other SSA vars.
> 
> As said, fix all callers to deal with NULL.
> 
> But I argue the current code is exactly optimal and safe.
ANd I'd argue that it's just plain broken and that the sanitization
you're referring to points to something broken elsewhere,  higher up in
the callers.

jeff
Richard Biener July 24, 2019, 5 p.m. UTC | #19
On July 24, 2019 6:00:04 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 7/23/19 3:42 AM, Richard Biener wrote:
>> On Tue, Jul 23, 2019 at 1:59 AM Jeff Law <law@redhat.com> wrote:
>>>
>>> On 7/16/19 12:37 PM, Andrew MacLeod wrote:
>>>> On 7/9/19 5:56 AM, Richard Biener wrote:
>>>>> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <aldyh@redhat.com>
>wrote:
>>>>>>
>>>>>>
>>>>>> On 7/4/19 6:33 AM, Richard Biener wrote:
>>>>>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <aldyh@redhat.com>
>wrote:
>>>>>>>> On 7/3/19 7:08 AM, Richard Biener wrote:
>>>>>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez
><aldyh@redhat.com>
>>>>>>>>> wrote:
>>>>>>>> How about we keep VARYING and UNDEFINED typeless until right
>before we
>>>>>>>> call into the ranger.  At which point, we have can populate
>min/max
>>>>>>>> because we have the tree_code and the type handy.  So right
>before we
>>>>>>>> call into the ranger do:
>>>>>>>>
>>>>>>>>           if (varying_p ())
>>>>>>>>             foo->set_varying(TYPE);
>>>>>>>>
>>>>>>>> This would avoid the type cache, and keep the ranger happy.
>>>>>>> you cannot do set_varying on the static const range but instead
>>>>>>> you'd do
>>>>>>>
>>>>>>>     value_range tem (*foo);
>>>>>>>     if (varying_p ())
>>>>>>>      tem->set_full_range (TYPE);
>>>>>>>
>>>>>>> which I think we already do in some places.  Thus my question
>_where_
>>>>>>> you actually need this.
>>>>>> Basically, everywhere.  By having a type for varying/undefined,
>we don't
>>>>>> have to special case anything.  Sure, we could for example,
>special case
>>>>>> the invert operation for undefined / varying.  And we could
>special case
>>>>>> everything dealing with ranges to handle varying and undefined,
>but why?
>>>>>>    We could also pass a type argument everywhere, but that's just
>ugly.
>>>>>> However, I do understand your objection to the type cache.
>>>>>>
>>>>>> How about the attached approach?  Set the type for
>varying/undefined
>>>>>> when we know it, while avoiding touching the CONST varying.  Then
>right
>>>>>> before calling the ranger, pass down a new varying node with
>min/max for
>>>>>> any varyings that were still typeless until that point.
>>>>>>
>>>>>> I have taken care of never adding a set_varying() that was not
>already
>>>>>> there.  Would this keep the const happy?
>>>>>>
>>>>>> Technically we don't need to set varying/undef types for every
>instance
>>>>>> in VRP, but we need it at least for the code that will be shared
>with
>>>>>> range-ops (extract_range_from_multiplicative_op, union,
>intersect, etc).
>>>>>>    I just figured if we have the information, might as well set
>it for
>>>>>> consistency.
>>>>>>
>>>>>> If you like this approach, I can rebase the other patches that
>depend on
>>>>>> this one.
>>>>> OK, so I went ant checked what you do for class irange which has
>>>>> a type but no kind member (but constructors with a kind).  It also
>>>>> uses wide_int members for storage.  For a pure integer constant
>>>>> range representation this represents somewhat odd choices;  I'd
>>>>> have elided the m_type member completely here, it seems fully
>>>>> redundant.  Only range operations need to be carried out in a
>>>>> specific type (what I was suggesting above).  Even the precision
>>>>> encoded in the wide_int members is redundant then (I'd have
>>>>> expected widest_int here and trailing-wide-ints for optimizing
>>>>> storage).
>>>>
>>>> What irange contains internally is a bit arbitrary.  It's really an
>API
>>>> for managing a set of subranges.  We could have used trees
>internally
>>>> just as easily, then we wouldnt need a type field. Since we were
>doing
>>>> lots of operations, rather than going back and forth from trees all
>the
>>>> time, we just used the underlying wide_int directly.   we could
>have
>>>> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int
>is
>>>> there, has all the operations, and it works fine. so thats what it
>>>> currently is on the branch.
>>>>
>>>> We are treating a range object as a unique self contained object.
>>>> Therefore, the range has a type so we know how to print it, and can
>>>> confirm before any operation that the ranges being operated on are
>>>> appropriately matched.  We found and opened bugzillas over the past
>>>> couple years for places where our code caught bugs because a range
>was
>>>> created and then operated on in a way that was not compatible with
>>>> another range.  I think there is a still an open one against ada(?)
>>>> where the switch and case  are different precision.
>>>>
>>>> From my point of view, a range object is similar to a tree node. A
>tree
>>>> node has the bits to indicate what the value is, but also
>associates a
>>>> type with those bits within the same object.  This is less error
>prone
>>>> than passing around the bits and the type separately.  As ranges
>are
>>>> starting to be used in many places outside of VRP, we should do the
>same
>>>> thing with ranges.  WIth value_range it would actually be free
>since
>>>> there is already a tree for the bounds already which contains the
>type.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> to fold_range/op_range?  The API also seems to be oddly
>>>>> constrained to binary ops.  Anyway, the way you build
>>>>> the operator table requires an awful lot of global C++ ctor
>>>>> invocations, sth we generally try to avoid.  But I'm getting
>>>>> into too many details here.
>>>>
>>>> Its "oddly constrained" because what you are looking at  is just
>the
>>>> standardized unary/binary ops code.. ie the equivalent of
>>>> extract_range_from_binary_expr() and
>extract_range_from_unary_expr().
>>>> The other ops we care about have specialized requirements, like
>PHIs
>>>> and the arbitrary numbers of parameters in a call, or anything less
>>>> common than one or two operands.    You are just not seeing those
>parts.
>>>>
>>>>>
>>>>> So - to answer your question above, I'd like you to pass down
>>>>> a type to operations.  Because that's what is fundamentally
>>>>> required - a range doesn't have a "type" and the current
>>>>> value_range_base doesn't fall into the trap of needing one.
>>>>>
>>>>> Richard.
>>>>>
>>>>
>>>> Why is having a type associated with the data a "trap"?   Perhaps
>the
>>>> old VRP lattice idea didn't need a type with the UNDEFINED and
>VARYING
>>>> lattice values, but we're moving past lattice values and into a
>realm
>>>> where we have ranges as useful things outside of VRP, and trying to
>>>> shoehorn lattice values does not seem appropriate anymore.
>>>>
>>>> I looked at implementing range-ops without a type in the range, and
>we
>>>> end up passing 2 parameters everywhere each time we do anything
>with a
>>>> range.  This doubled the number of parameters in most routines, and
>when
>>>> we had chains of calls, we were simply passing the type along with
>the
>>>> range.  It seems archaic to be constantly passing information
>around
>>>> instead of associating it with the range itself.  Its just another
>place
>>>> for an error to creep in.. Aldy found a place where we were
>creating
>>>> varying nodes for floats IIRC.. the type checking in the range
>>>> operations caught it precisely because the range returned wasn't
>the
>>>> type it was assumed to be.
>>>>
>>>> That said. I believe we can do away with the need for a type with
>an
>>>> 'UNDEFINED' range.  That is not too onerous, and there doesn't
>really
>>>> seem to be too many ripple effect from have a typeless undefined
>range.
>>>> I think we can contain that, and prevents us from having to add a
>hack
>>>> to value_range for that situation.
>>>>
>>>> VARYING  is another situation completely.  We adopted the term
>'varying'
>>>> for ease of compatibility with VRP, but varying is simply a range
>going
>>>> from MIN to MAX for a type.  Removing the type from that would then
>>>> require we always pass a type in with every range which gets back
>to
>>>> doubling the number of of parameters everywhere for no good reason.
>>>>
>>>> If we standardize value_range so that MIN and MAX are set for
>varying,
>>>> then everything works very smoothly, we can make value_range and
>irange
>>>> interchangeable and facilitate getting the range ops code into
>trunk.
>>>>
>>>> It seems like the only reason we cant do this right now is the
>CONST
>>>> varying nodes that are returned from get_value_range().
>>>> Looking at that routine, it seems there are only 2 cases where that
>can
>>>> be returned
>>>>  1) we ask for an ssa-name beyond the end of the local ssa-name
>vector
>>>>  2) once values_propagated is set  to true and an ssa-name has no
>entry.
>>>>
>>>> Both seem pretty straightforward to fix...
>>>> 1)  if we ask for an ssa-Name beyond the end, simply reallocate the
>>>> vector to be big enough.   I doubt this will trigger a lot, and if
>we
>>>> initially allocate it with room for an extra 10% names it'll
>probably
>>>> never trigger.  Or pick whatever number seems appropriate.
>>>> 2) if values_propagated is true, simply allocate a node for the
>>>> ssa-name,  set it to varying and return it. THIs accomplishes the
>same
>>>> thing.
>>>>
>>>> the number of additional nodes we allocate will be pretty minimal,
>and
>>>> it will never exceed the number of ssa-names. Then we never have to
>>>> worry about having CONST set for a varying node either. I see
>places
>>>> where there is special processing to avoid calling set_varying 
>because
>>>> we dont know if the vbalue_range in hand is in constant memory and
>would
>>>> cause the compiler to trap. This seems like a really bad situation,
>and
>>>> it would eliminate that issue.  We can also then eliminate the
>places
>>>> where VARYING is expanded to min/max for a given type.. instead we
>can
>>>> just pick up min/max directly.    It seems much cleaner overall.
>>>>
>>>> Something like the simple attached patch would resolve that issue,
>and
>>>> remove any lurking concerns/bugs with the CONST code.
>>>>
>>>> Then we can associate a type with varying, canonicalize it
>consistently,
>>>> and set MIN/MAX appropriately.   This will give us full
>>>> interchangeability between irange and value_range, establish a
>>>> solid/consistent API,  and we can move on to the next thing :-)
>>>>
>>>> Does this not seem reasonable?
>>> One might even claim that this patch in and of itself is a step
>forward
>>> independent of all the other work going on.
>>>
>>> THe first time I saw that code when I copied it into vr-values I
>thought
>>> it was ripe for fixing, but never got to it.  As it stands, it's a
>hack,
>>> no more, no less and it's a hack that we can easily remove and do
>>> something better.
>> 
>> It's not a hack - it's an optimization and a _sanitization_.  Iff you
>> want to remove it then
>> make it return NULL (no range) and make all callers deal with that.
>We can do better here.  Callers shouldn't have to cope with the fact
>that SSA_NAMEs and thus ranges can be created on the fly (gimple-fold)
>and they should be able to get a valid range object that is no
>different
>than any other range object.   NULL is no better than the constant
>varying node.
>
>If you really think we need some kind of special case here for a
>client,
>can you please elaborate which client and why?  Clearly it's something
>you feel strongly about and I'm open to the possibility there's a case
>I'm not aware of.
>
>
>
>> 
>> That we return a pointer to read-only storage makes sure callers do
>not
>> change the value-range.  That's better than the alternative below
>which
>> is the appropriate "fix" for the existing code if you want to get rid
>of the
>> statically allocated constant varying and the "hack" of CONST_CASTing
>it.
>But I'd claim that if callers are required not to change these ranges,
>then the callers are fundamentally broken.  I'm not sure what the
>"sanitization" is really buying you here.  Can you point to something
>specific?
>
>> 
>> But you lose the sanitizing that nobody can change it and the
>> changed info leaks to other SSA vars.
>> 
>> As said, fix all callers to deal with NULL.
>> 
>> But I argue the current code is exactly optimal and safe.
>ANd I'd argue that it's just plain broken and that the sanitization
>you're referring to points to something broken elsewhere,  higher up in
>the callers.

Another option is to make get_value_range return by value and the only way to change the lattice to call an appropriate set function. I think we already do the latter in all cases (but we use get_value_range in the setter) and returning by reference is just eliding the copy. 

Richard. 

>
>jeff
Andrew MacLeod July 24, 2019, 5:43 p.m. UTC | #20
On 7/24/19 9:25 AM, Richard Biener wrote:
> On Wed, Jul 24, 2019 at 12:56 AM Andrew MacLeod <amacleod@redhat.com> wrote:
>> On 7/23/19 5:33 AM, Richard Biener wrote:
>>
>>>>    From my point of view, a range object is similar to a tree node. A tree
>>>> node has the bits to indicate what the value is, but also associates a
>>>> type with those bits within the same object.  This is less error prone
>>>> than passing around the bits and the type separately.  As ranges are
>>>> starting to be used in many places outside of VRP, we should do the same
>>>> thing with ranges.  WIth value_range it would actually be free since
>>>> there is already a tree for the bounds already which contains the type.
>>> Not for VARYING or UNDEFINED.
>> Personally, Id put it in both for consistency.  I view a range as an
>> object we are associating with either an expression or an ssa-name. That
>> range should then have a consistent type with that name or expression.
>> Even if the range is empty, I would still expect it to be compatible
>> since I'm usually associating it with an ssa_name or expression.
> Imagine you have int i; long l; and you want to ask if both variables
> have the same value range, how would you do that?  What is "same" here?


you would do it exactly the same as you do with value_range today. What 
does adding a type() method which maps to TREE_TYPE (range.min()), which 
also works on VARYING now, have to do with any of this?
We aren't suddenly adding a bunch of restrictions to value_range...

you are arguing about irrelevant things to this patch.   we aren't 
adding a new set of restrictions that don't allow you to do something 
you could do before.


>
>> Because you seem so dead set against it, I can also see some consistency
>> in not having a type for undefined... Since there are 0 ranges, we can't
>> ask for a lower_bound () or upper_bound () on an empty range, so I can
>> see an argument for extending that not being able to ask the type()
>> either.  I agree that this is also a plausible viewpoint, and we are
>> changing our code to make that happen, even though we have to pass a few
>> extra types around and will lose some minor sanity checking when doing
>> intersection or union with an empty range.   To me an empty range is
>> akin to a NaN.. it isn't a real float value, but it does still have a
>> type associated with it.
> Both VARYING and UNDEFINED are currently lattice states, not real
> (integer) ranges.  irange (rightfully so) isn't convering the lattice state
> and the current value_range does so for historical reasons.  In
> principle the VRP lattice could be
>
>    struct { enum lattice_state state; irange *range; };
>
> where for state != VR_[ANTI]RANGE 'range' is NULL.
>
> You shouldn't tie your hands by trying to design something that plugs
> 1:1 into the existing uses.  That constrains you too much IMHO.
>
> Given that we do not represent VARYING as [-INF, +INF] currently
> and you appearantly want to preserve that(?) it should be
> straight-forward to add a irange_from_varying  (type)
> irange_from_undefined (type)
> if you want to get a range for a lattice state VARYING/UNDEFINED.
>

You've read the patch...  of course that is the main change we are 
proposing.
varying == [-INF, +INF] ==  [MIN, MAX]

the patch enforces that if you set the state to varying, it will set the 
endpoints to [MIN, MAX] for that object.
It also enforces that if you set the range to [MIN, MAX], it will also 
change the lattice value to VARYING for you, which doesn't always happen 
today.

And it is disengenuous to suggest that VARYING is typeless.   a varying 
char and a varying long are not equivalent.
If we assign a varying char to an unsigned int, the result is no longer 
varying, we get  a range of [0,255].
By claiming its some magical typeless state, the unsigned int would then 
become varying as well , or [0, 4294967295] which is wrong.

so varying is *not* typeless, and it is always representative of 
something which has a type.

A varying char has a range of [0,255]. We just want to make that 
available in value_range without everyone having to write
   if (r.varying_p())
      lbound = min_for_type (type_passed in);
   else
     lbound = r.min()

everytime they want to look at the lower bound.   The lower bound is now 
just always set.


thats it.  Thats what the patch is about.




>> So we can now normalize that code to not special case varying.. simply
>> ask for lower_bound and upper_bound ().  Its less special casing, its
>> consistent, costs no memory.. It all seems like a really good cleanup
>> which you appear to be opposing for reasons I can't quite fathom.
>>
>>
>>>>
>>> Well, I don't agree.  It's not archaic, it's the way GCC works everywhere
>>> else.  So it's consistent.  Archaically consistend in your view.
>> Are you really arguing we should write code the same way for 30+ years
>> and never change?
> Yes..

That is problematic at best.

>> Indeed, I would argue that when we have the opportunity to modernize
>> code in our code base, we really ought to be doing it... and we don't do
>> it nearly enough.
>>
>> Why don't we just pass a type along with every tree node instead of
>> including it in the node? we'd save memory in each tree node.   I
>> *really* fail to understand this argument of keeping the type separate
>> from the range when the range by definition represents sub-ranges of a
>> specific type.
> Does it?  Only if you consider the narrow view that each range
> comes from an operation on a GIMPLE/GENERIC expression.
>
> But like wide_ints iranges might be used for general range computations
> (see my example above).

again, we are not talking about irange.  we are using value range. In 
trunk.  In our branch.  everywhere.   Back to the patch please.

Not to mention the presence of a 'type' field in the structure has 
absolutely nothing to do with what you can do with irange.  We can 
extract the none-type part into a base class.. and implement it as:

class range_base {
   wide_int [num]
}

class gcc_irange : range_base
{
   tree type;
}

and you can do whatever you want with range_base in a typeless way.


This entire argument line is pointless and  has nothing to do with this 
patch.




>>
>>> that irange storage should not be wide_int but trailing-widest-int.
>>> Because obviously irange has storage requirement.  It makes
>>> in-place modification of iranges impossible but that's the way modern
>>> C++ code works (and be consistent with how wide-int works).
>>>
>>> I don't really like you invented an API unlike any existing in GCC.
>>>
>> I'd say you are kidding, but I'm afraid you aren't.
>>
>> Regardless, we don't need to discuss irange_storage details as we aren't
>> proposing integration of irange, nor irange_storage in the near
>> future... so I don't think the internal details of that prototype matter
>> at this point, only what we are trying to do with value_range.
> Oh, you are not?  Why do we argue changing tree-vrp.c then?

because we are using value_range as the range representation and setting 
the min/max for varying is important.
forget irange.
we may NEVER use irange.
We may instead extend value_range to handle multiple subranges someday.

And we aren't doing that right now either, so lets not comment on that 
either.


>> The API we are proposing here applies only to ranges, and specifically
>> value_range. It is designed to mirror what we do with tree nodes. It is
>> designed to promote ranges to be "just another kind of type" in GCC, one
>> that we could eventually represent with a tree_node if we wanted to. Its
>> really a type which specifies one or more sub-ranges for a type.   As
>> such, it contains values which are expected to be of that type. Thats it.
>>
>> Our value_range type() query is simply extracting TREE_TYPE () from one
>> of the bounds...  its really shorthand for the tree equivalent of
>> TREE_TYPE (TREE_OPERAND (t, 0)) if there were a RANGE_TYPE tree node.
>>
>> If we want to know the bounds for what is referred to as  a varying
>> node  ([MIN, MAX]), you have to have a type available. and if we store
>> min and max, its always there.
>>
>> I still fail to understand why you are so dead set against representing
>> varying as [MIN, MAX].  Its what varying *is*.
> But you are actually _not_ doing that because then you'd not need to
> store the type!  And generally we are only using operands with
> lattice state(!) VARYING for operations that possibly narrow its range
> (as an optimization, saving compile-time).
>

Are you talking about irange again?   we aren't storing a type in 
value_range.

we are merely accessing the type of the minimum range value, and for 
convenience making sure the bounds are also set for varying nodes which 
represents [min, max] for a type..






>> And really, that's the only only fundamental change we are proposing
>> here... And we aren't even adding a field to the structure to do it!
> You are changing a function that essentially sets the lattice state
> from "has a range" to "varying" to need a type.  That's fundamentally
> wrong.


And thats where we disagree.  As I pointed out earlier, varying is not a 
magical typeless state, no matter how much you pretend it is.

"has a range" obviously has a type, I can look at the lower bound and 
see what it is.  Moving it to varying doesn't magically change that.  I 
don't  "need" a type, its right there.  Of course it has a type.  The 
range of that object is now [MIN, MAX], which for convenience of VRP's 
algorithm, also has a lattice value of 'varying'     And that's all 
we're doing is actually setting those MIN/MAX values when it is moved to 
varying. How can that possibly be "fundamentally wrong".   Thats just an 
opinion mired in history because in 2005 Diego didn't bother setting the 
min/max for varying when he wrote it, but he easily could have.

In fact, It prompted me to chat with him about this...  I will quote 
him.  "you can also do VARYING == RANGE[MIN, MAX].    inyour 
is_varying_p() predicate you just return true when the range is [min, 
max].  It's really that simple."

If he had done that in the first place we wouldn't even be having this 
conversation. The change we are proposing has *ZERO* impact on how value 
range is used today (other than fixing some lurking issues).    In every 
single place where set_varying is called, we are setting varying for an 
object which has a type.  We never have to "make up" a type.

I can see no technical argument for why this is not acceptable. 
Everything you have argued has merely been vague opinions along the 
lines of "varying shouldn't have a type, or maybe I want to do this... 
", yet everywhere in the compiler where we use it, it *is* associated 
with a type.

Its an easy patch and this is taking far too long for something so 
trivial. I still can't figure out why its an issue you choose to 
resist.  Quite frankly, the painfulness of this is an example of why it 
is so hard to get people to contribute to gcc.   We really need to 
encourage innovation in our compiler, not stifle it.... or we will be dead.

Andrew
Jeff Law July 24, 2019, 6:18 p.m. UTC | #21
On 7/24/19 11:00 AM, Richard Biener wrote:
[ Big snip, ignore missing reply attributions... ]

>> it. But I'd claim that if callers are required not to change these
>> ranges, then the callers are fundamentally broken.  I'm not sure
>> what the "sanitization" is really buying you here.  Can you point
>> to something specific?
>> 
>>> 
>>> But you lose the sanitizing that nobody can change it and the 
>>> changed info leaks to other SSA vars.
>>> 
>>> As said, fix all callers to deal with NULL.
>>> 
>>> But I argue the current code is exactly optimal and safe.
>> ANd I'd argue that it's just plain broken and that the
>> sanitization you're referring to points to something broken
>> elsewhere,  higher up in the callers.
> 
> Another option is to make get_value_range return by value and the
> only way to change the lattice to call an appropriate set function. I
> think we already do the latter in all cases (but we use
> get_value_range in the setter) and returning by reference is just
> eliding the copy.
OK, so what I think you're getting at (and please correct me if I'm
wrong) is that once the lattice values are set, you don't want something
changing the recorded ranges underneath?

ISTM the way to enforce that is to embed the concept in the class and
enforce it by not allowing direct manipulation of range by the clients.
 So a client that wants this behavior somehow tells the class that
ranges are "set in stone" and from that point the setters don't allow
changing the underlying ranges.

I just want to make sure we're on the same page WRT why you think the
constant varying range object is useful.

jeff
Richard Biener July 24, 2019, 6:33 p.m. UTC | #22
On July 24, 2019 8:18:57 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 7/24/19 11:00 AM, Richard Biener wrote:
>[ Big snip, ignore missing reply attributions... ]
>
>>> it. But I'd claim that if callers are required not to change these
>>> ranges, then the callers are fundamentally broken.  I'm not sure
>>> what the "sanitization" is really buying you here.  Can you point
>>> to something specific?
>>> 
>>>> 
>>>> But you lose the sanitizing that nobody can change it and the 
>>>> changed info leaks to other SSA vars.
>>>> 
>>>> As said, fix all callers to deal with NULL.
>>>> 
>>>> But I argue the current code is exactly optimal and safe.
>>> ANd I'd argue that it's just plain broken and that the
>>> sanitization you're referring to points to something broken
>>> elsewhere,  higher up in the callers.
>> 
>> Another option is to make get_value_range return by value and the
>> only way to change the lattice to call an appropriate set function. I
>> think we already do the latter in all cases (but we use
>> get_value_range in the setter) and returning by reference is just
>> eliding the copy.
>OK, so what I think you're getting at (and please correct me if I'm
>wrong) is that once the lattice values are set, you don't want
>something
>changing the recorded ranges underneath?
>
>ISTM the way to enforce that is to embed the concept in the class and
>enforce it by not allowing direct manipulation of range by the clients.
> So a client that wants this behavior somehow tells the class that
>ranges are "set in stone" and from that point the setters don't allow
>changing the underlying ranges.

Yes. You'll see that nearly all callers do

Value_range vr = *get_value_range (name);

Modify 

Update_value_range (name, &vr) ;

And returning by reference was mostly an optimization. We _did_ have callers Changing the range in place and the const varying catched those. 

When returning by value we can return individual VARYINGs not in the lattice if we decide that's what we want. 

>I just want to make sure we're on the same page WRT why you think the
>constant varying range object is useful.

As said it's an optimization. We do not want to reallocate the lattice. And we want lattice updating to happen in a controlled manner, so returning a pointer into the lattice is bad design at this point. 

Richard. 

>jeff
Andrew MacLeod July 24, 2019, 7:50 p.m. UTC | #23
On 7/24/19 2:18 PM, Jeff Law wrote:
> On 7/24/19 11:00 AM, Richard Biener wrote:
> [ Big snip, ignore missing reply attributions... ]
>
>>> it. But I'd claim that if callers are required not to change these
>>> ranges, then the callers are fundamentally broken.  I'm not sure
>>> what the "sanitization" is really buying you here.  Can you point
>>> to something specific?
>>>
>>>> But you lose the sanitizing that nobody can change it and the
>>>> changed info leaks to other SSA vars.
>>>>
>>>> As said, fix all callers to deal with NULL.
>>>>
>>>> But I argue the current code is exactly optimal and safe.
>>> ANd I'd argue that it's just plain broken and that the
>>> sanitization you're referring to points to something broken
>>> elsewhere,  higher up in the callers.
>> Another option is to make get_value_range return by value and the
>> only way to change the lattice to call an appropriate set function. I
>> think we already do the latter in all cases (but we use
>> get_value_range in the setter) and returning by reference is just
>> eliding the copy.
> OK, so what I think you're getting at (and please correct me if I'm
> wrong) is that once the lattice values are set, you don't want something
> changing the recorded ranges underneath?
>
> ISTM the way to enforce that is to embed the concept in the class and
> enforce it by not allowing direct manipulation of range by the clients.
>   So a client that wants this behavior somehow tells the class that
> ranges are "set in stone" and from that point the setters don't allow
> changing the underlying ranges.
>
> I just want to make sure we're on the same page WRT why you think the
> constant varying range object is useful.
>
> jeff

That is not the functionality we are seeing.

whenever get_value_range silently returns a CONST varying node,  the 
ONLY way you can tell that the node might possibly be const elsewhere 
would be if you first check that it is varying, like in  :

    void
    vr_values::set_defs_to_varying (gimple *stmt)
    {
       ssa_op_iter i;
       tree def;
       FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_DEF)
         {
           value_range *vr = get_value_range (def);
           /* Avoid writing to vr_const_varying get_value_range may
    return.  */
           if (!vr->varying_p ())
             vr->set_varying ();
         }
    }



Which means there can be *no* context in which we ever try move one of 
these nodes from varying to anything else, or we'd trap on a write to 
read-only space.

Which means no place is ever trying to change those nodes from varying 
to anything else.  But nothing is preventing changes from other ranges 
to something else.

Which also means the only thing this approach accomplishes is to force 
us to check if a node is already varying, so that we don't overwrite the 
node to varying just in case its a hidden const.

how can this hidden const node really be useful?

I submit this is just a dangerous way to flag previously unprocessed 
nodes as VARYING for the duration of the pass after values_propagated is 
set...  not some higher level "Don't change this range any more" plan.  
Its already bottom of the lattice..  it isn't going anywhere.

Andrew
Richard Biener July 25, 2019, 8:34 a.m. UTC | #24
On Wed, Jul 24, 2019 at 9:50 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 7/24/19 2:18 PM, Jeff Law wrote:
>
> On 7/24/19 11:00 AM, Richard Biener wrote:
> [ Big snip, ignore missing reply attributions... ]
>
> it. But I'd claim that if callers are required not to change these
> ranges, then the callers are fundamentally broken.  I'm not sure
> what the "sanitization" is really buying you here.  Can you point
> to something specific?
>
> But you lose the sanitizing that nobody can change it and the
> changed info leaks to other SSA vars.
>
> As said, fix all callers to deal with NULL.
>
> But I argue the current code is exactly optimal and safe.
>
> ANd I'd argue that it's just plain broken and that the
> sanitization you're referring to points to something broken
> elsewhere,  higher up in the callers.
>
> Another option is to make get_value_range return by value and the
> only way to change the lattice to call an appropriate set function. I
> think we already do the latter in all cases (but we use
> get_value_range in the setter) and returning by reference is just
> eliding the copy.
>
> OK, so what I think you're getting at (and please correct me if I'm
> wrong) is that once the lattice values are set, you don't want something
> changing the recorded ranges underneath?
>
> ISTM the way to enforce that is to embed the concept in the class and
> enforce it by not allowing direct manipulation of range by the clients.
>  So a client that wants this behavior somehow tells the class that
> ranges are "set in stone" and from that point the setters don't allow
> changing the underlying ranges.
>
> I just want to make sure we're on the same page WRT why you think the
> constant varying range object is useful.
>
> jeff
>
>
> That is not the functionality we are seeing.
>
> whenever get_value_range silently returns a CONST varying node,  the ONLY way you can tell that the node might possibly be const elsewhere would be if you first check that it is varying, like in  :
>
> void
> vr_values::set_defs_to_varying (gimple *stmt)
> {
>   ssa_op_iter i;
>   tree def;
>   FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_DEF)
>     {
>       value_range *vr = get_value_range (def);
>       /* Avoid writing to vr_const_varying get_value_range may return.  */
>       if (!vr->varying_p ())
>         vr->set_varying ();
>     }
> }
>
>
>
> Which means there can be *no* context in which we ever try move one of these nodes from varying to anything else, or we'd trap on a write to read-only space.
>
> Which means no place is ever trying to change those nodes from varying to anything else.  But nothing is preventing changes from other ranges to something else.
>
> Which also means the only thing this approach accomplishes is to force us to check if a node is already varying, so that we don't  overwrite the node to varying just in case its a hidden const.
>
> how can this hidden const node really be useful?
>
> I submit this is just a dangerous way to flag previously unprocessed nodes as VARYING for the duration of the pass after values_propagated is set...  not some higher level "Don't change this range any more" plan.  Its already bottom of the lattice..  it isn't going anywhere.

It's for avoiding to reallocate the lattice during propagation where
we end up creating new SSA names.  And for convenience
you like so much - in this case so callers do not need to check for NULL.

I've played a bit with returning value_range by value but that breaks
down with the way we currently handle the
equivalence bitmap -- that bitmap is not supposed to be shared so we
can modify it in place but that means
each lattice query would need to do deep copying of the bitmap which
is of course bad.

Isolating modification of the lattice can be done in other ways.

Richard.

>
> Andrew
>
Jeff Law July 26, 2019, 3:37 a.m. UTC | #25
On 7/24/19 12:33 PM, Richard Biener wrote:
> On July 24, 2019 8:18:57 PM GMT+02:00, Jeff Law <law@redhat.com>
> wrote:
>> On 7/24/19 11:00 AM, Richard Biener wrote: [ Big snip, ignore
>> missing reply attributions... ]
>> 
>>>> it. But I'd claim that if callers are required not to change
>>>> these ranges, then the callers are fundamentally broken.  I'm
>>>> not sure what the "sanitization" is really buying you here.
>>>> Can you point to something specific?
>>>> 
>>>>> 
>>>>> But you lose the sanitizing that nobody can change it and the
>>>>>  changed info leaks to other SSA vars.
>>>>> 
>>>>> As said, fix all callers to deal with NULL.
>>>>> 
>>>>> But I argue the current code is exactly optimal and safe.
>>>> ANd I'd argue that it's just plain broken and that the 
>>>> sanitization you're referring to points to something broken 
>>>> elsewhere,  higher up in the callers.
>>> 
>>> Another option is to make get_value_range return by value and
>>> the only way to change the lattice to call an appropriate set
>>> function. I think we already do the latter in all cases (but we
>>> use get_value_range in the setter) and returning by reference is
>>> just eliding the copy.
>> OK, so what I think you're getting at (and please correct me if
>> I'm wrong) is that once the lattice values are set, you don't want 
>> something changing the recorded ranges underneath?
>> 
>> ISTM the way to enforce that is to embed the concept in the class
>> and enforce it by not allowing direct manipulation of range by the
>> clients. So a client that wants this behavior somehow tells the
>> class that ranges are "set in stone" and from that point the
>> setters don't allow changing the underlying ranges.
> 
> Yes. You'll see that nearly all callers do
> 
> Value_range vr = *get_value_range (name);
> 
> Modify
> 
> Update_value_range (name, &vr) ;
> 
> And returning by reference was mostly an optimization. We _did_ have
> callers Changing the range in place and the const varying catched
> those.
> 
> When returning by value we can return individual VARYINGs not in the
> lattice if we decide that's what we want.
> 
>> I just want to make sure we're on the same page WRT why you think
>> the constant varying range object is useful.
> 
> As said it's an optimization. We do not want to reallocate the
> lattice. And we want lattice updating to happen in a controlled
> manner, so returning a pointer into the lattice is bad design at this
> point.
But I would claim that the current state is dreadful.  Consider that
when gimple-fold asks for a new SSA_NAME, it could get a recycled one,
in which case we get a real range.  Or if it doens't get a recycled
name, then we get the const varying node.  The inconsistently is silly
when we can just reallocate the underlying object.

Between recycling of SSA_NAMEs and allocating a bit of additional space
(say rounding up to some reasonable boundary) I'd bet you'd never be
able to measure the reallocation in practice.

jeff
Jeff Law July 26, 2019, 3:41 a.m. UTC | #26
On 7/24/19 12:33 PM, Richard Biener wrote:
> On July 24, 2019 8:18:57 PM GMT+02:00, Jeff Law <law@redhat.com>
> wrote:
>> On 7/24/19 11:00 AM, Richard Biener wrote: [ Big snip, ignore
>> missing reply attributions... ]
>> 
>>>> it. But I'd claim that if callers are required not to change
>>>> these ranges, then the callers are fundamentally broken.  I'm
>>>> not sure what the "sanitization" is really buying you here.
>>>> Can you point to something specific?
>>>> 
>>>>> 
>>>>> But you lose the sanitizing that nobody can change it and the
>>>>>  changed info leaks to other SSA vars.
>>>>> 
>>>>> As said, fix all callers to deal with NULL.
>>>>> 
>>>>> But I argue the current code is exactly optimal and safe.
>>>> ANd I'd argue that it's just plain broken and that the 
>>>> sanitization you're referring to points to something broken 
>>>> elsewhere,  higher up in the callers.
>>> 
>>> Another option is to make get_value_range return by value and
>>> the only way to change the lattice to call an appropriate set
>>> function. I think we already do the latter in all cases (but we
>>> use get_value_range in the setter) and returning by reference is
>>> just eliding the copy.
>> OK, so what I think you're getting at (and please correct me if
>> I'm wrong) is that once the lattice values are set, you don't want 
>> something changing the recorded ranges underneath?
>> 
>> ISTM the way to enforce that is to embed the concept in the class
>> and enforce it by not allowing direct manipulation of range by the
>> clients. So a client that wants this behavior somehow tells the
>> class that ranges are "set in stone" and from that point the
>> setters don't allow changing the underlying ranges.
> 
> Yes. You'll see that nearly all callers do
> 
> Value_range vr = *get_value_range (name);
> 
> Modify
> 
> Update_value_range (name, &vr) ;
> 
> And returning by reference was mostly an optimization. We _did_ have
> callers Changing the range in place and the const varying catched
> those.
Well, that's the kind of thing we want to avoid at the API level.  One
way was to simply prohibit changes with a by-value return and forcing
changes through a setter.  Another would be returning everything as a
const, which is what I think your patch from today did.   In both cases
you have to use the appropriate APIs to make changes.  I prefer the
former because someone could cast away the const property, but if the
former isn't really feasible, then always returning a const object is a
reasonable compromise I guess.

jeff
Andrew MacLeod July 26, 2019, 2:32 p.m. UTC | #27
On 7/25/19 11:37 PM, Jeff Law wrote:
> On 7/24/19 12:33 PM, Richard Biener wrote:
>> On July 24, 2019 8:18:57 PM GMT+02:00, Jeff Law <law@redhat.com>
>> wrote:
>>> On 7/24/19 11:00 AM, Richard Biener wrote: [ Big snip, ignore
>>> missing reply attributions... ]
>>>
>>>>> it. But I'd claim that if callers are required not to change
>>>>> these ranges, then the callers are fundamentally broken.  I'm
>>>>> not sure what the "sanitization" is really buying you here.
>>>>> Can you point to something specific?
>>>>>
>>>>>> But you lose the sanitizing that nobody can change it and the
>>>>>>   changed info leaks to other SSA vars.
>>>>>>
>>>>>> As said, fix all callers to deal with NULL.
>>>>>>
>>>>>> But I argue the current code is exactly optimal and safe.
>>>>> ANd I'd argue that it's just plain broken and that the
>>>>> sanitization you're referring to points to something broken
>>>>> elsewhere,  higher up in the callers.
>>>> Another option is to make get_value_range return by value and
>>>> the only way to change the lattice to call an appropriate set
>>>> function. I think we already do the latter in all cases (but we
>>>> use get_value_range in the setter) and returning by reference is
>>>> just eliding the copy.
>>> OK, so what I think you're getting at (and please correct me if
>>> I'm wrong) is that once the lattice values are set, you don't want
>>> something changing the recorded ranges underneath?
>>>
>>> ISTM the way to enforce that is to embed the concept in the class
>>> and enforce it by not allowing direct manipulation of range by the
>>> clients. So a client that wants this behavior somehow tells the
>>> class that ranges are "set in stone" and from that point the
>>> setters don't allow changing the underlying ranges.
>> Yes. You'll see that nearly all callers do
>>
>> Value_range vr = *get_value_range (name);
>>
>> Modify
>>
>> Update_value_range (name, &vr) ;
>>
>> And returning by reference was mostly an optimization. We _did_ have
>> callers Changing the range in place and the const varying catched
>> those.
>>
>> When returning by value we can return individual VARYINGs not in the
>> lattice if we decide that's what we want.
>>
>>> I just want to make sure we're on the same page WRT why you think
>>> the constant varying range object is useful.
>> As said it's an optimization. We do not want to reallocate the
>> lattice. And we want lattice updating to happen in a controlled
>> manner, so returning a pointer into the lattice is bad design at this
>> point.
> But I would claim that the current state is dreadful.  Consider that
> when gimple-fold asks for a new SSA_NAME, it could get a recycled one,
> in which case we get a real range.  Or if it doens't get a recycled
> name, then we get the const varying node.  The inconsistently is silly
> when we can just reallocate the underlying object.
>
> Between recycling of SSA_NAMEs and allocating a bit of additional space
> (say rounding up to some reasonable boundary) I'd bet you'd never be
> able to measure the reallocation in practice.
>
I annotated the patch yesterday to actually log reallocations and ran a 
couple of bootstraps...

If we don't add any extra space in the vector initially (as it is 
allocated today) , it is re-sized a total of 201 times.  Of those, 93 
get back the same pointer so no resize actually happens.

IF we use the patch as I initially propose, where we add 10% to the 
vector to start, we re-size 10 times, all from either 18 to 25 , or 8 to 
14,so very small numbers of ssaname functions, where the 10% doesnt 
really do much.  Those were all re-allocations but one.   The initial 
resize does seem to help prevent any larger reallocations,

THat doesn't seem like that bad of a thing over all, and we could tweak 
the initial size a bit more if it would help? to deal with the small 
cases, we could make it num_names+10%+10 or something even.

I feel it would be safer.. It seems to me the CONST solution cannot be 
disabled.. ie, even a non-checking production compiler would go boom if 
it triggered.

As for addressing the issue that the CONST approach is trying to 
resolve, Could we change all the set/update routines to do something like

gcc_checking_assert (new_range->varying_p ()  || !values_propagated);

that way we'll trigger an assert if we try to change any value to 
something other than varying when values_propagated is set?

Andrew
Richard Biener July 30, 2019, 8:55 a.m. UTC | #28
On Fri, Jul 26, 2019 at 4:32 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 7/25/19 11:37 PM, Jeff Law wrote:
> > On 7/24/19 12:33 PM, Richard Biener wrote:
> >> On July 24, 2019 8:18:57 PM GMT+02:00, Jeff Law <law@redhat.com>
> >> wrote:
> >>> On 7/24/19 11:00 AM, Richard Biener wrote: [ Big snip, ignore
> >>> missing reply attributions... ]
> >>>
> >>>>> it. But I'd claim that if callers are required not to change
> >>>>> these ranges, then the callers are fundamentally broken.  I'm
> >>>>> not sure what the "sanitization" is really buying you here.
> >>>>> Can you point to something specific?
> >>>>>
> >>>>>> But you lose the sanitizing that nobody can change it and the
> >>>>>>   changed info leaks to other SSA vars.
> >>>>>>
> >>>>>> As said, fix all callers to deal with NULL.
> >>>>>>
> >>>>>> But I argue the current code is exactly optimal and safe.
> >>>>> ANd I'd argue that it's just plain broken and that the
> >>>>> sanitization you're referring to points to something broken
> >>>>> elsewhere,  higher up in the callers.
> >>>> Another option is to make get_value_range return by value and
> >>>> the only way to change the lattice to call an appropriate set
> >>>> function. I think we already do the latter in all cases (but we
> >>>> use get_value_range in the setter) and returning by reference is
> >>>> just eliding the copy.
> >>> OK, so what I think you're getting at (and please correct me if
> >>> I'm wrong) is that once the lattice values are set, you don't want
> >>> something changing the recorded ranges underneath?
> >>>
> >>> ISTM the way to enforce that is to embed the concept in the class
> >>> and enforce it by not allowing direct manipulation of range by the
> >>> clients. So a client that wants this behavior somehow tells the
> >>> class that ranges are "set in stone" and from that point the
> >>> setters don't allow changing the underlying ranges.
> >> Yes. You'll see that nearly all callers do
> >>
> >> Value_range vr = *get_value_range (name);
> >>
> >> Modify
> >>
> >> Update_value_range (name, &vr) ;
> >>
> >> And returning by reference was mostly an optimization. We _did_ have
> >> callers Changing the range in place and the const varying catched
> >> those.
> >>
> >> When returning by value we can return individual VARYINGs not in the
> >> lattice if we decide that's what we want.
> >>
> >>> I just want to make sure we're on the same page WRT why you think
> >>> the constant varying range object is useful.
> >> As said it's an optimization. We do not want to reallocate the
> >> lattice. And we want lattice updating to happen in a controlled
> >> manner, so returning a pointer into the lattice is bad design at this
> >> point.
> > But I would claim that the current state is dreadful.  Consider that
> > when gimple-fold asks for a new SSA_NAME, it could get a recycled one,
> > in which case we get a real range.  Or if it doens't get a recycled
> > name, then we get the const varying node.  The inconsistently is silly
> > when we can just reallocate the underlying object.
> >
> > Between recycling of SSA_NAMEs and allocating a bit of additional space
> > (say rounding up to some reasonable boundary) I'd bet you'd never be
> > able to measure the reallocation in practice.
> >
> I annotated the patch yesterday to actually log reallocations and ran a
> couple of bootstraps...
>
> If we don't add any extra space in the vector initially (as it is
> allocated today) , it is re-sized a total of 201 times.  Of those, 93
> get back the same pointer so no resize actually happens.
>
> IF we use the patch as I initially propose, where we add 10% to the
> vector to start, we re-size 10 times, all from either 18 to 25 , or 8 to
> 14,so very small numbers of ssaname functions, where the 10% doesnt
> really do much.  Those were all re-allocations but one.   The initial
> resize does seem to help prevent any larger reallocations,
>
> THat doesn't seem like that bad of a thing over all, and we could tweak
> the initial size a bit more if it would help? to deal with the small
> cases, we could make it num_names+10%+10 or something even.
>
> I feel it would be safer.. It seems to me the CONST solution cannot be
> disabled.. ie, even a non-checking production compiler would go boom if
> it triggered.
>
> As for addressing the issue that the CONST approach is trying to
> resolve, Could we change all the set/update routines to do something like
>
> gcc_checking_assert (new_range->varying_p ()  || !values_propagated);
>
> that way we'll trigger an assert if we try to change any value to
> something other than varying when values_propagated is set?

I think the constness is appropriately addressed by my recent API update
to split the get-value from get-lattice-entry and properly forcing lattice
update to go through the few setters.

I'm not totally against making the lattice expandable, as the followup
patch to the original re-org notices we do run into "new" stmts during
the combined analysis/propagation stages the DOM-based users use.
And yes, allocating the lattice with some initial head-room is of course
the way to go (I'd even just do 2 * num_ssa_names...).  Avoiding
"useless" allocations of VR_VARYING lattice entries (where a NULL
does it just fine) is another optimization.  But yeah, we do not
"free" lattice entries when they become VR_VARYING and further
return the shared const entry (but we could).

That still leaves me with the objection to making VARYING typed.
IMHO the vr-values.c lattice should look like

enum lattice_val { UNDEFINED, CONST, VARYING };
struct lattice_entry {
  lattice_val val;
  value_range *range;
};

and we'd have _no_ value_range (NULL) for UNDEFINED and VARYING
in the _lattice_.  And CONST (aka classical we have a value) would then
record a value_range.

That we currently mix lattice state and the value-range types
(range, anti-range) is just historical.  The CONST lattice state could even
be split further, CONST_SCALAR, CONST_RANGE, VARYING and the
entry be a union of either value_range * or tree.

Richard.

> Andrew
>
Andrew MacLeod July 30, 2019, 2:54 p.m. UTC | #29
On 7/30/19 4:55 AM, Richard Biener wrote:
> On Fri, Jul 26, 2019 at 4:32 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>> On 7/25/19 11:37 PM, Jeff Law wrote:
>>> On 7/24/19 12:33 PM, Richard Biener wrote:
>>>> On July 24, 2019 8:18:57 PM GMT+02:00, Jeff Law <law@redhat.com>
>>>> wrote:
>>>>> On 7/24/19 11:00 AM, Richard Biener wrote: [ Big snip, ignore
>>>>> missing reply attributions... ]
>>>>>
>>>>>>> it. But I'd claim that if callers are required not to change
>>>>>>> these ranges, then the callers are fundamentally broken.  I'm
>>>>>>> not sure what the "sanitization" is really buying you here.
>>>>>>> Can you point to something specific?
>>>>>>>
>>>>>>>> But you lose the sanitizing that nobody can change it and the
>>>>>>>>    changed info leaks to other SSA vars.
>>>>>>>>
>>>>>>>> As said, fix all callers to deal with NULL.
>>>>>>>>
>>>>>>>> But I argue the current code is exactly optimal and safe.
>>>>>>> ANd I'd argue that it's just plain broken and that the
>>>>>>> sanitization you're referring to points to something broken
>>>>>>> elsewhere,  higher up in the callers.
>>>>>> Another option is to make get_value_range return by value and
>>>>>> the only way to change the lattice to call an appropriate set
>>>>>> function. I think we already do the latter in all cases (but we
>>>>>> use get_value_range in the setter) and returning by reference is
>>>>>> just eliding the copy.
>>>>> OK, so what I think you're getting at (and please correct me if
>>>>> I'm wrong) is that once the lattice values are set, you don't want
>>>>> something changing the recorded ranges underneath?
>>>>>
>>>>> ISTM the way to enforce that is to embed the concept in the class
>>>>> and enforce it by not allowing direct manipulation of range by the
>>>>> clients. So a client that wants this behavior somehow tells the
>>>>> class that ranges are "set in stone" and from that point the
>>>>> setters don't allow changing the underlying ranges.
>>>> Yes. You'll see that nearly all callers do
>>>>
>>>> Value_range vr = *get_value_range (name);
>>>>
>>>> Modify
>>>>
>>>> Update_value_range (name, &vr) ;
>>>>
>>>> And returning by reference was mostly an optimization. We _did_ have
>>>> callers Changing the range in place and the const varying catched
>>>> those.
>>>>
>>>> When returning by value we can return individual VARYINGs not in the
>>>> lattice if we decide that's what we want.
>>>>
>>>>> I just want to make sure we're on the same page WRT why you think
>>>>> the constant varying range object is useful.
>>>> As said it's an optimization. We do not want to reallocate the
>>>> lattice. And we want lattice updating to happen in a controlled
>>>> manner, so returning a pointer into the lattice is bad design at this
>>>> point.
>>> But I would claim that the current state is dreadful.  Consider that
>>> when gimple-fold asks for a new SSA_NAME, it could get a recycled one,
>>> in which case we get a real range.  Or if it doens't get a recycled
>>> name, then we get the const varying node.  The inconsistently is silly
>>> when we can just reallocate the underlying object.
>>>
>>> Between recycling of SSA_NAMEs and allocating a bit of additional space
>>> (say rounding up to some reasonable boundary) I'd bet you'd never be
>>> able to measure the reallocation in practice.
>>>
>> I annotated the patch yesterday to actually log reallocations and ran a
>> couple of bootstraps...
>>
>> If we don't add any extra space in the vector initially (as it is
>> allocated today) , it is re-sized a total of 201 times.  Of those, 93
>> get back the same pointer so no resize actually happens.
>>
>> IF we use the patch as I initially propose, where we add 10% to the
>> vector to start, we re-size 10 times, all from either 18 to 25 , or 8 to
>> 14,so very small numbers of ssaname functions, where the 10% doesnt
>> really do much.  Those were all re-allocations but one.   The initial
>> resize does seem to help prevent any larger reallocations,
>>
>> THat doesn't seem like that bad of a thing over all, and we could tweak
>> the initial size a bit more if it would help? to deal with the small
>> cases, we could make it num_names+10%+10 or something even.
>>
>> I feel it would be safer.. It seems to me the CONST solution cannot be
>> disabled.. ie, even a non-checking production compiler would go boom if
>> it triggered.
>>
>> As for addressing the issue that the CONST approach is trying to
>> resolve, Could we change all the set/update routines to do something like
>>
>> gcc_checking_assert (new_range->varying_p ()  || !values_propagated);
>>
>> that way we'll trigger an assert if we try to change any value to
>> something other than varying when values_propagated is set?
> I think the constness is appropriately addressed by my recent API update
> to split the get-value from get-lattice-entry and properly forcing lattice
> update to go through the few setters.
>
> I'm not totally against making the lattice expandable, as the followup
> patch to the original re-org notices we do run into "new" stmts during
> the combined analysis/propagation stages the DOM-based users use.
> And yes, allocating the lattice with some initial head-room is of course
> the way to go (I'd even just do 2 * num_ssa_names...).  Avoiding
> "useless" allocations of VR_VARYING lattice entries (where a NULL
> does it just fine) is another optimization.  But yeah, we do not
> "free" lattice entries when they become VR_VARYING and further
> return the shared const entry (but we could).
sure,  2 * num_ssa_names  is a good start, it'll probably never trigger 
the growth code.
> That still leaves me with the objection to making VARYING typed.

Even the original author of the code says that is irrelevant. its just a 
choice he made at the time. As I pointed out, in practice you cannot 
disassociate VARYING from the type since a varying char and a varying 
long are not equivalent.  So varying only means something in the context 
of an ssa-name or expression, and those must have a type.  And it means 
[min, max] for that type.

> IMHO the vr-values.c lattice should look like
>
> enum lattice_val { UNDEFINED, CONST, VARYING };
> struct lattice_entry {
>    lattice_val val;
>    value_range *range;
> };
>
> and we'd have _no_ value_range (NULL) for UNDEFINED and VARYING
> in the _lattice_.  And CONST (aka classical we have a value) would then
> record a value_range.
Thats sounds great, and there is nothing preventing you from doing 
that.   Then value range can look *exactly* like we want... just a 
range.  We're just uniting the current implementation of value-range to 
properly reflect varying and [min,max] for consistency, which it doesn't 
have right now.  We can put this code in now, and then if you later get 
to implementing a new view of lattices, its pretty easy to change a few 
places where we called set_varying with a type.  That's incredibly 
trivial.  Or you could just ignore the type parameter.   That would work 
too.  Regardless, when you want the range of a varying node, you'll have 
to get the MIN and MAX appropriate for the type of that name, and we'll 
have done that part.
> That we currently mix lattice state and the value-range types
> (range, anti-range) is just historical.  The CONST lattice state could even
> be split further, CONST_SCALAR, CONST_RANGE, VARYING and the
> entry be a union of either value_range * or tree.
>
> Richard.
Everything in this compiler is historical. We did everything we could to 
save memory back in the old days, I know, I was there when we did this. 
   Combining the lattice and the range made sense in that context.  In 
fact, you could get rid of the lattice structure entirely with the way 
we represent irange... we have varying and undefined values query-able 
from the range without anything special.  That seems even more efficient 
to me, and that's the way I'd implement it today... no artificial 
lattice overlay needed, just the range.

Lattices will eventually go away. I personally don't see any point in us 
spinning our wheels re-implementing them when they are slated to be 
removed. Until then, there is absolutely nothing wrong with the way it 
works right now.  Adding the min and max to the varying node has no 
impact on how lattices work, it just allows value_range to interact with 
new range code better..

This change is trivial. it actually fixes a few warts. It lets us move 
on with other more significant things.

I've been trying to play nice and get agreement, but after a week and a 
half it seems like that isn't going to happen. I welcome any further 
comments on the patch, but otherwise I will approve the patch.

I will note this is the first time in a very long time that I have had 
to exercise my maintainer authority as one of the original architects of 
tree-ssa. I have been content to let others review and arrive at a 
consensus, but the system is not working this time. Its unclear to me 
why you are being so insistent about this trivial matter.  I will 
continue to make approvals as necessary going forward, but I still 
welcome your input and would prefer to settle on agreement to future 
patches.

Andrew
Richard Biener July 31, 2019, 8:36 a.m. UTC | #30
On Tue, Jul 30, 2019 at 4:54 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 7/30/19 4:55 AM, Richard Biener wrote:
> > On Fri, Jul 26, 2019 at 4:32 PM Andrew MacLeod <amacleod@redhat.com> wrote:
> >> On 7/25/19 11:37 PM, Jeff Law wrote:
> >>> On 7/24/19 12:33 PM, Richard Biener wrote:
> >>>> On July 24, 2019 8:18:57 PM GMT+02:00, Jeff Law <law@redhat.com>
> >>>> wrote:
> >>>>> On 7/24/19 11:00 AM, Richard Biener wrote: [ Big snip, ignore
> >>>>> missing reply attributions... ]
> >>>>>
> >>>>>>> it. But I'd claim that if callers are required not to change
> >>>>>>> these ranges, then the callers are fundamentally broken.  I'm
> >>>>>>> not sure what the "sanitization" is really buying you here.
> >>>>>>> Can you point to something specific?
> >>>>>>>
> >>>>>>>> But you lose the sanitizing that nobody can change it and the
> >>>>>>>>    changed info leaks to other SSA vars.
> >>>>>>>>
> >>>>>>>> As said, fix all callers to deal with NULL.
> >>>>>>>>
> >>>>>>>> But I argue the current code is exactly optimal and safe.
> >>>>>>> ANd I'd argue that it's just plain broken and that the
> >>>>>>> sanitization you're referring to points to something broken
> >>>>>>> elsewhere,  higher up in the callers.
> >>>>>> Another option is to make get_value_range return by value and
> >>>>>> the only way to change the lattice to call an appropriate set
> >>>>>> function. I think we already do the latter in all cases (but we
> >>>>>> use get_value_range in the setter) and returning by reference is
> >>>>>> just eliding the copy.
> >>>>> OK, so what I think you're getting at (and please correct me if
> >>>>> I'm wrong) is that once the lattice values are set, you don't want
> >>>>> something changing the recorded ranges underneath?
> >>>>>
> >>>>> ISTM the way to enforce that is to embed the concept in the class
> >>>>> and enforce it by not allowing direct manipulation of range by the
> >>>>> clients. So a client that wants this behavior somehow tells the
> >>>>> class that ranges are "set in stone" and from that point the
> >>>>> setters don't allow changing the underlying ranges.
> >>>> Yes. You'll see that nearly all callers do
> >>>>
> >>>> Value_range vr = *get_value_range (name);
> >>>>
> >>>> Modify
> >>>>
> >>>> Update_value_range (name, &vr) ;
> >>>>
> >>>> And returning by reference was mostly an optimization. We _did_ have
> >>>> callers Changing the range in place and the const varying catched
> >>>> those.
> >>>>
> >>>> When returning by value we can return individual VARYINGs not in the
> >>>> lattice if we decide that's what we want.
> >>>>
> >>>>> I just want to make sure we're on the same page WRT why you think
> >>>>> the constant varying range object is useful.
> >>>> As said it's an optimization. We do not want to reallocate the
> >>>> lattice. And we want lattice updating to happen in a controlled
> >>>> manner, so returning a pointer into the lattice is bad design at this
> >>>> point.
> >>> But I would claim that the current state is dreadful.  Consider that
> >>> when gimple-fold asks for a new SSA_NAME, it could get a recycled one,
> >>> in which case we get a real range.  Or if it doens't get a recycled
> >>> name, then we get the const varying node.  The inconsistently is silly
> >>> when we can just reallocate the underlying object.
> >>>
> >>> Between recycling of SSA_NAMEs and allocating a bit of additional space
> >>> (say rounding up to some reasonable boundary) I'd bet you'd never be
> >>> able to measure the reallocation in practice.
> >>>
> >> I annotated the patch yesterday to actually log reallocations and ran a
> >> couple of bootstraps...
> >>
> >> If we don't add any extra space in the vector initially (as it is
> >> allocated today) , it is re-sized a total of 201 times.  Of those, 93
> >> get back the same pointer so no resize actually happens.
> >>
> >> IF we use the patch as I initially propose, where we add 10% to the
> >> vector to start, we re-size 10 times, all from either 18 to 25 , or 8 to
> >> 14,so very small numbers of ssaname functions, where the 10% doesnt
> >> really do much.  Those were all re-allocations but one.   The initial
> >> resize does seem to help prevent any larger reallocations,
> >>
> >> THat doesn't seem like that bad of a thing over all, and we could tweak
> >> the initial size a bit more if it would help? to deal with the small
> >> cases, we could make it num_names+10%+10 or something even.
> >>
> >> I feel it would be safer.. It seems to me the CONST solution cannot be
> >> disabled.. ie, even a non-checking production compiler would go boom if
> >> it triggered.
> >>
> >> As for addressing the issue that the CONST approach is trying to
> >> resolve, Could we change all the set/update routines to do something like
> >>
> >> gcc_checking_assert (new_range->varying_p ()  || !values_propagated);
> >>
> >> that way we'll trigger an assert if we try to change any value to
> >> something other than varying when values_propagated is set?
> > I think the constness is appropriately addressed by my recent API update
> > to split the get-value from get-lattice-entry and properly forcing lattice
> > update to go through the few setters.
> >
> > I'm not totally against making the lattice expandable, as the followup
> > patch to the original re-org notices we do run into "new" stmts during
> > the combined analysis/propagation stages the DOM-based users use.
> > And yes, allocating the lattice with some initial head-room is of course
> > the way to go (I'd even just do 2 * num_ssa_names...).  Avoiding
> > "useless" allocations of VR_VARYING lattice entries (where a NULL
> > does it just fine) is another optimization.  But yeah, we do not
> > "free" lattice entries when they become VR_VARYING and further
> > return the shared const entry (but we could).
> sure,  2 * num_ssa_names  is a good start, it'll probably never trigger
> the growth code.
> > That still leaves me with the objection to making VARYING typed.
>
> Even the original author of the code says that is irrelevant. its just a
> choice he made at the time. As I pointed out, in practice you cannot
> disassociate VARYING from the type since a varying char and a varying
> long are not equivalent.  So varying only means something in the context
> of an ssa-name or expression, and those must have a type.  And it means
> [min, max] for that type.
>
> > IMHO the vr-values.c lattice should look like
> >
> > enum lattice_val { UNDEFINED, CONST, VARYING };
> > struct lattice_entry {
> >    lattice_val val;
> >    value_range *range;
> > };
> >
> > and we'd have _no_ value_range (NULL) for UNDEFINED and VARYING
> > in the _lattice_.  And CONST (aka classical we have a value) would then
> > record a value_range.
> Thats sounds great, and there is nothing preventing you from doing
> that.   Then value range can look *exactly* like we want... just a
> range.  We're just uniting the current implementation of value-range to
> properly reflect varying and [min,max] for consistency, which it doesn't
> have right now.  We can put this code in now, and then if you later get
> to implementing a new view of lattices, its pretty easy to change a few
> places where we called set_varying with a type.  That's incredibly
> trivial.  Or you could just ignore the type parameter.   That would work
> too.  Regardless, when you want the range of a varying node, you'll have
> to get the MIN and MAX appropriate for the type of that name, and we'll
> have done that part.
> > That we currently mix lattice state and the value-range types
> > (range, anti-range) is just historical.  The CONST lattice state could even
> > be split further, CONST_SCALAR, CONST_RANGE, VARYING and the
> > entry be a union of either value_range * or tree.
> >
> > Richard.
> Everything in this compiler is historical. We did everything we could to
> save memory back in the old days, I know, I was there when we did this.
>    Combining the lattice and the range made sense in that context.  In
> fact, you could get rid of the lattice structure entirely with the way
> we represent irange... we have varying and undefined values query-able
> from the range without anything special.  That seems even more efficient
> to me, and that's the way I'd implement it today... no artificial
> lattice overlay needed, just the range.
>
> Lattices will eventually go away. I personally don't see any point in us
> spinning our wheels re-implementing them when they are slated to be
> removed. Until then, there is absolutely nothing wrong with the way it
> works right now.  Adding the min and max to the varying node has no
> impact on how lattices work, it just allows value_range to interact with
> new range code better..
>
> This change is trivial. it actually fixes a few warts. It lets us move
> on with other more significant things.
>
> I've been trying to play nice and get agreement, but after a week and a
> half it seems like that isn't going to happen. I welcome any further
> comments on the patch, but otherwise I will approve the patch.
>
> I will note this is the first time in a very long time that I have had
> to exercise my maintainer authority as one of the original architects of
> tree-ssa. I have been content to let others review and arrive at a
> consensus, but the system is not working this time. Its unclear to me
> why you are being so insistent about this trivial matter.  I will
> continue to make approvals as necessary going forward, but I still
> welcome your input and would prefer to settle on agreement to future
> patches.

Fair enough - have fun.

Richard.

> Andrew
Andrew MacLeod July 31, 2019, 11:24 p.m. UTC | #31
On 7/31/19 4:36 AM, Richard Biener wrote:
> On Tue, Jul 30, 2019 at 4:54 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>> Everything in this compiler is historical. We did everything we could to
>> save memory back in the old days, I know, I was there when we did this.
>>     Combining the lattice and the range made sense in that context.  In
>> fact, you could get rid of the lattice structure entirely with the way
>> we represent irange... we have varying and undefined values query-able
>> from the range without anything special.  That seems even more efficient
>> to me, and that's the way I'd implement it today... no artificial
>> lattice overlay needed, just the range.
>>
>> Lattices will eventually go away. I personally don't see any point in us
>> spinning our wheels re-implementing them when they are slated to be
>> removed. Until then, there is absolutely nothing wrong with the way it
>> works right now.  Adding the min and max to the varying node has no
>> impact on how lattices work, it just allows value_range to interact with
>> new range code better..
>>
>> This change is trivial. it actually fixes a few warts. It lets us move
>> on with other more significant things.
>>
>> I've been trying to play nice and get agreement, but after a week and a
>> half it seems like that isn't going to happen. I welcome any further
>> comments on the patch, but otherwise I will approve the patch.
>>
>> I will note this is the first time in a very long time that I have had
>> to exercise my maintainer authority as one of the original architects of
>> tree-ssa. I have been content to let others review and arrive at a
>> consensus, but the system is not working this time. Its unclear to me
>> why you are being so insistent about this trivial matter.  I will
>> continue to make approvals as necessary going forward, but I still
>> welcome your input and would prefer to settle on agreement to future
>> patches.
> Fair enough - have fun.
>
> Richard.
>

In summary, We don't want to do it your way so you wash your hands of it 
and assign all VRP bugs to me stating I am now VRP maintainer.

Aldy started this relatively straightforward submission a month ago (!! 
July 1st!), and we've made virtually no progress since then. You weren't 
holding us up on technical merits, it appeared to be a personal 
preference to not do things our way, despite the explanations and 
cleanups it provided.   I felt I had no choice but to move forward or we 
will not get any code in before stage 1 ends. We're already a month 
behind now, and with vacations looming I fear our initial goals are 
already in serious danger.  Our code certainly won't be in for as long 
as I would have liked.   I still don't know why you were so insistent it 
had to be all your way...    We made a lot of effort to accommodate you 
out of respect for what you do for GCC, it would have been nice to see a 
little in return.

Regardless, you have knowledge in this space which is useful, and we 
welcome your input along the way should you decide to provide it.

Now, on to trying to improve ranges and VRP.

Andrew
Richard Biener Aug. 1, 2019, 2:11 p.m. UTC | #32
On Thu, Aug 1, 2019 at 1:24 AM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 7/31/19 4:36 AM, Richard Biener wrote:
> > On Tue, Jul 30, 2019 at 4:54 PM Andrew MacLeod <amacleod@redhat.com> wrote:
> >> Everything in this compiler is historical. We did everything we could to
> >> save memory back in the old days, I know, I was there when we did this.
> >>     Combining the lattice and the range made sense in that context.  In
> >> fact, you could get rid of the lattice structure entirely with the way
> >> we represent irange... we have varying and undefined values query-able
> >> from the range without anything special.  That seems even more efficient
> >> to me, and that's the way I'd implement it today... no artificial
> >> lattice overlay needed, just the range.
> >>
> >> Lattices will eventually go away. I personally don't see any point in us
> >> spinning our wheels re-implementing them when they are slated to be
> >> removed. Until then, there is absolutely nothing wrong with the way it
> >> works right now.  Adding the min and max to the varying node has no
> >> impact on how lattices work, it just allows value_range to interact with
> >> new range code better..
> >>
> >> This change is trivial. it actually fixes a few warts. It lets us move
> >> on with other more significant things.
> >>
> >> I've been trying to play nice and get agreement, but after a week and a
> >> half it seems like that isn't going to happen. I welcome any further
> >> comments on the patch, but otherwise I will approve the patch.
> >>
> >> I will note this is the first time in a very long time that I have had
> >> to exercise my maintainer authority as one of the original architects of
> >> tree-ssa. I have been content to let others review and arrive at a
> >> consensus, but the system is not working this time. Its unclear to me
> >> why you are being so insistent about this trivial matter.  I will
> >> continue to make approvals as necessary going forward, but I still
> >> welcome your input and would prefer to settle on agreement to future
> >> patches.
> > Fair enough - have fun.
> >
> > Richard.
> >
>
> In summary, We don't want to do it your way so you wash your hands of it
> and assign all VRP bugs to me stating I am now VRP maintainer.

I didn't assing them, I unassigned them from myself and CCed you.
Three bugs it was I think.

> Aldy started this relatively straightforward submission a month ago (!!
> July 1st!), and we've made virtually no progress since then.

Actually I think it's a waste of time on my side since I actually sat
down and looked at the branch - because while you posted several
"merge plans / proposals" the actual patches you proposed for integration
were just changes to the existing code.  But then when I tried to
understand why you need those by looking at the branch (OK, criticizing
some stuff I saw there) you said "well, don't look - it's all just a prototype
and even the APIs are not finished".

That makes me wonder why you do changes to trunk or proposing
merge plans when you've not even finished designing stuff...

As said, I feel I am wasting my time here so please feel free to
do whatever you are up to with VRP.

I did help with considerable resources even during last stage1/3 with
"fixing" the half-way conversion of value_range to a class.  I've not
manged to beat sense into the wide-int-range.h API, but oh well.

>You weren't
> holding us up on technical merits, it appeared to be a personal
> preference to not do things our way, despite the explanations and
> cleanups it provided.   I felt I had no choice but to move forward or we
> will not get any code in before stage 1 ends. We're already a month
> behind now, and with vacations looming I fear our initial goals are
> already in serious danger.  Our code certainly won't be in for as long
> as I would have liked.   I still don't know why you were so insistent it
> had to be all your way...    We made a lot of effort to accommodate you
> out of respect for what you do for GCC, it would have been nice to see a
> little in return.
>
> Regardless, you have knowledge in this space which is useful, and we
> welcome your input along the way should you decide to provide it.
>
> Now, on to trying to improve ranges and VRP.

And I'm on doing non-VRP stuff now.  There's other capable reviewers
that will help you out.

Richard.

> Andrew
>
>
>
Andrew MacLeod Aug. 1, 2019, 4:34 p.m. UTC | #33
On 8/1/19 10:11 AM, Richard Biener wrote:
> On Thu, Aug 1, 2019 at 1:24 AM Andrew MacLeod <amacleod@redhat.com> wrote:
>
>> Aldy started this relatively straightforward submission a month ago (!!
>> July 1st!), and we've made virtually no progress since then.
> Actually I think it's a waste of time on my side since I actually sat
> down and looked at the branch - because while you posted several
> "merge plans / proposals" the actual patches you proposed for integration
> were just changes to the existing code.  But then when I tried to

They are preliminary patches which enable us to align irange and 
value_range so they can be used interchangeably as the range class 
everywhere. We've adjusted irange on the branch to align with 
value_range, and these tweaks to the value_range API make them fully 
interchangeable.

So yes, this first set of patches are just changes to the existing 
code... We need them before we can bring in range-ops.    After that 
comes the outgoing range computational component, but I'm worried we're 
short on time for that now :-(  we'll see.



> understand why you need those by looking at the branch (OK, criticizing
> some stuff I saw there) you said "well, don't look - it's all just a prototype
> and even the APIs are not finished".
I explained numerous times why we needed the changes, you appear to 
simply not believe me. There should not have been a need to look at 
unprepared code.

As for being not finished, the API is being tweaked for trunk.. 2 things 
in particular:

  1)  I'm adding a type to the calls since we won't have a type for 
undefined any more. The aggregation of sub-ranges always started with 
undefined and built up from there, and it can no longer find the type 
from undefined.
  2)  The Implementation of each operation was still using chains of 
calls into legacy code which often used big switches...  I'm collapsing 
those into a basic operation call on wide-ints  which is being added to 
the API.    This will dramatically change the look of the file, without 
changing its basic behaviour.

So its virtually the same code, but with a very different (and cleaner) 
look designed for trunk...  we didn't really care what it looked like on 
the branch, we just wanted functionality...  and it shows.  Hence my 
desire to not have it critiqued :-)  The intention was always to rework 
it for trunk before having it examined.

I'll also modify the entire file to use value_range instead of irange.  
We don't want to contaminate trunk with pointless #defines or other 
silliness. We'll manage irange compatibility for range-ops on the 
branch. The varying min/max change we were trying to get in is a hard 
prerequisite for value_range to function properly in this role.


> That makes me wonder why you do changes to trunk or proposing
> merge plans when you've not even finished designing stuff...
>
> As said, I feel I am wasting my time here so please feel free to
> do whatever you are up to with VRP.
>
> I did help with considerable resources even during last stage1/3 with
> "fixing" the half-way conversion of value_range to a class.  I've not
> manged to beat sense into the wide-int-range.h API, but oh well.
yes, and I really do appreciate that work, it was critical to paving the 
way to allowing us to use value_range instead of irange...  We just need 
these final couple of things to finish that transition, and they were 
just not happening, thus my reaction.

I think the new  range-op operation on wide-ints that is being added to 
the API will probably replace most, if not all, of the wide-int-range.h 
stuff...  We'll see how it goes when I finally get back to finishing it.

Anyway, I'm off after today for a week, so we wont do anything further 
until I get back.
The weekend will be here shortly... Have a good one.

Andrew
diff mbox series

Patch

commit 24c3a6a2cb7424a9c022930cada3a5f3c84a388a
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Fri Jun 28 11:00:10 2019 +0200

    VR_UNDEFINED and VR_VARYING now have a type.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 01fb97cedb2..a5247735694 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,72 @@ 
+2019-07-01  Aldy Hernandez  <aldyh@redhat.com>
+
+	* gimple-ssa-evrp-analyze.c (record_ranges_from_phis): Skip PHIs
+	who's result are not valid for a value_range.
+	Set type for varying value_range.
+	* ipa-cp.c (ipcp_vr_lattice::init): Set type for varying/undef
+	value_range.
+	(ipcp_vr_lattice::set_to_bottom): Same.
+	(initialize_node_lattices): Same.
+	* tree-ssa-threadedge.c (record_temporary_equivalences_from_phis):
+	Same.
+	* tree-ssanames.c (get_range_info): Same.
+	* tree-vrp.c (value_range::set_equiv): Do not set equiv on
+	undef/varying.
+	(value_range_base::value_range_base): New constructor.
+	(value_range_base::check): Remove assert for empty min/max.
+	(value_range_base::equal_p): Allow comparison of typeless undefs.
+	(value_range_base::set_undefined): Add type.
+	(value_range::set_undefined): Same.
+	(value_range_base::set_varying): Same.
+	(value_range::set_varying): Same.
+	(value_range_base::type): Remove assert.
+	(value_range_base::dump): Display type for varying/undef.
+	(value_range_base::dump): Add argument-less overload.
+	(value_range::dump): Same.
+	(vrp_val_max): Add handle_pointers argument.
+	(vrp_val_min): Same.
+	(vrp_val_is_max): Same.
+	(vrp_val_is_min): Same.
+	(value_range_base::set_and_canonicalize): Adjust so type is
+	allowed for varying/undef.
+	(ranges_from_anti_range): Same.
+	(extract_range_from_muliplicative_op): Same.
+	(extract_range_from_binary_expr): Same.
+	(extract_range_from_unary_expr): Same.
+	(vrp_prop::vrp_initialize): Same.
+	(vrp_prop::visit_stmt): Same.
+	(value_range_base::union_helper): Same.
+	(value_range_base::normalize_symbolics): Same.
+	(determine_value_range_1): Same.
+	* tree-vrp.h (value_range_base): Add value_range_base(tree type)
+	constructor.
+	Add dump (), supports_type_p,
+	value_range_base_normalize_symbolics, set_varying, and
+	set_undefined.
+	(value_range): Add set_varying, set_undefined, and dump().
+	(vrp_val_is_min): Add argument.
+	(vrp_val_is_max): Same.
+	(vrp_val_min): Same.
+	(vrp_val_max): Same.
+	(range_includes_zero_p): Adjust for varying/undef with types.
+	* vr-values.c (class type_range_cache): New.
+	(set_value_range_to_truthvalue): Adjust for varying/undef with
+	types.
+	(get_value_range): Same.
+	(vr_values::set_defs_to_varying): Same.
+	(vr_values::update_value_range): Same.
+	(extract_range_for_var_from_comparison_expr): Same.
+	(extract_range_from_binary_expr): Same.
+	(extract_range_from_cond_expr): Same.
+	(check_for_binary_op_overflow): Same.
+	(extract_range_basic): Same.
+	(extract_range_from_assignment): Same.
+	(vr_values::vr_values): Initialize type cache.
+	(vr_values::~vr_values): Free type cache.
+	(extract_range_from_phi_node): Adjust for varying/undef with
+	types.
+	* vr-values.h (class vr_values): Add type_cache.
+
 2019-06-26  Jeff Law  <law@redhat.com>
 
 	PR tree-optimization/90883
diff --git a/gcc/gimple-ssa-evrp-analyze.c b/gcc/gimple-ssa-evrp-analyze.c
index 4c68af847e1..0184703a13c 100644
--- a/gcc/gimple-ssa-evrp-analyze.c
+++ b/gcc/gimple-ssa-evrp-analyze.c
@@ -251,13 +251,18 @@  evrp_range_analyzer::record_ranges_from_phis (basic_block bb)
       if (virtual_operand_p (lhs))
 	continue;
 
+      /* Skips floats and other things we can't represent in a
+	 range.  */
+      if (!value_range_base::supports_type_p (TREE_TYPE (lhs)))
+	continue;
+
       value_range vr_result;
       bool interesting = stmt_interesting_for_vrp (phi);
       if (!has_unvisited_preds && interesting)
 	vr_values->extract_range_from_phi_node (phi, &vr_result);
       else
 	{
-	  vr_result.set_varying ();
+	  vr_result.set_varying (TREE_TYPE (lhs));
 	  /* When we have an unvisited executable predecessor we can't
 	     use PHI arg ranges which may be still UNDEFINED but have
 	     to use VARYING for them.  But we can still resort to
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 69c00a9c5a5..d98ddf1b0c3 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -314,7 +314,7 @@  public:
   inline bool set_to_bottom ();
   bool meet_with (const value_range_base *p_vr);
   bool meet_with (const ipcp_vr_lattice &other);
-  void init () { gcc_assert (m_vr.undefined_p ()); }
+  void init (tree type) { m_vr.set_undefined (type); }
   void print (FILE * f);
 
 private:
@@ -977,7 +977,7 @@  ipcp_vr_lattice::set_to_bottom ()
 {
   if (m_vr.varying_p ())
     return false;
-  m_vr.set_varying ();
+  m_vr.set_varying (m_vr.type ());
   return true;
 }
 
@@ -1204,7 +1204,8 @@  initialize_node_lattices (struct cgraph_node *node)
   for (i = 0; i < ipa_get_param_count (info); i++)
     {
       struct ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
-      plats->m_value_range.init ();
+      tree type = ipa_get_type (info, i);
+      plats->m_value_range.init (type);
     }
 
   if (disable || variable)
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 785227df690..781f802f365 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -183,7 +183,7 @@  record_temporary_equivalences_from_phis (edge e,
 	  else if (TREE_CODE (src) == INTEGER_CST)
 	    new_vr->set (src);
 	  else
-	    new_vr->set_varying ();
+	    new_vr->set_varying (TREE_TYPE (src));
 
 	  /* This is a temporary range for DST, so push it.  */
 	  evrp_range_analyzer->push_value_range (dst, new_vr);
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 8b80bce8945..343e98488a9 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -440,14 +440,16 @@  get_range_info (const_tree name, value_range_base &vr)
   wide_int wmin, wmax;
   enum value_range_kind kind = get_range_info (name, &wmin, &wmax);
 
-  if (kind == VR_VARYING || kind == VR_UNDEFINED)
-    min = max = NULL;
+  if (kind == VR_VARYING)
+    vr.set_varying (TREE_TYPE (name));
+  else if (kind == VR_UNDEFINED)
+    vr.set_undefined (TREE_TYPE (name));
   else
     {
       min = wide_int_to_tree (TREE_TYPE (name), wmin);
       max = wide_int_to_tree (TREE_TYPE (name), wmax);
+      vr.set (kind, min, max);
     }
-  vr.set (kind, min, max);
   return kind;
 }
 
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 594ee9adc17..97046c22ed1 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -86,6 +86,8 @@  value_range_base::set (enum value_range_kind kind, tree min, tree max)
 void
 value_range::set_equiv (bitmap equiv)
 {
+  if (undefined_p () || varying_p ())
+    equiv = NULL;
   /* Since updating the equivalence set involves deep copying the
      bitmaps, only do it if absolutely necessary.
 
@@ -134,6 +136,11 @@  value_range::value_range (const value_range_base &other)
   set (other.kind (), other.min(), other.max (), NULL);
 }
 
+value_range_base::value_range_base (tree type)
+{
+  set_varying (type);
+}
+
 /* Like set, but keep the equivalences in place.  */
 
 void
@@ -190,7 +197,6 @@  value_range_base::check ()
       }
     case VR_UNDEFINED:
     case VR_VARYING:
-      gcc_assert (!min () && !max ());
       break;
     default:
       gcc_unreachable ();
@@ -217,6 +223,10 @@  value_range::check ()
 bool
 value_range_base::equal_p (const value_range_base &other) const
 {
+  /* Ignore types for undefined.  All undefines are equal.  */
+  if (undefined_p ())
+    return m_kind == other.m_kind;
+
   return (m_kind == other.m_kind
 	  && vrp_operand_equal_p (m_min, other.m_min)
 	  && vrp_operand_equal_p (m_max, other.m_max));
@@ -259,27 +269,56 @@  value_range_base::constant_p () const
 }
 
 void
-value_range_base::set_undefined ()
+value_range_base::set_undefined (tree type)
 {
-  set (VR_UNDEFINED, NULL, NULL);
+  m_kind = VR_UNDEFINED;
+  if (type)
+    {
+      if (supports_type_p (type))
+	{
+	  m_min = vrp_val_min (type, true);
+	  m_max = vrp_val_max (type, true);
+	}
+      else
+	{
+	  /* This is a temporary kludge for ipa-cp which is building
+	     undefined/varying of floats.  ??  */
+	  m_min = m_max = build1 (NOP_EXPR, type, type);
+	}
+    }
+  else
+    m_min = m_max = NULL;
 }
 
 void
-value_range::set_undefined ()
+value_range::set_undefined (tree type)
 {
-  set (VR_UNDEFINED, NULL, NULL, NULL);
+  value_range_base::set_undefined (type);
+  equiv_clear ();
 }
 
 void
-value_range_base::set_varying ()
+value_range_base::set_varying (tree type)
 {
-  set (VR_VARYING, NULL, NULL);
+  m_kind = VR_VARYING;
+  if (supports_type_p (type))
+    {
+      m_min = vrp_val_min (type, true);
+      m_max = vrp_val_max (type, true);
+    }
+  else
+    {
+      /* This is a temporary kludge for ipa-cp which is building
+	 undefined/varying of floats.  ??  */
+      m_min = m_max = build1 (NOP_EXPR, type, type);
+    }
 }
 
 void
-value_range::set_varying ()
+value_range::set_varying (tree type)
 {
-  set (VR_VARYING, NULL, NULL, NULL);
+  value_range_base::set_varying (type);
+  equiv_clear ();
 }
 
 /* Return TRUE if it is possible that range contains VAL.  */
@@ -338,21 +377,24 @@  value_range_base::singleton_p (tree *result) const
 tree
 value_range_base::type () const
 {
-  /* Types are only valid for VR_RANGE and VR_ANTI_RANGE, which are
-     known to have non-zero min/max.  */
-  gcc_assert (min ());
   return TREE_TYPE (min ());
 }
 
 void
 value_range_base::dump (FILE *file) const
 {
+  tree ttype;
+  if (undefined_p () && !m_min)
+    ttype = void_type_node;
+  else ttype = TREE_TYPE (m_min);
+
   if (undefined_p ())
-    fprintf (file, "UNDEFINED");
+    {
+      print_generic_expr (file, ttype);
+      fprintf (file, " UNDEFINED");
+    }
   else if (m_kind == VR_RANGE || m_kind == VR_ANTI_RANGE)
     {
-      tree ttype = type ();
-
       print_generic_expr (file, ttype);
       fprintf (file, " ");
 
@@ -378,11 +420,20 @@  value_range_base::dump (FILE *file) const
       fprintf (file, "]");
     }
   else if (varying_p ())
-    fprintf (file, "VARYING");
+    {
+      print_generic_expr (file, ttype);
+      fprintf (file, " VARYING");
+    }
   else
     gcc_unreachable ();
 }
 
+void
+value_range_base::dump () const
+{
+  dump (stderr);
+}
+
 void
 value_range::dump (FILE *file) const
 {
@@ -406,6 +457,12 @@  value_range::dump (FILE *file) const
     }
 }
 
+void
+value_range::dump () const
+{
+  dump (stderr);
+}
+
 void
 dump_value_range (FILE *file, const value_range *vr)
 {
@@ -499,10 +556,18 @@  static assert_locus **asserts_for;
 /* Return the maximum value for TYPE.  */
 
 tree
-vrp_val_max (const_tree type)
+vrp_val_max (const_tree type, bool handle_pointers)
 {
   if (!INTEGRAL_TYPE_P (type))
-    return NULL_TREE;
+    {
+      if (POINTER_TYPE_P (type) && handle_pointers)
+	{
+	  wide_int max = wi::max_value (TYPE_PRECISION (type),
+					TYPE_SIGN (type));
+	  return wide_int_to_tree (const_cast<tree> (type), max);
+	}
+      return NULL_TREE;
+    }
 
   return TYPE_MAX_VALUE (type);
 }
@@ -510,23 +575,22 @@  vrp_val_max (const_tree type)
 /* Return the minimum value for TYPE.  */
 
 tree
-vrp_val_min (const_tree type)
+vrp_val_min (const_tree type, bool handle_pointers)
 {
   if (!INTEGRAL_TYPE_P (type))
-    return NULL_TREE;
+    {
+      if (POINTER_TYPE_P (type) && handle_pointers)
+	return build_zero_cst (const_cast<tree> (type));
+      return NULL_TREE;
+    }
 
   return TYPE_MIN_VALUE (type);
 }
 
-/* Return whether VAL is equal to the maximum value of its type.
-   We can't do a simple equality comparison with TYPE_MAX_VALUE because
-   C typedefs and Ada subtypes can produce types whose TYPE_MAX_VALUE
-   is not == to the integer constant with the same value in the type.  */
-
 bool
-vrp_val_is_max (const_tree val)
+vrp_val_is_max (const_tree val, bool handle_pointers)
 {
-  tree type_max = vrp_val_max (TREE_TYPE (val));
+  tree type_max = vrp_val_max (TREE_TYPE (val), handle_pointers);
   return (val == type_max
 	  || (type_max != NULL_TREE
 	      && operand_equal_p (val, type_max, 0)));
@@ -535,9 +599,9 @@  vrp_val_is_max (const_tree val)
 /* Return whether VAL is equal to the minimum value of its type.  */
 
 bool
-vrp_val_is_min (const_tree val)
+vrp_val_is_min (const_tree val, bool handle_pointers)
 {
-  tree type_min = vrp_val_min (TREE_TYPE (val));
+  tree type_min = vrp_val_min (TREE_TYPE (val), handle_pointers);
   return (val == type_min
 	  || (type_min != NULL_TREE
 	      && operand_equal_p (val, type_min, 0)));
@@ -629,15 +693,17 @@  void
 value_range_base::set_and_canonicalize (enum value_range_kind kind,
 					tree min, tree max)
 {
-  /* Use the canonical setters for VR_UNDEFINED and VR_VARYING.  */
   if (kind == VR_UNDEFINED)
     {
-      set_undefined ();
+      if (min)
+	set_undefined (TREE_TYPE (min));
+      else
+	set_undefined ();
       return;
     }
   else if (kind == VR_VARYING)
     {
-      set_varying ();
+      set_varying (TREE_TYPE (min));
       return;
     }
 
@@ -660,7 +726,7 @@  value_range_base::set_and_canonicalize (enum value_range_kind kind,
 	 for VR_ANTI_RANGE empty range, so drop to varying as well.  */
       if (TYPE_PRECISION (TREE_TYPE (min)) == 1)
 	{
-	  set_varying ();
+	  set_varying (TREE_TYPE (min));
 	  return;
 	}
 
@@ -674,7 +740,7 @@  value_range_base::set_and_canonicalize (enum value_range_kind kind,
 	 to varying in this case.  */
       if (tree_int_cst_lt (max, min))
 	{
-	  set_varying ();
+	  set_varying (TREE_TYPE (min));
 	  return;
 	}
 
@@ -696,7 +762,7 @@  value_range_base::set_and_canonicalize (enum value_range_kind kind,
 	{
 	  /* We cannot deal with empty ranges, drop to varying.
 	     ???  This could be VR_UNDEFINED instead.  */
-	  set_varying ();
+	  set_varying (type);
 	  return;
 	}
       else if (TYPE_PRECISION (TREE_TYPE (min)) == 1
@@ -1178,8 +1244,8 @@  ranges_from_anti_range (const value_range_base *ar,
 {
   tree type = ar->type ();
 
-  vr0->set_undefined ();
-  vr1->set_undefined ();
+  vr0->set_undefined (type);
+  vr1->set_undefined (type);
 
   /* As a future improvement, we could handle ~[0, A] as: [-INF, -1] U
      [A+1, +INF].  Not sure if this helps in practice, though.  */
@@ -1202,7 +1268,7 @@  ranges_from_anti_range (const value_range_base *ar,
   if (vr0->undefined_p ())
     {
       *vr0 = *vr1;
-      vr1->set_undefined ();
+      vr1->set_undefined (type);
     }
 
   return !vr0->undefined_p ();
@@ -1240,7 +1306,8 @@  static void
 extract_range_from_multiplicative_op (value_range_base *vr,
 				      enum tree_code code,
 				      const value_range_base *vr0,
-				      const value_range_base *vr1)
+				      const value_range_base *vr1,
+				      tree type)
 {
   gcc_assert (code == MULT_EXPR
 	      || code == TRUNC_DIV_EXPR
@@ -1250,13 +1317,31 @@  extract_range_from_multiplicative_op (value_range_base *vr,
 	      || code == ROUND_DIV_EXPR
 	      || code == RSHIFT_EXPR
 	      || code == LSHIFT_EXPR);
-  gcc_assert (vr0->kind () == VR_RANGE
-	      && vr0->kind () == vr1->kind ());
+  if (!range_int_cst_p (vr1))
+    {
+      vr->set_varying (type);
+      return;
+    }
+
+  /* Even if vr0 is VARYING or otherwise not usable, we can derive
+     useful ranges just from the shift count.  E.g.
+     x >> 63 for signed 64-bit x is always [-1, 0].  */
+  value_range_base tem = vr0->normalize_symbolics ();
+  tree vr0_min, vr0_max;
+  if (tem.kind () == VR_RANGE)
+    {
+      vr0_min = tem.min ();
+      vr0_max = tem.max ();
+    }
+  else
+    {
+      vr0_min = vrp_val_min (type);
+      vr0_max = vrp_val_max (type);
+    }
 
-  tree type = vr0->type ();
   wide_int res_lb, res_ub;
-  wide_int vr0_lb = wi::to_wide (vr0->min ());
-  wide_int vr0_ub = wi::to_wide (vr0->max ());
+  wide_int vr0_lb = wi::to_wide (vr0_min);
+  wide_int vr0_ub = wi::to_wide (vr0_max);
   wide_int vr1_lb = wi::to_wide (vr1->min ());
   wide_int vr1_ub = wi::to_wide (vr1->max ());
   bool overflow_undefined = TYPE_OVERFLOW_UNDEFINED (type);
@@ -1270,7 +1355,7 @@  extract_range_from_multiplicative_op (value_range_base *vr,
 			      wide_int_to_tree (type, res_lb),
 			      wide_int_to_tree (type, res_ub));
   else
-    vr->set_varying ();
+    vr->set_varying (type);
 }
 
 /* If BOUND will include a symbolic bound, adjust it accordingly,
@@ -1479,7 +1564,7 @@  extract_range_from_binary_expr (value_range_base *vr,
   if (!INTEGRAL_TYPE_P (expr_type)
       && !POINTER_TYPE_P (expr_type))
     {
-      vr->set_varying ();
+      vr->set_varying (expr_type);
       return;
     }
 
@@ -1503,14 +1588,14 @@  extract_range_from_binary_expr (value_range_base *vr,
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
     {
-      vr->set_varying ();
+      vr->set_varying (expr_type);
       return;
     }
 
   /* If both ranges are UNDEFINED, so is the result.  */
   if (vr0.undefined_p () && vr1.undefined_p ())
     {
-      vr->set_undefined ();
+      vr->set_undefined (expr_type);
       return;
     }
   /* If one of the ranges is UNDEFINED drop it to VARYING for the following
@@ -1518,9 +1603,9 @@  extract_range_from_binary_expr (value_range_base *vr,
      have UNDEFINED result for all or some value-ranges of the not UNDEFINED
      operand.  */
   else if (vr0.undefined_p ())
-    vr0.set_varying ();
+    vr0.set_varying (expr_type);
   else if (vr1.undefined_p ())
-    vr1.set_varying ();
+    vr1.set_varying (expr_type);
 
   /* We get imprecise results from ranges_from_anti_range when
      code is EXACT_DIV_EXPR.  We could mask out bits in the resulting
@@ -1592,7 +1677,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	  || vr0.symbolic_p ()
 	  || vr1.symbolic_p ()))
     {
-      vr->set_varying ();
+      vr->set_varying (expr_type);
       return;
     }
 
@@ -1610,7 +1695,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	  else if (vr0.zero_p () && vr1.zero_p ())
 	    vr->set_zero (expr_type);
 	  else
-	    vr->set_varying ();
+	    vr->set_varying (expr_type);
 	}
       else if (code == POINTER_PLUS_EXPR)
 	{
@@ -1639,7 +1724,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	  else if (vr0.zero_p () && vr1.zero_p ())
 	    vr->set_zero (expr_type);
 	  else
-	    vr->set_varying ();
+	    vr->set_varying (expr_type);
 	}
       else if (code == BIT_AND_EXPR)
 	{
@@ -1650,10 +1735,10 @@  extract_range_from_binary_expr (value_range_base *vr,
 	  else if (vr0.zero_p () || vr1.zero_p ())
 	    vr->set_zero (expr_type);
 	  else
-	    vr->set_varying ();
+	    vr->set_varying (expr_type);
 	}
       else
-	vr->set_varying ();
+	vr->set_varying (expr_type);
 
       return;
     }
@@ -1720,7 +1805,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	  if (((bool)min_ovf && sym_min_op0 != sym_min_op1)
 	      || ((bool)max_ovf && sym_max_op0 != sym_max_op1))
 	    {
-	      vr->set_varying ();
+	      vr->set_varying (expr_type);
 	      return;
 	    }
 
@@ -1731,7 +1816,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 					 wmin, wmax, min_ovf, max_ovf);
 	  if (type == VR_VARYING)
 	    {
-	      vr->set_varying ();
+	      vr->set_varying (expr_type);
 	      return;
 	    }
 
@@ -1757,7 +1842,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	     a single range or anti-range as the above is
 		 [-INF+1, +INF(OVF)] intersected with ~[5, 5]
 	     but one could use a scheme similar to equivalences for this. */
-	  vr->set_varying ();
+	  vr->set_varying (expr_type);
 	  return;
 	}
     }
@@ -1774,7 +1859,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	vr->set (VR_RANGE, wide_int_to_tree (expr_type, wmin),
 		 wide_int_to_tree (expr_type, wmax));
       else
-	vr->set_varying ();
+	vr->set_varying (expr_type);
       return;
     }
   else if (code == MULT_EXPR)
@@ -1782,10 +1867,10 @@  extract_range_from_binary_expr (value_range_base *vr,
       if (!range_int_cst_p (&vr0)
 	  || !range_int_cst_p (&vr1))
 	{
-	  vr->set_varying ();
+	  vr->set_varying (expr_type);
 	  return;
 	}
-      extract_range_from_multiplicative_op (vr, code, &vr0, &vr1);
+      extract_range_from_multiplicative_op (vr, code, &vr0, &vr1, expr_type);
       return;
     }
   else if (code == RSHIFT_EXPR
@@ -1800,13 +1885,8 @@  extract_range_from_binary_expr (value_range_base *vr,
 	{
 	  if (code == RSHIFT_EXPR)
 	    {
-	      /* Even if vr0 is VARYING or otherwise not usable, we can derive
-		 useful ranges just from the shift count.  E.g.
-		 x >> 63 for signed 64-bit x is always [-1, 0].  */
-	      if (vr0.kind () != VR_RANGE || vr0.symbolic_p ())
-		vr0.set (VR_RANGE, vrp_val_min (expr_type),
-			 vrp_val_max (expr_type));
-	      extract_range_from_multiplicative_op (vr, code, &vr0, &vr1);
+	      extract_range_from_multiplicative_op (vr, code, &vr0, &vr1,
+						    expr_type);
 	      return;
 	    }
 	  else if (code == LSHIFT_EXPR
@@ -1827,7 +1907,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 		}
 	    }
 	}
-      vr->set_varying ();
+      vr->set_varying (expr_type);
       return;
     }
   else if (code == TRUNC_DIV_EXPR
@@ -1843,7 +1923,7 @@  extract_range_from_binary_expr (value_range_base *vr,
       /* Special case explicit division by zero as undefined.  */
       if (vr1.zero_p ())
 	{
-	  vr->set_undefined ();
+	  vr->set_undefined (expr_type);
 	  return;
 	}
 
@@ -1864,7 +1944,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 			       TYPE_OVERFLOW_UNDEFINED (expr_type),
 			       extra_range_p, extra_min, extra_max))
 	{
-	  vr->set_varying ();
+	  vr->set_varying (expr_type);
 	  return;
 	}
       vr->set (VR_RANGE, wide_int_to_tree (expr_type, wmin),
@@ -1882,7 +1962,7 @@  extract_range_from_binary_expr (value_range_base *vr,
     {
       if (vr1.zero_p ())
 	{
-	  vr->set_undefined ();
+	  vr->set_undefined (expr_type);
 	  return;
 	}
       wide_int wmin, wmax, tmp;
@@ -1923,7 +2003,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	      vr->set (VR_RANGE, min, max);
 	    }
 	  else
-	    vr->set_varying ();
+	    vr->set_varying (expr_type);
 	  return;
 	}
       else if (code == BIT_IOR_EXPR)
@@ -1941,7 +2021,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	      vr->set (VR_RANGE, min, max);
 	    }
 	  else
-	    vr->set_varying ();
+	    vr->set_varying (expr_type);
 	  return;
 	}
       else if (code == BIT_XOR_EXPR)
@@ -1957,7 +2037,7 @@  extract_range_from_binary_expr (value_range_base *vr,
 	      vr->set (VR_RANGE, min, max);
 	    }
 	  else
-	    vr->set_varying ();
+	    vr->set_varying (expr_type);
 	  return;
 	}
     }
@@ -1971,7 +2051,7 @@  extract_range_from_binary_expr (value_range_base *vr,
       || max == NULL_TREE
       || TREE_OVERFLOW_P (max))
     {
-      vr->set_varying ();
+      vr->set_varying (expr_type);
       return;
     }
 
@@ -1980,7 +2060,7 @@  extract_range_from_binary_expr (value_range_base *vr,
      Note that we do accept [-INF, -INF] and [+INF, +INF].  */
   if (vrp_val_is_min (min) && vrp_val_is_max (max))
     {
-      vr->set_varying ();
+      vr->set_varying (expr_type);
       return;
     }
 
@@ -1990,7 +2070,7 @@  extract_range_from_binary_expr (value_range_base *vr,
       /* If the new range has its limits swapped around (MIN > MAX),
 	 then the operation caused one of them to wrap around, mark
 	 the new range VARYING.  */
-      vr->set_varying ();
+      vr->set_varying (expr_type);
     }
   else
     vr->set (type, min, max);
@@ -2016,14 +2096,14 @@  extract_range_from_unary_expr (value_range_base *vr,
       || !(INTEGRAL_TYPE_P (type)
 	   || POINTER_TYPE_P (type)))
     {
-      vr->set_varying ();
+      vr->set_varying (type);
       return;
     }
 
   /* If VR0 is UNDEFINED, so is the result.  */
   if (vr0.undefined_p ())
     {
-      vr->set_undefined ();
+      vr->set_undefined (type);
       return;
     }
 
@@ -2088,7 +2168,7 @@  extract_range_from_unary_expr (value_range_base *vr,
 	  else if (vr0.zero_p ())
 	    vr->set_zero (type);
 	  else
-	    vr->set_varying ();
+	    vr->set_varying (type);
 	  return;
 	}
 
@@ -2123,7 +2203,7 @@  extract_range_from_unary_expr (value_range_base *vr,
 	  vr->set_and_canonicalize (VR_RANGE, min, max);
 	}
       else
-	vr->set_varying ();
+	vr->set_varying (outer_type);
       return;
     }
   else if (code == ABS_EXPR)
@@ -2136,7 +2216,7 @@  extract_range_from_unary_expr (value_range_base *vr,
 	vr->set (VR_RANGE, wide_int_to_tree (type, wmin),
 		 wide_int_to_tree (type, wmax));
       else
-	vr->set_varying ();
+	vr->set_varying (type);
       return;
     }
   else if (code == ABSU_EXPR)
@@ -2151,7 +2231,7 @@  extract_range_from_unary_expr (value_range_base *vr,
     }
 
   /* For unhandled operations fall back to varying.  */
-  vr->set_varying ();
+  vr->set_varying (type);
   return;
 }
 
@@ -5147,7 +5227,7 @@  vrp_prop::vrp_initialize ()
 	  if (!stmt_interesting_for_vrp (phi))
 	    {
 	      tree lhs = PHI_RESULT (phi);
-	      get_value_range (lhs)->set_varying ();
+	      get_value_range (lhs)->set_varying (TREE_TYPE (lhs));
 	      prop_set_simulate_again (phi, false);
 	    }
 	  else
@@ -5342,7 +5422,7 @@  vrp_prop::visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p)
 	    use_operand_p use_p;
 	    enum ssa_prop_result res = SSA_PROP_VARYING;
 
-	    get_value_range (lhs)->set_varying ();
+	    get_value_range (lhs)->set_varying (TREE_TYPE (lhs));
 
 	    FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
 	      {
@@ -6132,7 +6212,12 @@  value_range_base::union_helper (const value_range_base *vr0,
 
   /* Work on a temporary so we can still use vr0 when union returns varying.  */
   value_range_base tem;
-  tem.set_and_canonicalize (vr0type, vr0min, vr0max);
+  if (vr0type == VR_UNDEFINED)
+    tem.set_undefined (TREE_TYPE (vr0->min ()));
+  else if (vr0type == VR_VARYING)
+    tem.set_varying (TREE_TYPE (vr0->min ()));
+  else
+    tem.set_and_canonicalize (vr0type, vr0min, vr0max);
 
   /* Failed to find an efficient meet.  Before giving up and setting
      the result to VARYING, see if we can at least derive a useful
@@ -6212,6 +6297,48 @@  value_range::union_ (const value_range *other)
     }
 }
 
+/* Normalize symbolics into constants.  */
+
+value_range_base
+value_range_base::normalize_symbolics () const
+{
+  tree ttype = type ();
+  bool min_symbolic = !is_gimple_min_invariant (min ());
+  bool max_symbolic = !is_gimple_min_invariant (max ());
+  if (varying_p () || undefined_p () || (!min_symbolic && !max_symbolic))
+    return *this;
+
+  // [SYM, SYM] -> VARYING
+  if (min_symbolic && max_symbolic)
+    return value_range_base (ttype);
+  if (kind () == VR_RANGE)
+    {
+      // [SYM, NUM] -> [-MIN, NUM]
+      if (min_symbolic)
+	return value_range_base (VR_RANGE, vrp_val_min (ttype), max ());
+      // [NUM, SYM] -> [NUM, +MAX]
+      return value_range_base (VR_RANGE, min (), vrp_val_max (ttype));
+    }
+  gcc_assert (kind () == VR_ANTI_RANGE);
+  // ~[SYM, NUM] -> [NUM + 1, +MAX]
+  if (min_symbolic)
+    {
+      if (!vrp_val_is_max (max ()))
+	{
+	  tree n = wide_int_to_tree (ttype, wi::to_wide (max ()) + 1);
+	  return value_range_base (VR_RANGE, n, vrp_val_max (ttype));
+	}
+      return value_range_base (ttype);
+    }
+  // ~[NUM, SYM] -> [-MIN, NUM - 1]
+  if (!vrp_val_is_min (min ()))
+    {
+      tree n = wide_int_to_tree (ttype, wi::to_wide (min ()) - 1);
+      return value_range_base (VR_RANGE, vrp_val_min (ttype), n);
+    }
+  return value_range_base (ttype);
+}
+
 /* Visit all arguments for PHI node PHI that flow through executable
    edges.  If a valid value range can be derived from all the incoming
    value ranges, set a new range for the LHS of PHI.  */
@@ -6885,7 +7012,7 @@  determine_value_range_1 (value_range_base *vr, tree expr)
 	vr->set (kind, wide_int_to_tree (TREE_TYPE (expr), min),
 		 wide_int_to_tree (TREE_TYPE (expr), max));
       else
-	vr->set_varying ();
+	vr->set_varying (TREE_TYPE (expr));
     }
 }
 
diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h
index 4ec974f5fdb..5baabfd8b8d 100644
--- a/gcc/tree-vrp.h
+++ b/gcc/tree-vrp.h
@@ -43,6 +43,7 @@  class GTY((for_user)) value_range_base
 public:
   value_range_base ();
   value_range_base (value_range_kind, tree, tree);
+  value_range_base (tree type);
 
   void set (value_range_kind, tree, tree);
   void set (tree);
@@ -58,8 +59,8 @@  public:
   bool constant_p () const;
   bool undefined_p () const;
   bool varying_p () const;
-  void set_varying ();
-  void set_undefined ();
+  void set_varying (tree);
+  void set_undefined (tree = NULL);
 
   void union_ (const value_range_base *);
   void intersect (const value_range_base *);
@@ -76,6 +77,10 @@  public:
   bool nonzero_p () const;
   bool singleton_p (tree *result = NULL) const;
   void dump (FILE *) const;
+  void dump () const;
+
+  static bool supports_type_p (tree type);
+  value_range_base normalize_symbolics () const;
 
 protected:
   void check ();
@@ -133,8 +138,8 @@  class GTY((user)) value_range : public value_range_base
   bool equal_p (const value_range &, bool ignore_equivs) const;
 
   /* Types of value ranges.  */
-  void set_undefined ();
-  void set_varying ();
+  void set_undefined (tree = NULL);
+  void set_varying (tree);
 
   /* Equivalence bitmap methods.  */
   bitmap equiv () const;
@@ -145,6 +150,7 @@  class GTY((user)) value_range : public value_range_base
   void deep_copy (const value_range *);
   void set_and_canonicalize (enum value_range_kind, tree, tree, bitmap = NULL);
   void dump (FILE *) const;
+  void dump () const;
 
  private:
   /* Deep-copies bitmap argument.  */
@@ -254,6 +260,17 @@  struct assert_info
   tree expr;
 };
 
+// Return true if TYPE is a valid type for value_range to operate on.
+// Otherwise return FALSE.
+
+inline bool
+value_range_base::supports_type_p (tree type)
+{
+  if (type && (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type)))
+    return type;
+  return NULL;
+}
+
 extern void register_edge_assert_for (tree, edge, enum tree_code,
 				      tree, tree, vec<assert_info> &);
 extern bool stmt_interesting_for_vrp (gimple *);
@@ -267,11 +284,11 @@  extern bool range_int_cst_singleton_p (const value_range_base *);
 extern int compare_values (tree, tree);
 extern int compare_values_warnv (tree, tree, bool *);
 extern int operand_less_p (tree, tree);
-extern bool vrp_val_is_min (const_tree);
-extern bool vrp_val_is_max (const_tree);
+extern bool vrp_val_is_min (const_tree, bool handle_pointers = false);
+extern bool vrp_val_is_max (const_tree, bool handle_pointers = false);
 
-extern tree vrp_val_min (const_tree);
-extern tree vrp_val_max (const_tree);
+extern tree vrp_val_min (const_tree, bool handle_pointers = false);
+extern tree vrp_val_max (const_tree, bool handle_pointers = false);
 
 extern void extract_range_from_unary_expr (value_range_base *vr,
 					   enum tree_code code,
@@ -301,12 +318,10 @@  extern value_range_kind determine_value_range (tree, wide_int *, wide_int *);
 inline bool
 range_includes_zero_p (const value_range_base *vr)
 {
+  /* UNDEFINED may not have a type in uninitialized ranges.  */
   if (vr->undefined_p ())
     return false;
 
-  if (vr->varying_p ())
-    return true;
-
   return vr->may_contain_p (build_zero_cst (vr->type ()));
 }
 
diff --git a/gcc/vr-values.c b/gcc/vr-values.c
index 3f20c1a6fe8..6569af391ed 100644
--- a/gcc/vr-values.c
+++ b/gcc/vr-values.c
@@ -48,6 +48,68 @@  along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "vr-values.h"
 #include "cfghooks.h"
+#include "wide-int-range.h"
+
+
+/* Cache VARYING value_ranges indexed by type.  */
+class type_range_cache
+{
+public:
+  type_range_cache ();
+  ~type_range_cache ();
+  value_range *varying_range (tree type);
+private:
+  value_range *new_varying (tree type);
+  hash_map<tree, value_range *> *m_type_table;
+  obstack m_range_obj;
+};
+
+/* Delete type cache.  */
+type_range_cache::~type_range_cache ()
+{
+  delete m_type_table;
+  obstack_free (&m_range_obj, NULL);
+}
+
+/* Create a new type cache.  */
+type_range_cache::type_range_cache ()
+{
+  /* Allocate a map and a local obstack.  */
+  m_type_table = new hash_map<tree, value_range *>;
+  gcc_obstack_init (&m_range_obj);
+}
+
+/* Allocate a new range from the obstack and set it to VARYING for TYPE.  */
+inline value_range *
+type_range_cache::new_varying (tree type)
+{
+  /* Allocate memory.  */
+  void *p = XOBNEW (&m_range_obj, value_range);
+  /* Invoke the constructors on the memory using placement new.  */
+  value_range *new_p = new (p) value_range ();
+  /* Initialize it to varying.  */
+  new_p->set_varying (type);
+  return new_p;
+}
+
+/* Return a varying object for TYPE.  If it already exists, return it.
+   Otherwise allocate a new one and register it in the table.  */
+value_range *
+type_range_cache::varying_range (tree type)
+{
+  bool existed;
+  value_range *&slot = m_type_table->get_or_insert (type, &existed);
+  if (!existed)
+    slot = new_varying (type);
+  else
+    {
+      /* Sanity check to ensure this varying hasn't been modified.  */
+      value_range v;
+      v.set_varying (type);
+      gcc_checking_assert (v.equal_p (*slot, true));
+    }
+  return slot;
+}
 
 /* Set value range VR to a non-negative range of type TYPE.  */
 
@@ -64,7 +126,7 @@  static inline void
 set_value_range_to_truthvalue (value_range *vr, tree type)
 {
   if (TYPE_PRECISION (type) == 1)
-    vr->set_varying ();
+    vr->set_varying (type);
   else
     vr->update (VR_RANGE, build_int_cst (type, 0), build_int_cst (type, 1));
 }
@@ -78,7 +140,7 @@  set_value_range_to_truthvalue (value_range *vr, tree type)
 value_range *
 vr_values::get_value_range (const_tree var)
 {
-  static const value_range vr_const_varying (VR_VARYING, NULL, NULL);
+  tree type = TREE_TYPE (var);
   value_range *vr;
   tree sym;
   unsigned ver = SSA_NAME_VERSION (var);
@@ -91,7 +153,7 @@  vr_values::get_value_range (const_tree var)
      We should get here at most from the substitute-and-fold stage which
      will never try to change values.  */
   if (ver >= num_vr_values)
-    return CONST_CAST (value_range *, &vr_const_varying);
+    return type_cache->varying_range (type);
 
   vr = vr_value[ver];
   if (vr)
@@ -99,11 +161,15 @@  vr_values::get_value_range (const_tree var)
 
   /* After propagation finished do not allocate new value-ranges.  */
   if (values_propagated)
-    return CONST_CAST (value_range *, &vr_const_varying);
+    {
+      vr = vrp_value_range_pool.allocate ();
+      vr->set_varying (type);
+      return vr;
+    }
 
   /* Create a default value range.  */
   vr_value[ver] = vr = vrp_value_range_pool.allocate ();
-  vr->set_undefined ();
+  vr->set_undefined (type);
 
   /* If VAR is a default definition of a parameter, the variable can
      take any value in VAR's type.  */
@@ -126,10 +192,10 @@  vr_values::get_value_range (const_tree var)
 	    {
 	      get_range_info (var, *vr);
 	      if (vr->undefined_p ())
-		vr->set_varying ();
+		vr->set_varying (TREE_TYPE (sym));
 	    }
 	  else
-	    vr->set_varying ();
+	    vr->set_varying (TREE_TYPE (sym));
 	}
       else if (TREE_CODE (sym) == RESULT_DECL
 	       && DECL_BY_REFERENCE (sym))
@@ -150,12 +216,11 @@  vr_values::set_defs_to_varying (gimple *stmt)
   ssa_op_iter i;
   tree def;
   FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_DEF)
-    {
-      value_range *vr = get_value_range (def);
-      /* Avoid writing to vr_const_varying get_value_range may return.  */
-      if (!vr->varying_p ())
-	vr->set_varying ();
-    }
+    if (value_range_base::supports_type_p (TREE_TYPE (def)))
+      {
+	value_range *vr = get_value_range (def);
+	vr->set_varying (TREE_TYPE (def));
+      }
 }
 
 /* Update the value range and equivalence set for variable VAR to
@@ -198,13 +263,13 @@  vr_values::update_value_range (const_tree var, value_range *new_vr)
 	 called, if we are anyway, keep it VARYING.  */
       if (old_vr->varying_p ())
 	{
-	  new_vr->set_varying ();
+	  new_vr->set_varying (new_vr->type ());
 	  is_new = false;
 	}
       else if (new_vr->undefined_p ())
 	{
-	  old_vr->set_varying ();
-	  new_vr->set_varying ();
+	  old_vr->set_varying (TREE_TYPE (var));
+	  new_vr->set_varying (TREE_TYPE (var));
 	  return true;
 	}
       else
@@ -435,7 +500,7 @@  vr_values::extract_range_for_var_from_comparison_expr (tree var,
   if ((POINTER_TYPE_P (type) && cond_code != NE_EXPR && cond_code != EQ_EXPR)
       || limit == var)
     {
-      vr_p->set_varying ();
+      vr_p->set_varying (type);
       return;
     }
 
@@ -595,7 +660,7 @@  vr_values::extract_range_for_var_from_comparison_expr (tree var,
 	 all should be optimized away above us.  */
       if (cond_code == LT_EXPR
 	  && compare_values (max, min) == 0)
-	vr_p->set_varying ();
+	vr_p->set_varying (TREE_TYPE (min));
       else
 	{
 	  /* For LT_EXPR, we create the range [MIN, MAX - 1].  */
@@ -635,7 +700,7 @@  vr_values::extract_range_for_var_from_comparison_expr (tree var,
 	 all should be optimized away above us.  */
       if (cond_code == GT_EXPR
 	  && compare_values (min, max) == 0)
-	vr_p->set_varying ();
+	vr_p->set_varying (TREE_TYPE (min));
       else
 	{
 	  /* For GT_EXPR, we create the range [MIN + 1, MAX].  */
@@ -743,14 +808,14 @@  vr_values::extract_range_from_binary_expr (value_range *vr,
   else if (is_gimple_min_invariant (op0))
     vr0.set (op0);
   else
-    vr0.set_varying ();
+    vr0.set_varying (TREE_TYPE (op0));
 
   if (TREE_CODE (op1) == SSA_NAME)
     vr1 = *(get_value_range (op1));
   else if (is_gimple_min_invariant (op1))
     vr1.set (op1);
   else
-    vr1.set_varying ();
+    vr1.set_varying (TREE_TYPE (op1));
 
   /* If one argument is varying, we can sometimes still deduce a
      range for the output: any + [3, +INF] is in [MIN+3, +INF].  */
@@ -891,7 +956,7 @@  vr_values::extract_range_from_unary_expr (value_range *vr, enum tree_code code,
   else if (is_gimple_min_invariant (op0))
     vr0.set (op0);
   else
-    vr0.set_varying ();
+    vr0.set_varying (type);
 
   ::extract_range_from_unary_expr (vr, code, type, &vr0, TREE_TYPE (op0));
 }
@@ -913,7 +978,7 @@  vr_values::extract_range_from_cond_expr (value_range *vr, gassign *stmt)
   else if (is_gimple_min_invariant (op0))
     tem0.set (op0);
   else
-    tem0.set_varying ();
+    tem0.set_varying (TREE_TYPE (op0));
 
   tree op1 = gimple_assign_rhs3 (stmt);
   value_range tem1;
@@ -923,7 +988,7 @@  vr_values::extract_range_from_cond_expr (value_range *vr, gassign *stmt)
   else if (is_gimple_min_invariant (op1))
     tem1.set (op1);
   else
-    tem1.set_varying ();
+    tem1.set_varying (TREE_TYPE (op1));
 
   /* The resulting value range is the union of the operand ranges */
   vr->deep_copy (vr0);
@@ -975,14 +1040,14 @@  vr_values::check_for_binary_op_overflow (enum tree_code subcode, tree type,
   else if (TREE_CODE (op0) == INTEGER_CST)
     vr0.set (op0);
   else
-    vr0.set_varying ();
+    vr0.set_varying (TREE_TYPE (op0));
 
   if (TREE_CODE (op1) == SSA_NAME)
     vr1 = *get_value_range (op1);
   else if (TREE_CODE (op1) == INTEGER_CST)
     vr1.set (op1);
   else
-    vr1.set_varying ();
+    vr1.set_varying (TREE_TYPE (op1));
 
   tree vr0min = vr0.min (), vr0max = vr0.max ();
   tree vr1min = vr1.min (), vr1max = vr1.max ();
@@ -1310,7 +1375,7 @@  vr_values::extract_range_basic (value_range *vr, gimple *stmt)
 	  if (vr->kind () == VR_RANGE
 	      && (vr->min () == vr->max ()
 		  || operand_equal_p (vr->min (), vr->max (), 0)))
-	    vr->set_varying ();
+	    vr->set_varying (vr->type ());
 	  return;
 	}
     }
@@ -1366,7 +1431,7 @@  vr_values::extract_range_basic (value_range *vr, gimple *stmt)
 			vr->set (build_int_cst (type, ovf));
 		      else if (TYPE_PRECISION (type) == 1
 			       && !TYPE_UNSIGNED (type))
-			vr->set_varying ();
+			vr->set_varying (type);
 		      else
 			vr->set (VR_RANGE, build_int_cst (type, 0),
 				 build_int_cst (type, 1));
@@ -1411,7 +1476,7 @@  vr_values::extract_range_basic (value_range *vr, gimple *stmt)
       vr->equiv_clear ();
     }
   else
-    vr->set_varying ();
+    vr->set_varying (type);
 }
 
 
@@ -1447,7 +1512,7 @@  vr_values::extract_range_from_assignment (value_range *vr, gassign *stmt)
 	   && is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))
     vr->set (gimple_assign_rhs1 (stmt));
   else
-    vr->set_varying ();
+    vr->set_varying (TREE_TYPE (gimple_assign_lhs (stmt)));
 
   if (vr->varying_p ())
     extract_range_basic (vr, stmt);
@@ -1926,12 +1991,14 @@  vr_values::vr_values () : vrp_value_range_pool ("Tree VRP value ranges")
   bitmap_obstack_initialize (&vrp_equiv_obstack);
   to_remove_edges = vNULL;
   to_update_switch_stmts = vNULL;
+  type_cache = new type_range_cache;
 }
 
 /* Free VRP lattice.  */
 
 vr_values::~vr_values ()
 {
+  delete type_cache;
   /* Free allocated memory.  */
   free (vr_value);
   free (vr_phi_edge_counts);
@@ -2856,7 +2923,7 @@  vr_values::extract_range_from_phi_node (gphi *phi, value_range *vr_result)
 		      vr_arg_tem.set (vr_arg_->kind (), vr_arg_->min (),
 				      vr_arg_->max (), NULL);
 		      if (vr_arg_tem.symbolic_p ())
-			vr_arg_tem.set_varying ();
+			vr_arg_tem.set_varying (vr_arg_->type ());
 		    }
 		  else
 		    vr_arg = vr_arg_;
@@ -2978,7 +3045,7 @@  vr_values::extract_range_from_phi_node (gphi *phi, value_range *vr_result)
   goto update_range;
 
 varying:
-  vr_result->set_varying ();
+  vr_result->set_varying (TREE_TYPE (lhs));
 
 scev_check:
   /* If this is a loop PHI node SCEV may known more about its value-range.
@@ -2999,7 +3066,7 @@  infinite_check:
 	   || compare_values (vr_result->min (), vr_result->max ()) > 0))
     ;
   else
-    vr_result->set_varying ();
+    vr_result->set_varying (TREE_TYPE (lhs));
 
   /* If the new range is different than the previous value, keep
      iterating.  */
diff --git a/gcc/vr-values.h b/gcc/vr-values.h
index bd67f73701e..f14a4fa9973 100644
--- a/gcc/vr-values.h
+++ b/gcc/vr-values.h
@@ -137,6 +137,7 @@  class vr_values
     tree vec;
   };
 
+  class type_range_cache *type_cache;
   vec<edge> to_remove_edges;
   vec<switch_update> to_update_switch_stmts;
 };