diff mbox

[RFC,IPA-VRP] Early VRP Implementation

Message ID 19ff8188-aed7-0f9e-cc0b-0603698787ff@linaro.org
State New
Headers show

Commit Message

Kugan Vivekanandarajah July 26, 2016, 12:27 p.m. UTC
Hi Riachard,

Thanks for the review. Here is an updated patch with comments below.

> +/* Restore/Pop all the old VRs maintained in the cond_stack.  */
> +
> +void evrp_dom_walker::finalize_dom_walker ()
> +{
> +  while (!cond_stack.is_empty ())
> +    {
> +      tree var = cond_stack.last ().second;
> +      pop_value_range (var);
> +      cond_stack.pop ();
> +    }
>
> why is this necessary at all?  Looks like a waste of time (and the
> stack should be empty here
> anyway).  I think you leak cond_stack as well, I suppose using
> auto_vec might work here,
> you have to check.
Done.

>
>>>
>>> +         /* Discover VR when condition is true.  */
>>> +         if (te == e
>>> +             && !TREE_OVERFLOW_P (op0)
>>> +             && !TREE_OVERFLOW_P (op1))
>>
>>
>> This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from
>> set_value_range otherwise:
>>
>> gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
>>                   && (!TREE_OVERFLOW_P (max) || is_overflow_infinity
>> (max)));
>
> Ok, instead make sure to remove the overflow flag via
>
>   if (TREE_OVERFLOW_P (op0))
>     op0 = drop_tree_overflow (op0);
  ...
Done.

>
> it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and FALSE_VALUE
> cases and only invert the tree comparison for the false edge.
Done.

>
> +             tree cond = build2 (code, boolean_type_node, op0, op1);
> +             extract_range_for_var_from_comparison_expr (&vr, op0, cond);
>
> no wasteful tree building here please.  Instead of cond pass in code,
> op0 and op1
> as separate arguments.
Done.

>
> +         /* If we found any usable VR, set the VR to ssa_name and create a
> +            PUSH old value in the cond_stack with the old VR.  */
> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
> +           {
> +             value_range *new_vr = vrp_value_range_pool.allocate ();
> +             memset (new_vr, 0, sizeof (*new_vr));
> +             *new_vr = vr;
> +             cond_stack.safe_push (std::make_pair (bb, op0));
>
> the memset looks redundant given you copy-assing from vr anyway.
Done.

>
> You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where
> both i_1 and x_2 might have ranges that are useful.
I will address this once the basic structure is in place if that is OK 
with you.

>
> push and pop_value_range should be methods in the dom walker class
> and vrp_stack and cond_stack should be a single stack.  You can omit
> basic_block if you use a "magic" NULL_TREE var as marker (simply
> push a NULL_TREE, NULL pair at the end of before_dom_children).
>
Done.

>>>
>>> you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE
>>>
>>> why do you need those TREE_OVERFLOW checks?
>>>
>>> +             tree cond = build2 (code, boolean_type_node, op0, op1);
>>> +             tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
>>> +             extract_range_from_assert (&vr, a);
>>>
>>> so I was hoping that the "refactoring" patch in the series would expose a
>>> more
>>> useful interface than extract_range_from_assert ... namely one that can
>>> extract a range from the comparison directly and does not require building
>>> a scratch ASSERT_EXPR.
>>
>>
>> I have split extract_range_from_assert into
>> extract_range_for_var_from_comparison_expr and using it now. Is that better?
>
> See above.
>
>>>
>>>
>>> +         /* If we found any usable VR, set the VR to ssa_name and create
>>> a
>>> +            restore point in the cond_stack with the  old VR. */
>>> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>>> +           {
>>> +             value_range *new_vr = XCNEW (value_range);
>>> +             *new_vr = vr;
>>> +             cond_stack.safe_push (std::make_pair (bb,
>>> +                                                   std::make_pair (op0,
>>> +
>>> old_vr)));
>>> +             change_value_range (op0, new_vr);
>>>
>>> I don't like 'change_value_range' as interface, please integrate that into
>>> a push/pop_value_range style interface instead.
>>
>>
>> Tried using push/pop interface.
>>>
>>>
>>> +       vrp_visit_stmt (stmt, &taken_edge_p, &output_p);
>>> +    }
>>> +
>>> +  return NULL;
>>>
>>> you should return taken_edge_p (misnamed as it isn't a pointer) and take
>>> advantage of EDGE_EXECUTABLE.  Again see DOM/SCCVN (might want to
>>> defer this as a followup improvement).
>>
>>
>> I have added a TODO to this effect and will comeback to it.
>>>
>>>
>>> Note that the advantage of a DOM-based VRP is that backtracking is easy
>>> to implement (you don't do that yet).  That is, after DEF got a (better)
>>> value-range you can simply re-visit all the defs of its uses (and
>>> recursively).
>>> I think you have to be careful with stmts that might prematurely leave a
>>> BB
>>> though (like via EH).  So sth for a followup as well.
>>
>>
>> Is this OK now. Bootstrapped and regression tested on x86_64-linux  with no
>> new regressions.
>
> Better, still not perfect.
>
> I'm also arguing on the pass placement now - you put it where it may make sense
> for IPA VRP (but then IPA VRP could simply do this in its analysis phase) but
> not so much where it makes sense during the early pipeline.  I think it makes
> sense after FRE.
>
> Looking at how you finalize I see several issues with simply re-using
> vrp_finalize.
>
> First of all the loop doing the set_range_info will turn up with
> nothing as you've
> popped off all value-ranges from the lattice.  Same for substitute-and-fold (or
> jump-threading).
I am not sure understanding you. I am poping the value ranges for scope 
when we leave them. Other value ranges which lives in all the regions 
will remain.

>
> What you instead need to do with the DOM scheme is at the point you
> call vrp_visit_stmt do sth like
>
>      vrp_visit_stmt (....);
>      if (fold_stmt (&gsi, follow_single_use_edges))
>        update_stmt ();
I have added this. I think this will be good as we would be able to 
optimize with value ranges that is valid within the scope.

>      tree def = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
>      if (def
>          && INTEGRAL_TYPE_P (TREE_TYPE (def))
>          && (TREE_CODE (vr_value[i]->min) == INTEGER_CST)
>           && (TREE_CODE (vr_value[i]->max) == INTEGER_CST)
>           && (vr_value[i]->type == VR_RANGE
>               || vr_value[i]->type == VR_ANTI_RANGE))
>         set_range_info (name, vr_value[i]->type, vr_value[i]->min,
>                         vr_value[i]->max);
>
I am not sure if this is needed. I dont know if my push/pop value range 
implementation is not what you wanted. If you still prefer, I am happy 
to add this.

> thus please split vrp_finalize into two parts, one of it with the memory
> freeing which is the one you'd call.
>
> Note that EVRP is not enabled by default at any optimization level
> it seems so bootstrap / test of it would be useless (did you only
> test with the full series?  I never looked at the IPA VRP part)
>
I have:

+ftree-evrp
+Common Report Var(flag_tree_early_vrp) Init(1) Optimization
+Perform Early Value Range Propagation on trees.

I would like to run this only for -O2 and above but for now I am using 
this to test.

I  have tested the last set of patch separately.

I will do more testing on this patch based on your feedback. Does this 
look better?

Thanks,
Kugan

Comments

Richard Biener July 26, 2016, 1:37 p.m. UTC | #1
On Tue, Jul 26, 2016 at 2:27 PM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> Hi Riachard,
>
> Thanks for the review. Here is an updated patch with comments below.
>
>> +/* Restore/Pop all the old VRs maintained in the cond_stack.  */
>> +
>> +void evrp_dom_walker::finalize_dom_walker ()
>> +{
>> +  while (!cond_stack.is_empty ())
>> +    {
>> +      tree var = cond_stack.last ().second;
>> +      pop_value_range (var);
>> +      cond_stack.pop ();
>> +    }
>>
>> why is this necessary at all?  Looks like a waste of time (and the
>> stack should be empty here
>> anyway).  I think you leak cond_stack as well, I suppose using
>> auto_vec might work here,
>> you have to check.
>
> Done.
>
>>
>>>>
>>>> +         /* Discover VR when condition is true.  */
>>>> +         if (te == e
>>>> +             && !TREE_OVERFLOW_P (op0)
>>>> +             && !TREE_OVERFLOW_P (op1))
>>>
>>>
>>>
>>> This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from
>>> set_value_range otherwise:
>>>
>>> gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
>>>                   && (!TREE_OVERFLOW_P (max) || is_overflow_infinity
>>> (max)));
>>
>>
>> Ok, instead make sure to remove the overflow flag via
>>
>>   if (TREE_OVERFLOW_P (op0))
>>     op0 = drop_tree_overflow (op0);
>
>  ...
> Done.
>
>>
>> it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and
>> FALSE_VALUE
>> cases and only invert the tree comparison for the false edge.
>
> Done.
>
>>
>> +             tree cond = build2 (code, boolean_type_node, op0, op1);
>> +             extract_range_for_var_from_comparison_expr (&vr, op0, cond);
>>
>> no wasteful tree building here please.  Instead of cond pass in code,
>> op0 and op1
>> as separate arguments.
>
> Done.
>
>>
>> +         /* If we found any usable VR, set the VR to ssa_name and create
>> a
>> +            PUSH old value in the cond_stack with the old VR.  */
>> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>> +           {
>> +             value_range *new_vr = vrp_value_range_pool.allocate ();
>> +             memset (new_vr, 0, sizeof (*new_vr));
>> +             *new_vr = vr;
>> +             cond_stack.safe_push (std::make_pair (bb, op0));
>>
>> the memset looks redundant given you copy-assing from vr anyway.
>
> Done.
>
>>
>> You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where
>> both i_1 and x_2 might have ranges that are useful.
>
> I will address this once the basic structure is in place if that is OK with
> you.

Sure.  Just note it down somewhere.

>>
>> push and pop_value_range should be methods in the dom walker class
>> and vrp_stack and cond_stack should be a single stack.  You can omit
>> basic_block if you use a "magic" NULL_TREE var as marker (simply
>> push a NULL_TREE, NULL pair at the end of before_dom_children).
>>
> Done.
>
>
>>>>
>>>> you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE
>>>>
>>>> why do you need those TREE_OVERFLOW checks?
>>>>
>>>> +             tree cond = build2 (code, boolean_type_node, op0, op1);
>>>> +             tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
>>>> +             extract_range_from_assert (&vr, a);
>>>>
>>>> so I was hoping that the "refactoring" patch in the series would expose
>>>> a
>>>> more
>>>> useful interface than extract_range_from_assert ... namely one that can
>>>> extract a range from the comparison directly and does not require
>>>> building
>>>> a scratch ASSERT_EXPR.
>>>
>>>
>>>
>>> I have split extract_range_from_assert into
>>> extract_range_for_var_from_comparison_expr and using it now. Is that
>>> better?
>>
>>
>> See above.
>>
>>>>
>>>>
>>>> +         /* If we found any usable VR, set the VR to ssa_name and
>>>> create
>>>> a
>>>> +            restore point in the cond_stack with the  old VR. */
>>>> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>>>> +           {
>>>> +             value_range *new_vr = XCNEW (value_range);
>>>> +             *new_vr = vr;
>>>> +             cond_stack.safe_push (std::make_pair (bb,
>>>> +                                                   std::make_pair (op0,
>>>> +
>>>> old_vr)));
>>>> +             change_value_range (op0, new_vr);
>>>>
>>>> I don't like 'change_value_range' as interface, please integrate that
>>>> into
>>>> a push/pop_value_range style interface instead.
>>>
>>>
>>>
>>> Tried using push/pop interface.
>>>>
>>>>
>>>>
>>>> +       vrp_visit_stmt (stmt, &taken_edge_p, &output_p);
>>>> +    }
>>>> +
>>>> +  return NULL;
>>>>
>>>> you should return taken_edge_p (misnamed as it isn't a pointer) and take
>>>> advantage of EDGE_EXECUTABLE.  Again see DOM/SCCVN (might want to
>>>> defer this as a followup improvement).
>>>
>>>
>>>
>>> I have added a TODO to this effect and will comeback to it.
>>>>
>>>>
>>>>
>>>> Note that the advantage of a DOM-based VRP is that backtracking is easy
>>>> to implement (you don't do that yet).  That is, after DEF got a (better)
>>>> value-range you can simply re-visit all the defs of its uses (and
>>>> recursively).
>>>> I think you have to be careful with stmts that might prematurely leave a
>>>> BB
>>>> though (like via EH).  So sth for a followup as well.
>>>
>>>
>>>
>>> Is this OK now. Bootstrapped and regression tested on x86_64-linux  with
>>> no
>>> new regressions.
>>
>>
>> Better, still not perfect.
>>
>> I'm also arguing on the pass placement now - you put it where it may make
>> sense
>> for IPA VRP (but then IPA VRP could simply do this in its analysis phase)
>> but
>> not so much where it makes sense during the early pipeline.  I think it
>> makes
>> sense after FRE.
>>
>> Looking at how you finalize I see several issues with simply re-using
>> vrp_finalize.
>>
>> First of all the loop doing the set_range_info will turn up with
>> nothing as you've
>> popped off all value-ranges from the lattice.  Same for
>> substitute-and-fold (or
>> jump-threading).
>
> I am not sure understanding you. I am poping the value ranges for scope when
> we leave them. Other value ranges which lives in all the regions will
> remain.

Ah, yeah.  Sorry for the confusion.

>>
>> What you instead need to do with the DOM scheme is at the point you
>> call vrp_visit_stmt do sth like
>>
>>      vrp_visit_stmt (....);
>>      if (fold_stmt (&gsi, follow_single_use_edges))
>>        update_stmt ();
>
> I have added this. I think this will be good as we would be able to optimize
> with value ranges that is valid within the scope.

Yes, it also improves memory access locality and thus compile-time I think.

>>      tree def = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
>>      if (def
>>          && INTEGRAL_TYPE_P (TREE_TYPE (def))
>>          && (TREE_CODE (vr_value[i]->min) == INTEGER_CST)
>>           && (TREE_CODE (vr_value[i]->max) == INTEGER_CST)
>>           && (vr_value[i]->type == VR_RANGE
>>               || vr_value[i]->type == VR_ANTI_RANGE))
>>         set_range_info (name, vr_value[i]->type, vr_value[i]->min,
>>                         vr_value[i]->max);
>>
> I am not sure if this is needed. I dont know if my push/pop value range
> implementation is not what you wanted. If you still prefer, I am happy to
> add this.

See above, also if you fold at this point names used should have their
ranges updated so match.pd patterns can utilize them.

I see you are still calling substitute_and_fold though so if you want to
continue doing that (for now) then you don't need any of the above.
If you stop doing that then you need to supply vrp_valueize instead of
follow_single_use_edges to fold_stmt, otherwise you won't get any
constants propagated.

I'd prefer to not call substitute_and_fold as that is a function of the
SSA propagator.

>
>> thus please split vrp_finalize into two parts, one of it with the memory
>> freeing which is the one you'd call.
>>
>> Note that EVRP is not enabled by default at any optimization level
>> it seems so bootstrap / test of it would be useless (did you only
>> test with the full series?  I never looked at the IPA VRP part)
>>
> I have:
>
> +ftree-evrp
> +Common Report Var(flag_tree_early_vrp) Init(1) Optimization
> +Perform Early Value Range Propagation on trees.
>
> I would like to run this only for -O2 and above but for now I am using this
> to test.

There is an array in opts.c, default_options_table, where you can set
defaults based on the optimization level.  If you want to turn it on
at -O2 that would match the setting for -ftree-vrp in which case I would
rather not add another option.  For debugging one can use
-fdisable-tree-evrp1 for example.

+value_range *evrp_dom_walker::pop_value_range (const_tree var)
+{

coding style nit: newline missing after return type.

It seems that in your pop_value_range you assume you only pop one
range per BB - while that's likely true at the moment it will be a limitation
in the future.  You want to pop ranges until you hit the NULL marker
in after_dom_children and unconditionally push a NULL marker.

For example to match current VRPs behavior on say

   i_2 = (int) j_3;
   if (i_2 < 0)
     ...

which can register an assert for j_3 when i_2 < 0 is true we'd do that
by re-simulating DEFs of uses we figured out new ranges of (and all
their uses).  All those ranges would be temporary as well, thus they'd
need to be pushed/popped.  In my quick prototype this was done
using a worklist seeded by the names we can derive a range from from
conditionals and "SSA propagating" from it.  Note that for this
the generic vrp_visit_stmt cannot be re-used as it doesn't push/pop,
factoring out the lattice update is what is needed here.

+/* Visit the basic blocks in the dominance order and set the Value Ranges (VR)
+   for SSA_NAMEs in the scope.  Use this VR to discover more VRs.  Restore the
+   old VR once the scope is exited.  */
+
+static bool
+evrp_visit_phi_node_local (gphi *phi)
+{
+  size_t i;
+  tree lhs = PHI_RESULT (phi);
+  value_range vr_result = VR_INITIALIZER;
+  bool first = true;
+  int edges;
+
+  edges = 0;
+  for (i = 0; i < gimple_phi_num_args (phi); i++)
+    {
+      edge e = gimple_phi_arg_edge (phi, i);
+      tree arg = PHI_ARG_DEF (phi, i);
+      value_range vr_arg = VR_INITIALIZER;
+      ++edges;
+
+      /* If there is a back-edge, set the result to VARYING.  */
+      if (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
+       {
+         set_value_range_to_varying (&vr_result);
+         break;
+       }
...
+      /* If any of the RHS value is VARYING, set the result to VARYING.  */
+      if ((vr_arg.type != VR_RANGE)
+         && (vr_arg.type != VR_ANTI_RANGE))
+       {
+         set_value_range_to_varying (&vr_result);
+         break;
+       }

this shows that you need to start conservative for a DOM based VRP,
thus with all lattice values initialized to VARYING (but undefined SSA
names of course still can be UNDEFINED) rather than UNDEFINED.

+      if (TREE_CODE (arg) == SSA_NAME)
+       vr_arg = *(get_value_range (arg));
+      else
+       set_value_range_to_varying (&vr_arg);

err - what about constants?  When you initialize the lattice properly
you should be able to re-use vrp_visit_phi_node (maybe split out
its head to avoid using SCEV or the iteration limitation).

Btw, you don't want to call vrp_initialize in evrp either, this is setting
SSA propagator state which you do not want to do.  Please factor
out lattice allocation/deallocation instead.  I see that might require
really factoring vrp_visit_stmt into a function that omits updating
the lattice and just returns a range for the single DEF.

That said, a good refactoring is to split the SSA propagator callback
implementations (vrp_visit_stmt and vrp_visit_phi_node) into workers
returning a value range and the callback that does the update_value_range
call plus returing a SSA propgator state.  You can then re-use the worker.

Thanks,
Richard.

> I  have tested the last set of patch separately.
>
> I will do more testing on this patch based on your feedback. Does this look
> better?
>
> Thanks,
> Kugan
>
>
Kugan Vivekanandarajah July 28, 2016, 7:35 a.m. UTC | #2
Hi Richard,

Thanks for the review.
>
> It seems that in your pop_value_range you assume you only pop one
> range per BB - while that's likely true at the moment it will be a limitation
> in the future.  You want to pop ranges until you hit the NULL marker
> in after_dom_children and unconditionally push a NULL marker.
>
I understand. Right now, I am adding only one assert based on the 
condition. But in future, we will be adding more so this is needed. I 
will do that.

> For example to match current VRPs behavior on say
>
>    i_2 = (int) j_3;
>    if (i_2 < 0)
>      ...
>
> which can register an assert for j_3 when i_2 < 0 is true we'd do that
> by re-simulating DEFs of uses we figured out new ranges of (and all
> their uses).  All those ranges would be temporary as well, thus they'd
> need to be pushed/popped.  In my quick prototype this was done
> using a worklist seeded by the names we can derive a range from from
> conditionals and "SSA propagating" from it.  Note that for this
> the generic vrp_visit_stmt cannot be re-used as it doesn't push/pop,
> factoring out the lattice update is what is needed here.
>

I dont think I understand this part. vrp_visit_stmt is going to add 
value ranges for the variables defined in the if-block (in the example 
below it is for t). If we push the value range for i_2 and j_3 when we 
enter if-block, vrp_visit_stmt should compute "t" correctly. When we 
leave the if-block, we will pop i_2 and j_3.

     i_2 = (int) j_3;
     if (i_2 < 0)
     {
       t = j_2 * 2;	
     }
Am I missing something here?

> +/* Visit the basic blocks in the dominance order and set the Value Ranges (VR)
> +   for SSA_NAMEs in the scope.  Use this VR to discover more VRs.  Restore the
> +   old VR once the scope is exited.  */
> +
> +static bool
> +evrp_visit_phi_node_local (gphi *phi)
> +{
> +  size_t i;
> +  tree lhs = PHI_RESULT (phi);
> +  value_range vr_result = VR_INITIALIZER;
> +  bool first = true;
> +  int edges;
> +
> +  edges = 0;
> +  for (i = 0; i < gimple_phi_num_args (phi); i++)
> +    {
> +      edge e = gimple_phi_arg_edge (phi, i);
> +      tree arg = PHI_ARG_DEF (phi, i);
> +      value_range vr_arg = VR_INITIALIZER;
> +      ++edges;
> +
> +      /* If there is a back-edge, set the result to VARYING.  */
> +      if (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
> +       {
> +         set_value_range_to_varying (&vr_result);
> +         break;
> +       }
> ...
> +      /* If any of the RHS value is VARYING, set the result to VARYING.  */
> +      if ((vr_arg.type != VR_RANGE)
> +         && (vr_arg.type != VR_ANTI_RANGE))
> +       {
> +         set_value_range_to_varying (&vr_result);
> +         break;
> +       }
>
> this shows that you need to start conservative for a DOM based VRP,
> thus with all lattice values initialized to VARYING (but undefined SSA
> names of course still can be UNDEFINED) rather than UNDEFINED.
>
> +      if (TREE_CODE (arg) == SSA_NAME)
> +       vr_arg = *(get_value_range (arg));
> +      else
> +       set_value_range_to_varying (&vr_arg);
>
> err - what about constants?  When you initialize the lattice properly
> you should be able to re-use vrp_visit_phi_node (maybe split out
> its head to avoid using SCEV or the iteration limitation).

I also like re-using vrp_visit_phi_node but the issue is, we will have 
to keep a work-list of nodes to be re-evaluated till the lattice reach a 
fixpoint. Is that OK with you?

If we are to do this, we should be able to reuse the callbacks 
vrp_visit_phi_node and vrp_visit_stmt as it is.

Do you have a reference to your DOM based prototype?

Thanks,
Kugan

> Btw, you don't want to call vrp_initialize in evrp either, this is setting
> SSA propagator state which you do not want to do.  Please factor
> out lattice allocation/deallocation instead.  I see that might require
> really factoring vrp_visit_stmt into a function that omits updating
> the lattice and just returns a range for the single DEF.
>
> That said, a good refactoring is to split the SSA propagator callback
> implementations (vrp_visit_stmt and vrp_visit_phi_node) into workers
> returning a value range and the callback that does the update_value_range
> call plus returing a SSA propgator state.  You can then re-use the worker.
>
> Thanks,
> Richard.
>
>> I  have tested the last set of patch separately.
>>
>> I will do more testing on this patch based on your feedback. Does this look
>> better?
>>
>> Thanks,
>> Kugan
>>
>>
Richard Biener July 28, 2016, 11:34 a.m. UTC | #3
On Thu, Jul 28, 2016 at 9:35 AM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
> Thanks for the review.
>>
>>
>> It seems that in your pop_value_range you assume you only pop one
>> range per BB - while that's likely true at the moment it will be a
>> limitation
>> in the future.  You want to pop ranges until you hit the NULL marker
>> in after_dom_children and unconditionally push a NULL marker.
>>
> I understand. Right now, I am adding only one assert based on the condition.
> But in future, we will be adding more so this is needed. I will do that.
>
>> For example to match current VRPs behavior on say
>>
>>    i_2 = (int) j_3;
>>    if (i_2 < 0)
>>      ...
>>
>> which can register an assert for j_3 when i_2 < 0 is true we'd do that
>> by re-simulating DEFs of uses we figured out new ranges of (and all
>> their uses).  All those ranges would be temporary as well, thus they'd
>> need to be pushed/popped.  In my quick prototype this was done
>> using a worklist seeded by the names we can derive a range from from
>> conditionals and "SSA propagating" from it.  Note that for this
>> the generic vrp_visit_stmt cannot be re-used as it doesn't push/pop,
>> factoring out the lattice update is what is needed here.
>>
>
> I dont think I understand this part. vrp_visit_stmt is going to add value
> ranges for the variables defined in the if-block (in the example below it is
> for t). If we push the value range for i_2 and j_3 when we enter if-block,
> vrp_visit_stmt should compute "t" correctly. When we leave the if-block, we
> will pop i_2 and j_3.
>
>     i_2 = (int) j_3;
>     if (i_2 < 0)
>     {
>       t = j_2 * 2;
>     }
> Am I missing something here?

It works if you push the old value before calling vrp_visit_stmt, yes.
But I think
you want to do that only if the value-range changed to avoid too many changes
on the stack.  I guess we can defer further refactoring and
optimization of this case
to the point where we consider looking back very aggressively.

>> +/* Visit the basic blocks in the dominance order and set the Value Ranges
>> (VR)
>> +   for SSA_NAMEs in the scope.  Use this VR to discover more VRs.
>> Restore the
>> +   old VR once the scope is exited.  */
>> +
>> +static bool
>> +evrp_visit_phi_node_local (gphi *phi)
>> +{
>> +  size_t i;
>> +  tree lhs = PHI_RESULT (phi);
>> +  value_range vr_result = VR_INITIALIZER;
>> +  bool first = true;
>> +  int edges;
>> +
>> +  edges = 0;
>> +  for (i = 0; i < gimple_phi_num_args (phi); i++)
>> +    {
>> +      edge e = gimple_phi_arg_edge (phi, i);
>> +      tree arg = PHI_ARG_DEF (phi, i);
>> +      value_range vr_arg = VR_INITIALIZER;
>> +      ++edges;
>> +
>> +      /* If there is a back-edge, set the result to VARYING.  */
>> +      if (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
>> +       {
>> +         set_value_range_to_varying (&vr_result);
>> +         break;
>> +       }
>> ...
>> +      /* If any of the RHS value is VARYING, set the result to VARYING.
>> */
>> +      if ((vr_arg.type != VR_RANGE)
>> +         && (vr_arg.type != VR_ANTI_RANGE))
>> +       {
>> +         set_value_range_to_varying (&vr_result);
>> +         break;
>> +       }
>>
>> this shows that you need to start conservative for a DOM based VRP,
>> thus with all lattice values initialized to VARYING (but undefined SSA
>> names of course still can be UNDEFINED) rather than UNDEFINED.
>>
>> +      if (TREE_CODE (arg) == SSA_NAME)
>> +       vr_arg = *(get_value_range (arg));
>> +      else
>> +       set_value_range_to_varying (&vr_arg);
>>
>> err - what about constants?  When you initialize the lattice properly
>> you should be able to re-use vrp_visit_phi_node (maybe split out
>> its head to avoid using SCEV or the iteration limitation).
>
>
> I also like re-using vrp_visit_phi_node but the issue is, we will have to
> keep a work-list of nodes to be re-evaluated till the lattice reach a
> fixpoint. Is that OK with you?

No, why would you need to iterate here?  As said, the key point is to
initialize value-ranges as VARYING rather than UNDEFINED.

> If we are to do this, we should be able to reuse the callbacks
> vrp_visit_phi_node and vrp_visit_stmt as it is.
>
> Do you have a reference to your DOM based prototype?

I never posted it I think, it's structure is similar to yours with lots
of ??? comments ;)

Richard.

> Thanks,
> Kugan
>
>
>> Btw, you don't want to call vrp_initialize in evrp either, this is setting
>> SSA propagator state which you do not want to do.  Please factor
>> out lattice allocation/deallocation instead.  I see that might require
>> really factoring vrp_visit_stmt into a function that omits updating
>> the lattice and just returns a range for the single DEF.
>>
>> That said, a good refactoring is to split the SSA propagator callback
>> implementations (vrp_visit_stmt and vrp_visit_phi_node) into workers
>> returning a value range and the callback that does the update_value_range
>> call plus returing a SSA propgator state.  You can then re-use the worker.
>>
>> Thanks,
>> Richard.
>>
>>> I  have tested the last set of patch separately.
>>>
>>> I will do more testing on this patch based on your feedback. Does this
>>> look
>>> better?
>>>
>>> Thanks,
>>> Kugan
>>>
>>>
>
diff mbox

Patch

From eefcd1c5444cf5dd9f121e8bd04148d324d06ffc Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Fri, 24 Jun 2016 14:45:24 +1000
Subject: [PATCH 5/7] Add early vrp

---
 gcc/common.opt                            |   4 +
 gcc/doc/invoke.texi                       |   9 +
 gcc/passes.def                            |   1 +
 gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/evrp1.c     |  13 ++
 gcc/testsuite/gcc.dg/tree-ssa/evrp2.c     |  18 ++
 gcc/testsuite/gcc.dg/tree-ssa/evrp3.c     |  15 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr20318.c   |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr22117.c   |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr25382.c   |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr68431.c   |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp19.c     |   6 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp23.c     |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp24.c     |   5 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp58.c     |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp67.c     |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp98.c     |   6 +-
 gcc/timevar.def                           |   1 +
 gcc/tree-pass.h                           |   1 +
 gcc/tree-vrp.c                            | 354 ++++++++++++++++++++++++++----
 20 files changed, 397 insertions(+), 64 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/evrp1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/evrp2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/evrp3.c

diff --git a/gcc/common.opt b/gcc/common.opt
index f0d7196..29d0e4d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2471,6 +2471,10 @@  ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-evrp
+Common Report Var(flag_tree_early_vrp) Init(1) Optimization
+Perform Early Value Range Propagation on trees.
+
 fsplit-paths
 Common Report Var(flag_split_paths) Init(0) Optimization
 Split paths leading to loop backedges.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e000218..f4dc88d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7665,6 +7665,10 @@  enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-evrp
+@opindex ftree-evrp
+Perform Early Value Range Propagation on trees.
+
 @item -fsplit-paths
 @opindex fsplit-paths
 Split paths leading to loop backedges.  This can improve dead code
@@ -12270,6 +12274,11 @@  is made by appending @file{.slp} to the source file name.
 Dump each function after Value Range Propagation (VRP).  The file name
 is made by appending @file{.vrp} to the source file name.
 
+@item early vrp
+@opindex fdump-tree-evrp
+Dump each function after Early Value Range Propagation (EVRP).  The file name
+is made by appending @file{.evrp} to the source file name.
+
 @item oaccdevlow
 @opindex fdump-tree-oaccdevlow
 Dump each function after applying device-specific OpenACC transformations.
diff --git a/gcc/passes.def b/gcc/passes.def
index 3647e90..ebd360b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -89,6 +89,7 @@  along with GCC; see the file COPYING3.  If not see
 	     execute TODO_rebuild_alias at this point.  */
 	  NEXT_PASS (pass_build_ealias);
 	  NEXT_PASS (pass_fre);
+	  NEXT_PASS (pass_early_vrp);
 	  NEXT_PASS (pass_merge_phi);
           NEXT_PASS (pass_dse);
 	  NEXT_PASS (pass_cd_dce);
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C b/gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C
index 5e09583..dce05d6 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr31146-2.C
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-forwprop1" } */
+/* { dg-options "-O -fno-tree-evrp -fdump-tree-forwprop1" } */
 
 #include <new>
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/evrp1.c b/gcc/testsuite/gcc.dg/tree-ssa/evrp1.c
new file mode 100644
index 0000000..8c6e4e6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/evrp1.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
+
+int foo (int i);
+int bar (int j)
+{
+  if (j > 2)
+    return foo (j + 2);
+  else
+    return j;
+}
+
+/* { dg-final { scan-tree-dump "\\\[5, \\+INF" "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/evrp2.c b/gcc/testsuite/gcc.dg/tree-ssa/evrp2.c
new file mode 100644
index 0000000..e6d4235
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/evrp2.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
+
+int foo (int i);
+int bar2 (int j)
+{
+  if (j > 2)
+    {
+      if (j < 7)
+	return foo (j + 1);
+      else
+	return foo (j + 2);
+    }
+  return j;
+}
+
+
+/* { dg-final { scan-tree-dump "\\\[4, 7\\\]" "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/evrp3.c b/gcc/testsuite/gcc.dg/tree-ssa/evrp3.c
new file mode 100644
index 0000000..1a3bbd5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/evrp3.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
+
+int foo (int i);
+void bar (int j)
+{
+  unsigned int i;
+  for (i = 0; i < 10; ++i)
+    {
+      bar (i + 1);
+    }
+}
+
+/* { dg-final { scan-tree-dump "\\\[1, 10\\\]" "evrp" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr20318.c b/gcc/testsuite/gcc.dg/tree-ssa/pr20318.c
index 41f569e..8c12863 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr20318.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr20318.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile { target { ! keeps_null_pointer_checks } } } */
-/* { dg-options "-O2 -fdump-tree-original -fdump-tree-vrp1 -fdelete-null-pointer-checks" } */
+/* { dg-options "-O2 -fdump-tree-original -fdump-tree-evrp  -fdelete-null-pointer-checks" } */
 
 extern int* f(int) __attribute__((returns_nonnull));
 extern void eliminate ();
@@ -14,4 +14,4 @@  void h () {
 }
 
 /* { dg-final { scan-tree-dump-times "== 0" 1 "original" } } */
-/* { dg-final { scan-tree-dump-times "Folding predicate\[^\\n\]*to 0" 1 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Folding predicate\[^\\n\]*to 0" 1 "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c b/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
index 7efdd63..43cea0b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
@@ -3,7 +3,7 @@ 
    known to be zero after entering the first two "if" statements.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fno-tree-evrp -fdump-tree-vrp" } */
 
 void link_error (void);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c b/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
index dcf9148..0d19d0d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
@@ -3,7 +3,7 @@ 
    Check that VRP now gets ranges from BIT_AND_EXPRs.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-evrp" } */
 
 int
 foo (int a)
@@ -15,4 +15,4 @@  foo (int a)
     return 1;
 }
 
-/* { dg-final { scan-tree-dump-times "Folding predicate b_.* > 300 to 0" 1 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Folding predicate b_.* > 300 to 0" 1 "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr68431.c b/gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
index 3bd3843..a73aa00 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr68431.c
@@ -1,5 +1,5 @@ 
 /* PR tree-optimization/68431 */
-/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fdump-tree-evrp-details" } */
 
 unsigned int x = 1;
 int
@@ -13,4 +13,4 @@  main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "Folding predicate .*to 0" 1 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Folding predicate .*to 0" 1 "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
index cecacb6..3d47bfd 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp19.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-fwrapv -O1 -ftree-vrp -fdump-tree-vrp1" } */
+/* { dg-options "-fwrapv -O1 -ftree-vrp -fdump-tree-evrp" } */
 
 #include <limits.h>
 extern void abort ();
@@ -22,5 +22,5 @@  int g (int b) {
 	}
 	return 1;
 }
-/* { dg-final { scan-tree-dump "Folding predicate a_. < 0 to 0" "vrp1" } } */
-/* { dg-final { scan-tree-dump "Folding predicate b_. >= 0 to 1" "vrp1" } } */
+/* { dg-final { scan-tree-dump "Folding predicate a_. < 0 to 0" "evrp" } } */
+/* { dg-final { scan-tree-dump "Folding predicate b_. >= 0 to 1" "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c
index b877ccc..a6d2a24 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fdump-tree-evrp-details" } */
 
 void aa (void);
 void aos (void);
@@ -45,5 +45,5 @@  L8:
 /* The n_sets > 0 test can be simplified into n_sets == 1 since the
    only way to reach the test is when n_sets <= 1, and the only value
    which satisfies both conditions is n_sets == 1.  */
-/* { dg-final { scan-tree-dump-times "Simplified relational" 1 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Simplified relational" 1 "evrp" } } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp24.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp24.c
index e740575..0e1e69c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp24.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp24.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fdump-tree-evrp-details -fdump-tree-vrp1-details" } */
 
 
 struct rtx_def;
@@ -91,5 +91,6 @@  L7:
    The second n_sets > 0 test can also be simplified into n_sets == 1
    as the only way to reach the tests is when n_sets <= 1 and the only
    value which satisfies both conditions is n_sets == 1.  */
-/* { dg-final { scan-tree-dump-times "Simplified relational" 2 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Simplified relational" 1 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Simplified relational" 1 "evrp" } } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp58.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp58.c
index 5b44ae2..6df91ca 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp58.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp58.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fdump-tree-evrp-details" } */
 
 long long
 foo (long long a, signed char b, signed char c)
@@ -9,4 +9,4 @@  foo (long long a, signed char b, signed char c)
 }
 
 /* { dg-final { scan-tree-dump "Folded into" "vrp1" { target int32plus } } } */
-/* { dg-final { scan-tree-dump "Folding statement: _\[0-9\]\* = \\(long long int\\) bc_\[0-9\]\*;" "vrp1" { target int16 } } } */
+/* { dg-final { scan-tree-dump "Folding statement: _\[0-9\]\* = \\(long long int\\) bc_\[0-9\]\*;" "evrp" { target int16 } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp67.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp67.c
index ef5e8f9..b19b546 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp67.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp67.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1" } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
 
 extern void link_error (void);
 
@@ -36,4 +36,4 @@  unsigned baz (unsigned i)
   return i;
 }
 
-/* { dg-final { scan-tree-dump-times "Folding predicate" 3 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Folding predicate" 3 "evrp" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp98.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp98.c
index 982f091..7ad818c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp98.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp98.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target int128 } */
-/* { dg-options "-Os -fdump-tree-vrp1-details" } */
+/* { dg-options "-Os -fdump-tree-evrp-details" } */
 
 #include <stdint.h>
 #include <limits.h>
@@ -36,6 +36,6 @@  foo (bigger_than_word a, word b, uint8_t c)
   return ret;
 }
 
-/* { dg-final { scan-tree-dump "Folded into: if \\(_\[0-9\]+ == 1\\)" "vrp1" } } */
+/* { dg-final { scan-tree-dump "Folded into: if \\(_\[0-9\]+ == 1\\)" "evrp" } } */
 /* { dg-final { scan-tree-dump-not "Folded into: if \\(_\[0-9\]+ == 2\\)" "vrp1" } } */
-/* { dg-final { scan-tree-dump "Folded into: if \\(_\[0-9\]+ == 3\\)" "vrp1" } } */
+/* { dg-final { scan-tree-dump "Folded into: if \\(_\[0-9\]+ == 3\\)" "evrp" } } */
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 362aa6e..8d308ac 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -147,6 +147,7 @@  DEFTIMEVAR (TV_TREE_CFG		     , "tree CFG construction")
 DEFTIMEVAR (TV_TREE_CLEANUP_CFG	     , "tree CFG cleanup")
 DEFTIMEVAR (TV_TREE_TAIL_MERGE       , "tree tail merge")
 DEFTIMEVAR (TV_TREE_VRP              , "tree VRP")
+DEFTIMEVAR (TV_TREE_EARLY_VRP        , "tree Early VRP")
 DEFTIMEVAR (TV_TREE_COPY_PROP        , "tree copy propagation")
 DEFTIMEVAR (TV_FIND_REFERENCED_VARS  , "tree find ref. vars")
 DEFTIMEVAR (TV_TREE_PTA		     , "tree PTA")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 36299a6..d836d57 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -440,6 +440,7 @@  extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_early_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 83579e3..de755bc 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -59,6 +59,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "case-cfn-macros.h"
 #include "alloc-pool.h"
+#include "tree-vrp.h"
 #include "domwalk.h"
 
 #define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL }
@@ -1437,44 +1438,17 @@  op_with_boolean_value_range_p (tree op)
 	  && integer_onep (vr->max));
 }
 
-/* Extract value range information from an ASSERT_EXPR EXPR and store
-   it in *VR_P.  */
+/* Extract value range information for VAR when (OP COND_CODE LIMIT) is
+   true and store it in *VR_P.  */
 
 static void
-extract_range_from_assert (value_range *vr_p, tree expr)
+extract_range_for_var_from_comparison_expr (tree var, enum tree_code cond_code,
+					    tree op, tree limit,
+					    value_range *vr_p)
 {
-  tree var, cond, limit, min, max, type;
+  tree  min, max, type;
   value_range *limit_vr;
-  enum tree_code cond_code;
-
-  var = ASSERT_EXPR_VAR (expr);
-  cond = ASSERT_EXPR_COND (expr);
-
-  gcc_assert (COMPARISON_CLASS_P (cond));
-
-  /* Find VAR in the ASSERT_EXPR conditional.  */
-  if (var == TREE_OPERAND (cond, 0)
-      || TREE_CODE (TREE_OPERAND (cond, 0)) == PLUS_EXPR
-      || TREE_CODE (TREE_OPERAND (cond, 0)) == NOP_EXPR)
-    {
-      /* If the predicate is of the form VAR COMP LIMIT, then we just
-	 take LIMIT from the RHS and use the same comparison code.  */
-      cond_code = TREE_CODE (cond);
-      limit = TREE_OPERAND (cond, 1);
-      cond = TREE_OPERAND (cond, 0);
-    }
-  else
-    {
-      /* If the predicate is of the form LIMIT COMP VAR, then we need
-	 to flip around the comparison code to create the proper range
-	 for VAR.  */
-      cond_code = swap_tree_comparison (TREE_CODE (cond));
-      limit = TREE_OPERAND (cond, 0);
-      cond = TREE_OPERAND (cond, 1);
-    }
-
   limit = avoid_overflow_infinity (limit);
-
   type = TREE_TYPE (var);
   gcc_assert (limit != var);
 
@@ -1517,15 +1491,15 @@  extract_range_from_assert (value_range *vr_p, tree expr)
      as well build the range [b_4, +INF] for it.
      One special case we handle is extracting a range from a
      range test encoded as (unsigned)var + CST <= limit.  */
-  if (TREE_CODE (cond) == NOP_EXPR
-      || TREE_CODE (cond) == PLUS_EXPR)
+  if (TREE_CODE (op) == NOP_EXPR
+      || TREE_CODE (op) == PLUS_EXPR)
     {
-      if (TREE_CODE (cond) == PLUS_EXPR)
+      if (TREE_CODE (op) == PLUS_EXPR)
         {
-          min = fold_build1 (NEGATE_EXPR, TREE_TYPE (TREE_OPERAND (cond, 1)),
-			     TREE_OPERAND (cond, 1));
+          min = fold_build1 (NEGATE_EXPR, TREE_TYPE (TREE_OPERAND (op, 1)),
+			     TREE_OPERAND (op, 1));
           max = int_const_binop (PLUS_EXPR, limit, min);
-	  cond = TREE_OPERAND (cond, 0);
+	  op = TREE_OPERAND (op, 0);
 	}
       else
 	{
@@ -1709,6 +1683,41 @@  extract_range_from_assert (value_range *vr_p, tree expr)
   vrp_intersect_ranges (vr_p, get_value_range (var));
 }
 
+/* Extract value range information from an ASSERT_EXPR EXPR and store
+   it in *VR_P.  */
+
+static void
+extract_range_from_assert (value_range *vr_p, tree expr)
+{
+  tree var = ASSERT_EXPR_VAR (expr);
+  tree cond = ASSERT_EXPR_COND (expr);
+  tree limit, op;
+  enum tree_code cond_code;
+  gcc_assert (COMPARISON_CLASS_P (cond));
+
+  /* Find VAR in the ASSERT_EXPR conditional.  */
+  if (var == TREE_OPERAND (cond, 0)
+      || TREE_CODE (TREE_OPERAND (cond, 0)) == PLUS_EXPR
+      || TREE_CODE (TREE_OPERAND (cond, 0)) == NOP_EXPR)
+    {
+      /* If the predicate is of the form VAR COMP LIMIT, then we just
+	 take LIMIT from the RHS and use the same comparison code.  */
+      cond_code = TREE_CODE (cond);
+      limit = TREE_OPERAND (cond, 1);
+      op = TREE_OPERAND (cond, 0);
+    }
+  else
+    {
+      /* If the predicate is of the form LIMIT COMP VAR, then we need
+	 to flip around the comparison code to create the proper range
+	 for VAR.  */
+      cond_code = swap_tree_comparison (TREE_CODE (cond));
+      limit = TREE_OPERAND (cond, 0);
+      op = TREE_OPERAND (cond, 1);
+    }
+  extract_range_for_var_from_comparison_expr (var, cond_code, op,
+					      limit, vr_p);
+}
 
 /* Extract range information from SSA name VAR and store it in VR.  If
    VAR has an interesting range, use it.  Otherwise, create the
@@ -10144,7 +10153,7 @@  finalize_jump_threads (void)
 /* Traverse all the blocks folding conditionals with known ranges.  */
 
 static void
-vrp_finalize (bool jump_thread_p, bool warn_array_bounds_p)
+vrp_finalize (bool early_vrp_p, bool warn_array_bounds_p)
 {
   size_t i;
 
@@ -10185,7 +10194,7 @@  vrp_finalize (bool jump_thread_p, bool warn_array_bounds_p)
 
   /* We must identify jump threading opportunities before we release
      the datastructures built by VRP.  */
-  if (jump_thread_p)
+  if (!early_vrp_p)
     identify_jump_threads ();
 
   /* Free allocated memory.  */
@@ -10201,6 +10210,229 @@  vrp_finalize (bool jump_thread_p, bool warn_array_bounds_p)
 }
 
 
+/* Visit the basic blocks in the dominance order and set the Value Ranges (VR)
+   for SSA_NAMEs in the scope.  Use this VR to discover more VRs.  Restore the
+   old VR once the scope is exited.  */
+
+static bool
+evrp_visit_phi_node_local (gphi *phi)
+{
+  size_t i;
+  tree lhs = PHI_RESULT (phi);
+  value_range vr_result = VR_INITIALIZER;
+  bool first = true;
+  int edges;
+
+  edges = 0;
+  for (i = 0; i < gimple_phi_num_args (phi); i++)
+    {
+      edge e = gimple_phi_arg_edge (phi, i);
+      tree arg = PHI_ARG_DEF (phi, i);
+      value_range vr_arg = VR_INITIALIZER;
+      ++edges;
+
+      /* If there is a back-edge, set the result to VARYING.  */
+      if (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
+	{
+	  set_value_range_to_varying (&vr_result);
+	  break;
+	}
+
+      if (TREE_CODE (arg) == SSA_NAME)
+	vr_arg = *(get_value_range (arg));
+      else
+	set_value_range_to_varying (&vr_arg);
+
+      /* If any of the RHS value is VARYING, set the result to VARYING.  */
+      if ((vr_arg.type != VR_RANGE)
+	  && (vr_arg.type != VR_ANTI_RANGE))
+	{
+	  set_value_range_to_varying (&vr_result);
+	  break;
+	}
+
+      /* Set/meet the RHS value range with the result so far.  */
+      if (first)
+	set_value_range (&vr_result, vr_arg.type, vr_arg.min,
+			 vr_arg.max, vr_arg.equiv);
+      else
+	vrp_meet (&vr_result, &vr_arg);
+      first = false;
+      if (vr_result.type == VR_VARYING)
+	break;
+    }
+
+  /* Check if the value range has changed and return accordingly.  */
+  if (update_value_range (lhs, &vr_result))
+    return true;
+  else
+    return false;
+}
+
+/* evrp_dom_walker visits the basic blocks in the dominance order and set the
+   Value Ranges (VR) for SSA_NAMEs in the scope.  Use this VR to discover
+   more VRs.  */
+
+class evrp_dom_walker : public dom_walker
+{
+public:
+  evrp_dom_walker ()
+    : dom_walker (CDI_DOMINATORS), stack (10) {}
+
+  virtual edge before_dom_children (basic_block);
+  virtual void after_dom_children (basic_block);
+  void push_value_range (const_tree var, value_range *vr);
+  value_range *pop_value_range (const_tree var);
+
+  /* Cond_stack holds the old VR.  */
+  auto_vec<std::pair <const_tree, value_range*>, 0 > stack;
+};
+
+/* See if there is any new scope is entered with new VR and set that VR to
+   ssa_name before visiting the statements in the scope.  */
+
+edge evrp_dom_walker::before_dom_children (basic_block bb)
+{
+  value_range *new_vr = NULL;
+  tree op0 = NULL_TREE;
+  if (single_pred_p (bb))
+    {
+      edge e = single_pred_edge (bb);
+      value_range vr = VR_INITIALIZER;
+      gimple *stmt = last_stmt (e->src);
+
+      if (stmt
+	  && gimple_code (stmt) == GIMPLE_COND
+	  && INTEGRAL_TYPE_P (TREE_TYPE (gimple_cond_lhs (stmt))))
+	{
+	  op0 = gimple_cond_lhs (stmt);
+	  tree op1 = gimple_cond_rhs (stmt);
+	  tree_code code = gimple_cond_code (stmt);
+	  value_range *old_vr = get_value_range (op0);
+
+	  if (TREE_OVERFLOW_P (op1))
+	    op1 = drop_tree_overflow (op1);
+
+	  /* If condition is false, invert the cond.  */
+	  if (e->flags & EDGE_FALSE_VALUE)
+	    code = invert_tree_comparison (gimple_cond_code (stmt),
+					   HONOR_NANS (op0));
+	  /* Discover VR when condition is true.  */
+	  extract_range_for_var_from_comparison_expr (op0, code, op0, op1, &vr);
+	  if (old_vr->type == VR_RANGE || old_vr->type == VR_ANTI_RANGE)
+	    vrp_intersect_ranges (&vr, old_vr);
+
+	  /* If we found any usable VR, set the VR to ssa_name and create a
+	     PUSH old value in the stack with the old VR.  */
+	  if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
+	    {
+	      new_vr = vrp_value_range_pool.allocate ();
+	      *new_vr = vr;
+	    }
+	}
+    }
+  push_value_range (op0, new_vr);
+
+  /* Visit PHI stmts and discover any new VRs possible.  */
+  gimple_stmt_iterator gsi;
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      if (stmt_interesting_for_vrp (phi))
+	evrp_visit_phi_node_local (phi);
+    }
+
+  /* Visit all other stmts and discover any new VRs possible.  */
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      edge taken_edge;
+      tree output;
+      /* TODO, if found taken_edge, we should visit (return it) and travel
+	 again to improve VR as done in DOM/SCCVN optimizations.  It should
+	 be done carefully as stmts might prematurely leave a BB like
+	 in EH.  */
+      if (stmt_interesting_for_vrp (stmt))
+	{
+	  vrp_visit_stmt (stmt, &taken_edge, &output);
+	  if (fold_stmt (&gsi, follow_single_use_edges))
+	    update_stmt (gsi_stmt (gsi));
+	}
+    }
+  return NULL;
+}
+
+/* Restore/Pop VRs valid only for BB when we leave BB.  */
+
+void evrp_dom_walker::after_dom_children (basic_block bb ATTRIBUTE_UNUSED)
+{
+  gcc_checking_assert (!stack.is_empty ());
+  const_tree var = stack.last ().first;
+  pop_value_range (var);
+}
+
+/* Push the Value Range of VAR to the stack and upadate it with new VR.  */
+
+void evrp_dom_walker::push_value_range (const_tree var, value_range *vr)
+{
+  if  (vr != NULL)
+    {
+      unsigned ver = SSA_NAME_VERSION (var);
+      gcc_checking_assert (vr_value);
+      stack.safe_push (std::make_pair (var, vr_value[ver]));
+
+      if (ver < num_vr_values)
+	vr_value[ver] = vr;
+    }
+  else
+    stack.safe_push (std::make_pair (var, vr));
+}
+
+/* Pop the Value Range from the vrp_stack and update VAR with it.  */
+
+value_range *evrp_dom_walker::pop_value_range (const_tree var)
+{
+  value_range *vr = stack.last ().second;
+  if (vr != NULL)
+    {
+      unsigned ver = SSA_NAME_VERSION (var);
+      gcc_checking_assert (var == stack.last ().first);
+      gcc_checking_assert (vr_value);
+
+      if (ver < num_vr_values)
+	vr_value[ver] = vr;
+    }
+  stack.pop ();
+  return vr;
+}
+
+
+/* Main entry point for the early vrp pass which is a simplified non-iterative
+   version of VRP.  Value ranges discovered in early vrp will be used by ipa-vrp.  */
+
+static unsigned int
+execute_early_vrp ()
+{
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  vrp_initialize ();
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	e->flags |= EDGE_EXECUTABLE;
+    }
+
+  evrp_dom_walker walker;
+  walker.walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  vrp_finalize (true, false);
+  return 0;
+}
+
 /* Main entry point to VRP (Value Range Propagation).  This pass is
    loosely based on J. R. C. Patterson, ``Accurate Static Branch
    Prediction by Value Range Propagation,'' in SIGPLAN Conference on
@@ -10270,7 +10502,7 @@  execute_vrp (bool warn_array_bounds_p)
 
   vrp_initialize ();
   ssa_propagate (vrp_visit_stmt, vrp_visit_phi_node);
-  vrp_finalize (true, warn_array_bounds_p);
+  vrp_finalize (false, warn_array_bounds_p);
 
   free_numbers_of_iterations_estimates (cfun);
 
@@ -10368,3 +10600,41 @@  make_pass_vrp (gcc::context *ctxt)
 {
   return new pass_vrp (ctxt);
 }
+
+namespace {
+
+const pass_data pass_data_early_vrp =
+{
+  GIMPLE_PASS, /* type */
+  "evrp", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_EARLY_VRP, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  ( TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all ),
+};
+
+class pass_early_vrp : public gimple_opt_pass
+{
+public:
+  pass_early_vrp (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_early_vrp, ctxt)
+    {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_early_vrp (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_early_vrp != 0; }
+  virtual unsigned int execute (function *)
+    { return execute_early_vrp (); }
+
+}; // class pass_vrp
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_early_vrp (gcc::context *ctxt)
+{
+  return new pass_early_vrp (ctxt);
+}
+
-- 
1.9.1