diff mbox series

[RFA,P1,PR,tree-optimization/83298] Avoid over-optimistic result range for COND_EXPR

Message ID f314214c-2b9d-dba5-d1c0-f182827781b6@redhat.com
State New
Headers show
Series [RFA,P1,PR,tree-optimization/83298] Avoid over-optimistic result range for COND_EXPR | expand

Commit Message

Jeff Law Dec. 8, 2017, 12:18 a.m. UTC
So the underlying issue here is quite simple.  Given something like

x = (cond) ? res1 : res2;

EVRP analysis will compute the resultant range using vrp_meet of the
ranges for res1 and res2.  Seems pretty natural.

vrp_meet makes optimistic assumptions if either range is VR_UNDEFINED
and will set the resultant range to the range of the other operand.

Some callers explicitly mention this is the desired behavior (PHI
processing).  Other callers avoid calling vrp_meet when one of the
ranges is VR_UNDEFINED and do something sensible
(extract_range_from_unary_expr, extract_range_from_binary_expr_1).

extract_range_from_cond_expr neither mentions that it wants the
optimistic behavior nor does it avoid calling vrp_meet with a
VR_UNDEFINED range.  It naturally seems to fit in better with the other
extract_range_from_* routines.

I'm not at all familiar with the ipa-cp bits, but from a quick look they
also seems to fall into the extract_* camp.


Anyway, normally in a domwalk the only place where we're going to see
VR_UNDEFINED would be in the PHI nodes.  It's one of the nice properties
of a domwalk :-)

However, for jump threading we look past the dominance frontier;
furthermore, we do not currently record ranges for statements we process
as part of the jump threading.  But we do try to extract the range each
statement generates -- we're primarily looking for cases where the
statement generates a singleton range.

While the plan does include recording ranges as we look past the
dominance frontier, I strongly believe some serious code cleanup in DOM
and jump threading needs to happen first.  So I don't want to go down
that path for gcc-8.

So we're kind-of stuck with the fact that we might query for a resultant
range when one or more input operands may not have recorded range
information.  Thankfully that's easily resolved by making
extract_range_from_cond_expr work like the other range extraction
routines and avoid calling vrp_meet when one or more operands is
VR_UNDEFINED.

Bootstrapped and regression tested on x86_64.  OK for the trunk?

Jeff
PR tree-optimization/83298
	* vr-values.c (vr_values::extract_range_from_cond_expr): Do not
	call vrp_meet if one of the input operands is VR_UNDEFINED.

	PR tree-optimization/83298
	* gcc.c-torture/execute/pr83298.c: New test.

Comments

Richard Biener Dec. 8, 2017, 11:17 a.m. UTC | #1
On Fri, Dec 8, 2017 at 1:18 AM, Jeff Law <law@redhat.com> wrote:
>
> So the underlying issue here is quite simple.  Given something like
>
> x = (cond) ? res1 : res2;
>
> EVRP analysis will compute the resultant range using vrp_meet of the
> ranges for res1 and res2.  Seems pretty natural.
>
> vrp_meet makes optimistic assumptions if either range is VR_UNDEFINED
> and will set the resultant range to the range of the other operand.
>
> Some callers explicitly mention this is the desired behavior (PHI
> processing).  Other callers avoid calling vrp_meet when one of the
> ranges is VR_UNDEFINED and do something sensible
> (extract_range_from_unary_expr, extract_range_from_binary_expr_1).
>
> extract_range_from_cond_expr neither mentions that it wants the
> optimistic behavior nor does it avoid calling vrp_meet with a
> VR_UNDEFINED range.  It naturally seems to fit in better with the other
> extract_range_from_* routines.
>
> I'm not at all familiar with the ipa-cp bits, but from a quick look they
> also seems to fall into the extract_* camp.
>
>
> Anyway, normally in a domwalk the only place where we're going to see
> VR_UNDEFINED would be in the PHI nodes.  It's one of the nice properties
> of a domwalk :-)
>
> However, for jump threading we look past the dominance frontier;
> furthermore, we do not currently record ranges for statements we process
> as part of the jump threading.  But we do try to extract the range each
> statement generates -- we're primarily looking for cases where the
> statement generates a singleton range.
>
> While the plan does include recording ranges as we look past the
> dominance frontier, I strongly believe some serious code cleanup in DOM
> and jump threading needs to happen first.  So I don't want to go down
> that path for gcc-8.
>
> So we're kind-of stuck with the fact that we might query for a resultant
> range when one or more input operands may not have recorded range
> information.  Thankfully that's easily resolved by making
> extract_range_from_cond_expr work like the other range extraction
> routines and avoid calling vrp_meet when one or more operands is
> VR_UNDEFINED.
>
> Bootstrapped and regression tested on x86_64.  OK for the trunk?

But that does regress the case of either arm being an uninitialized
variable.

I'm not convinced that when you look forward past the dominance frontier
and do VRP analysis on stmts without analyzing all intermediate
stmts on the path (or at least push all defs on that path temporarily
to VR_VARYING) is fixed by this patch.  It merely looks like a wrong
workaround for a fundamental issue in how DOM now uses the
interface?

Thanks,
Richard.

> Jeff
>
>
>         PR tree-optimization/83298
>         * vr-values.c (vr_values::extract_range_from_cond_expr): Do not
>         call vrp_meet if one of the input operands is VR_UNDEFINED.
>
>         PR tree-optimization/83298
>         * gcc.c-torture/execute/pr83298.c: New test.
>
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr83298.c b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
> new file mode 100644
> index 00000000000..0e51ababf5c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
> @@ -0,0 +1,11 @@
> +
> +int a, b, c = 1;
> +
> +int main ()
> +{
> +  for (; b < 1; b++)
> +    ;
> +  if (!(c * (a < 1)))
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/vr-values.c b/gcc/vr-values.c
> index 9352e120d9d..ee5ae3c6a27 100644
> --- a/gcc/vr-values.c
> +++ b/gcc/vr-values.c
> @@ -912,6 +912,23 @@ vr_values::extract_range_from_cond_expr (value_range *vr, gassign *stmt)
>    else
>      set_value_range_to_varying (&vr1);
>
> +  /* If either range is VR_UNDEFINED, vrp_meet will make the optimistic
> +     choice and use the range of the other operand as the result range.
> +
> +     Other users of vrp_meet either explicitly filter the calls for
> +     this case, or they do not care (PHI processing where unexecutable
> +     edges are explicitly expected to be ignored).
> +
> +     Like most other callers, we can not generally tolerate the optimistic
> +     choice here.  Consider jump threading where we're looking into a
> +     non-dominated block and thus may not necessarily have processed the
> +     ranges for statements within that non-dominated block.  */
> +  if (vr0.type == VR_UNDEFINED || vr1.type == VR_UNDEFINED)
> +    {
> +      set_value_range_to_varying (vr);
> +      return;
> +    }
> +
>    /* The resulting value range is the union of the operand ranges */
>    copy_value_range (vr, &vr0);
>    vrp_meet (vr, &vr1);
>
Richard Biener Dec. 8, 2017, 11:20 a.m. UTC | #2
On Fri, Dec 8, 2017 at 1:18 AM, Jeff Law <law@redhat.com> wrote:
>
> So the underlying issue here is quite simple.  Given something like
>
> x = (cond) ? res1 : res2;
>
> EVRP analysis will compute the resultant range using vrp_meet of the
> ranges for res1 and res2.  Seems pretty natural.
>
> vrp_meet makes optimistic assumptions if either range is VR_UNDEFINED
> and will set the resultant range to the range of the other operand.
>
> Some callers explicitly mention this is the desired behavior (PHI
> processing).  Other callers avoid calling vrp_meet when one of the
> ranges is VR_UNDEFINED and do something sensible
> (extract_range_from_unary_expr, extract_range_from_binary_expr_1).

Note that "those" simply optimize the VR_UNDEFINED case by ignoring
the range.  They basically take a shortcut around vrp_meet and avoid
calling extract_range_* on VR_UNDEFINED ranges.

> extract_range_from_cond_expr neither mentions that it wants the
> optimistic behavior nor does it avoid calling vrp_meet with a
> VR_UNDEFINED range.  It naturally seems to fit in better with the other
> extract_range_from_* routines.
>
> I'm not at all familiar with the ipa-cp bits, but from a quick look they
> also seems to fall into the extract_* camp.
>
>
> Anyway, normally in a domwalk the only place where we're going to see
> VR_UNDEFINED would be in the PHI nodes.  It's one of the nice properties
> of a domwalk :-)
>
> However, for jump threading we look past the dominance frontier;
> furthermore, we do not currently record ranges for statements we process
> as part of the jump threading.  But we do try to extract the range each
> statement generates -- we're primarily looking for cases where the
> statement generates a singleton range.
>
> While the plan does include recording ranges as we look past the
> dominance frontier, I strongly believe some serious code cleanup in DOM
> and jump threading needs to happen first.  So I don't want to go down
> that path for gcc-8.
>
> So we're kind-of stuck with the fact that we might query for a resultant
> range when one or more input operands may not have recorded range
> information.  Thankfully that's easily resolved by making
> extract_range_from_cond_expr work like the other range extraction
> routines and avoid calling vrp_meet when one or more operands is
> VR_UNDEFINED.
>
> Bootstrapped and regression tested on x86_64.  OK for the trunk?
>
> Jeff
>
>
>         PR tree-optimization/83298
>         * vr-values.c (vr_values::extract_range_from_cond_expr): Do not
>         call vrp_meet if one of the input operands is VR_UNDEFINED.
>
>         PR tree-optimization/83298
>         * gcc.c-torture/execute/pr83298.c: New test.
>
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr83298.c b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
> new file mode 100644
> index 00000000000..0e51ababf5c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
> @@ -0,0 +1,11 @@
> +
> +int a, b, c = 1;
> +
> +int main ()
> +{
> +  for (; b < 1; b++)
> +    ;
> +  if (!(c * (a < 1)))
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/vr-values.c b/gcc/vr-values.c
> index 9352e120d9d..ee5ae3c6a27 100644
> --- a/gcc/vr-values.c
> +++ b/gcc/vr-values.c
> @@ -912,6 +912,23 @@ vr_values::extract_range_from_cond_expr (value_range *vr, gassign *stmt)
>    else
>      set_value_range_to_varying (&vr1);
>
> +  /* If either range is VR_UNDEFINED, vrp_meet will make the optimistic
> +     choice and use the range of the other operand as the result range.
> +
> +     Other users of vrp_meet either explicitly filter the calls for
> +     this case, or they do not care (PHI processing where unexecutable
> +     edges are explicitly expected to be ignored).
> +
> +     Like most other callers, we can not generally tolerate the optimistic
> +     choice here.  Consider jump threading where we're looking into a
> +     non-dominated block and thus may not necessarily have processed the
> +     ranges for statements within that non-dominated block.  */
> +  if (vr0.type == VR_UNDEFINED || vr1.type == VR_UNDEFINED)
> +    {
> +      set_value_range_to_varying (vr);
> +      return;
> +    }
> +
>    /* The resulting value range is the union of the operand ranges */
>    copy_value_range (vr, &vr0);
>    vrp_meet (vr, &vr1);
>
Jeff Law Dec. 11, 2017, 6:32 p.m. UTC | #3
On 12/08/2017 04:17 AM, Richard Biener wrote:
> On Fri, Dec 8, 2017 at 1:18 AM, Jeff Law <law@redhat.com> wrote:
>>
>> So the underlying issue here is quite simple.  Given something like
>>
>> x = (cond) ? res1 : res2;
>>
>> EVRP analysis will compute the resultant range using vrp_meet of the
>> ranges for res1 and res2.  Seems pretty natural.
>>
>> vrp_meet makes optimistic assumptions if either range is VR_UNDEFINED
>> and will set the resultant range to the range of the other operand.
>>
>> Some callers explicitly mention this is the desired behavior (PHI
>> processing).  Other callers avoid calling vrp_meet when one of the
>> ranges is VR_UNDEFINED and do something sensible
>> (extract_range_from_unary_expr, extract_range_from_binary_expr_1).
>>
>> extract_range_from_cond_expr neither mentions that it wants the
>> optimistic behavior nor does it avoid calling vrp_meet with a
>> VR_UNDEFINED range.  It naturally seems to fit in better with the other
>> extract_range_from_* routines.
>>
>> I'm not at all familiar with the ipa-cp bits, but from a quick look they
>> also seems to fall into the extract_* camp.
>>
>>
>> Anyway, normally in a domwalk the only place where we're going to see
>> VR_UNDEFINED would be in the PHI nodes.  It's one of the nice properties
>> of a domwalk :-)
>>
>> However, for jump threading we look past the dominance frontier;
>> furthermore, we do not currently record ranges for statements we process
>> as part of the jump threading.  But we do try to extract the range each
>> statement generates -- we're primarily looking for cases where the
>> statement generates a singleton range.
>>
>> While the plan does include recording ranges as we look past the
>> dominance frontier, I strongly believe some serious code cleanup in DOM
>> and jump threading needs to happen first.  So I don't want to go down
>> that path for gcc-8.
>>
>> So we're kind-of stuck with the fact that we might query for a resultant
>> range when one or more input operands may not have recorded range
>> information.  Thankfully that's easily resolved by making
>> extract_range_from_cond_expr work like the other range extraction
>> routines and avoid calling vrp_meet when one or more operands is
>> VR_UNDEFINED.
>>
>> Bootstrapped and regression tested on x86_64.  OK for the trunk?
> 
> But that does regress the case of either arm being an uninitialized
> variable.
> 
> I'm not convinced that when you look forward past the dominance frontier
> and do VRP analysis on stmts without analyzing all intermediate
> stmts on the path (or at least push all defs on that path temporarily
> to VR_VARYING) is fixed by this patch.  It merely looks like a wrong
> workaround for a fundamental issue in how DOM now uses the
> interface?I'm going back and pulling together the bits to fix this in a more
consistent way.  Specifically, recording ranges as we process statements
outside the dominance frontier so we don't ever see a VR_UNDEFINED range.

It's not bad as long as I resist the urge to pull in cleanups along the
way :-)

The only thing really painful is that normally when we generate a range
for a statement's output, we can just update the VR and reflect the
updated range into the global table -- that range is globally
applicable.    There's no need to push stuff onto the unwind stack or
the like.  The update_value_range API is sufficient here as NEW_VR is
just a temporary -- we copy the relevant bits from NEW_VR into the
actual tables.


When threading we must add entries to the unwinding stack.  Worse yet,
we can't use the existing VR because it's a stack local and obviously
does not persist.  We have to allocate a new object and copy from the
stack temporary into the new object.  Ugh.

We don't see any of this complexity with the other tables (like
const_and_copies) because we always update the global state and always
unwind it.  The VRP bits are a bit more efficient because they don't
bother with unwinding entries in cases where the result is globally
applicable.

Jeff
jeff
Jeff Law Dec. 11, 2017, 7:34 p.m. UTC | #4
On 12/08/2017 04:17 AM, Richard Biener wrote:
> 
> I'm not convinced that when you look forward past the dominance frontier
> and do VRP analysis on stmts without analyzing all intermediate
> stmts on the path (or at least push all defs on that path temporarily
> to VR_VARYING) is fixed by this patch.  It merely looks like a wrong
> workaround for a fundamental issue in how DOM now uses the
> interface?
So here's the bits to record ranges (with unwind entries so we can roll
them back) as we process beyond the dominance frontier.  This was always
in the plan, but I wanted to do some cleanups prior to adding this
capability.

I've stayed away from doing any of the cleanups at this time.

At a high level we break out routines to push a marker and pop to a
marker on the unwinding stack.  We then define the enter/leave in terms
of those new routines.  This allows us to push/pop scopes as we process
thread paths which don't want the same processing we see in the enter
method.

We add a boolean to record_ranges_from_stmt to indicate if any generated
range is of a temporary nature   The pre-existing calls all pass false
here to indicate the range is global.  WHen true (from threading) we
generate the necessary unwind entries and avoid changing any global state.

push_value_range becomes a public member function so it can be used when
threading to record a temporary range created by a PHI.  It's not
usually necessary, but could be for cases where we're unable to
propagate the PHI equivalences.

--


We pass in the evrp_range_analyzer instance from DOM into the threader.
From VRP we just pass in a NULL pointer as VRP doesn't use the EVRP
analyzer and we have to check it in various places.  This is one of the
many cleanups that will occur as we drop threading from tree-vrp.c.

We pass that down to a couple functions.  Nothing significant.

Then it's just a matter of recording something for PHIs and statements
we encounter -- ensuring that in both cases we create unwind entries.
That obviously fixes 83298 and it's duplicate (testcases for both are
included).  It probably enables more jump threading as well, but I
didn't specifically look for cases where that happened.

Bootstrapped and regression tested on x86.

OK for the trunk?


Jeff
* gimple-ssa-evrp-analyze.h (class evrp_range_analyzer): Make
	push_value_range a public interface.  Add new argument to
	record_ranges_from_stmt.
	* gimple-ssa-evrp-analyze.c
	(evrp_range_analyzer::record_ranges_from_stmt): Add new argument.
	Update comments.  Handle recording temporary equivalences.
	* tree-ssa-dom.c (dom_opt_opt_walker::before_dom_children): Add
	new argument to call to evrp_range_analyzer::record_ranges_from_stmt.
	* gimple-ssa-evrp.c (evrp_dom_walker::before_dom_children): Likewise.
	* tree-ssa-threadedge.c: Include alloc-pool.h, vr-values.h and
	gimple-ssa-evrp-analyze.h.
	(record_temporary_equivalences_from_phis): Add new argument.  When
	the PHI arg is an SSA_NAME, set the result's range to the range
	of the PHI arg.
	(record_temporary_equivalences_from_stmts_at_dest): Record ranges
	from statements too.
	(thread_through_normal_block): Accept new argument, evrp_range_analyzer.
	Pass it down to children as needed.
	(thread_outgoing_edges): Likewise.
	(thread_across_edge): Likewise.   Push/pop range state as needed.
	* tree-ssa-threadedge.h (thread_outgoing_edges): Update prototype.

	* gcc.c-torture/execute/pr83298.c: New test.
	* gcc.c-torture/execute/pr83362.c New test.

diff --git a/gcc/gimple-ssa-evrp-analyze.h b/gcc/gimple-ssa-evrp-analyze.h
index 4783e6f772e..3968cfd805a 100644
--- a/gcc/gimple-ssa-evrp-analyze.h
+++ b/gcc/gimple-ssa-evrp-analyze.h
@@ -31,13 +31,18 @@ class evrp_range_analyzer
   }
 
   void enter (basic_block);
+  void push_marker (void);
+  void pop_to_marker (void);
   void leave (basic_block);
-  void record_ranges_from_stmt (gimple *);
+  void record_ranges_from_stmt (gimple *, bool);
 
   /* Main interface to retrieve range information.  */
   value_range *get_value_range (const_tree op)
     { return vr_values->get_value_range (op); }
 
+  /* Record a new unwindable range.  */
+  void push_value_range (tree var, value_range *vr);
+
   /* Dump all the current value ranges.  This is primarily
      a debugging interface.  */
   void dump_all_value_ranges (FILE *fp)
@@ -57,7 +62,6 @@ class evrp_range_analyzer
   DISABLE_COPY_AND_ASSIGN (evrp_range_analyzer);
   class vr_values *vr_values;
 
-  void push_value_range (tree var, value_range *vr);
   value_range *pop_value_range (tree var);
   value_range *try_find_new_range (tree, tree op, tree_code code, tree limit);
   void record_ranges_from_incoming_edge (basic_block);
diff --git a/gcc/gimple-ssa-evrp-analyze.c b/gcc/gimple-ssa-evrp-analyze.c
index fb3d3297a78..8e9881b6964 100644
--- a/gcc/gimple-ssa-evrp-analyze.c
+++ b/gcc/gimple-ssa-evrp-analyze.c
@@ -56,10 +56,20 @@ evrp_range_analyzer::evrp_range_analyzer () : stack (10)
   vr_values = new class vr_values;
 }
 
+/* Push an unwinding marker onto the unwinding stack.  */
+
 void
-evrp_range_analyzer::enter (basic_block bb)
+evrp_range_analyzer::push_marker ()
 {
   stack.safe_push (std::make_pair (NULL_TREE, (value_range *)NULL));
+}
+
+/* Analyze ranges as we enter basic block BB.  */
+
+void
+evrp_range_analyzer::enter (basic_block bb)
+{
+  push_marker ();
   record_ranges_from_incoming_edge (bb);
   record_ranges_from_phis (bb);
   bb->flags |= BB_VISITED;
@@ -259,8 +269,13 @@ evrp_range_analyzer::record_ranges_from_phis (basic_block bb)
     }
 }
 
+/* Record ranges from STMT into our VR_VALUES class.  If TEMPORARY is
+   true, then this is a temporary equivalence and should be recorded
+   into the unwind table.  Othewise record the equivalence into the
+   global table.  */
+
 void
-evrp_range_analyzer::record_ranges_from_stmt (gimple *stmt)
+evrp_range_analyzer::record_ranges_from_stmt (gimple *stmt, bool temporary)
 {
   tree output = NULL_TREE;
 
@@ -273,10 +288,36 @@ evrp_range_analyzer::record_ranges_from_stmt (gimple *stmt)
       vr_values->extract_range_from_stmt (stmt, &taken_edge, &output, &vr);
       if (output)
 	{
-	  vr_values->update_value_range (output, &vr);
-
-	  /* Set the SSA with the value range.  */
-	  set_ssa_range_info (output, &vr);
+	  /* Set the SSA with the value range.  There are two cases to
+	     consider.  First (the the most common) is we are processing
+	     STMT in a context where its resulting range globally holds
+	     and thus it can be reflected into the global ranges and need
+	     not be unwound as we leave scope.
+
+	     The second case occurs if we are processing a statement in
+	     a context where the resulting range must not be reflected
+	     into the global tables and must be unwound as we leave
+	     the current context.  This happens in jump threading for
+	     example.  */
+	  if (!temporary)
+	    {
+	      /* Case one.  We can just update the underlying range
+		 information as well as the global information.  */
+	      vr_values->update_value_range (output, &vr);
+	      set_ssa_range_info (output, &vr);
+	    }
+	  else
+	    {
+	      /* We're going to need to unwind this range.  We can
+		 not use VR as that's a stack object.  We have to allocate
+		 a new range and push the old range onto the stack.  We
+		 also have to be very careful about sharing the underlying
+		 bitmaps.  Ugh.  */
+	      value_range *new_vr = vr_values->allocate_value_range ();
+	      *new_vr = vr;
+	      new_vr->equiv = NULL;
+	      push_value_range (output, new_vr);
+	    }
 	}
       else
 	vr_values->set_defs_to_varying (stmt);
@@ -333,10 +374,10 @@ evrp_range_analyzer::record_ranges_from_stmt (gimple *stmt)
     }
 }
 
-/* Restore/pop VRs valid only for BB when we leave BB.  */
+/* Unwind recorded ranges to their most recent state.  */
 
 void
-evrp_range_analyzer::leave (basic_block bb ATTRIBUTE_UNUSED)
+evrp_range_analyzer::pop_to_marker (void)
 {
   gcc_checking_assert (!stack.is_empty ());
   while (stack.last ().first != NULL_TREE)
@@ -344,6 +385,15 @@ evrp_range_analyzer::leave (basic_block bb ATTRIBUTE_UNUSED)
   stack.pop ();
 }
 
+/* Restore/pop VRs valid only for BB when we leave BB.  */
+
+void
+evrp_range_analyzer::leave (basic_block bb ATTRIBUTE_UNUSED)
+{
+  pop_to_marker ();
+}
+
+
 /* Push the Value Range of VAR to the stack and update it with new VR.  */
 
 void
diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c
index 609ce38f218..112a43f4e8e 100644
--- a/gcc/gimple-ssa-evrp.c
+++ b/gcc/gimple-ssa-evrp.c
@@ -134,7 +134,7 @@ evrp_dom_walker::before_dom_children (basic_block bb)
 	  print_gimple_stmt (dump_file, stmt, 0);
 	}
 
-      evrp_range_analyzer.record_ranges_from_stmt (stmt);
+      evrp_range_analyzer.record_ranges_from_stmt (stmt, false);
 
       if (gcond *cond = dyn_cast <gcond *> (stmt))
 	{
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr83298.c b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
new file mode 100644
index 00000000000..0e51ababf5c
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
@@ -0,0 +1,11 @@
+
+int a, b, c = 1;
+
+int main ()
+{
+  for (; b < 1; b++)
+    ;
+  if (!(c * (a < 1))) 
+    __builtin_abort ();
+  return 0; 
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr83362.c b/gcc/testsuite/gcc.c-torture/execute/pr83362.c
new file mode 100644
index 00000000000..54350819c56
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr83362.c
@@ -0,0 +1,31 @@
+typedef unsigned char u8;
+typedef unsigned int u32;
+
+u32 a, b, d, e;
+u8 c;
+
+static u32 __attribute__ ((noinline, noclone))
+foo (u32 p)
+{
+  do
+    {
+      e /= 0xfff;
+      if (p > c)
+	d = 0;
+      e -= 3;
+      e *= b <= a;
+    }
+  while (e >= 88030);
+  return e;
+}
+
+int
+main (void)
+{
+  u32 x = foo (1164);
+  if (x != 0xfd)
+    __builtin_abort ();
+  return 0;
+}
+
+
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 59a7d87898e..93c992a9215 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -1433,7 +1433,7 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
   edge taken_edge = NULL;
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
     {
-      evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi));
+      evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
       taken_edge = this->optimize_stmt (bb, gsi);
     }
 
@@ -1456,6 +1456,7 @@ dom_opt_dom_walker::after_dom_children (basic_block bb)
   x_vr_values = evrp_range_analyzer.get_vr_values ();
   thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
 			 m_avail_exprs_stack,
+			 &evrp_range_analyzer,
 			 simplify_stmt_for_jump_threading);
   x_vr_values = NULL;
 
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 536c4717b72..1f176de715c 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -37,6 +37,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-dom.h"
 #include "gimple-fold.h"
 #include "cfganal.h"
+#include "alloc-pool.h"
+#include "vr-values.h"
+#include "gimple-ssa-evrp-analyze.h"
 
 /* To avoid code explosion due to jump threading, we limit the
    number of statements we are going to copy.  This variable
@@ -114,17 +117,16 @@ potentially_threadable_block (basic_block bb)
 }
 
 /* Record temporary equivalences created by PHIs at the target of the
-   edge E.  Record unwind information for the equivalences onto STACK.
+   edge E.  Record unwind information for the equivalences into
+   CONST_AND_COPIES and EVRP_RANGE_DATA.
 
    If a PHI which prevents threading is encountered, then return FALSE
-   indicating we should not thread this edge, else return TRUE.
-
-   If SRC_MAP/DST_MAP exist, then mark the source and destination SSA_NAMEs
-   of any equivalences recorded.  We use this to make invalidation after
-   traversing back edges less painful.  */
+   indicating we should not thread this edge, else return TRUE.  */
 
 static bool
-record_temporary_equivalences_from_phis (edge e, const_and_copies *const_and_copies)
+record_temporary_equivalences_from_phis (edge e,
+    const_and_copies *const_and_copies,
+    evrp_range_analyzer *evrp_range_analyzer)
 {
   gphi_iterator gsi;
 
@@ -152,6 +154,14 @@ record_temporary_equivalences_from_phis (edge e, const_and_copies *const_and_cop
 	stmt_count++;
 
       const_and_copies->record_const_or_copy (dst, src);
+
+      /* Also update the value range associated with DST, using
+	 the range from SRC.  */
+      if (evrp_range_analyzer && TREE_CODE (src) == SSA_NAME)
+	{
+	  value_range *vr = evrp_range_analyzer->get_value_range (src);
+	  evrp_range_analyzer->push_value_range (dst, vr);
+	}
     }
   return true;
 }
@@ -191,6 +201,7 @@ static gimple *
 record_temporary_equivalences_from_stmts_at_dest (edge e,
     const_and_copies *const_and_copies,
     avail_exprs_stack *avail_exprs_stack,
+    evrp_range_analyzer *evrp_range_analyzer,
     pfn_simplify simplify)
 {
   gimple *stmt = NULL;
@@ -235,6 +246,11 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
       if (stmt_count > max_stmt_count)
 	return NULL;
 
+      /* These are temporary ranges, do nto reflect them back into
+	 the global range data.  */
+      if (evrp_range_analyzer)
+	evrp_range_analyzer->record_ranges_from_stmt (stmt, true);
+
       /* If this is not a statement that sets an SSA_NAME to a new
 	 value, then do not try to simplify this statement as it will
 	 not simplify in any way that is helpful for jump threading.  */
@@ -972,6 +988,7 @@ thread_through_normal_block (edge e,
 			     gcond *dummy_cond,
 			     const_and_copies *const_and_copies,
 			     avail_exprs_stack *avail_exprs_stack,
+			     evrp_range_analyzer *evrp_range_analyzer,
 			     pfn_simplify simplify,
 			     vec<jump_thread_edge *> *path,
 			     bitmap visited)
@@ -983,7 +1000,8 @@ thread_through_normal_block (edge e,
      Note that if we found a PHI that made the block non-threadable, then
      we need to bubble that up to our caller in the same manner we do
      when we prematurely stop processing statements below.  */
-  if (!record_temporary_equivalences_from_phis (e, const_and_copies))
+  if (!record_temporary_equivalences_from_phis (e, const_and_copies,
+					        evrp_range_analyzer))
     return -1;
 
   /* Now walk each statement recording any context sensitive
@@ -991,6 +1009,7 @@ thread_through_normal_block (edge e,
   gimple *stmt
     = record_temporary_equivalences_from_stmts_at_dest (e, const_and_copies,
 							avail_exprs_stack,
+							evrp_range_analyzer,
 							simplify);
 
   /* There's two reasons STMT might be null, and distinguishing
@@ -1105,12 +1124,15 @@ thread_across_edge (gcond *dummy_cond,
 		    edge e,
 		    class const_and_copies *const_and_copies,
 		    class avail_exprs_stack *avail_exprs_stack,
+		    class evrp_range_analyzer *evrp_range_analyzer,
 		    pfn_simplify simplify)
 {
   bitmap visited = BITMAP_ALLOC (NULL);
 
   const_and_copies->push_marker ();
   avail_exprs_stack->push_marker ();
+  if (evrp_range_analyzer)
+    evrp_range_analyzer->push_marker ();
 
   stmt_count = 0;
 
@@ -1124,6 +1146,7 @@ thread_across_edge (gcond *dummy_cond,
     threaded = thread_through_normal_block (e, dummy_cond,
 					    const_and_copies,
 					    avail_exprs_stack,
+					    evrp_range_analyzer,
 					    simplify, path,
 					    visited);
   else
@@ -1135,6 +1158,8 @@ thread_across_edge (gcond *dummy_cond,
 					   e->dest);
       const_and_copies->pop_to_marker ();
       avail_exprs_stack->pop_to_marker ();
+      if (evrp_range_analyzer)
+	evrp_range_analyzer->pop_to_marker ();
       BITMAP_FREE (visited);
       register_jump_thread (path);
       return;
@@ -1160,6 +1185,8 @@ thread_across_edge (gcond *dummy_cond,
 	  BITMAP_FREE (visited);
 	  const_and_copies->pop_to_marker ();
           avail_exprs_stack->pop_to_marker ();
+	  if (evrp_range_analyzer)
+	    evrp_range_analyzer->pop_to_marker ();
 	  return;
 	}
     }
@@ -1187,6 +1214,8 @@ thread_across_edge (gcond *dummy_cond,
 	{
 	  const_and_copies->pop_to_marker ();
           avail_exprs_stack->pop_to_marker ();
+	  if (evrp_range_analyzer)
+	    evrp_range_analyzer->pop_to_marker ();
 	  BITMAP_FREE (visited);
 	  return;
 	}
@@ -1202,6 +1231,8 @@ thread_across_edge (gcond *dummy_cond,
 	   for each of E->dest's successors.  */
 	const_and_copies->push_marker ();
 	avail_exprs_stack->push_marker ();
+	if (evrp_range_analyzer)
+	  evrp_range_analyzer->push_marker ();
 
 	/* Avoid threading to any block we have already visited.  */
 	bitmap_clear (visited);
@@ -1229,6 +1260,7 @@ thread_across_edge (gcond *dummy_cond,
 	  found = thread_through_normal_block (path->last ()->e, dummy_cond,
 					       const_and_copies,
 					       avail_exprs_stack,
+					       evrp_range_analyzer,
 					       simplify, path,
 					       visited) > 0;
 
@@ -1244,12 +1276,16 @@ thread_across_edge (gcond *dummy_cond,
 	  delete_jump_thread_path (path);
 
 	/* And unwind the equivalence table.  */
+	if (evrp_range_analyzer)
+	  evrp_range_analyzer->pop_to_marker ();
 	avail_exprs_stack->pop_to_marker ();
 	const_and_copies->pop_to_marker ();
       }
     BITMAP_FREE (visited);
   }
 
+  if (evrp_range_analyzer)
+    evrp_range_analyzer->pop_to_marker ();
   const_and_copies->pop_to_marker ();
   avail_exprs_stack->pop_to_marker ();
 }
@@ -1271,6 +1307,7 @@ void
 thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
 		       class const_and_copies *const_and_copies,
 		       class avail_exprs_stack *avail_exprs_stack,
+		       class evrp_range_analyzer *evrp_range_analyzer,
 		       tree (*simplify) (gimple *, gimple *,
 					 class avail_exprs_stack *,
 					 basic_block))
@@ -1288,7 +1325,7 @@ thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
     {
       thread_across_edge (dummy_cond, single_succ_edge (bb),
 			  const_and_copies, avail_exprs_stack,
-			  simplify);
+			  evrp_range_analyzer, simplify);
     }
   else if ((last = last_stmt (bb))
 	   && gimple_code (last) == GIMPLE_COND
@@ -1304,11 +1341,13 @@ thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
 	 more than one predecessor and more than one successor.  */
       if (potentially_threadable_block (true_edge->dest))
 	thread_across_edge (dummy_cond, true_edge,
-			    const_and_copies, avail_exprs_stack, simplify);
+			    const_and_copies, avail_exprs_stack,
+			    evrp_range_analyzer, simplify);
 
       /* Similarly for the ELSE arm.  */
       if (potentially_threadable_block (false_edge->dest))
 	thread_across_edge (dummy_cond, false_edge,
-			    const_and_copies, avail_exprs_stack, simplify);
+			    const_and_copies, avail_exprs_stack,
+			    evrp_range_analyzer, simplify);
     }
 }
diff --git a/gcc/tree-ssa-threadedge.h b/gcc/tree-ssa-threadedge.h
index 49dfa9c94d4..0f2e39f72e3 100644
--- a/gcc/tree-ssa-threadedge.h
+++ b/gcc/tree-ssa-threadedge.h
@@ -30,9 +30,11 @@ extern void threadedge_initialize_values (void);
 extern void threadedge_finalize_values (void);
 extern bool potentially_threadable_block (basic_block);
 extern void propagate_threaded_block_debug_into (basic_block, basic_block);
+class evrp_range_analyzer;
 extern void thread_outgoing_edges (basic_block, gcond *,
 				   const_and_copies *,
 				   avail_exprs_stack *,
+				   evrp_range_analyzer *,
 				   tree (*) (gimple *, gimple *,
 					     avail_exprs_stack *, basic_block));
 
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index a86b38208ab..7df6f244657 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -6677,7 +6677,7 @@ vrp_dom_walker::after_dom_children (basic_block bb)
 
   x_vr_values = vr_values;
   thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
-			 m_avail_exprs_stack,
+			 m_avail_exprs_stack, NULL,
 			 simplify_stmt_for_jump_threading);
   x_vr_values = NULL;
Richard Biener Dec. 12, 2017, 8:39 a.m. UTC | #5
On Mon, Dec 11, 2017 at 8:34 PM, Jeff Law <law@redhat.com> wrote:
> On 12/08/2017 04:17 AM, Richard Biener wrote:
>>
>> I'm not convinced that when you look forward past the dominance frontier
>> and do VRP analysis on stmts without analyzing all intermediate
>> stmts on the path (or at least push all defs on that path temporarily
>> to VR_VARYING) is fixed by this patch.  It merely looks like a wrong
>> workaround for a fundamental issue in how DOM now uses the
>> interface?
> So here's the bits to record ranges (with unwind entries so we can roll
> them back) as we process beyond the dominance frontier.  This was always
> in the plan, but I wanted to do some cleanups prior to adding this
> capability.
>
> I've stayed away from doing any of the cleanups at this time.
>
> At a high level we break out routines to push a marker and pop to a
> marker on the unwinding stack.  We then define the enter/leave in terms
> of those new routines.  This allows us to push/pop scopes as we process
> thread paths which don't want the same processing we see in the enter
> method.
>
> We add a boolean to record_ranges_from_stmt to indicate if any generated
> range is of a temporary nature   The pre-existing calls all pass false
> here to indicate the range is global.  WHen true (from threading) we
> generate the necessary unwind entries and avoid changing any global state.
>
> push_value_range becomes a public member function so it can be used when
> threading to record a temporary range created by a PHI.  It's not
> usually necessary, but could be for cases where we're unable to
> propagate the PHI equivalences.
>
> --
>
>
> We pass in the evrp_range_analyzer instance from DOM into the threader.
> From VRP we just pass in a NULL pointer as VRP doesn't use the EVRP
> analyzer and we have to check it in various places.  This is one of the
> many cleanups that will occur as we drop threading from tree-vrp.c.
>
> We pass that down to a couple functions.  Nothing significant.
>
> Then it's just a matter of recording something for PHIs and statements
> we encounter -- ensuring that in both cases we create unwind entries.
> That obviously fixes 83298 and it's duplicate (testcases for both are
> included).  It probably enables more jump threading as well, but I
> didn't specifically look for cases where that happened.
>
> Bootstrapped and regression tested on x86.
>
> OK for the trunk?

LGTM.  I notice (existing code) that we use set_vr_value from push_value_range
vs. update_value_range from the regular code.  Also for some unrelated missed
optimization I'd like to be able to set and unwind SSA range info via the stack
as well.  Both future things we might want to cleanup.  I guess once the
SSA propagator VRP can go there's an opportunity to cleanup even more of the
code.

Ok.

Thanks,
Richard.

>
> Jeff
>
>
>
>         * gimple-ssa-evrp-analyze.h (class evrp_range_analyzer): Make
>         push_value_range a public interface.  Add new argument to
>         record_ranges_from_stmt.
>         * gimple-ssa-evrp-analyze.c
>         (evrp_range_analyzer::record_ranges_from_stmt): Add new argument.
>         Update comments.  Handle recording temporary equivalences.
>         * tree-ssa-dom.c (dom_opt_opt_walker::before_dom_children): Add
>         new argument to call to evrp_range_analyzer::record_ranges_from_stmt.
>         * gimple-ssa-evrp.c (evrp_dom_walker::before_dom_children): Likewise.
>         * tree-ssa-threadedge.c: Include alloc-pool.h, vr-values.h and
>         gimple-ssa-evrp-analyze.h.
>         (record_temporary_equivalences_from_phis): Add new argument.  When
>         the PHI arg is an SSA_NAME, set the result's range to the range
>         of the PHI arg.
>         (record_temporary_equivalences_from_stmts_at_dest): Record ranges
>         from statements too.
>         (thread_through_normal_block): Accept new argument, evrp_range_analyzer.
>         Pass it down to children as needed.
>         (thread_outgoing_edges): Likewise.
>         (thread_across_edge): Likewise.   Push/pop range state as needed.
>         * tree-ssa-threadedge.h (thread_outgoing_edges): Update prototype.
>
>         * gcc.c-torture/execute/pr83298.c: New test.
>         * gcc.c-torture/execute/pr83362.c New test.
>
> diff --git a/gcc/gimple-ssa-evrp-analyze.h b/gcc/gimple-ssa-evrp-analyze.h
> index 4783e6f772e..3968cfd805a 100644
> --- a/gcc/gimple-ssa-evrp-analyze.h
> +++ b/gcc/gimple-ssa-evrp-analyze.h
> @@ -31,13 +31,18 @@ class evrp_range_analyzer
>    }
>
>    void enter (basic_block);
> +  void push_marker (void);
> +  void pop_to_marker (void);
>    void leave (basic_block);
> -  void record_ranges_from_stmt (gimple *);
> +  void record_ranges_from_stmt (gimple *, bool);
>
>    /* Main interface to retrieve range information.  */
>    value_range *get_value_range (const_tree op)
>      { return vr_values->get_value_range (op); }
>
> +  /* Record a new unwindable range.  */
> +  void push_value_range (tree var, value_range *vr);
> +
>    /* Dump all the current value ranges.  This is primarily
>       a debugging interface.  */
>    void dump_all_value_ranges (FILE *fp)
> @@ -57,7 +62,6 @@ class evrp_range_analyzer
>    DISABLE_COPY_AND_ASSIGN (evrp_range_analyzer);
>    class vr_values *vr_values;
>
> -  void push_value_range (tree var, value_range *vr);
>    value_range *pop_value_range (tree var);
>    value_range *try_find_new_range (tree, tree op, tree_code code, tree limit);
>    void record_ranges_from_incoming_edge (basic_block);
> diff --git a/gcc/gimple-ssa-evrp-analyze.c b/gcc/gimple-ssa-evrp-analyze.c
> index fb3d3297a78..8e9881b6964 100644
> --- a/gcc/gimple-ssa-evrp-analyze.c
> +++ b/gcc/gimple-ssa-evrp-analyze.c
> @@ -56,10 +56,20 @@ evrp_range_analyzer::evrp_range_analyzer () : stack (10)
>    vr_values = new class vr_values;
>  }
>
> +/* Push an unwinding marker onto the unwinding stack.  */
> +
>  void
> -evrp_range_analyzer::enter (basic_block bb)
> +evrp_range_analyzer::push_marker ()
>  {
>    stack.safe_push (std::make_pair (NULL_TREE, (value_range *)NULL));
> +}
> +
> +/* Analyze ranges as we enter basic block BB.  */
> +
> +void
> +evrp_range_analyzer::enter (basic_block bb)
> +{
> +  push_marker ();
>    record_ranges_from_incoming_edge (bb);
>    record_ranges_from_phis (bb);
>    bb->flags |= BB_VISITED;
> @@ -259,8 +269,13 @@ evrp_range_analyzer::record_ranges_from_phis (basic_block bb)
>      }
>  }
>
> +/* Record ranges from STMT into our VR_VALUES class.  If TEMPORARY is
> +   true, then this is a temporary equivalence and should be recorded
> +   into the unwind table.  Othewise record the equivalence into the
> +   global table.  */
> +
>  void
> -evrp_range_analyzer::record_ranges_from_stmt (gimple *stmt)
> +evrp_range_analyzer::record_ranges_from_stmt (gimple *stmt, bool temporary)
>  {
>    tree output = NULL_TREE;
>
> @@ -273,10 +288,36 @@ evrp_range_analyzer::record_ranges_from_stmt (gimple *stmt)
>        vr_values->extract_range_from_stmt (stmt, &taken_edge, &output, &vr);
>        if (output)
>         {
> -         vr_values->update_value_range (output, &vr);
> -
> -         /* Set the SSA with the value range.  */
> -         set_ssa_range_info (output, &vr);
> +         /* Set the SSA with the value range.  There are two cases to
> +            consider.  First (the the most common) is we are processing
> +            STMT in a context where its resulting range globally holds
> +            and thus it can be reflected into the global ranges and need
> +            not be unwound as we leave scope.
> +
> +            The second case occurs if we are processing a statement in
> +            a context where the resulting range must not be reflected
> +            into the global tables and must be unwound as we leave
> +            the current context.  This happens in jump threading for
> +            example.  */
> +         if (!temporary)
> +           {
> +             /* Case one.  We can just update the underlying range
> +                information as well as the global information.  */
> +             vr_values->update_value_range (output, &vr);
> +             set_ssa_range_info (output, &vr);
> +           }
> +         else
> +           {
> +             /* We're going to need to unwind this range.  We can
> +                not use VR as that's a stack object.  We have to allocate
> +                a new range and push the old range onto the stack.  We
> +                also have to be very careful about sharing the underlying
> +                bitmaps.  Ugh.  */
> +             value_range *new_vr = vr_values->allocate_value_range ();
> +             *new_vr = vr;
> +             new_vr->equiv = NULL;
> +             push_value_range (output, new_vr);
> +           }
>         }
>        else
>         vr_values->set_defs_to_varying (stmt);
> @@ -333,10 +374,10 @@ evrp_range_analyzer::record_ranges_from_stmt (gimple *stmt)
>      }
>  }
>
> -/* Restore/pop VRs valid only for BB when we leave BB.  */
> +/* Unwind recorded ranges to their most recent state.  */
>
>  void
> -evrp_range_analyzer::leave (basic_block bb ATTRIBUTE_UNUSED)
> +evrp_range_analyzer::pop_to_marker (void)
>  {
>    gcc_checking_assert (!stack.is_empty ());
>    while (stack.last ().first != NULL_TREE)
> @@ -344,6 +385,15 @@ evrp_range_analyzer::leave (basic_block bb ATTRIBUTE_UNUSED)
>    stack.pop ();
>  }
>
> +/* Restore/pop VRs valid only for BB when we leave BB.  */
> +
> +void
> +evrp_range_analyzer::leave (basic_block bb ATTRIBUTE_UNUSED)
> +{
> +  pop_to_marker ();
> +}
> +
> +
>  /* Push the Value Range of VAR to the stack and update it with new VR.  */
>
>  void
> diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c
> index 609ce38f218..112a43f4e8e 100644
> --- a/gcc/gimple-ssa-evrp.c
> +++ b/gcc/gimple-ssa-evrp.c
> @@ -134,7 +134,7 @@ evrp_dom_walker::before_dom_children (basic_block bb)
>           print_gimple_stmt (dump_file, stmt, 0);
>         }
>
> -      evrp_range_analyzer.record_ranges_from_stmt (stmt);
> +      evrp_range_analyzer.record_ranges_from_stmt (stmt, false);
>
>        if (gcond *cond = dyn_cast <gcond *> (stmt))
>         {
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr83298.c b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
> new file mode 100644
> index 00000000000..0e51ababf5c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
> @@ -0,0 +1,11 @@
> +
> +int a, b, c = 1;
> +
> +int main ()
> +{
> +  for (; b < 1; b++)
> +    ;
> +  if (!(c * (a < 1)))
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr83362.c b/gcc/testsuite/gcc.c-torture/execute/pr83362.c
> new file mode 100644
> index 00000000000..54350819c56
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr83362.c
> @@ -0,0 +1,31 @@
> +typedef unsigned char u8;
> +typedef unsigned int u32;
> +
> +u32 a, b, d, e;
> +u8 c;
> +
> +static u32 __attribute__ ((noinline, noclone))
> +foo (u32 p)
> +{
> +  do
> +    {
> +      e /= 0xfff;
> +      if (p > c)
> +       d = 0;
> +      e -= 3;
> +      e *= b <= a;
> +    }
> +  while (e >= 88030);
> +  return e;
> +}
> +
> +int
> +main (void)
> +{
> +  u32 x = foo (1164);
> +  if (x != 0xfd)
> +    __builtin_abort ();
> +  return 0;
> +}
> +
> +
> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
> index 59a7d87898e..93c992a9215 100644
> --- a/gcc/tree-ssa-dom.c
> +++ b/gcc/tree-ssa-dom.c
> @@ -1433,7 +1433,7 @@ dom_opt_dom_walker::before_dom_children (basic_block bb)
>    edge taken_edge = NULL;
>    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>      {
> -      evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi));
> +      evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), false);
>        taken_edge = this->optimize_stmt (bb, gsi);
>      }
>
> @@ -1456,6 +1456,7 @@ dom_opt_dom_walker::after_dom_children (basic_block bb)
>    x_vr_values = evrp_range_analyzer.get_vr_values ();
>    thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
>                          m_avail_exprs_stack,
> +                        &evrp_range_analyzer,
>                          simplify_stmt_for_jump_threading);
>    x_vr_values = NULL;
>
> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> index 536c4717b72..1f176de715c 100644
> --- a/gcc/tree-ssa-threadedge.c
> +++ b/gcc/tree-ssa-threadedge.c
> @@ -37,6 +37,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssa-dom.h"
>  #include "gimple-fold.h"
>  #include "cfganal.h"
> +#include "alloc-pool.h"
> +#include "vr-values.h"
> +#include "gimple-ssa-evrp-analyze.h"
>
>  /* To avoid code explosion due to jump threading, we limit the
>     number of statements we are going to copy.  This variable
> @@ -114,17 +117,16 @@ potentially_threadable_block (basic_block bb)
>  }
>
>  /* Record temporary equivalences created by PHIs at the target of the
> -   edge E.  Record unwind information for the equivalences onto STACK.
> +   edge E.  Record unwind information for the equivalences into
> +   CONST_AND_COPIES and EVRP_RANGE_DATA.
>
>     If a PHI which prevents threading is encountered, then return FALSE
> -   indicating we should not thread this edge, else return TRUE.
> -
> -   If SRC_MAP/DST_MAP exist, then mark the source and destination SSA_NAMEs
> -   of any equivalences recorded.  We use this to make invalidation after
> -   traversing back edges less painful.  */
> +   indicating we should not thread this edge, else return TRUE.  */
>
>  static bool
> -record_temporary_equivalences_from_phis (edge e, const_and_copies *const_and_copies)
> +record_temporary_equivalences_from_phis (edge e,
> +    const_and_copies *const_and_copies,
> +    evrp_range_analyzer *evrp_range_analyzer)
>  {
>    gphi_iterator gsi;
>
> @@ -152,6 +154,14 @@ record_temporary_equivalences_from_phis (edge e, const_and_copies *const_and_cop
>         stmt_count++;
>
>        const_and_copies->record_const_or_copy (dst, src);
> +
> +      /* Also update the value range associated with DST, using
> +        the range from SRC.  */
> +      if (evrp_range_analyzer && TREE_CODE (src) == SSA_NAME)
> +       {
> +         value_range *vr = evrp_range_analyzer->get_value_range (src);
> +         evrp_range_analyzer->push_value_range (dst, vr);
> +       }
>      }
>    return true;
>  }
> @@ -191,6 +201,7 @@ static gimple *
>  record_temporary_equivalences_from_stmts_at_dest (edge e,
>      const_and_copies *const_and_copies,
>      avail_exprs_stack *avail_exprs_stack,
> +    evrp_range_analyzer *evrp_range_analyzer,
>      pfn_simplify simplify)
>  {
>    gimple *stmt = NULL;
> @@ -235,6 +246,11 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
>        if (stmt_count > max_stmt_count)
>         return NULL;
>
> +      /* These are temporary ranges, do nto reflect them back into
> +        the global range data.  */
> +      if (evrp_range_analyzer)
> +       evrp_range_analyzer->record_ranges_from_stmt (stmt, true);
> +
>        /* If this is not a statement that sets an SSA_NAME to a new
>          value, then do not try to simplify this statement as it will
>          not simplify in any way that is helpful for jump threading.  */
> @@ -972,6 +988,7 @@ thread_through_normal_block (edge e,
>                              gcond *dummy_cond,
>                              const_and_copies *const_and_copies,
>                              avail_exprs_stack *avail_exprs_stack,
> +                            evrp_range_analyzer *evrp_range_analyzer,
>                              pfn_simplify simplify,
>                              vec<jump_thread_edge *> *path,
>                              bitmap visited)
> @@ -983,7 +1000,8 @@ thread_through_normal_block (edge e,
>       Note that if we found a PHI that made the block non-threadable, then
>       we need to bubble that up to our caller in the same manner we do
>       when we prematurely stop processing statements below.  */
> -  if (!record_temporary_equivalences_from_phis (e, const_and_copies))
> +  if (!record_temporary_equivalences_from_phis (e, const_and_copies,
> +                                               evrp_range_analyzer))
>      return -1;
>
>    /* Now walk each statement recording any context sensitive
> @@ -991,6 +1009,7 @@ thread_through_normal_block (edge e,
>    gimple *stmt
>      = record_temporary_equivalences_from_stmts_at_dest (e, const_and_copies,
>                                                         avail_exprs_stack,
> +                                                       evrp_range_analyzer,
>                                                         simplify);
>
>    /* There's two reasons STMT might be null, and distinguishing
> @@ -1105,12 +1124,15 @@ thread_across_edge (gcond *dummy_cond,
>                     edge e,
>                     class const_and_copies *const_and_copies,
>                     class avail_exprs_stack *avail_exprs_stack,
> +                   class evrp_range_analyzer *evrp_range_analyzer,
>                     pfn_simplify simplify)
>  {
>    bitmap visited = BITMAP_ALLOC (NULL);
>
>    const_and_copies->push_marker ();
>    avail_exprs_stack->push_marker ();
> +  if (evrp_range_analyzer)
> +    evrp_range_analyzer->push_marker ();
>
>    stmt_count = 0;
>
> @@ -1124,6 +1146,7 @@ thread_across_edge (gcond *dummy_cond,
>      threaded = thread_through_normal_block (e, dummy_cond,
>                                             const_and_copies,
>                                             avail_exprs_stack,
> +                                           evrp_range_analyzer,
>                                             simplify, path,
>                                             visited);
>    else
> @@ -1135,6 +1158,8 @@ thread_across_edge (gcond *dummy_cond,
>                                            e->dest);
>        const_and_copies->pop_to_marker ();
>        avail_exprs_stack->pop_to_marker ();
> +      if (evrp_range_analyzer)
> +       evrp_range_analyzer->pop_to_marker ();
>        BITMAP_FREE (visited);
>        register_jump_thread (path);
>        return;
> @@ -1160,6 +1185,8 @@ thread_across_edge (gcond *dummy_cond,
>           BITMAP_FREE (visited);
>           const_and_copies->pop_to_marker ();
>            avail_exprs_stack->pop_to_marker ();
> +         if (evrp_range_analyzer)
> +           evrp_range_analyzer->pop_to_marker ();
>           return;
>         }
>      }
> @@ -1187,6 +1214,8 @@ thread_across_edge (gcond *dummy_cond,
>         {
>           const_and_copies->pop_to_marker ();
>            avail_exprs_stack->pop_to_marker ();
> +         if (evrp_range_analyzer)
> +           evrp_range_analyzer->pop_to_marker ();
>           BITMAP_FREE (visited);
>           return;
>         }
> @@ -1202,6 +1231,8 @@ thread_across_edge (gcond *dummy_cond,
>            for each of E->dest's successors.  */
>         const_and_copies->push_marker ();
>         avail_exprs_stack->push_marker ();
> +       if (evrp_range_analyzer)
> +         evrp_range_analyzer->push_marker ();
>
>         /* Avoid threading to any block we have already visited.  */
>         bitmap_clear (visited);
> @@ -1229,6 +1260,7 @@ thread_across_edge (gcond *dummy_cond,
>           found = thread_through_normal_block (path->last ()->e, dummy_cond,
>                                                const_and_copies,
>                                                avail_exprs_stack,
> +                                              evrp_range_analyzer,
>                                                simplify, path,
>                                                visited) > 0;
>
> @@ -1244,12 +1276,16 @@ thread_across_edge (gcond *dummy_cond,
>           delete_jump_thread_path (path);
>
>         /* And unwind the equivalence table.  */
> +       if (evrp_range_analyzer)
> +         evrp_range_analyzer->pop_to_marker ();
>         avail_exprs_stack->pop_to_marker ();
>         const_and_copies->pop_to_marker ();
>        }
>      BITMAP_FREE (visited);
>    }
>
> +  if (evrp_range_analyzer)
> +    evrp_range_analyzer->pop_to_marker ();
>    const_and_copies->pop_to_marker ();
>    avail_exprs_stack->pop_to_marker ();
>  }
> @@ -1271,6 +1307,7 @@ void
>  thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
>                        class const_and_copies *const_and_copies,
>                        class avail_exprs_stack *avail_exprs_stack,
> +                      class evrp_range_analyzer *evrp_range_analyzer,
>                        tree (*simplify) (gimple *, gimple *,
>                                          class avail_exprs_stack *,
>                                          basic_block))
> @@ -1288,7 +1325,7 @@ thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
>      {
>        thread_across_edge (dummy_cond, single_succ_edge (bb),
>                           const_and_copies, avail_exprs_stack,
> -                         simplify);
> +                         evrp_range_analyzer, simplify);
>      }
>    else if ((last = last_stmt (bb))
>            && gimple_code (last) == GIMPLE_COND
> @@ -1304,11 +1341,13 @@ thread_outgoing_edges (basic_block bb, gcond *dummy_cond,
>          more than one predecessor and more than one successor.  */
>        if (potentially_threadable_block (true_edge->dest))
>         thread_across_edge (dummy_cond, true_edge,
> -                           const_and_copies, avail_exprs_stack, simplify);
> +                           const_and_copies, avail_exprs_stack,
> +                           evrp_range_analyzer, simplify);
>
>        /* Similarly for the ELSE arm.  */
>        if (potentially_threadable_block (false_edge->dest))
>         thread_across_edge (dummy_cond, false_edge,
> -                           const_and_copies, avail_exprs_stack, simplify);
> +                           const_and_copies, avail_exprs_stack,
> +                           evrp_range_analyzer, simplify);
>      }
>  }
> diff --git a/gcc/tree-ssa-threadedge.h b/gcc/tree-ssa-threadedge.h
> index 49dfa9c94d4..0f2e39f72e3 100644
> --- a/gcc/tree-ssa-threadedge.h
> +++ b/gcc/tree-ssa-threadedge.h
> @@ -30,9 +30,11 @@ extern void threadedge_initialize_values (void);
>  extern void threadedge_finalize_values (void);
>  extern bool potentially_threadable_block (basic_block);
>  extern void propagate_threaded_block_debug_into (basic_block, basic_block);
> +class evrp_range_analyzer;
>  extern void thread_outgoing_edges (basic_block, gcond *,
>                                    const_and_copies *,
>                                    avail_exprs_stack *,
> +                                  evrp_range_analyzer *,
>                                    tree (*) (gimple *, gimple *,
>                                              avail_exprs_stack *, basic_block));
>
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index a86b38208ab..7df6f244657 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -6677,7 +6677,7 @@ vrp_dom_walker::after_dom_children (basic_block bb)
>
>    x_vr_values = vr_values;
>    thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies,
> -                        m_avail_exprs_stack,
> +                        m_avail_exprs_stack, NULL,
>                          simplify_stmt_for_jump_threading);
>    x_vr_values = NULL;
>
>
Jeff Law Dec. 12, 2017, 4:49 p.m. UTC | #6
On 12/12/2017 01:39 AM, Richard Biener wrote:
> 
> LGTM.  I notice (existing code) that we use set_vr_value from push_value_range
> vs. update_value_range from the regular code.  Also for some unrelated missed
> optimization I'd like to be able to set and unwind SSA range info via the stack
> as well.  Both future things we might want to cleanup.  I guess once the
> SSA propagator VRP can go there's an opportunity to cleanup even more of the
> code.
Right.  It's clearly an area ripe for a bit of cleanup.  The avail_exprs
and const/copies tables take the approach of recording everything in the
unwind tables.  The VRP bits try to be smarter and only use the
unwinding stack when it's absolutely necessary.  I'm torn as I can see
pros and cons of both approaches.  I'd certainly like to settle on just
one for consistency's sake.

The API for update_value_range is painful -- it assumes the new range is
a temporary range and that the equivs hung off it can be free'd.  I'm
not sure why it's done that way, but it introduces a certain amount of
pain.  So we have to be careful in how we use it.

A part of me wants to classify the lowest level range structure
(value_range) and introduce some invariants (ie, can they be shared or
are they copied).  But it feels like not enough gain right now.

I absolutely agree that we'll want the push/pop scope primitives to be
exposed.  I've consistently seen value in that with the other tables,
thus exposing those in the VRP bits seemed advisable.

Jeff

ps.  Another report came in today that is almost certainly a DUP of
83298. I'll include its testcase in the commit.
diff mbox series

Patch

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr83298.c b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
new file mode 100644
index 00000000000..0e51ababf5c
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr83298.c
@@ -0,0 +1,11 @@ 
+
+int a, b, c = 1;
+
+int main ()
+{
+  for (; b < 1; b++)
+    ;
+  if (!(c * (a < 1))) 
+    __builtin_abort ();
+  return 0; 
+}
diff --git a/gcc/vr-values.c b/gcc/vr-values.c
index 9352e120d9d..ee5ae3c6a27 100644
--- a/gcc/vr-values.c
+++ b/gcc/vr-values.c
@@ -912,6 +912,23 @@  vr_values::extract_range_from_cond_expr (value_range *vr, gassign *stmt)
   else
     set_value_range_to_varying (&vr1);
 
+  /* If either range is VR_UNDEFINED, vrp_meet will make the optimistic
+     choice and use the range of the other operand as the result range.
+
+     Other users of vrp_meet either explicitly filter the calls for
+     this case, or they do not care (PHI processing where unexecutable
+     edges are explicitly expected to be ignored).
+
+     Like most other callers, we can not generally tolerate the optimistic
+     choice here.  Consider jump threading where we're looking into a
+     non-dominated block and thus may not necessarily have processed the
+     ranges for statements within that non-dominated block.  */
+  if (vr0.type == VR_UNDEFINED || vr1.type == VR_UNDEFINED)
+    {
+      set_value_range_to_varying (vr);
+      return;
+    }
+
   /* The resulting value range is the union of the operand ranges */
   copy_value_range (vr, &vr0);
   vrp_meet (vr, &vr1);