diff mbox

Optimize in VRP loads from constant arrays (take 2)

Message ID 20170429162849.GQ1809@tucnak
State New
Headers show

Commit Message

Jakub Jelinek April 29, 2017, 4:28 p.m. UTC
On Fri, Apr 21, 2017 at 04:42:03PM +0200, Richard Biener wrote:
> On April 21, 2017 4:06:59 PM GMT+02:00, Jakub Jelinek <jakub@redhat.com> wrote:
> >This patch attempts to implement the optimization mentioned in
> >http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00945.html
> >If we know a (relatively small) value range for ARRAY_REF index
> >in constant array load and the values in the array for all the indexes
> >in the array are the same (and constant), we can just replace the load
> >with the constant.
> 
> But this should be done during propagation (and thus can even produce a range rather than just a constant).

True, I've changed the implementation to do that.
Except that it is only useful during propagation for integral or pointer
types, for anything else (and for pointers when not just doing NULL vs.
non-NULL vr) it is also performed during simplification (in that case
it requires operand_equal_p values).

The patch also newly computes range for the index and range from the array
bounds (taking into account arrays at struct end) and intersects them,
plus takes into account that some iterators might have a couple of zero LBS
bits (as computed by ccp).

> Also much of the fold_const_aggregate_ref work can be shared for all indices.

Maybe.  Is that required for the patch, or can it be done incrementally?

> >Shall I introduce a param for the size of the range to consider?
> 
> I think so.  Eventually we can pre-compute/cache a "range tree" for a
> Constant initializer?

param introduced.

> That said I'm happy with moving it to propagation time and considering ranges
> And leave compile-time optimization for future work (with possibly increasing the number of elements to consider)

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, the patch doesn't handle the case when the constant load is aggregate,
where fold_const_aggregate_ref_1 returns a CONSTRUCTOR.  CONSTRUCTOR is not
gimple min invariant, and when not empty can't be in the IL anyway, so I was
thinking we could perhaps in that case just modify the rhs to use a constant
index (e.g. vr0.min) instead of the rhs with variable index.  It didn't work
because operand_equal_p doesn't handle non-empty CONSTRUCTORs (they compare
unequal even if they have the same elements).

2017-04-29  Jakub Jelinek  <jakub@redhat.com>

	* params.def (PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS): New.
	* tree.c (array_at_struct_end_p): Return false if ref is STRING_CST.
	* tree-vrp.c: Include gimplify.h.
	(load_index): New variable.
	(vrp_load_valueize, extract_range_from_load): New functions.
	(extract_range_from_assignment, simplify_stmt_using_ranges): Use
	extract_range_from_load.
	(stmt_interesting_for_vrp): Return true for loads with handled
	component rhs.

	* gcc.dg/tree-ssa/vrp113.c: New test.



	Jakub

Comments

Richard Biener May 2, 2017, 11:13 a.m. UTC | #1
On Sat, 29 Apr 2017, Jakub Jelinek wrote:

> On Fri, Apr 21, 2017 at 04:42:03PM +0200, Richard Biener wrote:
> > On April 21, 2017 4:06:59 PM GMT+02:00, Jakub Jelinek <jakub@redhat.com> wrote:
> > >This patch attempts to implement the optimization mentioned in
> > >http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00945.html
> > >If we know a (relatively small) value range for ARRAY_REF index
> > >in constant array load and the values in the array for all the indexes
> > >in the array are the same (and constant), we can just replace the load
> > >with the constant.
> > 
> > But this should be done during propagation (and thus can even produce a range rather than just a constant).
> 
> True, I've changed the implementation to do that.
> Except that it is only useful during propagation for integral or pointer
> types, for anything else (and for pointers when not just doing NULL vs.
> non-NULL vr) it is also performed during simplification (in that case
> it requires operand_equal_p values).
> 
> The patch also newly computes range for the index and range from the array
> bounds (taking into account arrays at struct end) and intersects them,
> plus takes into account that some iterators might have a couple of zero LBS
> bits (as computed by ccp).

Hmm, while I can see how this helps I'd rather not do this in this way.
(see PR80533 and my followup with a testcase which shows
array_at_struct_end_p is wrong)  Ideally we'd add asserts for array
indices instead.  Thus

  i_3 = ASSERT_EXPR <i_2, (unsigned)i_2 <= 3>
  .. = a[i_3];

which of course needs adjustment to -Warray-bounds to look at the
range of the original SSA name (and in loops even that might degrade...).

Was this necessary to get any reasonable results?

> > Also much of the fold_const_aggregate_ref work can be shared for all indices.
> 
> Maybe.  Is that required for the patch, or can it be done incrementally?

Incrementally.

Still given the high cost of get_array_ctor_element_at_index which
does linear searching I'd add a few early-outs like

  base = get_base_address (rhs);
  ctor = ctor_for_folding (base);
  if (! ctor)
    return NULL_TREE;

I'd restructure the patch quite different, using for_each_index on the
ref gather an array of index pointers (bail out on sth unhandled).
Then I'd see if I have interesting ranges for them, if not, bail out.
Also compute the size product of all ranges and test that against
PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS.  Then store the minimum range
value in the index places (temporarily) and use get_base_ref_and_extent to
get at the constant "starting" offset.  From there iterate using
the remembered indices (remember the ref tree as well via for_each_index),
directly adjusting the constant offset so you can feed
fold_ctor_reference the constant offsets of all elements that need to
be considered.  As optimization fold_ctor_reference would know how
to start from the "last" offset (much refactoring would need to be
done here given nested ctors and multiple indices I guess).

That said, restricting to a single index and open-coding
for_each_index intermangled with index range handling makes the code
quite hard to follow.

For large ctors we probably need to do sth to 
get_array_ctor_element_at_index, but given the way we lay out
ctors (idx being optional plus ranges) doing better than a linear search
might be tricky...

Thanks,
Richard.

> > >Shall I introduce a param for the size of the range to consider?
> > 
> > I think so.  Eventually we can pre-compute/cache a "range tree" for a
> > Constant initializer?
> 
> param introduced.
> 
> > That said I'm happy with moving it to propagation time and considering ranges
> > And leave compile-time optimization for future work (with possibly increasing the number of elements to consider)
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Note, the patch doesn't handle the case when the constant load is aggregate,
> where fold_const_aggregate_ref_1 returns a CONSTRUCTOR.  CONSTRUCTOR is not
> gimple min invariant, and when not empty can't be in the IL anyway, so I was
> thinking we could perhaps in that case just modify the rhs to use a constant
> index (e.g. vr0.min) instead of the rhs with variable index.  It didn't work
> because operand_equal_p doesn't handle non-empty CONSTRUCTORs (they compare
> unequal even if they have the same elements).
> 
> 2017-04-29  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* params.def (PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS): New.
> 	* tree.c (array_at_struct_end_p): Return false if ref is STRING_CST.
> 	* tree-vrp.c: Include gimplify.h.
> 	(load_index): New variable.
> 	(vrp_load_valueize, extract_range_from_load): New functions.
> 	(extract_range_from_assignment, simplify_stmt_using_ranges): Use
> 	extract_range_from_load.
> 	(stmt_interesting_for_vrp): Return true for loads with handled
> 	component rhs.
> 
> 	* gcc.dg/tree-ssa/vrp113.c: New test.
> 
> --- gcc/params.def.jj	2017-03-19 11:57:24.000000000 +0100
> +++ gcc/params.def	2017-04-28 13:24:04.670084868 +0200
> @@ -1275,6 +1275,12 @@ DEFPARAM (PARAM_MAX_VRP_SWITCH_ASSERTION
>  	  "edge of a switch statement during VRP",
>  	  10, 0, 0)
>  
> +DEFPARAM (PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS,
> +	  "max-vrp-constant-array-loads",
> +	  "Maximum number of constant array load indexes to check for "
> +	  "VRP optimizations",
> +	  256, 0, 0)
> +
>  DEFPARAM (PARAM_VECT_EPILOGUES_NOMASK,
>  	  "vect-epilogues-nomask",
>  	  "Enable loop epilogue vectorization using smaller vector size.",
> --- gcc/tree.c.jj	2017-04-27 15:41:53.000000000 +0200
> +++ gcc/tree.c	2017-04-28 15:26:18.005275533 +0200
> @@ -13271,6 +13271,10 @@ array_at_struct_end_p (tree ref, bool al
>        && !(flag_unconstrained_commons
>  	   && VAR_P (ref) && DECL_COMMON (ref)))
>      return false;
> +  /* Similarly for string literals, we are constrained by their given
> +     domain.  */
> +  else if (TREE_CODE (ref) == STRING_CST)
> +    return false;
>  
>    return true;
>  }
> --- gcc/tree-vrp.c.jj	2017-04-25 18:35:35.606504460 +0200
> +++ gcc/tree-vrp.c	2017-04-28 17:12:29.310298865 +0200
> @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.
>  #include "alloc-pool.h"
>  #include "domwalk.h"
>  #include "tree-cfgcleanup.h"
> +#include "gimplify.h"
>  
>  #define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL }
>  
> @@ -3779,6 +3780,211 @@ extract_range_from_comparison (value_ran
>      set_value_range_to_truthvalue (vr, type);
>  }
>  
> +/* Variables used to communicate index and its current value
> +   between extract_range_from_load and its valueize function.  */
> +static tree load_index[2];
> +
> +/* Valueize callback for extract_range_from_load.  */
> +
> +static tree
> +vrp_load_valueize (tree name)
> +{
> +  if (name == load_index[0])
> +    return load_index[1];
> +  return name;
> +}
> +
> +/* Extract a range from a constant load or simplify it to a constant,
> +   if there is a single ARRAY_REF among handled components, we have
> +   a narrow range for the index or the array bounds imply a narrow
> +   range and constant loads with all the indexes in the range yield
> +   the same constant or a range can be derived from them.
> +   RHS is the gimple_assign_rhs1 of the load.
> +   If VR is non-NULL, the function extracts a value range from the
> +   constant load values.
> +   If VR is NULL, all constants must be the same and we return that
> +   constant.  */
> +
> +static tree
> +extract_range_from_load (value_range *vr, tree rhs)
> +{
> +  tree t, *p = NULL;
> +  tree index = NULL_TREE;
> +  value_range vr0 = VR_INITIALIZER;
> +  int step = 1;
> +  if (vr)
> +    set_value_range_to_varying (vr);
> +  for (t = rhs; handled_component_p (t); t = TREE_OPERAND (t, 0))
> +    switch (TREE_CODE (t))
> +      {
> +      case ARRAY_REF:
> +	if (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST)
> +	  break;
> +	if (TREE_CODE (TREE_OPERAND (t, 1)) == SSA_NAME)
> +	  {
> +	    if (index)
> +	      return NULL_TREE;
> +	    index = TREE_OPERAND (t, 1);
> +	    if (!INTEGRAL_TYPE_P (TREE_TYPE (index)))
> +	      return NULL_TREE;
> +	    tree lb = array_ref_low_bound (t);
> +	    if (lb == NULL_TREE || TREE_CODE (lb) != INTEGER_CST)
> +	      return NULL_TREE;
> +	    value_range vr1 = VR_INITIALIZER;
> +	    extract_range_from_unary_expr (&vr0, NOP_EXPR, TREE_TYPE (lb),
> +					   index);
> +	    tree ub = NULL_TREE;
> +	    if (!array_at_struct_end_p (t))
> +	      {
> +		ub = array_ref_up_bound (t);
> +		if (ub == NULL_TREE
> +		    || TREE_CODE (ub) != INTEGER_CST
> +		    || tree_int_cst_lt (ub, lb))
> +		  ub = NULL_TREE;
> +	      }
> +	    set_value_range (&vr1, VR_RANGE, lb,
> +			     ub ? ub : vrp_val_max (TREE_TYPE (lb)), NULL);
> +	    vrp_intersect_ranges (&vr0, &vr1);
> +	    if (vr0.type != VR_RANGE)
> +	      return NULL_TREE;
> +	    step = wi::ffs (get_nonzero_bits (index));
> +	    if (step <= 1)
> +	      step = 1;
> +	    else
> +	      {
> +		step = MIN (step - 1, 30);
> +		step = MIN (step, TYPE_PRECISION (TREE_TYPE (index))
> +				  + TYPE_UNSIGNED (TREE_TYPE (index)) - 2);
> +		wide_int w = vr0.min;
> +		w += wi::mask (step, false, w.get_precision ());
> +		w &= wi::mask (step, true, w.get_precision ());
> +		if (wi::gt_p (w, vr0.max, TYPE_SIGN (TREE_TYPE (vr0.min)))
> +		    || wi::lt_p (w, vr0.min, TYPE_SIGN (TREE_TYPE (vr0.min))))
> +		  return NULL_TREE;
> +		vr0.min = wide_int_to_tree (TREE_TYPE (vr0.min), w);
> +		step = 1 << step;
> +	      }
> +	    wide_int diff = vr0.max;
> +	    diff -= vr0.min;
> +	    diff = wi::udiv_trunc (diff, step);
> +
> +	    /* As we have to check every index in the range, avoid
> +	       doing it too many times.  */
> +	    if (!wi::ltu_p (diff,
> +			    PARAM_VALUE (PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS)))
> +	      return NULL_TREE;
> +	    if (tree_int_cst_lt (vr0.min, vrp_val_min (TREE_TYPE (index)))
> +		|| tree_int_cst_lt (vrp_val_max (TREE_TYPE (index)), vr0.max))
> +	      return NULL_TREE;
> +	  }
> +	break;
> +      case ARRAY_RANGE_REF:
> +	return NULL_TREE;
> +      default:
> +	break;
> +      }
> +  if (index == NULL_TREE)
> +    return NULL_TREE;
> +  load_index[0] = index;
> +  load_index[1] = vr0.min;
> +  tree c = fold_const_aggregate_ref_1 (rhs, vrp_load_valueize);
> +  /* fold_const_aggregate_ref_1 can unfortunately only valueize a very
> +     small subset of expressions so far.  */
> +  if (c == NULL_TREE)
> +    {
> +      rhs = unshare_expr (rhs);
> +      for (t = rhs;
> +	   TREE_CODE (t) != ARRAY_REF || TREE_OPERAND (t, 1) != index;
> +	   t = TREE_OPERAND (t, 0))
> +	;
> +      p = &TREE_OPERAND (t, 1);
> +      *p = load_index[1];
> +      c = fold_const_aggregate_ref_1 (rhs, vrp_load_valueize);
> +    }
> +  if (c == NULL_TREE || !is_gimple_min_invariant (c))
> +    return NULL_TREE;
> +  if (vr)
> +    {
> +      if (INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
> +	{
> +	  if (TREE_CODE (c) != INTEGER_CST)
> +	    return NULL_TREE;
> +	  set_value_range_to_value (vr, c, NULL);
> +	}
> +      else if (POINTER_TYPE_P (TREE_TYPE (rhs)))
> +	{
> +	  bool strict_overflow_p;
> +	  if (integer_zerop (c))
> +	    set_value_range_to_null (vr, TREE_TYPE (rhs));
> +	  else if (tree_single_nonzero_warnv_p (c, &strict_overflow_p))
> +	    set_value_range_to_nonnull (vr, TREE_TYPE (rhs));
> +	  else
> +	    return NULL_TREE;
> +	}
> +      else
> +	return NULL_TREE;
> +    }
> +  wide_int w = vr0.min;
> +  while (1)
> +    {
> +      w += step;
> +      if (wi::gt_p (w, vr0.max, TYPE_SIGN (TREE_TYPE (vr0.min)))
> +	  || wi::le_p (w, vr0.min, TYPE_SIGN (TREE_TYPE (vr0.min))))
> +	break;
> +      load_index[1] = wide_int_to_tree (TREE_TYPE (vr0.min), w);
> +      if (p)
> +	*p = load_index[1];
> +      tree c2 = fold_const_aggregate_ref_1 (rhs, vrp_load_valueize);
> +      if (c2 == NULL_TREE || !is_gimple_min_invariant (c2))
> +	{
> +	  c = NULL_TREE;
> +	  break;
> +	}
> +      if (vr && INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
> +	{
> +	  if (TREE_CODE (c2) != INTEGER_CST)
> +	    {
> +	      c = NULL_TREE;
> +	      break;
> +	    }
> +	  if (tree_int_cst_lt (c2, vr->min))
> +	    {
> +	      if (TREE_OVERFLOW_P (c2))
> +		c2 = drop_tree_overflow (c2);
> +	      vr->min = c2;
> +	    }
> +	  else if (tree_int_cst_lt (vr->max, c2))
> +	    {
> +	      if (TREE_OVERFLOW_P (c2))
> +		c2 = drop_tree_overflow (c2);
> +	      vr->max = c2;
> +	    }
> +	}
> +      else if (vr && integer_zerop (c))
> +	{
> +	  if (!integer_zerop (c2))
> +	    {
> +	      c = NULL_TREE;
> +	      break;
> +	    }
> +	}
> +      else if (vr)
> +	{
> +	  bool strict_overflow_p;
> +	  if (!tree_single_nonzero_warnv_p (c2, &strict_overflow_p))
> +	    {
> +	      c = NULL_TREE;
> +	      break;
> +	    }
> +	}
> +      else if (!operand_equal_p (c, c2, 0))
> +	return NULL_TREE;
> +    }
> +  if (c == NULL_TREE && vr)
> +    set_value_range_to_varying (vr);
> +  return c;
> +}
> +
>  /* Helper function for simplify_internal_call_using_ranges and
>     extract_range_basic.  Return true if OP0 SUBCODE OP1 for
>     SUBCODE {PLUS,MINUS,MULT}_EXPR is known to never overflow or
> @@ -4280,6 +4486,9 @@ extract_range_from_assignment (value_ran
>    else if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS
>  	   && is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))
>      set_value_range_to_value (vr, gimple_assign_rhs1 (stmt), NULL);
> +  else if (gimple_assign_single_p (stmt)
> +	   && handled_component_p (gimple_assign_rhs1 (stmt)))
> +    extract_range_from_load (vr, gimple_assign_rhs1 (stmt));
>    else
>      set_value_range_to_varying (vr);
>  
> @@ -7339,12 +7548,14 @@ stmt_interesting_for_vrp (gimple *stmt)
>  
>        /* In general, assignments with virtual operands are not useful
>  	 for deriving ranges, with the obvious exception of calls to
> -	 builtin functions.  */
> +	 builtin functions and for handled_component_p loads.  */
>        if (lhs && TREE_CODE (lhs) == SSA_NAME
>  	  && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
>  	      || POINTER_TYPE_P (TREE_TYPE (lhs)))
> -	  && (is_gimple_call (stmt)
> -	      || !gimple_vuse (stmt)))
> +	  && (!gimple_vuse (stmt)
> +	      || is_gimple_call (stmt)
> +	      || (gimple_assign_single_p (stmt)
> +		  && handled_component_p (gimple_assign_rhs1 (stmt)))))
>  	return true;
>        else if (is_gimple_call (stmt) && gimple_call_internal_p (stmt))
>  	switch (gimple_call_internal_fn (stmt))
> @@ -10742,6 +10953,23 @@ simplify_stmt_using_ranges (gimple_stmt_
>  	  return simplify_min_or_max_using_ranges (gsi, stmt);
>  
>  	default:
> +	  if (gimple_assign_single_p (stmt)
> +	      && handled_component_p (rhs1)
> +	      && !INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
> +	    {
> +	      /* For integral loads this is handled by computing
> +		 value ranges and if they are singleton, replacing.
> +		 For pointers we only compute NULL or non-NULL ranges,
> +		 and can here actually simplify to a particular
> +		 ADDR_EXPR if identical everywhere.  For other types
> +		 this is the only possibility to optimize.  */
> +	      tree c = extract_range_from_load (NULL, rhs1);
> +	      if (c)
> +		{
> +		  gimple_assign_set_rhs_from_tree (gsi, c);
> +		  return true;
> +		}
> +	    }
>  	  break;
>  	}
>      }
> --- gcc/testsuite/gcc.dg/tree-ssa/vrp113.c.jj	2017-04-27 16:34:42.170486063 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/vrp113.c	2017-04-28 17:23:43.431690269 +0200
> @@ -0,0 +1,74 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-evrp -fdump-tree-vrp1" } */
> +
> +#define NULL (void *) 0
> +struct S { int a, b; };
> +const struct S var1[16] = { { 1, 2 }, { 1, 2 }, { 1, 2 },
> +			    { 1, 2 }, { 2, 3 }, { 4, 5 } };
> +const float var2[32] = { 7.0f, -64.0f, 12.0f, 24.0f,
> +			 7.0f, -65.0f, 13.0f, 25.0f,
> +			 7.0f, -66.0f, 14.0f, 26.0f,
> +			 7.0f, -67.0f, 15.0f, 27.0f,
> +			 7.0f, -66.0f, 14.0f, 26.0f,
> +			 7.0f, -67.0f, 15.0f, 27.0f,
> +			 7.0f, -68.0f, 16.0f, 28.0f,
> +			 7.0f, -69.0f, 17.0f, 27.0f };
> +const signed char var3[] = { -12, 8, 0, 9, 21, 2, 22, 3, 20, 4, 21, 5,
> +			     20, 6, 21, 7, 22, 8, 21, 9, 10, 11, 12, 13 };
> +const char str[] = "abc";
> +const char *const var4[] = { str, str, str, NULL, NULL, NULL };
> +void link_error (void);
> +
> +int
> +f1 (int x)
> +{
> +  return var1[x & 3].a + var1[(x & 1) + 2].b + "abcd\4\4\4\4\4\4\4\4efgh"[(x & 7) + 4];
> +}
> +
> +float
> +f2 (int x)
> +{
> +  return var2[x & -4];
> +}
> +
> +void
> +f3 (int x)
> +{
> +  if (x >= 2 && x <= 9)
> +    {
> +      if (var3[x * 2] < 20 || var3[x * 2] > 22)
> +	link_error ();
> +    }
> +  if (x > 0)
> +    {
> +      if (var3[x] < 0 || var3[x] > 22)
> +	link_error ();
> +    }
> +  if (var3[x] < -12)
> +    link_error ();
> +  if (x < 3)
> +    {
> +      if (var4[x] == NULL)
> +	link_error ();
> +    }
> +  else
> +    {
> +      if (var4[x] != NULL)
> +	link_error ();
> +    }
> +}
> +
> +const char *
> +f4 (int x)
> +{
> +  if (x >= 3)
> +    return "def";
> +  return var4[x];
> +}
> +
> +/* { dg-final { scan-tree-dump "return 7;" "evrp" } } */
> +/* { dg-final { scan-tree-dump "(_\[0-9]* =|return) 7\.0e\\+0;" "vrp1" } } */
> +/* { dg-final { scan-tree-dump-not "link_error" "vrp1" } } */
> +/* { dg-final { scan-tree-dump-not "var4" "vrp1" } } */
> +/* { dg-final { scan-tree-dump "&str" "vrp1" } } */
> +
> 
> 
> 	Jakub
> 
>
Jakub Jelinek May 2, 2017, 12:12 p.m. UTC | #2
On Tue, May 02, 2017 at 01:13:19PM +0200, Richard Biener wrote:
> On Sat, 29 Apr 2017, Jakub Jelinek wrote:
> 
> > On Fri, Apr 21, 2017 at 04:42:03PM +0200, Richard Biener wrote:
> > > On April 21, 2017 4:06:59 PM GMT+02:00, Jakub Jelinek <jakub@redhat.com> wrote:
> > > >This patch attempts to implement the optimization mentioned in
> > > >http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00945.html
> > > >If we know a (relatively small) value range for ARRAY_REF index
> > > >in constant array load and the values in the array for all the indexes
> > > >in the array are the same (and constant), we can just replace the load
> > > >with the constant.
> > > 
> > > But this should be done during propagation (and thus can even produce a range rather than just a constant).
> > 
> > True, I've changed the implementation to do that.
> > Except that it is only useful during propagation for integral or pointer
> > types, for anything else (and for pointers when not just doing NULL vs.
> > non-NULL vr) it is also performed during simplification (in that case
> > it requires operand_equal_p values).
> > 
> > The patch also newly computes range for the index and range from the array
> > bounds (taking into account arrays at struct end) and intersects them,
> > plus takes into account that some iterators might have a couple of zero LBS
> > bits (as computed by ccp).
> 
> Hmm, while I can see how this helps I'd rather not do this in this way.
> (see PR80533 and my followup with a testcase which shows
> array_at_struct_end_p is wrong)  Ideally we'd add asserts for array
> indices instead.  Thus
> 
>   i_3 = ASSERT_EXPR <i_2, (unsigned)i_2 <= 3>
>   .. = a[i_3];
> 
> which of course needs adjustment to -Warray-bounds to look at the
> range of the original SSA name (and in loops even that might degrade...).
> 
> Was this necessary to get any reasonable results?

Yes, it is very common that you end up with a VR that has negative min, or
very large max, but if the array is in the middle of a struct, or it is a
toplevel non-common array variable etc., which is quite common case, then
we still should be able to optimize it.
Adding extra ASSERT_EXPRs would work too, but wouldn't it be too expensive?
I mean there can be lots of array accesses with the same index, if we'd add
an ASSERT_EXPR for every case, VRP work could increase many times.  It is
true that we'd be able to optimize:
int foo [5] = { 1, 2, 3, 4, 5 };
void
foo (int i, int j)
{
  if (j > 10)
    {
      foo[i] = 10;
      if (i < 0 || i >= 5)
	link_error ();
    }
  foo[i]++;
}

If array_at_struct_end_p is wrong, it should be fixed ;)

> Still given the high cost of get_array_ctor_element_at_index which
> does linear searching I'd add a few early-outs like
> 
>   base = get_base_address (rhs);
>   ctor = ctor_for_folding (base);
>   if (! ctor)
>     return NULL_TREE;

This one I can add easily.

> I'd restructure the patch quite different, using for_each_index on the
> ref gather an array of index pointers (bail out on sth unhandled).
> Then I'd see if I have interesting ranges for them, if not, bail out.
> Also compute the size product of all ranges and test that against
> PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS.  Then store the minimum range
> value in the index places (temporarily) and use get_base_ref_and_extent to
> get at the constant "starting" offset.  From there iterate using
> the remembered indices (remember the ref tree as well via for_each_index),
> directly adjusting the constant offset so you can feed
> fold_ctor_reference the constant offsets of all elements that need to
> be considered.  As optimization fold_ctor_reference would know how
> to start from the "last" offset (much refactoring would need to be
> done here given nested ctors and multiple indices I guess).

But for this, don't you want to take it over?
I agree that the current implementation is not very efficient and that is
why it is also limited to that small number of iterations.
As many cases just aren't able to use the valueize callback, handling
arbitrary numbers of non-constant indexes would be harder.

	Jakub
Richard Biener May 2, 2017, 12:50 p.m. UTC | #3
On Tue, 2 May 2017, Jakub Jelinek wrote:

> On Tue, May 02, 2017 at 01:13:19PM +0200, Richard Biener wrote:
> > On Sat, 29 Apr 2017, Jakub Jelinek wrote:
> > 
> > > On Fri, Apr 21, 2017 at 04:42:03PM +0200, Richard Biener wrote:
> > > > On April 21, 2017 4:06:59 PM GMT+02:00, Jakub Jelinek <jakub@redhat.com> wrote:
> > > > >This patch attempts to implement the optimization mentioned in
> > > > >http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00945.html
> > > > >If we know a (relatively small) value range for ARRAY_REF index
> > > > >in constant array load and the values in the array for all the indexes
> > > > >in the array are the same (and constant), we can just replace the load
> > > > >with the constant.
> > > > 
> > > > But this should be done during propagation (and thus can even produce a range rather than just a constant).
> > > 
> > > True, I've changed the implementation to do that.
> > > Except that it is only useful during propagation for integral or pointer
> > > types, for anything else (and for pointers when not just doing NULL vs.
> > > non-NULL vr) it is also performed during simplification (in that case
> > > it requires operand_equal_p values).
> > > 
> > > The patch also newly computes range for the index and range from the array
> > > bounds (taking into account arrays at struct end) and intersects them,
> > > plus takes into account that some iterators might have a couple of zero LBS
> > > bits (as computed by ccp).
> > 
> > Hmm, while I can see how this helps I'd rather not do this in this way.
> > (see PR80533 and my followup with a testcase which shows
> > array_at_struct_end_p is wrong)  Ideally we'd add asserts for array
> > indices instead.  Thus
> > 
> >   i_3 = ASSERT_EXPR <i_2, (unsigned)i_2 <= 3>
> >   .. = a[i_3];
> > 
> > which of course needs adjustment to -Warray-bounds to look at the
> > range of the original SSA name (and in loops even that might degrade...).
> > 
> > Was this necessary to get any reasonable results?
> 
> Yes, it is very common that you end up with a VR that has negative min, or
> very large max, but if the array is in the middle of a struct, or it is a
> toplevel non-common array variable etc., which is quite common case, then
> we still should be able to optimize it.
> Adding extra ASSERT_EXPRs would work too, but wouldn't it be too expensive?
> I mean there can be lots of array accesses with the same index, if we'd add
> an ASSERT_EXPR for every case, VRP work could increase many times.  It is
> true that we'd be able to optimize:
> int foo [5] = { 1, 2, 3, 4, 5 };
> void
> foo (int i, int j)
> {
>   if (j > 10)
>     {
>       foo[i] = 10;
>       if (i < 0 || i >= 5)
> 	link_error ();
>     }
>   foo[i]++;
> }

I don't think it's too expensive.  Yes, you get one additional assert
(and SSA name) per array index.  My main gripe would be the 
affect on -Warray-bounds...

> 
> If array_at_struct_end_p is wrong, it should be fixed ;)

Indeed.  It was originally meant to say false if you can trust
TYPE_DOMAIN of the array but now it says false if there's some means
to constrain the array size (the DECL_P path and now your STRING_CST
one).  But all callers afterwards just look at TYPE_DOMAIN ...

> > Still given the high cost of get_array_ctor_element_at_index which
> > does linear searching I'd add a few early-outs like
> > 
> >   base = get_base_address (rhs);
> >   ctor = ctor_for_folding (base);
> >   if (! ctor)
> >     return NULL_TREE;
> 
> This one I can add easily.
> 
> > I'd restructure the patch quite different, using for_each_index on the
> > ref gather an array of index pointers (bail out on sth unhandled).
> > Then I'd see if I have interesting ranges for them, if not, bail out.
> > Also compute the size product of all ranges and test that against
> > PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS.  Then store the minimum range
> > value in the index places (temporarily) and use get_base_ref_and_extent to
> > get at the constant "starting" offset.  From there iterate using
> > the remembered indices (remember the ref tree as well via for_each_index),
> > directly adjusting the constant offset so you can feed
> > fold_ctor_reference the constant offsets of all elements that need to
> > be considered.  As optimization fold_ctor_reference would know how
> > to start from the "last" offset (much refactoring would need to be
> > done here given nested ctors and multiple indices I guess).
> 
> But for this, don't you want to take it over?

I can try.  Is there a PR for this?

> I agree that the current implementation is not very efficient and that is
> why it is also limited to that small number of iterations.
> As many cases just aren't able to use the valueize callback, handling
> arbitrary numbers of non-constant indexes would be harder.

Sure.  I'd have expected you simply handle ARRAY_REF of a VAR_DECL
and nothing else ;)

Richard.
Jakub Jelinek May 3, 2017, 8:03 a.m. UTC | #4
On Tue, May 02, 2017 at 02:50:16PM +0200, Richard Biener wrote:
> > If array_at_struct_end_p is wrong, it should be fixed ;)
> 
> Indeed.  It was originally meant to say false if you can trust
> TYPE_DOMAIN of the array but now it says false if there's some means
> to constrain the array size (the DECL_P path and now your STRING_CST
> one).  But all callers afterwards just look at TYPE_DOMAIN ...

So shall we verify that TYPE_DOMAIN is consistent with the object size
in that case inside of array_at_struct_end_p?

> > > I'd restructure the patch quite different, using for_each_index on the
> > > ref gather an array of index pointers (bail out on sth unhandled).
> > > Then I'd see if I have interesting ranges for them, if not, bail out.
> > > Also compute the size product of all ranges and test that against
> > > PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS.  Then store the minimum range
> > > value in the index places (temporarily) and use get_base_ref_and_extent to
> > > get at the constant "starting" offset.  From there iterate using
> > > the remembered indices (remember the ref tree as well via for_each_index),
> > > directly adjusting the constant offset so you can feed
> > > fold_ctor_reference the constant offsets of all elements that need to
> > > be considered.  As optimization fold_ctor_reference would know how
> > > to start from the "last" offset (much refactoring would need to be
> > > done here given nested ctors and multiple indices I guess).
> > 
> > But for this, don't you want to take it over?
> 
> I can try.  Is there a PR for this?

Ok, filed PR80603, it is now all yours.

> > I agree that the current implementation is not very efficient and that is
> > why it is also limited to that small number of iterations.
> > As many cases just aren't able to use the valueize callback, handling
> > arbitrary numbers of non-constant indexes would be harder.
> 
> Sure.  I'd have expected you simply handle ARRAY_REF of a VAR_DECL
> and nothing else ;)

That would be too simple and boring ;)

	Jakub
diff mbox

Patch

--- gcc/params.def.jj	2017-03-19 11:57:24.000000000 +0100
+++ gcc/params.def	2017-04-28 13:24:04.670084868 +0200
@@ -1275,6 +1275,12 @@  DEFPARAM (PARAM_MAX_VRP_SWITCH_ASSERTION
 	  "edge of a switch statement during VRP",
 	  10, 0, 0)
 
+DEFPARAM (PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS,
+	  "max-vrp-constant-array-loads",
+	  "Maximum number of constant array load indexes to check for "
+	  "VRP optimizations",
+	  256, 0, 0)
+
 DEFPARAM (PARAM_VECT_EPILOGUES_NOMASK,
 	  "vect-epilogues-nomask",
 	  "Enable loop epilogue vectorization using smaller vector size.",
--- gcc/tree.c.jj	2017-04-27 15:41:53.000000000 +0200
+++ gcc/tree.c	2017-04-28 15:26:18.005275533 +0200
@@ -13271,6 +13271,10 @@  array_at_struct_end_p (tree ref, bool al
       && !(flag_unconstrained_commons
 	   && VAR_P (ref) && DECL_COMMON (ref)))
     return false;
+  /* Similarly for string literals, we are constrained by their given
+     domain.  */
+  else if (TREE_CODE (ref) == STRING_CST)
+    return false;
 
   return true;
 }
--- gcc/tree-vrp.c.jj	2017-04-25 18:35:35.606504460 +0200
+++ gcc/tree-vrp.c	2017-04-28 17:12:29.310298865 +0200
@@ -62,6 +62,7 @@  along with GCC; see the file COPYING3.
 #include "alloc-pool.h"
 #include "domwalk.h"
 #include "tree-cfgcleanup.h"
+#include "gimplify.h"
 
 #define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL }
 
@@ -3779,6 +3780,211 @@  extract_range_from_comparison (value_ran
     set_value_range_to_truthvalue (vr, type);
 }
 
+/* Variables used to communicate index and its current value
+   between extract_range_from_load and its valueize function.  */
+static tree load_index[2];
+
+/* Valueize callback for extract_range_from_load.  */
+
+static tree
+vrp_load_valueize (tree name)
+{
+  if (name == load_index[0])
+    return load_index[1];
+  return name;
+}
+
+/* Extract a range from a constant load or simplify it to a constant,
+   if there is a single ARRAY_REF among handled components, we have
+   a narrow range for the index or the array bounds imply a narrow
+   range and constant loads with all the indexes in the range yield
+   the same constant or a range can be derived from them.
+   RHS is the gimple_assign_rhs1 of the load.
+   If VR is non-NULL, the function extracts a value range from the
+   constant load values.
+   If VR is NULL, all constants must be the same and we return that
+   constant.  */
+
+static tree
+extract_range_from_load (value_range *vr, tree rhs)
+{
+  tree t, *p = NULL;
+  tree index = NULL_TREE;
+  value_range vr0 = VR_INITIALIZER;
+  int step = 1;
+  if (vr)
+    set_value_range_to_varying (vr);
+  for (t = rhs; handled_component_p (t); t = TREE_OPERAND (t, 0))
+    switch (TREE_CODE (t))
+      {
+      case ARRAY_REF:
+	if (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST)
+	  break;
+	if (TREE_CODE (TREE_OPERAND (t, 1)) == SSA_NAME)
+	  {
+	    if (index)
+	      return NULL_TREE;
+	    index = TREE_OPERAND (t, 1);
+	    if (!INTEGRAL_TYPE_P (TREE_TYPE (index)))
+	      return NULL_TREE;
+	    tree lb = array_ref_low_bound (t);
+	    if (lb == NULL_TREE || TREE_CODE (lb) != INTEGER_CST)
+	      return NULL_TREE;
+	    value_range vr1 = VR_INITIALIZER;
+	    extract_range_from_unary_expr (&vr0, NOP_EXPR, TREE_TYPE (lb),
+					   index);
+	    tree ub = NULL_TREE;
+	    if (!array_at_struct_end_p (t))
+	      {
+		ub = array_ref_up_bound (t);
+		if (ub == NULL_TREE
+		    || TREE_CODE (ub) != INTEGER_CST
+		    || tree_int_cst_lt (ub, lb))
+		  ub = NULL_TREE;
+	      }
+	    set_value_range (&vr1, VR_RANGE, lb,
+			     ub ? ub : vrp_val_max (TREE_TYPE (lb)), NULL);
+	    vrp_intersect_ranges (&vr0, &vr1);
+	    if (vr0.type != VR_RANGE)
+	      return NULL_TREE;
+	    step = wi::ffs (get_nonzero_bits (index));
+	    if (step <= 1)
+	      step = 1;
+	    else
+	      {
+		step = MIN (step - 1, 30);
+		step = MIN (step, TYPE_PRECISION (TREE_TYPE (index))
+				  + TYPE_UNSIGNED (TREE_TYPE (index)) - 2);
+		wide_int w = vr0.min;
+		w += wi::mask (step, false, w.get_precision ());
+		w &= wi::mask (step, true, w.get_precision ());
+		if (wi::gt_p (w, vr0.max, TYPE_SIGN (TREE_TYPE (vr0.min)))
+		    || wi::lt_p (w, vr0.min, TYPE_SIGN (TREE_TYPE (vr0.min))))
+		  return NULL_TREE;
+		vr0.min = wide_int_to_tree (TREE_TYPE (vr0.min), w);
+		step = 1 << step;
+	      }
+	    wide_int diff = vr0.max;
+	    diff -= vr0.min;
+	    diff = wi::udiv_trunc (diff, step);
+
+	    /* As we have to check every index in the range, avoid
+	       doing it too many times.  */
+	    if (!wi::ltu_p (diff,
+			    PARAM_VALUE (PARAM_MAX_VRP_CONSTANT_ARRAY_LOADS)))
+	      return NULL_TREE;
+	    if (tree_int_cst_lt (vr0.min, vrp_val_min (TREE_TYPE (index)))
+		|| tree_int_cst_lt (vrp_val_max (TREE_TYPE (index)), vr0.max))
+	      return NULL_TREE;
+	  }
+	break;
+      case ARRAY_RANGE_REF:
+	return NULL_TREE;
+      default:
+	break;
+      }
+  if (index == NULL_TREE)
+    return NULL_TREE;
+  load_index[0] = index;
+  load_index[1] = vr0.min;
+  tree c = fold_const_aggregate_ref_1 (rhs, vrp_load_valueize);
+  /* fold_const_aggregate_ref_1 can unfortunately only valueize a very
+     small subset of expressions so far.  */
+  if (c == NULL_TREE)
+    {
+      rhs = unshare_expr (rhs);
+      for (t = rhs;
+	   TREE_CODE (t) != ARRAY_REF || TREE_OPERAND (t, 1) != index;
+	   t = TREE_OPERAND (t, 0))
+	;
+      p = &TREE_OPERAND (t, 1);
+      *p = load_index[1];
+      c = fold_const_aggregate_ref_1 (rhs, vrp_load_valueize);
+    }
+  if (c == NULL_TREE || !is_gimple_min_invariant (c))
+    return NULL_TREE;
+  if (vr)
+    {
+      if (INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	{
+	  if (TREE_CODE (c) != INTEGER_CST)
+	    return NULL_TREE;
+	  set_value_range_to_value (vr, c, NULL);
+	}
+      else if (POINTER_TYPE_P (TREE_TYPE (rhs)))
+	{
+	  bool strict_overflow_p;
+	  if (integer_zerop (c))
+	    set_value_range_to_null (vr, TREE_TYPE (rhs));
+	  else if (tree_single_nonzero_warnv_p (c, &strict_overflow_p))
+	    set_value_range_to_nonnull (vr, TREE_TYPE (rhs));
+	  else
+	    return NULL_TREE;
+	}
+      else
+	return NULL_TREE;
+    }
+  wide_int w = vr0.min;
+  while (1)
+    {
+      w += step;
+      if (wi::gt_p (w, vr0.max, TYPE_SIGN (TREE_TYPE (vr0.min)))
+	  || wi::le_p (w, vr0.min, TYPE_SIGN (TREE_TYPE (vr0.min))))
+	break;
+      load_index[1] = wide_int_to_tree (TREE_TYPE (vr0.min), w);
+      if (p)
+	*p = load_index[1];
+      tree c2 = fold_const_aggregate_ref_1 (rhs, vrp_load_valueize);
+      if (c2 == NULL_TREE || !is_gimple_min_invariant (c2))
+	{
+	  c = NULL_TREE;
+	  break;
+	}
+      if (vr && INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	{
+	  if (TREE_CODE (c2) != INTEGER_CST)
+	    {
+	      c = NULL_TREE;
+	      break;
+	    }
+	  if (tree_int_cst_lt (c2, vr->min))
+	    {
+	      if (TREE_OVERFLOW_P (c2))
+		c2 = drop_tree_overflow (c2);
+	      vr->min = c2;
+	    }
+	  else if (tree_int_cst_lt (vr->max, c2))
+	    {
+	      if (TREE_OVERFLOW_P (c2))
+		c2 = drop_tree_overflow (c2);
+	      vr->max = c2;
+	    }
+	}
+      else if (vr && integer_zerop (c))
+	{
+	  if (!integer_zerop (c2))
+	    {
+	      c = NULL_TREE;
+	      break;
+	    }
+	}
+      else if (vr)
+	{
+	  bool strict_overflow_p;
+	  if (!tree_single_nonzero_warnv_p (c2, &strict_overflow_p))
+	    {
+	      c = NULL_TREE;
+	      break;
+	    }
+	}
+      else if (!operand_equal_p (c, c2, 0))
+	return NULL_TREE;
+    }
+  if (c == NULL_TREE && vr)
+    set_value_range_to_varying (vr);
+  return c;
+}
+
 /* Helper function for simplify_internal_call_using_ranges and
    extract_range_basic.  Return true if OP0 SUBCODE OP1 for
    SUBCODE {PLUS,MINUS,MULT}_EXPR is known to never overflow or
@@ -4280,6 +4486,9 @@  extract_range_from_assignment (value_ran
   else if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS
 	   && is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))
     set_value_range_to_value (vr, gimple_assign_rhs1 (stmt), NULL);
+  else if (gimple_assign_single_p (stmt)
+	   && handled_component_p (gimple_assign_rhs1 (stmt)))
+    extract_range_from_load (vr, gimple_assign_rhs1 (stmt));
   else
     set_value_range_to_varying (vr);
 
@@ -7339,12 +7548,14 @@  stmt_interesting_for_vrp (gimple *stmt)
 
       /* In general, assignments with virtual operands are not useful
 	 for deriving ranges, with the obvious exception of calls to
-	 builtin functions.  */
+	 builtin functions and for handled_component_p loads.  */
       if (lhs && TREE_CODE (lhs) == SSA_NAME
 	  && (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
 	      || POINTER_TYPE_P (TREE_TYPE (lhs)))
-	  && (is_gimple_call (stmt)
-	      || !gimple_vuse (stmt)))
+	  && (!gimple_vuse (stmt)
+	      || is_gimple_call (stmt)
+	      || (gimple_assign_single_p (stmt)
+		  && handled_component_p (gimple_assign_rhs1 (stmt)))))
 	return true;
       else if (is_gimple_call (stmt) && gimple_call_internal_p (stmt))
 	switch (gimple_call_internal_fn (stmt))
@@ -10742,6 +10953,23 @@  simplify_stmt_using_ranges (gimple_stmt_
 	  return simplify_min_or_max_using_ranges (gsi, stmt);
 
 	default:
+	  if (gimple_assign_single_p (stmt)
+	      && handled_component_p (rhs1)
+	      && !INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
+	    {
+	      /* For integral loads this is handled by computing
+		 value ranges and if they are singleton, replacing.
+		 For pointers we only compute NULL or non-NULL ranges,
+		 and can here actually simplify to a particular
+		 ADDR_EXPR if identical everywhere.  For other types
+		 this is the only possibility to optimize.  */
+	      tree c = extract_range_from_load (NULL, rhs1);
+	      if (c)
+		{
+		  gimple_assign_set_rhs_from_tree (gsi, c);
+		  return true;
+		}
+	    }
 	  break;
 	}
     }
--- gcc/testsuite/gcc.dg/tree-ssa/vrp113.c.jj	2017-04-27 16:34:42.170486063 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp113.c	2017-04-28 17:23:43.431690269 +0200
@@ -0,0 +1,74 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp -fdump-tree-vrp1" } */
+
+#define NULL (void *) 0
+struct S { int a, b; };
+const struct S var1[16] = { { 1, 2 }, { 1, 2 }, { 1, 2 },
+			    { 1, 2 }, { 2, 3 }, { 4, 5 } };
+const float var2[32] = { 7.0f, -64.0f, 12.0f, 24.0f,
+			 7.0f, -65.0f, 13.0f, 25.0f,
+			 7.0f, -66.0f, 14.0f, 26.0f,
+			 7.0f, -67.0f, 15.0f, 27.0f,
+			 7.0f, -66.0f, 14.0f, 26.0f,
+			 7.0f, -67.0f, 15.0f, 27.0f,
+			 7.0f, -68.0f, 16.0f, 28.0f,
+			 7.0f, -69.0f, 17.0f, 27.0f };
+const signed char var3[] = { -12, 8, 0, 9, 21, 2, 22, 3, 20, 4, 21, 5,
+			     20, 6, 21, 7, 22, 8, 21, 9, 10, 11, 12, 13 };
+const char str[] = "abc";
+const char *const var4[] = { str, str, str, NULL, NULL, NULL };
+void link_error (void);
+
+int
+f1 (int x)
+{
+  return var1[x & 3].a + var1[(x & 1) + 2].b + "abcd\4\4\4\4\4\4\4\4efgh"[(x & 7) + 4];
+}
+
+float
+f2 (int x)
+{
+  return var2[x & -4];
+}
+
+void
+f3 (int x)
+{
+  if (x >= 2 && x <= 9)
+    {
+      if (var3[x * 2] < 20 || var3[x * 2] > 22)
+	link_error ();
+    }
+  if (x > 0)
+    {
+      if (var3[x] < 0 || var3[x] > 22)
+	link_error ();
+    }
+  if (var3[x] < -12)
+    link_error ();
+  if (x < 3)
+    {
+      if (var4[x] == NULL)
+	link_error ();
+    }
+  else
+    {
+      if (var4[x] != NULL)
+	link_error ();
+    }
+}
+
+const char *
+f4 (int x)
+{
+  if (x >= 3)
+    return "def";
+  return var4[x];
+}
+
+/* { dg-final { scan-tree-dump "return 7;" "evrp" } } */
+/* { dg-final { scan-tree-dump "(_\[0-9]* =|return) 7\.0e\\+0;" "vrp1" } } */
+/* { dg-final { scan-tree-dump-not "link_error" "vrp1" } } */
+/* { dg-final { scan-tree-dump-not "var4" "vrp1" } } */
+/* { dg-final { scan-tree-dump "&str" "vrp1" } } */
+