Determine more IVs to be non-overflowing

Message ID	20160630130301.GA40315@kam.mff.cuni.cz
State	New
Headers	show Return-Path: <gcc-patches-return-430783-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=K7vKrzcyUKlNaudVw M5lIz0ARMJDmVVHkuw9JjeZK/iadfxSvurX1UFNvFAt1cpJBOMiGM2mTgha/Z+Id P7QYpEzXp3AF6Y9uZNEyFvbppAQTsyvBQcfWa7hSpGLpRh95OxNetdZMXWabjHDJ 9gsFOSHb6Cu5mzvCqrzTsbZ28s= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org Date: Thu, 30 Jun 2016 15:03:01 +0200 From: Jan Hubicka <hubicka@ucw.cz> To: Jan Hubicka <hubicka@ucw.cz> Cc: Richard Biener <rguenther@suse.de>, gcc-patches@gcc.gnu.org Subject: Re: Determine more IVs to be non-overflowing Message-ID: <20160630130301.GA40315@kam.mff.cuni.cz> References: <20160627140938.GF98078@kam.mff.cuni.cz> <alpine.LSU.2.11.1606290934010.29772@t29.fhfr.qr> <20160629080804.GA12114@kam.mff.cuni.cz> <alpine.LSU.2.11.1606291010410.29772@t29.fhfr.qr> <20160629105100.GA17019@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160629105100.GA17019@kam.mff.cuni.cz> User-Agent: Mutt/1.5.21 (2010-09-15)

Message ID

20160630130301.GA40315@kam.mff.cuni.cz

State

New

Headers

DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:date
	:from:to:cc:subject:message-id:references:mime-version
	:content-type:in-reply-to; q=dns; s=default; b=K7vKrzcyUKlNaudVw
	M5lIz0ARMJDmVVHkuw9JjeZK/iadfxSvurX1UFNvFAt1cpJBOMiGM2mTgha/Z+Id
	P7QYpEzXp3AF6Y9uZNEyFvbppAQTsyvBQcfWa7hSpGLpRh95OxNetdZMXWabjHDJ
	9gsFOSHb6Cu5mzvCqrzTsbZ28s=
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org
Date: Thu, 30 Jun 2016 15:03:01 +0200
From: Jan Hubicka <hubicka@ucw.cz>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: Richard Biener <rguenther@suse.de>, gcc-patches@gcc.gnu.org
Subject: Re: Determine more IVs to be non-overflowing
Message-ID: <20160630130301.GA40315@kam.mff.cuni.cz>
References: <20160627140938.GF98078@kam.mff.cuni.cz>
	<alpine.LSU.2.11.1606290934010.29772@t29.fhfr.qr>
	<20160629080804.GA12114@kam.mff.cuni.cz>
	<alpine.LSU.2.11.1606291010410.29772@t29.fhfr.qr>
	<20160629105100.GA17019@kam.mff.cuni.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160629105100.GA17019@kam.mff.cuni.cz>
User-Agent: Mutt/1.5.21 (2010-09-15)

Commit Message

Jan Hubicka June 30, 2016, 1:03 p.m. UTC

Hi,
i sent the email before, but it seems it never left my machine. Sorry for
possible duplicates.  This is updated version which does not overflow (by
capping nit), but still using widest_int for the calculatoins. It seems more
readable to do so than using wi:add/wi:mult/wi:lt and overflow checks, but
i can definitly update the patch to do it, too.

Bootstrapped/regtested x86_64-linux, OK?

Honza

	* tree-scalar-evolution.h (iv_can_overflow_p): Declare.
	* tree-scalar-evolution.c (iv_can_overflow_p): New funcition.
	(simple_iv): Use it.
	* tree-ssa-loop-niter.c (nowrap_type_p): Use ANY_INTEGRAL_TYPE_P

	* gcc.dg/tree-ssa/scev-14.c: New testcase.

Comments

Richard Biener July 1, 2016, 11:45 a.m. UTC | #1

On Thu, 30 Jun 2016, Jan Hubicka wrote:

> Hi,
> i sent the email before, but it seems it never left my machine. Sorry for
> possible duplicates.  This is updated version which does not overflow (by
> capping nit), but still using widest_int for the calculatoins. It seems more
> readable to do so than using wi:add/wi:mult/wi:lt and overflow checks, but
> i can definitly update the patch to do it, too.
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
> Honza
> 
> 	* tree-scalar-evolution.h (iv_can_overflow_p): Declare.
> 	* tree-scalar-evolution.c (iv_can_overflow_p): New funcition.
> 	(simple_iv): Use it.
> 	* tree-ssa-loop-niter.c (nowrap_type_p): Use ANY_INTEGRAL_TYPE_P
> 
> 	* gcc.dg/tree-ssa/scev-14.c: New testcase.
> 
> Index: testsuite/gcc.dg/tree-ssa/scev-14.c
> ===================================================================
> --- testsuite/gcc.dg/tree-ssa/scev-14.c	(revision 0)
> +++ testsuite/gcc.dg/tree-ssa/scev-14.c	(working copy)
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
> +int a[100];
> +void t(unsigned int n)
> +{
> +  unsigned int i;
> +  for (i=0; i<n; i++)
> +     a[i]++;
> +}
> +/* { dg-final { scan-tree-dump "Overflowness wrto loop niter:	No-overflow"  "ivopts" } } */
> +/* { dg-final { scan-tree-dump-not "Overflowness wrto loop niter:	Overflow" "ivopts" } } */
> Index: tree-scalar-evolution.c
> ===================================================================
> --- tree-scalar-evolution.c	(revision 237856)
> +++ tree-scalar-evolution.c	(working copy)
> @@ -280,6 +280,7 @@ along with GCC; see the file COPYING3.
>  #include "params.h"
>  #include "tree-ssa-propagate.h"
>  #include "gimple-fold.h"
> +#include "print-tree.h"

Don't see you need this.

>  static tree analyze_scalar_evolution_1 (struct loop *, tree, tree);
>  static tree analyze_scalar_evolution_for_address_of (struct loop *loop,
> @@ -3309,6 +3310,60 @@ scev_reset (void)
>      }
>  }
>  
> +/* Return true if the IV calculation in TYPE can overflow based on the knowledge
> +   of the upper bound on the number of iterations of LOOP, the BASE and STEP
> +   of IV.
> +
> +   We do not use information whether TYPE can overflow so it is safe to
> +   use this test even for derived IVs not computed every iteration or
> +   hypotetical IVs to be inserted into code.  */
> +
> +bool
> +iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step)

Exporting this is also not necessary?

> +{
> +  widest_int nit;
> +  wide_int base_min, base_max, step_min, step_max, type_min, type_max;
> +  signop sgn = TYPE_SIGN (type);
> +  signop base_sgn = TYPE_SIGN (TREE_TYPE (base));
> +  signop step_sgn = TYPE_SIGN (TREE_TYPE (step));
> +
> +  if (step == 0)
> +    return false;

Err - you probably mean

     if (integer_zerop (step))

here?  your check is a NULL pointer check ...

> +
> +  if (TREE_CODE (base) == INTEGER_CST)
> +    base_min = base_max = base;
> +  else if (TREE_CODE (base) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (base))
> +	   && get_range_info (base, &base_min, &base_max) == VR_RANGE)

split &&s into separate lines.  Also use && INTEGRAL_TYPE_P rather
than a negative check.  This means that pointer IVs are always
considered to overflow.

> +    ;
> +  else
> +    return true;
> +
> +  if (TREE_CODE (step) == INTEGER_CST)
> +    step_min = step_max = step;
> +  else if (TREE_CODE (step) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (step))
> +	   && get_range_info (step, &step_min, &step_max) == VR_RANGE)

Likewise.

> +    ;
> +  else
> +    return true;
> +
> +  if (!get_max_loop_iterations (loop, &nit))
> +    return true;
> +
> +  type_min = wi::min_value (type);
> +  type_max = wi::max_value (type);
> +  /* Watch overflow.  */
> +  if ((widest_int)1 << TYPE_PRECISION (type) < nit)
> +    return true;

TYPE_PRECISION (type) - 1?  The comment can be improved I think.

> +  if ((widest_int::from (base_max, base_sgn)
> +       + widest_int::from (step_max, step_sgn) * (nit + 1))
> +       > widest_int::from (type_max, sgn)
> +      || (widest_int::from (type_min, sgn)
> +	  > (widest_int::from (base_min, base_sgn)
> +	     + widest_int::from (step_min, step_sgn) * (nit + 1))))
> +    return true;

and this lacks any comment...  so it decodes to

 (base_max + step_max * (nit + 1) > type_max)
 || (type_min > base_min + step_min * (nit + 1))

and it basically assumes infinite precision arithmetic for
the computation.  As mentioned previously for __int128 widest_int
does _not_ guarantee this so you need to use FIXED_WIDE_INT
with a precision of WIDE_INT_MAX_PRECISION * 2 (+1?) like VRP does.

Note as both type_min/max and base_min/max are wide_ints the above
might be simplified

  step_max * (nit + 1) > type_max - base_max

where the subtraction can be carried out in wide_int(?) and you
should also CSE nit + 1 or transform it further to

  step_max * nit > type_max - base_max - step_max

where if I understand correctly, if type_max - base_max - step_max
underflows we already wrap.

Thanks,
Richard.

> +  return false;
> +}
> +
>  /* Checks whether use of OP in USE_LOOP behaves as a simple affine iv with
>     respect to WRTO_LOOP and returns its base and step in IV if possible
>     (see analyze_scalar_evolution_in_loop for more details on USE_LOOP
> @@ -3375,8 +3430,12 @@ simple_iv (struct loop *wrto_loop, struc
>    if (tree_contains_chrecs (iv->base, NULL))
>      return false;
>  
> -  iv->no_overflow = (!folded_casts && ANY_INTEGRAL_TYPE_P (type)
> -		     && TYPE_OVERFLOW_UNDEFINED (type));
> +  iv->no_overflow = !folded_casts && nowrap_type_p (type);
> +
> +  if (!iv->no_overflow
> +      && !iv_can_overflow_p (wrto_loop, type, iv->base, iv->step))
> +    iv->no_overflow = true;
> +
>  
>    /* Try to simplify iv base:
>  
> Index: tree-scalar-evolution.h
> ===================================================================
> --- tree-scalar-evolution.h	(revision 237856)
> +++ tree-scalar-evolution.h	(working copy)
> @@ -38,6 +38,7 @@ extern unsigned int scev_const_prop (voi
>  extern bool expression_expensive_p (tree);
>  extern bool simple_iv (struct loop *, struct loop *, tree, struct affine_iv *,
>  		       bool);
> +extern bool iv_can_overflow_p (struct loop *, tree, tree, tree);
>  extern tree compute_overall_effect_of_inner_loop (struct loop *, tree);
>  
>  /* Returns the basic block preceding LOOP, or the CFG entry block when
> Index: tree-ssa-loop-niter.c
> ===================================================================
> --- tree-ssa-loop-niter.c	(revision 237856)
> +++ tree-ssa-loop-niter.c	(working copy)
> @@ -4105,7 +4105,7 @@ n_of_executions_at_most (gimple *stmt,
>  bool
>  nowrap_type_p (tree type)
>  {
> -  if (INTEGRAL_TYPE_P (type)
> +  if (ANY_INTEGRAL_TYPE_P (type)
>        && TYPE_OVERFLOW_UNDEFINED (type))
>      return true;
>  
> 
>

Jan Hubicka July 1, 2016, 1:16 p.m. UTC | #2

> > Index: tree-scalar-evolution.c
> > ===================================================================
> > --- tree-scalar-evolution.c	(revision 237856)
> > +++ tree-scalar-evolution.c	(working copy)
> > @@ -280,6 +280,7 @@ along with GCC; see the file COPYING3.
> >  #include "params.h"
> >  #include "tree-ssa-propagate.h"
> >  #include "gimple-fold.h"
> > +#include "print-tree.h"
> 
> Don't see you need this.

Yes, i forgot this from debugging.
> 
> >  static tree analyze_scalar_evolution_1 (struct loop *, tree, tree);
> >  static tree analyze_scalar_evolution_for_address_of (struct loop *loop,
> > @@ -3309,6 +3310,60 @@ scev_reset (void)
> >      }
> >  }
> >  
> > +/* Return true if the IV calculation in TYPE can overflow based on the knowledge
> > +   of the upper bound on the number of iterations of LOOP, the BASE and STEP
> > +   of IV.
> > +
> > +   We do not use information whether TYPE can overflow so it is safe to
> > +   use this test even for derived IVs not computed every iteration or
> > +   hypotetical IVs to be inserted into code.  */
> > +
> > +bool
> > +iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step)
> 
> Exporting this is also not necessary?

The reason why I export this is that incrementally I plan to use it in ivopts.
It is constructing its own candidates and also needs to know if they will
overflow or not.

I will drop it for now and include it in incremental patch.
> 
> > +{
> > +  widest_int nit;
> > +  wide_int base_min, base_max, step_min, step_max, type_min, type_max;
> > +  signop sgn = TYPE_SIGN (type);
> > +  signop base_sgn = TYPE_SIGN (TREE_TYPE (base));
> > +  signop step_sgn = TYPE_SIGN (TREE_TYPE (step));
> > +
> > +  if (step == 0)
> > +    return false;
> 
> Err - you probably mean
> 
>      if (integer_zerop (step))
> 
> here?  your check is a NULL pointer check ...

Yes, sorry. inteer_zerop (step).  Probably does not matter.

> > +  type_min = wi::min_value (type);
> > +  type_max = wi::max_value (type);
> > +  /* Watch overflow.  */
> > +  if ((widest_int)1 << TYPE_PRECISION (type) < nit)
> > +    return true;
> 
> TYPE_PRECISION (type) - 1?  The comment can be improved I think.

For type==char I think TYPE_PRECISION should be 8 and useful nit values are in
range 0...255 (because of the +1 addition bellow). So I think it should be
  if ((widest_int)1 << TYPE_PRECISION (type) < nit)

> 
> > +  if ((widest_int::from (base_max, base_sgn)
> > +       + widest_int::from (step_max, step_sgn) * (nit + 1))
> > +       > widest_int::from (type_max, sgn)
> > +      || (widest_int::from (type_min, sgn)
> > +	  > (widest_int::from (base_min, base_sgn)
> > +	     + widest_int::from (step_min, step_sgn) * (nit + 1))))
> > +    return true;
> 
> and this lacks any comment...  so it decodes to
> 
>  (base_max + step_max * (nit + 1) > type_max)
>  || (type_min > base_min + step_min * (nit + 1))

Yes, it is trying to compute the final value of IV when loop reaches maximal
number of iteration and check that it is in the range of the target type.
> 
> and it basically assumes infinite precision arithmetic for
> the computation.  As mentioned previously for __int128 widest_int
> does _not_ guarantee this so you need to use FIXED_WIDE_INT
> with a precision of WIDE_INT_MAX_PRECISION * 2 (+1?) like VRP does.

With the NIT check I know that all 4 values are representable in a target type.

     3) widest_int.  This representation is an approximation of
     infinite precision math.  However, it is not really infinite
     precision math as in the GMP library.  It is really finite
     precision math where the precision is 4 times the size of the
     largest integer that the target port can represent.

My understanding is that to do the mutiplcation I need two times of the bits of
TYPE_PRECISION (type) (+2 for the sign and addition perhaps) and widest_int has
4 times. If this is not true, perhaps the comment can be made more explicit?

Thanks for the pointer to vrp, now at least I see how the widening/explicit
precision work.

I suppose then the type should be FIXED_WIDE_INT (WIDE_INT_MAX_PRECISION * 2 + 2)?
> 
> Note as both type_min/max and base_min/max are wide_ints the above
> might be simplified
> 
>   step_max * (nit + 1) > type_max - base_max
> 
> where the subtraction can be carried out in wide_int(?) and you
> should also CSE nit + 1 or transform it further to
> 
>   step_max * nit > type_max - base_max - step_max
> 
> where if I understand correctly, if type_max - base_max - step_max
> underflows we already wrap.

Yes, this look like a good idea and yes, if type_max-base_max-step_max is
negative, we wrap.  I suppose even with some inlining doing manual CSE is not
bad plan.

Thanks, wide_int.h is still bit of black magic to me ;)
Honza
> 
> Thanks,
> Richard.

Index: testsuite/gcc.dg/tree-ssa/scev-14.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/scev-14.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/scev-14.c	(working copy)
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+int a[100];
+void t(unsigned int n)
+{
+  unsigned int i;
+  for (i=0; i<n; i++)
+     a[i]++;
+}
+/* { dg-final { scan-tree-dump "Overflowness wrto loop niter:	No-overflow"  "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "Overflowness wrto loop niter:	Overflow" "ivopts" } } */
Index: tree-scalar-evolution.c
===================================================================
--- tree-scalar-evolution.c	(revision 237856)
+++ tree-scalar-evolution.c	(working copy)
@@ -280,6 +280,7 @@  along with GCC; see the file COPYING3.
 #include "params.h"
 #include "tree-ssa-propagate.h"
 #include "gimple-fold.h"
+#include "print-tree.h"
 
 static tree analyze_scalar_evolution_1 (struct loop *, tree, tree);
 static tree analyze_scalar_evolution_for_address_of (struct loop *loop,
@@ -3309,6 +3310,60 @@  scev_reset (void)
     }
 }
 
+/* Return true if the IV calculation in TYPE can overflow based on the knowledge
+   of the upper bound on the number of iterations of LOOP, the BASE and STEP
+   of IV.
+
+   We do not use information whether TYPE can overflow so it is safe to
+   use this test even for derived IVs not computed every iteration or
+   hypotetical IVs to be inserted into code.  */
+
+bool
+iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step)
+{
+  widest_int nit;
+  wide_int base_min, base_max, step_min, step_max, type_min, type_max;
+  signop sgn = TYPE_SIGN (type);
+  signop base_sgn = TYPE_SIGN (TREE_TYPE (base));
+  signop step_sgn = TYPE_SIGN (TREE_TYPE (step));
+
+  if (step == 0)
+    return false;
+
+  if (TREE_CODE (base) == INTEGER_CST)
+    base_min = base_max = base;
+  else if (TREE_CODE (base) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (base))
+	   && get_range_info (base, &base_min, &base_max) == VR_RANGE)
+    ;
+  else
+    return true;
+
+  if (TREE_CODE (step) == INTEGER_CST)
+    step_min = step_max = step;
+  else if (TREE_CODE (step) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (step))
+	   && get_range_info (step, &step_min, &step_max) == VR_RANGE)
+    ;
+  else
+    return true;
+
+  if (!get_max_loop_iterations (loop, &nit))
+    return true;
+
+  type_min = wi::min_value (type);
+  type_max = wi::max_value (type);
+  /* Watch overflow.  */
+  if ((widest_int)1 << TYPE_PRECISION (type) < nit)
+    return true;
+  if ((widest_int::from (base_max, base_sgn)
+       + widest_int::from (step_max, step_sgn) * (nit + 1))
+       > widest_int::from (type_max, sgn)
+      || (widest_int::from (type_min, sgn)
+	  > (widest_int::from (base_min, base_sgn)
+	     + widest_int::from (step_min, step_sgn) * (nit + 1))))
+    return true;
+  return false;
+}
+
 /* Checks whether use of OP in USE_LOOP behaves as a simple affine iv with
    respect to WRTO_LOOP and returns its base and step in IV if possible
    (see analyze_scalar_evolution_in_loop for more details on USE_LOOP
@@ -3375,8 +3430,12 @@  simple_iv (struct loop *wrto_loop, struc
   if (tree_contains_chrecs (iv->base, NULL))
     return false;
 
-  iv->no_overflow = (!folded_casts && ANY_INTEGRAL_TYPE_P (type)
-		     && TYPE_OVERFLOW_UNDEFINED (type));
+  iv->no_overflow = !folded_casts && nowrap_type_p (type);
+
+  if (!iv->no_overflow
+      && !iv_can_overflow_p (wrto_loop, type, iv->base, iv->step))
+    iv->no_overflow = true;
+
 
   /* Try to simplify iv base:
 
Index: tree-scalar-evolution.h
===================================================================
--- tree-scalar-evolution.h	(revision 237856)
+++ tree-scalar-evolution.h	(working copy)
@@ -38,6 +38,7 @@  extern unsigned int scev_const_prop (voi
 extern bool expression_expensive_p (tree);
 extern bool simple_iv (struct loop *, struct loop *, tree, struct affine_iv *,
 		       bool);
+extern bool iv_can_overflow_p (struct loop *, tree, tree, tree);
 extern tree compute_overall_effect_of_inner_loop (struct loop *, tree);
 
 /* Returns the basic block preceding LOOP, or the CFG entry block when
Index: tree-ssa-loop-niter.c
===================================================================
--- tree-ssa-loop-niter.c	(revision 237856)
+++ tree-ssa-loop-niter.c	(working copy)
@@ -4105,7 +4105,7 @@  n_of_executions_at_most (gimple *stmt,
 bool
 nowrap_type_p (tree type)
 {
-  if (INTEGRAL_TYPE_P (type)
+  if (ANY_INTEGRAL_TYPE_P (type)
       && TYPE_OVERFLOW_UNDEFINED (type))
     return true;

Determine more IVs to be non-overflowing

Commit Message

Comments

Patch