Patchwork Reduce complette unrolling & peeling limits

login
register
mail settings
Submitter Jan Hubicka
Date Nov. 21, 2012, 4:25 p.m.
Message ID <20121121162535.GA32567@kam.mff.cuni.cz>
Download mbox | patch
Permalink /patch/200802/
State New
Headers show

Comments

Jan Hubicka - Nov. 21, 2012, 4:25 p.m.
Hi,
here is updated patch.  It should get the bounds safe enough to not have effect
on codegen of complette unrolling.

There is IMO no way to cut the walk of loop body w/o affecting codegen in 
unrolling for size mode.  The condition on unroling to happen is

 unrolled_size * 2 / 3 < original_size

The patch makes the function walking body to stop after minimal number of
duplicated insns is large (PARAM_MAX_COMPLETELY_PEELED_INSNS). The formula
above allows unlimited duplication when loop body is large enough. This is
more a bug than feature, so I think it is safe to alter it.

Bootstrapped/regtested x86_64-linux, OK?

Honza

 	* tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Add UPPER_BOUND
 	parameter.
 	(try_unroll_loop_completely) Update.
Eric Botcazou - Dec. 3, 2012, 12:03 p.m.
> here is updated patch.  It should get the bounds safe enough to not have
> effect on codegen of complette unrolling.
> 
> There is IMO no way to cut the walk of loop body w/o affecting codegen in
> unrolling for size mode.  The condition on unroling to happen is
> 
>  unrolled_size * 2 / 3 < original_size
> 
> The patch makes the function walking body to stop after minimal number of
> duplicated insns is large (PARAM_MAX_COMPLETELY_PEELED_INSNS). The formula
> above allows unlimited duplication when loop body is large enough. This is
> more a bug than feature, so I think it is safe to alter it.
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
> Honza
> 
>  	* tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Add UPPER_BOUND
>  	parameter.
>  	(try_unroll_loop_completely) Update.

The patch hasn't been installed, has it?  The test still takes 20s to compile 
at -O3 on a fast x86-64 box, so you can imagine what this yields on slower 
machines (and that's before the x4 because of the various dg-torture options).
Jan Hubicka - Dec. 4, 2012, 6:27 p.m.
> > here is updated patch.  It should get the bounds safe enough to not have
> > effect on codegen of complette unrolling.
> > 
> > There is IMO no way to cut the walk of loop body w/o affecting codegen in
> > unrolling for size mode.  The condition on unroling to happen is
> > 
> >  unrolled_size * 2 / 3 < original_size
> > 
> > The patch makes the function walking body to stop after minimal number of
> > duplicated insns is large (PARAM_MAX_COMPLETELY_PEELED_INSNS). The formula
> > above allows unlimited duplication when loop body is large enough. This is
> > more a bug than feature, so I think it is safe to alter it.
> > 
> > Bootstrapped/regtested x86_64-linux, OK?
> > 
> > Honza
> > 
> >  	* tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Add UPPER_BOUND
> >  	parameter.
> >  	(try_unroll_loop_completely) Update.
> 
> The patch hasn't been installed, has it?  The test still takes 20s to compile 
> at -O3 on a fast x86-64 box, so you can imagine what this yields on slower 
> machines (and that's before the x4 because of the various dg-torture options).

Yes, I need approval for this one.
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01798.html

Honza
Richard Guenther - Dec. 6, 2012, 9:31 a.m.
On Tue, 4 Dec 2012, Jan Hubicka wrote:

> > > here is updated patch.  It should get the bounds safe enough to not have
> > > effect on codegen of complette unrolling.
> > > 
> > > There is IMO no way to cut the walk of loop body w/o affecting codegen in
> > > unrolling for size mode.  The condition on unroling to happen is
> > > 
> > >  unrolled_size * 2 / 3 < original_size
> > > 
> > > The patch makes the function walking body to stop after minimal number of
> > > duplicated insns is large (PARAM_MAX_COMPLETELY_PEELED_INSNS). The formula
> > > above allows unlimited duplication when loop body is large enough. This is
> > > more a bug than feature, so I think it is safe to alter it.
> > > 
> > > Bootstrapped/regtested x86_64-linux, OK?
> > > 
> > > Honza
> > > 
> > >  	* tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Add UPPER_BOUND
> > >  	parameter.
> > >  	(try_unroll_loop_completely) Update.
> > 
> > The patch hasn't been installed, has it?  The test still takes 20s to compile 
> > at -O3 on a fast x86-64 box, so you can imagine what this yields on slower 
> > machines (and that's before the x4 because of the various dg-torture options).
> 
> Yes, I need approval for this one.
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01798.html

Ok.

Thanks,
Richard.

Patch

Index: tree-ssa-loop-ivcanon.c
===================================================================
--- tree-ssa-loop-ivcanon.c	(revision 193694)
+++ tree-ssa-loop-ivcanon.c	(working copy)
@@ -1,5 +1,5 @@ 
-/* Induction variable canonicalization.
-   Copyright (C) 2004, 2005, 2007, 2008, 2010
+/* Induction variable canonicalization and loop peeling.
+   Copyright (C) 2004, 2005, 2007, 2008, 2010, 2012
    Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -207,10 +210,12 @@  constant_after_peeling (tree op, gimple
    iteration of the loop.
    EDGE_TO_CANCEL (if non-NULL) is an non-exit edge eliminated in the last iteration
    of loop.
-   Return results in SIZE, estimate benefits for complete unrolling exiting by EXIT.  */
+   Return results in SIZE, estimate benefits for complete unrolling exiting by EXIT. 
+   Stop estimating after UPPER_BOUND is met. Return true in this case */
 
-static void
-tree_estimate_loop_size (struct loop *loop, edge exit, edge edge_to_cancel, struct loop_size *size)
+static bool
+tree_estimate_loop_size (struct loop *loop, edge exit, edge edge_to_cancel, struct loop_size *size,
+			 int upper_bound)
 {
   basic_block *body = get_loop_body (loop);
   gimple_stmt_iterator gsi;
@@ -316,6 +321,12 @@  tree_estimate_loop_size (struct loop *lo
 	      if (likely_eliminated || likely_eliminated_last)
 		size->last_iteration_eliminated_by_peeling += num;
 	    }
+	  if ((size->overall * 3 / 2 - size->eliminated_by_peeling
+	      - size->last_iteration_eliminated_by_peeling) > upper_bound)
+	    {
+              free (body);
+	      return true;
+	    }
 	}
     }
   while (path.length ())
@@ -357,6 +368,7 @@  tree_estimate_loop_size (struct loop *lo
 	     size->last_iteration_eliminated_by_peeling);
 
   free (body);
+  return false;
 }
 
 /* Estimate number of insns of completely unrolled loop.
@@ -699,12 +711,22 @@  try_unroll_loop_completely (struct loop
       sbitmap wont_exit;
       edge e;
       unsigned i;
-      vec<edge> to_remove = vNULL;
+      bool large;
+      vec<edge> to_remove = vNULL;
       if (ul == UL_SINGLE_ITER)
 	return false;
 
-      tree_estimate_loop_size (loop, exit, edge_to_cancel, &size);
+      large = tree_estimate_loop_size
+		 (loop, exit, edge_to_cancel, &size,
+		  PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS));
       ninsns = size.overall;
+      if (large)
+	{
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, "Not unrolling loop %d: it is too large.\n",
+		     loop->num);
+	  return false;
+	}
 
       unr_insns = estimated_unrolled_size (&size, n_unroll);
       if (dump_file && (dump_flags & TDF_DETAILS))