Patchwork PATCH: adjust ivopts costs for -Os

login
register
mail settings
Submitter Sandra Loosemore
Date June 9, 2010, 3:30 a.m.
Message ID <4C0F0ADC.9090909@codesourcery.com>
Download mbox | patch
Permalink /patch/55047/
State New
Headers show

Comments

Sandra Loosemore - June 9, 2010, 3:30 a.m.
This patch fixes a problem I spotted while working on PR42505.  It does not fix 
the test case reported in that issue.

Currently, the ivopts costs model divides costs for IV setup by the expected 
number of loop iterations.  This is OK when optimizing for speed, but when 
optimizing for code size, the code required for IV setup doesn't magically 
become smaller just because it's located outside the loop.

I benchmarked this with CSiBE on x86-64, and the total code size measured was 
3482313 originally, 3481067 with this patch.  I also tried on ARM with the 
options mentioned in PR42505 and came up with 2640825 vs 2639463.  So, as well 
as seeming like the abstractly Right Thing, there does seem to be a measurable 
improvement from this patch.

OK to check in?

-Sandra


2010-06-08  Sandra Loosemore  <sandra@codesourcery.com>

	gcc/
	* tree-ssa-loop-ivopts.c (adjust_setup_cost): New function.
	(get_computation_cost_at): Use it.
	(determine_use_iv_cost_condition): Likewise.
	(determine_iv_cost): Likewise.
Zdenek Dvorak - June 9, 2010, 10:08 a.m.
Hi,

> This patch fixes a problem I spotted while working on PR42505.  It does 
> not fix the test case reported in that issue.
>
> Currently, the ivopts costs model divides costs for IV setup by the 
> expected number of loop iterations.  This is OK when optimizing for 
> speed, but when optimizing for code size, the code required for IV setup 
> doesn't magically become smaller just because it's located outside the 
> loop.
>
> I benchmarked this with CSiBE on x86-64, and the total code size measured 
> was 3482313 originally, 3481067 with this patch.  I also tried on ARM 
> with the options mentioned in PR42505 and came up with 2640825 vs 
> 2639463.  So, as well as seeming like the abstractly Right Thing, there 
> does seem to be a measurable improvement from this patch.
>
> OK to check in?

OK.

Zdenek

Patch

Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	(revision 160444)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -2936,6 +2936,20 @@  get_computation (struct loop *loop, stru
   return get_computation_at (loop, use, cand, use->stmt);
 }
 
+/* Adjust the cost COST for being in loop setup rather than loop body.
+   If we're optimizing for space, the loop setup overhead is constant;
+   if we're optimizing for speed, amortize it over the per-iteration cost.  */
+static unsigned
+adjust_setup_cost (struct ivopts_data *data, unsigned cost)
+{
+  if (cost == INFTY)
+    return cost;
+  else if (optimize_loop_for_speed_p (data->current_loop))
+    return cost / AVG_LOOP_NITER (data->current_loop);
+  else
+    return cost;
+}
+
 /* Returns cost of addition in MODE.  */
 
 static unsigned
@@ -3848,8 +3862,8 @@  get_computation_cost_at (struct ivopts_d
   /* Symbol + offset should be compile-time computable so consider that they
       are added once to the variable, if present.  */
   if (var_present && (symbol_present || offset))
-    cost.cost += add_cost (TYPE_MODE (ctype), speed)
-		 / AVG_LOOP_NITER (data->current_loop);
+    cost.cost += adjust_setup_cost (data,
+				    add_cost (TYPE_MODE (ctype), speed));
 
   /* Having offset does not affect runtime cost in case it is added to
      symbol, but it increases complexity.  */
@@ -4114,7 +4128,7 @@  determine_use_iv_cost_condition (struct 
       elim_cost = force_var_cost (data, bound, &depends_on_elim);
       /* The bound is a loop invariant, so it will be only computed
 	 once.  */
-      elim_cost.cost /= AVG_LOOP_NITER (data->current_loop);
+      elim_cost.cost = adjust_setup_cost (data, elim_cost.cost);
     }
   else
     elim_cost = infinite_cost;
@@ -4361,7 +4375,7 @@  determine_iv_cost (struct ivopts_data *d
   cost_base = force_var_cost (data, base, NULL);
   cost_step = add_cost (TYPE_MODE (TREE_TYPE (base)), data->speed);
 
-  cost = cost_step + cost_base.cost / AVG_LOOP_NITER (current_loop);
+  cost = cost_step + adjust_setup_cost (data, cost_base.cost);
 
   /* Prefer the original ivs unless we may gain something by replacing it.
      The reason is to make debugging simpler; so this is not relevant for