diff mbox

Change IVOPTS and strength reduction to use expmed cost model

Message ID 1343232822.4638.16.camel@oc2474580526.ibm.com
State New
Headers show

Commit Message

Bill Schmidt July 25, 2012, 4:13 p.m. UTC
Per Richard Henderson's suggestion
(http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01370.html), this patch
changes the IVOPTS and straight-line strength reduction passes to make
use of data computed by init_expmed.  This required adding a new
convert_cost array in expmed to store the costs of converting between
various scalar integer modes, and exposing expmed's multiplication hash
table for external use (new function mult_by_coeff_cost).  Richard H,
I'd appreciate it if you could look at what I did there and make sure
it's correct.  Thanks!

I decided it wasn't worth distinguishing between reg-reg add costs and
reg-constant add costs, so I simplified the strength reduction
calculations rather than adding another array to expmed for this
purpose.  But I can make this distinction if that's preferable.

Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
regressions.  Ok for trunk?

Thanks,
Bill


2012-07-25  Bill Schmidt  <wschmidt@linux.ibm.com>

	* tree-ssa-loop-ivopts.c (mbc_entry_hash): Remove.
	(mbc_entry_eq): Likewise.
	(mult_costs): Likewise.
	(cost_tables_exist): Likewise.
	(initialize_costs): Likewise.
	(finalize_costs): Likewise.
	(tree_ssa_iv_optimize_init): Remove call to initialize_costs.
	(add_regs_cost): Remove.
	(multiply_regs_cost): Likewise.
	(add_const_cost): Likewise.
	(extend_or_trunc_reg_cost): Likewise.
	(negate_reg_cost): Likewise.
	(struct mbc_entry): Likewise.
	(multiply_by_const_cost): Likewise.
	(get_address_cost): Change add_regs_cost calls to add_cost lookups;
	change multiply_by_const_cost to mult_by_coeff_cost.
	(force_expr_to_var_cost): Likewise.
	(difference_cost): Change multiply_by_const_cost to mult_by_coeff_cost.
	(get_computation_cost_at): Change add_regs_cost calls to add_cost
	lookups; change multiply_by_const_cost to mult_by_coeff_cost.
	(determine_iv_cost): Change add_regs_cost calls to add_cost lookups.
	(tree_ssa_iv_optimize_finalize): Remove call to finalize_costs.
	* tree-ssa-address.c (expmed.h): New #include.
	(most_expensive_mult_to_index): Change multiply_by_const_cost to
	mult_by_coeff_cost.
	* gimple-ssa-strength-reduction.c (expmed.h): New #include.
	(stmt_cost): Change to use mult_by_coeff_cost, mul_cost, add_cost,
	neg_cost, and convert_cost instead of IVOPTS interfaces.
	(execute_strength_reduction): Remove calls to initialize_costs and
	finalize_costs.
	* expmed.c (struct init_expmed_rtl): Add convert rtx_def.
	(init_expmed_one_mode): Initialize convert rtx_def; initialize
	convert_cost for related modes.
	(mult_by_coeff_cost): New function.
	* expmed.h (struct target_expmed): Add x_convert_cost matrix.
	(convert_cost): New #define.
	(mult_by_coeff_cost): New extern decl.
	* tree-flow.h (initialize_costs): Remove decl.
	(finalize_costs): Likewise.
	(multiply_by_const_cost): Likewise.
	(add_regs_cost): Likewise.
	(multiply_regs_cost): Likewise.
	(add_const_cost): Likewise.
	(extend_or_trunc_reg_cost): Likewise.
	(negate_reg_cost): Likewise.

Comments

Richard Henderson July 25, 2012, 4:59 p.m. UTC | #1
On 07/25/2012 09:13 AM, William J. Schmidt wrote:
> Per Richard Henderson's suggestion
> (http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01370.html), this patch
> changes the IVOPTS and straight-line strength reduction passes to make
> use of data computed by init_expmed.  This required adding a new
> convert_cost array in expmed to store the costs of converting between
> various scalar integer modes, and exposing expmed's multiplication hash
> table for external use (new function mult_by_coeff_cost).  Richard H,
> I'd appreciate it if you could look at what I did there and make sure
> it's correct.  Thanks!

Correctness looks good.

> I decided it wasn't worth distinguishing between reg-reg add costs and
> reg-constant add costs, so I simplified the strength reduction
> calculations rather than adding another array to expmed for this
> purpose.  But I can make this distinction if that's preferable.

I don't think this is worth thinking about at this level.  This is
something that some rtl-level optimization ought to be able to fix
up trivially, e.g. cse.

> Index: gcc/expmed.h
> ===================================================================
> --- gcc/expmed.h	(revision 189845)
> +++ gcc/expmed.h	(working copy)
> @@ -155,6 +155,11 @@ struct target_expmed {
>    int x_udiv_cost[2][NUM_MACHINE_MODES];
>    int x_mul_widen_cost[2][NUM_MACHINE_MODES];
>    int x_mul_highpart_cost[2][NUM_MACHINE_MODES];
> +
> +  /* Conversion costs are only defined between two scalar integer modes
> +     of different sizes.  The first machine mode is the destination mode,
> +     and the second is the source mode.  */
> +  int x_convert_cost[2][NUM_MACHINE_MODES][NUM_MACHINE_MODES];
>  };

2 * NUM_MACHINE_MODES is quite large...  I think we could do better with

#define NUM_MODE_INT (MAX_MODE_INT - MIN_MODE_INT + 1)

  x_convert_cost[2][NUM_MODE_INT][NUM_MODE_INT];

though really that could be done with all of these fields all at once.

That does suggest it would be better to leave at least inline functions
to access these elements, rather than open code the array access.


r~
diff mbox

Patch

Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	(revision 189845)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -88,9 +88,6 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-propagate.h"
 #include "expmed.h"
 
-static hashval_t mbc_entry_hash (const void *);
-static int mbc_entry_eq (const void*, const void *);
-
 /* FIXME: Expressions are expanded to RTL in this pass to determine the
    cost of different addressing modes.  This should be moved to a TBD
    interface between the GIMPLE and RTL worlds.  */
@@ -381,11 +378,6 @@  struct iv_ca_delta
 
 static VEC(tree,heap) *decl_rtl_to_reset;
 
-/* Cached costs for multiplies by constants, and a flag to indicate
-   when they're valid.  */
-static htab_t mult_costs[2];
-static bool cost_tables_exist = false;
-
 static comp_cost force_expr_to_var_cost (tree, bool);
 
 /* Number of uses recorded in DATA.  */
@@ -851,26 +843,6 @@  htab_inv_expr_hash (const void *ent)
   return expr->hash;
 }
 
-/* Allocate data structures for the cost model.  */
-
-void
-initialize_costs (void)
-{
-  mult_costs[0] = htab_create (100, mbc_entry_hash, mbc_entry_eq, free);
-  mult_costs[1] = htab_create (100, mbc_entry_hash, mbc_entry_eq, free);
-  cost_tables_exist = true;
-}
-
-/* Release data structures for the cost model.  */
-
-void
-finalize_costs (void)
-{
-  cost_tables_exist = false;
-  htab_delete (mult_costs[0]);
-  htab_delete (mult_costs[1]);
-}
-
 /* Initializes data structures used by the iv optimization pass, stored
    in DATA.  */
 
@@ -889,8 +861,6 @@  tree_ssa_iv_optimize_init (struct ivopts_data *dat
                                     htab_inv_expr_eq, free);
   data->inv_expr_id = 0;
   decl_rtl_to_reset = VEC_alloc (tree, heap, 20);
-
-  initialize_costs ();
 }
 
 /* Returns a memory object to that EXPR points.  In case we are able to
@@ -3077,250 +3047,6 @@  adjust_setup_cost (struct ivopts_data *data, unsig
     return cost;
 }
 
-/* Returns cost of addition in MODE.  */
-
-unsigned
-add_regs_cost (enum machine_mode mode, bool speed)
-{
-  static unsigned costs[NUM_MACHINE_MODES][2];
-  rtx seq;
-  unsigned cost;
-
-  if (costs[mode][speed])
-    return costs[mode][speed];
-
-  start_sequence ();
-  force_operand (gen_rtx_fmt_ee (PLUS, mode,
-				 gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1),
-				 gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 2)),
-		 NULL_RTX);
-  seq = get_insns ();
-  end_sequence ();
-
-  cost = seq_cost (seq, speed);
-  if (!cost)
-    cost = 1;
-
-  costs[mode][speed] = cost;
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "Addition in %s costs %d\n",
-	     GET_MODE_NAME (mode), cost);
-  return cost;
-}
-
-/* Returns cost of multiplication in MODE.  */
-
-unsigned
-multiply_regs_cost (enum machine_mode mode, bool speed)
-{
-  static unsigned costs[NUM_MACHINE_MODES][2];
-  rtx seq;
-  unsigned cost;
-
-  if (costs[mode][speed])
-    return costs[mode][speed];
-
-  start_sequence ();
-  force_operand (gen_rtx_fmt_ee (MULT, mode,
-				 gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1),
-				 gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 2)),
-		 NULL_RTX);
-  seq = get_insns ();
-  end_sequence ();
-
-  cost = seq_cost (seq, speed);
-  if (!cost)
-    cost = 1;
-
-  costs[mode][speed] = cost;
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "Multiplication in %s costs %d\n",
-	     GET_MODE_NAME (mode), cost);
-  return cost;
-}
-
-/* Returns cost of addition with a constant in MODE.  */
-
-unsigned
-add_const_cost (enum machine_mode mode, bool speed)
-{
-  static unsigned costs[NUM_MACHINE_MODES][2];
-  rtx seq;
-  unsigned cost;
-
-  if (costs[mode][speed])
-    return costs[mode][speed];
-
-  /* Arbitrarily generate insns for x + 2, as the exact constant
-     shouldn't matter.  */
-  start_sequence ();
-  force_operand (gen_rtx_fmt_ee (PLUS, mode,
-				 gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1),
-				 gen_int_mode (2, mode)),
-		 NULL_RTX);
-  seq = get_insns ();
-  end_sequence ();
-
-  cost = seq_cost (seq, speed);
-  if (!cost)
-    cost = 1;
-
-  costs[mode][speed] = cost;
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "Addition to constant in %s costs %d\n",
-	     GET_MODE_NAME (mode), cost);
-  return cost;
-}
-
-/* Returns cost of extend or truncate in MODE.  */
-
-unsigned
-extend_or_trunc_reg_cost (tree type_to, tree type_from, bool speed)
-{
-  static unsigned costs[NUM_MACHINE_MODES][NUM_MACHINE_MODES][2];
-  rtx seq;
-  unsigned cost;
-  enum machine_mode mode_to = TYPE_MODE (type_to);
-  enum machine_mode mode_from = TYPE_MODE (type_from);
-  tree size_to = TYPE_SIZE (type_to);
-  tree size_from = TYPE_SIZE (type_from);
-  enum rtx_code code;
-
-  gcc_assert (TREE_CODE (size_to) == INTEGER_CST
-	      && TREE_CODE (size_from) == INTEGER_CST);
-
-  if (costs[mode_to][mode_from][speed])
-    return costs[mode_to][mode_from][speed];
-
-  if (tree_int_cst_lt (size_to, size_from))
-    code = TRUNCATE;
-  else if (TYPE_UNSIGNED (type_to))
-    code = ZERO_EXTEND;
-  else
-    code = SIGN_EXTEND;
-
-  start_sequence ();
-  gen_rtx_fmt_e (code, mode_to,
-		 gen_raw_REG (mode_from, LAST_VIRTUAL_REGISTER + 1));
-  seq = get_insns ();
-  end_sequence ();
-
-  cost = seq_cost (seq, speed);
-  if (!cost)
-    cost = 1;
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "Conversion from %s to %s costs %d\n",
-	     GET_MODE_NAME (mode_to), GET_MODE_NAME (mode_from), cost);
-
-  costs[mode_to][mode_from][speed] = cost;
-  return cost;
-}
-
-/* Returns cost of negation in MODE.  */
-
-unsigned
-negate_reg_cost (enum machine_mode mode, bool speed)
-{
-  static unsigned costs[NUM_MACHINE_MODES][2];
-  rtx seq;
-  unsigned cost;
-
-  if (costs[mode][speed])
-    return costs[mode][speed];
-
-  start_sequence ();
-  force_operand (gen_rtx_fmt_e (NEG, mode,
-				gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)),
-		 NULL_RTX);
-  seq = get_insns ();
-  end_sequence ();
-
-  cost = seq_cost (seq, speed);
-  if (!cost)
-    cost = 1;
-
-  costs[mode][speed] = cost;
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "Negation in %s costs %d\n",
-	     GET_MODE_NAME (mode), cost);
-  return cost;
-}
-
-/* Entry in a hashtable of already known costs for multiplication.  */
-struct mbc_entry
-{
-  HOST_WIDE_INT cst;		/* The constant to multiply by.  */
-  enum machine_mode mode;	/* In mode.  */
-  unsigned cost;		/* The cost.  */
-};
-
-/* Counts hash value for the ENTRY.  */
-
-static hashval_t
-mbc_entry_hash (const void *entry)
-{
-  const struct mbc_entry *e = (const struct mbc_entry *) entry;
-
-  return 57 * (hashval_t) e->mode + (hashval_t) (e->cst % 877);
-}
-
-/* Compares the hash table entries ENTRY1 and ENTRY2.  */
-
-static int
-mbc_entry_eq (const void *entry1, const void *entry2)
-{
-  const struct mbc_entry *e1 = (const struct mbc_entry *) entry1;
-  const struct mbc_entry *e2 = (const struct mbc_entry *) entry2;
-
-  return (e1->mode == e2->mode
-	  && e1->cst == e2->cst);
-}
-
-/* Returns cost of multiplication by constant CST in MODE.  */
-
-unsigned
-multiply_by_const_cost (HOST_WIDE_INT cst, enum machine_mode mode, bool speed)
-{
-  struct mbc_entry **cached, act;
-  rtx seq;
-  unsigned cost;
-
-  gcc_assert (cost_tables_exist);
-
-  act.mode = mode;
-  act.cst = cst;
-  cached = (struct mbc_entry **)
-    htab_find_slot (mult_costs[speed], &act, INSERT);
-    
-  if (*cached)
-    return (*cached)->cost;
-
-  *cached = XNEW (struct mbc_entry);
-  (*cached)->mode = mode;
-  (*cached)->cst = cst;
-
-  start_sequence ();
-  expand_mult (mode, gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1),
-	       gen_int_mode (cst, mode), NULL_RTX, 0);
-  seq = get_insns ();
-  end_sequence ();
-
-  cost = seq_cost (seq, speed);
-
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    fprintf (dump_file, "Multiplication by %d in %s costs %d\n",
-	     (int) cst, GET_MODE_NAME (mode), cost);
-
-  (*cached)->cost = cost;
-
-  return cost;
-}
-
 /* Returns true if multiplying by RATIO is allowed in an address.  Test the
    validity for a memory reference accessing memory of mode MODE in
    address space AS.  */
@@ -3582,7 +3308,7 @@  get_address_cost (bool symbol_present, bool var_pr
 	 If VAR_PRESENT is true, try whether the mode with
 	 SYMBOL_PRESENT = false is cheaper even with cost of addition, and
 	 if this is the case, use it.  */
-      add_c = add_regs_cost (address_mode, speed);
+      add_c = add_cost[speed][address_mode];
       for (i = 0; i < 8; i++)
 	{
 	  var_p = i & 1;
@@ -3663,10 +3389,10 @@  get_address_cost (bool symbol_present, bool var_pr
 	     && multiplier_allowed_in_address_p (ratio, mem_mode, as));
 
   if (ratio != 1 && !ratio_p)
-    cost += multiply_by_const_cost (ratio, address_mode, speed);
+    cost += mult_by_coeff_cost (ratio, address_mode, speed);
 
   if (s_offset && !offset_p && !symbol_present)
-    cost += add_regs_cost (address_mode, speed);
+    cost += add_cost[speed][address_mode];
 
   if (may_autoinc)
     *may_autoinc = autoinc;
@@ -3833,7 +3559,7 @@  force_expr_to_var_cost (tree expr, bool speed)
     case PLUS_EXPR:
     case MINUS_EXPR:
     case NEGATE_EXPR:
-      cost = new_cost (add_regs_cost (mode, speed), 0);
+      cost = new_cost (add_cost[speed][mode], 0);
       if (TREE_CODE (expr) != NEGATE_EXPR)
         {
           tree mult = NULL_TREE;
@@ -3853,11 +3579,11 @@  force_expr_to_var_cost (tree expr, bool speed)
 
     case MULT_EXPR:
       if (cst_and_fits_in_hwi (op0))
-	cost = new_cost (multiply_by_const_cost (int_cst_value (op0),
-						 mode, speed), 0);
+	cost = new_cost (mult_by_coeff_cost (int_cst_value (op0),
+					     mode, speed), 0);
       else if (cst_and_fits_in_hwi (op1))
-	cost = new_cost (multiply_by_const_cost (int_cst_value (op1),
-						 mode, speed), 0);
+	cost = new_cost (mult_by_coeff_cost (int_cst_value (op1),
+					     mode, speed), 0);
       else
 	return new_cost (target_spill_cost [speed], 0);
       break;
@@ -4023,7 +3749,7 @@  difference_cost (struct ivopts_data *data,
   if (integer_zerop (e1))
     {
       comp_cost cost = force_var_cost (data, e2, depends_on);
-      cost.cost += multiply_by_const_cost (-1, mode, data->speed);
+      cost.cost += mult_by_coeff_cost (-1, mode, data->speed);
       return cost;
     }
 
@@ -4334,7 +4060,7 @@  get_computation_cost_at (struct ivopts_data *data,
 					 &symbol_present, &var_present,
 					 &offset, depends_on));
       cost.cost /= avg_loop_niter (data->current_loop);
-      cost.cost += add_regs_cost (TYPE_MODE (ctype), data->speed);
+      cost.cost += add_cost[data->speed][TYPE_MODE (ctype)];
     }
 
   if (inv_expr_id)
@@ -4367,7 +4093,7 @@  get_computation_cost_at (struct ivopts_data *data,
   if (!symbol_present && !var_present && !offset)
     {
       if (ratio != 1)
-	cost.cost += multiply_by_const_cost (ratio, TYPE_MODE (ctype), speed);
+	cost.cost += mult_by_coeff_cost (ratio, TYPE_MODE (ctype), speed);
       return cost;
     }
 
@@ -4375,18 +4101,18 @@  get_computation_cost_at (struct ivopts_data *data,
       are added once to the variable, if present.  */
   if (var_present && (symbol_present || offset))
     cost.cost += adjust_setup_cost (data,
-				    add_regs_cost (TYPE_MODE (ctype), speed));
+				    add_cost[speed][TYPE_MODE (ctype)]);
 
   /* Having offset does not affect runtime cost in case it is added to
      symbol, but it increases complexity.  */
   if (offset)
     cost.complexity++;
 
-  cost.cost += add_regs_cost (TYPE_MODE (ctype), speed);
+  cost.cost += add_cost[speed][TYPE_MODE (ctype)];
 
   aratio = ratio > 0 ? ratio : -ratio;
   if (aratio != 1)
-    cost.cost += multiply_by_const_cost (aratio, TYPE_MODE (ctype), speed);
+    cost.cost += mult_by_coeff_cost (aratio, TYPE_MODE (ctype), speed);
   return cost;
 
 fallback:
@@ -5232,7 +4958,7 @@  determine_iv_cost (struct ivopts_data *data, struc
      or a const set.  */
   if (cost_base.cost == 0)
     cost_base.cost = COSTS_N_INSNS (1);
-  cost_step = add_regs_cost (TYPE_MODE (TREE_TYPE (base)), data->speed);
+  cost_step = add_cost[data->speed][TYPE_MODE (TREE_TYPE (base))];
 
   cost = cost_step + adjust_setup_cost (data, cost_base.cost);
 
@@ -6804,8 +6530,6 @@  tree_ssa_iv_optimize_finalize (struct ivopts_data
   VEC_free (iv_use_p, heap, data->iv_uses);
   VEC_free (iv_cand_p, heap, data->iv_candidates);
   htab_delete (data->inv_expr_tab);
-
-  finalize_costs ();
 }
 
 /* Returns true if the loop body BODY includes any function calls.  */
Index: gcc/tree-ssa-address.c
===================================================================
--- gcc/tree-ssa-address.c	(revision 189845)
+++ gcc/tree-ssa-address.c	(working copy)
@@ -42,6 +42,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "expr.h"
 #include "ggc.h"
 #include "target.h"
+#include "expmed.h"
 
 /* TODO -- handling of symbols (according to Richard Hendersons
    comments, http://gcc.gnu.org/ml/gcc-patches/2005-04/msg00949.html):
@@ -554,7 +555,7 @@  most_expensive_mult_to_index (tree type, struct me
 	  || !multiplier_allowed_in_address_p (coef, TYPE_MODE (type), as))
 	continue;
 
-      acost = multiply_by_const_cost (coef, address_mode, speed);
+      acost = mult_by_coeff_cost (coef, address_mode, speed);
 
       if (acost > best_mult_cost)
 	{
Index: gcc/gimple-ssa-strength-reduction.c
===================================================================
--- gcc/gimple-ssa-strength-reduction.c	(revision 189845)
+++ gcc/gimple-ssa-strength-reduction.c	(working copy)
@@ -54,6 +54,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-flow.h"
 #include "domwalk.h"
 #include "pointer-set.h"
+#include "expmed.h"
 
 /* Information about a strength reduction candidate.  Each statement
    in the candidate table represents an expression of one of the
@@ -340,29 +341,22 @@  stmt_cost (gimple gs, bool speed)
       rhs2 = gimple_assign_rhs2 (gs);
 
       if (host_integerp (rhs2, 0))
-	return multiply_by_const_cost (TREE_INT_CST_LOW (rhs2), lhs_mode,
-				       speed);
+	return mult_by_coeff_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);
 
       gcc_assert (TREE_CODE (rhs1) != INTEGER_CST);
-      return multiply_regs_cost (TYPE_MODE (TREE_TYPE (lhs)), speed);
+      return mul_cost[speed][lhs_mode];
 
     case PLUS_EXPR:
     case POINTER_PLUS_EXPR:
     case MINUS_EXPR:
       rhs2 = gimple_assign_rhs2 (gs);
+      return add_cost[speed][lhs_mode];
 
-      if (host_integerp (rhs2, 0))
-	return add_const_cost (TYPE_MODE (TREE_TYPE (rhs1)), speed);
-
-      gcc_assert (TREE_CODE (rhs1) != INTEGER_CST);
-      return add_regs_cost (lhs_mode, speed);
-
     case NEGATE_EXPR:
-      return negate_reg_cost (lhs_mode, speed);
+      return neg_cost[speed][lhs_mode];
 
     case NOP_EXPR:
-      return extend_or_trunc_reg_cost (TREE_TYPE (lhs), TREE_TYPE (rhs1),
-				       speed);
+      return convert_cost[speed][lhs_mode][TYPE_MODE (TREE_TYPE (rhs1))];
 
     /* Note that we don't assign costs to copies that in most cases
        will go away.  */
@@ -1460,9 +1454,6 @@  execute_strength_reduction (void)
      back edges, and this gives us dominator information as well.  */
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  /* Initialize costs tables in IVOPTS.  */
-  initialize_costs ();
-
   /* Set up callbacks for the generic dominator tree walker.  */
   walk_data.dom_direction = CDI_DOMINATORS;
   walk_data.initialize_block_local_data = NULL;
@@ -1493,7 +1484,6 @@  execute_strength_reduction (void)
   pointer_map_destroy (stmt_cand_map);
   VEC_free (slsr_cand_t, heap, cand_vec);
   obstack_free (&cand_obstack, NULL);
-  finalize_costs ();
 
   return 0;
 }
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c	(revision 189845)
+++ gcc/expmed.c	(working copy)
@@ -112,6 +112,7 @@  struct init_expmed_rtl
   struct rtx_def shift_add;	rtunion shift_add_fld1;
   struct rtx_def shift_sub0;	rtunion shift_sub0_fld1;
   struct rtx_def shift_sub1;	rtunion shift_sub1_fld1;
+  struct rtx_def convert;
 
   rtx pow2[MAX_BITS_PER_WORD];
   rtx cint[MAX_BITS_PER_WORD];
@@ -122,6 +123,7 @@  init_expmed_one_mode (struct init_expmed_rtl *all,
 		      enum machine_mode mode, int speed)
 {
   int m, n, mode_bitsize;
+  enum machine_mode mode_from;
 
   mode_bitsize = GET_MODE_UNIT_BITSIZE (mode);
 
@@ -139,6 +141,7 @@  init_expmed_one_mode (struct init_expmed_rtl *all,
   PUT_MODE (&all->shift_add, mode);
   PUT_MODE (&all->shift_sub0, mode);
   PUT_MODE (&all->shift_sub1, mode);
+  PUT_MODE (&all->convert, mode);
 
   add_cost[speed][mode] = set_src_cost (&all->plus, speed);
   neg_cost[speed][mode] = set_src_cost (&all->neg, speed);
@@ -183,6 +186,30 @@  init_expmed_one_mode (struct init_expmed_rtl *all,
 	  mul_highpart_cost[speed][mode]
 	    = set_src_cost (&all->wide_trunc, speed);
 	}
+
+      for (mode_from = GET_CLASS_NARROWEST_MODE (MODE_INT);
+	   mode_from != VOIDmode;
+	   mode_from = GET_MODE_WIDER_MODE (mode_from))
+	if (mode != mode_from)
+	  {
+	    unsigned short size_to = GET_MODE_SIZE (mode);
+	    unsigned short size_from = GET_MODE_SIZE (mode_from);
+	    if (size_to < size_from)
+	      {
+		PUT_CODE (&all->convert, TRUNCATE);
+		PUT_MODE (&all->reg, mode_from);
+		convert_cost[speed][mode][mode_from]
+		  = set_src_cost (&all->convert, speed);
+	      }
+	    else if (size_from < size_to)
+	      {
+		/* Assume cost of zero-extend and sign-extend is the same.  */
+		PUT_CODE (&all->convert, ZERO_EXTEND);
+		PUT_MODE (&all->reg, mode_from);
+		convert_cost[speed][mode][mode_from]
+		  = set_src_cost (&all->convert, speed);
+	      }
+	  }
     }
 }
 
@@ -262,6 +289,9 @@  init_expmed (void)
   XEXP (&all.shift_sub1, 0) = &all.reg;
   XEXP (&all.shift_sub1, 1) = &all.shift_mult;
 
+  PUT_CODE (&all.convert, TRUNCATE);
+  XEXP (&all.convert, 0) = &all.reg;
+
   for (speed = 0; speed < 2; speed++)
     {
       crtl->maybe_hot_insn_p = speed;
@@ -3262,6 +3292,24 @@  expand_mult (enum machine_mode mode, rtx op0, rtx
   return op0;
 }
 
+/* Return a cost estimate for multiplying a register by the given
+   COEFFicient in the given MODE and SPEED.  */
+
+int
+mult_by_coeff_cost (HOST_WIDE_INT coeff, enum machine_mode mode, bool speed)
+{
+  int max_cost;
+  struct algorithm algorithm;
+  enum mult_variant variant;
+
+  rtx fake_reg = gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1);
+  max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, fake_reg), speed);
+  if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost))
+    return algorithm.cost.cost;
+  else
+    return max_cost;
+}
+
 /* Perform a widening multiplication and return an rtx for the result.
    MODE is mode of value; OP0 and OP1 are what to multiply (rtx's);
    TARGET is a suggestion for where to store the result (an rtx).
Index: gcc/expmed.h
===================================================================
--- gcc/expmed.h	(revision 189845)
+++ gcc/expmed.h	(working copy)
@@ -155,6 +155,11 @@  struct target_expmed {
   int x_udiv_cost[2][NUM_MACHINE_MODES];
   int x_mul_widen_cost[2][NUM_MACHINE_MODES];
   int x_mul_highpart_cost[2][NUM_MACHINE_MODES];
+
+  /* Conversion costs are only defined between two scalar integer modes
+     of different sizes.  The first machine mode is the destination mode,
+     and the second is the source mode.  */
+  int x_convert_cost[2][NUM_MACHINE_MODES][NUM_MACHINE_MODES];
 };
 
 extern struct target_expmed default_target_expmed;
@@ -196,5 +201,8 @@  extern struct target_expmed *this_target_expmed;
   (this_target_expmed->x_mul_widen_cost)
 #define mul_highpart_cost \
   (this_target_expmed->x_mul_highpart_cost)
+#define convert_cost \
+  (this_target_expmed->x_convert_cost)
 
+extern int mult_by_coeff_cost (HOST_WIDE_INT, enum machine_mode, bool);
 #endif
Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h	(revision 189845)
+++ gcc/tree-flow.h	(working copy)
@@ -806,14 +806,6 @@  bool expr_invariant_in_loop_p (struct loop *, tree
 bool stmt_invariant_in_loop_p (struct loop *, gimple);
 bool multiplier_allowed_in_address_p (HOST_WIDE_INT, enum machine_mode,
 				      addr_space_t);
-void initialize_costs (void);
-void finalize_costs (void);
-unsigned multiply_by_const_cost (HOST_WIDE_INT, enum machine_mode, bool);
-unsigned add_regs_cost (enum machine_mode, bool);
-unsigned multiply_regs_cost (enum machine_mode, bool);
-unsigned add_const_cost (enum machine_mode, bool);
-unsigned extend_or_trunc_reg_cost (tree, tree, bool);
-unsigned negate_reg_cost (enum machine_mode, bool);
 bool may_be_nonaddressable_p (tree expr);
 
 /* In tree-ssa-threadupdate.c.  */