From patchwork Wed Jul 25 19:40:13 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 173248 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 98AED2C00A5 for ; Thu, 26 Jul 2012 05:40:48 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1343850049; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Received:Received:Subject:From:To:Cc: In-Reply-To:References:Content-Type:Date:Message-ID:Mime-Version: Content-Transfer-Encoding:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=V+714lydmfMFbLlYiUo2n3xhaws=; b=myX4rAX8N8d7442 z3NolS/GVgJCwvh1kBh1fum3FEFw6NShJZ7d2dNe+TxwmLjAH6bS50mbt1urlLda aS7HeFxAft/8QC6MN4MrI7RosbdujlOVlrkzxbU0HN28NIQq7BAvCDGGl4ya7ab9 Qat2FHJ1WuKlAQtuvPaaEPuTKdiU= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Received:Received:Subject:From:To:Cc:In-Reply-To:References:Content-Type:Date:Message-ID:Mime-Version:Content-Transfer-Encoding:X-Content-Scanned:x-cbid:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=OLgsj2L+zY7exMy/PkLjhWiJ6NO+yTsP5XQHkyWZhcJ/++GVkukf3TBm5OGHLg jZIxT01t65AzLWI+qcH7ti4fw53UdxrfWF+D/Ma4VcdwjHpQWuOBXSGdTBNvGKrc otbpL6KD3WFXsTTS+S05bR1nUfQSy7dNLOGFYq0ZXaNlw=; Received: (qmail 5837 invoked by alias); 25 Jul 2012 19:40:44 -0000 Received: (qmail 5816 invoked by uid 22791); 25 Jul 2012 19:40:41 -0000 X-SWARE-Spam-Status: No, hits=-4.2 required=5.0 tests=AWL, BAYES_00, KAM_STOCKGEN, KHOP_RCVD_UNTRUST, KHOP_THREADED, MAY_BE_FORGED, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, TW_TM, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e7.ny.us.ibm.com (HELO e7.ny.us.ibm.com) (32.97.182.137) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 25 Jul 2012 19:40:24 +0000 Received: from /spool/local by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 25 Jul 2012 15:40:23 -0400 Received: from d01dlp03.pok.ibm.com (9.56.224.17) by e7.ny.us.ibm.com (192.168.1.107) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 25 Jul 2012 15:40:20 -0400 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 06037C9001B for ; Wed, 25 Jul 2012 15:40:20 -0400 (EDT) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q6PJeJ7x433098 for ; Wed, 25 Jul 2012 15:40:19 -0400 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q6PJeEBe003492 for ; Wed, 25 Jul 2012 13:40:14 -0600 Received: from [9.10.86.122] (ibm-tp6f2po0ikq.rchland.ibm.com [9.10.86.122] (may be forged)) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q6PJeDDR003468; Wed, 25 Jul 2012 13:40:13 -0600 Subject: Re: [PATCH] Change IVOPTS and strength reduction to use expmed cost model From: "William J. Schmidt" To: Richard Henderson Cc: gcc-patches@gcc.gnu.org, bergner@vnet.ibm.com, rguenther@suse.de In-Reply-To: <501025F2.1060408@redhat.com> References: <1343232822.4638.16.camel@oc2474580526.ibm.com> <501025F2.1060408@redhat.com> Date: Wed, 25 Jul 2012 14:40:13 -0500 Message-ID: <1343245213.4638.21.camel@oc2474580526.ibm.com> Mime-Version: 1.0 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12072519-5806-0000-0000-000017AA5B54 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Wed, 2012-07-25 at 09:59 -0700, Richard Henderson wrote: > On 07/25/2012 09:13 AM, William J. Schmidt wrote: > > Per Richard Henderson's suggestion > > (http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01370.html), this patch > > changes the IVOPTS and straight-line strength reduction passes to make > > use of data computed by init_expmed. This required adding a new > > convert_cost array in expmed to store the costs of converting between > > various scalar integer modes, and exposing expmed's multiplication hash > > table for external use (new function mult_by_coeff_cost). Richard H, > > I'd appreciate it if you could look at what I did there and make sure > > it's correct. Thanks! > > Correctness looks good. > > > I decided it wasn't worth distinguishing between reg-reg add costs and > > reg-constant add costs, so I simplified the strength reduction > > calculations rather than adding another array to expmed for this > > purpose. But I can make this distinction if that's preferable. > > I don't think this is worth thinking about at this level. This is > something that some rtl-level optimization ought to be able to fix > up trivially, e.g. cse. > > > Index: gcc/expmed.h > > =================================================================== > > --- gcc/expmed.h (revision 189845) > > +++ gcc/expmed.h (working copy) > > @@ -155,6 +155,11 @@ struct target_expmed { > > int x_udiv_cost[2][NUM_MACHINE_MODES]; > > int x_mul_widen_cost[2][NUM_MACHINE_MODES]; > > int x_mul_highpart_cost[2][NUM_MACHINE_MODES]; > > + > > + /* Conversion costs are only defined between two scalar integer modes > > + of different sizes. The first machine mode is the destination mode, > > + and the second is the source mode. */ > > + int x_convert_cost[2][NUM_MACHINE_MODES][NUM_MACHINE_MODES]; > > }; > > 2 * NUM_MACHINE_MODES is quite large... I think we could do better with > > #define NUM_MODE_INT (MAX_MODE_INT - MIN_MODE_INT + 1) > > x_convert_cost[2][NUM_MODE_INT][NUM_MODE_INT]; > > though really that could be done with all of these fields all at once. > > That does suggest it would be better to leave at least inline functions > to access these elements, rather than open code the array access. > > > r~ > Thanks for the quick review! Excellent point about the array size. The attached revised patch follows your suggestion to limit the size. I only did this for the new field, as changing all the existing accessors to inline functions is more effort than I have time for right now. This is left as an exercise for the reader. ;) Bootstrapped and tested on powepc64-unknown-linux-gnu with no new failures. Is this ok? Thanks, Bill 2012-07-25 Bill Schmidt * tree-ssa-loop-ivopts.c (mbc_entry_hash): Remove. (mbc_entry_eq): Likewise. (mult_costs): Likewise. (cost_tables_exist): Likewise. (initialize_costs): Likewise. (finalize_costs): Likewise. (tree_ssa_iv_optimize_init): Remove call to initialize_costs. (add_regs_cost): Remove. (multiply_regs_cost): Likewise. (add_const_cost): Likewise. (extend_or_trunc_reg_cost): Likewise. (negate_reg_cost): Likewise. (struct mbc_entry): Likewise. (multiply_by_const_cost): Likewise. (get_address_cost): Change add_regs_cost calls to add_cost lookups; change multiply_by_const_cost to mult_by_coeff_cost. (force_expr_to_var_cost): Likewise. (difference_cost): Change multiply_by_const_cost to mult_by_coeff_cost. (get_computation_cost_at): Change add_regs_cost calls to add_cost lookups; change multiply_by_const_cost to mult_by_coeff_cost. (determine_iv_cost): Change add_regs_cost calls to add_cost lookups. (tree_ssa_iv_optimize_finalize): Remove call to finalize_costs. * tree-ssa-address.c (expmed.h): New #include. (most_expensive_mult_to_index): Change multiply_by_const_cost to mult_by_coeff_cost. * gimple-ssa-strength-reduction.c (expmed.h): New #include. (stmt_cost): Change to use mult_by_coeff_cost, mul_cost, add_cost, neg_cost, and convert_cost instead of IVOPTS interfaces. (execute_strength_reduction): Remove calls to initialize_costs and finalize_costs. * expmed.c (struct init_expmed_rtl): Add convert rtx_def. (init_expmed_one_mode): Initialize convert rtx_def; initialize x_convert_cost for related modes. (mult_by_coeff_cost): New function. * expmed.h (NUM_MODE_INT): New #define. (struct target_expmed): Add x_convert_cost matrix. (set_convert_cost): New inline function. (convert_cost): Likewise. (mult_by_coeff_cost): New extern decl. * tree-flow.h (initialize_costs): Remove decl. (finalize_costs): Likewise. (multiply_by_const_cost): Likewise. (add_regs_cost): Likewise. (multiply_regs_cost): Likewise. (add_const_cost): Likewise. (extend_or_trunc_reg_cost): Likewise. (negate_reg_cost): Likewise. Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 189845) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -88,9 +88,6 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa-propagate.h" #include "expmed.h" -static hashval_t mbc_entry_hash (const void *); -static int mbc_entry_eq (const void*, const void *); - /* FIXME: Expressions are expanded to RTL in this pass to determine the cost of different addressing modes. This should be moved to a TBD interface between the GIMPLE and RTL worlds. */ @@ -381,11 +378,6 @@ struct iv_ca_delta static VEC(tree,heap) *decl_rtl_to_reset; -/* Cached costs for multiplies by constants, and a flag to indicate - when they're valid. */ -static htab_t mult_costs[2]; -static bool cost_tables_exist = false; - static comp_cost force_expr_to_var_cost (tree, bool); /* Number of uses recorded in DATA. */ @@ -851,26 +843,6 @@ htab_inv_expr_hash (const void *ent) return expr->hash; } -/* Allocate data structures for the cost model. */ - -void -initialize_costs (void) -{ - mult_costs[0] = htab_create (100, mbc_entry_hash, mbc_entry_eq, free); - mult_costs[1] = htab_create (100, mbc_entry_hash, mbc_entry_eq, free); - cost_tables_exist = true; -} - -/* Release data structures for the cost model. */ - -void -finalize_costs (void) -{ - cost_tables_exist = false; - htab_delete (mult_costs[0]); - htab_delete (mult_costs[1]); -} - /* Initializes data structures used by the iv optimization pass, stored in DATA. */ @@ -889,8 +861,6 @@ tree_ssa_iv_optimize_init (struct ivopts_data *dat htab_inv_expr_eq, free); data->inv_expr_id = 0; decl_rtl_to_reset = VEC_alloc (tree, heap, 20); - - initialize_costs (); } /* Returns a memory object to that EXPR points. In case we are able to @@ -3077,250 +3047,6 @@ adjust_setup_cost (struct ivopts_data *data, unsig return cost; } -/* Returns cost of addition in MODE. */ - -unsigned -add_regs_cost (enum machine_mode mode, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - - if (costs[mode][speed]) - return costs[mode][speed]; - - start_sequence (); - force_operand (gen_rtx_fmt_ee (PLUS, mode, - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1), - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 2)), - NULL_RTX); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - costs[mode][speed] = cost; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Addition in %s costs %d\n", - GET_MODE_NAME (mode), cost); - return cost; -} - -/* Returns cost of multiplication in MODE. */ - -unsigned -multiply_regs_cost (enum machine_mode mode, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - - if (costs[mode][speed]) - return costs[mode][speed]; - - start_sequence (); - force_operand (gen_rtx_fmt_ee (MULT, mode, - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1), - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 2)), - NULL_RTX); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - costs[mode][speed] = cost; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Multiplication in %s costs %d\n", - GET_MODE_NAME (mode), cost); - return cost; -} - -/* Returns cost of addition with a constant in MODE. */ - -unsigned -add_const_cost (enum machine_mode mode, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - - if (costs[mode][speed]) - return costs[mode][speed]; - - /* Arbitrarily generate insns for x + 2, as the exact constant - shouldn't matter. */ - start_sequence (); - force_operand (gen_rtx_fmt_ee (PLUS, mode, - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1), - gen_int_mode (2, mode)), - NULL_RTX); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - costs[mode][speed] = cost; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Addition to constant in %s costs %d\n", - GET_MODE_NAME (mode), cost); - return cost; -} - -/* Returns cost of extend or truncate in MODE. */ - -unsigned -extend_or_trunc_reg_cost (tree type_to, tree type_from, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - enum machine_mode mode_to = TYPE_MODE (type_to); - enum machine_mode mode_from = TYPE_MODE (type_from); - tree size_to = TYPE_SIZE (type_to); - tree size_from = TYPE_SIZE (type_from); - enum rtx_code code; - - gcc_assert (TREE_CODE (size_to) == INTEGER_CST - && TREE_CODE (size_from) == INTEGER_CST); - - if (costs[mode_to][mode_from][speed]) - return costs[mode_to][mode_from][speed]; - - if (tree_int_cst_lt (size_to, size_from)) - code = TRUNCATE; - else if (TYPE_UNSIGNED (type_to)) - code = ZERO_EXTEND; - else - code = SIGN_EXTEND; - - start_sequence (); - gen_rtx_fmt_e (code, mode_to, - gen_raw_REG (mode_from, LAST_VIRTUAL_REGISTER + 1)); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Conversion from %s to %s costs %d\n", - GET_MODE_NAME (mode_to), GET_MODE_NAME (mode_from), cost); - - costs[mode_to][mode_from][speed] = cost; - return cost; -} - -/* Returns cost of negation in MODE. */ - -unsigned -negate_reg_cost (enum machine_mode mode, bool speed) -{ - static unsigned costs[NUM_MACHINE_MODES][2]; - rtx seq; - unsigned cost; - - if (costs[mode][speed]) - return costs[mode][speed]; - - start_sequence (); - force_operand (gen_rtx_fmt_e (NEG, mode, - gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)), - NULL_RTX); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - if (!cost) - cost = 1; - - costs[mode][speed] = cost; - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Negation in %s costs %d\n", - GET_MODE_NAME (mode), cost); - return cost; -} - -/* Entry in a hashtable of already known costs for multiplication. */ -struct mbc_entry -{ - HOST_WIDE_INT cst; /* The constant to multiply by. */ - enum machine_mode mode; /* In mode. */ - unsigned cost; /* The cost. */ -}; - -/* Counts hash value for the ENTRY. */ - -static hashval_t -mbc_entry_hash (const void *entry) -{ - const struct mbc_entry *e = (const struct mbc_entry *) entry; - - return 57 * (hashval_t) e->mode + (hashval_t) (e->cst % 877); -} - -/* Compares the hash table entries ENTRY1 and ENTRY2. */ - -static int -mbc_entry_eq (const void *entry1, const void *entry2) -{ - const struct mbc_entry *e1 = (const struct mbc_entry *) entry1; - const struct mbc_entry *e2 = (const struct mbc_entry *) entry2; - - return (e1->mode == e2->mode - && e1->cst == e2->cst); -} - -/* Returns cost of multiplication by constant CST in MODE. */ - -unsigned -multiply_by_const_cost (HOST_WIDE_INT cst, enum machine_mode mode, bool speed) -{ - struct mbc_entry **cached, act; - rtx seq; - unsigned cost; - - gcc_assert (cost_tables_exist); - - act.mode = mode; - act.cst = cst; - cached = (struct mbc_entry **) - htab_find_slot (mult_costs[speed], &act, INSERT); - - if (*cached) - return (*cached)->cost; - - *cached = XNEW (struct mbc_entry); - (*cached)->mode = mode; - (*cached)->cst = cst; - - start_sequence (); - expand_mult (mode, gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1), - gen_int_mode (cst, mode), NULL_RTX, 0); - seq = get_insns (); - end_sequence (); - - cost = seq_cost (seq, speed); - - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "Multiplication by %d in %s costs %d\n", - (int) cst, GET_MODE_NAME (mode), cost); - - (*cached)->cost = cost; - - return cost; -} - /* Returns true if multiplying by RATIO is allowed in an address. Test the validity for a memory reference accessing memory of mode MODE in address space AS. */ @@ -3582,7 +3308,7 @@ get_address_cost (bool symbol_present, bool var_pr If VAR_PRESENT is true, try whether the mode with SYMBOL_PRESENT = false is cheaper even with cost of addition, and if this is the case, use it. */ - add_c = add_regs_cost (address_mode, speed); + add_c = add_cost[speed][address_mode]; for (i = 0; i < 8; i++) { var_p = i & 1; @@ -3663,10 +3389,10 @@ get_address_cost (bool symbol_present, bool var_pr && multiplier_allowed_in_address_p (ratio, mem_mode, as)); if (ratio != 1 && !ratio_p) - cost += multiply_by_const_cost (ratio, address_mode, speed); + cost += mult_by_coeff_cost (ratio, address_mode, speed); if (s_offset && !offset_p && !symbol_present) - cost += add_regs_cost (address_mode, speed); + cost += add_cost[speed][address_mode]; if (may_autoinc) *may_autoinc = autoinc; @@ -3833,7 +3559,7 @@ force_expr_to_var_cost (tree expr, bool speed) case PLUS_EXPR: case MINUS_EXPR: case NEGATE_EXPR: - cost = new_cost (add_regs_cost (mode, speed), 0); + cost = new_cost (add_cost[speed][mode], 0); if (TREE_CODE (expr) != NEGATE_EXPR) { tree mult = NULL_TREE; @@ -3853,11 +3579,11 @@ force_expr_to_var_cost (tree expr, bool speed) case MULT_EXPR: if (cst_and_fits_in_hwi (op0)) - cost = new_cost (multiply_by_const_cost (int_cst_value (op0), - mode, speed), 0); + cost = new_cost (mult_by_coeff_cost (int_cst_value (op0), + mode, speed), 0); else if (cst_and_fits_in_hwi (op1)) - cost = new_cost (multiply_by_const_cost (int_cst_value (op1), - mode, speed), 0); + cost = new_cost (mult_by_coeff_cost (int_cst_value (op1), + mode, speed), 0); else return new_cost (target_spill_cost [speed], 0); break; @@ -4023,7 +3749,7 @@ difference_cost (struct ivopts_data *data, if (integer_zerop (e1)) { comp_cost cost = force_var_cost (data, e2, depends_on); - cost.cost += multiply_by_const_cost (-1, mode, data->speed); + cost.cost += mult_by_coeff_cost (-1, mode, data->speed); return cost; } @@ -4334,7 +4060,7 @@ get_computation_cost_at (struct ivopts_data *data, &symbol_present, &var_present, &offset, depends_on)); cost.cost /= avg_loop_niter (data->current_loop); - cost.cost += add_regs_cost (TYPE_MODE (ctype), data->speed); + cost.cost += add_cost[data->speed][TYPE_MODE (ctype)]; } if (inv_expr_id) @@ -4367,7 +4093,7 @@ get_computation_cost_at (struct ivopts_data *data, if (!symbol_present && !var_present && !offset) { if (ratio != 1) - cost.cost += multiply_by_const_cost (ratio, TYPE_MODE (ctype), speed); + cost.cost += mult_by_coeff_cost (ratio, TYPE_MODE (ctype), speed); return cost; } @@ -4375,18 +4101,18 @@ get_computation_cost_at (struct ivopts_data *data, are added once to the variable, if present. */ if (var_present && (symbol_present || offset)) cost.cost += adjust_setup_cost (data, - add_regs_cost (TYPE_MODE (ctype), speed)); + add_cost[speed][TYPE_MODE (ctype)]); /* Having offset does not affect runtime cost in case it is added to symbol, but it increases complexity. */ if (offset) cost.complexity++; - cost.cost += add_regs_cost (TYPE_MODE (ctype), speed); + cost.cost += add_cost[speed][TYPE_MODE (ctype)]; aratio = ratio > 0 ? ratio : -ratio; if (aratio != 1) - cost.cost += multiply_by_const_cost (aratio, TYPE_MODE (ctype), speed); + cost.cost += mult_by_coeff_cost (aratio, TYPE_MODE (ctype), speed); return cost; fallback: @@ -5232,7 +4958,7 @@ determine_iv_cost (struct ivopts_data *data, struc or a const set. */ if (cost_base.cost == 0) cost_base.cost = COSTS_N_INSNS (1); - cost_step = add_regs_cost (TYPE_MODE (TREE_TYPE (base)), data->speed); + cost_step = add_cost[data->speed][TYPE_MODE (TREE_TYPE (base))]; cost = cost_step + adjust_setup_cost (data, cost_base.cost); @@ -6804,8 +6530,6 @@ tree_ssa_iv_optimize_finalize (struct ivopts_data VEC_free (iv_use_p, heap, data->iv_uses); VEC_free (iv_cand_p, heap, data->iv_candidates); htab_delete (data->inv_expr_tab); - - finalize_costs (); } /* Returns true if the loop body BODY includes any function calls. */ Index: gcc/tree-ssa-address.c =================================================================== --- gcc/tree-ssa-address.c (revision 189845) +++ gcc/tree-ssa-address.c (working copy) @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "expr.h" #include "ggc.h" #include "target.h" +#include "expmed.h" /* TODO -- handling of symbols (according to Richard Hendersons comments, http://gcc.gnu.org/ml/gcc-patches/2005-04/msg00949.html): @@ -554,7 +555,7 @@ most_expensive_mult_to_index (tree type, struct me || !multiplier_allowed_in_address_p (coef, TYPE_MODE (type), as)) continue; - acost = multiply_by_const_cost (coef, address_mode, speed); + acost = mult_by_coeff_cost (coef, address_mode, speed); if (acost > best_mult_cost) { Index: gcc/gimple-ssa-strength-reduction.c =================================================================== --- gcc/gimple-ssa-strength-reduction.c (revision 189845) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-flow.h" #include "domwalk.h" #include "pointer-set.h" +#include "expmed.h" /* Information about a strength reduction candidate. Each statement in the candidate table represents an expression of one of the @@ -340,29 +341,22 @@ stmt_cost (gimple gs, bool speed) rhs2 = gimple_assign_rhs2 (gs); if (host_integerp (rhs2, 0)) - return multiply_by_const_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, - speed); + return mult_by_coeff_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed); gcc_assert (TREE_CODE (rhs1) != INTEGER_CST); - return multiply_regs_cost (TYPE_MODE (TREE_TYPE (lhs)), speed); + return mul_cost[speed][lhs_mode]; case PLUS_EXPR: case POINTER_PLUS_EXPR: case MINUS_EXPR: rhs2 = gimple_assign_rhs2 (gs); + return add_cost[speed][lhs_mode]; - if (host_integerp (rhs2, 0)) - return add_const_cost (TYPE_MODE (TREE_TYPE (rhs1)), speed); - - gcc_assert (TREE_CODE (rhs1) != INTEGER_CST); - return add_regs_cost (lhs_mode, speed); - case NEGATE_EXPR: - return negate_reg_cost (lhs_mode, speed); + return neg_cost[speed][lhs_mode]; case NOP_EXPR: - return extend_or_trunc_reg_cost (TREE_TYPE (lhs), TREE_TYPE (rhs1), - speed); + return convert_cost (lhs_mode, TYPE_MODE (TREE_TYPE (rhs1)), speed); /* Note that we don't assign costs to copies that in most cases will go away. */ @@ -1460,9 +1454,6 @@ execute_strength_reduction (void) back edges, and this gives us dominator information as well. */ loop_optimizer_init (AVOID_CFG_MODIFICATIONS); - /* Initialize costs tables in IVOPTS. */ - initialize_costs (); - /* Set up callbacks for the generic dominator tree walker. */ walk_data.dom_direction = CDI_DOMINATORS; walk_data.initialize_block_local_data = NULL; @@ -1493,7 +1484,6 @@ execute_strength_reduction (void) pointer_map_destroy (stmt_cand_map); VEC_free (slsr_cand_t, heap, cand_vec); obstack_free (&cand_obstack, NULL); - finalize_costs (); return 0; } Index: gcc/expmed.c =================================================================== --- gcc/expmed.c (revision 189845) +++ gcc/expmed.c (working copy) @@ -112,6 +112,7 @@ struct init_expmed_rtl struct rtx_def shift_add; rtunion shift_add_fld1; struct rtx_def shift_sub0; rtunion shift_sub0_fld1; struct rtx_def shift_sub1; rtunion shift_sub1_fld1; + struct rtx_def convert; rtx pow2[MAX_BITS_PER_WORD]; rtx cint[MAX_BITS_PER_WORD]; @@ -122,6 +123,7 @@ init_expmed_one_mode (struct init_expmed_rtl *all, enum machine_mode mode, int speed) { int m, n, mode_bitsize; + enum machine_mode mode_from; mode_bitsize = GET_MODE_UNIT_BITSIZE (mode); @@ -139,6 +141,7 @@ init_expmed_one_mode (struct init_expmed_rtl *all, PUT_MODE (&all->shift_add, mode); PUT_MODE (&all->shift_sub0, mode); PUT_MODE (&all->shift_sub1, mode); + PUT_MODE (&all->convert, mode); add_cost[speed][mode] = set_src_cost (&all->plus, speed); neg_cost[speed][mode] = set_src_cost (&all->neg, speed); @@ -183,6 +186,30 @@ init_expmed_one_mode (struct init_expmed_rtl *all, mul_highpart_cost[speed][mode] = set_src_cost (&all->wide_trunc, speed); } + + for (mode_from = GET_CLASS_NARROWEST_MODE (MODE_INT); + mode_from != VOIDmode; + mode_from = GET_MODE_WIDER_MODE (mode_from)) + if (mode != mode_from) + { + unsigned short size_to = GET_MODE_SIZE (mode); + unsigned short size_from = GET_MODE_SIZE (mode_from); + if (size_to < size_from) + { + PUT_CODE (&all->convert, TRUNCATE); + PUT_MODE (&all->reg, mode_from); + set_convert_cost (mode, mode_from, speed, + set_src_cost (&all->convert, speed)); + } + else if (size_from < size_to) + { + /* Assume cost of zero-extend and sign-extend is the same. */ + PUT_CODE (&all->convert, ZERO_EXTEND); + PUT_MODE (&all->reg, mode_from); + set_convert_cost (mode, mode_from, speed, + set_src_cost (&all->convert, speed)); + } + } } } @@ -262,6 +289,9 @@ init_expmed (void) XEXP (&all.shift_sub1, 0) = &all.reg; XEXP (&all.shift_sub1, 1) = &all.shift_mult; + PUT_CODE (&all.convert, TRUNCATE); + XEXP (&all.convert, 0) = &all.reg; + for (speed = 0; speed < 2; speed++) { crtl->maybe_hot_insn_p = speed; @@ -3262,6 +3292,24 @@ expand_mult (enum machine_mode mode, rtx op0, rtx return op0; } +/* Return a cost estimate for multiplying a register by the given + COEFFicient in the given MODE and SPEED. */ + +int +mult_by_coeff_cost (HOST_WIDE_INT coeff, enum machine_mode mode, bool speed) +{ + int max_cost; + struct algorithm algorithm; + enum mult_variant variant; + + rtx fake_reg = gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1); + max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, fake_reg), speed); + if (choose_mult_variant (mode, coeff, &algorithm, &variant, max_cost)) + return algorithm.cost.cost; + else + return max_cost; +} + /* Perform a widening multiplication and return an rtx for the result. MODE is mode of value; OP0 and OP1 are what to multiply (rtx's); TARGET is a suggestion for where to store the result (an rtx). Index: gcc/expmed.h =================================================================== --- gcc/expmed.h (revision 189845) +++ gcc/expmed.h (working copy) @@ -124,6 +124,8 @@ struct alg_hash_entry { #define NUM_ALG_HASH_ENTRIES 307 #endif +#define NUM_MODE_INT (MAX_MODE_INT - MIN_MODE_INT + 1) + /* Target-dependent globals. */ struct target_expmed { /* Each entry of ALG_HASH caches alg_code for some integer. This is @@ -155,6 +157,11 @@ struct target_expmed { int x_udiv_cost[2][NUM_MACHINE_MODES]; int x_mul_widen_cost[2][NUM_MACHINE_MODES]; int x_mul_highpart_cost[2][NUM_MACHINE_MODES]; + + /* Conversion costs are only defined between two scalar integer modes + of different sizes. The first machine mode is the destination mode, + and the second is the source mode. */ + int x_convert_cost[2][NUM_MODE_INT][NUM_MODE_INT]; }; extern struct target_expmed default_target_expmed; @@ -197,4 +204,43 @@ extern struct target_expmed *this_target_expmed; #define mul_highpart_cost \ (this_target_expmed->x_mul_highpart_cost) +/* Set the COST for converting from FROM_MODE to TO_MODE when optimizing + for SPEED. */ + +static inline void +set_convert_cost (enum machine_mode to_mode, enum machine_mode from_mode, + bool speed, int cost) +{ + int to_idx, from_idx; + + gcc_assert (to_mode >= MIN_MODE_INT + && to_mode <= MAX_MODE_INT + && from_mode >= MIN_MODE_INT + && from_mode <= MAX_MODE_INT); + + to_idx = to_mode - MIN_MODE_INT; + from_idx = from_mode - MIN_MODE_INT; + this_target_expmed->x_convert_cost[speed][to_idx][from_idx] = cost; +} + +/* Return the cost for converting from FROM_MODE to TO_MODE when optimizing + for SPEED. */ + +static inline int +convert_cost (enum machine_mode to_mode, enum machine_mode from_mode, + bool speed) +{ + int to_idx, from_idx; + + gcc_assert (to_mode >= MIN_MODE_INT + && to_mode <= MAX_MODE_INT + && from_mode >= MIN_MODE_INT + && from_mode <= MAX_MODE_INT); + + to_idx = to_mode - MIN_MODE_INT; + from_idx = from_mode - MIN_MODE_INT; + return this_target_expmed->x_convert_cost[speed][to_idx][from_idx]; +} + +extern int mult_by_coeff_cost (HOST_WIDE_INT, enum machine_mode, bool); #endif Index: gcc/tree-flow.h =================================================================== --- gcc/tree-flow.h (revision 189845) +++ gcc/tree-flow.h (working copy) @@ -806,14 +806,6 @@ bool expr_invariant_in_loop_p (struct loop *, tree bool stmt_invariant_in_loop_p (struct loop *, gimple); bool multiplier_allowed_in_address_p (HOST_WIDE_INT, enum machine_mode, addr_space_t); -void initialize_costs (void); -void finalize_costs (void); -unsigned multiply_by_const_cost (HOST_WIDE_INT, enum machine_mode, bool); -unsigned add_regs_cost (enum machine_mode, bool); -unsigned multiply_regs_cost (enum machine_mode, bool); -unsigned add_const_cost (enum machine_mode, bool); -unsigned extend_or_trunc_reg_cost (tree, tree, bool); -unsigned negate_reg_cost (enum machine_mode, bool); bool may_be_nonaddressable_p (tree expr); /* In tree-ssa-threadupdate.c. */