From patchwork Thu Jun 28 21:45:32 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 167960 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 00008B7006 for ; Fri, 29 Jun 2012 07:46:16 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1341524777; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Received:Message-ID:Subject:From:To:Cc:Date: Content-Type:Content-Transfer-Encoding:Mime-Version:Mailing-List: Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:Sender:Delivered-To; bh=C9YwyFxQpzjQMGdEks7kTdVWeZI=; b=DkU13aMNa2RHDbxzt/h+diGlgpjB7prOpcPa7/rQSu9UJl43Nbom2CHADCHHGh gPFYfRxeMK/94K5F/vWIgdzJO/gSSqX7N1wLKMmctxwhlpE1p/fb0GPz5zSrZKB7 GdyB9FaLIgSPF2I9shBleW4KJACdQGh/BW7xgczeyTOOo= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Received:Message-ID:Subject:From:To:Cc:Date:Content-Type:Content-Transfer-Encoding:Mime-Version:x-cbid:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=Jb0ep1SqyP9lbc9noekEbG7kKEU+Lwx69jN6FI3Z4NBeSUpNWc4F00/zB6y9yB lfvbEL9WglbiHjv+rJL8ZMzPsnSC+Yc5etNhLKMtO47R3SKx/L0Lam3wWDcX4gTp wj04xomMnFEnWUlMgy8JJTu0Kf4Gp3Bta2eOVlNnkVrRM=; Received: (qmail 840 invoked by alias); 28 Jun 2012 21:46:10 -0000 Received: (qmail 820 invoked by uid 22791); 28 Jun 2012 21:46:02 -0000 X-SWARE-Spam-Status: No, hits=-5.0 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, TW_TM, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e1.ny.us.ibm.com (HELO e1.ny.us.ibm.com) (32.97.182.141) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 28 Jun 2012 21:45:47 +0000 Received: from /spool/local by e1.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 28 Jun 2012 17:45:46 -0400 Received: from d01relay01.pok.ibm.com (9.56.227.233) by e1.ny.us.ibm.com (192.168.1.101) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 28 Jun 2012 17:45:43 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay01.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q5SLjhGU391780 for ; Thu, 28 Jun 2012 17:45:43 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q5SLjXoL010600 for ; Thu, 28 Jun 2012 18:45:33 -0300 Received: from [9.65.34.124] (sig-9-65-34-124.mts.ibm.com [9.65.34.124]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q5SLjVDS010290; Thu, 28 Jun 2012 18:45:31 -0300 Message-ID: <1340919932.2861.15.camel@gnopaine> Subject: [PATCH] Fix PR46556 (straight-line strength reduction, part 2) From: "William J. Schmidt" To: gcc-patches@gcc.gnu.org Cc: bergner@vnet.ibm.com, rguenther@suse.de Date: Thu, 28 Jun 2012 16:45:32 -0500 Mime-Version: 1.0 x-cbid: 12062821-6078-0000-0000-00000CA839BF Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Here's a relatively small piece of strength reduction that solves that pesky addressing bug that got me looking at this in the first place... The main part of the code is the stuff that was reviewed last year, but which needed to find a good home. So hopefully that's in pretty good shape. I recast base_cand_map as an htab again since I now need to look up trees other than SSA names. I plan to put together a follow-up patch to change code and commentary references so that "base_name" becomes "base_expr". Doing that now would clutter up the patch too much. Bootstrapped and tested on powerpc64-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill gcc: PR tree-optimization/46556 * gimple-ssa-strength-reduction.c (enum cand_kind): Add CAND_REF. (base_cand_map): Change to hash table. (base_cand_hash): New function. (base_cand_free): Likewise. (base_cand_eq): Likewise. (lookup_cand): Change base_cand_map to hash table. (find_basis_for_candidate): Likewise. (base_cand_from_table): Exclude CAND_REF. (restructure_reference): New function. (slsr_process_ref): Likewise. (find_candidates_in_block): Call slsr_process_ref. (dump_candidate): Handle CAND_REF. (base_cand_dump_callback): New function. (dump_cand_chains): Change base_cand_map to hash table. (replace_ref): New function. (replace_refs): Likewise. (analyze_candidates_and_replace): Call replace_refs. (execute_strength_reduction): Change base_cand_map to hash table. gcc/testsuite: PR tree-optimization/46556 * testsuite/gcc.dg/tree-ssa/slsr-27.c: New. * testsuite/gcc.dg/tree-ssa/slsr-28.c: New. * testsuite/gcc.dg/tree-ssa/slsr-29.c: New. Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dom2" } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p->a[n], p->c[n], p->b[n]); +} + +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 3 "dom2" } } */ +/* { dg-final { cleanup-tree-dump "dom2" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0) @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dom2" } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p->a[n], p->c[n], p->b[n]); + if (n > 12) + foo (p->a[n], p->c[n], p->b[n]); + else if (n > 3) + foo (p->b[n], p->a[n], p->c[n]); +} + +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 9 "dom2" } } */ +/* { dg-final { cleanup-tree-dump "dom2" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0) @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-dom2" } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p->a[n], p->c[n], p->b[n]); + if (n > 3) + { + foo (p->a[n], p->c[n], p->b[n]); + if (n > 12) + foo (p->b[n], p->a[n], p->c[n]); + } +} + +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 9 "dom2" } } */ +/* { dg-final { cleanup-tree-dump "dom2" } } */ Index: gcc/gimple-ssa-strength-reduction.c =================================================================== --- gcc/gimple-ssa-strength-reduction.c (revision 189025) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -32,7 +32,7 @@ along with GCC; see the file COPYING3. If not see 2) Explicit multiplies, unknown constant multipliers, no conditional increments. (data gathering complete, replacements pending) - 3) Implicit multiplies in addressing expressions. (pending) + 3) Implicit multiplies in addressing expressions. (complete) 4) Explicit multiplies, conditional increments. (pending) It would also be possible to apply strength reduction to divisions @@ -107,9 +107,49 @@ along with GCC; see the file COPYING3. If not see as a strength reduction opportunity, even though this S1 would also be replaceable by the S1' above. This can be added if it - comes up in practice. */ + comes up in practice. + Strength reduction in addressing + -------------------------------- + There is another kind of candidate known as CAND_REF. A CAND_REF + describes a statement containing a memory reference having + complex addressing that might benefit from strength reduction. + Specifically, we are interested in references for which + get_inner_reference returns a base address, offset, and bitpos as + follows: + base: MEM_REF (T1, C1) + offset: MULT_EXPR (PLUS_EXPR (T2, C2), C3) + bitpos: C4 * BITS_PER_UNIT + + Here T1 and T2 are arbitrary trees, and C1, C2, C3, C4 are + arbitrary integer constants. Note that C2 may be zero, in which + case the offset will be MULT_EXPR (T2, C3). + + When this pattern is recognized, the original memory reference + can be replaced with: + + MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2, C3)), + C1 + (C2 * C3) + C4) + + which distributes the multiply to allow constant folding. When + two or more addressing expressions can be represented by MEM_REFs + of this form, differing only in the constants C1, C2, and C4, + making this substitution produces more efficient addressing during + the RTL phases. When there are not at least two expressions with + the same values of T1, T2, and C3, there is nothing to be gained + by the replacement. + + Strength reduction of CAND_REFs uses the same infrastructure as + that used by CAND_MULTs and CAND_ADDs. We record T1 in the base (B) + field, MULT_EXPR (T2, C3) in the stride (S) field, and + C1 + (C2 * C3) + C4 in the index (i) field. A basis for a CAND_REF + is thus another CAND_REF with the same B and S values. When at + least two CAND_REFs are chained together using the basis relation, + each of them is replaced as above, resulting in improved code + generation for addressing. */ + + /* Index into the candidate vector, offset by 1. VECs are zero-based, while cand_idx's are one-based, with zero indicating null. */ typedef unsigned cand_idx; @@ -118,7 +158,8 @@ typedef unsigned cand_idx; enum cand_kind { CAND_MULT, - CAND_ADD + CAND_ADD, + CAND_REF }; struct slsr_cand_d @@ -137,7 +178,9 @@ struct slsr_cand_d /* The type of the candidate. This is normally the type of base_name, but casts may have occurred when combining feeding instructions. - A candidate can only be a basis for candidates of the same final type. */ + A candidate can only be a basis for candidates of the same final type. + (For CAND_REFs, this is the type to be used for operand 1 of the + replacement MEM_REF.) */ tree cand_type; /* The kind of candidate (CAND_MULT, etc.). */ @@ -211,8 +254,8 @@ static struct pointer_map_t *stmt_cand_map; /* Obstack for candidates. */ static struct obstack cand_obstack; -/* Array mapping from base SSA names to chains of candidates. */ -static cand_chain_t *base_cand_map; +/* Hash table embodying a mapping from base names to chains of candidates. */ +static htab_t base_cand_map; /* Obstack for candidate chains. */ static struct obstack chain_obstack; @@ -225,6 +268,33 @@ lookup_cand (cand_idx idx) return VEC_index (slsr_cand_t, cand_vec, idx - 1); } +/* Callback to produce a hash value for a candidate chain header. */ + +static hashval_t +base_cand_hash (const void *p) +{ + tree base_expr = ((const_cand_chain_t) p)->base_name; + return iterative_hash_expr (base_expr, 0); +} + +/* Callback when an element is removed from the hash table. + We never remove entries until the entire table is released. */ + +static void +base_cand_free (void *p ATTRIBUTE_UNUSED) +{ +} + +/* Callback to return true if two candidate chain headers are equal. */ + +static int +base_cand_eq (const void *p1, const void *p2) +{ + const_cand_chain_t const chain1 = (const_cand_chain_t) p1; + const_cand_chain_t const chain2 = (const_cand_chain_t) p2; + return operand_equal_p (chain1->base_name, chain2->base_name, 0); +} + /* Use the base name from candidate C to look for possible candidates that can serve as a basis for C. Each potential basis must also appear in a block that dominates the candidate statement and have @@ -235,11 +305,12 @@ lookup_cand (cand_idx idx) static int find_basis_for_candidate (slsr_cand_t c) { + cand_chain mapping_key; cand_chain_t chain; slsr_cand_t basis = NULL; - gcc_assert (TREE_CODE (c->base_name) == SSA_NAME); - chain = base_cand_map[SSA_NAME_VERSION (c->base_name)]; + mapping_key.base_name = c->base_name; + chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); for (; chain; chain = chain->next) { @@ -273,23 +344,23 @@ find_basis_for_candidate (slsr_cand_t c) static void record_potential_basis (slsr_cand_t c) { - cand_chain_t node, head; - int index; + cand_chain_t node; + void **slot; node = (cand_chain_t) obstack_alloc (&chain_obstack, sizeof (cand_chain)); node->base_name = c->base_name; node->cand = c; node->next = NULL; - index = SSA_NAME_VERSION (c->base_name); - head = base_cand_map[index]; + slot = htab_find_slot (base_cand_map, node, INSERT); - if (head) + if (*slot) { + cand_chain_t head = (cand_chain_t) (*slot); node->next = head->next; head->next = node; } else - base_cand_map[index] = node; + *slot = node; } /* Allocate storage for a new candidate and initialize its fields. @@ -390,10 +461,11 @@ base_cand_from_table (tree base_in) return (slsr_cand_t) NULL; result = (slsr_cand_t *) pointer_map_contains (stmt_cand_map, def); - if (!result) - return (slsr_cand_t) NULL; + + if (result && (*result)->kind != CAND_REF) + return *result; - return *result; + return (slsr_cand_t) NULL; } /* Add an entry to the statement-to-candidate mapping. */ @@ -406,6 +478,127 @@ add_cand_for_stmt (gimple gs, slsr_cand_t c) *slot = c; } +/* Look for the following pattern: + + *PBASE: MEM_REF (T1, C1) + + *POFFSET: MULT_EXPR (T2, C3) [C2 is zero] + or + MULT_EXPR (PLUS_EXPR (T2, C2), C3) + or + MULT_EXPR (MINUS_EXPR (T2, -C2), C3) + + *PINDEX: C4 * BITS_PER_UNIT + + If not present, leave the input values unchanged and return FALSE. + Otherwise, modify the input values as follows and return TRUE: + + *PBASE: T1 + *POFFSET: MULT_EXPR (T2, C3) + *PINDEX: C1 + (C2 * C3) + C4 */ + +static bool +restructure_reference (tree *pbase, tree *poffset, double_int *pindex, + tree *ptype) +{ + tree base = *pbase, offset = *poffset; + double_int index = *pindex; + double_int bpu = uhwi_to_double_int (BITS_PER_UNIT); + tree mult_op0, mult_op1, t1, t2, type; + double_int c1, c2, c3, c4; + + if (!base + || !offset + || TREE_CODE (base) != MEM_REF + || TREE_CODE (offset) != MULT_EXPR + || TREE_CODE (TREE_OPERAND (offset, 1)) != INTEGER_CST + || !double_int_zero_p (double_int_umod (index, bpu, FLOOR_MOD_EXPR))) + return false; + + t1 = TREE_OPERAND (base, 0); + c1 = mem_ref_offset (base); + type = TREE_TYPE (TREE_OPERAND (base, 1)); + + mult_op0 = TREE_OPERAND (offset, 0); + mult_op1 = TREE_OPERAND (offset, 1); + + c3 = tree_to_double_int (mult_op1); + + if (TREE_CODE (mult_op0) == PLUS_EXPR) + + if (TREE_CODE (TREE_OPERAND (mult_op0, 1)) == INTEGER_CST) + { + t2 = TREE_OPERAND (mult_op0, 0); + c2 = tree_to_double_int (TREE_OPERAND (mult_op0, 1)); + } + else + return false; + + else if (TREE_CODE (mult_op0) == MINUS_EXPR) + + if (TREE_CODE (TREE_OPERAND (mult_op0, 1)) == INTEGER_CST) + { + t2 = TREE_OPERAND (mult_op0, 0); + c2 = double_int_neg (tree_to_double_int (TREE_OPERAND (mult_op0, 1))); + } + else + return false; + + else + { + t2 = mult_op0; + c2 = double_int_zero; + } + + c4 = double_int_udiv (index, bpu, FLOOR_DIV_EXPR); + + *pbase = t1; + *poffset = fold_build2 (MULT_EXPR, sizetype, t2, + double_int_to_tree (sizetype, c3)); + *pindex = double_int_add (double_int_add (c1, double_int_mul (c2, c3)), c4); + *ptype = type; + + return true; +} + +/* Given GS which contains a data reference, create a CAND_REF entry in + the candidate table and attempt to find a basis. */ + +static void +slsr_process_ref (gimple gs) +{ + tree ref_expr, base, offset, type; + HOST_WIDE_INT bitsize, bitpos; + enum machine_mode mode; + int unsignedp, volatilep; + double_int index; + slsr_cand_t c; + + if (gimple_vdef (gs)) + ref_expr = gimple_assign_lhs (gs); + else + ref_expr = gimple_assign_rhs1 (gs); + + if (!handled_component_p (ref_expr) + || TREE_CODE (ref_expr) == BIT_FIELD_REF + || (TREE_CODE (ref_expr) == COMPONENT_REF + && DECL_BIT_FIELD (TREE_OPERAND (ref_expr, 1)))) + return; + + base = get_inner_reference (ref_expr, &bitsize, &bitpos, &offset, &mode, + &unsignedp, &volatilep, false); + index = uhwi_to_double_int (bitpos); + + if (!restructure_reference (&base, &offset, &index, &type)) + return; + + c = alloc_cand_and_find_basis (CAND_REF, gs, base, index, offset, + type, 0); + + /* Add the candidate to the statement-candidate mapping. */ + add_cand_for_stmt (gs, c); +} + /* Create a candidate entry for a statement GS, where GS multiplies two SSA names BASE_IN and STRIDE_IN. Propagate any known information about the two SSA names into the new candidate. Return the new @@ -1056,8 +1249,12 @@ find_candidates_in_block (struct dom_walk_data *wa { gimple gs = gsi_stmt (gsi); - if (is_gimple_assign (gs) - && SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs))))) + if (gimple_vuse (gs) && gimple_assign_single_p (gs)) + slsr_process_ref (gs); + + else if (is_gimple_assign (gs) + && SCALAR_INT_MODE_P + (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs))))) { tree rhs1 = NULL_TREE, rhs2 = NULL_TREE; @@ -1151,6 +1348,15 @@ dump_candidate (slsr_cand_t c) print_generic_expr (dump_file, c->stride, 0); fputs (") : ", dump_file); break; + case CAND_REF: + fputs (" REF : ", dump_file); + print_generic_expr (dump_file, c->base_name, 0); + fputs (" + (", dump_file); + print_generic_expr (dump_file, c->stride, 0); + fputs (") + ", dump_file); + dump_double_int (dump_file, c->index, false); + fputs (" : ", dump_file); + break; default: gcc_unreachable (); } @@ -1181,36 +1387,33 @@ dump_cand_vec (void) dump_candidate (c); } -/* Dump the candidate chains. */ +/* Callback used to dump the candidate chains hash table. */ -static void -dump_cand_chains (void) +static int +base_cand_dump_callback (void **slot, void *ignored ATTRIBUTE_UNUSED) { - unsigned i; + const_cand_chain_t chain = *((const_cand_chain_t *) slot); + cand_chain_t p; - fprintf (dump_file, "\nStrength reduction candidate chains:\n\n"); + print_generic_expr (dump_file, chain->base_name, 0); + fprintf (dump_file, " -> %d", chain->cand->cand_num); - for (i = 0; i < num_ssa_names; i++) - { - const_cand_chain_t chain = base_cand_map[i]; + for (p = chain->next; p; p = p->next) + fprintf (dump_file, " -> %d", p->cand->cand_num); - if (chain) - { - cand_chain_t p; + fputs ("\n", dump_file); + return 1; +} - print_generic_expr (dump_file, chain->base_name, 0); - fprintf (dump_file, " -> %d", chain->cand->cand_num); +/* Dump the candidate chains. */ - for (p = chain->next; p; p = p->next) - fprintf (dump_file, " -> %d", p->cand->cand_num); - - fputs ("\n", dump_file); - } - } - +static void +dump_cand_chains (void) +{ + fprintf (dump_file, "\nStrength reduction candidate chains:\n\n"); + htab_traverse_noresize (base_cand_map, base_cand_dump_callback, NULL); fputs ("\n", dump_file); } - /* Recursive helper for unconditional_cands_with_known_stride_p. Returns TRUE iff C, its siblings, and its dependents are all @@ -1246,6 +1449,53 @@ unconditional_cands_with_known_stride_p (slsr_cand return unconditional_cands (lookup_cand (root->dependent)); } +/* Replace *EXPR in candidate C with an equivalent strength-reduced + data reference. */ + +static void +replace_ref (tree *expr, slsr_cand_t c) +{ + tree add_expr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (c->base_name), + c->base_name, c->stride); + tree mem_ref = fold_build2 (MEM_REF, TREE_TYPE (*expr), add_expr, + double_int_to_tree (c->cand_type, c->index)); + + /* Gimplify the base addressing expression for the new MEM_REF tree. */ + gimple_stmt_iterator gsi = gsi_for_stmt (c->cand_stmt); + TREE_OPERAND (mem_ref, 0) + = force_gimple_operand_gsi (&gsi, TREE_OPERAND (mem_ref, 0), + /*simple_p=*/true, NULL, + /*before=*/true, GSI_SAME_STMT); + copy_ref_info (mem_ref, *expr); + *expr = mem_ref; + update_stmt (c->cand_stmt); +} + +/* Replace CAND_REF candidate C, each sibling of candidate C, and each + dependent of candidate C with an equivalent strength-reduced data + reference. */ + +static void +replace_refs (slsr_cand_t c) +{ + if (gimple_vdef (c->cand_stmt)) + { + tree *lhs = gimple_assign_lhs_ptr (c->cand_stmt); + replace_ref (lhs, c); + } + else + { + tree *rhs = gimple_assign_rhs1_ptr (c->cand_stmt); + replace_ref (rhs, c); + } + + if (c->sibling) + replace_refs (lookup_cand (c->sibling)); + + if (c->dependent) + replace_refs (lookup_cand (c->dependent)); +} + /* Calculate the increment required for candidate C relative to its basis. */ @@ -1413,13 +1663,18 @@ analyze_candidates_and_replace (void) first_dep = lookup_cand (c->dependent); + /* If this is a chain of CAND_REFs, unconditionally replace + each of them with a strength-reduced data reference. */ + if (c->kind == CAND_REF) + replace_refs (c); + /* If the common stride of all related candidates is a known constant, and none of these has a phi-dependence, then all replacements are considered profitable. Each replaces a multiply by a single add, with the possibility that a feeding add also goes dead as a result. */ - if (unconditional_cands_with_known_stride_p (c)) + else if (unconditional_cands_with_known_stride_p (c)) replace_dependents (first_dep); /* TODO: When the stride is an SSA name, it may still be @@ -1428,9 +1683,6 @@ analyze_candidates_and_replace (void) can be reused, or are less expensive to calculate than the replaced statements. */ - /* TODO: Strength-reduce data references with implicit - multiplication in their addressing expressions. */ - /* TODO: When conditional increments occur so that a candidate is dependent upon a phi-basis, the cost of introducing a temporary must be accounted for. */ @@ -1455,8 +1707,8 @@ execute_strength_reduction (void) gcc_obstack_init (&chain_obstack); /* Allocate the mapping from base names to candidate chains. */ - base_cand_map = XNEWVEC (cand_chain_t, num_ssa_names); - memset (base_cand_map, 0, num_ssa_names * sizeof (cand_chain_t)); + base_cand_map = htab_create (500, base_cand_hash, + base_cand_eq, base_cand_free); /* Initialize the loop optimizer. We need to detect flow across back edges, and this gives us dominator information as well. */ @@ -1490,7 +1742,7 @@ execute_strength_reduction (void) /* Free resources. */ fini_walk_dominator_tree (&walk_data); loop_optimizer_finalize (); - free (base_cand_map); + htab_delete (base_cand_map); obstack_free (&chain_obstack, NULL); pointer_map_destroy (stmt_cand_map); VEC_free (slsr_cand_t, heap, cand_vec);