From patchwork Mon Oct 8 10:38:34 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Botcazou X-Patchwork-Id: 189986 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 63F0E2C00B2 for ; Mon, 8 Oct 2012 21:51:13 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1350298274; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:From:To:Subject:Date:Message-ID:User-Agent:MIME-Version: Content-Type:Content-Transfer-Encoding:Mailing-List:Precedence: List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=0vxzvJSbQcEXNVJoSE4KvUv1WWc=; b=r6rRTUiwj+jdrND 86oWMREHNKr/iMMKlYVKjRh3K9hFlOLsGB+JogwC29ekuB3nid4ahNhXeGdgL1Pf tncgaWBHJXvIaLv/QJfGhiOLuJQgR6DxpRNY0no5FFGYKn7ZJXa35NSw9n2s7zfz FCvI30hI17dNzonHGYgYCUwdO+ns= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:From:To:Subject:Date:Message-ID:User-Agent:MIME-Version:Content-Type:Content-Transfer-Encoding:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=NkGI8KRJudxUQGl6PbPf85Nlxqlxg5E9IEp+mkIKpnVyoZgJJXeJsb/y7bwTZe gk2eDiRoAQUxEUt/ojq6b5tIdjhjVKg2sDBfZmOnSmsHWa+HdT731ot1vI7WEcnY aPB4F8ki6ee+MwZNZWpbRKKE6coryJBItj9FDa0juxMzI=; Received: (qmail 15347 invoked by alias); 8 Oct 2012 10:51:01 -0000 Received: (qmail 15331 invoked by uid 22791); 8 Oct 2012 10:51:00 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00,TW_TM X-Spam-Check-By: sourceware.org Received: from mel.act-europe.fr (HELO mel.act-europe.fr) (194.98.77.210) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 08 Oct 2012 10:50:42 +0000 Received: from localhost (localhost [127.0.0.1]) by filtered-smtp.eu.adacore.com (Postfix) with ESMTP id D83AC29002A for ; Mon, 8 Oct 2012 12:50:52 +0200 (CEST) Received: from mel.act-europe.fr ([127.0.0.1]) by localhost (smtp.eu.adacore.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2793VMaHJPXH for ; Mon, 8 Oct 2012 12:50:52 +0200 (CEST) Received: from polaris.localnet (bon31-6-88-161-99-133.fbx.proxad.net [88.161.99.133]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mel.act-europe.fr (Postfix) with ESMTP id 950CE290008 for ; Mon, 8 Oct 2012 12:50:52 +0200 (CEST) From: Eric Botcazou To: gcc-patches@gcc.gnu.org Subject: [RFC] Implement load sinking in loops Date: Mon, 08 Oct 2012 12:38:34 +0200 Message-ID: <5318138.62Wp5QHEft@polaris> User-Agent: KMail/4.7.2 (Linux/3.1.10-1.16-desktop; KDE/4.7.2; x86_64; ; ) MIME-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, we recently noticed that, even at -O3, the compiler doesn't figure out that the following loop is dumb: #define SIZE 64 int foo (int v[]) { int r; for (i = 0; i < SIZE; i++) r = v[i]; return r; } which was a bit of a surprise. On second thoughts, this isn't entirely unexpected, as it probably matters only for (slightly) pathological cases. The attached patch nevertheless implements a form of load sinking in loops so as to optimize these cases. It's combined with invariant motion to optimize: int foo (int v[], int a) { int r, i; for (i = 0; i < SIZE; i++) r = v[i] + a; return r; } and with store sinking to optimize: int foo (int v1[], int v2[]) { int r[SIZE]; int i, j; for (j = 0; j < SIZE; j++) for (i = 0; i < SIZE; i++) r[j] = v1[j] + v2[i]; return r[SIZE - 1]; } The optimization is enabled at -O2 in the patch for measurement purposes but, given how rarely it triggers (e.g. exactly 10 occurrences in a GCC bootstrap, compiler-only, all languages except Go), it's probably best suited to -O3. Or perhaps we don't care and it should simply be dropped... Thoughts? Tested on x86_64-suse-linux. 2012-10-08 Eric Botcazou * gimple.h (gsi_insert_seq_on_edge_before): Declare. * gimple-iterator.c (gsi_insert_seq_on_edge_before): New function. * tree-ssa-loop-im.c (struct mem_ref_loc): Add LHS field. (mem_ref_in_stmt): Remove gcc_assert. (copy_load_and_single_use_chain): New function. (execute_lm): Likewise. (hoist_memory_references): Hoist the loads after the stores. (ref_always_accessed_p): Rename into... (ref_always_stored_p): ...this. Remove STORE_P and add ONCE_P. (can_lsm_ref_p): New function extracted from... (can_sm_ref_p): ...here. Call it. (follow_invariant_single_use_chain): New function. (can_lm_ref_p): Likewise. (find_refs_for_sm): Rename into.. (find_refs_for_lsm): ...this. Find load hoisting opportunities. (loop_suitable_for_sm): Rename into... (loop_suitable_for_lsm): ...this. (store_motion_loop): Rename into... (load_store_motion_loop): ...this. Adjust calls to above functions. (tree_ssa_lim): Likewise. 2012-10-08 Eric Botcazou * gcc.dg/tree-ssa/loadmotion-1.c: New test. * gcc.dg/tree-ssa/loadmotion-2.c: New test. * gcc.dg/tree-ssa/loadmotion-3.c: New test. Index: gimple.h =================================================================== --- gimple.h (revision 192137) +++ gimple.h (working copy) @@ -5196,6 +5196,7 @@ void gsi_move_before (gimple_stmt_iterat void gsi_move_to_bb_end (gimple_stmt_iterator *, basic_block); void gsi_insert_on_edge (edge, gimple); void gsi_insert_seq_on_edge (edge, gimple_seq); +void gsi_insert_seq_on_edge_before (edge, gimple_seq); basic_block gsi_insert_on_edge_immediate (edge, gimple); basic_block gsi_insert_seq_on_edge_immediate (edge, gimple_seq); void gsi_commit_one_edge_insert (edge, basic_block *); Index: gimple-iterator.c =================================================================== --- gimple-iterator.c (revision 192137) +++ gimple-iterator.c (working copy) @@ -677,6 +677,16 @@ gsi_insert_seq_on_edge (edge e, gimple_s gimple_seq_add_seq (&PENDING_STMT (e), seq); } +/* Likewise, but append it instead of prepending it. */ + +void +gsi_insert_seq_on_edge_before (edge e, gimple_seq seq) +{ + gimple_seq pending = NULL; + gimple_seq_add_seq (&pending, seq); + gimple_seq_add_seq (&pending, PENDING_STMT (e)); + PENDING_STMT (e) = pending; +} /* Insert the statement pointed-to by GSI into edge E. Every attempt is made to place the statement in an existing basic block, but Index: tree-ssa-loop-im.c =================================================================== --- tree-ssa-loop-im.c (revision 192137) +++ tree-ssa-loop-im.c (working copy) @@ -103,6 +103,7 @@ typedef struct mem_ref_loc { tree *ref; /* The reference itself. */ gimple stmt; /* The statement in that it occurs. */ + tree lhs; /* The (ultimate) LHS for a load. */ } *mem_ref_loc_p; DEF_VEC_P(mem_ref_loc_p); @@ -674,7 +675,6 @@ mem_ref_in_stmt (gimple stmt) if (!mem) return NULL; - gcc_assert (!store); hash = iterative_hash_expr (*mem, 0); ref = (mem_ref_p) htab_find_with_hash (memory_accesses.refs, *mem, hash); @@ -2192,6 +2192,140 @@ execute_sm (struct loop *loop, VEC (edge execute_sm_if_changed (ex, ref->mem, tmp_var, store_flag); } +/* Copy the load and the chain of single uses described by LOC and return the + sequence of new statements. Also set NEW_LHS to the copy of LOC->LHS. */ + +static gimple_seq +copy_load_and_single_use_chain (mem_ref_loc_p loc, tree *new_lhs) +{ + tree mem = *loc->ref; + tree lhs, tmp_var, ssa_name; + gimple_seq seq = NULL; + gimple stmt; + unsigned n = 0; + + /* First copy the load and create the new LHS for it. */ + lhs = gimple_assign_lhs (loc->stmt); + tmp_var = create_tmp_reg (TREE_TYPE (lhs), get_lsm_tmp_name (mem, n++)); + stmt = gimple_build_assign (tmp_var, unshare_expr (mem)); + ssa_name = make_ssa_name (tmp_var, stmt); + gimple_assign_set_lhs (stmt, ssa_name); + gimple_seq_add_stmt (&seq, stmt); + + /* Then follow the single-use chain and repeat the operation. */ + while (lhs != loc->lhs) + { + tree op1, op2; + use_operand_p use; + gimple use_stmt; + bool ret = single_imm_use (lhs, &use, &use_stmt); + enum tree_code rhs_code = gimple_assign_rhs_code (use_stmt); + gcc_assert (ret); + + if (gimple_assign_rhs1 (use_stmt) == lhs) + { + op1 = ssa_name; + op2 = gimple_assign_rhs2 (use_stmt); + } + else + { + op1 = gimple_assign_rhs1 (use_stmt); + op2 = ssa_name; + } + + lhs = gimple_assign_lhs (use_stmt); + tmp_var = create_tmp_reg (TREE_TYPE (lhs), get_lsm_tmp_name (mem, n++)); + stmt = gimple_build_assign_with_ops (rhs_code, tmp_var, op1, op2); + ssa_name = make_ssa_name (tmp_var, stmt); + gimple_assign_set_lhs (stmt, ssa_name); + gimple_seq_add_stmt (&seq, stmt); + } + + *new_lhs = ssa_name; + return seq; +} + +/* Execute load motion of memory reference REF from LOOP. + Exits from the LOOP are stored in EXITS. The original loads are turned + into null assignments and the new loads are emitted on the exit edges. */ + +static void +execute_lm (struct loop *loop, VEC (edge, heap) *exits, mem_ref_p ref) +{ + VEC (mem_ref_loc_p, heap) *locs = NULL; + mem_ref_loc_p loc; + unsigned i; + + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Executing load motion of "); + print_generic_expr (dump_file, ref->mem, 0); + fprintf (dump_file, " from loop %d\n", loop->num); + } + + get_all_locs_in_loop (loop, ref, &locs); + + /* We have two cases: either the LHS was assigned to a reference that has + been hoisted out of the loop or it wasn't used within the loop. */ + FOR_EACH_VEC_ELT (mem_ref_loc_p, locs, i, loc) + { + imm_use_iterator iter; + gimple use_stmt; + tree new_lhs; + + FOR_EACH_IMM_USE_STMT (use_stmt, iter, loc->lhs) + if (gimple_debug_bind_p (use_stmt)) + { + gimple_debug_bind_reset_value (use_stmt); + update_stmt (use_stmt); + } + else if (gimple_assign_single_p (use_stmt)) + { + tree var = gimple_assign_lhs (use_stmt); + unsigned j; + edge ex; + + /* This must be a variable replacing a former store. */ + gcc_assert (TREE_CODE (var) == VAR_DECL); + + /* Insert the load and the single-use chain on every exit edge, + before the store that has been previously inserted there. */ + FOR_EACH_VEC_ELT (edge, exits, j, ex) + { + gimple_seq s = copy_load_and_single_use_chain (loc, &new_lhs); + gimple stmt = gimple_build_assign (var, new_lhs); + gimple_seq_add_stmt (&s, stmt); + gsi_insert_seq_on_edge_before (ex, s); + } + } + else + { + use_operand_p use; + + /* We are supposed to be in loop-closed SSA form. */ + gcc_assert (gimple_code (use_stmt) == GIMPLE_PHI); + + /* Insert the load and the single-use chain on every relevant edge + of the PHI node and set the PHI argument to the new LHS. */ + FOR_EACH_IMM_USE_ON_STMT (use, iter) + { + edge e = gimple_phi_arg_edge (use_stmt, + PHI_ARG_INDEX_FROM_USE (use)); + gimple_seq s = copy_load_and_single_use_chain (loc, &new_lhs); + gsi_insert_seq_on_edge (e, s); + SET_USE (use, new_lhs); + } + } + + /* Finally kill the original load. */ + gimple_assign_set_rhs1 (loc->stmt, + build_zero_cst (TREE_TYPE (ref->mem))); + update_stmt (loc->stmt); + } + + VEC_free (mem_ref_loc_p, heap, locs); +} + /* Hoists memory references MEM_REFS out of LOOP. EXITS is the list of exit edges of the LOOP. */ @@ -2203,67 +2337,77 @@ hoist_memory_references (struct loop *lo unsigned i; bitmap_iterator bi; + /* First hoist the stores. */ + EXECUTE_IF_SET_IN_BITMAP (mem_refs, 0, i, bi) + { + ref = VEC_index (mem_ref_p, memory_accesses.refs_list, i); + if (bitmap_bit_p (ref->stored, loop->num)) + execute_sm (loop, exits, ref); + } + + /* Then hoist the loads. */ EXECUTE_IF_SET_IN_BITMAP (mem_refs, 0, i, bi) { ref = VEC_index (mem_ref_p, memory_accesses.refs_list, i); - execute_sm (loop, exits, ref); + if (!bitmap_bit_p (ref->stored, loop->num)) + execute_lm (loop, exits, ref); } } -/* Returns true if REF is always accessed in LOOP. If STORED_P is true - make sure REF is always stored to in LOOP. */ +/* Return true if REF is always stored to in LOOP. If ONCE_P is true, make + sure REF is stored to exactly once in LOOP. */ static bool -ref_always_accessed_p (struct loop *loop, mem_ref_p ref, bool stored_p) +ref_always_stored_p (struct loop *loop, mem_ref_p ref, bool once_p) { VEC (mem_ref_loc_p, heap) *locs = NULL; unsigned i; mem_ref_loc_p loc; bool ret = false; struct loop *must_exec; - tree base; + tree base, lhs; base = get_base_address (ref->mem); - if (INDIRECT_REF_P (base) - || TREE_CODE (base) == MEM_REF) + if (INDIRECT_REF_P (base) || TREE_CODE (base) == MEM_REF) base = TREE_OPERAND (base, 0); get_all_locs_in_loop (loop, ref, &locs); + if (once_p && VEC_length (mem_ref_loc_p, locs) != 1) + { + VEC_free (mem_ref_loc_p, heap, locs); + return false; + } + FOR_EACH_VEC_ELT (mem_ref_loc_p, locs, i, loc) { if (!get_lim_data (loc->stmt)) continue; - /* If we require an always executed store make sure the statement - stores to the reference. */ - if (stored_p) - { - tree lhs; - if (!gimple_get_lhs (loc->stmt)) - continue; - lhs = get_base_address (gimple_get_lhs (loc->stmt)); - if (!lhs) - continue; - if (INDIRECT_REF_P (lhs) - || TREE_CODE (lhs) == MEM_REF) - lhs = TREE_OPERAND (lhs, 0); - if (lhs != base) - continue; - } + if (!gimple_get_lhs (loc->stmt)) + continue; + + lhs = get_base_address (gimple_get_lhs (loc->stmt)); + if (!lhs) + continue; + + if (INDIRECT_REF_P (lhs) || TREE_CODE (lhs) == MEM_REF) + lhs = TREE_OPERAND (lhs, 0); + + if (lhs != base) + continue; must_exec = get_lim_data (loc->stmt)->always_executed_in; if (!must_exec) continue; - if (must_exec == loop - || flow_loop_nested_p (must_exec, loop)) + if (must_exec == loop || flow_loop_nested_p (must_exec, loop)) { ret = true; break; } } - VEC_free (mem_ref_loc_p, heap, locs); + VEC_free (mem_ref_loc_p, heap, locs); return ret; } @@ -2375,77 +2519,234 @@ ref_indep_loop_p (struct loop *loop, mem return ret; } -/* Returns true if we can perform store motion of REF from LOOP. */ +/* Return true if we can perform load or store motion of REF from LOOP. + FOR_SM is true if this is for store motion or false for load motion. */ static bool -can_sm_ref_p (struct loop *loop, mem_ref_p ref) +can_lsm_ref_p (struct loop *loop, mem_ref_p ref, bool for_sm) { - tree base; + bitmap refs; + bool stored; /* Can't hoist unanalyzable refs. */ if (!MEM_ANALYZABLE (ref)) return false; - /* Unless the reference is stored in the loop, there is nothing to do. */ - if (!bitmap_bit_p (ref->stored, loop->num)) + /* Unless the reference is accessed in the loop, there is nothing to do. */ + refs = VEC_index (bitmap, memory_accesses.all_refs_in_loop, loop->num); + if (!bitmap_bit_p (refs, ref->id)) + return false; + + /* The reference must be stored to for store motion and may not be stored + to for load motion. */ + stored = bitmap_bit_p (ref->stored, loop->num); + if (stored != for_sm) return false; /* It should be movable. */ if (!is_gimple_reg_type (TREE_TYPE (ref->mem)) - || TREE_THIS_VOLATILE (ref->mem) - || !for_each_index (&ref->mem, may_move_till, loop)) + || TREE_THIS_VOLATILE (ref->mem)) return false; /* If it can throw fail, we do not properly update EH info. */ if (tree_could_throw_p (ref->mem)) return false; - /* If it can trap, it must be always executed in LOOP. - Readonly memory locations may trap when storing to them, but + return true; +} + +/* Return true if we can perform store motion of REF from LOOP. */ + +static bool +can_sm_ref_p (struct loop *loop, mem_ref_p ref) +{ + tree base; + + if (!can_lsm_ref_p (loop, ref, true)) + return false; + + /* It must be possible to hoist the index out of the loop. */ + if (!for_each_index (&ref->mem, may_move_till, loop)) + return false; + + /* If it can trap, a store must be always executed in LOOP. + Read-only memory locations may trap when storing to them, but tree_could_trap_p is a predicate for rvalues, so check that explicitly. */ base = get_base_address (ref->mem); if ((tree_could_trap_p (ref->mem) || (DECL_P (base) && TREE_READONLY (base))) - && !ref_always_accessed_p (loop, ref, true)) + && !ref_always_stored_p (loop, ref, false)) return false; - /* And it must be independent on all other memory references - in LOOP. */ + /* And the store must be independent of other memory references in LOOP. */ if (!ref_indep_loop_p (loop, ref)) return false; return true; } -/* Marks the references in LOOP for that store motion should be performed - in REFS_TO_SM. SM_EXECUTED is the set of references for that store +/* Follow a chain of single uses in LOOP starting from SSA_NAME and containing + only invariants of LOOP. Return the end of the chain. */ + +static tree +follow_invariant_single_use_chain (struct loop *loop, tree ssa_name) +{ + use_operand_p use; + gimple use_stmt; + + while (single_imm_use (ssa_name, &use, &use_stmt) + && flow_bb_inside_loop_p (loop, gimple_bb (use_stmt))) + { + tree other_op; + + if (!is_gimple_assign (use_stmt) + || gimple_assign_rhs_class (use_stmt) != GIMPLE_BINARY_RHS) + break; + + if (gimple_assign_rhs1 (use_stmt) == ssa_name) + other_op = gimple_assign_rhs2 (use_stmt); + else + other_op = gimple_assign_rhs1 (use_stmt); + + if (outermost_invariant_loop (other_op, loop) == NULL) + break; + + ssa_name = gimple_assign_lhs (use_stmt); + } + + return ssa_name; +} + +/* Return true if we can perform load motion of REF from LOOP. + REFS_TO_SM is the set of references for which store motion will be + performed in LOOP. */ + +static bool +can_lm_ref_p (struct loop *loop, mem_ref_p ref, bitmap refs_to_sm) +{ + VEC (mem_ref_loc_p, heap) *locs = NULL; + mem_ref_loc_p loc; + unsigned i; + + if (!can_lsm_ref_p (loop, ref, false)) + return false; + + get_all_locs_in_loop (loop, ref, &locs); + + /* The LHS may not be used within the loop or the LHS must be the head of + an invariant single-use chain whose end has this property. */ + FOR_EACH_VEC_ELT (mem_ref_loc_p, locs, i, loc) + { + struct lim_aux_data *lim_data; + struct loop *must_exec; + imm_use_iterator iter; + gimple use_stmt; + bool fail = false; + tree lhs; + + /* Don't look at loads that will be hoisted as invariant. */ + lim_data = get_lim_data (loc->stmt); + if (!lim_data || lim_data->max_loop != NULL) + goto failure; + + /* The loads must always be executed in LOOP. */ + must_exec = lim_data->always_executed_in; + if (!must_exec + || !(must_exec == loop || flow_loop_nested_p (must_exec, loop))) + goto failure; + + lhs = gimple_assign_lhs (loc->stmt); + + /* Follow an invariant single-use chain and retrieve its end. */ + lhs = follow_invariant_single_use_chain (loop, lhs); + + FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs) + if (gimple_debug_bind_p (use_stmt)) + ; + else if (flow_bb_inside_loop_p (loop, gimple_bb (use_stmt))) + { + mem_ref_p use_ref; + + /* If the LHS is assigned to references that will be hoisted out + of the loop, then the condition will be fulfilled after store + motion. But we need to make sure the assignments are always + executed once since we cannot insert PHI nodes on edges. */ + if (!gimple_has_volatile_ops (use_stmt) + && gimple_assign_single_p (use_stmt) + && gimple_assign_rhs1 (use_stmt) == lhs + && gimple_vdef (use_stmt) + && (use_ref = mem_ref_in_stmt (use_stmt)) != NULL + && bitmap_bit_p (refs_to_sm, use_ref->id) + && ref_always_stored_p (loop, use_ref, true)) + continue; + + fail = true; + BREAK_FROM_IMM_USE_STMT (iter); + } + + if (fail) + goto failure; + + /* Record the end of the chain. */ + loc->lhs = lhs; + } + + /* And the load must be independent of other memory references in LOOP. */ + if (!ref_indep_loop_p (loop, ref)) + goto failure; + + VEC_free (mem_ref_loc_p, heap, locs); + return true; + +failure: + VEC_free (mem_ref_loc_p, heap, locs); + return false; +} + +/* Mark references in LOOP for which load-store motion should be performed + in REFS_TO_LSM. LSM_EXECUTED is the set of references for which load-store motion was performed in one of the outer loops. */ static void -find_refs_for_sm (struct loop *loop, bitmap sm_executed, bitmap refs_to_sm) +find_refs_for_lsm (struct loop *loop, bitmap lsm_executed, bitmap refs_to_lsm) { + bitmap refs_to_sm = BITMAP_ALLOC (NULL); bitmap refs = VEC_index (bitmap, memory_accesses.all_refs_in_loop, loop->num); unsigned i; bitmap_iterator bi; mem_ref_p ref; - EXECUTE_IF_AND_COMPL_IN_BITMAP (refs, sm_executed, 0, i, bi) + /* First mark references for which store motion should be performed, as + this will enable load motion for further references. */ + EXECUTE_IF_AND_COMPL_IN_BITMAP (refs, lsm_executed, 0, i, bi) { ref = VEC_index (mem_ref_p, memory_accesses.refs_list, i); if (can_sm_ref_p (loop, ref)) bitmap_set_bit (refs_to_sm, i); } + + /* Then mark references for which load motion should be performed, taking + into account the result of the above analysis. */ + EXECUTE_IF_AND_COMPL_IN_BITMAP (refs, lsm_executed, 0, i, bi) + { + ref = VEC_index (mem_ref_p, memory_accesses.refs_list, i); + if (can_lm_ref_p (loop, ref, refs_to_sm)) + bitmap_set_bit (refs_to_lsm, i); + } + + bitmap_ior_into (refs_to_lsm, refs_to_sm); + BITMAP_FREE (refs_to_sm); } -/* Checks whether LOOP (with exits stored in EXITS array) is suitable - for a store motion optimization (i.e. whether we can insert statement +/* Check whether LOOP (with exits stored in EXITS array) is suitable for + a load-store motion optimization (i.e. whether we can insert statement on its exits). */ static bool -loop_suitable_for_sm (struct loop *loop ATTRIBUTE_UNUSED, - VEC (edge, heap) *exits) +loop_suitable_for_lsm (struct loop *loop ATTRIBUTE_UNUSED, + VEC (edge, heap) *exits) { unsigned i; edge ex; @@ -2457,44 +2758,44 @@ loop_suitable_for_sm (struct loop *loop return true; } -/* Try to perform store motion for all memory references modified inside - LOOP. SM_EXECUTED is the bitmap of the memory references for that +/* Try to perform load-store motion for all memory references modified inside + LOOP. LSM_EXECUTED is the bitmap of the memory references for which load- store motion was executed in one of the outer loops. */ static void -store_motion_loop (struct loop *loop, bitmap sm_executed) +load_store_motion_loop (struct loop *loop, bitmap lsm_executed) { VEC (edge, heap) *exits = get_loop_exit_edges (loop); struct loop *subloop; - bitmap sm_in_loop = BITMAP_ALLOC (NULL); + bitmap lsm_in_loop = BITMAP_ALLOC (NULL); - if (loop_suitable_for_sm (loop, exits)) + if (loop_suitable_for_lsm (loop, exits)) { - find_refs_for_sm (loop, sm_executed, sm_in_loop); - hoist_memory_references (loop, sm_in_loop, exits); + find_refs_for_lsm (loop, lsm_executed, lsm_in_loop); + hoist_memory_references (loop, lsm_in_loop, exits); } VEC_free (edge, heap, exits); - bitmap_ior_into (sm_executed, sm_in_loop); + bitmap_ior_into (lsm_executed, lsm_in_loop); for (subloop = loop->inner; subloop != NULL; subloop = subloop->next) - store_motion_loop (subloop, sm_executed); - bitmap_and_compl_into (sm_executed, sm_in_loop); - BITMAP_FREE (sm_in_loop); + load_store_motion_loop (subloop, lsm_executed); + bitmap_and_compl_into (lsm_executed, lsm_in_loop); + BITMAP_FREE (lsm_in_loop); } -/* Try to perform store motion for all memory references modified inside +/* Try to perform load-store motion for all memory references modified inside loops. */ static void -store_motion (void) +load_store_motion (void) { struct loop *loop; - bitmap sm_executed = BITMAP_ALLOC (NULL); + bitmap lsm_executed = BITMAP_ALLOC (NULL); for (loop = current_loops->tree_root->inner; loop != NULL; loop = loop->next) - store_motion_loop (loop, sm_executed); + load_store_motion_loop (loop, lsm_executed); - BITMAP_FREE (sm_executed); + BITMAP_FREE (lsm_executed); gsi_commit_edge_inserts (); } @@ -2652,9 +2953,9 @@ tree_ssa_lim (void) invariant and cost for computing the invariant. */ determine_invariantness (); - /* Execute store motion. Force the necessary invariants to be moved + /* Execute load-store motion. Force the necessary invariants to be moved out of the loops as well. */ - store_motion (); + load_store_motion (); /* Move the expressions that are expensive enough. */ todo = move_computations ();