From patchwork Wed Apr 6 11:08:34 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Belevantsev X-Patchwork-Id: 90006 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 2FAD1B6F01 for ; Wed, 6 Apr 2011 21:09:05 +1000 (EST) Received: (qmail 3343 invoked by alias); 6 Apr 2011 11:09:03 -0000 Received: (qmail 3000 invoked by uid 22791); 6 Apr 2011 11:08:56 -0000 X-SWARE-Spam-Status: No, hits=1.7 required=5.0 tests=AWL, BAYES_00, FSL_RU_URL, RCVD_IN_BRBL_LASTEXT, SARE_BAYES_7x5, SARE_BAYES_8x5, SARE_BAYES_9x5, TW_CF, TW_CP, TW_KC, TW_QN, TW_XB, TW_XV, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp.ispras.ru (HELO smtp.ispras.ru) (83.149.198.202) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 06 Apr 2011 11:08:44 +0000 Received: from [10.10.3.52] (winnie.ispras.ru [83.149.198.236]) by smtp.ispras.ru (Postfix) with ESMTP id 60E3A5D406C; Wed, 6 Apr 2011 15:08:18 +0400 (MSD) Message-ID: <4D9C49B2.9030302@ispras.ru> Date: Wed, 06 Apr 2011 15:08:34 +0400 From: Andrey Belevantsev User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: GCC Patches CC: Steve Ellcey , Jakub Jelinek , "Vladimir N. Makarov" , Alexander Monakov Subject: [4.5] Backport various selective scheduler patches to 4.5 branch X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello, As we discussed in PR 43603, we need to do another round of sel-sched patches' backporting to 4.5 -- there were a few reports about the problems already fixed on trunk. I have successfully bootstrapped and tested on x86-64 and ia64 the unified patch (attached) that backports fixes for the following PRs: 43603 45352 45354 45570 45652 46204 46518 46521 46522 46585 46602 46649 46875 47036 48144 The only changes outside of sel-sched* files are introducing get_reg_base_value function in alias.c for PR 45652 and fixing sched_create_recovery_edges in haifa-sched.c for dominator info updates in PR 43603. Both of those are safe. All patches are in trunk for quite some time. There are 13 separate patches (some of the bugs required several patches which were merged in one during the backport), and I plan to commit them separately tomorrow unless RMs (or anybody) objects. The patch for PR 48144 is not yet committed to 4.6, I will do the backport separately (it is the only one that requires backporting atm). Andrey diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 682ae23..9d9324c 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,178 @@ +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2011-03-26 Andrey Belevantsev + + PR rtl-optimization/48144 + * sel-sched-ir.c (merge_history_vect): Factor out from ... + (merge_expr_data): ... here. + (av_set_intersect): Rename to av_set_code_motion_filter. + Update all callers. Call merge_history_vect when an expression + is found in both sets. + * sel-sched-ir.h (av_set_code_motion_filter): Add prototype. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2011-01-13 Andrey Belevantsev + + PR rtl-optimization/45352 + * sel-sched.c: Update copyright years. + (reset_sched_cycles_in_current_ebb): Also recheck the DFA state + in the advancing loop when we have issued issue_rate insns. + + Backport from mainline + 2010-12-22 Andrey Belevantsev + + PR rtl-optimization/45352 + PR rtl-optimization/46521 + PR rtl-optimization/46522 + * sel-sched.c (reset_sched_cycles_in_current_ebb): Recheck the DFA state + on the last iteration of the advancing loop. + (sel_sched_region_1): Propagate the rescheduling bit to the next block + also for empty blocks. + + Backport from mainline + 2010-11-08 Andrey Belevantsev + + PR rtl-optimization/45352 + * sel-sched.c (find_best_expr): Do not set pneed_stall when + the variable_issue hook is not implemented. + (fill_insns): Remove dead variable stall_iterations. + (init_seqno_1): Force EBB start for resetting sched cycles on any + successor blocks of the rescheduled region. + (sel_sched_region_1): Use bitmap_bit_p instead of bitmap_clear_bit. + (reset_sched_cycles_in_current_ebb): Add debug printing. + New variable issued_insns. Advance state when we have issued + issue_rate insns. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-24 Alexander Monakov + + PR rtl-optimization/47036 + * sel-sched-ir.c (fallthru_bb_of_jump): Remove special support for + unconditional jumps. + * sel-sched.c (moveup_expr): Ditto. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-15 Alexander Monakov + + PR rtl-optimization/46649 + * sel-sched-ir.c (purge_empty_blocks): Unconditionally skip the first + basic block in the region. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-14 Alexander Monakov + + PR rtl-optimization/46875 + * sched-vis.c (print_pattern): Dump "sequence" for ADDR_VECs. + * sel-sched-ir.c (bb_has_removable_jump_to_p): Forbid table jumps. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-07 Andrey Belevantsev + PR target/43603 + * haifa-sched.c (sched_create_recovery_edges): Update + dominator info. + * sel-sched-ir.c (maybe_tidy_empty_bb): Update dominator info + after deleting an empty block. + (tidy_control_flow): Also verify dominators. + (sel_remove_bb): Update dominator info after removing a block. + (sel_redirect_edge_and_branch_force): Assert that no unreachable + blocks will be created. Update dominator info. + (sel_redirect_edge_and_branch): Update dominator info when + basic blocks do not become unreachable. + (sel_remove_loop_preheader): Update dominator info. + + 2010-10-14 Andrey Belevantsev + + * sel-sched-ir.c (maybe_tidy_empty_bb): Simplify comment. + (tidy_control_flow): Tidy vertical space. + (sel_remove_bb): New variable idx. Use it to remember the basic + block index before deleting the block. + (sel_remove_empty_bb): Remove dead code, simplify and insert to ... + (sel_merge_blocks): ... here. + * sel-sched-ir.h (sel_remove_empty_bb): Remove prototype. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-03 Alexander Monakov + + PR rtl-optimization/45354 + * sel-sched-ir.c (jump_leads_only_to_bb_p): Rename to ... + (bb_has_removable_jump_to_p): This. Update all callers. Make static. + Allow BBs ending with a conditional jump. Forbid EDGE_CROSSING jumps. + * sel-sched-ir.h (jump_leads_only_to_bb_p): Delete prototype. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-11-25 Alexander Monakov + + PR rtl-optimization/46585 + * sel-sched-ir.c (return_regset_to_pool): Verify that RS is not NULL. + (vinsn_init): Skip computation of dependencies for local NOPs. + (vinsn_delete): Don't try to free regsets for local NOPs. + (setup_nop_and_exit_insns): Change definition of nop_pattern. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-11-25 Alexander Monakov + + PR rtl-optimization/46602 + * sel-sched-ir.c (maybe_tidy_empty_bb): Move checking ... + (tidy_control_flow): Here. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-11-22 Alexander Monakov + + PR rtl-optimization/45652 + * alias.c (get_reg_base_value): New. + * rtl.h (get_reg_base_value): Add prototype. + * sel-sched.c (init_regs_for_mode): Use it. Don't use registers with + non-null REG_BASE_VALUE for renaming. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-11-18 Alexander Monakov + + PR middle-end/46518 + * sel-sched-ir.c (init_expr): Use the correct type for + target_available. + * sel-sched.c (fill_vec_av_set): Use explicitly signed char type. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-11-12 Alexander Monakov + + PR rtl-optimization/46204 + sel-sched-ir.c (maybe_tidy_empty_bb): Remove second argument. + pdate all callers. Do not recompute topological order. Adjust + allthrough edges following a degenerate conditional jump. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-10-14 Andrey Belevantsev + + PR rtl-optimization/45570 + * sel-sched-ir.c (cfg_preds_1): When walking out of the region, + assert that we are pipelining outer loops. Allow returning + zero predecessors. + 2011-03-30 H.J. Lu Backport from mainline diff --git a/gcc/alias.c b/gcc/alias.c index 7d3d343..dcf3ec5 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -1228,6 +1228,14 @@ record_set (rtx dest, const_rtx set, void *data ATTRIBUTE_UNUSED) reg_seen[regno] = 1; } +/* Return REG_BASE_VALUE for REGNO. Selective scheduler uses this to avoid + using hard registers with non-null REG_BASE_VALUE for renaming. */ +rtx +get_reg_base_value (unsigned int regno) +{ + return VEC_index (rtx, reg_base_value, regno); +} + /* If a value is known for REGNO, return it. */ rtx diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 09ac219..04c027d 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -4442,6 +4442,8 @@ sched_create_recovery_edges (basic_block first_bb, basic_block rec, edge_flags = 0; make_single_succ_edge (rec, second_bb, edge_flags); + if (dom_info_available_p (CDI_DOMINATORS)) + set_immediate_dominator (CDI_DOMINATORS, rec, first_bb); } /* This function creates recovery code for INSN. If MUTATE_P is nonzero, diff --git a/gcc/rtl.h b/gcc/rtl.h index 3a3882b..707b53e 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -2410,6 +2410,7 @@ extern rtx find_base_term (rtx); extern rtx gen_hard_reg_clobber (enum machine_mode, unsigned int); extern rtx get_reg_known_value (unsigned int); extern bool get_reg_known_equiv_p (unsigned int); +extern rtx get_reg_base_value (unsigned int); #ifdef STACK_REGS extern int stack_regs_mentioned (const_rtx insn); diff --git a/gcc/sched-vis.c b/gcc/sched-vis.c index 5754a56..805b8cf 100644 --- a/gcc/sched-vis.c +++ b/gcc/sched-vis.c @@ -600,7 +600,7 @@ print_pattern (char *buf, const_rtx x, int verbose) sprintf (buf, "asm {%s}", XSTR (x, 0)); break; case ADDR_VEC: - break; + /* Fall through. */ case ADDR_DIFF_VEC: print_value (buf, XEXP (x, 0), verbose); break; diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c index 47dcf73..429f4ce 100644 --- a/gcc/sel-sched-ir.c +++ b/gcc/sel-sched-ir.c @@ -152,7 +152,9 @@ static void free_history_vect (VEC (expr_history_def, heap) **); static void move_bb_info (basic_block, basic_block); static void remove_empty_bb (basic_block, bool); +static void sel_merge_blocks (basic_block, basic_block); static void sel_remove_loop_preheader (void); +static bool bb_has_removable_jump_to_p (basic_block, basic_block); static bool insn_is_the_only_one_in_bb_p (insn_t); static void create_initial_data_sets (basic_block); @@ -939,6 +941,7 @@ get_clear_regset_from_pool (void) void return_regset_to_pool (regset rs) { + gcc_assert (rs); regset_pool.diff--; if (regset_pool.n == regset_pool.s) @@ -1172,6 +1175,9 @@ vinsn_init (vinsn_t vi, insn_t insn, bool force_unique_p) VINSN_COUNT (vi) = 0; vi->cost = -1; + if (INSN_NOP_P (insn)) + return; + if (DF_INSN_UID_SAFE_GET (INSN_UID (insn)) != NULL) init_id_from_df (VINSN_ID (vi), insn, force_unique_p); else @@ -1253,9 +1259,12 @@ vinsn_delete (vinsn_t vi) { gcc_assert (VINSN_COUNT (vi) == 0); - return_regset_to_pool (VINSN_REG_SETS (vi)); - return_regset_to_pool (VINSN_REG_USES (vi)); - return_regset_to_pool (VINSN_REG_CLOBBERS (vi)); + if (!INSN_NOP_P (VINSN_INSN_RTX (vi))) + { + return_regset_to_pool (VINSN_REG_SETS (vi)); + return_regset_to_pool (VINSN_REG_USES (vi)); + return_regset_to_pool (VINSN_REG_CLOBBERS (vi)); + } free (vi); } @@ -1555,6 +1564,20 @@ free_history_vect (VEC (expr_history_def, heap) **pvect) *pvect = NULL; } +/* Merge vector FROM to PVECT. */ +static void +merge_history_vect (VEC (expr_history_def, heap) **pvect, + VEC (expr_history_def, heap) *from) +{ + expr_history_def *phist; + int i; + + /* We keep this vector sorted. */ + for (i = 0; VEC_iterate (expr_history_def, from, i, phist); i++) + insert_in_history_vect (pvect, phist->uid, phist->type, + phist->old_expr_vinsn, phist->new_expr_vinsn, + phist->spec_ds); +} /* Compare two vinsns as rhses if possible and as vinsns otherwise. */ bool @@ -1592,7 +1615,7 @@ static void init_expr (expr_t expr, vinsn_t vi, int spec, int use, int priority, int sched_times, int orig_bb_index, ds_t spec_done_ds, ds_t spec_to_check_ds, int orig_sched_cycle, - VEC(expr_history_def, heap) *history, bool target_available, + VEC(expr_history_def, heap) *history, signed char target_available, bool was_substituted, bool was_renamed, bool needs_spec_check_p, bool cant_move) { @@ -1787,9 +1810,6 @@ update_speculative_bits (expr_t to, expr_t from, insn_t split_point) void merge_expr_data (expr_t to, expr_t from, insn_t split_point) { - int i; - expr_history_def *phist; - /* For now, we just set the spec of resulting expr to be minimum of the specs of merged exprs. */ if (EXPR_SPEC (to) > EXPR_SPEC (from)) @@ -1813,20 +1833,12 @@ merge_expr_data (expr_t to, expr_t from, insn_t split_point) EXPR_ORIG_SCHED_CYCLE (to) = MIN (EXPR_ORIG_SCHED_CYCLE (to), EXPR_ORIG_SCHED_CYCLE (from)); - /* We keep this vector sorted. */ - for (i = 0; - VEC_iterate (expr_history_def, EXPR_HISTORY_OF_CHANGES (from), - i, phist); - i++) - insert_in_history_vect (&EXPR_HISTORY_OF_CHANGES (to), - phist->uid, phist->type, - phist->old_expr_vinsn, phist->new_expr_vinsn, - phist->spec_ds); - EXPR_WAS_SUBSTITUTED (to) |= EXPR_WAS_SUBSTITUTED (from); EXPR_WAS_RENAMED (to) |= EXPR_WAS_RENAMED (from); EXPR_CANT_MOVE (to) |= EXPR_CANT_MOVE (from); + merge_history_vect (&EXPR_HISTORY_OF_CHANGES (to), + EXPR_HISTORY_OF_CHANGES (from)); update_target_availability (to, from, split_point); update_speculative_bits (to, from, split_point); } @@ -2319,16 +2331,24 @@ av_set_split_usefulness (av_set_t av, int prob, int all_prob) } /* Leave in AVP only those expressions, which are present in AV, - and return it. */ + and return it, merging history expressions. */ void -av_set_intersect (av_set_t *avp, av_set_t av) +av_set_code_motion_filter (av_set_t *avp, av_set_t av) { av_set_iterator i; - expr_t expr; + expr_t expr, expr2; FOR_EACH_EXPR_1 (expr, i, avp) - if (av_set_lookup (av, EXPR_VINSN (expr)) == NULL) + if ((expr2 = av_set_lookup (av, EXPR_VINSN (expr))) == NULL) av_set_iter_remove (&i); + else + /* When updating av sets in bookkeeping blocks, we can add more insns + there which will be transformed but the upper av sets will not + reflect those transformations. We then fail to undo those + when searching for such insns. So merge the history saved + in the av set of the block we are processing. */ + merge_history_vect (&EXPR_HISTORY_OF_CHANGES (expr), + EXPR_HISTORY_OF_CHANGES (expr2)); } @@ -3539,9 +3559,10 @@ sel_recompute_toporder (void) /* Tidy the possibly empty block BB. */ static bool -maybe_tidy_empty_bb (basic_block bb, bool recompute_toporder_p) +maybe_tidy_empty_bb (basic_block bb) { basic_block succ_bb, pred_bb; + VEC (basic_block, heap) *dom_bbs; edge e; edge_iterator ei; bool rescan_p; @@ -3577,6 +3598,7 @@ maybe_tidy_empty_bb (basic_block bb, bool recompute_toporder_p) succ_bb = single_succ (bb); rescan_p = true; pred_bb = NULL; + dom_bbs = NULL; /* Redirect all non-fallthru edges to the next bb. */ while (rescan_p) @@ -3589,20 +3611,44 @@ maybe_tidy_empty_bb (basic_block bb, bool recompute_toporder_p) if (!(e->flags & EDGE_FALLTHRU)) { - recompute_toporder_p |= sel_redirect_edge_and_branch (e, succ_bb); + /* We can not invalidate computed topological order by moving + the edge destination block (E->SUCC) along a fallthru edge. + + We will update dominators here only when we'll get + an unreachable block when redirecting, otherwise + sel_redirect_edge_and_branch will take care of it. */ + if (e->dest != bb + && single_pred_p (e->dest)) + VEC_safe_push (basic_block, heap, dom_bbs, e->dest); + sel_redirect_edge_and_branch (e, succ_bb); rescan_p = true; break; } + /* If the edge is fallthru, but PRED_BB ends in a conditional jump + to BB (so there is no non-fallthru edge from PRED_BB to BB), we + still have to adjust it. */ + else if (single_succ_p (pred_bb) && any_condjump_p (BB_END (pred_bb))) + { + /* If possible, try to remove the unneeded conditional jump. */ + if (INSN_SCHED_TIMES (BB_END (pred_bb)) == 0 + && !IN_CURRENT_FENCE_P (BB_END (pred_bb))) + { + if (!sel_remove_insn (BB_END (pred_bb), false, false)) + tidy_fallthru_edge (e); + } + else + sel_redirect_edge_and_branch (e, succ_bb); + rescan_p = true; + break; + } } } - /* If it is possible - merge BB with its predecessor. */ if (can_merge_blocks_p (bb->prev_bb, bb)) sel_merge_blocks (bb->prev_bb, bb); else - /* Otherwise this is a block without fallthru predecessor. - Just delete it. */ { + /* This is a block without fallthru predecessor. Just delete it. */ gcc_assert (pred_bb != NULL); if (in_current_region_p (pred_bb)) @@ -3610,12 +3656,12 @@ maybe_tidy_empty_bb (basic_block bb, bool recompute_toporder_p) remove_empty_bb (bb, true); } - if (recompute_toporder_p) - sel_recompute_toporder (); - -#ifdef ENABLE_CHECKING - verify_backedges (); -#endif + if (!VEC_empty (basic_block, dom_bbs)) + { + VEC_safe_push (basic_block, heap, dom_bbs, succ_bb); + iterate_fix_dominators (CDI_DOMINATORS, dom_bbs, false); + VEC_free (basic_block, heap, dom_bbs); + } return true; } @@ -3630,12 +3676,12 @@ tidy_control_flow (basic_block xbb, bool full_tidying) insn_t first, last; /* First check whether XBB is empty. */ - changed = maybe_tidy_empty_bb (xbb, false); + changed = maybe_tidy_empty_bb (xbb); if (changed || !full_tidying) return changed; /* Check if there is a unnecessary jump after insn left. */ - if (jump_leads_only_to_bb_p (BB_END (xbb), xbb->next_bb) + if (bb_has_removable_jump_to_p (xbb, xbb->next_bb) && INSN_SCHED_TIMES (BB_END (xbb)) == 0 && !IN_CURRENT_FENCE_P (BB_END (xbb))) { @@ -3676,7 +3722,7 @@ tidy_control_flow (basic_block xbb, bool full_tidying) /* And unconditional jump in previous basic block leads to next basic block of XBB and this jump can be safely removed. */ && in_current_region_p (xbb->prev_bb) - && jump_leads_only_to_bb_p (BB_END (xbb->prev_bb), xbb->next_bb) + && bb_has_removable_jump_to_p (xbb->prev_bb, xbb->next_bb) && INSN_SCHED_TIMES (BB_END (xbb->prev_bb)) == 0 /* Also this jump is not at the scheduling boundary. */ && !IN_CURRENT_FENCE_P (BB_END (xbb->prev_bb))) @@ -3694,11 +3740,16 @@ tidy_control_flow (basic_block xbb, bool full_tidying) that contained that jump, becomes empty too. In such case remove it too. */ if (sel_bb_empty_p (xbb->prev_bb)) - changed = maybe_tidy_empty_bb (xbb->prev_bb, recompute_toporder_p); - else if (recompute_toporder_p) + changed = maybe_tidy_empty_bb (xbb->prev_bb); + if (recompute_toporder_p) sel_recompute_toporder (); } +#ifdef ENABLE_CHECKING + verify_backedges (); + verify_dominators (CDI_DOMINATORS); +#endif + return changed; } @@ -3706,14 +3757,14 @@ tidy_control_flow (basic_block xbb, bool full_tidying) void purge_empty_blocks (void) { - /* Do not attempt to delete preheader. */ - int i = sel_is_loop_preheader_p (BASIC_BLOCK (BB_TO_BLOCK (0))) ? 1 : 0; + int i; - while (i < current_nr_blocks) + /* Do not attempt to delete the first basic block in the region. */ + for (i = 1; i < current_nr_blocks; ) { basic_block b = BASIC_BLOCK (BB_TO_BLOCK (i)); - if (maybe_tidy_empty_bb (b, false)) + if (maybe_tidy_empty_bb (b)) continue; i++; @@ -4381,9 +4432,6 @@ fallthru_bb_of_jump (rtx jump) if (!JUMP_P (jump)) return NULL; - if (any_uncondjump_p (jump)) - return single_succ (BLOCK_FOR_INSN (jump)); - if (!any_condjump_p (jump)) return NULL; @@ -4578,8 +4626,12 @@ cfg_preds_1 (basic_block bb, insn_t **preds, int *n, int *size) basic_block pred_bb = e->src; insn_t bb_end = BB_END (pred_bb); - /* ??? This code is not supposed to walk out of a region. */ - gcc_assert (in_current_region_p (pred_bb)); + if (!in_current_region_p (pred_bb)) + { + gcc_assert (flag_sel_sched_pipelining_outer_loops + && current_loop_nest); + continue; + } if (sel_bb_empty_p (pred_bb)) cfg_preds_1 (pred_bb, preds, n, size); @@ -4592,7 +4644,9 @@ cfg_preds_1 (basic_block bb, insn_t **preds, int *n, int *size) } } - gcc_assert (*n != 0); + gcc_assert (*n != 0 + || (flag_sel_sched_pipelining_outer_loops + && current_loop_nest)); } /* Find all predecessors of BB and record them in PREDS and their number @@ -5018,16 +5072,23 @@ sel_add_bb (basic_block bb) static void sel_remove_bb (basic_block bb, bool remove_from_cfg_p) { + unsigned idx = bb->index; + gcc_assert (bb != NULL && BB_NOTE_LIST (bb) == NULL_RTX); remove_bb_from_region (bb); return_bb_to_pool (bb); - bitmap_clear_bit (blocks_to_reschedule, bb->index); + bitmap_clear_bit (blocks_to_reschedule, idx); if (remove_from_cfg_p) - delete_and_free_basic_block (bb); + { + basic_block succ = single_succ (bb); + delete_and_free_basic_block (bb); + set_immediate_dominator (CDI_DOMINATORS, succ, + recompute_dominator (CDI_DOMINATORS, succ)); + } - rgn_setup_region (CONTAINING_RGN (bb->index)); + rgn_setup_region (CONTAINING_RGN (idx)); } /* Concatenate info of EMPTY_BB to info of MERGE_BB. */ @@ -5042,50 +5103,6 @@ move_bb_info (basic_block merge_bb, basic_block empty_bb) } -/* Remove an empty basic block EMPTY_BB. When MERGE_UP_P is true, we put - EMPTY_BB's note lists into its predecessor instead of putting them - into the successor. When REMOVE_FROM_CFG_P is true, also remove - the empty block. */ -void -sel_remove_empty_bb (basic_block empty_bb, bool merge_up_p, - bool remove_from_cfg_p) -{ - basic_block merge_bb; - - gcc_assert (sel_bb_empty_p (empty_bb)); - - if (merge_up_p) - { - merge_bb = empty_bb->prev_bb; - gcc_assert (EDGE_COUNT (empty_bb->preds) == 1 - && EDGE_PRED (empty_bb, 0)->src == merge_bb); - } - else - { - edge e; - edge_iterator ei; - - merge_bb = bb_next_bb (empty_bb); - - /* Redirect incoming edges (except fallthrough one) of EMPTY_BB to its - successor block. */ - for (ei = ei_start (empty_bb->preds); - (e = ei_safe_edge (ei)); ) - { - if (! (e->flags & EDGE_FALLTHRU)) - sel_redirect_edge_and_branch (e, merge_bb); - else - ei_next (&ei); - } - - gcc_assert (EDGE_COUNT (empty_bb->succs) == 1 - && EDGE_SUCC (empty_bb, 0)->dest == merge_bb); - } - - move_bb_info (merge_bb, empty_bb); - remove_empty_bb (empty_bb, remove_from_cfg_p); -} - /* Remove EMPTY_BB. If REMOVE_FROM_CFG_P is false, remove EMPTY_BB from region, but keep it in CFG. */ static void @@ -5385,12 +5402,16 @@ sel_create_recovery_block (insn_t orig_insn) } /* Merge basic block B into basic block A. */ -void +static void sel_merge_blocks (basic_block a, basic_block b) { - sel_remove_empty_bb (b, true, false); - merge_blocks (a, b); + gcc_assert (sel_bb_empty_p (b) + && EDGE_COUNT (b->preds) == 1 + && EDGE_PRED (b, 0)->src == b->prev_bb); + move_bb_info (b->prev_bb, b); + remove_empty_bb (b, false); + merge_blocks (a, b); change_loops_latches (b, a); } @@ -5400,12 +5421,15 @@ sel_merge_blocks (basic_block a, basic_block b) void sel_redirect_edge_and_branch_force (edge e, basic_block to) { - basic_block jump_bb, src; + basic_block jump_bb, src, orig_dest = e->dest; int prev_max_uid; rtx jump; - gcc_assert (!sel_bb_empty_p (e->src)); - + /* This function is now used only for bookkeeping code creation, where + we'll never get the single pred of orig_dest block and thus will not + hit unreachable blocks when updating dominator info. */ + gcc_assert (!sel_bb_empty_p (e->src) + && !single_pred_p (orig_dest)); src = e->src; prev_max_uid = get_max_uid (); jump_bb = redirect_edge_and_branch_force (e, to); @@ -5422,6 +5446,10 @@ sel_redirect_edge_and_branch_force (edge e, basic_block to) jump = find_new_jump (src, jump_bb, prev_max_uid); if (jump) sel_init_new_insn (jump, INSN_INIT_TODO_LUID | INSN_INIT_TODO_SIMPLEJUMP); + set_immediate_dominator (CDI_DOMINATORS, to, + recompute_dominator (CDI_DOMINATORS, to)); + set_immediate_dominator (CDI_DOMINATORS, orig_dest, + recompute_dominator (CDI_DOMINATORS, orig_dest)); } /* A wrapper for redirect_edge_and_branch. Return TRUE if blocks connected by @@ -5430,11 +5458,12 @@ bool sel_redirect_edge_and_branch (edge e, basic_block to) { bool latch_edge_p; - basic_block src; + basic_block src, orig_dest = e->dest; int prev_max_uid; rtx jump; edge redirected; bool recompute_toporder_p = false; + bool maybe_unreachable = single_pred_p (orig_dest); latch_edge_p = (pipelining_p && current_loop_nest @@ -5465,6 +5494,15 @@ sel_redirect_edge_and_branch (edge e, basic_block to) if (jump) sel_init_new_insn (jump, INSN_INIT_TODO_LUID | INSN_INIT_TODO_SIMPLEJUMP); + /* Only update dominator info when we don't have unreachable blocks. + Otherwise we'll update in maybe_tidy_empty_bb. */ + if (!maybe_unreachable) + { + set_immediate_dominator (CDI_DOMINATORS, to, + recompute_dominator (CDI_DOMINATORS, to)); + set_immediate_dominator (CDI_DOMINATORS, orig_dest, + recompute_dominator (CDI_DOMINATORS, orig_dest)); + } return recompute_toporder_p; } @@ -5603,7 +5641,7 @@ setup_nop_and_exit_insns (void) gcc_assert (nop_pattern == NULL_RTX && exit_insn == NULL_RTX); - nop_pattern = gen_nop (); + nop_pattern = constm1_rtx; start_sequence (); emit_insn (nop_pattern); @@ -6093,22 +6131,20 @@ sel_is_loop_preheader_p (basic_block bb) return false; } -/* Checks whether JUMP leads to basic block DEST_BB and no other blocks. */ -bool -jump_leads_only_to_bb_p (insn_t jump, basic_block dest_bb) +/* Check whether JUMP_BB ends with a jump insn that leads only to DEST_BB and + can be removed, making the corresponding edge fallthrough (assuming that + all basic blocks between JUMP_BB and DEST_BB are empty). */ +static bool +bb_has_removable_jump_to_p (basic_block jump_bb, basic_block dest_bb) { - basic_block jump_bb = BLOCK_FOR_INSN (jump); - - /* It is not jump, jump with side-effects or jump can lead to several - basic blocks. */ - if (!onlyjump_p (jump) - || !any_uncondjump_p (jump)) + if (!onlyjump_p (BB_END (jump_bb)) + || tablejump_p (BB_END (jump_bb), NULL, NULL)) return false; /* Several outgoing edges, abnormal edge or destination of jump is not DEST_BB. */ if (EDGE_COUNT (jump_bb->succs) != 1 - || EDGE_SUCC (jump_bb, 0)->flags & EDGE_ABNORMAL + || EDGE_SUCC (jump_bb, 0)->flags & (EDGE_ABNORMAL | EDGE_CROSSING) || EDGE_SUCC (jump_bb, 0)->dest != dest_bb) return false; @@ -6188,12 +6224,16 @@ sel_remove_loop_preheader (void) basic block if it becomes empty. */ if (next_bb->prev_bb == prev_bb && prev_bb != ENTRY_BLOCK_PTR - && jump_leads_only_to_bb_p (BB_END (prev_bb), next_bb)) + && bb_has_removable_jump_to_p (prev_bb, next_bb)) { redirect_edge_and_branch (EDGE_SUCC (prev_bb, 0), next_bb); if (BB_END (prev_bb) == bb_note (prev_bb)) free_data_sets (prev_bb); } + + set_immediate_dominator (CDI_DOMINATORS, next_bb, + recompute_dominator (CDI_DOMINATORS, + next_bb)); } } VEC_free (basic_block, heap, preheader_blocks); diff --git a/gcc/sel-sched-ir.h b/gcc/sel-sched-ir.h index acf25b2..bd354dc 100644 --- a/gcc/sel-sched-ir.h +++ b/gcc/sel-sched-ir.h @@ -1565,7 +1565,7 @@ extern void av_set_leave_one_nonspec (av_set_t *); extern expr_t av_set_element (av_set_t, int); extern void av_set_substract_cond_branches (av_set_t *); extern void av_set_split_usefulness (av_set_t, int, int); -extern void av_set_intersect (av_set_t *, av_set_t); +extern void av_set_code_motion_filter (av_set_t *, av_set_t); extern void sel_save_haifa_priorities (void); @@ -1590,7 +1590,6 @@ extern bool sel_remove_insn (insn_t, bool, bool); extern bool bb_header_p (insn_t); extern void sel_init_invalid_data_sets (insn_t); extern bool insn_at_boundary_p (insn_t); -extern bool jump_leads_only_to_bb_p (insn_t, basic_block); /* Basic block and CFG functions. */ @@ -1618,11 +1617,9 @@ extern bool in_same_ebb_p (insn_t, insn_t); extern bool tidy_control_flow (basic_block, bool); extern void free_bb_note_pool (void); -extern void sel_remove_empty_bb (basic_block, bool, bool); extern void purge_empty_blocks (void); extern basic_block sel_split_edge (edge); extern basic_block sel_create_recovery_block (insn_t); -extern void sel_merge_blocks (basic_block, basic_block); extern bool sel_redirect_edge_and_branch (edge, basic_block); extern void sel_redirect_edge_and_branch_force (edge, basic_block); extern void sel_init_pipelining (void); diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c index cb2bfeb..1ffb787 100644 --- a/gcc/sel-sched.c +++ b/gcc/sel-sched.c @@ -1,5 +1,6 @@ /* Instruction scheduling pass. Selective scheduler and pipeliner. - Copyright (C) 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. + Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011 + Free Software Foundation, Inc. This file is part of GCC. @@ -1138,6 +1139,9 @@ init_regs_for_mode (enum machine_mode mode) /* Can't use regs which aren't saved by the prologue. */ || !TEST_HARD_REG_BIT (sel_hrd.regs_ever_used, cur_reg + i) + /* Can't use regs with non-null REG_BASE_VALUE, because adjusting + it affects aliasing globally and invalidates all AV sets. */ + || get_reg_base_value (cur_reg + i) #ifdef LEAF_REGISTERS /* We can't use a non-leaf register if we're in a leaf function. */ @@ -2167,10 +2171,8 @@ moveup_expr (expr_t expr, insn_t through_insn, bool inside_insn_group, || ! in_current_region_p (fallthru_bb)) return MOVEUP_EXPR_NULL; - /* And it should be mutually exclusive with through_insn, or - be an unconditional jump. */ - if (! any_uncondjump_p (insn) - && ! sched_insns_conditions_mutex_p (insn, through_insn) + /* And it should be mutually exclusive with through_insn. */ + if (! sched_insns_conditions_mutex_p (insn, through_insn) && ! DEBUG_INSN_P (through_insn)) return MOVEUP_EXPR_NULL; } @@ -3735,7 +3737,7 @@ fill_vec_av_set (av_set_t av, blist_t bnds, fence_t fence, { expr_t expr = VEC_index (expr_t, vec_av_set, n); insn_t insn = EXPR_INSN_RTX (expr); - char target_available; + signed char target_available; bool is_orig_reg_p = true; int need_cycles, new_prio; @@ -4403,7 +4405,8 @@ find_best_expr (av_set_t *av_vliw_ptr, blist_t bnds, fence_t fence, { can_issue_more = invoke_aftermath_hooks (fence, EXPR_INSN_RTX (best), can_issue_more); - if (can_issue_more == 0) + if (targetm.sched.variable_issue + && can_issue_more == 0) *pneed_stall = 1; } @@ -5514,7 +5517,7 @@ fill_insns (fence_t fence, int seqno, ilist_t **scheduled_insns_tailpp) blist_t *bnds_tailp1, *bndsp; expr_t expr_vliw; int need_stall; - int was_stall = 0, scheduled_insns = 0, stall_iterations = 0; + int was_stall = 0, scheduled_insns = 0; int max_insns = pipelining_p ? issue_rate : 2 * issue_rate; int max_stall = pipelining_p ? 1 : 3; bool last_insn_was_debug = false; @@ -5533,16 +5536,15 @@ fill_insns (fence_t fence, int seqno, ilist_t **scheduled_insns_tailpp) do { expr_vliw = find_best_expr (&av_vliw, bnds, fence, &need_stall); - if (!expr_vliw && need_stall) + if (! expr_vliw && need_stall) { /* All expressions required a stall. Do not recompute av sets as we'll get the same answer (modulo the insns between the fence and its boundary, which will not be available for - pipelining). */ - gcc_assert (! expr_vliw && stall_iterations < 2); - was_stall++; - /* If we are going to stall for too long, break to recompute av + pipelining). + If we are going to stall for too long, break to recompute av sets and bring more insns for pipelining. */ + was_stall++; if (need_stall <= 3) stall_for_cycles (fence, need_stall); else @@ -6475,7 +6477,7 @@ code_motion_path_driver (insn_t insn, av_set_t orig_ops, ilist_t path, /* Filter the orig_ops set. */ if (AV_SET_VALID_P (insn)) - av_set_intersect (&orig_ops, AV_SET (insn)); + av_set_code_motion_filter (&orig_ops, AV_SET (insn)); /* If no more original ops, return immediately. */ if (!orig_ops) @@ -6717,6 +6719,8 @@ init_seqno_1 (basic_block bb, sbitmap visited_bbs, bitmap blocks_to_reschedule) init_seqno_1 (succ, visited_bbs, blocks_to_reschedule); } + else if (blocks_to_reschedule) + bitmap_set_bit (forced_ebb_heads, succ->index); } for (insn = BB_END (bb); insn != note; insn = PREV_INSN (insn)) @@ -6971,6 +6975,7 @@ reset_sched_cycles_in_current_ebb (void) int last_clock = 0; int haifa_last_clock = -1; int haifa_clock = 0; + int issued_insns = 0; insn_t insn; if (targetm.sched.md_init) @@ -6989,7 +6994,7 @@ reset_sched_cycles_in_current_ebb (void) { int cost, haifa_cost; int sort_p; - bool asm_p, real_insn, after_stall; + bool asm_p, real_insn, after_stall, all_issued; int clock; if (!INSN_P (insn)) @@ -7025,7 +7030,9 @@ reset_sched_cycles_in_current_ebb (void) haifa_cost = cost; after_stall = 1; } - + all_issued = issued_insns == issue_rate; + if (haifa_cost == 0 && all_issued) + haifa_cost = 1; if (haifa_cost > 0) { int i = 0; @@ -7033,6 +7040,7 @@ reset_sched_cycles_in_current_ebb (void) while (haifa_cost--) { advance_state (curr_state); + issued_insns = 0; i++; if (sched_verbose >= 2) @@ -7049,9 +7057,22 @@ reset_sched_cycles_in_current_ebb (void) && haifa_cost > 0 && estimate_insn_cost (insn, curr_state) == 0) break; - } + + /* When the data dependency stall is longer than the DFA stall, + and when we have issued exactly issue_rate insns and stalled, + it could be that after this longer stall the insn will again + become unavailable to the DFA restrictions. Looks strange + but happens e.g. on x86-64. So recheck DFA on the last + iteration. */ + if ((after_stall || all_issued) + && real_insn + && haifa_cost == 0) + haifa_cost = estimate_insn_cost (insn, curr_state); + } haifa_clock += i; + if (sched_verbose >= 2) + sel_print ("haifa clock: %d\n", haifa_clock); } else gcc_assert (haifa_cost == 0); @@ -7065,21 +7086,27 @@ reset_sched_cycles_in_current_ebb (void) &sort_p)) { advance_state (curr_state); + issued_insns = 0; haifa_clock++; if (sched_verbose >= 2) { sel_print ("advance_state (dfa_new_cycle)\n"); debug_state (curr_state); + sel_print ("haifa clock: %d\n", haifa_clock + 1); } } if (real_insn) { cost = state_transition (curr_state, insn); + issued_insns++; if (sched_verbose >= 2) - debug_state (curr_state); - + { + sel_print ("scheduled insn %d, clock %d\n", INSN_UID (insn), + haifa_clock + 1); + debug_state (curr_state); + } gcc_assert (cost < 0); } @@ -7492,21 +7519,23 @@ sel_sched_region_1 (void) { basic_block bb = EBB_FIRST_BB (i); - if (sel_bb_empty_p (bb)) - { - bitmap_clear_bit (blocks_to_reschedule, bb->index); - continue; - } - if (bitmap_bit_p (blocks_to_reschedule, bb->index)) { + if (! bb_ends_ebb_p (bb)) + bitmap_set_bit (blocks_to_reschedule, bb_next_bb (bb)->index); + if (sel_bb_empty_p (bb)) + { + bitmap_clear_bit (blocks_to_reschedule, bb->index); + continue; + } clear_outdated_rtx_info (bb); if (sel_insn_is_speculation_check (BB_END (bb)) && JUMP_P (BB_END (bb))) bitmap_set_bit (blocks_to_reschedule, BRANCH_EDGE (bb)->dest->index); } - else if (INSN_SCHED_TIMES (sel_bb_head (bb)) <= 0) + else if (! sel_bb_empty_p (bb) + && INSN_SCHED_TIMES (sel_bb_head (bb)) <= 0) bitmap_set_bit (blocks_to_reschedule, bb->index); } diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 7fef765..ecc0e16 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,92 @@ +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2011-03-26 Andrey Belevantsev + + PR rtl-optimization/48144 + * gcc.dg/pr48144.c: New test. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2011-01-13 Andrey Belevantsev + + PR rtl-optimization/45352 + * gcc.dg/pr45352-3.c: New. + + Backport from mainline + 2010-12-22 Andrey Belevantsev + + PR rtl-optimization/45352 + PR rtl-optimization/46521 + PR rtl-optimization/46522 + * gcc.dg/pr46521.c: New. + * gcc.dg/pr46522.c: New. + + Backport from mainline + 2010-11-08 Andrey Belevantsev + + PR rtl-optimization/45352 + gcc.dg/pr45352.c, gcc.dg/pr45352-1.c, gcc.dg/pr45352-2.c: New tests. + gcc.target/i386/pr45352.c, gcc.target/i386/pr45352-1.c, + gcc.target/i386/pr45352-2.c: New tests. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-24 Alexander Monakov + + PR rtl-optimization/47036 + * g++.dg/opt/pr47036.C: New. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-15 Alexander Monakov + + PR rtl-optimization/46649 + * g++.dg/opt/pr46649.C: New. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-14 Alexander Monakov + + PR rtl-optimization/46875 + * gcc.dg/pr46875.c: New. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-07 Andrey Belevantsev + + PR target/43603 + * gcc.target/ia64/pr43603.c: New. + * gcc/testsuite/g++.dg/opt/pr46640.C: New. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-12-03 Alexander Monakov + + PR rtl-optimization/45354 + * gcc.dg/tree-prof/pr45354.c: New. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-11-22 Alexander Monakov + + PR rtl-optimization/45652 + * gcc.dg/pr45652.c: New. + +2011-04-06 Andrey Belevantsev + + Backport from mainline + 2010-10-14 Andrey Belevantsev + PR rtl-optimization/45570 + * gcc.dg/pr45570.c: New test. + 2011-03-31 Rainer Orth PR target/16292 diff --git a/gcc/testsuite/g++.dg/opt/pr46640.C b/gcc/testsuite/g++.dg/opt/pr46640.C new file mode 100644 index 0000000..0892c9a --- /dev/null +++ b/gcc/testsuite/g++.dg/opt/pr46640.C @@ -0,0 +1,44 @@ +// { dg-do compile { target x86_64-*-* } } +// { dg-options "-fschedule-insns2 -fsel-sched-pipelining -fselective-scheduling2 -fno-exceptions -O" } + +struct QBasicAtomicInt +{ + int i, j; + bool deref () + { + asm volatile ("":"=m" (i), "=qm" (j)); + } +}; + +struct Data +{ + QBasicAtomicInt ref; + void *data; +}; + +struct QByteArray +{ + Data * d; + ~QByteArray () + { + d->ref.deref (); + } +}; + +int indexOf (unsigned); +int stat (void *, int *); +QByteArray encodeName (); + +bool makeDir (unsigned len) +{ + unsigned i = 0; + while (len) + { + int st; + int pos = indexOf (i); + QByteArray baseEncoded = encodeName (); + if (stat (baseEncoded.d->data, &st) && stat (baseEncoded.d, &st)) + return false; + i = pos; + } +} diff --git a/gcc/testsuite/g++.dg/opt/pr46649.C b/gcc/testsuite/g++.dg/opt/pr46649.C new file mode 100644 index 0000000..1428e8b --- /dev/null +++ b/gcc/testsuite/g++.dg/opt/pr46649.C @@ -0,0 +1,9 @@ +// { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } +// { dg-options "-fschedule-insns -fselective-scheduling" } + +void foo () +{ + for (;;) + for (;;({break;})) + ; +} diff --git a/gcc/testsuite/g++.dg/opt/pr47036.C b/gcc/testsuite/g++.dg/opt/pr47036.C new file mode 100644 index 0000000..d6d5adc --- /dev/null +++ b/gcc/testsuite/g++.dg/opt/pr47036.C @@ -0,0 +1,10 @@ +// { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } +// { dg-options "-fschedule-insns -fselective-scheduling -fno-dce" } + + +void foo () +{ + for (;;) + for (;;({break;})); +} + diff --git a/gcc/testsuite/gcc.dg/pr45352-1.c b/gcc/testsuite/gcc.dg/pr45352-1.c new file mode 100644 index 0000000..3b092cd --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr45352-1.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-O3 -fschedule-insns -fschedule-insns2 -fselective-scheduling2 -fsel-sched-pipelining -funroll-loops -fprefetch-loop-arrays" } */ + +void main1 (float *pa, float *pc) +{ + int i; + float b[256]; + float c[256]; + for (i = 0; i < 256; i++) + b[i] = c[i] = pc[i]; + for (i = 0; i < 256; i++) + pa[i] = b[i] * c[i]; +} diff --git a/gcc/testsuite/gcc.dg/pr45352-2.c b/gcc/testsuite/gcc.dg/pr45352-2.c new file mode 100644 index 0000000..eed3847 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr45352-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-O1 -freorder-blocks -fschedule-insns2 -funswitch-loops -fselective-scheduling2 -fsel-sched-pipelining -funroll-all-loops" } */ +void +foo1 (int *s) +{ + s[0] = s[1]; + while (s[6] - s[8]) + { + s[6] -= s[8]; + if (s[8] || s[0]) + { + s[3] += s[0]; + s[4] += s[1]; + } + s[7]++; + } +} diff --git a/gcc/testsuite/gcc.dg/pr45352-3.c b/gcc/testsuite/gcc.dg/pr45352-3.c new file mode 100644 index 0000000..ce7879f --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr45352-3.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-O -fprofile-generate -fgcse -fno-gcse-lm -fgcse-sm -fno-ivopts -fno-tree-loop-im -ftree-pre -funroll-loops -fno-web -fschedule-insns2 -fselective-scheduling2 -fsel-sched-pipelining" } */ + +extern volatile float f[]; + +void foo (void) +{ + int i; + for (i = 0; i < 100; i++) + f[i] = 0; + for (i = 0; i < 100; i++) + f[i] = 0; + for (i = 0; i < 100; i++) + if (f[i]) + __builtin_abort (); +} diff --git a/gcc/testsuite/gcc.dg/pr45352.c b/gcc/testsuite/gcc.dg/pr45352.c new file mode 100644 index 0000000..75f9a21 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr45352.c @@ -0,0 +1,24 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-Os -fselective-scheduling2 -fsel-sched-pipelining -fprofile-generate" } */ + +static inline void +bmp_iter_next (int *bi, int *bit_no) +{ + *bi >>= 1; + *bit_no += 1; +} + +int bmp_iter_set (int *bi, int *bit_no); +void bitmap_initialize_stat (int, ...); +void bitmap_clear (void); + +void +df_md_alloc (int bi, int bb_index, void *bb_info) +{ + for (; bmp_iter_set (&bi, &bb_index); bmp_iter_next (&bi, &bb_index)) + + if (bb_info) + bitmap_clear (); + else + bitmap_initialize_stat (0); +} diff --git a/gcc/testsuite/gcc.dg/pr45570.c b/gcc/testsuite/gcc.dg/pr45570.c new file mode 100644 index 0000000..8a25951 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr45570.c @@ -0,0 +1,28 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O3 -fselective-scheduling2 -fsel-sched-pipelining -fsel-sched-pipelining-outer-loops -ftracer" } */ +void +parser_get_next_char (char c, int qm, char *p) +{ + int quote_mode = 0; + for (; *p; p++) + { + if (qm) + { + if (quote_mode == 0 && *p == '"' && *(p - 1)) + { + quote_mode = 1; + continue; + } + if (quote_mode && *p == '"' && *(p - 1)) + quote_mode = 0; + } + if (quote_mode == 0 && *p == c && *(p - 1)) + break; + } +} + +void +parser_get_next_parameter (char *p) +{ + parser_get_next_char (':', 1, p); +} diff --git a/gcc/testsuite/gcc.dg/pr45652.c b/gcc/testsuite/gcc.dg/pr45652.c new file mode 100644 index 0000000..8f55f0c --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr45652.c @@ -0,0 +1,39 @@ +/* { dg-do run { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -fselective-scheduling2" } */ + +struct S { + double i[2]; +}; + +void __attribute__ ((noinline)) checkcd (struct S x) +{ + if (x.i[0] != 7.0 || x.i[1] != 8.0) + __builtin_abort (); +} + +void __attribute__ ((noinline)) testvacd (int n, ...) +{ + int i; + __builtin_va_list ap; + __builtin_va_start (ap, n); + for (i = 0; i < n; i++) + { + struct S t = __builtin_va_arg (ap, struct S); + checkcd (t); + } + __builtin_va_end (ap); +} + +void +testitcd (void) +{ + struct S x = { { 7.0, 8.0 } }; + testvacd (2, x, x); +} + +int +main () +{ + testitcd (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/pr46521.c b/gcc/testsuite/gcc.dg/pr46521.c new file mode 100644 index 0000000..0c41c43 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr46521.c @@ -0,0 +1,20 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-Os -fselective-scheduling2 -fsel-sched-pipelining -fprofile-generate -fno-early-inlining" } */ + +static void bmp_iter_next (int *bi) +{ + *bi >>= 1; +} + +int bmp_iter_set (int *, int); +void bitmap_clear (void); +void bitmap_initialize_stat (void); + +void df_md_alloc (int bi, int bb_index, int bb_info) +{ + for (; bmp_iter_set (&bi, bb_index); bmp_iter_next (&bi)) + if (bb_info) + bitmap_clear (); + else + bitmap_initialize_stat (); +} diff --git a/gcc/testsuite/gcc.dg/pr46522.c b/gcc/testsuite/gcc.dg/pr46522.c new file mode 100644 index 0000000..13a5aa9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr46522.c @@ -0,0 +1,33 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-O3 -fkeep-inline-functions -fsel-sched-pipelining -fselective-scheduling2 -funroll-loops" } */ + +struct S +{ + unsigned i, j; +}; + +static inline void +bar (struct S *s) +{ + if (s->i++ == 1) + { + s->i = 0; + s->j++; + } +} + +void +foo1 (struct S *s) +{ + bar (s); +} + +void +foo2 (struct S s1, struct S s2, int i) +{ + while (s1.i != s2.i) { + if (i) + *(unsigned *) 0 |= (1U << s1.i); + bar (&s1); + } +} diff --git a/gcc/testsuite/gcc.dg/pr46585.c b/gcc/testsuite/gcc.dg/pr46585.c new file mode 100644 index 0000000..32befdf --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr46585.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* i?86-*-* x86_64-*-* } } */ +/* { dg-options "-fno-dce -fschedule-insns -fselective-scheduling" } */ +void +foo (void) +{ + switch (0) + { + default: + break; + } +} diff --git a/gcc/testsuite/gcc.dg/pr46875.c b/gcc/testsuite/gcc.dg/pr46875.c new file mode 100644 index 0000000..c601708 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr46875.c @@ -0,0 +1,27 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-Os -fselective-scheduling2" } */ + +long +foo (int x, long *y) +{ + long a = 0; + switch (x) + { + case 0: + a = *y; + break; + case 1: + a = *y; + break; + case 2: + a = *y; + break; + case 3: + a = *y; + break; + case 4: + a = *y; + break; + } + return a; +} diff --git a/gcc/testsuite/gcc.dg/pr48144.c b/gcc/testsuite/gcc.dg/pr48144.c new file mode 100644 index 0000000..030202d --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr48144.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O -frerun-cse-after-loop -fschedule-insns2 -fselective-scheduling2 -fno-tree-ch -funroll-loops --param=max-sched-extend-regions-iters=2 --param=max-sched-region-blocks=15" } */ +extern void *memcpy(void *dest, const void *src, __SIZE_TYPE__ n); + +void bar (void *, void *, void *); + +void foo + (void *p, char *data, unsigned data_len) +{ + int buffer[8]; + int buf2[8]; + unsigned i; + for (i = 0; i + 8 <= data_len; i += 8) + bar (p, buffer, data + i); + memcpy (buf2, data + i, data_len); +} diff --git a/gcc/testsuite/gcc.dg/tree-prof/pr45354.c b/gcc/testsuite/gcc.dg/tree-prof/pr45354.c new file mode 100644 index 0000000..b30ad77 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-prof/pr45354.c @@ -0,0 +1,43 @@ +/* { dg-require-effective-target freorder } */ +/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ + +extern void abort (void); + +int ifelse_val2; + +int __attribute__((noinline)) +test_ifelse2 (int i) +{ + int result = 0; + if (!i) /* count(6) */ + result = 1; /* count(1) */ + if (i == 1) /* count(6) */ + result = 1024; + if (i == 2) /* count(6) */ + result = 2; /* count(3) */ + if (i == 3) /* count(6) */ + return 8; /* count(2) */ + if (i == 4) /* count(4) */ + return 2048; + return result; /* count(4) */ +} + +void __attribute__((noinline)) +call_ifelse () +{ + ifelse_val2 += test_ifelse2 (0); + ifelse_val2 += test_ifelse2 (2); + ifelse_val2 += test_ifelse2 (2); + ifelse_val2 += test_ifelse2 (2); + ifelse_val2 += test_ifelse2 (3); + ifelse_val2 += test_ifelse2 (3); +} + +int +main() +{ + call_ifelse (); + if (ifelse_val2 != 23) + abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/pr45352-1.c b/gcc/testsuite/gcc.target/i386/pr45352-1.c new file mode 100644 index 0000000..5cd1bd8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr45352-1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-mtune=amdfam10 -O3 -fpeel-loops -fselective-scheduling2 -fsel-sched-pipelining -fPIC" } */ + +static int FIR_Tab_16[16][16]; + +void +V_Pass_Avrg_16_C_ref (int *Dst, int *Src, int W, int BpS, int Rnd) +{ + while (W-- > 0) + { + int i, k; + int Sums[16] = { }; + for (i = 0; i < 16; ++i) + for (k = 0; k < 16; ++k) + Sums[k] += FIR_Tab_16[i][k] * Src[i]; + for (i = 0; i < 16; ++i) + Dst[i] = Sums[i] + Src[i]; + } +} diff --git a/gcc/testsuite/gcc.target/i386/pr45352-2.c b/gcc/testsuite/gcc.target/i386/pr45352-2.c new file mode 100644 index 0000000..5f9ebb1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr45352-2.c @@ -0,0 +1,109 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -mtune=amdfam10 -fexpensive-optimizations -fgcse -foptimize-register-move -freorder-blocks -fschedule-insns2 -funswitch-loops -fgcse-las -fselective-scheduling2 -fsel-sched-pipelining -funroll-all-loops" } */ + +typedef char uint8_t; +typedef uint32_t; +typedef vo_frame_t; +struct vo_frame_s +{ + uint8_t base[3]; + int pitches[3];}; +typedef struct +{ +void + (*proc_macro_block) + (void); +} +xine_xvmc_t; +typedef struct +{ + uint8_t ref[2][3]; +int pmv; +} +motion_t; +typedef struct +{ + uint32_t bitstream_buf; + int bitstream_bits; + uint8_t * bitstream_ptr; + uint8_t dest[3]; + int pitches[3]; + int offset; + motion_t b_motion; + motion_t f_motion; + int v_offset; + int coded_picture_width; + int picture_structure; +struct vo_frame_s *current_frame;} +picture_t; +typedef struct +{ +int xvmc_last_slice_code;} +mpeg2dec_accel_t; +static int bitstream_init (picture_t * picture, void *start) +{ + picture->bitstream_ptr = start; + return (int) (long) start; +} +static slice_xvmc_init (picture_t * picture, int code) +{ + int offset; + struct vo_frame_s *forward_reference_frame; + offset = picture->picture_structure == 2; + picture->pitches[0] = picture->current_frame->pitches[0]; + picture->pitches[1] = picture->current_frame->pitches[1]; + if (picture) + picture->f_motion.ref + [0] + [0] + = (char) (long) (forward_reference_frame->base + (offset ? picture->pitches[0] : 0)); + picture->f_motion.ref[0][1] = (offset); + if (picture->picture_structure) + picture->pitches[0] <<= picture->pitches[1] <<= 1; + offset = 0; + while (1) + { + if (picture->bitstream_buf >= 0x08000000) + break; + switch (picture->bitstream_buf >> 12) + { + case 8: + offset += 33; + picture->bitstream_buf + |= + picture->bitstream_ptr[1] << picture->bitstream_bits; + } + } + picture->offset = (offset); + while (picture->offset - picture->coded_picture_width >= 0) + { + picture->offset -= picture->coded_picture_width; + if (picture->current_frame) + { + picture->dest[0] += picture->pitches[0]; + picture->dest[1] += picture->pitches[1]; + } + picture->v_offset += 16; + } +} + +void +mpeg2_xvmc_slice + (mpeg2dec_accel_t * accel, picture_t * picture, int code, uint8_t buffer,int mba_inc) +{ + xine_xvmc_t * xvmc = (xine_xvmc_t *) (long) bitstream_init (picture, (void *) (long) buffer); + slice_xvmc_init (picture, code); + while (1) + { + if (picture) + break; + switch (picture->bitstream_buf) + { + case 8: + mba_inc += accel->xvmc_last_slice_code = code; + xvmc->proc_macro_block (); + while (mba_inc) + ; + } + } +} diff --git a/gcc/testsuite/gcc.target/i386/pr45352.c b/gcc/testsuite/gcc.target/i386/pr45352.c new file mode 100644 index 0000000..ef710ce --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr45352.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=amdfam10 -fselective-scheduling2 -fsel-sched-pipelining -funroll-all-loops" } */ + +struct S +{ + struct + { + int i; + } **p; + int x; + int y; +}; + +extern int baz (void); +extern int bar (void *, int, int); + +void +foo (struct S *s) +{ + int i; + for (i = 0; i < s->x; i++) + bar (s->p[i], baz (), s->y); + for (i = 0; i < s->x; i++) + s->p[i]->i++; +} diff --git a/gcc/testsuite/gcc.target/ia64/pr43603.c b/gcc/testsuite/gcc.target/ia64/pr43603.c new file mode 100644 index 0000000..ad3a5b1 --- /dev/null +++ b/gcc/testsuite/gcc.target/ia64/pr43603.c @@ -0,0 +1,39 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +int +foo( long * np, int * dp, int qn) +{ + int i; + int n0; + int d0; + int a; + int b; + int c; + int d; + + a = 1; + b = 0; + c = 1; + d = 1; + + d0 = dp[0]; + + for (i = qn; i >= 0; i--) { + if (bar((c == 0)) && (np[1] == d0)) { + car(np - 3, dp, 3); + } else { + __asm__ ("xma.hu %0 = %2, %3, f0\n\txma.l %1 = %2, %3, f0" : "=&f" ((a)), +"=f" (b) : "f" ((c)), "f" ((d))); + n0 = np[0]; + if (n0 < d0) + c = 1; + else + c = 0; + + } + *--np = a; + } + + return 0; +}