From patchwork Fri Aug 3 12:05:15 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 174982 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 64D202C0087 for ; Fri, 3 Aug 2012 22:05:55 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1344600356; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=k9vRakR pwpmnf75DVmCy4iL8Nsk=; b=JBKuAkNvklBLVmGDsiGQQWO4dRMbnhfWc5eRJw1 0sdNLwTs8CydX3kFjNcWZvnktIYBunVBGQNlYOFFxovHlzNGWeU53Qo3hyd3eiPq Imm2erN3awM7mPfT7oSHV+40/txDfmpwetvdu2bsOSh9qv3IX1c/cfeEQhvbHjYR Gv0g= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=Ap9eOEKzCuNuVP0CIf2kDnz/P9giPhEesRNo8krkcqiv+XvvyK0whytiINa9AG crTYz4oQUEKNCSQhXJpDLeR/Lw6/+M1beVMQJT2yIqIFtMVYYIebSZbibiiWZheR gkE5tQ4IiKRABg4z5+rFknN34nFHvgSxHT4KC6N8Jgr40=; Received: (qmail 15820 invoked by alias); 3 Aug 2012 12:05:50 -0000 Received: (qmail 15802 invoked by uid 22791); 3 Aug 2012 12:05:42 -0000 X-SWARE-Spam-Status: No, hits=-3.6 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, TW_DB, TW_HW X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 03 Aug 2012 12:05:22 +0000 Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1SxGd6-0000kB-RU from Bernd_Schmidt@mentor.com for gcc-patches@gcc.gnu.org; Fri, 03 Aug 2012 05:05:20 -0700 Received: from SVR-IES-FEM-02.mgc.mentorg.com ([137.202.0.106]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Fri, 3 Aug 2012 05:05:20 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-02.mgc.mentorg.com (137.202.0.106) with Microsoft SMTP Server id 14.1.289.1; Fri, 3 Aug 2012 13:05:17 +0100 Message-ID: <501BBE7B.7030104@codesourcery.com> Date: Fri, 3 Aug 2012 14:05:15 +0200 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.6esrpre) Gecko/20120730 Thunderbird/10.0.6 MIME-Version: 1.0 To: GCC Patches Subject: Scheduler: Allow breaking dependencies by modifying patterns Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch allows us to change rn++ rm=[rn] into rm=[rn + 4] rn++ Opportunities to do this are discovered by a mini-pass over the instructions after generating dependencies and before scheduling a block. At that point we have all the information required to ensure that a candidate dep between two instructions is only used to show the register dependence, and to ensure that every insn with a memory reference is only subject to at most one dep causing a pattern change. The dep_t structure is extended to hold an optional pointer to a "replacement description", which holds information about what to change when a dependency is broken. The time when this replacement is applied differs depending on whether the changed insn is the DEP_CON (in which case the pattern is changed whenever the broken dependency becomes the last one), or the DEP_PRO, in which case we make the change when the corresponding DEP_CON has been scheduled. This ensures that the ready list always contains insns with the correct pattern. A few additional bits are needed in the dep structure: one to hold information about whether a dependency occurs multiple times, and one to distinguish dependencies that are purely for register values from those with other meanings (e.g. memory references). Also, sched-rgn was changed to use a new bit, DEP_POSTPONED, rather than HARD_DEP to indicate that we don't want to schedule an insn in the current block. A possible future extension would be to also allow autoinc addressing modes as the increment insn. Bootstrapped and tested on x86_64-linux, and also tested on c6x-elf (quite a number of changes were necessary to make it work there). It was originally written for a mips target and tested there in the context of a 4.6 tree. I've also run spec2000 on x86_64, with no change that looked like anything other than noise. Ok? Bernd * dbgcnt.def (sched_breakdep): New counter. * haifa-sched.c (update_insn_after_change): New static function, broken out of haifa_change_pattern. (haifa_change_pattern): Call it. (dep_t heap vecs): Declare. (INSN_COST): Define earlier. (next_cycle_replace_deps, next_cycle_apply): New static variables. (apply_replacement): New static function. (recompute_todo_spec): New argument FOR_BACKTRACK. All callers changed. Handle DEP_REPLACE deps. (contributes_to_priority_p): False for replaceable deps. (must_restore_pattern_p, restore_pattern): New static functions. (schedule_insn): Use them. Apply replacements for broken deps. (struct haifa_saved_data): Add new fields to keep track of replacements. (save_backtrack_point): Initialize them. (undo_replacements_for_backtrack): New static function. (restore_last_backtrack_point, free_topmost_backtrack_point): Use it and keep track of replacements. (perform_replacements_new_cycle, undo_all_replacements): New static functions. (schedule_block): Call these two as necessary. Call find_modifiable_mems. (try_ready): Tweak the assert. Check for DEP_POSTPONED. * sched-deps.c: Include "emit-rtl.h". (init_dep_1): Initialize DEP_NONREG, DEP_MULTIPLE and DEP_REPLACE. (dep_spec_p): True for DEP_REPLACE deps. (mark_as_hard): New static variable. (update_dep): Update DEP_NONREG and DEP_MULTIPLE. (add_dependence_list): New argument hard. All callers changed. Set and clear mark_as_hard around function body. (add_dependence_list_and_free): Likewise. (haifa_note_mem_dep): Set DEP_NONREG. (sched_analyze_insn): Switch loop with if statement testing for sel_sched_p. (struct mem_inc_info): New. (attempt_change, parse_add_or_inc, find_inc, find_mem): New static functions. (find_modifiable_mems): New function. * sched-int.h (struct dep_replacement): New. (struct _dep): Add replace, nonreg and multiple fields. Make type and cost bitfields. (UNKNOWN_DEP_COST): Change to match the bitfield. (DEP_NONREG, DEP_MULTIPLE, DEP_REPLACE): New macros. (DEP_POSTPONED): New macro. (DEP_CANCELLED): Renumber. (find_modifiable_mems): Declare. (enum SCHED_FLAGS): Add DONT_BREAK_DEPENDENCIES. * sched-rgn.c (init_ready_list): Set TODO_SPEC here. (new_ready): Don't set HARD_DEP, use DEP_POSTPONED. (debug_dependencies): Dump DEP_NONREG and DEP_MULTIPLE. * Makefile.in (sched-deps.o): Update dependencies. * config/c6x/c6x.c (in_hwloop): New static variable. (c6x_set_sched_flags): If it is true, add DONT_BREAK_DEPENDENCIES. (hwloop_optimize): Set and clear it around preliminary scheduling pass. Index: gcc/dbgcnt.def =================================================================== --- gcc/dbgcnt.def (revision 189284) +++ gcc/dbgcnt.def (working copy) @@ -176,6 +176,7 @@ DEBUG_COUNTER (sched2_func) DEBUG_COUNTER (sched_block) DEBUG_COUNTER (sched_func) DEBUG_COUNTER (sched_insn) +DEBUG_COUNTER (sched_breakdep) DEBUG_COUNTER (sched_region) DEBUG_COUNTER (sel_sched_cnt) DEBUG_COUNTER (sel_sched_region_cnt) Index: gcc/haifa-sched.c =================================================================== --- gcc/haifa-sched.c (revision 189284) +++ gcc/haifa-sched.c (working copy) @@ -213,6 +213,9 @@ struct common_sched_info_def *common_sch #define FEEDS_BACKTRACK_INSN(INSN) (HID (INSN)->feeds_backtrack_insn) #define SHADOW_P(INSN) (HID (INSN)->shadow_p) #define MUST_RECOMPUTE_SPEC_P(INSN) (HID (INSN)->must_recompute_spec) +/* Cached cost of the instruction. Use insn_cost to get cost of the + insn. -1 here means that the field is not initialized. */ +#define INSN_COST(INSN) (HID (INSN)->cost) /* If INSN_TICK of an instruction is equal to INVALID_TICK, then it should be recalculated from scratch. */ @@ -1114,18 +1117,59 @@ cond_clobbered_p (rtx insn, HARD_REG_SET return false; } +/* This function should be called after modifying the pattern of INSN, + to update scheduler data structures as needed. */ +static void +update_insn_after_change (rtx insn) +{ + sd_iterator_def sd_it; + dep_t dep; + + dfa_clear_single_insn_cache (insn); + + sd_it = sd_iterator_start (insn, + SD_LIST_FORW | SD_LIST_BACK | SD_LIST_RES_BACK); + while (sd_iterator_cond (&sd_it, &dep)) + { + DEP_COST (dep) = UNKNOWN_DEP_COST; + sd_iterator_next (&sd_it); + } + + /* Invalidate INSN_COST, so it'll be recalculated. */ + INSN_COST (insn) = -1; + /* Invalidate INSN_TICK, so it'll be recalculated. */ + INSN_TICK (insn) = INVALID_TICK; +} + +DEF_VEC_P(dep_t); +DEF_VEC_ALLOC_P(dep_t, heap); + +/* Two VECs, one to hold dependencies for which pattern replacements + need to be applied or restored at the start of the next cycle, and + another to hold an integer that is either one, to apply the + corresponding replacement, or zero to restore it. */ +static VEC(dep_t, heap) *next_cycle_replace_deps; +static VEC(int, heap) *next_cycle_apply; + +static void apply_replacement (dep_t, bool); +static void restore_pattern (dep_t, bool); + /* Look at the remaining dependencies for insn NEXT, and compute and return the TODO_SPEC value we should use for it. This is called after one of - NEXT's dependencies has been resolved. */ + NEXT's dependencies has been resolved. + We also perform pattern replacements for predication, and for broken + replacement dependencies. The latter is only done if FOR_BACKTRACK is + false. */ static ds_t -recompute_todo_spec (rtx next) +recompute_todo_spec (rtx next, bool for_backtrack) { ds_t new_ds; sd_iterator_def sd_it; - dep_t dep, control_dep = NULL; + dep_t dep, modify_dep = NULL; int n_spec = 0; int n_control = 0; + int n_replace = 0; bool first_p = true; if (sd_lists_empty_p (next, SD_LIST_BACK)) @@ -1142,9 +1186,10 @@ recompute_todo_spec (rtx next) FOR_EACH_DEP (next, SD_LIST_BACK, sd_it, dep) { + rtx pro = DEP_PRO (dep); ds_t ds = DEP_STATUS (dep) & SPECULATIVE; - if (DEBUG_INSN_P (DEP_PRO (dep)) && !DEBUG_INSN_P (next)) + if (DEBUG_INSN_P (pro) && !DEBUG_INSN_P (next)) continue; if (ds) @@ -1159,15 +1204,47 @@ recompute_todo_spec (rtx next) else new_ds = ds_merge (new_ds, ds); } - if (DEP_TYPE (dep) == REG_DEP_CONTROL) + else if (DEP_TYPE (dep) == REG_DEP_CONTROL) { - n_control++; - control_dep = dep; + if (QUEUE_INDEX (pro) != QUEUE_SCHEDULED) + { + n_control++; + modify_dep = dep; + } + DEP_STATUS (dep) &= ~DEP_CANCELLED; + } + else if (DEP_REPLACE (dep) != NULL) + { + if (QUEUE_INDEX (pro) != QUEUE_SCHEDULED) + { + n_replace++; + modify_dep = dep; + } DEP_STATUS (dep) &= ~DEP_CANCELLED; } } - if (n_control == 1 && n_spec == 0) + if (n_replace > 0 && n_control == 0 && n_spec == 0) + { + if (!dbg_cnt (sched_breakdep)) + return HARD_DEP; + FOR_EACH_DEP (next, SD_LIST_BACK, sd_it, dep) + { + struct dep_replacement *desc = DEP_REPLACE (dep); + if (desc != NULL) + { + if (desc->insn == next && !for_backtrack) + { + gcc_assert (n_replace == 1); + apply_replacement (dep, true); + } + DEP_STATUS (dep) |= DEP_CANCELLED; + } + } + return 0; + } + + else if (n_control == 1 && n_replace == 0 && n_spec == 0) { rtx pro, other, new_pat; rtx cond = NULL_RTX; @@ -1181,7 +1258,7 @@ recompute_todo_spec (rtx next) && PREDICATED_PAT (next) == NULL_RTX)) return HARD_DEP; - pro = DEP_PRO (control_dep); + pro = DEP_PRO (modify_dep); other = real_insn_for_shadow (pro); if (other != NULL_RTX) pro = other; @@ -1220,7 +1297,7 @@ recompute_todo_spec (rtx next) PREDICATED_PAT (next)); gcc_assert (success); } - DEP_STATUS (control_dep) |= DEP_CANCELLED; + DEP_STATUS (modify_dep) |= DEP_CANCELLED; return DEP_CONTROL; } @@ -1237,11 +1314,12 @@ recompute_todo_spec (rtx next) dependencies, so we return HARD_DEP in such a case. Also fail if we have speculative dependencies with not enough points, or more than one control dependency. */ - if ((n_spec > 0 && n_control > 0) + if ((n_spec > 0 && (n_control > 0 || n_replace > 0)) || (n_spec > 0 /* Too few points? */ && ds_weak (new_ds) < spec_info->data_weakness_cutoff) - || (n_control > 1)) + || n_control > 0 + || n_replace > 0) return HARD_DEP; return new_ds; @@ -1261,10 +1339,6 @@ static rtx last_nondebug_scheduled_insn; first unscheduled one. */ static rtx nonscheduled_insns_begin; -/* Cached cost of the instruction. Use below function to get cost of the - insn. -1 here means that the field is not initialized. */ -#define INSN_COST(INSN) (HID (INSN)->cost) - /* Compute cost of executing INSN. This is the number of cycles between instruction issue and instruction results. */ @@ -1443,6 +1517,9 @@ contributes_to_priority_p (dep_t dep) DEP_PRO (dep))) return false; + if (DEP_REPLACE (dep) != NULL) + return false; + /* If flag COUNT_SPEC_IN_CRITICAL_PATH is set, then speculative instructions will less likely be scheduled. That is because the priority of @@ -2136,6 +2213,31 @@ model_recompute (rtx insn) if (print_p) fprintf (sched_dump, MODEL_BAR); } + +/* After DEP, which was cancelled, has been resolved for insn NEXT, + check whether the insn's pattern needs restoring. */ +static bool +must_restore_pattern_p (rtx next, dep_t dep) +{ + if (QUEUE_INDEX (next) == QUEUE_SCHEDULED) + return false; + + if (DEP_TYPE (dep) == REG_DEP_CONTROL) + { + gcc_assert (ORIG_PAT (next) != NULL_RTX); + gcc_assert (next == DEP_CON (dep)); + } + else + { + struct dep_replacement *desc = DEP_REPLACE (dep); + if (desc->insn != next) + { + gcc_assert (*desc->loc == desc->orig); + return false; + } + } + return true; +} /* model_spill_cost (CL, P, P') returns the cost of increasing the pressure on CL from P to P'. We use this to calculate a "base ECC", @@ -3735,7 +3837,20 @@ schedule_insn (rtx insn) check_clobbered_conditions (insn); - /* Update dependent instructions. */ + /* Update dependent instructions. First, see if by scheduling this insn + now we broke a dependence in a way that requires us to change another + insn. */ + for (sd_it = sd_iterator_start (insn, SD_LIST_SPEC_BACK); + sd_iterator_cond (&sd_it, &dep); sd_iterator_next (&sd_it)) + { + struct dep_replacement *desc = DEP_REPLACE (dep); + rtx pro = DEP_PRO (dep); + if (QUEUE_INDEX (pro) != QUEUE_SCHEDULED + && desc != NULL && desc->insn == pro) + apply_replacement (dep, false); + } + + /* Go through and resolve forward dependencies. */ for (sd_it = sd_iterator_start (insn, SD_LIST_FORW); sd_iterator_cond (&sd_it, &dep);) { @@ -3749,17 +3864,8 @@ schedule_insn (rtx insn) if (cancelled) { - if (QUEUE_INDEX (next) != QUEUE_SCHEDULED) - { - int tick = INSN_TICK (next); - gcc_assert (ORIG_PAT (next) != NULL_RTX); - haifa_change_pattern (next, ORIG_PAT (next)); - INSN_TICK (next) = tick; - if (sd_lists_empty_p (next, SD_LIST_BACK)) - TODO_SPEC (next) = 0; - else if (!sd_lists_empty_p (next, SD_LIST_HARD_BACK)) - TODO_SPEC (next) = HARD_DEP; - } + if (must_restore_pattern_p (next, dep)) + restore_pattern (dep, false); continue; } @@ -3917,6 +4023,16 @@ struct haifa_saved_data to 0 when restoring. */ int q_size; rtx *insn_queue; + + /* Describe pattern replacements that occurred since this backtrack point + was queued. */ + VEC (dep_t, heap) *replacement_deps; + VEC (int, heap) *replace_apply; + + /* A copy of the next-cycle replacement vectors at the time of the backtrack + point. */ + VEC (dep_t, heap) *next_cycle_deps; + VEC (int, heap) *next_cycle_apply; }; /* A record, in reverse order, of all scheduled insns which have delay slots @@ -3973,6 +4089,11 @@ save_backtrack_point (struct delay_pair save->sched_block = sched_block; + save->replacement_deps = NULL; + save->replace_apply = NULL; + save->next_cycle_deps = VEC_copy (dep_t, heap, next_cycle_replace_deps); + save->next_cycle_apply = VEC_copy (int, heap, next_cycle_apply); + if (current_sched_info->save_state) save->fe_saved_data = (*current_sched_info->save_state) (); @@ -4042,6 +4163,25 @@ toggle_cancelled_flags (bool set) } } +/* Undo the replacements that have occurred after backtrack point SAVE + was placed. */ +static void +undo_replacements_for_backtrack (struct haifa_saved_data *save) +{ + while (!VEC_empty (dep_t, save->replacement_deps)) + { + dep_t dep = VEC_pop (dep_t, save->replacement_deps); + int apply_p = VEC_pop (int, save->replace_apply); + + if (apply_p) + restore_pattern (dep, true); + else + apply_replacement (dep, true); + } + VEC_free (dep_t, heap, save->replacement_deps); + VEC_free (int, heap, save->replace_apply); +} + /* Pop entries from the SCHEDULED_INSNS vector up to and including INSN. Restore their dependencies to an unresolved state, and mark them as queued nowhere. */ @@ -4107,7 +4247,7 @@ unschedule_insns_until (rtx insn) haifa_change_pattern (con, ORIG_PAT (con)); } else if (QUEUE_INDEX (con) != QUEUE_SCHEDULED) - TODO_SPEC (con) = recompute_todo_spec (con); + TODO_SPEC (con) = recompute_todo_spec (con, true); } VEC_free (rtx, heap, recompute_vec); } @@ -4135,6 +4275,10 @@ restore_last_backtrack_point (struct sch targetm.sched.free_sched_context (save->be_saved_data); } + /* Do this first since it clobbers INSN_TICK of the involved + instructions. */ + undo_replacements_for_backtrack (save); + /* Clear the QUEUE_INDEX of everything in the ready list or one of the queues. */ if (ready.n_ready > 0) @@ -4170,7 +4314,7 @@ restore_last_backtrack_point (struct sch { rtx insn = first[i]; QUEUE_INDEX (insn) = QUEUE_READY; - TODO_SPEC (insn) = recompute_todo_spec (insn); + TODO_SPEC (insn) = recompute_todo_spec (insn, true); INSN_TICK (insn) = save->clock_var; } } @@ -4187,7 +4331,7 @@ restore_last_backtrack_point (struct sch { rtx x = XEXP (link, 0); QUEUE_INDEX (x) = i; - TODO_SPEC (x) = recompute_todo_spec (x); + TODO_SPEC (x) = recompute_todo_spec (x, true); INSN_TICK (x) = save->clock_var + i; } } @@ -4208,6 +4352,10 @@ restore_last_backtrack_point (struct sch mark_backtrack_feeds (save->delay_pair->i2, 0); + gcc_assert (VEC_empty (dep_t, next_cycle_replace_deps)); + next_cycle_replace_deps = VEC_copy (dep_t, heap, save->next_cycle_deps); + next_cycle_apply = VEC_copy (int, heap, save->next_cycle_apply); + free (save); for (save = backtrack_queue; save; save = save->next) @@ -4237,7 +4385,14 @@ free_topmost_backtrack_point (bool reset INSN_EXACT_TICK (pair->i2) = INVALID_TICK; pair = pair->next_same_i1; } + undo_replacements_for_backtrack (save); } + else + { + VEC_free (dep_t, heap, save->replacement_deps); + VEC_free (int, heap, save->replace_apply); + } + if (targetm.sched.free_sched_context) targetm.sched.free_sched_context (save->be_saved_data); if (current_sched_info->restore_state) @@ -4258,6 +4413,124 @@ free_backtrack_queue (void) free_topmost_backtrack_point (false); } +/* Apply a replacement described by DESC. If IMMEDIATELY is false, we + may have to postpone the replacement until the start of the next cycle, + at which point we will be called again with IMMEDIATELY true. This is + only done for machines which have instruction packets with explicit + parallelism however. */ +static void +apply_replacement (dep_t dep, bool immediately) +{ + struct dep_replacement *desc = DEP_REPLACE (dep); + if (!immediately && targetm.sched.exposed_pipeline && reload_completed) + { + VEC_safe_push (dep_t, heap, next_cycle_replace_deps, dep); + VEC_safe_push (int, heap, next_cycle_apply, 1); + } + else + { + bool success; + + if (QUEUE_INDEX (desc->insn) == QUEUE_SCHEDULED) + return; + + if (sched_verbose >= 5) + fprintf (sched_dump, "applying replacement for insn %d\n", + INSN_UID (desc->insn)); + + success = validate_change (desc->insn, desc->loc, desc->newval, 0); + gcc_assert (success); + + update_insn_after_change (desc->insn); + if ((TODO_SPEC (desc->insn) & (HARD_DEP | DEP_POSTPONED)) == 0) + fix_tick_ready (desc->insn); + + if (backtrack_queue != NULL) + { + VEC_safe_push (dep_t, heap, backtrack_queue->replacement_deps, dep); + VEC_safe_push (int, heap, backtrack_queue->replace_apply, 1); + } + } +} + +/* We have determined that a pattern involved in DEP must be restored. + If IMMEDIATELY is false, we may have to postpone the replacement + until the start of the next cycle, at which point we will be called + again with IMMEDIATELY true. */ +static void +restore_pattern (dep_t dep, bool immediately) +{ + rtx next = DEP_CON (dep); + int tick = INSN_TICK (next); + + /* If we already scheduled the insn, the modified version is + correct. */ + if (QUEUE_INDEX (next) == QUEUE_SCHEDULED) + return; + + if (!immediately && targetm.sched.exposed_pipeline && reload_completed) + { + VEC_safe_push (dep_t, heap, next_cycle_replace_deps, dep); + VEC_safe_push (int, heap, next_cycle_apply, 0); + return; + } + + + if (DEP_TYPE (dep) == REG_DEP_CONTROL) + { + if (sched_verbose >= 5) + fprintf (sched_dump, "restoring pattern for insn %d\n", + INSN_UID (next)); + haifa_change_pattern (next, ORIG_PAT (next)); + } + else + { + struct dep_replacement *desc = DEP_REPLACE (dep); + bool success; + + if (sched_verbose >= 5) + fprintf (sched_dump, "restoring pattern for insn %d\n", + INSN_UID (desc->insn)); + tick = INSN_TICK (desc->insn); + + success = validate_change (desc->insn, desc->loc, desc->orig, 0); + gcc_assert (success); + update_insn_after_change (desc->insn); + if (backtrack_queue != NULL) + { + VEC_safe_push (dep_t, heap, backtrack_queue->replacement_deps, dep); + VEC_safe_push (int, heap, backtrack_queue->replace_apply, 0); + } + } + INSN_TICK (next) = tick; + if (TODO_SPEC (next) == DEP_POSTPONED) + return; + + if (sd_lists_empty_p (next, SD_LIST_BACK)) + TODO_SPEC (next) = 0; + else if (!sd_lists_empty_p (next, SD_LIST_HARD_BACK)) + TODO_SPEC (next) = HARD_DEP; +} + +/* Perform pattern replacements that were queued up until the next + cycle. */ +static void +perform_replacements_new_cycle (void) +{ + int i; + dep_t dep; + FOR_EACH_VEC_ELT (dep_t, next_cycle_replace_deps, i, dep) + { + int apply_p = VEC_index (int, next_cycle_apply, i); + if (apply_p) + apply_replacement (dep, true); + else + restore_pattern (dep, true); + } + VEC_truncate (dep_t, next_cycle_replace_deps, 0); + VEC_truncate (int, next_cycle_apply, 0); +} + /* Compute INSN_TICK_ESTIMATE for INSN. PROCESSED is a bitmap of instructions we've previously encountered, a set bit prevents recursion. BUDGET is a limit on how far ahead we look, it is @@ -4515,6 +4788,30 @@ restore_other_notes (rtx head, basic_blo return head; } +/* When we know we are going to discard the schedule due to a failed attempt + at modulo scheduling, undo all replacements. */ +static void +undo_all_replacements (void) +{ + rtx insn; + int i; + + FOR_EACH_VEC_ELT (rtx, scheduled_insns, i, insn) + { + sd_iterator_def sd_it; + dep_t dep; + + /* See if we must undo a replacement. */ + for (sd_it = sd_iterator_start (insn, SD_LIST_RES_FORW); + sd_iterator_cond (&sd_it, &dep); sd_iterator_next (&sd_it)) + { + struct dep_replacement *desc = DEP_REPLACE (dep); + if (desc != NULL) + validate_change (desc->insn, desc->loc, desc->orig, 0); + } + } +} + /* Move insns that became ready to fire from queue to ready list. */ static void @@ -5556,6 +5853,9 @@ schedule_block (basic_block *target_bb) rtx head = NEXT_INSN (prev_head); rtx tail = PREV_INSN (next_tail); + if ((current_sched_info->flags & DONT_BREAK_DEPENDENCIES) == 0) + find_modifiable_mems (head, tail); + /* We used to have code to avoid getting parameters moved from hard argument registers into pseudos. @@ -5676,6 +5976,7 @@ schedule_block (basic_block *target_bb) /* Loop until all the insns in BB are scheduled. */ while ((*current_sched_info->schedule_more_p) ()) { + perform_replacements_new_cycle (); do { start_clock_var = clock_var; @@ -5892,7 +6193,7 @@ schedule_block (basic_block *target_bb) /* We normally get here only if we don't want to move insn from the split block. */ { - TODO_SPEC (insn) = HARD_DEP; + TODO_SPEC (insn) = DEP_POSTPONED; goto restart_choose_ready; } @@ -6003,6 +6304,8 @@ schedule_block (basic_block *target_bb) gcc_assert (failed); failed_insn = failed->delay_pair->i1; + /* Clear these queues. */ + perform_replacements_new_cycle (); toggle_cancelled_flags (false); unschedule_insns_until (failed_insn); while (failed != backtrack_queue) @@ -6030,6 +6333,7 @@ schedule_block (basic_block *target_bb) if (ls.modulo_epilogue) success = true; end_schedule: + perform_replacements_new_cycle (); if (modulo_ii > 0) { /* Once again, debug insn suckiness: they can be on the ready list @@ -6069,6 +6373,9 @@ schedule_block (basic_block *target_bb) } } + if (!success) + undo_all_replacements (); + /* Debug info. */ if (sched_verbose) { @@ -6552,14 +6859,15 @@ try_ready (rtx next) old_ts = TODO_SPEC (next); - gcc_assert (!(old_ts & ~(SPECULATIVE | HARD_DEP | DEP_CONTROL)) - && ((old_ts & HARD_DEP) + gcc_assert (!(old_ts & ~(SPECULATIVE | HARD_DEP | DEP_CONTROL | DEP_POSTPONED)) + && (old_ts == HARD_DEP + || old_ts == DEP_POSTPONED || (old_ts & SPECULATIVE) - || (old_ts & DEP_CONTROL))); + || old_ts == DEP_CONTROL)); - new_ts = recompute_todo_spec (next); + new_ts = recompute_todo_spec (next, false); - if (new_ts & HARD_DEP) + if (new_ts & (HARD_DEP | DEP_POSTPONED)) gcc_assert (new_ts == old_ts && QUEUE_INDEX (next) == QUEUE_NOWHERE); else if (current_sched_info->new_ready) @@ -6627,7 +6935,7 @@ try_ready (rtx next) TODO_SPEC (next) = new_ts; - if (new_ts & HARD_DEP) + if (new_ts & (HARD_DEP | DEP_POSTPONED)) { /* We can't assert (QUEUE_INDEX (next) == QUEUE_NOWHERE) here because control-speculative NEXT could have been discarded by sched-rgn.c @@ -7647,27 +7955,13 @@ fix_recovery_deps (basic_block rec) static bool haifa_change_pattern (rtx insn, rtx new_pat) { - sd_iterator_def sd_it; - dep_t dep; int t; t = validate_change (insn, &PATTERN (insn), new_pat, 0); if (!t) return false; - dfa_clear_single_insn_cache (insn); - sd_it = sd_iterator_start (insn, - SD_LIST_FORW | SD_LIST_BACK | SD_LIST_RES_BACK); - while (sd_iterator_cond (&sd_it, &dep)) - { - DEP_COST (dep) = UNKNOWN_DEP_COST; - sd_iterator_next (&sd_it); - } - - /* Invalidate INSN_COST, so it'll be recalculated. */ - INSN_COST (insn) = -1; - /* Invalidate INSN_TICK, so it'll be recalculated. */ - INSN_TICK (insn) = INVALID_TICK; + update_insn_after_change (insn); return true; } Index: gcc/sched-deps.c =================================================================== --- gcc/sched-deps.c (revision 189284) +++ gcc/sched-deps.c (working copy) @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3. #include "insn-attr.h" #include "except.h" #include "recog.h" +#include "emit-rtl.h" #include "sched-int.h" #include "params.h" #include "cselib.h" @@ -108,6 +109,9 @@ init_dep_1 (dep_t dep, rtx pro, rtx con, DEP_TYPE (dep) = type; DEP_STATUS (dep) = ds; DEP_COST (dep) = UNKNOWN_DEP_COST; + DEP_NONREG (dep) = 0; + DEP_MULTIPLE (dep) = 0; + DEP_REPLACE (dep) = NULL; } /* Init DEP with the arguments. @@ -433,6 +437,8 @@ dep_spec_p (dep_t dep) if (DEP_TYPE (dep) == REG_DEP_CONTROL) return true; } + if (DEP_REPLACE (dep) != NULL) + return true; return false; } @@ -471,11 +477,14 @@ static bitmap_head *control_dependency_c static bitmap_head *spec_dependency_cache = NULL; static int cache_size; +/* True if we should mark added dependencies as a non-register deps. */ +static bool mark_as_hard; + static int deps_may_trap_p (const_rtx); static void add_dependence_1 (rtx, rtx, enum reg_note); -static void add_dependence_list (rtx, rtx, int, enum reg_note); +static void add_dependence_list (rtx, rtx, int, enum reg_note, bool); static void add_dependence_list_and_free (struct deps_desc *, rtx, - rtx *, int, enum reg_note); + rtx *, int, enum reg_note, bool); static void delete_all_dependences (rtx); static void chain_to_prev_insn (rtx); @@ -1135,6 +1144,9 @@ update_dep (dep_t dep, dep_t new_dep, enum reg_note old_type = DEP_TYPE (dep); bool was_spec = dep_spec_p (dep); + DEP_NONREG (dep) |= DEP_NONREG (new_dep); + DEP_MULTIPLE (dep) = 1; + /* If this is a more restrictive type of dependence than the existing one, then change the existing dependence to this type. */ @@ -1537,31 +1549,36 @@ add_dependence (rtx con, rtx pro, enum r fprintf (sched_dump, "making DEP_CONTROL for %d\n", INSN_UID (real_pro)); add_dependence_list (con, INSN_COND_DEPS (real_pro), 0, - REG_DEP_TRUE); + REG_DEP_TRUE, false); } } add_dependence_1 (con, pro, dep_type); } -/* A convenience wrapper to operate on an entire list. */ +/* A convenience wrapper to operate on an entire list. HARD should be + true if DEP_NONREG should be set on newly created dependencies. */ static void -add_dependence_list (rtx insn, rtx list, int uncond, enum reg_note dep_type) +add_dependence_list (rtx insn, rtx list, int uncond, enum reg_note dep_type, + bool hard) { + mark_as_hard = hard; for (; list; list = XEXP (list, 1)) { if (uncond || ! sched_insns_conditions_mutex_p (insn, XEXP (list, 0))) add_dependence (insn, XEXP (list, 0), dep_type); } + mark_as_hard = false; } /* Similar, but free *LISTP at the same time, when the context - is not readonly. */ + is not readonly. HARD should be true if DEP_NONREG should be set on + newly created dependencies. */ static void add_dependence_list_and_free (struct deps_desc *deps, rtx insn, rtx *listp, - int uncond, enum reg_note dep_type) + int uncond, enum reg_note dep_type, bool hard) { rtx list, next; @@ -1570,10 +1587,11 @@ add_dependence_list_and_free (struct dep disregarded. */ if (deps->readonly || DEBUG_INSN_P (insn)) { - add_dependence_list (insn, *listp, uncond, dep_type); + add_dependence_list (insn, *listp, uncond, dep_type, hard); return; } + mark_as_hard = hard; for (list = *listp, *listp = NULL; list ; list = next) { next = XEXP (list, 1); @@ -1581,6 +1599,7 @@ add_dependence_list_and_free (struct dep add_dependence (insn, XEXP (list, 0), dep_type); free_INSN_LIST_node (list); } + mark_as_hard = false; } /* Remove all occurrences of INSN from LIST. Return the number of @@ -1746,7 +1765,7 @@ flush_pending_lists (struct deps_desc *d if (for_write) { add_dependence_list_and_free (deps, insn, &deps->pending_read_insns, - 1, REG_DEP_ANTI); + 1, REG_DEP_ANTI, true); if (!deps->readonly) { free_EXPR_LIST_list (&deps->pending_read_mems); @@ -1755,14 +1774,16 @@ flush_pending_lists (struct deps_desc *d } add_dependence_list_and_free (deps, insn, &deps->pending_write_insns, 1, - for_read ? REG_DEP_ANTI : REG_DEP_OUTPUT); + for_read ? REG_DEP_ANTI : REG_DEP_OUTPUT, + true); add_dependence_list_and_free (deps, insn, &deps->last_pending_memory_flush, 1, - for_read ? REG_DEP_ANTI : REG_DEP_OUTPUT); + for_read ? REG_DEP_ANTI : REG_DEP_OUTPUT, + true); add_dependence_list_and_free (deps, insn, &deps->pending_jump_insns, 1, - REG_DEP_ANTI); + REG_DEP_ANTI, true); if (!deps->readonly) { @@ -1827,6 +1848,7 @@ haifa_note_mem_dep (rtx mem, rtx pending init_dep_1 (dep, pending_insn, cur_insn, ds_to_dt (ds), current_sched_info->flags & USE_DEPS_LIST ? ds : 0); + DEP_NONREG (dep) = 1; maybe_add_or_update_dep_1 (dep, false, pending_mem, mem); } @@ -1839,6 +1861,8 @@ haifa_note_dep (rtx elem, ds_t ds) dep_t dep = &_dep; init_dep (dep, elem, cur_insn, ds_to_dt (ds)); + if (mark_as_hard) + DEP_NONREG (dep) = 1; maybe_add_or_update_dep_1 (dep, false, NULL_RTX, NULL_RTX); } @@ -2343,7 +2367,7 @@ sched_analyze_reg (struct deps_desc *dep = alloc_INSN_LIST (insn, deps->sched_before_next_call); else add_dependence_list (insn, deps->last_function_call, 1, - REG_DEP_ANTI); + REG_DEP_ANTI, false); } } } @@ -2499,9 +2523,9 @@ sched_analyze_1 (struct deps_desc *deps, } add_dependence_list (insn, deps->last_pending_memory_flush, 1, - REG_DEP_ANTI); + REG_DEP_ANTI, true); add_dependence_list (insn, deps->pending_jump_insns, 1, - REG_DEP_CONTROL); + REG_DEP_CONTROL, true); if (!deps->readonly) add_insn_mem_dependence (deps, false, insn, dest); @@ -2801,7 +2825,7 @@ sched_analyze_insn (struct deps_desc *de /* Avoid moving trapping instructions across function calls that might not always return. */ add_dependence_list (insn, deps->last_function_call_may_noreturn, - 1, REG_DEP_ANTI); + 1, REG_DEP_ANTI, true); /* We must avoid creating a situation in which two successors of the current block have different unwind info after scheduling. If at any @@ -2818,7 +2842,8 @@ sched_analyze_insn (struct deps_desc *de = alloc_INSN_LIST (insn, deps->sched_before_next_jump); /* Make sure epilogue insn is scheduled after preceding jumps. */ - add_dependence_list (insn, deps->pending_jump_insns, 1, REG_DEP_ANTI); + add_dependence_list (insn, deps->pending_jump_insns, 1, REG_DEP_ANTI, + true); } if (code == COND_EXEC) @@ -2839,7 +2864,7 @@ sched_analyze_insn (struct deps_desc *de instruction so that reg-stack won't get confused. */ if (code == CLOBBER) add_dependence_list (insn, deps->last_function_call, 1, - REG_DEP_OUTPUT); + REG_DEP_OUTPUT, true); } else if (code == PARALLEL) { @@ -2900,11 +2925,12 @@ sched_analyze_insn (struct deps_desc *de EXECUTE_IF_SET_IN_REG_SET (reg_pending_control_uses, 0, i, rsi) { struct deps_reg *reg_last = &deps->reg_last[i]; - add_dependence_list (insn, reg_last->sets, 0, REG_DEP_ANTI); + add_dependence_list (insn, reg_last->sets, 0, REG_DEP_ANTI, + false); add_dependence_list (insn, reg_last->implicit_sets, - 0, REG_DEP_ANTI); + 0, REG_DEP_ANTI, false); add_dependence_list (insn, reg_last->clobbers, 0, - REG_DEP_ANTI); + REG_DEP_ANTI, false); } } @@ -2934,9 +2960,9 @@ sched_analyze_insn (struct deps_desc *de } add_dependence_list (insn, deps->last_pending_memory_flush, 1, - REG_DEP_ANTI); + REG_DEP_ANTI, true); add_dependence_list (insn, deps->pending_jump_insns, 1, - REG_DEP_ANTI); + REG_DEP_ANTI, true); } } @@ -2969,19 +2995,20 @@ sched_analyze_insn (struct deps_desc *de add_dependence (insn, prev, REG_DEP_ANTI); add_dependence_list (insn, deps->last_function_call, 1, - REG_DEP_ANTI); + REG_DEP_ANTI, false); - for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1)) - if (!sel_sched_p ()) + if (!sel_sched_p ()) + for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1)) add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI); EXECUTE_IF_SET_IN_REG_SET (reg_pending_uses, 0, i, rsi) { struct deps_reg *reg_last = &deps->reg_last[i]; - add_dependence_list (insn, reg_last->sets, 1, REG_DEP_ANTI); + add_dependence_list (insn, reg_last->sets, 1, REG_DEP_ANTI, false); /* There's no point in making REG_DEP_CONTROL dependencies for debug insns. */ - add_dependence_list (insn, reg_last->clobbers, 1, REG_DEP_ANTI); + add_dependence_list (insn, reg_last->clobbers, 1, REG_DEP_ANTI, + false); if (!deps->readonly) reg_last->uses = alloc_INSN_LIST (insn, reg_last->uses); @@ -3007,9 +3034,11 @@ sched_analyze_insn (struct deps_desc *de EXECUTE_IF_SET_IN_REG_SET (reg_pending_uses, 0, i, rsi) { struct deps_reg *reg_last = &deps->reg_last[i]; - add_dependence_list (insn, reg_last->sets, 0, REG_DEP_TRUE); - add_dependence_list (insn, reg_last->implicit_sets, 0, REG_DEP_ANTI); - add_dependence_list (insn, reg_last->clobbers, 0, REG_DEP_TRUE); + add_dependence_list (insn, reg_last->sets, 0, REG_DEP_TRUE, false); + add_dependence_list (insn, reg_last->implicit_sets, 0, REG_DEP_ANTI, + false); + add_dependence_list (insn, reg_last->clobbers, 0, REG_DEP_TRUE, + false); if (!deps->readonly) { @@ -3022,10 +3051,11 @@ sched_analyze_insn (struct deps_desc *de if (TEST_HARD_REG_BIT (implicit_reg_pending_uses, i)) { struct deps_reg *reg_last = &deps->reg_last[i]; - add_dependence_list (insn, reg_last->sets, 0, REG_DEP_TRUE); + add_dependence_list (insn, reg_last->sets, 0, REG_DEP_TRUE, false); add_dependence_list (insn, reg_last->implicit_sets, 0, - REG_DEP_ANTI); - add_dependence_list (insn, reg_last->clobbers, 0, REG_DEP_TRUE); + REG_DEP_ANTI, false); + add_dependence_list (insn, reg_last->clobbers, 0, REG_DEP_TRUE, + false); if (!deps->readonly) { @@ -3060,12 +3090,14 @@ sched_analyze_insn (struct deps_desc *de EXECUTE_IF_SET_IN_REG_SET (reg_pending_clobbers, 0, i, rsi) { struct deps_reg *reg_last = &deps->reg_last[i]; - add_dependence_list (insn, reg_last->sets, 0, REG_DEP_OUTPUT); + add_dependence_list (insn, reg_last->sets, 0, REG_DEP_OUTPUT, + false); add_dependence_list (insn, reg_last->implicit_sets, 0, - REG_DEP_ANTI); - add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI); + REG_DEP_ANTI, false); + add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI, + false); add_dependence_list (insn, reg_last->control_uses, 0, - REG_DEP_CONTROL); + REG_DEP_CONTROL, false); if (!deps->readonly) { @@ -3077,13 +3109,16 @@ sched_analyze_insn (struct deps_desc *de EXECUTE_IF_SET_IN_REG_SET (reg_pending_sets, 0, i, rsi) { struct deps_reg *reg_last = &deps->reg_last[i]; - add_dependence_list (insn, reg_last->sets, 0, REG_DEP_OUTPUT); + add_dependence_list (insn, reg_last->sets, 0, REG_DEP_OUTPUT, + false); add_dependence_list (insn, reg_last->implicit_sets, 0, - REG_DEP_ANTI); - add_dependence_list (insn, reg_last->clobbers, 0, REG_DEP_OUTPUT); - add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI); + REG_DEP_ANTI, false); + add_dependence_list (insn, reg_last->clobbers, 0, REG_DEP_OUTPUT, + false); + add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI, + false); add_dependence_list (insn, reg_last->control_uses, 0, - REG_DEP_CONTROL); + REG_DEP_CONTROL, false); if (!deps->readonly) reg_last->sets = alloc_INSN_LIST (insn, reg_last->sets); @@ -3098,17 +3133,18 @@ sched_analyze_insn (struct deps_desc *de || reg_last->clobbers_length > MAX_PENDING_LIST_LENGTH) { add_dependence_list_and_free (deps, insn, ®_last->sets, 0, - REG_DEP_OUTPUT); + REG_DEP_OUTPUT, false); add_dependence_list_and_free (deps, insn, ®_last->implicit_sets, 0, - REG_DEP_ANTI); + REG_DEP_ANTI, false); add_dependence_list_and_free (deps, insn, ®_last->uses, 0, - REG_DEP_ANTI); + REG_DEP_ANTI, false); add_dependence_list_and_free (deps, insn, ®_last->control_uses, 0, - REG_DEP_ANTI); - add_dependence_list_and_free - (deps, insn, ®_last->clobbers, 0, REG_DEP_OUTPUT); + REG_DEP_ANTI, false); + add_dependence_list_and_free (deps, insn, + ®_last->clobbers, 0, + REG_DEP_OUTPUT, false); if (!deps->readonly) { @@ -3119,12 +3155,14 @@ sched_analyze_insn (struct deps_desc *de } else { - add_dependence_list (insn, reg_last->sets, 0, REG_DEP_OUTPUT); + add_dependence_list (insn, reg_last->sets, 0, REG_DEP_OUTPUT, + false); add_dependence_list (insn, reg_last->implicit_sets, 0, - REG_DEP_ANTI); - add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI); + REG_DEP_ANTI, false); + add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI, + false); add_dependence_list (insn, reg_last->control_uses, 0, - REG_DEP_CONTROL); + REG_DEP_CONTROL, false); } if (!deps->readonly) @@ -3139,16 +3177,16 @@ sched_analyze_insn (struct deps_desc *de struct deps_reg *reg_last = &deps->reg_last[i]; add_dependence_list_and_free (deps, insn, ®_last->sets, 0, - REG_DEP_OUTPUT); + REG_DEP_OUTPUT, false); add_dependence_list_and_free (deps, insn, ®_last->implicit_sets, - 0, REG_DEP_ANTI); + 0, REG_DEP_ANTI, false); add_dependence_list_and_free (deps, insn, ®_last->clobbers, 0, - REG_DEP_OUTPUT); + REG_DEP_OUTPUT, false); add_dependence_list_and_free (deps, insn, ®_last->uses, 0, - REG_DEP_ANTI); + REG_DEP_ANTI, false); add_dependence_list (insn, reg_last->control_uses, 0, - REG_DEP_CONTROL); + REG_DEP_CONTROL, false); if (!deps->readonly) { @@ -3173,10 +3211,11 @@ sched_analyze_insn (struct deps_desc *de if (TEST_HARD_REG_BIT (implicit_reg_pending_clobbers, i)) { struct deps_reg *reg_last = &deps->reg_last[i]; - add_dependence_list (insn, reg_last->sets, 0, REG_DEP_ANTI); - add_dependence_list (insn, reg_last->clobbers, 0, REG_DEP_ANTI); - add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI); - add_dependence_list (insn, reg_last->control_uses, 0, REG_DEP_ANTI); + add_dependence_list (insn, reg_last->sets, 0, REG_DEP_ANTI, false); + add_dependence_list (insn, reg_last->clobbers, 0, REG_DEP_ANTI, false); + add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI, false); + add_dependence_list (insn, reg_last->control_uses, 0, REG_DEP_ANTI, + false); if (!deps->readonly) reg_last->implicit_sets @@ -3214,15 +3253,16 @@ sched_analyze_insn (struct deps_desc *de EXECUTE_IF_SET_IN_REG_SET (&deps->reg_last_in_use, 0, i, rsi) { struct deps_reg *reg_last = &deps->reg_last[i]; - add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI); + add_dependence_list (insn, reg_last->uses, 0, REG_DEP_ANTI, + true); add_dependence_list (insn, reg_last->sets, 0, reg_pending_barrier == TRUE_BARRIER - ? REG_DEP_TRUE : REG_DEP_ANTI); + ? REG_DEP_TRUE : REG_DEP_ANTI, true); add_dependence_list (insn, reg_last->implicit_sets, 0, - REG_DEP_ANTI); + REG_DEP_ANTI, true); add_dependence_list (insn, reg_last->clobbers, 0, reg_pending_barrier == TRUE_BARRIER - ? REG_DEP_TRUE : REG_DEP_ANTI); + ? REG_DEP_TRUE : REG_DEP_ANTI, true); } } else @@ -3231,19 +3271,21 @@ sched_analyze_insn (struct deps_desc *de { struct deps_reg *reg_last = &deps->reg_last[i]; add_dependence_list_and_free (deps, insn, ®_last->uses, 0, - REG_DEP_ANTI); + REG_DEP_ANTI, true); add_dependence_list_and_free (deps, insn, ®_last->control_uses, 0, - REG_DEP_CONTROL); + REG_DEP_CONTROL, true); add_dependence_list_and_free (deps, insn, ®_last->sets, 0, reg_pending_barrier == TRUE_BARRIER - ? REG_DEP_TRUE : REG_DEP_ANTI); + ? REG_DEP_TRUE : REG_DEP_ANTI, + true); add_dependence_list_and_free (deps, insn, ®_last->implicit_sets, 0, - REG_DEP_ANTI); + REG_DEP_ANTI, true); add_dependence_list_and_free (deps, insn, ®_last->clobbers, 0, reg_pending_barrier == TRUE_BARRIER - ? REG_DEP_TRUE : REG_DEP_ANTI); + ? REG_DEP_TRUE : REG_DEP_ANTI, + true); if (!deps->readonly) { @@ -3526,7 +3568,7 @@ deps_analyze_insn (struct deps_desc *dep /* For each insn which shouldn't cross a jump, add a dependence. */ add_dependence_list_and_free (deps, insn, &deps->sched_before_next_jump, 1, - REG_DEP_ANTI); + REG_DEP_ANTI, true); sched_analyze_insn (deps, PATTERN (insn), insn); } @@ -3582,7 +3624,7 @@ deps_analyze_insn (struct deps_desc *dep between that insn and this call insn. */ add_dependence_list_and_free (deps, insn, &deps->sched_before_next_call, 1, - REG_DEP_ANTI); + REG_DEP_ANTI, true); sched_analyze_insn (deps, PATTERN (insn), insn); @@ -4476,4 +4518,305 @@ check_dep (dep_t dep, bool relaxed_p) } #endif /* ENABLE_CHECKING */ +/* Holds information about a pair of memory reference and register increment + insns which depend on each other, but could possibly be interchanged. */ +struct mem_inc_info +{ + rtx inc_insn; + rtx mem_insn; + + rtx *mem_loc; + /* A register occurring in the memory address for which we wish to break + the dependence. This must be identical to the destination register of + the increment. */ + rtx mem_reg0; + /* Any kind of index that is added to that register. */ + rtx mem_index; + /* The constant offset used in the memory address. */ + HOST_WIDE_INT mem_constant; + /* The constant added in the increment insn. Negated if the increment is + after the memory address. */ + HOST_WIDE_INT inc_constant; + /* The source register used in the increment. May be different from mem_reg0 + if the increment occurs before the memory address. */ + rtx inc_input; +}; + +/* Verify that the memory location described in MII can be replaced with + one using NEW_ADDR. Return the new memory reference or NULL_RTX. The + insn remains unchanged by this function. */ + +static rtx +attempt_change (struct mem_inc_info *mii, rtx new_addr) +{ + rtx mem = *mii->mem_loc; + rtx new_mem; + + /* Jump thru a lot of hoops to keep the attributes up to date. We + do not want to call one of the change address variants that take + an offset even though we know the offset in many cases. These + assume you are changing where the address is pointing by the + offset. */ + new_mem = replace_equiv_address_nv (mem, new_addr); + if (! validate_change (mii->mem_insn, mii->mem_loc, new_mem, 0)) + { + if (sched_verbose >= 5) + fprintf (sched_dump, "validation failure\n"); + return NULL_RTX; + } + + /* Put back the old one. */ + validate_change (mii->mem_insn, mii->mem_loc, mem, 0); + + return new_mem; +} + +/* Return true if INSN is of a form "a = b op c" where a and b are + regs. op is + if c is a reg and +|- if c is a const. Fill in + informantion in MII about what is found. + BEFORE_MEM indicates whether the increment is found before or after + a corresponding memory reference. */ + +static bool +parse_add_or_inc (struct mem_inc_info *mii, rtx insn, bool before_mem) +{ + rtx pat = single_set (insn); + rtx src, cst; + bool regs_equal; + + if (RTX_FRAME_RELATED_P (insn) || !pat) + return false; + + /* Result must be single reg. */ + if (!REG_P (SET_DEST (pat))) + return false; + + if (GET_CODE (SET_SRC (pat)) != PLUS + && GET_CODE (SET_SRC (pat)) != MINUS) + return false; + + mii->inc_insn = insn; + src = SET_SRC (pat); + mii->inc_input = XEXP (src, 0); + + if (!REG_P (XEXP (src, 0))) + return false; + + if (!rtx_equal_p (SET_DEST (pat), mii->mem_reg0)) + return false; + + cst = XEXP (src, 1); + if (!CONST_INT_P (cst)) + return false; + mii->inc_constant = INTVAL (cst); + + regs_equal = rtx_equal_p (mii->inc_input, mii->mem_reg0); + + if (!before_mem) + { + mii->inc_constant = -mii->inc_constant; + if (!regs_equal) + return false; + } + + if (regs_equal && REGNO (SET_DEST (pat)) == STACK_POINTER_REGNUM) + /* Note that the sign has already been reversed for !before_mem. */ + return mii->inc_constant > 0; + + return true; +} + +/* Once a suitable mem reference has been found and the corresponding data + in MII has been filled in, this function is called to find a suitable + add or inc insn involving the register we found in the memory + reference. */ + +static bool +find_inc (struct mem_inc_info *mii, bool backwards) +{ + sd_iterator_def sd_it; + dep_t dep; + + sd_it = sd_iterator_start (mii->mem_insn, + backwards ? SD_LIST_HARD_BACK : SD_LIST_FORW); + while (sd_iterator_cond (&sd_it, &dep)) + { + dep_node_t node = DEP_LINK_NODE (*sd_it.linkp); + rtx pro = DEP_PRO (dep); + rtx con = DEP_CON (dep); + rtx inc_cand = backwards ? pro : con; + if (DEP_NONREG (dep) || DEP_MULTIPLE (dep)) + goto next; + if (parse_add_or_inc (mii, inc_cand, backwards)) + { + struct dep_replacement *desc; + df_ref *def_rec; + rtx newaddr, newmem; + + if (sched_verbose >= 5) + fprintf (sched_dump, "candidate mem/inc pair: %d %d\n", + INSN_UID (mii->mem_insn), INSN_UID (inc_cand)); + + /* Need to assure that none of the operands of the inc + instruction are assigned to by the mem insn. */ + for (def_rec = DF_INSN_DEFS (mii->mem_insn); *def_rec; def_rec++) + { + df_ref def = *def_rec; + if (reg_overlap_mentioned_p (DF_REF_REG (def), mii->inc_input) + || reg_overlap_mentioned_p (DF_REF_REG (def), mii->mem_reg0)) + { + if (sched_verbose >= 5) + fprintf (sched_dump, + "inc conflicts with store failure.\n"); + goto next; + } + } + newaddr = mii->inc_input; + if (mii->mem_index != NULL_RTX) + newaddr = gen_rtx_PLUS (GET_MODE (newaddr), newaddr, + mii->mem_index); + newaddr = plus_constant (GET_MODE (newaddr), newaddr, + mii->mem_constant + mii->inc_constant); + newmem = attempt_change (mii, newaddr); + if (newmem == NULL_RTX) + goto next; + if (sched_verbose >= 5) + fprintf (sched_dump, "successful address replacement\n"); + desc = XCNEW (struct dep_replacement); + DEP_REPLACE (dep) = desc; + desc->loc = mii->mem_loc; + desc->newval = newmem; + desc->orig = *desc->loc; + desc->insn = mii->mem_insn; + move_dep_link (DEP_NODE_BACK (node), INSN_HARD_BACK_DEPS (con), + INSN_SPEC_BACK_DEPS (con)); + if (backwards) + { + FOR_EACH_DEP (mii->inc_insn, SD_LIST_BACK, sd_it, dep) + if (modified_in_p (mii->inc_input, DEP_PRO (dep))) + add_dependence_1 (mii->mem_insn, DEP_PRO (dep), + REG_DEP_TRUE); + } + else + { + FOR_EACH_DEP (mii->inc_insn, SD_LIST_FORW, sd_it, dep) + if (modified_in_p (mii->inc_input, DEP_CON (dep))) + add_dependence_1 (DEP_CON (dep), mii->mem_insn, + REG_DEP_ANTI); + } + return true; + } + next: + sd_iterator_next (&sd_it); + } + return false; +} + +/* A recursive function that walks ADDRESS_OF_X to find memory references + which could be modified during scheduling. We call find_inc for each + one we find that has a recognizable form. MII holds information about + the pair of memory/increment instructions. + We ensure that every instruction with a memory reference (which will be + the location of the replacement) is assigned at most one breakable + dependency. */ + +static bool +find_mem (struct mem_inc_info *mii, rtx *address_of_x) +{ + rtx x = *address_of_x; + enum rtx_code code = GET_CODE (x); + const char *const fmt = GET_RTX_FORMAT (code); + int i; + + if (code == MEM) + { + rtx reg0 = XEXP (x, 0); + + mii->mem_loc = address_of_x; + mii->mem_index = NULL_RTX; + mii->mem_constant = 0; + if (GET_CODE (reg0) == PLUS && CONST_INT_P (XEXP (reg0, 1))) + { + mii->mem_constant = INTVAL (XEXP (reg0, 1)); + reg0 = XEXP (reg0, 0); + } + if (GET_CODE (reg0) == PLUS) + { + mii->mem_index = XEXP (reg0, 1); + reg0 = XEXP (reg0, 0); + } + if (REG_P (reg0)) + { + df_ref *def_rec; + int occurrences = 0; + + /* Make sure this reg appears only once in this insn. Can't use + count_occurrences since that only works for pseudos. */ + for (def_rec = DF_INSN_USES (mii->mem_insn); *def_rec; def_rec++) + { + df_ref def = *def_rec; + if (reg_overlap_mentioned_p (reg0, DF_REF_REG (def))) + if (++occurrences > 1) + { + if (sched_verbose >= 5) + fprintf (sched_dump, "mem count failure\n"); + return false; + } + } + + mii->mem_reg0 = reg0; + return find_inc (mii, true) || find_inc (mii, false); + } + return false; + } + + if (code == SIGN_EXTRACT || code == ZERO_EXTRACT) + { + /* If REG occurs inside a MEM used in a bit-field reference, + that is unacceptable. */ + return false; + } + + /* Time for some deep diving. */ + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + { + if (find_mem (mii, &XEXP (x, i))) + return true; + } + else if (fmt[i] == 'E') + { + int j; + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + if (find_mem (mii, &XVECEXP (x, i, j))) + return true; + } + } + return false; +} + + +/* Examine the instructions between HEAD and TAIL and try to find + dependencies that can be broken by modifying one of the patterns. */ + +void +find_modifiable_mems (rtx head, rtx tail) +{ + rtx insn; + int success_in_block = 0; + + for (insn = head; insn != tail; insn = NEXT_INSN (insn)) + { + struct mem_inc_info mii; + + if (!NONDEBUG_INSN_P (insn) || RTX_FRAME_RELATED_P (insn)) + continue; + + mii.mem_insn = insn; + if (find_mem (&mii, &PATTERN (insn))) + success_in_block++; + } +} + #endif /* INSN_SCHEDULING */ Index: gcc/sched-int.h =================================================================== --- gcc/sched-int.h (revision 189284) +++ gcc/sched-int.h (working copy) @@ -206,6 +206,18 @@ typedef int dw_t; extern enum reg_note ds_to_dk (ds_t); extern ds_t dk_to_ds (enum reg_note); +/* Describe a dependency that can be broken by making a replacement + in one of the patterns. LOC is the location, ORIG and NEWVAL the + two alternative contents, and INSN the instruction that must be + changed. */ +struct dep_replacement +{ + rtx *loc; + rtx orig; + rtx newval; + rtx insn; +}; + /* Information about the dependency. */ struct _dep { @@ -215,18 +227,30 @@ struct _dep /* Consumer. */ rtx con; - /* Dependency major type. This field is superseded by STATUS below. - Though, it is still in place because some targets use it. */ - enum reg_note type; + /* If nonnull, holds a pointer to information about how to break the + dependency by making a replacement in one of the insns. There is + only one such dependency for each insn that must be modified in + order to break such a dependency. */ + struct dep_replacement *replace; /* Dependency status. This field holds all dependency types and additional information for speculative dependencies. */ ds_t status; - /* Cached cost of the dependency. */ - int cost; + /* Dependency major type. This field is superseded by STATUS below. + Though, it is still in place because some targets use it. */ + ENUM_BITFIELD(reg_note) type:6; + + unsigned nonreg:1; + unsigned multiple:1; + + /* Cached cost of the dependency. Make sure to update UNKNOWN_DEP_COST + when changing the size of this field. */ + int cost:20; }; +#define UNKNOWN_DEP_COST (-1<<19) + typedef struct _dep dep_def; typedef dep_def *dep_t; @@ -235,8 +259,9 @@ typedef dep_def *dep_t; #define DEP_TYPE(D) ((D)->type) #define DEP_STATUS(D) ((D)->status) #define DEP_COST(D) ((D)->cost) - -#define UNKNOWN_DEP_COST INT_MIN +#define DEP_NONREG(D) ((D)->nonreg) +#define DEP_MULTIPLE(D) ((D)->multiple) +#define DEP_REPLACE(D) ((D)->replace) /* Functions to work with dep. */ @@ -1047,7 +1072,11 @@ enum SPEC_TYPES_OFFSETS { Therefore, it can appear only in TODO_SPEC field of an instruction. */ #define HARD_DEP (DEP_CONTROL << 1) -#define DEP_CANCELLED (HARD_DEP << 1) +/* Set in the TODO_SPEC field of an instruction for which new_ready + has decided not to schedule it speculatively. */ +#define DEP_POSTPONED (HARD_DEP << 1) + +#define DEP_CANCELLED (DEP_POSTPONED << 1) /* This represents the results of calling sched-deps.c functions, which modify dependencies. */ @@ -1074,7 +1103,8 @@ enum SCHED_FLAGS { DO_SPECULATION = USE_DEPS_LIST << 1, DO_BACKTRACKING = DO_SPECULATION << 1, DO_PREDICATION = DO_BACKTRACKING << 1, - SCHED_RGN = DO_PREDICATION << 1, + DONT_BREAK_DEPENDENCIES = DO_PREDICATION << 1, + SCHED_RGN = DONT_BREAK_DEPENDENCIES << 1, SCHED_EBB = SCHED_RGN << 1, /* Scheduler can possibly create new basic blocks. Used for assertions. */ NEW_BBS = SCHED_EBB << 1, @@ -1406,6 +1436,8 @@ extern void dump_region_dot_file (const extern void haifa_sched_init (void); extern void haifa_sched_finish (void); +extern void find_modifiable_mems (rtx, rtx); + /* sched-deps.c interface to walk, add, search, update, resolve, delete and debug instruction dependencies. */ Index: gcc/Makefile.in =================================================================== --- gcc/Makefile.in (revision 189284) +++ gcc/Makefile.in (working copy) @@ -3263,7 +3263,7 @@ haifa-sched.o : haifa-sched.c $(CONFIG_H sched-deps.o : sched-deps.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(RTL_H) $(SCHED_INT_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h \ $(FUNCTION_H) $(INSN_ATTR_H) $(DIAGNOSTIC_CORE_H) $(RECOG_H) $(EXCEPT_H) cselib.h \ - ira.h $(PARAMS_H) $(TM_P_H) ira.h $(TARGET_H) + ira.h $(PARAMS_H) $(TM_P_H) ira.h $(TARGET_H) $(EMIT_RTL_H) sched-rgn.o : sched-rgn.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(RTL_H) $(SCHED_INT_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h \ $(FUNCTION_H) $(INSN_ATTR_H) $(DIAGNOSTIC_CORE_H) $(RECOG_H) $(EXCEPT_H) $(PARAMS_H) \ Index: gcc/sched-rgn.c =================================================================== --- gcc/sched-rgn.c (revision 189284) +++ gcc/sched-rgn.c (working copy) @@ -2103,6 +2103,8 @@ init_ready_list (void) Count number of insns in the target block being scheduled. */ for (insn = NEXT_INSN (prev_head); insn != next_tail; insn = NEXT_INSN (insn)) { + gcc_assert (TODO_SPEC (insn) == HARD_DEP || TODO_SPEC (insn) == DEP_POSTPONED); + TODO_SPEC (insn) = HARD_DEP; try_ready (insn); target_n_insns++; @@ -2126,7 +2128,11 @@ init_ready_list (void) for (insn = src_head; insn != src_next_tail; insn = NEXT_INSN (insn)) if (INSN_P (insn)) - try_ready (insn); + { + gcc_assert (TODO_SPEC (insn) == HARD_DEP || TODO_SPEC (insn) == DEP_POSTPONED); + TODO_SPEC (insn) = HARD_DEP; + try_ready (insn); + } } } @@ -2218,11 +2224,11 @@ new_ready (rtx next, ds_t ts) ts = new_ds; else /* NEXT isn't ready yet. */ - ts = (ts & ~SPECULATIVE) | HARD_DEP; + ts = DEP_POSTPONED; } else /* NEXT isn't ready yet. */ - ts = (ts & ~SPECULATIVE) | HARD_DEP; + ts = DEP_POSTPONED; } } @@ -2826,7 +2832,9 @@ void debug_dependencies (rtx head, rtx t dep_t dep; FOR_EACH_DEP (insn, SD_LIST_FORW, sd_it, dep) - fprintf (sched_dump, "%d ", INSN_UID (DEP_CON (dep))); + fprintf (sched_dump, "%d%s%s ", INSN_UID (DEP_CON (dep)), + DEP_NONREG (dep) ? "n" : "", + DEP_MULTIPLE (dep) ? "m" : ""); } fprintf (sched_dump, "\n"); } Index: gcc/config/c6x/c6x.c =================================================================== --- gcc/config/c6x/c6x.c (revision 189284) +++ gcc/config/c6x/c6x.c (working copy) @@ -3911,6 +3911,13 @@ c6x_free_sched_context (void *_sc) free (_sc); } +/* True if we are currently performing a preliminary scheduling + pass before modulo scheduling; we can't allow the scheduler to + modify instruction patterns using packetization assumptions, + since there will be another scheduling pass later if modulo + scheduling fails. */ +static bool in_hwloop; + /* Provide information about speculation capabilities, and set the DO_BACKTRACKING flag. */ static void @@ -3922,6 +3929,8 @@ c6x_set_sched_flags (spec_info_t spec_in { *flags |= DO_BACKTRACKING | DO_PREDICATION; } + if (in_hwloop) + *flags |= DONT_BREAK_DEPENDENCIES; spec_info->mask = 0; } @@ -5535,9 +5544,11 @@ hwloop_optimize (hwloop_info loop) reshuffle_units (loop->head); + in_hwloop = true; schedule_ebbs_init (); schedule_ebb (BB_HEAD (loop->tail), loop->loop_end, true); schedule_ebbs_finish (); + in_hwloop = false; bb = loop->head; loop_earliest = bb_earliest_end_cycle (bb, loop->loop_end) + 1;