From patchwork Fri Apr 5 20:14:58 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Makarov X-Patchwork-Id: 234270 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 3C8932C010E for ; Sat, 6 Apr 2013 07:15:16 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=CZ0APMlZJQydh4Vya9QMe1bpKLRNuVnPfQyVkfGHKwr7d+ LYlxf6iPo3YcqW0iHPZDVbiyPIhX4oYJPBhVfhgql3uHiUmM0FzPhGuse4tS3hQb NMBkuYupB9oxZ8u/Qgh7pfmkz0apVWpijzBazd4As7oTXp/8GapckFqWqc2XA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=Zm+zOwkGtnEjiXX6SSN+wRiDdxU=; b=WTQyRUvWurZc5pfAhgv+ 3O7Kw92tNnw+VWbPfCLsuKLoXCi5zcQyWfAUH2VM4+IfY+Jz09KOOfhvQU2VVui5 /bndY1jQM+K7ohBWXIp1Yy3XrNQjqm8e3M6kD87widAmMQr5iJegIQ0/rAHK11td Ekc1xRI5hlyQg1AD8F0lBMg= Received: (qmail 9239 invoked by alias); 5 Apr 2013 20:15:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 9225 invoked by uid 89); 5 Apr 2013 20:15:09 -0000 X-Spam-SWARE-Status: No, score=-7.7 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.1 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Fri, 05 Apr 2013 20:15:05 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r35KF1Bh007459 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 5 Apr 2013 16:15:01 -0400 Received: from Mair.local (vpn-55-130.rdu2.redhat.com [10.10.55.130]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r35KExlx014689 for ; Fri, 5 Apr 2013 16:14:59 -0400 Message-ID: <515F30C2.2090205@redhat.com> Date: Fri, 05 Apr 2013 16:14:58 -0400 From: Vladimir Makarov User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: GCC Patches Subject: [ira-improv] patch for new code dealing with hard reg preferences X-Virus-Found: No The following patch adds a new functionality to IRA to improve RA in presence of hard registers in RTL. IRA already had some mechanism dealing with hard regs but it affected only pseudos in *the same insn* containing hard register. This technique was not good enough to remove regmove pass code making transformations for matching constraints (e.g. 2-op insn machine) although IRA can make better decision for pseudos but still not enough good for hard registers. So just removing the code in regmove resulted in worse generated code performance. On the other hand there are performance PRs which is a result of bad decision in pre-mature optimization of regmove pass for matching constraints. We already have an IRA code propagating pseudo preferences to other pseudos through net of copies. We need the same code for better dealing with hard register preferences. The patch adds such code. I could modify structure ira_copy for this but decided to use new separate structures as the copy structure is too big and we don't need most of its fields for hard-register preferences. By the way, LRA has already code for dealing with hard register preferences but in simplified way as we work on RTL mostly locally. This is not a final patch as the big part of regmove code and old IRA code for hard register preferences are not removed. This code is just switched off by -fira-hard-reg-pref option. I am going to play a bit more with the new code and when I decide to submit it to trunk, I remove the unnecessary code and the option. Right now, the results look pretty good for SPEC2000 on x86/x86-64. The compiler is always faster (as a regmove RTL pass is switched off), generates in average smaller code (the best is 0.05% decrease for x86-64 SPECFP2000), and generates in average better code (0.6% and 0.4% SPECFP2000 improvement on x86 and x86-64 correspondingly). The results were gotten on Intel Core I7-2600 in -O3 optimization mode. The patch was successfully bootstrapped on x86/x86-64 with the new code switched on. Committed as rev. 197525. 2013-04-05 Vladimir Makarov * common.opt (fira-hard-reg-pref): New. * regmove.c (regmove_optimize): Don't call regmove_forward_pass for flag_ira_hard_reg_pref. * ira-int.h (ira_pref_t, allocno_prefs, ALLOCNO_PREFS): New. (struct ira_allocno_pref, ira_prefs, ira_prefs_num): New. (ira_debug_pref, ira_debug_prefs, ira_debug_allocno_prefs): New. (ira_create_pref, ira_create_copy): New. (ira_add_allocno_copy_to_list): Remove. (ira_swap_allocno_copy_ends_if_necessary): Ditto. (ira_pref_iterator, ira_pref_iter_init, ira_pref_iter_cond): New. (FOR_EACH_PREF): New. * ira-build.c (ira_prefs, ira_prefs_num): New. (ira_create_allocno): Reset preferences. (pref_pool, pref_vec, initiate_prefs, find_allocno_pref) (ira_create_pref, add_allocno_pref_to_list, ira_add_allocno_pref) (print_pref, ira_debug_pref, print_prefs, ira_debug_prefs) (print_allocno_prefs, ira_debug_allocno_prefs, finish_pref) (finish_prefs): New. (ira_add_allocno_copy_to_list): Rename to add_allocno_copy_to_list. Make static. (ira_swap_allocno_copy_ends_if_necessary): Rename to swap_allocno_copy_ends_if_necessary. Make static. (ira_build, ira_destroy): Initialize and finish the prefs. * ira-color.c (update_allocno_cost, update_costs_from_allocno): New. (update_copy_costs): Rename to update_costs_from_copies. Use update_costs_from_allocno. (update_costs_from_prefs, update_costs_from_copies): New. (assign_hard_reg): Call update_costs_from_prefs. (color_allocnos, color_pass): Ditto. * ira-costs.c (find_costs_and_classes): Improve code. Add code for hard reg preferences. (process_bb_node_for_hard_reg_moves): Add code for hard reg preferences. Index: ChangeLog =================================================================== --- ChangeLog (revision 195439) +++ ChangeLog (working copy) @@ -1,3 +1,40 @@ +2013-04-05 Vladimir Makarov + + * common.opt (fira-hard-reg-pref): New. + * regmove.c (regmove_optimize): Don't call regmove_forward_pass + for flag_ira_hard_reg_pref. + * ira-int.h (ira_pref_t, allocno_prefs, ALLOCNO_PREFS): New. + (struct ira_allocno_pref, ira_prefs, ira_prefs_num): New. + (ira_debug_pref, ira_debug_prefs, ira_debug_allocno_prefs): New. + (ira_create_pref, ira_create_copy): New. + (ira_add_allocno_copy_to_list): Remove. + (ira_swap_allocno_copy_ends_if_necessary): Ditto. + (ira_pref_iterator, ira_pref_iter_init, ira_pref_iter_cond): New. + (FOR_EACH_PREF): New. + * ira-build.c (ira_prefs, ira_prefs_num): New. + (ira_create_allocno): Reset preferences. + (pref_pool, pref_vec, initiate_prefs, find_allocno_pref) + (ira_create_pref, add_allocno_pref_to_list, ira_add_allocno_pref) + (print_pref, ira_debug_pref, print_prefs, ira_debug_prefs) + (print_allocno_prefs, ira_debug_allocno_prefs, finish_pref) + (finish_prefs): New. + (ira_add_allocno_copy_to_list): Rename to + add_allocno_copy_to_list. Make static. + (ira_swap_allocno_copy_ends_if_necessary): Rename to + swap_allocno_copy_ends_if_necessary. Make static. + (ira_build, ira_destroy): Initialize and finish the prefs. + * ira-color.c (update_allocno_cost, update_costs_from_allocno): + New. + (update_copy_costs): Rename to update_costs_from_copies. Use + update_costs_from_allocno. + (update_costs_from_prefs, update_costs_from_copies): New. + (assign_hard_reg): Call update_costs_from_prefs. + (color_allocnos, color_pass): Ditto. + * ira-costs.c (find_costs_and_classes): Improve code. Add code for + hard reg preferences. + (process_bb_node_for_hard_reg_moves): Add code for hard reg + preferences. + 2013-01-24 Shenghou Ma * doc/invoke.texi: fix typo. Index: common.opt =================================================================== --- common.opt (revision 195438) +++ common.opt (working copy) @@ -1470,6 +1470,10 @@ Common Report Var(flag_errno_math) Init(1) Optimization SetByCombined Set errno after built-in math functions +fira-hard-reg-pref +Common Report Var(flag_ira_hard_reg_pref) Init(1) Optimization +Use new IRA code for hard reg preference and switch of remove constraint matching code + fmax-errors= Common Joined RejectNegative UInteger Var(flag_max_errors) -fmax-errors= Maximum number of errors to report Index: ira-build.c =================================================================== --- ira-build.c (revision 195438) +++ ira-build.c (working copy) @@ -76,6 +76,13 @@ /* Map a conflict id to its conflict record. */ ira_object_t *ira_object_id_map; +/* Array of references to all allocno preferences. The order number + of the preference corresponds to the index in the array. */ +ira_pref_t *ira_prefs; + +/* Size of the previous array. */ +int ira_prefs_num; + /* Array of references to all copies. The order number of the copy corresponds to the index in the array. Removed copies have NULL element value. */ @@ -514,6 +521,7 @@ ALLOCNO_BAD_SPILL_P (a) = false; ALLOCNO_ASSIGNED_P (a) = false; ALLOCNO_MODE (a) = (regno < 0 ? VOIDmode : PSEUDO_REGNO_MODE (regno)); + ALLOCNO_PREFS (a) = NULL; ALLOCNO_COPIES (a) = NULL; ALLOCNO_HARD_REG_COSTS (a) = NULL; ALLOCNO_CONFLICT_HARD_REG_COSTS (a) = NULL; @@ -1162,6 +1170,158 @@ +/* Pools for allocno preferences. */ +static alloc_pool pref_pool; + +/* Vec containing references to all created preferences. It is a + container of array ira_prefs. */ +static vec pref_vec; + +/* The function initializes data concerning allocno prefs. */ +static void +initiate_prefs (void) +{ + pref_pool + = create_alloc_pool ("prefs", sizeof (struct ira_allocno_pref), 100); + pref_vec.create (get_max_uid ()); + ira_prefs = NULL; + ira_prefs_num = 0; +} + +/* Return pref for A and HARD_REGNO if any. */ +static ira_pref_t +find_allocno_pref (ira_allocno_t a, int hard_regno) +{ + ira_pref_t pref; + + for (pref = ALLOCNO_PREFS (a); pref != NULL; pref = pref->next_pref) + if (pref->allocno == a && pref->hard_regno == hard_regno) + return pref; + return NULL; +} + +/* Create and return pref with given attributes A, HARD_REGNO, and FREQ. */ +ira_pref_t +ira_create_pref (ira_allocno_t a, int hard_regno, int freq) +{ + ira_pref_t pref; + + pref = (ira_pref_t) pool_alloc (pref_pool); + pref->num = ira_prefs_num; + pref->allocno = a; + pref->hard_regno = hard_regno; + pref->freq = freq; + pref_vec.safe_push (pref); + ira_prefs = pref_vec.address (); + ira_prefs_num = pref_vec.length (); + return pref; +} + +/* Attach a pref PREF to the cooresponding allocno. */ +static void +add_allocno_pref_to_list (ira_pref_t pref) +{ + ira_allocno_t a = pref->allocno; + + pref->next_pref = ALLOCNO_PREFS (a); + ALLOCNO_PREFS (a) = pref; +} + +/* Create (or update frequency if the pref already exists) the pref of + allocnos A preferring HARD_REGNO with frequency FREQ. */ +void +ira_add_allocno_pref (ira_allocno_t a, int hard_regno, int freq) +{ + ira_pref_t pref; + + if (freq <= 0) + return; + if ((pref = find_allocno_pref (a, hard_regno)) != NULL) + { + pref->freq += freq; + return; + } + pref = ira_create_pref (a, hard_regno, freq); + ira_assert (a != NULL); + add_allocno_pref_to_list (pref); +} + +/* Print info about PREF into file F. */ +static void +print_pref (FILE *f, ira_pref_t pref) +{ + fprintf (f, " pref%d:a%d(r%d)<-hr%d@%d\n", pref->num, + ALLOCNO_NUM (pref->allocno), ALLOCNO_REGNO (pref->allocno), + pref->hard_regno, pref->freq); +} + +/* Print info about PREF into stderr. */ +void +ira_debug_pref (ira_pref_t pref) +{ + print_pref (stderr, pref); +} + +/* Print info about all prefs into file F. */ +static void +print_prefs (FILE *f) +{ + ira_pref_t pref; + ira_pref_iterator pi; + + FOR_EACH_PREF (pref, pi) + print_pref (f, pref); +} + +/* Print info about all prefs into stderr. */ +void +ira_debug_prefs (void) +{ + print_prefs (stderr); +} + +/* Print info about prefs involving allocno A into file F. */ +static void +print_allocno_prefs (FILE *f, ira_allocno_t a) +{ + ira_pref_t pref; + + fprintf (f, " a%d(r%d):", ALLOCNO_NUM (a), ALLOCNO_REGNO (a)); + for (pref = ALLOCNO_PREFS (a); pref != NULL; pref = pref->next_pref) + fprintf (f, " pref%d:hr%d@%d", pref->num, pref->hard_regno, pref->freq); + fprintf (f, "\n"); +} + +/* Print info about prefs involving allocno A into stderr. */ +void +ira_debug_allocno_prefs (ira_allocno_t a) +{ + print_allocno_prefs (stderr, a); +} + +/* The function frees memory allocated for PREF. */ +static void +finish_pref (ira_pref_t pref) +{ + pool_free (pref_pool, pref); +} + + +/* Free memory allocated for all prefs. */ +static void +finish_prefs (void) +{ + ira_pref_t pref; + ira_pref_iterator pi; + + FOR_EACH_PREF (pref, pi) + finish_pref (pref); + pref_vec.release (); + free_alloc_pool (pref_pool); +} + + + /* Pools for copies. */ static alloc_pool copy_pool; @@ -1234,8 +1394,8 @@ } /* Attach a copy CP to allocnos involved into the copy. */ -void -ira_add_allocno_copy_to_list (ira_copy_t cp) +static void +add_allocno_copy_to_list (ira_copy_t cp) { ira_allocno_t first = cp->first, second = cp->second; @@ -1263,8 +1423,8 @@ /* Make a copy CP a canonical copy where number of the first allocno is less than the second one. */ -void -ira_swap_allocno_copy_ends_if_necessary (ira_copy_t cp) +static void +swap_allocno_copy_ends_if_necessary (ira_copy_t cp) { ira_allocno_t temp; ira_copy_t temp_cp; @@ -1304,8 +1464,8 @@ cp = ira_create_copy (first, second, freq, constraint_p, insn, loop_tree_node); ira_assert (first != NULL && second != NULL); - ira_add_allocno_copy_to_list (cp); - ira_swap_allocno_copy_ends_if_necessary (cp); + add_allocno_copy_to_list (cp); + swap_allocno_copy_ends_if_necessary (cp); return cp; } @@ -3099,8 +3259,8 @@ ira_assert (ALLOCNO_LOOP_TREE_NODE (cp->first) == ira_loop_tree_root && ALLOCNO_LOOP_TREE_NODE (cp->second) == ira_loop_tree_root); - ira_add_allocno_copy_to_list (cp); - ira_swap_allocno_copy_ends_if_necessary (cp); + add_allocno_copy_to_list (cp); + swap_allocno_copy_ends_if_necessary (cp); } rebuild_regno_allocno_maps (); if (ira_max_point != ira_max_point_before_emit) @@ -3188,6 +3348,7 @@ df_analyze (); initiate_cost_vectors (); initiate_allocnos (); + initiate_prefs (); initiate_copies (); create_loop_tree_nodes (); form_loop_tree (); @@ -3272,6 +3433,7 @@ ira_destroy (void) { finish_loop_tree_nodes (); + finish_prefs (); finish_copies (); finish_allocnos (); finish_cost_vectors (); Index: ira-color.c =================================================================== --- ira-color.c (revision 195438) +++ ira-color.c (working copy) @@ -1219,29 +1219,44 @@ return true; } -/* Update the cost of allocnos to increase chances to remove some - copies as the result of subsequent assignment. */ +/* Increase costs of HARD_REGNO by UPDATE_COST for ALLOCNO. Return + true if we really modified the cost. */ +static bool +update_allocno_cost (ira_allocno_t allocno, int hard_regno, int update_cost) +{ + int i; + enum reg_class aclass = ALLOCNO_CLASS (allocno); + + i = ira_class_hard_reg_index[aclass][hard_regno]; + if (i < 0) + return false; + ira_allocate_and_set_or_copy_costs + (&ALLOCNO_UPDATED_HARD_REG_COSTS (allocno), aclass, + ALLOCNO_UPDATED_CLASS_COST (allocno), + ALLOCNO_HARD_REG_COSTS (allocno)); + ira_allocate_and_set_or_copy_costs + (&ALLOCNO_UPDATED_CONFLICT_HARD_REG_COSTS (allocno), + aclass, 0, ALLOCNO_CONFLICT_HARD_REG_COSTS (allocno)); + ALLOCNO_UPDATED_HARD_REG_COSTS (allocno)[i] += update_cost; + ALLOCNO_UPDATED_CONFLICT_HARD_REG_COSTS (allocno)[i] + += update_cost; + return true; +} + +/* Update (decrease if DECR_P) HARD_REGNO cost of allocnos connected + by copies to ALLOCNO to increase chances to remove some copies as + the result of subsequent assignment. */ static void -update_copy_costs (ira_allocno_t allocno, bool decr_p) +update_costs_from_allocno (ira_allocno_t allocno, int hard_regno, + int divisor, bool decr_p) { - int i, cost, update_cost, hard_regno, divisor; + int cost, update_cost; enum machine_mode mode; enum reg_class rclass, aclass; ira_allocno_t another_allocno; ira_copy_t cp, next_cp; - hard_regno = ALLOCNO_HARD_REGNO (allocno); - ira_assert (hard_regno >= 0); - - aclass = ALLOCNO_CLASS (allocno); - if (aclass == NO_REGS) - return; - i = ira_class_hard_reg_index[aclass][hard_regno]; - ira_assert (i >= 0); rclass = REGNO_REG_CLASS (hard_regno); - - start_update_cost (); - divisor = 1; do { mode = ALLOCNO_MODE (allocno); @@ -1277,26 +1292,54 @@ if (update_cost == 0) continue; - ira_allocate_and_set_or_copy_costs - (&ALLOCNO_UPDATED_HARD_REG_COSTS (another_allocno), aclass, - ALLOCNO_UPDATED_CLASS_COST (another_allocno), - ALLOCNO_HARD_REG_COSTS (another_allocno)); - ira_allocate_and_set_or_copy_costs - (&ALLOCNO_UPDATED_CONFLICT_HARD_REG_COSTS (another_allocno), - aclass, 0, ALLOCNO_CONFLICT_HARD_REG_COSTS (another_allocno)); - i = ira_class_hard_reg_index[aclass][hard_regno]; - if (i < 0) + if (! update_allocno_cost (another_allocno, hard_regno, update_cost)) continue; - ALLOCNO_UPDATED_HARD_REG_COSTS (another_allocno)[i] += update_cost; - ALLOCNO_UPDATED_CONFLICT_HARD_REG_COSTS (another_allocno)[i] - += update_cost; - queue_update_cost (another_allocno, divisor * COST_HOP_DIVISOR); } } while (get_next_update_cost (&allocno, &divisor)); } +/* Update (decrease if DECR_P) preferred ALLOCNO hard register costs + and costs of allocnos connected to ALLOCNO through copy. */ +static void +update_costs_from_prefs (ira_allocno_t allocno, bool decr_p) +{ + int cost; + enum machine_mode mode; + enum reg_class rclass, aclass; + ira_pref_t pref; + + mode = ALLOCNO_MODE (allocno); + aclass = ALLOCNO_CLASS (allocno); + start_update_cost (); + for (pref = ALLOCNO_PREFS (allocno); pref != NULL; pref = pref->next_pref) + { + rclass = REGNO_REG_CLASS (pref->hard_regno); + cost = pref->freq * ira_register_move_cost[mode][rclass][aclass]; + if (decr_p) + cost = -cost; + if (update_allocno_cost (allocno, pref->hard_regno, cost)) + update_costs_from_allocno (allocno, pref->hard_regno, + COST_HOP_DIVISOR, decr_p); + } +} + +/* Update (decrease if DECR_P) the cost of allocnos connected to + ALLOCNO through copies to increase chances to remove some copies as + the result of subsequent assignment. ALLOCNO was just assigned to + a hard register. */ +static void +update_costs_from_copies (ira_allocno_t allocno, bool decr_p) +{ + int hard_regno; + + hard_regno = ALLOCNO_HARD_REGNO (allocno); + ira_assert (hard_regno >= 0 && ALLOCNO_CLASS (allocno) != NO_REGS); + start_update_cost (); + update_costs_from_allocno (allocno, hard_regno, 1, decr_p); +} + /* This function updates COSTS (decrease if DECR_P) for hard_registers of ACLASS by conflict costs of the unassigned allocnos connected by copies with allocnos in update_cost_queue. This @@ -1711,7 +1754,8 @@ ALLOCNO_HARD_REGNO (a) = best_hard_regno; ALLOCNO_ASSIGNED_P (a) = true; if (best_hard_regno >= 0) - update_copy_costs (a, true); + update_costs_from_copies (a, true); + update_costs_from_prefs (a, false); ira_assert (ALLOCNO_CLASS (a) == aclass); /* We don't need updated costs anymore: */ ira_free_allocno_updated_costs (a); @@ -2602,7 +2646,10 @@ { a = ira_allocnos[i]; if (ALLOCNO_CLASS (a) != NO_REGS && ! empty_profitable_hard_regs (a)) - ALLOCNO_COLOR_DATA (a)->in_graph_p = true; + { + ALLOCNO_COLOR_DATA (a)->in_graph_p = true; + update_costs_from_prefs (a, true); + } else { ALLOCNO_HARD_REGNO (a) = -1; @@ -2769,7 +2816,7 @@ ALLOCNO_HARD_REGNO (subloop_allocno) = hard_regno; ALLOCNO_ASSIGNED_P (subloop_allocno) = true; if (hard_regno >= 0) - update_copy_costs (subloop_allocno, true); + update_costs_from_copies (subloop_allocno, true); /* We don't need updated costs anymore: */ ira_free_allocno_updated_costs (subloop_allocno); } @@ -2813,7 +2860,7 @@ ALLOCNO_HARD_REGNO (subloop_allocno) = hard_regno; ALLOCNO_ASSIGNED_P (subloop_allocno) = true; if (hard_regno >= 0) - update_copy_costs (subloop_allocno, true); + update_costs_from_copies (subloop_allocno, true); /* We don't need updated costs anymore: */ ira_free_allocno_updated_costs (subloop_allocno); } @@ -2829,7 +2876,7 @@ ALLOCNO_HARD_REGNO (subloop_allocno) = hard_regno; ALLOCNO_ASSIGNED_P (subloop_allocno) = true; if (hard_regno >= 0) - update_copy_costs (subloop_allocno, true); + update_costs_from_copies (subloop_allocno, true); /* We don't need updated costs anymore: */ ira_free_allocno_updated_costs (subloop_allocno); } @@ -3810,7 +3857,7 @@ ? ALLOCNO_CLASS_COST (a) : ALLOCNO_HARD_REG_COSTS (a) [ira_class_hard_reg_index[aclass][old_hard_regno]]); - update_copy_costs (a, false); + update_costs_from_copies (a, false); } ira_overall_cost -= cost; ALLOCNO_HARD_REGNO (a) = hard_regno; @@ -3825,7 +3872,7 @@ ? ALLOCNO_CLASS_COST (a) : ALLOCNO_HARD_REG_COSTS (a) [ira_class_hard_reg_index[aclass][hard_regno]]); - update_copy_costs (a, true); + update_costs_from_copies (a, true); } else /* Reload changed class of the allocno. */ Index: ira-costs.c =================================================================== --- ira-costs.c (revision 195438) +++ ira-costs.c (working copy) @@ -1733,14 +1733,15 @@ a != NULL; a = ALLOCNO_NEXT_REGNO_ALLOCNO (a)) { - a_num = ALLOCNO_NUM (a); - if (regno_aclass[i] == NO_REGS) + enum reg_class aclass = regno_aclass[i]; + int a_num = ALLOCNO_NUM (a); + int *total_a_costs = COSTS (total_allocno_costs, a_num)->cost; + int *a_costs = COSTS (costs, a_num)->cost; + + if (aclass == NO_REGS) best = NO_REGS; else { - int *total_a_costs = COSTS (total_allocno_costs, a_num)->cost; - int *a_costs = COSTS (costs, a_num)->cost; - /* Finding best class which is subset of the common class. */ best_cost = (1 << (HOST_BITS_PER_INT - 2)) - 1; @@ -1749,7 +1750,7 @@ for (k = 0; k < cost_classes_ptr->num; k++) { rclass = cost_classes[k]; - if (! ira_class_subset_p[rclass][regno_aclass[i]]) + if (! ira_class_subset_p[rclass][aclass]) continue; /* Ignore classes that are too small or invalid for this operand. */ @@ -1784,9 +1785,25 @@ ALLOCNO_LOOP_TREE_NODE (a)->loop_num); fprintf (dump_file, ") best %s, allocno %s\n", reg_class_names[best], - reg_class_names[regno_aclass[i]]); + reg_class_names[aclass]); } pref[a_num] = best; + if (flag_ira_hard_reg_pref && pass != start && best != aclass + && ira_class_hard_regs_num[best] > 0 + && (ira_reg_class_max_nregs[best][ALLOCNO_MODE (a)] + >= ira_class_hard_regs_num[best])) + { + int ind = cost_classes_ptr->index[aclass]; + + ira_assert (ind >= 0); + ira_add_allocno_pref (a, ira_class_hard_regs[best][0], + (a_costs[ind] - ALLOCNO_CLASS_COST (a)) + / (ira_register_move_cost + [ALLOCNO_MODE (a)][best][aclass])); + for (k = 0; k < cost_classes_ptr->num; k++) + if (ira_class_subset_p[cost_classes[k]][best]) + a_costs[k] = a_costs[ind]; + } } } @@ -1812,11 +1829,10 @@ static void process_bb_node_for_hard_reg_moves (ira_loop_tree_node_t loop_tree_node) { - int i, freq, cost, src_regno, dst_regno, hard_regno; + int i, freq, src_regno, dst_regno, hard_regno; bool to_p; ira_allocno_t a; - enum reg_class rclass, hard_reg_class; - enum machine_mode mode; + enum reg_class rclass; basic_block bb; rtx insn, set, src, dst; @@ -1843,15 +1859,15 @@ && src_regno < FIRST_PSEUDO_REGISTER) { hard_regno = src_regno; + a = ira_curr_regno_allocno_map[dst_regno]; to_p = true; - a = ira_curr_regno_allocno_map[dst_regno]; } else if (src_regno >= FIRST_PSEUDO_REGISTER && dst_regno < FIRST_PSEUDO_REGISTER) { hard_regno = dst_regno; + a = ira_curr_regno_allocno_map[src_regno]; to_p = false; - a = ira_curr_regno_allocno_map[src_regno]; } else continue; @@ -1861,20 +1877,29 @@ i = ira_class_hard_reg_index[rclass][hard_regno]; if (i < 0) continue; - mode = ALLOCNO_MODE (a); - hard_reg_class = REGNO_REG_CLASS (hard_regno); - ira_init_register_move_cost_if_necessary (mode); - cost - = (to_p ? ira_register_move_cost[mode][hard_reg_class][rclass] - : ira_register_move_cost[mode][rclass][hard_reg_class]) * freq; - ira_allocate_and_set_costs (&ALLOCNO_HARD_REG_COSTS (a), rclass, - ALLOCNO_CLASS_COST (a)); - ira_allocate_and_set_costs (&ALLOCNO_CONFLICT_HARD_REG_COSTS (a), - rclass, 0); - ALLOCNO_HARD_REG_COSTS (a)[i] -= cost; - ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[i] -= cost; - ALLOCNO_CLASS_COST (a) = MIN (ALLOCNO_CLASS_COST (a), - ALLOCNO_HARD_REG_COSTS (a)[i]); + if (flag_ira_hard_reg_pref) + ira_add_allocno_pref (a, hard_regno, freq); + else + { + int cost; + enum reg_class hard_reg_class; + enum machine_mode mode; + + mode = ALLOCNO_MODE (a); + hard_reg_class = REGNO_REG_CLASS (hard_regno); + ira_init_register_move_cost_if_necessary (mode); + cost + = (to_p ? ira_register_move_cost[mode][hard_reg_class][rclass] + : ira_register_move_cost[mode][rclass][hard_reg_class]) * freq; + ira_allocate_and_set_costs (&ALLOCNO_HARD_REG_COSTS (a), rclass, + ALLOCNO_CLASS_COST (a)); + ira_allocate_and_set_costs (&ALLOCNO_CONFLICT_HARD_REG_COSTS (a), + rclass, 0); + ALLOCNO_HARD_REG_COSTS (a)[i] -= cost; + ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[i] -= cost; + ALLOCNO_CLASS_COST (a) = MIN (ALLOCNO_CLASS_COST (a), + ALLOCNO_HARD_REG_COSTS (a)[i]); + } } } Index: ira-int.h =================================================================== --- ira-int.h (revision 195438) +++ ira-int.h (working copy) @@ -60,6 +60,7 @@ allocnos. */ typedef struct live_range *live_range_t; typedef struct ira_allocno *ira_allocno_t; +typedef struct ira_allocno_pref *ira_pref_t; typedef struct ira_allocno_copy *ira_copy_t; typedef struct ira_object *ira_object_t; @@ -349,6 +350,8 @@ register class living at the point than number of hard-registers of the class available for the allocation. */ int excess_pressure_points_num; + /* Allocno hard reg preferences. */ + ira_pref_t allocno_prefs; /* Copies to other non-conflicting allocnos. The copies can represent move insn or potential move insn usually because of two operand insn constraints. */ @@ -429,6 +432,7 @@ #define ALLOCNO_BAD_SPILL_P(A) ((A)->bad_spill_p) #define ALLOCNO_ASSIGNED_P(A) ((A)->assigned_p) #define ALLOCNO_MODE(A) ((A)->mode) +#define ALLOCNO_PREFS(A) ((A)->allocno_prefs) #define ALLOCNO_COPIES(A) ((A)->allocno_copies) #define ALLOCNO_HARD_REG_COSTS(A) ((A)->hard_reg_costs) #define ALLOCNO_UPDATED_HARD_REG_COSTS(A) ((A)->updated_hard_reg_costs) @@ -519,6 +523,33 @@ /* The size of the previous array. */ extern int ira_objects_num; +/* The following structure represents a hard register prefererence of + allocno. The preference represent move insns or potential move + insns usually because of two operand insn constraints. One move + operand is a hard register. */ +struct ira_allocno_pref +{ + /* The unique order number of the preference node starting with 0. */ + int num; + /* Preferred hard register. */ + int hard_regno; + /* Accumulated execution frequency of insns from which the + preference created. */ + int freq; + /* Given allocno. */ + ira_allocno_t allocno; + /* All prefernces with the same allocno are linked by the following + member. */ + ira_pref_t next_pref; +}; + +/* Array of references to all allocno preferences. The order number + of the preference corresponds to the index in the array. */ +extern ira_pref_t *ira_prefs; + +/* Size of the previous array. */ +extern int ira_prefs_num; + /* The following structure represents a copy of two allocnos. The copies represent move insns or potential move insns usually because of two operand insn constraints. To remove register shuffle, we @@ -935,6 +966,10 @@ extern ira_loop_tree_node_t ira_curr_loop_tree_node; extern ira_allocno_t *ira_curr_regno_allocno_map; +extern void ira_debug_pref (ira_pref_t); +extern void ira_debug_prefs (void); +extern void ira_debug_allocno_prefs (ira_allocno_t); + extern void ira_debug_copy (ira_copy_t); extern void ira_debug_copies (void); extern void ira_debug_allocno_copies (ira_allocno_t); @@ -961,10 +996,10 @@ extern void ira_finish_live_range (live_range_t); extern void ira_finish_live_range_list (live_range_t); extern void ira_free_allocno_updated_costs (ira_allocno_t); +extern ira_pref_t ira_create_pref (ira_allocno_t, int, int); +extern void ira_add_allocno_pref (ira_allocno_t, int, int); extern ira_copy_t ira_create_copy (ira_allocno_t, ira_allocno_t, int, bool, rtx, ira_loop_tree_node_t); -extern void ira_add_allocno_copy_to_list (ira_copy_t); -extern void ira_swap_allocno_copy_ends_if_necessary (ira_copy_t); extern ira_copy_t ira_add_allocno_copy (ira_allocno_t, ira_allocno_t, int, bool, rtx, ira_loop_tree_node_t); @@ -1147,6 +1182,44 @@ ira_allocno_object_iter_cond (&(ITER), (A), &(O));) +/* The iterator for prefs. */ +typedef struct { + /* The number of the current element in IRA_PREFS. */ + int n; +} ira_pref_iterator; + +/* Initialize the iterator I. */ +static inline void +ira_pref_iter_init (ira_pref_iterator *i) +{ + i->n = 0; +} + +/* Return TRUE if we have more prefs to visit, in which case *PREF is + set to the pref to be visited. Otherwise, return FALSE. */ +static inline bool +ira_pref_iter_cond (ira_pref_iterator *i, ira_pref_t *pref) +{ + int n; + + for (n = i->n; n < ira_prefs_num; n++) + if (ira_prefs[n] != NULL) + { + *pref = ira_prefs[n]; + i->n = n + 1; + return true; + } + return false; +} + +/* Loop over all prefs. In each iteration, P is set to the next + pref. ITER is an instance of ira_pref_iterator used to iterate + the prefs. */ +#define FOR_EACH_PREF(P, ITER) \ + for (ira_pref_iter_init (&(ITER)); \ + ira_pref_iter_cond (&(ITER), &(P));) + + /* The iterator for copies. */ typedef struct { /* The number of the current element in IRA_COPIES. */ Index: regmove.c =================================================================== --- regmove.c (revision 195438) +++ regmove.c (working copy) @@ -1241,9 +1241,9 @@ for (i = nregs; --i >= 0; ) regno_src_regno[i] = -1; - /* A forward pass. Replace output operands with input operands. */ - regmove_forward_pass (); - + if (! flag_ira_hard_reg_pref) + /* A forward pass. Replace output operands with input operands. */ + regmove_forward_pass (); /* A backward pass. Replace input operands with output operands. */ regmove_backward_pass ();