From patchwork Thu Sep 27 22:58:19 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Makarov X-Patchwork-Id: 187522 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id CA9342C00A0 for ; Fri, 28 Sep 2012 08:58:59 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1349391540; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Message-ID:Date:From:User-Agent:MIME-Version:To:Subject: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=IIGFxCH uH8qEFYcLddslVwpy5Rw=; b=ETr6pCo5qzXatjutowYyL1UIyO2q5YwyDVpcY8G ieuFNTh05/+M69F8EXTOmK3QLzyNupEFMOWFAtoDxTh/+dqzarvY1a5geEqIU3ic GjIEVCnzzp5ucWZGjapgfoKNhKfe+MdjFkVBwhSqtAaT+9rBHC1TQ9uaR/Ob4Bxg cCe4= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=NJgAs+09JaWsDVNNbjDW4Hd+5W+cCdmfdQj9+gh1MdDQQKTWyPzM9Iyhq5trID BphDGWiyQNkScu4PAUycyZ988sZHYX/KviRx7UuLMLFNqbYCCrbgolsMVp85Jv5x pE3yYFXsQMuoZnFK2P06ji9FL2KpJTpxNUodUVQ1gzOz8=; Received: (qmail 16360 invoked by alias); 27 Sep 2012 22:58:55 -0000 Received: (qmail 16294 invoked by uid 22791); 27 Sep 2012 22:58:31 -0000 X-SWARE-Spam-Status: No, hits= required= tests= Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 27 Sep 2012 22:58:21 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q8RMwKDI011355 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 27 Sep 2012 18:58:21 -0400 Received: from ivy.local (ovpn-113-56.phx2.redhat.com [10.3.113.56]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id q8RMwJ8R002152 for ; Thu, 27 Sep 2012 18:58:20 -0400 Message-ID: <5064DA0B.4020701@redhat.com> Date: Thu, 27 Sep 2012 18:58:19 -0400 From: Vladimir Makarov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120911 Thunderbird/15.0.1 MIME-Version: 1.0 To: GCC Patches Subject: RFC: LRA for x86/x86-64 [5/9] X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org The following patch mostly prepares some data from IRA which will be used by LRA. It is done by moving some definitions fro ira-int.h to ira.h. New data reg_class_subset is generated in IRA for LRA. New functions dealing with equivs are created. They will be used by LRA. Some code of IRA is rewritten to use them too. The patch also adds a wrapper code in IRA to be prepared to call LRA. 2012-09-27 Vladimir Makarov * ira-int.h (struct target_ira_int): Remove x_ira_class_subset_p and x_ira_reg_classes_intersect_p. (ira_class_subset_p, ira_reg_classes_intersect_p): Remove. (ira_reg_equiv_len, ira_reg_equiv_invariant_p): Ditto. (ira_reg_equiv_const): Ditto. (ira_equiv_no_lvalue_p): New function. * ira-color.c (color_pass, move_spill_restore, coalesce_allocnos): Use ira_equiv_no_lvalue_p. (coalesce_spill_slots, ira_sort_regnos_for_alter_reg): Ditto. * ira-emit.c (ira_create_new_reg): Call ira_expand_reg_equiv. (generate_edge_moves, change_loop) Use ira_equiv_no_lvalue_p. (emit_move_list): Simplify code. Call ira_update_equiv_info_by_shuffle_insn. Use ira_reg_equiv instead of ira_reg_equiv_invariant_p and ira_reg_equiv_const. Change assert. * ira.c: (setup_reg_class_relations): Set up ira_reg_class_subset. (ira_reg_equiv_invariant_p, ira_reg_equiv_const): Remove. (find_reg_equiv_invariant_const): Ditto. (setup_reg_renumber): Use ira_equiv_no_lvalue_p instead of ira_reg_equiv_invariant_p. Skip caps for LRA. (setup_reg_equiv_init, ira_update_equiv_info_by_shuffle_insn): New functions. (ira_reg_equiv_len): Move it before ira_reg_equiv. Change comment. (ira_reg_equiv): New. (ira_expand_reg_equiv, finish_reg_equiv): New functions. (no_equiv, update_equiv_regs): Use ira_reg_equiv instead of reg_equiv_init. (setup_reg_equiv): New function. (ira_use_lra_p): New global. (ira): Move initialization of ira_obstack and ira_bitmap_obstack upper. Call init_reg_equiv, setup_reg_equiv, and setup_reg_equiv_init instead of initialization of ira_reg_equiv_len, ira_reg_equiv_invariant_p, and ira_reg_equiv_const. Don't flatten IRA IRA for LRA. Don't reassign conflict allocnos for LRA. Call finish_reg_equiv. (do_reload): Prepare code for LRA call. * ira.h (ira_use_lra_p): New external. (struct target_ira): Add members x_ira_class_subset_p x_ira_reg_class_subset, and x_ira_reg_classes_intersect_p. (ira_class_subset_p, ira_reg_class_subset): New macros. (ira_reg_classes_intersect_p): New macro. (ira_reg_equiv_len, ira_reg_equiv): New externals. (struct ira_reg_equiv): New. (ira_expand_reg_equiv, ira_update_equiv_info_by_shuffle_insn): New prototypes. Index: ira-int.h =================================================================== --- ira-int.h (revision 191771) +++ ira-int.h (working copy) @@ -795,11 +795,6 @@ struct target_ira_int { /* Map class->true if class is a pressure class, false otherwise. */ bool x_ira_reg_pressure_class_p[N_REG_CLASSES]; - /* Register class subset relation: TRUE if the first class is a subset - of the second one considering only hard registers available for the - allocation. */ - int x_ira_class_subset_p[N_REG_CLASSES][N_REG_CLASSES]; - /* Array of the number of hard registers of given class which are available for allocation. The order is defined by the hard register numbers. */ @@ -838,13 +833,8 @@ struct target_ira_int { taking all hard-registers including fixed ones into account. */ enum reg_class x_ira_reg_class_intersect[N_REG_CLASSES][N_REG_CLASSES]; - /* True if the two classes (that is calculated taking only hard - registers available for allocation into account; are - intersected. */ - bool x_ira_reg_classes_intersect_p[N_REG_CLASSES][N_REG_CLASSES]; - /* Classes with end marker LIM_REG_CLASSES which are intersected with - given class (the first index;. That includes given class itself. + given class (the first index). That includes given class itself. This is calculated taking only hard registers available for allocation into account. */ enum reg_class x_ira_reg_class_super_classes[N_REG_CLASSES][N_REG_CLASSES]; @@ -861,7 +851,7 @@ struct target_ira_int { /* For each reg class, table listing all the classes contained in it (excluding the class itself. Non-allocatable registers are - excluded from the consideration;. */ + excluded from the consideration). */ enum reg_class x_alloc_reg_class_subclasses[N_REG_CLASSES][N_REG_CLASSES]; /* Array whose values are hard regset of hard registers for which @@ -894,8 +884,6 @@ extern struct target_ira_int *this_targe (this_target_ira_int->x_ira_reg_allocno_class_p) #define ira_reg_pressure_class_p \ (this_target_ira_int->x_ira_reg_pressure_class_p) -#define ira_class_subset_p \ - (this_target_ira_int->x_ira_class_subset_p) #define ira_non_ordered_class_hard_regs \ (this_target_ira_int->x_ira_non_ordered_class_hard_regs) #define ira_class_hard_reg_index \ @@ -912,8 +900,6 @@ extern struct target_ira_int *this_targe (this_target_ira_int->x_ira_uniform_class_p) #define ira_reg_class_intersect \ (this_target_ira_int->x_ira_reg_class_intersect) -#define ira_reg_classes_intersect_p \ - (this_target_ira_int->x_ira_reg_classes_intersect_p) #define ira_reg_class_super_classes \ (this_target_ira_int->x_ira_reg_class_super_classes) #define ira_reg_class_subunion \ @@ -934,17 +920,6 @@ extern void ira_debug_disposition (void) extern void ira_debug_allocno_classes (void); extern void ira_init_register_move_cost (enum machine_mode); -/* The length of the two following arrays. */ -extern int ira_reg_equiv_len; - -/* The element value is TRUE if the corresponding regno value is - invariant. */ -extern bool *ira_reg_equiv_invariant_p; - -/* The element value is equiv constant of given pseudo-register or - NULL_RTX. */ -extern rtx *ira_reg_equiv_const; - /* ira-build.c */ /* The current loop tree node and its regno allocno map. */ @@ -1028,6 +1003,20 @@ extern void ira_emit (bool); +/* Return true if equivalence of pseudo REGNO is not a lvalue. */ +static inline bool +ira_equiv_no_lvalue_p (int regno) +{ + if (regno >= ira_reg_equiv_len) + return false; + return (ira_reg_equiv[regno].constant != NULL_RTX + || ira_reg_equiv[regno].invariant != NULL_RTX + || (ira_reg_equiv[regno].memory != NULL_RTX + && MEM_READONLY_P (ira_reg_equiv[regno].memory))); +} + + + /* Initialize register costs for MODE if necessary. */ static inline void ira_init_register_move_cost_if_necessary (enum machine_mode mode) Index: ira.c =================================================================== --- ira.c (revision 191771) +++ ira.c (working copy) @@ -1201,6 +1201,7 @@ setup_reg_class_relations (void) { ira_reg_classes_intersect_p[cl1][cl2] = false; ira_reg_class_intersect[cl1][cl2] = NO_REGS; + ira_reg_class_subset[cl1][cl2] = NO_REGS; COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); COPY_HARD_REG_SET (temp_set2, reg_class_contents[cl2]); @@ -1248,9 +1249,8 @@ setup_reg_class_relations (void) COPY_HARD_REG_SET (union_set, reg_class_contents[cl1]); IOR_HARD_REG_SET (union_set, reg_class_contents[cl2]); AND_COMPL_HARD_REG_SET (union_set, no_unit_alloc_regs); - for (i = 0; i < ira_important_classes_num; i++) + for (cl3 = 0; cl3 < N_REG_CLASSES; cl3++) { - cl3 = ira_important_classes[i]; COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl3]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); if (hard_reg_set_subset_p (temp_hard_regset, intersection_set)) @@ -1258,25 +1258,45 @@ setup_reg_class_relations (void) /* CL3 allocatable hard register set is inside of intersection of allocatable hard register sets of CL1 and CL2. */ + if (important_class_p[cl3]) + { + COPY_HARD_REG_SET + (temp_set2, + reg_class_contents + [(int) ira_reg_class_intersect[cl1][cl2]]); + AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); + if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) + /* If the allocatable hard register sets are + the same, prefer GENERAL_REGS or the + smallest class for debugging + purposes. */ + || (hard_reg_set_equal_p (temp_hard_regset, temp_set2) + && (cl3 == GENERAL_REGS + || ((ira_reg_class_intersect[cl1][cl2] + != GENERAL_REGS) + && hard_reg_set_subset_p + (reg_class_contents[cl3], + reg_class_contents + [(int) + ira_reg_class_intersect[cl1][cl2]]))))) + ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3; + } COPY_HARD_REG_SET (temp_set2, - reg_class_contents[(int) - ira_reg_class_intersect[cl1][cl2]]); + reg_class_contents[(int) ira_reg_class_subset[cl1][cl2]]); AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); - if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) - /* If the allocatable hard register sets are the - same, prefer GENERAL_REGS or the smallest - class for debugging purposes. */ + if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) + /* Ignore unavailable hard registers and prefer + smallest class for debugging purposes. */ || (hard_reg_set_equal_p (temp_hard_regset, temp_set2) - && (cl3 == GENERAL_REGS - || (ira_reg_class_intersect[cl1][cl2] != GENERAL_REGS - && hard_reg_set_subset_p - (reg_class_contents[cl3], - reg_class_contents - [(int) ira_reg_class_intersect[cl1][cl2]]))))) - ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3; + && hard_reg_set_subset_p + (reg_class_contents[cl3], + reg_class_contents + [(int) ira_reg_class_subset[cl1][cl2]]))) + ira_reg_class_subset[cl1][cl2] = (enum reg_class) cl3; } - if (hard_reg_set_subset_p (temp_hard_regset, union_set)) + if (important_class_p[cl3] + && hard_reg_set_subset_p (temp_hard_regset, union_set)) { /* CL3 allocatbale hard register set is inside of union of allocatable hard register sets of CL1 @@ -1885,66 +1905,6 @@ ira_setup_eliminable_regset (void) -/* The length of the following two arrays. */ -int ira_reg_equiv_len; - -/* The element value is TRUE if the corresponding regno value is - invariant. */ -bool *ira_reg_equiv_invariant_p; - -/* The element value is equiv constant of given pseudo-register or - NULL_RTX. */ -rtx *ira_reg_equiv_const; - -/* Set up the two arrays declared above. */ -static void -find_reg_equiv_invariant_const (void) -{ - unsigned int i; - bool invariant_p; - rtx list, insn, note, constant, x; - - for (i = FIRST_PSEUDO_REGISTER; i < VEC_length (reg_equivs_t, reg_equivs); i++) - { - constant = NULL_RTX; - invariant_p = false; - for (list = reg_equiv_init (i); list != NULL_RTX; list = XEXP (list, 1)) - { - insn = XEXP (list, 0); - note = find_reg_note (insn, REG_EQUIV, NULL_RTX); - - if (note == NULL_RTX) - continue; - - x = XEXP (note, 0); - - if (! CONSTANT_P (x) - || ! flag_pic || LEGITIMATE_PIC_OPERAND_P (x)) - { - /* It can happen that a REG_EQUIV note contains a MEM - that is not a legitimate memory operand. As later - stages of the reload assume that all addresses found - in the reg_equiv_* arrays were originally legitimate, - we ignore such REG_EQUIV notes. */ - if (memory_operand (x, VOIDmode)) - invariant_p = MEM_READONLY_P (x); - else if (function_invariant_p (x)) - { - if (GET_CODE (x) == PLUS - || x == frame_pointer_rtx || x == arg_pointer_rtx) - invariant_p = true; - else - constant = x; - } - } - } - ira_reg_equiv_invariant_p[i] = invariant_p; - ira_reg_equiv_const[i] = constant; - } -} - - - /* Vector of substitutions of register numbers, used to map pseudo regs into hardware regs. This is set up as a result of register allocation. @@ -1965,6 +1925,8 @@ setup_reg_renumber (void) caller_save_needed = 0; FOR_EACH_ALLOCNO (a, ai) { + if (ira_use_lra_p && ALLOCNO_CAP_MEMBER (a) != NULL) + continue; /* There are no caps at this point. */ ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL); if (! ALLOCNO_ASSIGNED_P (a)) @@ -1996,9 +1958,7 @@ setup_reg_renumber (void) ira_assert (!optimize || flag_caller_saves || (ALLOCNO_CALLS_CROSSED_NUM (a) == ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)) - || regno >= ira_reg_equiv_len - || ira_reg_equiv_const[regno] - || ira_reg_equiv_invariant_p[regno]); + || ira_equiv_no_lvalue_p (regno)); caller_save_needed = 1; } } @@ -2165,6 +2125,109 @@ check_allocation (void) } #endif +/* Allocate REG_EQUIV_INIT. Set up it from IRA_REG_EQUIV which should + be already calculated. */ +static void +setup_reg_equiv_init (void) +{ + int i; + int max_regno = max_reg_num (); + + for (i = 0; i < max_regno; i++) + reg_equiv_init (i) = ira_reg_equiv[i].init_insns; +} + +/* Update equiv regno from movement of FROM_REGNO to TO_REGNO. INSNS + are insns which were generated for such movement. It is assumed + that FROM_REGNO and TO_REGNO always have the same value at the + point of any move containing such registers. This function is used + to update equiv info for register shuffles on the region borders + and for caller save/restore insns. */ +void +ira_update_equiv_info_by_shuffle_insn (int to_regno, int from_regno, rtx insns) +{ + rtx insn, x, note; + + if (! ira_reg_equiv[from_regno].defined_p + && (! ira_reg_equiv[to_regno].defined_p + || ((x = ira_reg_equiv[to_regno].memory) != NULL_RTX + && ! MEM_READONLY_P (x)))) + return; + insn = insns; + if (NEXT_INSN (insn) != NULL_RTX) + { + if (! ira_reg_equiv[to_regno].defined_p) + { + ira_assert (ira_reg_equiv[to_regno].init_insns == NULL_RTX); + return; + } + ira_reg_equiv[to_regno].defined_p = false; + ira_reg_equiv[to_regno].memory + = ira_reg_equiv[to_regno].constant + = ira_reg_equiv[to_regno].invariant + = ira_reg_equiv[to_regno].init_insns = NULL_RTX; + if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) + fprintf (ira_dump_file, + " Invalidating equiv info for reg %d\n", to_regno); + return; + } + /* It is possible that FROM_REGNO still has no equivalence because + in shuffles to_regno<-from_regno and from_regno<-to_regno the 2nd + insn was not processed yet. */ + if (ira_reg_equiv[from_regno].defined_p) + { + ira_reg_equiv[to_regno].defined_p = true; + if ((x = ira_reg_equiv[from_regno].memory) != NULL_RTX) + { + ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX + && ira_reg_equiv[from_regno].constant == NULL_RTX); + ira_assert (ira_reg_equiv[to_regno].memory == NULL_RTX + || rtx_equal_p (ira_reg_equiv[to_regno].memory, x)); + ira_reg_equiv[to_regno].memory = x; + if (! MEM_READONLY_P (x)) + /* We don't add the insn to insn init list because memory + equivalence is just to say what memory is better to use + when the pseudo is spilled. */ + return; + } + else if ((x = ira_reg_equiv[from_regno].constant) != NULL_RTX) + { + ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX); + ira_assert (ira_reg_equiv[to_regno].constant == NULL_RTX + || rtx_equal_p (ira_reg_equiv[to_regno].constant, x)); + ira_reg_equiv[to_regno].constant = x; + } + else + { + x = ira_reg_equiv[from_regno].invariant; + ira_assert (x != NULL_RTX); + ira_assert (ira_reg_equiv[to_regno].invariant == NULL_RTX + || rtx_equal_p (ira_reg_equiv[to_regno].invariant, x)); + ira_reg_equiv[to_regno].invariant = x; + } + if (find_reg_note (insn, REG_EQUIV, x) == NULL_RTX) + { + note = set_unique_reg_note (insn, REG_EQUIV, x); + gcc_assert (note != NULL_RTX); + if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) + { + fprintf (ira_dump_file, + " Adding equiv note to insn %u for reg %d ", + INSN_UID (insn), to_regno); + print_value_slim (ira_dump_file, x, 1); + fprintf (ira_dump_file, "\n"); + } + } + } + ira_reg_equiv[to_regno].init_insns + = gen_rtx_INSN_LIST (VOIDmode, insn, + ira_reg_equiv[to_regno].init_insns); + if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) + fprintf (ira_dump_file, + " Adding equiv init move insn %u to reg %d\n", + INSN_UID (insn), to_regno); +} + /* Fix values of array REG_EQUIV_INIT after live range splitting done by IRA. */ static void @@ -2202,6 +2265,7 @@ fix_reg_equiv_init (void) prev = x; else { + /* Remove the wrong list element. */ if (prev == NULL_RTX) reg_equiv_init (i) = next; else @@ -2334,6 +2398,46 @@ mark_elimination (int from, int to) +/* The length of the following array. */ +int ira_reg_equiv_len; + +/* Info about equiv. info for each register. */ +struct ira_reg_equiv *ira_reg_equiv; + +/* Expand ira_reg_equiv if necessary. */ +void +ira_expand_reg_equiv (void) +{ + int old = ira_reg_equiv_len; + + if (ira_reg_equiv_len > max_reg_num ()) + return; + ira_reg_equiv_len = max_reg_num () * 3 / 2 + 1; + ira_reg_equiv + = (struct ira_reg_equiv *) xrealloc (ira_reg_equiv, + ira_reg_equiv_len + * sizeof (struct ira_reg_equiv)); + gcc_assert (old < ira_reg_equiv_len); + memset (ira_reg_equiv + old, 0, + sizeof (struct ira_reg_equiv) * (ira_reg_equiv_len - old)); +} + +static void +init_reg_equiv (void) +{ + ira_reg_equiv_len = 0; + ira_reg_equiv = NULL; + ira_expand_reg_equiv (); +} + +static void +finish_reg_equiv (void) +{ + free (ira_reg_equiv); +} + + + struct equivalence { /* Set when a REG_EQUIV note is found or created. Use to @@ -2707,7 +2811,8 @@ no_equiv (rtx reg, const_rtx store ATTRI should keep their initialization insns. */ if (reg_equiv[regno].is_arg_equivalence) return; - reg_equiv_init (regno) = NULL_RTX; + ira_reg_equiv[regno].defined_p = false; + ira_reg_equiv[regno].init_insns = NULL_RTX; for (; list; list = XEXP (list, 1)) { rtx insn = XEXP (list, 0); @@ -2743,7 +2848,7 @@ static int recorded_label_ref; value into the using insn. If it succeeds, we can eliminate the register completely. - Initialize the REG_EQUIV_INIT array of initializing insns. + Initialize init_insns in ira_reg_equiv array. Return non-zero if jump label rebuilding should be done. */ static int @@ -2818,14 +2923,16 @@ update_equiv_regs (void) gcc_assert (REG_P (dest)); regno = REGNO (dest); - /* Note that we don't want to clear reg_equiv_init even if there - are multiple sets of this register. */ + /* Note that we don't want to clear init_insns in + ira_reg_equiv even if there are multiple sets of this + register. */ reg_equiv[regno].is_arg_equivalence = 1; /* Record for reload that this is an equivalencing insn. */ if (rtx_equal_p (src, XEXP (note, 0))) - reg_equiv_init (regno) - = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno)); + ira_reg_equiv[regno].init_insns + = gen_rtx_INSN_LIST (VOIDmode, insn, + ira_reg_equiv[regno].init_insns); /* Continue normally in case this is a candidate for replacements. */ @@ -2925,8 +3032,9 @@ update_equiv_regs (void) /* If we haven't done so, record for reload that this is an equivalencing insn. */ if (!reg_equiv[regno].is_arg_equivalence) - reg_equiv_init (regno) - = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno)); + ira_reg_equiv[regno].init_insns + = gen_rtx_INSN_LIST (VOIDmode, insn, + ira_reg_equiv[regno].init_insns); /* Record whether or not we created a REG_EQUIV note for a LABEL_REF. We might end up substituting the LABEL_REF for uses of the @@ -3026,7 +3134,7 @@ update_equiv_regs (void) { /* This insn makes the equivalence, not the one initializing the register. */ - reg_equiv_init (regno) + ira_reg_equiv[regno].init_insns = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX); df_notes_rescan (init_insn); } @@ -3080,9 +3188,10 @@ update_equiv_regs (void) /* reg_equiv[REGNO].replace gets set only when REG_N_REFS[REGNO] is 2, i.e. the register is set - once and used once. (If it were only set, but not used, - flow would have deleted the setting insns.) Hence - there can only be one insn in reg_equiv[REGNO].init_insns. */ + once and used once. (If it were only set, but + not used, flow would have deleted the setting + insns.) Hence there can only be one insn in + reg_equiv[REGNO].init_insns. */ gcc_assert (reg_equiv[regno].init_insns && !XEXP (reg_equiv[regno].init_insns, 1)); equiv_insn = XEXP (reg_equiv[regno].init_insns, 0); @@ -3129,7 +3238,7 @@ update_equiv_regs (void) reg_equiv[regno].init_insns = XEXP (reg_equiv[regno].init_insns, 1); - reg_equiv_init (regno) = NULL_RTX; + ira_reg_equiv[regno].init_insns = NULL_RTX; bitmap_set_bit (cleared_regs, regno); } /* Move the initialization of the register to just before @@ -3162,7 +3271,7 @@ update_equiv_regs (void) if (insn == BB_HEAD (bb)) BB_HEAD (bb) = PREV_INSN (insn); - reg_equiv_init (regno) + ira_reg_equiv[regno].init_insns = gen_rtx_INSN_LIST (VOIDmode, new_insn, NULL_RTX); bitmap_set_bit (cleared_regs, regno); } @@ -3208,6 +3317,88 @@ update_equiv_regs (void) +/* Set up fields memory, constant, and invariant from init_insns in + the structures of array ira_reg_equiv. */ +static void +setup_reg_equiv (void) +{ + int i; + rtx elem, insn, set, x; + + for (i = FIRST_PSEUDO_REGISTER; i < ira_reg_equiv_len; i++) + for (elem = ira_reg_equiv[i].init_insns; elem; elem = XEXP (elem, 1)) + { + insn = XEXP (elem, 0); + set = single_set (insn); + + /* Init insns can set up equivalence when the reg is a destination or + a source (in this case the destination is memory). */ + if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set)))) + { + if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL) + x = XEXP (x, 0); + else if (REG_P (SET_DEST (set)) + && REGNO (SET_DEST (set)) == (unsigned int) i) + x = SET_SRC (set); + else + { + gcc_assert (REG_P (SET_SRC (set)) + && REGNO (SET_SRC (set)) == (unsigned int) i); + x = SET_DEST (set); + } + if (! function_invariant_p (x) + || ! flag_pic + /* A function invariant is often CONSTANT_P but may + include a register. We promise to only pass + CONSTANT_P objects to LEGITIMATE_PIC_OPERAND_P. */ + || (CONSTANT_P (x) && LEGITIMATE_PIC_OPERAND_P (x))) + { + /* It can happen that a REG_EQUIV note contains a MEM + that is not a legitimate memory operand. As later + stages of reload assume that all addresses found in + the lra_regno_equiv_* arrays were originally + legitimate, we ignore such REG_EQUIV notes. */ + if (memory_operand (x, VOIDmode)) + { + ira_reg_equiv[i].defined_p = true; + ira_reg_equiv[i].memory = x; + continue; + } + else if (function_invariant_p (x)) + { + enum machine_mode mode; + + mode = GET_MODE (SET_DEST (set)); + if (GET_CODE (x) == PLUS + || x == frame_pointer_rtx || x == arg_pointer_rtx) + /* This is PLUS of frame pointer and a constant, + or fp, or argp. */ + ira_reg_equiv[i].invariant = x; + else if (targetm.legitimate_constant_p (mode, x)) + ira_reg_equiv[i].constant = x; + else + { + ira_reg_equiv[i].memory = force_const_mem (mode, x); + if (ira_reg_equiv[i].memory == NULL_RTX) + { + ira_reg_equiv[i].defined_p = false; + ira_reg_equiv[i].init_insns = NULL_RTX; + break; + } + } + ira_reg_equiv[i].defined_p = true; + continue; + } + } + } + ira_reg_equiv[i].defined_p = false; + ira_reg_equiv[i].init_insns = NULL_RTX; + break; + } +} + + + /* Print chain C to FILE. */ static void print_insn_chain (FILE *file, struct insn_chain *c) @@ -4102,6 +4293,11 @@ allocate_initial_values (void) } } + +/* True when we use LRA instead of reload pass for the current + function. */ +bool ira_use_lra_p; + /* All natural loops. */ struct loops ira_loops; @@ -4120,6 +4316,13 @@ ira (FILE *f) int max_regno_before_ira, ira_max_point_before_emit; int rebuild_p; + ira_use_lra_p = targetm.lra_p (); + +#ifndef IRA_NO_OBSTACK + gcc_obstack_init (&ira_obstack); +#endif + bitmap_obstack_initialize (&ira_bitmap_obstack); + if (flag_caller_saves) init_caller_save (); @@ -4166,30 +4369,18 @@ ira (FILE *f) if (resize_reg_info () && flag_ira_loop_pressure) ira_set_pseudo_classes (ira_dump_file); + init_reg_equiv (); rebuild_p = update_equiv_regs (); + setup_reg_equiv (); + setup_reg_equiv_init (); -#ifndef IRA_NO_OBSTACK - gcc_obstack_init (&ira_obstack); -#endif - bitmap_obstack_initialize (&ira_bitmap_obstack); - if (optimize) + if (optimize && rebuild_p) { - max_regno = max_reg_num (); - ira_reg_equiv_len = max_regno; - ira_reg_equiv_invariant_p - = (bool *) ira_allocate (max_regno * sizeof (bool)); - memset (ira_reg_equiv_invariant_p, 0, max_regno * sizeof (bool)); - ira_reg_equiv_const = (rtx *) ira_allocate (max_regno * sizeof (rtx)); - memset (ira_reg_equiv_const, 0, max_regno * sizeof (rtx)); - find_reg_equiv_invariant_const (); - if (rebuild_p) - { - timevar_push (TV_JUMP); - rebuild_jump_labels (get_insns ()); - if (purge_all_dead_edges ()) - delete_unreachable_blocks (); - timevar_pop (TV_JUMP); - } + timevar_push (TV_JUMP); + rebuild_jump_labels (get_insns ()); + if (purge_all_dead_edges ()) + delete_unreachable_blocks (); + timevar_pop (TV_JUMP); } allocated_reg_info_size = max_reg_num (); @@ -4241,19 +4432,32 @@ ira (FILE *f) ira_emit (loops_p); + max_regno = max_reg_num (); if (ira_conflicts_p) { - max_regno = max_reg_num (); - if (! loops_p) - ira_initiate_assign (); + { + if (! ira_use_lra_p) + ira_initiate_assign (); + } else { expand_reg_info (); - if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) - fprintf (ira_dump_file, "Flattening IR\n"); - ira_flattening (max_regno_before_ira, ira_max_point_before_emit); + if (ira_use_lra_p) + { + ira_allocno_t a; + ira_allocno_iterator ai; + + FOR_EACH_ALLOCNO (a, ai) + ALLOCNO_REGNO (a) = REGNO (ALLOCNO_EMIT_DATA (a)->reg); + } + else + { + if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) + fprintf (ira_dump_file, "Flattening IR\n"); + ira_flattening (max_regno_before_ira, ira_max_point_before_emit); + } /* New insns were generated: add notes and recalculate live info. */ df_analyze (); @@ -4262,9 +4466,12 @@ ira (FILE *f) record_loop_exits (); current_loops = &ira_loops; - setup_allocno_assignment_flags (); - ira_initiate_assign (); - ira_reassign_conflict_allocnos (max_regno); + if (! ira_use_lra_p) + { + setup_allocno_assignment_flags (); + ira_initiate_assign (); + ira_reassign_conflict_allocnos (max_regno); + } } } @@ -4322,45 +4529,72 @@ do_reload (void) if (flag_ira_verbose < 10) ira_dump_file = dump_file; - df_set_flags (DF_NO_INSN_RESCAN); - build_insn_chain (); + timevar_push (TV_RELOAD); + if (ira_use_lra_p) + { + if (current_loops != NULL) + { + flow_loops_free (&ira_loops); + free_dominance_info (CDI_DOMINATORS); + } + FOR_ALL_BB (bb) + bb->loop_father = NULL; + current_loops = NULL; + + if (ira_conflicts_p) + ira_free (ira_spilled_reg_stack_slots); - need_dce = reload (get_insns (), ira_conflicts_p); + ira_destroy (); + + VEC_free (reg_equivs_t, gc, reg_equivs); + reg_equivs = NULL; + need_dce = false; + } + else + { + df_set_flags (DF_NO_INSN_RESCAN); + build_insn_chain (); + + need_dce = reload (get_insns (), ira_conflicts_p); + + } + + timevar_pop (TV_RELOAD); timevar_push (TV_IRA); - if (ira_conflicts_p) + if (ira_conflicts_p && ! ira_use_lra_p) { ira_free (ira_spilled_reg_stack_slots); - ira_finish_assign (); } + if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL && overall_cost_before != ira_overall_cost) fprintf (ira_dump_file, "+++Overall after reload %d\n", ira_overall_cost); - ira_destroy (); flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots; - if (current_loops != NULL) + if (! ira_use_lra_p) { - flow_loops_free (&ira_loops); - free_dominance_info (CDI_DOMINATORS); + ira_destroy (); + if (current_loops != NULL) + { + flow_loops_free (&ira_loops); + free_dominance_info (CDI_DOMINATORS); + } + FOR_ALL_BB (bb) + bb->loop_father = NULL; + current_loops = NULL; + + regstat_free_ri (); + regstat_free_n_sets_and_refs (); } - FOR_ALL_BB (bb) - bb->loop_father = NULL; - current_loops = NULL; - - regstat_free_ri (); - regstat_free_n_sets_and_refs (); if (optimize) - { - cleanup_cfg (CLEANUP_EXPENSIVE); + cleanup_cfg (CLEANUP_EXPENSIVE); - ira_free (ira_reg_equiv_invariant_p); - ira_free (ira_reg_equiv_const); - } + finish_reg_equiv (); bitmap_obstack_release (&ira_bitmap_obstack); #ifndef IRA_NO_OBSTACK Index: ira.h =================================================================== --- ira.h (revision 191771) +++ ira.h (working copy) @@ -20,11 +20,16 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see . */ +/* True when we use LRA instead of reload pass for the current + function. */ +extern bool ira_use_lra_p; + /* True if we have allocno conflicts. It is false for non-optimized mode or when the conflict table is too big. */ extern bool ira_conflicts_p; -struct target_ira { +struct target_ira +{ /* Map: hard register number -> allocno class it belongs to. If the corresponding class is NO_REGS, the hard register is not available for allocation. */ @@ -79,6 +84,23 @@ struct target_ira { class. */ int x_ira_class_hard_regs_num[N_REG_CLASSES]; + /* Register class subset relation: TRUE if the first class is a subset + of the second one considering only hard registers available for the + allocation. */ + int x_ira_class_subset_p[N_REG_CLASSES][N_REG_CLASSES]; + + /* The biggest class inside of intersection of the two classes (that + is calculated taking only hard registers available for allocation + into account. If the both classes contain no hard registers + available for allocation, the value is calculated with taking all + hard-registers including fixed ones into account. */ + enum reg_class x_ira_reg_class_subset[N_REG_CLASSES][N_REG_CLASSES]; + + /* True if the two classes (that is calculated taking only hard + registers available for allocation into account; are + intersected. */ + bool x_ira_reg_classes_intersect_p[N_REG_CLASSES][N_REG_CLASSES]; + /* Function specific hard registers can not be used for the register allocation. */ HARD_REG_SET x_ira_no_alloc_regs; @@ -117,9 +139,37 @@ extern struct target_ira *this_target_ir (this_target_ira->x_ira_class_hard_regs) #define ira_class_hard_regs_num \ (this_target_ira->x_ira_class_hard_regs_num) +#define ira_class_subset_p \ + (this_target_ira->x_ira_class_subset_p) +#define ira_reg_class_subset \ + (this_target_ira->x_ira_reg_class_subset) +#define ira_reg_classes_intersect_p \ + (this_target_ira->x_ira_reg_classes_intersect_p) #define ira_no_alloc_regs \ (this_target_ira->x_ira_no_alloc_regs) +/* Major structure describing equivalence info for a pseudo. */ +struct ira_reg_equiv +{ + /* True if we can use this equivalence. */ + bool defined_p; + /* True if the usage of the equivalence is profitable. */ + bool profitable_p; + /* Equiv. memory, constant, invariant, and initializing insns of + given pseudo-register or NULL_RTX. */ + rtx memory; + rtx constant; + rtx invariant; + /* Always NULL_RTX if defined_p is false. */ + rtx init_insns; +}; + +/* The length of the following array. */ +extern int ira_reg_equiv_len; + +/* Info about equiv. info for each register. */ +extern struct ira_reg_equiv *ira_reg_equiv; + extern void ira_init_once (void); extern void ira_init (void); extern void ira_finish_once (void); @@ -127,6 +177,8 @@ extern void ira_setup_eliminable_regset extern rtx ira_eliminate_regs (rtx, enum machine_mode); extern void ira_set_pseudo_classes (FILE *); extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *); +extern void ira_expand_reg_equiv (void); +extern void ira_update_equiv_info_by_shuffle_insn (int, int, rtx); extern void ira_sort_regnos_for_alter_reg (int *, int, unsigned int *); extern void ira_mark_allocation_change (int); Index: ira-color.c =================================================================== --- ira-color.c (revision 191771) +++ ira-color.c (working copy) @@ -2835,8 +2835,7 @@ color_pass (ira_loop_tree_node_t loop_tr exit_freq = ira_loop_edge_freq (subloop_node, regno, true); enter_freq = ira_loop_edge_freq (subloop_node, regno, false); ira_assert (regno < ira_reg_equiv_len); - if (ira_reg_equiv_invariant_p[regno] - || ira_reg_equiv_const[regno] != NULL_RTX) + if (ira_equiv_no_lvalue_p (regno)) { if (! ALLOCNO_ASSIGNED_P (subloop_allocno)) { @@ -2941,9 +2940,7 @@ move_spill_restore (void) copies and the reload pass can spill the allocno set by copy although the allocno will not get memory slot. */ - || (regno < ira_reg_equiv_len - && (ira_reg_equiv_invariant_p[regno] - || ira_reg_equiv_const[regno] != NULL_RTX)) + || ira_equiv_no_lvalue_p (regno) || !bitmap_bit_p (loop_node->border_allocnos, ALLOCNO_NUM (a))) continue; mode = ALLOCNO_MODE (a); @@ -3367,9 +3364,7 @@ coalesce_allocnos (void) a = ira_allocnos[j]; regno = ALLOCNO_REGNO (a); if (! ALLOCNO_ASSIGNED_P (a) || ALLOCNO_HARD_REGNO (a) >= 0 - || (regno < ira_reg_equiv_len - && (ira_reg_equiv_const[regno] != NULL_RTX - || ira_reg_equiv_invariant_p[regno]))) + || ira_equiv_no_lvalue_p (regno)) continue; for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { @@ -3384,9 +3379,7 @@ coalesce_allocnos (void) if ((cp->insn != NULL || cp->constraint_p) && ALLOCNO_ASSIGNED_P (cp->second) && ALLOCNO_HARD_REGNO (cp->second) < 0 - && (regno >= ira_reg_equiv_len - || (! ira_reg_equiv_invariant_p[regno] - && ira_reg_equiv_const[regno] == NULL_RTX))) + && ! ira_equiv_no_lvalue_p (regno)) sorted_copies[cp_num++] = cp; } else if (cp->second == a) @@ -3652,9 +3645,7 @@ coalesce_spill_slots (ira_allocno_t *spi allocno = spilled_coalesced_allocnos[i]; if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno || bitmap_bit_p (set_jump_crosses, ALLOCNO_REGNO (allocno)) - || (ALLOCNO_REGNO (allocno) < ira_reg_equiv_len - && (ira_reg_equiv_const[ALLOCNO_REGNO (allocno)] != NULL_RTX - || ira_reg_equiv_invariant_p[ALLOCNO_REGNO (allocno)]))) + || ira_equiv_no_lvalue_p (ALLOCNO_REGNO (allocno))) continue; for (j = 0; j < i; j++) { @@ -3662,9 +3653,7 @@ coalesce_spill_slots (ira_allocno_t *spi n = ALLOCNO_COALESCE_DATA (a)->temp; if (ALLOCNO_COALESCE_DATA (a)->first == a && ! bitmap_bit_p (set_jump_crosses, ALLOCNO_REGNO (a)) - && (ALLOCNO_REGNO (a) >= ira_reg_equiv_len - || (! ira_reg_equiv_invariant_p[ALLOCNO_REGNO (a)] - && ira_reg_equiv_const[ALLOCNO_REGNO (a)] == NULL_RTX)) + && ! ira_equiv_no_lvalue_p (ALLOCNO_REGNO (a)) && ! slot_coalesced_allocno_live_ranges_intersect_p (allocno, n)) break; } @@ -3772,9 +3761,7 @@ ira_sort_regnos_for_alter_reg (int *pseu allocno = spilled_coalesced_allocnos[i]; if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno || ALLOCNO_HARD_REGNO (allocno) >= 0 - || (ALLOCNO_REGNO (allocno) < ira_reg_equiv_len - && (ira_reg_equiv_const[ALLOCNO_REGNO (allocno)] != NULL_RTX - || ira_reg_equiv_invariant_p[ALLOCNO_REGNO (allocno)]))) + || ira_equiv_no_lvalue_p (ALLOCNO_REGNO (allocno))) continue; if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) fprintf (ira_dump_file, " Slot %d (freq,size):", slot_num); Index: ira-emit.c =================================================================== --- ira-emit.c (revision 191771) +++ ira-emit.c (working copy) @@ -340,6 +340,7 @@ ira_create_new_reg (rtx original_reg) if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) fprintf (ira_dump_file, " Creating newreg=%i from oldreg=%i\n", REGNO (new_reg), REGNO (original_reg)); + ira_expand_reg_equiv (); return new_reg; } @@ -515,8 +516,7 @@ generate_edge_moves (edge e) /* Remove unnecessary stores at the region exit. We should do this for readonly memory for sure and this is guaranteed by that we never generate moves on region borders (see - checking ira_reg_equiv_invariant_p in function - change_loop). */ + checking in function change_loop). */ if (ALLOCNO_HARD_REGNO (dest_allocno) < 0 && ALLOCNO_HARD_REGNO (src_allocno) >= 0 && store_can_be_removed_p (src_allocno, dest_allocno)) @@ -610,8 +610,7 @@ change_loop (ira_loop_tree_node_t node) /* don't create copies because reload can spill an allocno set by copy although the allocno will not get memory slot. */ - || ira_reg_equiv_invariant_p[regno] - || ira_reg_equiv_const[regno] != NULL_RTX)) + || ira_equiv_no_lvalue_p (regno))) continue; original_reg = allocno_emit_reg (allocno); if (parent_allocno == NULL @@ -899,17 +898,22 @@ modify_move_list (move_t list) static rtx emit_move_list (move_t list, int freq) { - int cost, regno; - rtx result, insn, set, to; + rtx to, from, dest; + int to_regno, from_regno, cost, regno; + rtx result, insn, set; enum machine_mode mode; enum reg_class aclass; + grow_reg_equivs (); start_sequence (); for (; list != NULL; list = list->next) { start_sequence (); - emit_move_insn (allocno_emit_reg (list->to), - allocno_emit_reg (list->from)); + to = allocno_emit_reg (list->to); + to_regno = REGNO (to); + from = allocno_emit_reg (list->from); + from_regno = REGNO (from); + emit_move_insn (to, from); list->insn = get_insns (); end_sequence (); for (insn = list->insn; insn != NULL_RTX; insn = NEXT_INSN (insn)) @@ -925,21 +929,22 @@ emit_move_list (move_t list, int freq) to use the equivalence. */ if ((set = single_set (insn)) != NULL_RTX) { - to = SET_DEST (set); - if (GET_CODE (to) == SUBREG) - to = SUBREG_REG (to); - ira_assert (REG_P (to)); - regno = REGNO (to); + dest = SET_DEST (set); + if (GET_CODE (dest) == SUBREG) + dest = SUBREG_REG (dest); + ira_assert (REG_P (dest)); + regno = REGNO (dest); if (regno >= ira_reg_equiv_len - || (! ira_reg_equiv_invariant_p[regno] - && ira_reg_equiv_const[regno] == NULL_RTX)) + || (ira_reg_equiv[regno].invariant == NULL_RTX + && ira_reg_equiv[regno].constant == NULL_RTX)) continue; /* regno has no equivalence. */ ira_assert ((int) VEC_length (reg_equivs_t, reg_equivs) - >= ira_reg_equiv_len); + > regno); reg_equiv_init (regno) = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno)); } } + ira_update_equiv_info_by_shuffle_insn (to_regno, from_regno, list->insn); emit_insn (list->insn); mode = ALLOCNO_MODE (list->to); aclass = ALLOCNO_CLASS (list->to);