From patchwork Wed Oct 24 01:42:04 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joern Rennecke X-Patchwork-Id: 193614 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 5FC4F2C0147 for ; Wed, 24 Oct 2012 12:42:21 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1351647742; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To: MIME-Version:Content-Type:Content-Transfer-Encoding:User-Agent: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=jrqu90JAzni1PBPrzlqL lWQh2x0=; b=R2qwBJ4IEZzTRcdW0Q0J9u3qo+M56bOK/Ah27aJ6OQviDsfv+Uyi njpgjHnuAIudR+6aUpXNpaTBiElra5wHFQKXLaNy+8+389TOt4PofAtP142/g9d6 1A4XIebVKebZc7xnjudagt/w4M1xCLyq1ie/w6TR0iQV2UI/DBPBDZk= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:User-Agent:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=efsBdtQzazuxddh746mYgi1QKjDc2GKqqyIFRbSpButqXOryvhPRSrLhYYrlLf 6PGqRk3CUvyrfgIJwFPhwQMsgl6KlqjsEFBVVWKQolasjCyLsZat3VXAuHYkdHR4 L4bascbmu/GijVgJbiTUYGQeJjrnwi5yavaLJhKgSMJCc=; Received: (qmail 13650 invoked by alias); 24 Oct 2012 01:42:16 -0000 Received: (qmail 13641 invoked by uid 22791); 24 Oct 2012 01:42:15 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL, BAYES_05, KHOP_THREADED, MIME_QP_LONG_LINE, TW_DJ, TW_EG, TW_GP, TW_JG, TW_SD X-Spam-Check-By: sourceware.org Received: from c62.cesmail.net (HELO c62.cesmail.net) (216.154.195.54) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 24 Oct 2012 01:42:06 +0000 Received: from unknown (HELO epsilon2) ([192.168.1.60]) by c62.cesmail.net with ESMTP; 23 Oct 2012 21:42:04 -0400 Received: from cust213-dsl91-135-11.idnet.net (cust213-dsl91-135-11.idnet.net [91.135.11.213]) by webmail.spamcop.net (Horde MIME library) with HTTP; Tue, 23 Oct 2012 21:42:04 -0400 Message-ID: <20121023214204.vsnfi46iyscw4sos-nzlynne@webmail.spamcop.net> Date: Tue, 23 Oct 2012 21:42:04 -0400 From: Joern Rennecke To: Richard Biener Cc: Richard Sandiford , gcc-patches@gcc.gnu.org, Eric Botcazou Subject: RFA: Add lock_lenth attribute to support the ARC port (Was: Re: Ping: RFA: add lock_length attribute to break branch-shortening cycles) References: <20121015013935.v6k1ge59foks8s88-nzlynne@webmail.spamcop.net> <87391flcjy.fsf@talisman.home> <20121016153515.0nmsft8o2soo4s40-nzlynne@webmail.spamcop.net> In-Reply-To: MIME-Version: 1.0 User-Agent: Internet Messaging Program (IMP) H3 (4.1.4) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Quoting Richard Biener : > On Tue, Oct 16, 2012 at 9:35 PM, Joern Rennecke > wrote: .. >> Well, we could split it anyway, and give ports without the need for >> multiple length attributes the benefit of the optimistic algorithm. >> >> I have attached a patch that implements this. > > Looks reasonable to me, though I'm not familiar enough with the code > to approve it. Now that Richard Sandiford has reviewed that split-off part and it's in the source tree, we can return to the remaining functionality needed by for the ARC port. > I'd strongly suggest to try harder to make things work for you without > the new attribute even though I wasn't really able to follow your reasoning > on why that wouldn't work. It may be easier to motivate this change > once the port is in without that attribute so one can actually look at > the machine description and port details. Well, it doesn't simply drop in with the existing branch shortening - libgcc won't compile because of out-of-range branches. I tried to lump the length and lock_length atribute together, and that just gives genattrtab indigestion. It sits there looping forever. I could start debugging this, but that would take an unknown amount of time, and then the review of the fix would take an unknown amount of time, and then the ARC port probably needs fixing up again because it just doesn't work right with these fudged lengths. And even if we could get everything required in before the close of phase 1, the branch shortening would be substandard. It seems more productive to get the branch shortening working now. The lock_length atrtibute is completely optional, so no port maintainer would be forced to use the functionality if it's not desired. The issue is that the some instructions need to be aligned or unaligned for performance or in a few cases even for correctness. Just inserting random nops would be a waste of code space and cycles, since most of the time, the desired (mis)alignment can be archived by selectively making a short instruction long. If an instruction that is long once was forced to stay long henceforth, that would actually defeat the purpose of getting the desired alignment. Then another short instruction - if it can be found - would need to be upsized. So a size increase could ripple through as alignments are distorted. The natural thing to do is really when the alignemnt changes is really to let the upsized instruction be short again. Only length-determined branch sizes have to be locked to avoid cycles. This is the documentation for the new role of the lock_length attribute (reduced from my previous attempt): @cindex lock_length Usually, when doing optimizing branch shortening, the instruction length is calculated by avaluating the @code{length} attribute, applying @code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant value and the length of the instruction in the previous iteration. If you define the @code{lock_length} attribute, the @code{lock_length} attribute will be evaluated, and then the maximum with of @code{lock_length} value from the previous iteration will be formed and saved. Then the maximum of that value with the @code{length} attribute will be formed, and @code{ADJUST_INSN_LENGTH} will be applied. Thus, you can allow the length to vary downwards as well as upwards across iterations with suitable definitions of the @code{length} attribute and/or @code{ADJUST_INSN_LENGTH}. Care has to be taken that this does not lead to infinite loops. The new patch builds on this patch: http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01890.html as a prerequisite. build tested with libraries in revision 192654 for i686-pc-linux-gnu X mipsel-elf . bootstrapped in revision 192703 on i686-pc-linux-gnu; I've also successfully run config-list.mk with the patch applied to this revision. The following ports had pre-existing failures, which are documented in the sub-PRs or PR47093/PR44756: alpha64-dec-vms alpha-dec-vms am33_2.0-linux arm-netbsdelf arm-wrs-vxworks avr-elf avr-rtems c6x-elf c6x-uclinux cr16-elf fr30-elf i686-interix3 --enable-obsolete i686-openbsd3.0 i686-pc-msdosdjgpp ia64-hp-vms iq2000-elf lm32-elf lm32-rtems lm32-uclinux mep-elf microblaze-elf microblaze-linux mn10300-elf moxie-elf moxie-rtems moxie-uclinux pdp11-aout picochip-elf --enable-obsolete rl78-elf rx-elf score-elf --enable-obsolete sh5el-netbsd sh64-elf --with-newlib sh64-linux sh64-netbsd sh-elf shle-linux sh-netbsdelf sh-rtems sh-superh-elf sh-wrs-vxworks tilegx-linux-gnu tilepro-linux-gnu vax-linux-gnu vax-netbsdelf vax-openbsd x86_64-knetbsd-gnu I'll be posting the ARC port shortly; it does not fit into a single 100 KB posting, so I'm thinking of splitting it in a configury patch and zx compressed files/tarballs for arc.c, arc.md, libgcc, and the rest of the port. 2012-10-22 Joern Rennecke * doc/md.texi (node Defining Attributes): Add lock_length to table of special attributes. (node Insn Lengths): Document lock_length attribute. * final.c (uid_lock_length): New variable. (insn_max_length, get_attr_lock_length): New functions. (get_attr_length): Use insn_max_length. (get_attr_length_1): Use direct recursion rather than calling get_attr_length. (INSN_VARIABLE_LENGTH_P): Define. (shorten_branches): Implement lock_length attribute functionality. * genattrtab.c (lock_length_str): New variable. (make_length_attrs): New parameter base. (main): Initialize lock_length_str. Generate lock_lengths attributes. * genattr.c (gen_attr): Emit declarations for lock_length attribute related functions. (main): Emit stub defines for lock_length attribute related functions. Index: doc/md.texi =================================================================== --- doc/md.texi (revision 2660) +++ doc/md.texi (working copy) @@ -7515,6 +7515,9 @@ extern enum attr_type get_attr_type (); code chunks. This is especially important when verifying branch distances. @xref{Insn Lengths}. +@item lock_length +The @code{lock_length} attribute. + @item enabled The @code{enabled} attribute can be defined to prevent certain alternatives of an insn definition from being used during code @@ -8038,6 +8041,22 @@ (define_insn "jump" (const_int 6)))]) @end smallexample +@cindex lock_length +Usually, when doing optimizing branch shortening, the instruction length +is calculated by avaluating the @code{length} attribute, applying +@code{ADJUST_INSN_LENGTH}, and taking the maximum of the resultant +value and the length of the instruction in the previous iteration. + +If you define the @code{lock_length} attribute, the @code{lock_length} +attribute will be evaluated, and then the maximum with of @code{lock_length} +value from the previous iteration will be formed and saved. +Then the maximum of that value with the @code{length} attribute will +be formed, and @code{ADJUST_INSN_LENGTH} will be applied. +Thus, you can allow the length to vary downwards as well as upwards +across iterations with suitable definitions of the @code{length} attribute +and/or @code{ADJUST_INSN_LENGTH}. Care has to be taken that this does not +lead to infinite loops. + @end ifset @ifset INTERNALS @node Constant Attributes Index: final.c =================================================================== --- final.c (revision 2660) +++ final.c (working copy) @@ -308,6 +308,7 @@ dbr_sequence_length (void) `insn_current_length'. */ static int *insn_lengths; +static int *uid_lock_length; VEC(int,heap) *insn_addresses_; @@ -430,14 +431,41 @@ get_attr_length_1 (rtx insn, int (*fallb return length; } +/* Calculate the maximum length of INSN. */ +static int +insn_max_length (rtx insn) +{ + int length, lock_length; + + length = insn_default_length (insn); + if (HAVE_ATTR_lock_length) + { + lock_length = insn_default_lock_length (insn); + if (length < lock_length) + length = lock_length; + } + return length; +} + /* Obtain the current length of an insn. If branch shortening has been done, get its actual length. Otherwise, get its maximum length. */ int get_attr_length (rtx insn) { - return get_attr_length_1 (insn, insn_default_length); + return get_attr_length_1 (insn, insn_max_length); } +int +get_attr_lock_length (rtx insn) +{ + if (uid_lock_length && insn_lengths_max_uid > INSN_UID (insn)) + return uid_lock_length[INSN_UID (insn)]; + return get_attr_length_1 (insn, insn_min_lock_length); +} + +#define INSN_VARIABLE_LENGTH_P(INSN) \ + (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN)) + /* Obtain the current length of an insn. If branch shortening has been done, get its actual length. Otherwise, get its minimum length. */ int @@ -966,6 +994,11 @@ shorten_branches (rtx first) /* Allocate the rest of the arrays. */ insn_lengths = XNEWVEC (int, max_uid); insn_lengths_max_uid = max_uid; + if (HAVE_ATTR_lock_length) + uid_lock_length = XCNEWVEC (int, max_uid); + else + uid_lock_length = insn_lengths; + /* Syntax errors can lead to labels being outside of the main insn stream. Initialize insn_addresses, so that we get reproducible results. */ INSN_ADDRESSES_ALLOC (max_uid); @@ -1064,7 +1097,7 @@ shorten_branches (rtx first) #endif /* CASE_VECTOR_SHORTEN_MODE */ /* Compute initial lengths, addresses, and varying flags for each insn. */ - int (*length_fun) (rtx) = increasing ? insn_min_length : insn_default_length; + int (*length_fun) (rtx) = increasing ? insn_min_length : insn_max_length; for (insn_current_address = 0, insn = first; insn != 0; @@ -1106,7 +1139,7 @@ shorten_branches (rtx first) /* Alignment is handled by ADDR_VEC_ALIGN. */ } else if (GET_CODE (body) == ASM_INPUT || asm_noperands (body) >= 0) - insn_lengths[uid] = asm_insn_count (body) * insn_default_length (insn); + insn_lengths[uid] = asm_insn_count (body) * insn_max_length (insn); else if (GET_CODE (body) == SEQUENCE) { int i; @@ -1117,7 +1150,7 @@ shorten_branches (rtx first) const_delay_slots = 0; #endif int (*inner_length_fun) (rtx) - = const_delay_slots ? length_fun : insn_default_length; + = const_delay_slots ? length_fun : insn_max_length; /* Inside a delay slot sequence, we do not do any branch shortening if the shortening could change the number of delay slots of the branch. */ @@ -1130,7 +1163,7 @@ shorten_branches (rtx first) if (GET_CODE (body) == ASM_INPUT || asm_noperands (PATTERN (XVECEXP (body, 0, i))) >= 0) inner_length = (asm_insn_count (PATTERN (inner_insn)) - * insn_default_length (inner_insn)); + * insn_max_length (inner_insn)); else inner_length = inner_length_fun (inner_insn); @@ -1138,7 +1171,7 @@ shorten_branches (rtx first) if (const_delay_slots) { if ((varying_length[inner_uid] - = insn_variable_length_p (inner_insn)) != 0) + = INSN_VARIABLE_LENGTH_P (inner_insn)) != 0) varying_length[uid] = 1; INSN_ADDRESSES (inner_uid) = (insn_current_address + insn_lengths[uid]); @@ -1151,7 +1184,7 @@ shorten_branches (rtx first) else if (GET_CODE (body) != USE && GET_CODE (body) != CLOBBER) { insn_lengths[uid] = length_fun (insn); - varying_length[uid] = insn_variable_length_p (insn); + varying_length[uid] = INSN_VARIABLE_LENGTH_P (insn); } /* If needed, do any adjustment. */ @@ -1354,7 +1387,7 @@ shorten_branches (rtx first) { rtx inner_insn = XVECEXP (body, 0, i); int inner_uid = INSN_UID (inner_insn); - int inner_length; + int inner_length, lock_length; INSN_ADDRESSES (inner_uid) = insn_current_address; @@ -1365,16 +1398,23 @@ shorten_branches (rtx first) else inner_length = insn_current_length (inner_insn); - if (inner_length != insn_lengths[inner_uid]) + lock_length + = (HAVE_ATTR_lock_length + ? insn_current_lock_length (inner_insn) : inner_length); + if (lock_length != uid_lock_length[inner_uid]) { - if (!increasing || inner_length > insn_lengths[inner_uid]) + if (!increasing + || lock_length > uid_lock_length[inner_uid]) { - insn_lengths[inner_uid] = inner_length; + uid_lock_length[inner_uid] = lock_length; something_changed = 1; } else - inner_length = insn_lengths[inner_uid]; + lock_length = uid_lock_length[inner_uid]; } + if (inner_length < lock_length) + inner_length = lock_length; + insn_lengths[inner_uid] = inner_length; insn_current_address += inner_length; new_length += inner_length; } @@ -1382,6 +1422,17 @@ shorten_branches (rtx first) else { new_length = insn_current_length (insn); + if (HAVE_ATTR_lock_length) + { + int lock_length = insn_current_lock_length (insn); + + if (!increasing || lock_length > uid_lock_length[uid]) + uid_lock_length[uid] = lock_length; + else + lock_length = uid_lock_length[uid]; + if (new_length < lock_length) + new_length = lock_length; + } insn_current_address += new_length; } @@ -1393,7 +1444,8 @@ shorten_branches (rtx first) #endif if (new_length != insn_lengths[uid] - && (!increasing || new_length > insn_lengths[uid])) + && (!increasing || HAVE_ATTR_lock_length + || new_length > insn_lengths[uid])) { insn_lengths[uid] = new_length; something_changed = 1; @@ -1407,6 +1459,11 @@ shorten_branches (rtx first) } free (varying_length); + if (HAVE_ATTR_lock_length) + { + free (uid_lock_length); + uid_lock_length = 0; + } } /* Given the body of an INSN known to be generated by an ASM statement, return Index: genattr.c =================================================================== --- genattr.c (revision 2660) +++ genattr.c (working copy) @@ -71,6 +71,10 @@ extern int insn_default_length (rtx);\n\ extern int insn_min_length (rtx);\n\ extern int insn_variable_length_p (rtx);\n\ extern int insn_current_length (rtx);\n\n\ +extern int insn_default_lock_length (rtx);\n\ +extern int insn_min_lock_length (rtx);\n\ +extern int insn_variable_lock_length_p (rtx);\n\ +extern int insn_current_lock_length (rtx);\n\n\ #include \"insn-addr.h\"\n"); } } @@ -337,7 +341,8 @@ main (int argc, char **argv) } /* Special-purpose atributes should be tested with if, not #ifdef. */ - const char * const special_attrs[] = { "length", "enabled", 0 }; + const char * const special_attrs[] + = { "length", "lock_length", "enabled", 0 }; for (const char * const *p = special_attrs; *p; p++) { printf ("#ifndef HAVE_ATTR_%s\n" @@ -346,13 +351,19 @@ main (int argc, char **argv) } /* We make an exception here to provide stub definitions for insn_*_length* functions. */ - puts ("#if !HAVE_ATTR_length\n" - "extern int hook_int_rtx_0 (rtx);\n" + puts ("extern int hook_int_rtx_0 (rtx);\n" + "#if !HAVE_ATTR_length\n" "#define insn_default_length hook_int_rtx_0\n" "#define insn_min_length hook_int_rtx_0\n" "#define insn_variable_length_p hook_int_rtx_0\n" "#define insn_current_length hook_int_rtx_0\n" "#include \"insn-addr.h\"\n" + "#endif\n" + "#if !HAVE_ATTR_lock_length\n" + "#define insn_default_lock_length hook_int_rtx_0\n" + "#define insn_min_lock_length hook_int_rtx_0\n" + "#define insn_variable_lock_length_p hook_int_rtx_0\n" + "#define insn_current_lock_length hook_int_rtx_0\n" "#endif\n"); /* Output flag masks for use by reorg. Index: genattrtab.c =================================================================== --- genattrtab.c (revision 2660) +++ genattrtab.c (working copy) @@ -242,6 +242,7 @@ struct attr_value_list **insn_code_value static const char *alternative_name; static const char *length_str; +static const char *lock_length_str; static const char *delay_type_str; static const char *delay_1_0_str; static const char *num_delay_slots_str; @@ -1541,14 +1542,14 @@ substitute_address (rtx exp, rtx (*no_ad */ static void -make_length_attrs (void) +make_length_attrs (const char **base) { static const char *new_names[] = { - "*insn_default_length", - "*insn_min_length", - "*insn_variable_length_p", - "*insn_current_length" + "*insn_default_%s", + "*insn_min_%s", + "*insn_variable_%s_p", + "*insn_current_%s" }; static rtx (*const no_address_fn[]) (rtx) = {identity_fn,identity_fn, zero_fn, zero_fn}; @@ -1561,7 +1562,7 @@ make_length_attrs (void) /* See if length attribute is defined. If so, it must be numeric. Make it special so we don't output anything for it. */ - length_attr = find_attr (&length_str, 0); + length_attr = find_attr (base, 0); if (length_attr == 0) return; @@ -1574,11 +1575,14 @@ make_length_attrs (void) /* Make each new attribute, in turn. */ for (i = 0; i < ARRAY_SIZE (new_names); i++) { - make_internal_attr (new_names[i], + const char *p = attr_printf (strlen (new_names[i]) - 2 + strlen (*base), + new_names[i], *base); + + make_internal_attr (p, substitute_address (length_attr->default_val->value, no_address_fn[i], address_fn[i]), ATTR_NONE); - new_attr = find_attr (&new_names[i], 0); + new_attr = find_attr (&p, 0); for (av = length_attr->first_value; av; av = av->next) for (ie = av->first_insn; ie; ie = ie->next) { @@ -5179,6 +5183,7 @@ main (int argc, char **argv) alternative_name = DEF_ATTR_STRING ("alternative"); length_str = DEF_ATTR_STRING ("length"); + lock_length_str = DEF_ATTR_STRING ("lock_length"); delay_type_str = DEF_ATTR_STRING ("*delay_type"); delay_1_0_str = DEF_ATTR_STRING ("*delay_1_0"); num_delay_slots_str = DEF_ATTR_STRING ("*num_delay_slots"); @@ -5275,7 +5280,8 @@ main (int argc, char **argv) fill_attr (attr); /* Construct extra attributes for `length'. */ - make_length_attrs (); + make_length_attrs (&length_str); + make_length_attrs (&lock_length_str); /* Perform any possible optimizations to speed up compilation. */ optimize_attrs ();