| Submitter | Joern Rennecke |
|---|---|
| Date | Oct. 3, 2012, 6:22 p.m. |
| Message ID | <20121003142206.9zuttl646c808so4-nzlynne@webmail.spamcop.net> |
| Download | mbox | patch |
| Permalink | /patch/188889/ |
| State | New |
| Headers | show |
Comments
On Wed, Oct 3, 2012 at 8:22 PM, Joern Rennecke <joern.rennecke@embecosm.com> wrote: > The ARCompact architecture has some pipelining features that result in > the vanilla branch shortening not always converging. > > Moreover, there are some short, complex branch instructions with very short > offsets; replacing them with a multi-insn sequence when the offset doesn't > reach makes the code significantly longer. Thus, when starting branch > shortening with pessimistic assumptions, the short branches are often not > used because of the pessimistic branch length causing the offsets going out > of range. > This problem can be avoided when starting with a low estimate and working > upwards. However, that makes the incidence of infinite branch shortening > cycles higher, and also makes it impossible to just break out after some > iteration count. > > To address these issues, I've made the generator programs recognize the > optional lock_length attribute. > > To quote from the documentation added for this feature: > > If you define the `lock_length' attribute, branch shortening will work > the other way round: it starts out assuming minimum instruction lengths > and iterates from there. In addition, the value of the `lock_length' > attribute does not decrease across iterations, and the value computed > for the `length' attribute will be no smaller than that of the > `lock_length' attribute. I miss a few things in this description: - what is the value of lock_length supposed to be? From the "lock" prefix it sounds like it is something unchanging, maybe even constant, thus a maximum? - the length attribute still needs to be specified when lock_length is? how do they relate? Is lock_length always smaller / bigger than length? - what happens if you have patterns with lock_length and patterns without? - what patterns does lock_length apply to? In general optimistically attacking this kind of problem should be always better - did you try simply switching this for all targets? It shouldn't be slower and the only thing you need to guarantee is that during iteration you never make insn-lenghts smaller again. Richard. > bootstrapped and regression tested on i686-pc-linux-gnu > > 2012-10-03 Joern Rennecke <joern.rennecke@embecosm.com> > > * final.c (get_attr_length_1): Use direct recursion rather than > calling get_attr_length. > (get_attr_lock_length): New function. > (INSN_VARIABLE_LENGTH_P): Define. > (shorten_branches): Take HAVE_ATTR_lock_length into account. > Don't overwrite non-delay slot insn lengths with the lengths of > delay slot insns with same uid. > * genattrtab.c (lock_length_str): New variable. > (make_length_attrs): New parameter base. > (main): Initialize lock_length_str. > Generate lock_lengths attributes. > * genattr.c (gen_attr): Emit declarations for lock_length attribute > related functions. > * doc/md.texi (node Insn Lengths): Document lock_length attribute. > > Index: doc/md.texi > =================================================================== > --- doc/md.texi (revision 192036) > +++ doc/md.texi (working copy) > @@ -8004,6 +8004,20 @@ (define_insn "jump" > (const_int 6)))]) > @end smallexample > > +@cindex lock_length > +Usually, branch shortening is done assuming the worst case (i.e. longest) > +lengths, and then iterating (if optimizing) to smaller lengths till > +no further changed occur. This does not work so well for architectures > +that have very small minimum offsets and considerable jumps in instruction > +lengths. > + > +If you define the @code{lock_length} attribute, branch shortening will > +work the other way round: it starts out assuming minimum instruction > +lengths and iterates from there. In addition, the value of the > +@code{lock_length} attribute does not decrease across iterations, and > +the value computed for the @code{length} attribute will be no smaller > +than that of the @code{lock_length} attribute. > + > @end ifset > @ifset INTERNALS > @node Constant Attributes > Index: final.c > =================================================================== > --- final.c (revision 192036) > +++ final.c (working copy) > @@ -312,6 +312,7 @@ dbr_sequence_length (void) > `insn_current_length'. */ > > static int *insn_lengths; > +static char *uid_lock_length; > > VEC(int,heap) *insn_addresses_; > > @@ -447,6 +448,20 @@ get_attr_length (rtx insn) > return get_attr_length_1 (insn, insn_default_length); > } > > +#ifdef HAVE_ATTR_lock_length > +int > +get_attr_lock_length (rtx insn) > +{ > + if (uid_lock_length && insn_lengths_max_uid > INSN_UID (insn)) > + return uid_lock_length[INSN_UID (insn)]; > + return get_attr_length_1 (insn, insn_min_lock_length); > +} > +#define INSN_VARIABLE_LENGTH_P(INSN) \ > + (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN)) > +#else > +#define INSN_VARIABLE_LENGTH_P(INSN) (insn_variable_length_p (INSN)) > +#endif > + > /* Obtain the current length of an insn. If branch shortening has been > done, > get its actual length. Otherwise, get its minimum length. */ > int > @@ -865,6 +880,7 @@ shorten_branches (rtx first ATTRIBUTE_UN > rtx body; > int uid; > rtx align_tab[MAX_CODE_ALIGN]; > + int (*length_fun) (rtx); > > #endif > > @@ -875,6 +891,10 @@ shorten_branches (rtx first ATTRIBUTE_UN > free (uid_shuid); > > uid_shuid = XNEWVEC (int, max_uid); > +#ifdef HAVE_ATTR_lock_length > + uid_lock_length = XNEWVEC (char, max_uid); > + memset (uid_lock_length, 0, max_uid); > +#endif > > if (max_labelno != max_label_num ()) > { > @@ -1067,6 +1087,10 @@ shorten_branches (rtx first ATTRIBUTE_UN > #endif /* CASE_VECTOR_SHORTEN_MODE */ > > /* Compute initial lengths, addresses, and varying flags for each insn. > */ > + length_fun = insn_default_length; > +#ifdef HAVE_ATTR_lock_length > + length_fun = insn_min_length; > +#endif > for (insn_current_address = 0, insn = first; > insn != 0; > insn_current_address += insn_lengths[uid], insn = NEXT_INSN (insn)) > @@ -1131,26 +1155,32 @@ shorten_branches (rtx first ATTRIBUTE_UN > inner_length = (asm_insn_count (PATTERN (inner_insn)) > * insn_default_length (inner_insn)); > else > - inner_length = insn_default_length (inner_insn); > + inner_length = length_fun (inner_insn); > > insn_lengths[inner_uid] = inner_length; > if (const_delay_slots) > { > if ((varying_length[inner_uid] > - = insn_variable_length_p (inner_insn)) != 0) > + = INSN_VARIABLE_LENGTH_P (inner_insn)) != 0) > varying_length[uid] = 1; > INSN_ADDRESSES (inner_uid) = (insn_current_address > + insn_lengths[uid]); > } > else > - varying_length[inner_uid] = 0; > + { > + /* We'd need to make this code a bit more complicated > + to properly support non-const-delay-slots with the > + lock_length attribute. */ > + gcc_assert (length_fun == &insn_default_length); > + varying_length[inner_uid] = 0; > + } > insn_lengths[uid] += inner_length; > } > } > else if (GET_CODE (body) != USE && GET_CODE (body) != CLOBBER) > { > - insn_lengths[uid] = insn_default_length (insn); > - varying_length[uid] = insn_variable_length_p (insn); > + insn_lengths[uid] = length_fun (insn); > + varying_length[uid] = INSN_VARIABLE_LENGTH_P (insn); > } > > /* If needed, do any adjustment. */ > @@ -1220,6 +1250,7 @@ shorten_branches (rtx first ATTRIBUTE_UN > rtx prev; > int rel_align = 0; > addr_diff_vec_flags flags; > + enum machine_mode vec_mode; > > /* Avoid automatic aggregate initialization. */ > flags = ADDR_DIFF_VEC_FLAGS (body); > @@ -1298,9 +1329,15 @@ shorten_branches (rtx first ATTRIBUTE_UN > else > max_addr += align_fuzz (max_lab, rel_lab, 0, 0); > } > - PUT_MODE (body, CASE_VECTOR_SHORTEN_MODE (min_addr - rel_addr, > - max_addr - rel_addr, > - body)); > + vec_mode = CASE_VECTOR_SHORTEN_MODE (min_addr - rel_addr, > + max_addr - rel_addr, > body); > + if (!uid_lock_length > + || !uid_lock_length[uid] > + || (GET_MODE_SIZE (vec_mode) > + >= GET_MODE_SIZE (GET_MODE (body)))) > + PUT_MODE (body, vec_mode); > + if (uid_lock_length) > + uid_lock_length[uid] = 1; > if (JUMP_TABLES_IN_TEXT_SECTION > || readonly_data_section == text_section) > { > @@ -1360,18 +1397,44 @@ shorten_branches (rtx first ATTRIBUTE_UN > else > inner_length = insn_current_length (inner_insn); > > - if (inner_length != insn_lengths[inner_uid]) > + /* We can't record lengths of delay slot insns. */ > + if (i == 0 && inner_length != insn_lengths[inner_uid]) > { > - insn_lengths[inner_uid] = inner_length; > - something_changed = 1; > +#ifdef HAVE_ATTR_lock_length > + int lock_length = insn_current_lock_length > (inner_insn); > + > + if (lock_length > uid_lock_length[inner_uid]) > + uid_lock_length[inner_uid] = lock_length; > + else > + lock_length = uid_lock_length[inner_uid]; > + if (inner_length < lock_length) > + inner_length = lock_length; > + if (inner_length != insn_lengths[inner_uid]) > +#endif > + { > + insn_lengths[inner_uid] = inner_length; > + something_changed = 1; > + } > } > - insn_current_address += insn_lengths[inner_uid]; > + insn_current_address += inner_length; > new_length += inner_length; > } > } > else > { > new_length = insn_current_length (insn); > +#ifdef HAVE_ATTR_lock_length > + { > + int lock_length = insn_current_lock_length (insn); > + > + if (lock_length > uid_lock_length[uid]) > + uid_lock_length[uid] = lock_length; > + else > + lock_length = uid_lock_length[uid]; > + if (new_length < lock_length) > + new_length = lock_length; > + } > +#endif > insn_current_address += new_length; > } > > @@ -1388,12 +1451,17 @@ shorten_branches (rtx first ATTRIBUTE_UN > something_changed = 1; > } > } > +#ifndef HAVE_ATTR_lock_length > /* For a non-optimizing compile, do only a single pass. */ > if (!optimize) > break; > +#endif > } > > free (varying_length); > + if (uid_lock_length) > + free (uid_lock_length); > + uid_lock_length = 0; > > #endif /* HAVE_ATTR_length */ > } > Index: genattr.c > =================================================================== > --- genattr.c (revision 192036) > +++ genattr.c (working copy) > @@ -71,6 +71,10 @@ extern int insn_default_length (rtx);\n\ > extern int insn_min_length (rtx);\n\ > extern int insn_variable_length_p (rtx);\n\ > extern int insn_current_length (rtx);\n\n\ > +extern int insn_default_lock_length (rtx);\n\ > +extern int insn_min_lock_length (rtx);\n\ > +extern int insn_variable_lock_length_p (rtx);\n\ > +extern int insn_current_lock_length (rtx);\n\n\ > #include \"insn-addr.h\"\n"); > } > } > Index: genattrtab.c > =================================================================== > --- genattrtab.c (revision 192036) > +++ genattrtab.c (working copy) > @@ -242,6 +242,7 @@ struct attr_value_list **insn_code_value > > static const char *alternative_name; > static const char *length_str; > +static const char *lock_length_str; > static const char *delay_type_str; > static const char *delay_1_0_str; > static const char *num_delay_slots_str; > @@ -1541,14 +1542,14 @@ substitute_address (rtx exp, rtx (*no_ad > */ > > static void > -make_length_attrs (void) > +make_length_attrs (const char **base) > { > static const char *new_names[] = > { > - "*insn_default_length", > - "*insn_min_length", > - "*insn_variable_length_p", > - "*insn_current_length" > + "*insn_default_%s", > + "*insn_min_%s", > + "*insn_variable_%s_p", > + "*insn_current_%s" > }; > static rtx (*const no_address_fn[]) (rtx) > = {identity_fn,identity_fn, zero_fn, zero_fn}; > @@ -1561,7 +1562,7 @@ make_length_attrs (void) > > /* See if length attribute is defined. If so, it must be numeric. Make > it special so we don't output anything for it. */ > - length_attr = find_attr (&length_str, 0); > + length_attr = find_attr (base, 0); > if (length_attr == 0) > return; > > @@ -1574,11 +1575,14 @@ make_length_attrs (void) > /* Make each new attribute, in turn. */ > for (i = 0; i < ARRAY_SIZE (new_names); i++) > { > - make_internal_attr (new_names[i], > + const char *p = attr_printf (strlen (new_names[i]) - 2 + strlen > (*base), > + new_names[i], *base); > + > + make_internal_attr (p, > substitute_address > (length_attr->default_val->value, > no_address_fn[i], > address_fn[i]), > ATTR_NONE); > - new_attr = find_attr (&new_names[i], 0); > + new_attr = find_attr (&p, 0); > for (av = length_attr->first_value; av; av = av->next) > for (ie = av->first_insn; ie; ie = ie->next) > { > @@ -5178,6 +5182,7 @@ main (int argc, char **argv) > > alternative_name = DEF_ATTR_STRING ("alternative"); > length_str = DEF_ATTR_STRING ("length"); > + lock_length_str = DEF_ATTR_STRING ("lock_length"); > delay_type_str = DEF_ATTR_STRING ("*delay_type"); > delay_1_0_str = DEF_ATTR_STRING ("*delay_1_0"); > num_delay_slots_str = DEF_ATTR_STRING ("*num_delay_slots"); > @@ -5274,7 +5279,8 @@ main (int argc, char **argv) > fill_attr (attr); > > /* Construct extra attributes for `length'. */ > - make_length_attrs (); > + make_length_attrs (&length_str); > + make_length_attrs (&lock_length_str); > > /* Perform any possible optimizations to speed up compilation. */ > optimize_attrs (); >
Patch
Index: doc/md.texi =================================================================== --- doc/md.texi (revision 192036) +++ doc/md.texi (working copy) @@ -8004,6 +8004,20 @@ (define_insn "jump" (const_int 6)))]) @end smallexample +@cindex lock_length +Usually, branch shortening is done assuming the worst case (i.e. longest) +lengths, and then iterating (if optimizing) to smaller lengths till +no further changed occur. This does not work so well for architectures +that have very small minimum offsets and considerable jumps in instruction +lengths. + +If you define the @code{lock_length} attribute, branch shortening will +work the other way round: it starts out assuming minimum instruction +lengths and iterates from there. In addition, the value of the +@code{lock_length} attribute does not decrease across iterations, and +the value computed for the @code{length} attribute will be no smaller +than that of the @code{lock_length} attribute. + @end ifset @ifset INTERNALS @node Constant Attributes Index: final.c =================================================================== --- final.c (revision 192036) +++ final.c (working copy) @@ -312,6 +312,7 @@ dbr_sequence_length (void) `insn_current_length'. */ static int *insn_lengths; +static char *uid_lock_length; VEC(int,heap) *insn_addresses_; @@ -447,6 +448,20 @@ get_attr_length (rtx insn) return get_attr_length_1 (insn, insn_default_length); } +#ifdef HAVE_ATTR_lock_length +int +get_attr_lock_length (rtx insn) +{ + if (uid_lock_length && insn_lengths_max_uid > INSN_UID (insn)) + return uid_lock_length[INSN_UID (insn)]; + return get_attr_length_1 (insn, insn_min_lock_length); +} +#define INSN_VARIABLE_LENGTH_P(INSN) \ + (insn_variable_length_p (INSN) || insn_variable_lock_length_p (INSN)) +#else +#define INSN_VARIABLE_LENGTH_P(INSN) (insn_variable_length_p (INSN)) +#endif + /* Obtain the current length of an insn. If branch shortening has been done, get its actual length. Otherwise, get its minimum length. */ int @@ -865,6 +880,7 @@ shorten_branches (rtx first ATTRIBUTE_UN rtx body; int uid; rtx align_tab[MAX_CODE_ALIGN]; + int (*length_fun) (rtx); #endif @@ -875,6 +891,10 @@ shorten_branches (rtx first ATTRIBUTE_UN free (uid_shuid); uid_shuid = XNEWVEC (int, max_uid); +#ifdef HAVE_ATTR_lock_length + uid_lock_length = XNEWVEC (char, max_uid); + memset (uid_lock_length, 0, max_uid); +#endif if (max_labelno != max_label_num ()) { @@ -1067,6 +1087,10 @@ shorten_branches (rtx first ATTRIBUTE_UN #endif /* CASE_VECTOR_SHORTEN_MODE */ /* Compute initial lengths, addresses, and varying flags for each insn. */ + length_fun = insn_default_length; +#ifdef HAVE_ATTR_lock_length + length_fun = insn_min_length; +#endif for (insn_current_address = 0, insn = first; insn != 0; insn_current_address += insn_lengths[uid], insn = NEXT_INSN (insn)) @@ -1131,26 +1155,32 @@ shorten_branches (rtx first ATTRIBUTE_UN inner_length = (asm_insn_count (PATTERN (inner_insn)) * insn_default_length (inner_insn)); else - inner_length = insn_default_length (inner_insn); + inner_length = length_fun (inner_insn); insn_lengths[inner_uid] = inner_length; if (const_delay_slots) { if ((varying_length[inner_uid] - = insn_variable_length_p (inner_insn)) != 0) + = INSN_VARIABLE_LENGTH_P (inner_insn)) != 0) varying_length[uid] = 1; INSN_ADDRESSES (inner_uid) = (insn_current_address + insn_lengths[uid]); } else - varying_length[inner_uid] = 0; + { + /* We'd need to make this code a bit more complicated + to properly support non-const-delay-slots with the + lock_length attribute. */ + gcc_assert (length_fun == &insn_default_length); + varying_length[inner_uid] = 0; + } insn_lengths[uid] += inner_length; } } else if (GET_CODE (body) != USE && GET_CODE (body) != CLOBBER) { - insn_lengths[uid] = insn_default_length (insn); - varying_length[uid] = insn_variable_length_p (insn); + insn_lengths[uid] = length_fun (insn); + varying_length[uid] = INSN_VARIABLE_LENGTH_P (insn); } /* If needed, do any adjustment. */ @@ -1220,6 +1250,7 @@ shorten_branches (rtx first ATTRIBUTE_UN rtx prev; int rel_align = 0; addr_diff_vec_flags flags; + enum machine_mode vec_mode; /* Avoid automatic aggregate initialization. */ flags = ADDR_DIFF_VEC_FLAGS (body); @@ -1298,9 +1329,15 @@ shorten_branches (rtx first ATTRIBUTE_UN else max_addr += align_fuzz (max_lab, rel_lab, 0, 0); } - PUT_MODE (body, CASE_VECTOR_SHORTEN_MODE (min_addr - rel_addr, - max_addr - rel_addr, - body)); + vec_mode = CASE_VECTOR_SHORTEN_MODE (min_addr - rel_addr, + max_addr - rel_addr, body); + if (!uid_lock_length + || !uid_lock_length[uid] + || (GET_MODE_SIZE (vec_mode) + >= GET_MODE_SIZE (GET_MODE (body)))) + PUT_MODE (body, vec_mode); + if (uid_lock_length) + uid_lock_length[uid] = 1; if (JUMP_TABLES_IN_TEXT_SECTION || readonly_data_section == text_section) { @@ -1360,18 +1397,44 @@ shorten_branches (rtx first ATTRIBUTE_UN else inner_length = insn_current_length (inner_insn); - if (inner_length != insn_lengths[inner_uid]) + /* We can't record lengths of delay slot insns. */ + if (i == 0 && inner_length != insn_lengths[inner_uid]) { - insn_lengths[inner_uid] = inner_length; - something_changed = 1; +#ifdef HAVE_ATTR_lock_length + int lock_length = insn_current_lock_length (inner_insn); + + if (lock_length > uid_lock_length[inner_uid]) + uid_lock_length[inner_uid] = lock_length; + else + lock_length = uid_lock_length[inner_uid]; + if (inner_length < lock_length) + inner_length = lock_length; + if (inner_length != insn_lengths[inner_uid]) +#endif + { + insn_lengths[inner_uid] = inner_length; + something_changed = 1; + } } - insn_current_address += insn_lengths[inner_uid]; + insn_current_address += inner_length; new_length += inner_length; } } else { new_length = insn_current_length (insn); +#ifdef HAVE_ATTR_lock_length + { + int lock_length = insn_current_lock_length (insn); + + if (lock_length > uid_lock_length[uid]) + uid_lock_length[uid] = lock_length; + else + lock_length = uid_lock_length[uid]; + if (new_length < lock_length) + new_length = lock_length; + } +#endif insn_current_address += new_length; } @@ -1388,12 +1451,17 @@ shorten_branches (rtx first ATTRIBUTE_UN something_changed = 1; } } +#ifndef HAVE_ATTR_lock_length /* For a non-optimizing compile, do only a single pass. */ if (!optimize) break; +#endif } free (varying_length); + if (uid_lock_length) + free (uid_lock_length); + uid_lock_length = 0; #endif /* HAVE_ATTR_length */ } Index: genattr.c =================================================================== --- genattr.c (revision 192036) +++ genattr.c (working copy) @@ -71,6 +71,10 @@ extern int insn_default_length (rtx);\n\ extern int insn_min_length (rtx);\n\ extern int insn_variable_length_p (rtx);\n\ extern int insn_current_length (rtx);\n\n\ +extern int insn_default_lock_length (rtx);\n\ +extern int insn_min_lock_length (rtx);\n\ +extern int insn_variable_lock_length_p (rtx);\n\ +extern int insn_current_lock_length (rtx);\n\n\ #include \"insn-addr.h\"\n"); } } Index: genattrtab.c =================================================================== --- genattrtab.c (revision 192036) +++ genattrtab.c (working copy) @@ -242,6 +242,7 @@ struct attr_value_list **insn_code_value static const char *alternative_name; static const char *length_str; +static const char *lock_length_str; static const char *delay_type_str; static const char *delay_1_0_str; static const char *num_delay_slots_str; @@ -1541,14 +1542,14 @@ substitute_address (rtx exp, rtx (*no_ad */ static void -make_length_attrs (void) +make_length_attrs (const char **base) { static const char *new_names[] = { - "*insn_default_length", - "*insn_min_length", - "*insn_variable_length_p", - "*insn_current_length" + "*insn_default_%s", + "*insn_min_%s", + "*insn_variable_%s_p", + "*insn_current_%s" }; static rtx (*const no_address_fn[]) (rtx) = {identity_fn,identity_fn, zero_fn, zero_fn}; @@ -1561,7 +1562,7 @@ make_length_attrs (void) /* See if length attribute is defined. If so, it must be numeric. Make it special so we don't output anything for it. */ - length_attr = find_attr (&length_str, 0); + length_attr = find_attr (base, 0); if (length_attr == 0) return; @@ -1574,11 +1575,14 @@ make_length_attrs (void) /* Make each new attribute, in turn. */ for (i = 0; i < ARRAY_SIZE (new_names); i++) { - make_internal_attr (new_names[i], + const char *p = attr_printf (strlen (new_names[i]) - 2 + strlen (*base), + new_names[i], *base); + + make_internal_attr (p, substitute_address (length_attr->default_val->value, no_address_fn[i], address_fn[i]), ATTR_NONE); - new_attr = find_attr (&new_names[i], 0); + new_attr = find_attr (&p, 0); for (av = length_attr->first_value; av; av = av->next) for (ie = av->first_insn; ie; ie = ie->next) { @@ -5178,6 +5182,7 @@ main (int argc, char **argv) alternative_name = DEF_ATTR_STRING ("alternative"); length_str = DEF_ATTR_STRING ("length"); + lock_length_str = DEF_ATTR_STRING ("lock_length"); delay_type_str = DEF_ATTR_STRING ("*delay_type"); delay_1_0_str = DEF_ATTR_STRING ("*delay_1_0"); num_delay_slots_str = DEF_ATTR_STRING ("*num_delay_slots"); @@ -5274,7 +5279,8 @@ main (int argc, char **argv) fill_attr (attr); /* Construct extra attributes for `length'. */ - make_length_attrs (); + make_length_attrs (&length_str); + make_length_attrs (&lock_length_str); /* Perform any possible optimizations to speed up compilation. */ optimize_attrs ();
The ARCompact architecture has some pipelining features that result in the vanilla branch shortening not always converging. Moreover, there are some short, complex branch instructions with very short offsets; replacing them with a multi-insn sequence when the offset doesn't reach makes the code significantly longer. Thus, when starting branch shortening with pessimistic assumptions, the short branches are often not used because of the pessimistic branch length causing the offsets going out of range. This problem can be avoided when starting with a low estimate and working upwards. However, that makes the incidence of infinite branch shortening cycles higher, and also makes it impossible to just break out after some iteration count. To address these issues, I've made the generator programs recognize the optional lock_length attribute. To quote from the documentation added for this feature: If you define the `lock_length' attribute, branch shortening will work the other way round: it starts out assuming minimum instruction lengths and iterates from there. In addition, the value of the `lock_length' attribute does not decrease across iterations, and the value computed for the `length' attribute will be no smaller than that of the `lock_length' attribute. bootstrapped and regression tested on i686-pc-linux-gnu 2012-10-03 Joern Rennecke <joern.rennecke@embecosm.com> * final.c (get_attr_length_1): Use direct recursion rather than calling get_attr_length. (get_attr_lock_length): New function. (INSN_VARIABLE_LENGTH_P): Define. (shorten_branches): Take HAVE_ATTR_lock_length into account. Don't overwrite non-delay slot insn lengths with the lengths of delay slot insns with same uid. * genattrtab.c (lock_length_str): New variable. (make_length_attrs): New parameter base. (main): Initialize lock_length_str. Generate lock_lengths attributes. * genattr.c (gen_attr): Emit declarations for lock_length attribute related functions. * doc/md.texi (node Insn Lengths): Document lock_length attribute.