From patchwork Tue Jun 15 16:26:21 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Matz X-Patchwork-Id: 55763 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id A214CB7DA8 for ; Wed, 16 Jun 2010 02:26:57 +1000 (EST) Received: (qmail 30178 invoked by alias); 15 Jun 2010 16:26:54 -0000 Received: (qmail 29800 invoked by uid 22791); 15 Jun 2010 16:26:43 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL, BAYES_50, TW_CP, TW_VB, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 15 Jun 2010 16:26:25 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.221.2]) by mx2.suse.de (Postfix) with ESMTP id 1EBD08A95F for ; Tue, 15 Jun 2010 18:26:22 +0200 (CEST) Date: Tue, 15 Jun 2010 18:26:21 +0200 (CEST) From: Michael Matz To: gcc-patches@gcc.gnu.org Subject: Speed up genattrtab Message-ID: MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, okay, let's try again, after many years. Maybe this time :) This speeds up genattrtab to be no time issue during bootstrap anymore. Over the years I worked on many approaches to this. My first one was to throttle down the optimization, then I completely removed the optimization, then I implemented different kinds of optimizations, then I combined them with the throttled down ones, and now I'm back to more or less only throttling the optimizations. Obviously switching off all optimizations creates the fastest combination of genattrtab+compiling insn-attrtab.c. But it has some effect on the overall speed of the compiler. Not too bad as I rectified this a bit, but maybe too much to be acceptable to everyone. But for all development during the last years I used a so modified genattrtab, it's really nice :) Now, after much benchmarking last week I'm proposing the below patch. It does not get rid of all optimizations, but throttles it significantly (plus reorders the order of computation so that lowering the limits doesn't have too much effect for small attributes). Numbers follow. First the architecture, then the version of genattrtab, then four numbers: gen_u == seconds to run an optimized (!) genattrtab st1_1 == seconds to compile generated insn-attrtab.c with an optimized cc1 big_u == seconds to compile an artificial piece of code that generates large functions, many loops, scheduling opportunities kde_u == seconds to compile kdecore.cc, a one-file variant of an older version of libkdecore. The genattrtab versions are: clean == as in SVN, try3 == no call to optimize_attrs, otherwise same as proposed patch, try == the proposed variant. These measurements were taken on genattrtab versions that didn't contain the latest changes to support enum attributes, but those have no speed effect (I've checked for some combinations). arch name gen_u st1_u big_u kde_u alpha clean 0 0.75 43.21 32.52 alpha try3 0 0.26 44.19 32.85 alpha try 0 0.35 43.26 32.50 arm clean 6 19.66 49.25 37.89 arm try3 0 1.78 50.04 38.00 arm try 2 2.35 49.85 38.10 crisv32 clean 0 0.21 36.17 27.33 crisv32 try3 0 0.15 36.49 27.53 crisv32 try 0 0.23 36.21 27.43 hppa clean 0 1.11 46.77 31.97 hppa try3 0 0.58 46.85 32.00 hppa try 0 0.64 46.97 31.84 i386 clean 38 34.25 33.51 29.93 i386 try3 1 1.99 34.26 30.64 i386 try 6 2.31 33.78 30.12 ia64 clean 1 1.88 66.55 49.81 ia64 try3 0 0.71 67.08 50.33 ia64 try 0 0.95 66.62 49.81 mips clean 74 17.08 51.23 mips try3 0 1.74 52.11 mips try 4 2.29 50.77 powerpc clean 56 48.59 49.74 34.15 powerpc try3 0 2.59 50.60 34.97 powerpc try 5 2.17 49.38 34.71 s390x clean 0 1.82 47.26 32.83 s390x try3 0 0.62 47.63 33.75 s390x try 0 0.78 47.41 33.64 sh clean 0 1.46 50.99 38.09 sh try3 0 0.68 51.05 38.30 sh try 0 0.91 50.79 38.13 sparc clean 0 1.11 44.21 32.98 sparc try3 0 0.57 44.45 33.22 sparc try 0 0.57 43.27 32.93 x86_64 clean 52 43.81 28.78 28.72 x86_64 try3 1 2.16 29.22 29.40 x86_64 try 6 2.98 28.96 28.96 (mips wasn't able to compile kdecore.cc). This is all cross compilers to $arch-linux, all running on the same host machine (a x86_64-linux iCore7 machine). As said, I'm proposing "try", so compare the first and third numbers. It will hugely help i386, mips, powerpc and x86_64, and arm a bit; the others aren't a problem right now anyway. The speed difference of the compiler is acceptable I think, actually even speeding up the compiler sometimes (probably cache effects, because the .text size of insn-attrtab is _much_ smaller) or being in the noise. So, if included, we go from 95 seconds to 9 seconds for x86_64 for an optimized cc1, the difference will be even larger for stage2 (using an possibly unoptimized cc1). As said, I'm bootstrapping with variants of this since years, but of course I'm regstrapping this currently on x86_64-linux. Okay for trunk? Ciao, Michael. Index: genattrtab.c =================================================================== --- genattrtab.c (revision 160784) +++ genattrtab.c (working copy) @@ -241,6 +241,7 @@ static const char *length_str; static const char *delay_type_str; static const char *delay_1_0_str; static const char *num_delay_slots_str; +static const char *insn_code_str; /* Simplify an expression. Only call the routine if there is something to simplify. */ @@ -1621,6 +1622,58 @@ write_length_unit_log (void) printf ("EXPORTED_CONST int length_unit_log = %u;\n", length_unit_log); } +/* Compute approximate cost of the expression. Used to decide whether + expression is cheap enough for inline. */ +static int +attr_rtx_cost (rtx x) +{ + int cost = 1; + enum rtx_code code; + if (!x) + return 0; + code = GET_CODE (x); + switch (code) + { + case MATCH_OPERAND: + if (XSTR (x, 1)[0]) + return 10; + else + return 1; + + case EQ_ATTR_ALT: + return 1; + + case EQ_ATTR: + /* Alternatives don't result into function call. */ + if (!strcmp_check (XSTR (x, 0), alternative_name) + || !strcmp_check (XSTR (x, 0), insn_code_str)) + return 1; + else + return 5; + default: + { + int i, j; + const char *fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + switch (fmt[i]) + { + case 'V': + case 'E': + for (j = 0; j < XVECLEN (x, i); j++) + cost += attr_rtx_cost (XVECEXP (x, i, j)); + break; + case 'e': + cost += attr_rtx_cost (XEXP (x, i)); + break; + } + } + } + break; + } + return cost; +} + /* Take a COND expression and see if any of the conditions in it can be simplified. If any are known true or known false for the particular insn code, the COND can be further simplified. @@ -1744,6 +1797,111 @@ simplify_cond (rtx exp, int insn_code, i return ret; } +/* Taken a COND expression EXP and a value AV, and returns a possibly + optimized variant of EXP. This is similar to simplify_cond, but + instead of optimizing for just one instruction we optimize for the + set of instructions found in AV. */ + +static rtx +simplify_cond_insn_list (rtx exp, struct attr_value *av) +{ + int i; + /* We store the desired contents here, + then build a new expression if they don't match EXP. */ + rtx defval = XEXP (exp, 1); + rtx new_defval = XEXP (exp, 1); + int len = XVECLEN (exp, 0); + rtx *tests; + int allsame = 1; + rtx ret; + + if (av->has_asm_insn) + return exp; + + tests = XNEWVEC (rtx, len); + /* This lets us free all storage allocated below, if appropriate. */ + obstack_finish (rtl_obstack); + + memcpy (tests, XVEC (exp, 0)->elem, len * sizeof (rtx)); + + /* See if default value needs simplification. */ + if (GET_CODE (defval) == COND) + new_defval = simplify_cond_insn_list (defval, av); + + for (i = 0; i < len; i += 2) + { + rtx newtest, newval; + struct insn_ent *ie; + int oldcost; + + oldcost = attr_rtx_cost (XVECEXP (exp, 0, i)); + tests[i] = false_rtx; + for (ie = av->first_insn; ie != 0; ie = ie->next) + { + rtx orig_test = XVECEXP (exp, 0, i); + gcc_assert (ie->def->insn_code >= 0); + newtest = simplify_test_exp_in_temp (orig_test, ie->def->insn_code, + ie->def->insn_index); + /* We couldn't simplify the expression for this insn code, + so it makes no sense to try further, as syntactically we + can't shorten the test anymore. Further, also the + simplifications done up to now can be removed. */ + if (newtest == orig_test) + { + tests[i] = orig_test; + break; + } + if (newtest != false_rtx) + { + newtest = attr_rtx (AND, + attr_eq (insn_code_str, + attr_numeral (ie->def->insn_code)), + newtest); + tests[i] = insert_right_side (IOR, tests[i], newtest, -2, -2); + } + } + + if (attr_rtx_cost (tests[i]) > oldcost) + tests[i] = XVECEXP (exp, 0, i); + + /* Simplify this test. */ + newtest = simplify_test_exp_in_temp (tests[i], -2, -2); + tests[i] = newtest; + + newval = tests[i + 1]; + /* See if this value may need simplification. */ + if (GET_CODE (newval) == COND) + newval = simplify_cond_insn_list (newval, av); + + tests[i + 1] = newval; + } + + /* See if we changed anything. */ + if (len != XVECLEN (exp, 0) || new_defval != XEXP (exp, 1)) + allsame = 0; + else + for (i = 0; i < len; i++) + if (! attr_equal_p (tests[i], XVECEXP (exp, 0, i))) + { + allsame = 0; + break; + } + + if (allsame) + ret = exp; + else + { + rtx newexp = rtx_alloc (COND); + + XVEC (newexp, 0) = rtvec_alloc (len); + memcpy (XVEC (newexp, 0)->elem, tests, len * sizeof (rtx)); + XEXP (newexp, 1) = new_defval; + ret = newexp; + } + free (tests); + return ret; +} + /* Remove an insn entry from an attribute value. */ static void @@ -2216,57 +2374,6 @@ simplify_or_tree (rtx exp, rtx *pterm, i return exp; } -/* Compute approximate cost of the expression. Used to decide whether - expression is cheap enough for inline. */ -static int -attr_rtx_cost (rtx x) -{ - int cost = 0; - enum rtx_code code; - if (!x) - return 0; - code = GET_CODE (x); - switch (code) - { - case MATCH_OPERAND: - if (XSTR (x, 1)[0]) - return 10; - else - return 0; - - case EQ_ATTR_ALT: - return 0; - - case EQ_ATTR: - /* Alternatives don't result into function call. */ - if (!strcmp_check (XSTR (x, 0), alternative_name)) - return 0; - else - return 5; - default: - { - int i, j; - const char *fmt = GET_RTX_FORMAT (code); - for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) - { - switch (fmt[i]) - { - case 'V': - case 'E': - for (j = 0; j < XVECLEN (x, i); j++) - cost += attr_rtx_cost (XVECEXP (x, i, j)); - break; - case 'e': - cost += attr_rtx_cost (XEXP (x, i)); - break; - } - } - } - break; - } - return cost; -} - /* Simplify test expression and use temporary obstack in order to avoid memory bloat. Use ATTR_IND_SIMPLIFIED to avoid unnecessary simplifications and avoid unnecessary copying if possible. */ @@ -2599,6 +2706,25 @@ simplify_test_exp (rtx exp, int insn_cod return SIMPLIFY_TEST_EXP (newexp, insn_code, insn_index); } + /* Similarly, + convert (ior (and (y) (x)) + (and (z) (x))) + to (and (ior (y) (z)) + (x)) + Note that we want the common term to stay at the end. + */ + + else if (GET_CODE (left) == AND && GET_CODE (right) == AND + && attr_equal_p (XEXP (left, 1), XEXP (right, 1))) + { + newexp = attr_rtx (IOR, XEXP (left, 0), XEXP (right, 0)); + + left = newexp; + right = XEXP (right, 1); + newexp = attr_rtx (AND, left, right); + return SIMPLIFY_TEST_EXP (newexp, insn_code, insn_index); + } + /* See if all or all but one of the insn's alternatives are specified in this tree. Optimize if so. */ @@ -2702,6 +2828,13 @@ simplify_test_exp (rtx exp, int insn_cod break; } + if (XSTR (exp, 0) == insn_code_str) + { + if (insn_code >= 0) + newexp = atoi (XSTR (exp, 1)) == insn_code ? true_rtx : false_rtx; + break; + } + /* Look at the value for this insn code in the specified attribute. We normally can replace this comparison with the condition that would give this insn the values being tested for. */ @@ -2734,7 +2867,7 @@ simplify_test_exp (rtx exp, int insn_cod x = evaluate_eq_attr (exp, attr, av->value, insn_code, insn_index); x = SIMPLIFY_TEST_EXP (x, insn_code, insn_index); - if (attr_rtx_cost(x) < 20) + if (attr_rtx_cost(x) < 7) return x; } } @@ -2754,6 +2887,131 @@ simplify_test_exp (rtx exp, int insn_cod return newexp; } +/* Return 1 if any EQ_ATTR subexpression of P refers to ATTR, + otherwise return 0. */ + +static int +tests_attr_p (rtx p, struct attr_desc *attr) +{ + const char *fmt; + int i, ie, j, je; + + if (GET_CODE (p) == EQ_ATTR) + { + if (XSTR (p, 0) != attr->name) + return 0; + return 1; + } + + fmt = GET_RTX_FORMAT (GET_CODE (p)); + ie = GET_RTX_LENGTH (GET_CODE (p)); + for (i = 0; i < ie; i++) + { + switch (*fmt++) + { + case 'e': + if (tests_attr_p (XEXP (p, i), attr)) + return 1; + break; + + case 'E': + je = XVECLEN (p, i); + for (j = 0; j < je; ++j) + if (tests_attr_p (XVECEXP (p, i, j), attr)) + return 1; + break; + } + } + + return 0; +} + +/* Calculate a topological sorting of all attributes so that + all attributes only depend on attributes in front of it. + Place the result in *RET (which is a pointer to an array of + attr_desc pointers), and return the size of that array. */ + +static int +get_attr_order (struct attr_desc ***ret) +{ + int i, j; + int num = 0; + struct attr_desc *attr; + struct attr_desc **all, **sorted; + char *handled; + for (i = 0; i < MAX_ATTRS_INDEX; i++) + for (attr = attrs[i]; attr; attr = attr->next) + num++; + all = XNEWVEC (struct attr_desc *, num); + sorted = XNEWVEC (struct attr_desc *, num); + handled = XCNEWVEC (char, num); + num = 0; + for (i = 0; i < MAX_ATTRS_INDEX; i++) + for (attr = attrs[i]; attr; attr = attr->next) + all[num++] = attr; + + j = 0; + for (i = 0; i < num; i++) + if (all[i]->is_const) + handled[i] = 1, sorted[j++] = all[i]; + + /* We have only few attributes hence we can live with the inner + loop being O(n^2), unlike the normal fast variants of topological + sorting. */ + while (j < num) + { + for (i = 0; i < num; i++) + if (!handled[i]) + { + /* Let's see if I depends on anything interesting. */ + int k; + for (k = 0; k < num; k++) + if (!handled[k]) + { + struct attr_value *av; + for (av = all[i]->first_value; av; av = av->next) + if (av->num_insns != 0) + if (tests_attr_p (av->value, all[k])) + break; + + if (av) + /* Something in I depends on K. */ + break; + } + if (k == num) + { + /* Nothing in I depended on anything intersting, so + it's done. */ + handled[i] = 1; + sorted[j++] = all[i]; + } + } + } + + for (j = 0; j < num; j++) + { + struct attr_desc *attr2; + struct attr_value *av; + + attr = sorted[j]; + fprintf (stderr, "%s depends on: ", attr->name); + for (i = 0; i < MAX_ATTRS_INDEX; ++i) + for (attr2 = attrs[i]; attr2; attr2 = attr2->next) + if (!attr2->is_const) + for (av = attr->first_value; av; av = av->next) + if (av->num_insns != 0) + if (tests_attr_p (av->value, attr2)) + { + fprintf (stderr, "%s, ", attr2->name); + break; + } + fprintf (stderr, "\n"); + } + free (all); + *ret = sorted; + return num; +} + /* Optimize the attribute lists by seeing if we can determine conditional values from the known values of other attributes. This will save subroutine calls during the compilation. */ @@ -2768,6 +3026,8 @@ optimize_attrs (void) int i; struct attr_value_list *ivbuf; struct attr_value_list *iv; + struct attr_desc **topsort; + int topnum; /* For each insn code, make a list of all the insn_ent's for it, for all values for all attributes. */ @@ -2783,18 +3043,22 @@ optimize_attrs (void) iv = ivbuf = XNEWVEC (struct attr_value_list, num_insn_ents); - for (i = 0; i < MAX_ATTRS_INDEX; i++) - for (attr = attrs[i]; attr; attr = attr->next) - for (av = attr->first_value; av; av = av->next) - for (ie = av->first_insn; ie; ie = ie->next) - { - iv->attr = attr; - iv->av = av; - iv->ie = ie; - iv->next = insn_code_values[ie->def->insn_code]; - insn_code_values[ie->def->insn_code] = iv; - iv++; - } + /* Create the chain of insn*attr values such that we see dependend + attributes after their dependencies. As we use a stack via the + next pointers start from the end of the topological order. */ + topnum = get_attr_order (&topsort); + for (i = topnum - 1; i >= 0; i--) + for (av = topsort[i]->first_value; av; av = av->next) + for (ie = av->first_insn; ie; ie = ie->next) + { + iv->attr = topsort[i]; + iv->av = av; + iv->ie = ie; + iv->next = insn_code_values[ie->def->insn_code]; + insn_code_values[ie->def->insn_code] = iv; + iv++; + } + free (topsort); /* Sanity check on num_insn_ents. */ gcc_assert (iv == ivbuf + num_insn_ents); @@ -2829,7 +3093,15 @@ optimize_attrs (void) } rtl_obstack = old; - if (newexp != av->value) + /* If we created a new value for this instruction, and it's + cheaper than the old value, and overall cheap, use that + one as specific value for the current instruction. + The last test is to avoid exploding the get_attr_ function + sizes for no much gain. */ + if (newexp != av->value + && attr_rtx_cost (newexp) < attr_rtx_cost (av->value) + && attr_rtx_cost (newexp) < 26 + ) { newexp = attr_copy_rtx (newexp); remove_insn_ent (av, ie); @@ -3332,6 +3604,12 @@ write_test_expr (rtx exp, int flags) break; } + if (XSTR (exp, 0) == insn_code_str) + { + printf ("insn_code == %s", XSTR (exp, 1)); + break; + } + attr = find_attr (&XSTR (exp, 0), 0); gcc_assert (attr); @@ -3618,10 +3896,80 @@ walk_attr_value (rtx exp) } } -/* Write out a function to obtain the attribute for a given INSN. */ +/* If a subexpression of P refers to ATTR, write out C code to retrieve + the value of that attribute storing it in a local variable. */ + +static int +write_expr_attr_cache (rtx p, struct attr_desc *attr, int indent) +{ + const char *fmt; + int i, ie, j, je; + + if (GET_CODE (p) == EQ_ATTR) + { + if (XSTR (p, 0) != attr->name) + return 0; + + write_indent (indent); + if (attr->enum_name) + printf (" enum %s ", attr->enum_name); + else if (!attr->is_numeric) + printf (" enum attr_%s ", attr->name); + else + printf (" int "); + + printf ("attr_%s ATTRIBUTE_UNUSED = get_attr_%s (insn);\n", + attr->name, attr->name); + return 1; + } + + fmt = GET_RTX_FORMAT (GET_CODE (p)); + ie = GET_RTX_LENGTH (GET_CODE (p)); + for (i = 0; i < ie; i++) + { + switch (*fmt++) + { + case 'e': + if (write_expr_attr_cache (XEXP (p, i), attr, indent)) + return 1; + break; + + case 'E': + je = XVECLEN (p, i); + for (j = 0; j < je; ++j) + if (write_expr_attr_cache (XVECEXP (p, i, j), attr, indent)) + return 1; + break; + } + } + + return 0; +} + +/* Given a VALUE (possibly having EQ_ATTR tests in subexpression) + write out C code to retrieve the value of all attributes used + in tests embedded therein. */ static void -write_attr_get (struct attr_desc *attr) +write_cache_used_attr_for_value (rtx value, int indent) +{ + struct attr_desc *attr2; + int i; + + for (i = 0; i < MAX_ATTRS_INDEX; ++i) + for (attr2 = attrs[i]; attr2; attr2 = attr2->next) + if (!attr2->is_const) + write_expr_attr_cache (value, attr2, indent); +} + +/* Write out C code to calculate the value of ATTR per instruction. + PREFIX and SUFFIX are used to delimit the 'return' statement delivering + the value to our caller. Either it will form a real return statement, + or an accumulator update. */ + +static void +write_attr_switch (struct attr_desc *attr, int indent, const char *prefix, + const char *suffix) { struct attr_value *av, *common_av; @@ -3629,6 +3977,32 @@ write_attr_get (struct attr_desc *attr) switch we will generate. */ common_av = find_most_used (attr); + write_indent (indent); + printf ("{\n"); + write_indent (indent); + printf (" int insn_code = recog_memoized (insn);\n"); + write_indent (indent); + printf (" switch (insn_code)\n"); + write_indent (indent); + printf (" {\n"); + + for (av = attr->first_value; av; av = av->next) + if (av != common_av) + write_attr_case (attr, av, 1, prefix, suffix, indent + 4, true_rtx); + + write_attr_case (attr, common_av, 0, prefix, suffix, indent + 4, true_rtx); + + write_indent (indent); + printf (" }\n"); + write_indent (indent); + printf ("}\n\n"); +} + +/* Write out a function to obtain the attribute for a given INSN. */ + +static void +write_attr_get (struct attr_desc *attr) +{ /* Write out start of function, then all values with explicit `case' lines, then a `default', then the value with the most uses. */ if (attr->enum_name) @@ -3646,6 +4020,7 @@ write_attr_get (struct attr_desc *attr) printf ("get_attr_%s (rtx insn ATTRIBUTE_UNUSED)\n", attr->name); else { + struct attr_value *av; printf ("get_attr_%s (void)\n", attr->name); printf ("{\n"); @@ -3656,22 +4031,13 @@ write_attr_get (struct attr_desc *attr) av->first_insn->def->insn_index); else if (av->num_insns != 0) write_attr_set (attr, 2, av->value, "return", ";", - true_rtx, -2, 0); + true_rtx, -2, -2); printf ("}\n\n"); return; } - printf ("{\n"); - printf (" switch (recog_memoized (insn))\n"); - printf (" {\n"); - - for (av = attr->first_value; av; av = av->next) - if (av != common_av) - write_attr_case (attr, av, 1, "return", ";", 4, true_rtx); - - write_attr_case (attr, common_av, 0, "return", ";", 4, true_rtx); - printf (" }\n}\n\n"); + write_attr_switch (attr, 0, "return", ";"); } /* Given an AND tree of known true terms (because we are inside an `if' with @@ -3727,6 +4093,10 @@ write_attr_set (struct attr_desc *attr, rtx testexp; rtx inner_true; + /* Reset our_known_true after some time to not accumulate + too much cruft (slowing down genattrtab). */ + if ((i & 31) == 0) + our_known_true = known_true; testexp = eliminate_known_true (our_known_true, XVECEXP (value, 0, i), insn_code, insn_index); @@ -3755,7 +4125,7 @@ write_attr_set (struct attr_desc *attr, write_indent (indent); printf ("%sif ", first_if ? "" : "else "); first_if = 0; - write_test_expr (testexp, 0); + write_test_expr (testexp, 2); printf ("\n"); write_indent (indent + 2); printf ("{\n"); @@ -3820,6 +4190,7 @@ write_attr_case (struct attr_desc *attr, int write_case_lines, const char *prefix, const char *suffix, int indent, rtx known_true) { + rtx opt_val; if (av->num_insns == 0) return; @@ -3843,9 +4214,33 @@ write_attr_case (struct attr_desc *attr, printf ("default:\n"); } + indent += 2; + write_indent (indent); + printf ("{\n"); + + opt_val = av->value; + /* If we have multiple but only few instructions associated with + this value we can possibly optimize this. */ + if (GET_CODE (opt_val) == COND && av->num_insns > 1 && av->num_insns < 10) + opt_val = simplify_cond_insn_list (opt_val, av); + while (GET_CODE (opt_val) == COND) + { + rtx newexp2; + if (av->num_insns == 1) + newexp2 = simplify_cond (opt_val, av->first_insn->def->insn_code, + av->first_insn->def->insn_index); + else + newexp2 = simplify_cond (opt_val, -2, -2); + if (newexp2 == opt_val) + break; + opt_val = newexp2; + } + /* See what we have to do to output this value. */ must_extract = must_constrain = address_used = 0; - walk_attr_value (av->value); + walk_attr_value (opt_val); + + write_cache_used_attr_for_value (opt_val, indent); if (must_constrain) { @@ -3859,11 +4254,11 @@ write_attr_case (struct attr_desc *attr, } if (av->num_insns == 1) - write_attr_set (attr, indent + 2, av->value, prefix, suffix, + write_attr_set (attr, indent + 2, opt_val, prefix, suffix, known_true, av->first_insn->def->insn_code, av->first_insn->def->insn_index); else - write_attr_set (attr, indent + 2, av->value, prefix, suffix, + write_attr_set (attr, indent + 2, opt_val, prefix, suffix, known_true, -2, 0); if (strncmp (prefix, "return", 6)) @@ -3871,6 +4266,8 @@ write_attr_case (struct attr_desc *attr, write_indent (indent + 2); printf ("break;\n"); } + write_indent (indent); + printf ("}\n"); printf ("\n"); } @@ -3993,7 +4390,6 @@ write_eligible_delay (const char *kind) char str[50]; const char *pstr; struct attr_desc *attr; - struct attr_value *av, *common_av; int i; /* Compute the maximum number of delay slots required. We use the delay @@ -4026,19 +4422,11 @@ write_eligible_delay (const char *kind) { attr = find_attr (&delay_type_str, 0); gcc_assert (attr); - common_av = find_most_used (attr); - - printf (" insn = delay_insn;\n"); - printf (" switch (recog_memoized (insn))\n"); - printf (" {\n"); sprintf (str, " * %d;\n break;", max_slots); - for (av = attr->first_value; av; av = av->next) - if (av != common_av) - write_attr_case (attr, av, 1, "slot +=", str, 4, true_rtx); - write_attr_case (attr, common_av, 0, "slot +=", str, 4, true_rtx); - printf (" }\n\n"); + printf (" insn = delay_insn;\n"); + write_attr_switch (attr, 2, "slot +=", str); /* Ensure matched. Otherwise, shouldn't have been called. */ printf (" gcc_assert (slot >= %d);\n\n", max_slots); @@ -4047,20 +4435,10 @@ write_eligible_delay (const char *kind) /* If just one type of delay slot, write simple switch. */ if (num_delays == 1 && max_slots == 1) { - printf (" insn = candidate_insn;\n"); - printf (" switch (recog_memoized (insn))\n"); - printf (" {\n"); - attr = find_attr (&delay_1_0_str, 0); gcc_assert (attr); - common_av = find_most_used (attr); - - for (av = attr->first_value; av; av = av->next) - if (av != common_av) - write_attr_case (attr, av, 1, "return", ";", 4, true_rtx); - - write_attr_case (attr, common_av, 0, "return", ";", 4, true_rtx); - printf (" }\n"); + printf (" insn = candidate_insn;\n"); + write_attr_switch (attr, 2, "return", ";"); } else @@ -4076,21 +4454,12 @@ write_eligible_delay (const char *kind) { printf (" case %d:\n", (i / 3) + (num_delays == 1 ? 0 : delay->num * max_slots)); - printf (" switch (recog_memoized (insn))\n"); - printf ("\t{\n"); sprintf (str, "*%s_%d_%d", kind, delay->num, i / 3); pstr = str; attr = find_attr (&pstr, 0); gcc_assert (attr); - common_av = find_most_used (attr); - - for (av = attr->first_value; av; av = av->next) - if (av != common_av) - write_attr_case (attr, av, 1, "return", ";", 8, true_rtx); - - write_attr_case (attr, common_av, 0, "return", ";", 8, true_rtx); - printf (" }\n"); + write_attr_switch (attr, 6, "return", ";"); } printf (" default:\n"); @@ -4133,7 +4502,7 @@ find_attr (const char **name_p, int crea /* Before we resort to using `strcmp', see if the string address matches anywhere. In most cases, it should have been canonicalized to do so. */ - if (name == alternative_name) + if (name == alternative_name || name == insn_code_str) return NULL; index = name[0] & (MAX_ATTRS_INDEX - 1); @@ -4458,6 +4827,7 @@ main (int argc, char **argv) delay_type_str = DEF_ATTR_STRING ("*delay_type"); delay_1_0_str = DEF_ATTR_STRING ("*delay_1_0"); num_delay_slots_str = DEF_ATTR_STRING ("*num_delay_slots"); + insn_code_str = DEF_ATTR_STRING ("insn_code"); printf ("/* Generated automatically by the program `genattrtab'\n\ from the machine description file `md'. */\n\n"); @@ -4573,7 +4943,7 @@ from the machine description file `md'. /* Construct extra attributes for `length'. */ make_length_attrs (); - /* Perform any possible optimizations to speed up compilation. */ + /* Perform some optimizations to speed up compilation. */ optimize_attrs (); /* Now write out all the `gen_attr_...' routines. Do these before the