From patchwork Tue May 20 08:16:23 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 350550 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id E11A7140086 for ; Tue, 20 May 2014 18:16:38 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=mBQ7zkHWz1k7OqyWvAvMtyra0jzPw5REVF6OWVLtDXujLyf6lpEEK y3XifKp77yokEm2kvvMisTI6xg15CKp4vDugfJYG2lIXqNa8nL4St2gkXyEKbexq SMQV+Hgbc6lEaT4oMNQbetRDbyurz4z8CNE9WC2jn8cPUfdoXUpQJA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=PnaspE0iLaEt38Tb6w7elWZflzw=; b=UG2458UDsoQfbQSGNZJS +fHAi19rTv92NMDu+Pqbd8NGYTAI/kBaYIYDilWDzxogVWuwpH3aoP/+sdVHT6rv zYb8X2IpZK5AwGS901hgHXYESc50QuOUmCcUDWx8OKxH2PUFZvlT7enjAmEosZ2W 80tR/W7ilVDK31QxbB2nQcw= Received: (qmail 28304 invoked by alias); 20 May 2014 08:16:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 28295 invoked by uid 89); 20 May 2014 08:16:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f172.google.com Received: from mail-wi0-f172.google.com (HELO mail-wi0-f172.google.com) (209.85.212.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 20 May 2014 08:16:28 +0000 Received: by mail-wi0-f172.google.com with SMTP id hi2so5443902wib.5 for ; Tue, 20 May 2014 01:16:25 -0700 (PDT) X-Received: by 10.180.183.5 with SMTP id ei5mr2527599wic.54.1400573785452; Tue, 20 May 2014 01:16:25 -0700 (PDT) Received: from localhost ([2.26.169.52]) by mx.google.com with ESMTPSA id r15sm14542902wiw.17.2014.05.20.01.16.24 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 May 2014 01:16:24 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, rdsandiford@googlemail.com Subject: RFA: cache enabled attribute by insn code Date: Tue, 20 May 2014 09:16:23 +0100 Message-ID: <87k39gkgq0.fsf@talisman.default> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 get_attr_enabled was showing up high in a -O0 compile of fold-const.ii. At the moment, extract_insn calls this function for every alternative on each extraction, which can be expensive for instructions like moves that have many alternatives. The attribute is only supposed to depend on the insn code and the current target. It isn't allowed to depend on the values of operands. LRA already makes use of this to cache the enabled attributes based on code. This patch makes the caching compiler-wide. It also converts the bool array into a plain int bitfield, since at the moment we don't allow more than 30 alternatives. (It's possible we might want to increase it beyond 32 at some point, but then we can change the type to uint64_t instead. Wanting to go beyond 64 would suggest deeper problems somewhere. :-)) The patch gives a consistent compile-time improvement of about ~3.5% on the -O0 fold-const.ii testcase. There were a few instances of the construct: cl = NO_REGS; for (ignore_p = false, curr_alt = 0; (c = *constraints); constraints += CONSTRAINT_LEN (c, constraints)) if (c == '#' || !recog_data.alternative_enabled_p[curr_alt]) ignore_p = true; else if (c == ',') { curr_alt++; ignore_p = false; } else if (! ignore_p) This had the effect of ignoring all alternatives after the first attribute-disabled one, since once we found a disabled alternative we'd always enter the first arm of the "if" and never increment curr_alt. I don't think that was intentional. Tested on x86_64-linux-gnu. OK to install? I notice ira_setup_alts is using a HARD_REG_SET to store a mask of alternatives. If this patch is OK, I'll follow up with a patch to use alternative_mask there too. Thanks, Richard gcc/ * system.h (TEST_BIT): New macro. * recog.h (alternative_mask): New type. (ALL_ALTERNATIVES, ALTERNATIVE_BIT): New macros. (recog_data_d): Replace alternative_enabled_p array with enabled_alternatives. (target_recog): New structure. (default_target_recog, this_target_recog): Declare. (get_enabled_alternatives): Likewise. * recog.c (default_target_recog, this_target_recog): New variables. (get_enabled_alternatives): New function. (extract_insn): Use it. (preprocess_constraints, constrain_operands): Adjust for change to recog_data. * postreload.c (reload_cse_simplify_operands): Likewise. * reload.c (find_reloads): Likewise. * ira-costs.c (record_reg_classes): Likewise. * ira-lives.c (single_reg_class): Likewise. Fix bug in which all alternatives after a disabled one would be skipped. (ira_implicitly_set_insn_hard_regs): Likewise. * ira.c (commutative_constraint_p): Likewise. (ira_setup_alts): Adjust for change to recog_data. * lra-int.h (lra_insn_recog_data): Replace alternative_enabled_p with enabled_alternatives. * lra.c (free_insn_recog_data): Update accordingly. (lra_update_insn_recog_data): Likewise. (lra_set_insn_recog_data): Likewise. Use get_enabled_alternatives. * lra-constraints.c (process_alt_operands): Likewise. Handle only_alternative as part of the enabled mask. * target-globals.h (this_target_recog): Declare. (target_globals): Add a recog field. (restore_target_globals): Restore this_target_recog. * target-globals.c: Include recog.h. (default_target_globals): Initialize recog field. (save_target_globals): Likewise. Index: gcc/recog.h =================================================================== --- gcc/recog.h 2014-05-19 20:42:50.830279171 +0100 +++ gcc/recog.h 2014-05-19 21:02:22.926604180 +0100 @@ -46,18 +46,18 @@ struct operand_alternative const char *constraint; /* The register class valid for this alternative (possibly NO_REGS). */ - enum reg_class cl; + ENUM_BITFIELD (reg_class) cl : 16; /* "Badness" of this alternative, computed from number of '?' and '!' characters in the constraint string. */ - unsigned int reject; + unsigned int reject : 16; /* -1 if no matching constraint was found, or an operand number. */ - int matches; + int matches : 8; /* The same information, but reversed: -1 if this operand is not matched by any other, or the operand number of the operand that matches this one. */ - int matched; + int matched : 8; /* Nonzero if '&' was found in the constraint string. */ unsigned int earlyclobber:1; @@ -77,8 +77,9 @@ struct operand_alternative /* Nonzero if 'X' was found in the constraint string, or if the constraint string for this alternative was empty. */ unsigned int anything_ok:1; -}; + unsigned int unused : 8; +}; extern void init_recog (void); extern void init_recog_no_volatile (void); @@ -134,7 +135,7 @@ extern void insn_extract (rtx); extern void extract_insn (rtx); extern void extract_constrain_insn_cached (rtx); extern void extract_insn_cached (rtx); -extern void preprocess_constraints (void); +extern void preprocess_constraints (rtx); extern rtx peep2_next_insn (int); extern int peep2_regno_dead_p (int, int); extern int peep2_reg_dead_p (int, rtx); @@ -258,7 +259,7 @@ struct recog_data_d /* Contains a vector of operand_alternative structures for every operand. Set up by preprocess_constraints. */ -extern struct operand_alternative recog_op_alt[MAX_RECOG_OPERANDS][MAX_RECOG_ALTERNATIVES]; +extern operand_alternative **recog_op_alt; /* A table defined in insn-output.c that give information about each insn-code value. */ @@ -376,6 +377,7 @@ struct insn_data_d /* Target-dependent globals. */ struct target_recog { alternative_mask x_enabled_alternatives[LAST_INSN_CODE]; + operand_alternative **x_op_alt[LAST_INSN_CODE]; }; extern struct target_recog default_target_recog; Index: gcc/recog.c =================================================================== --- gcc/recog.c 2014-05-19 20:43:28.978647678 +0100 +++ gcc/recog.c 2014-05-19 23:26:02.037573014 +0100 @@ -80,7 +80,11 @@ struct recog_data_d recog_data; /* Contains a vector of operand_alternative structures for every operand. Set up by preprocess_constraints. */ -struct operand_alternative recog_op_alt[MAX_RECOG_OPERANDS][MAX_RECOG_ALTERNATIVES]; +operand_alternative **recog_op_alt; + +static operand_alternative asm_op_alt_1[MAX_RECOG_OPERANDS + * MAX_RECOG_ALTERNATIVES]; +static operand_alternative *asm_op_alt[MAX_RECOG_OPERANDS]; /* On return from `constrain_operands', indicate which alternative was satisfied. */ @@ -2326,23 +2330,43 @@ extract_insn (rtx insn) information from the constraint strings into a more usable form. The collected data is stored in recog_op_alt. */ void -preprocess_constraints (void) +preprocess_constraints (rtx insn) { int i; - for (i = 0; i < recog_data.n_operands; i++) - memset (recog_op_alt[i], 0, (recog_data.n_alternatives - * sizeof (struct operand_alternative))); + int code = INSN_CODE (insn); + if (code >= 0 && this_target_recog->x_op_alt[code]) + { + recog_op_alt = this_target_recog->x_op_alt[code]; + return; + } + + int n_alternatives = recog_data.n_alternatives; + int n_operands = recog_data.n_operands; + int n_entries = n_operands * n_alternatives; + + operand_alternative *op_alt; + if (code < 0) + { + memset (asm_op_alt_1, 0, n_entries * sizeof (operand_alternative)); + op_alt = asm_op_alt_1; + recog_op_alt = asm_op_alt; + } + else + { + op_alt = XCNEWVEC (operand_alternative, n_entries); + recog_op_alt = XNEWVEC (operand_alternative *, n_operands); + this_target_recog->x_op_alt[code] = recog_op_alt; + } - for (i = 0; i < recog_data.n_operands; i++) + for (i = 0; i < n_operands; i++, op_alt += n_alternatives) { int j; - struct operand_alternative *op_alt; const char *p = recog_data.constraints[i]; - op_alt = recog_op_alt[i]; + recog_op_alt[i] = op_alt; - for (j = 0; j < recog_data.n_alternatives; j++) + for (j = 0; j < n_alternatives; j++) { op_alt[j].cl = NO_REGS; op_alt[j].constraint = p; Index: gcc/ira-lives.c =================================================================== --- gcc/ira-lives.c 2014-05-19 20:42:50.819279065 +0100 +++ gcc/ira-lives.c 2014-05-19 21:01:12.577922691 +0100 @@ -1238,7 +1238,7 @@ process_bb_node_lives (ira_loop_tree_nod } extract_insn (insn); - preprocess_constraints (); + preprocess_constraints (insn); process_single_reg_class_operands (false, freq); /* See which defined values die here. */ Index: gcc/reg-stack.c =================================================================== --- gcc/reg-stack.c 2014-05-16 09:47:34.336936052 +0100 +++ gcc/reg-stack.c 2014-05-19 21:01:50.349288768 +0100 @@ -473,7 +473,7 @@ check_asm_stack_operands (rtx insn) constrain_operands (1); alt = which_alternative; - preprocess_constraints (); + preprocess_constraints (insn); get_asm_operands_in_out (body, &n_outputs, &n_inputs); @@ -2032,7 +2032,7 @@ subst_asm_stack_regs (rtx insn, stack_pt constrain_operands (1); alt = which_alternative; - preprocess_constraints (); + preprocess_constraints (insn); get_asm_operands_in_out (body, &n_outputs, &n_inputs); Index: gcc/regcprop.c =================================================================== --- gcc/regcprop.c 2014-05-17 07:59:06.436021428 +0100 +++ gcc/regcprop.c 2014-05-19 21:01:29.001081988 +0100 @@ -774,7 +774,7 @@ copyprop_hardreg_forward_1 (basic_block extract_insn (insn); if (! constrain_operands (1)) fatal_insn_not_found (insn); - preprocess_constraints (); + preprocess_constraints (insn); alt = which_alternative; n_ops = recog_data.n_operands; is_asm = asm_noperands (PATTERN (insn)) >= 0; @@ -880,7 +880,7 @@ copyprop_hardreg_forward_1 (basic_block extract_insn (insn); if (! constrain_operands (1)) fatal_insn_not_found (insn); - preprocess_constraints (); + preprocess_constraints (insn); } /* Otherwise, try all valid registers and see if its valid. */ @@ -908,7 +908,7 @@ copyprop_hardreg_forward_1 (basic_block extract_insn (insn); if (! constrain_operands (1)) fatal_insn_not_found (insn); - preprocess_constraints (); + preprocess_constraints (insn); } } } Index: gcc/regrename.c =================================================================== --- gcc/regrename.c 2014-05-06 18:38:47.928200023 +0100 +++ gcc/regrename.c 2014-05-19 21:01:35.724147073 +0100 @@ -1571,7 +1571,7 @@ build_def_use (basic_block bb) extract_insn (insn); if (! constrain_operands (1)) fatal_insn_not_found (insn); - preprocess_constraints (); + preprocess_constraints (insn); alt = which_alternative; n_ops = recog_data.n_operands; untracked_operands = 0; Index: gcc/sched-deps.c =================================================================== --- gcc/sched-deps.c 2014-05-16 09:47:34.347936161 +0100 +++ gcc/sched-deps.c 2014-05-19 21:01:58.526367964 +0100 @@ -2865,7 +2865,7 @@ sched_analyze_insn (struct deps_desc *de HARD_REG_SET temp; extract_insn (insn); - preprocess_constraints (); + preprocess_constraints (insn); ira_implicitly_set_insn_hard_regs (&temp); AND_COMPL_HARD_REG_SET (temp, ira_no_alloc_regs); IOR_HARD_REG_SET (implicit_reg_pending_clobbers, temp); Index: gcc/sel-sched.c =================================================================== --- gcc/sel-sched.c 2014-03-04 21:19:43.120097702 +0000 +++ gcc/sel-sched.c 2014-05-19 21:02:14.471522363 +0100 @@ -1019,7 +1019,7 @@ get_reg_class (rtx insn) extract_insn (insn); if (! constrain_operands (1)) fatal_insn_not_found (insn); - preprocess_constraints (); + preprocess_constraints (insn); alt = which_alternative; n_ops = recog_data.n_operands; @@ -2141,7 +2141,7 @@ implicit_clobber_conflict_p (insn_t thro /* Calculate implicit clobbers. */ extract_insn (insn); - preprocess_constraints (); + preprocess_constraints (insn); ira_implicitly_set_insn_hard_regs (&temp); AND_COMPL_HARD_REG_SET (temp, ira_no_alloc_regs); Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c 2014-05-19 07:46:29.013179879 +0100 +++ gcc/config/arm/arm.c 2014-05-19 21:05:50.289605508 +0100 @@ -11335,7 +11335,7 @@ xscale_sched_adjust_cost (rtx insn, rtx that overlaps with SHIFTED_OPERAND, then we have increase the cost of this dependency. */ extract_insn (dep); - preprocess_constraints (); + preprocess_constraints (dep); for (opno = 0; opno < recog_data.n_operands; opno++) { /* We can ignore strict inputs. */ @@ -16870,7 +16870,7 @@ note_invalid_constants (rtx insn, HOST_W /* Fill in recog_op_alt with information about the constraints of this insn. */ - preprocess_constraints (); + preprocess_constraints (insn); for (opno = 0; opno < recog_data.n_operands; opno++) { Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c 2014-05-19 07:46:26.771160339 +0100 +++ gcc/config/i386/i386.c 2014-05-19 21:05:39.461501255 +0100 @@ -5826,7 +5826,7 @@ ix86_legitimate_combined_insn (rtx insn) int i; extract_insn (insn); - preprocess_constraints (); + preprocess_constraints (insn); for (i = 0; i < recog_data.n_operands; i++) {