From patchwork Sat Oct 8 21:57:56 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dimitrios Apostolou X-Patchwork-Id: 118569 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id A92E9B7091 for ; Sun, 9 Oct 2011 08:58:22 +1100 (EST) Received: (qmail 27803 invoked by alias); 8 Oct 2011 21:58:20 -0000 Received: (qmail 27794 invoked by uid 22791); 8 Oct 2011 21:58:18 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mailout-de.gmx.net (HELO mailout-de.gmx.net) (213.165.64.23) by sourceware.org (qpsmtpd/0.43rc1) with SMTP; Sat, 08 Oct 2011 21:58:02 +0000 Received: (qmail invoked by alias); 08 Oct 2011 21:58:00 -0000 Received: from teras.ics.forth.gr (EHLO [139.91.70.93]) [139.91.70.93] by mail.gmx.net (mp045) with SMTP; 08 Oct 2011 23:58:00 +0200 Date: Sun, 9 Oct 2011 00:57:56 +0300 (EEST) From: Dimitrios Apostolou To: Dimitrios Apostolou cc: Steven Bosscher , Kenneth Zadeck , gcc-patches@gcc.gnu.org, Paolo Bonzini , seongbae.park@gmail.com, Jakub Jelinek , Richard Guenther , Manolis Marazakis Subject: Re: [DF] [performance] generate DF_REF_BASE REFs in REGNO order In-Reply-To: Message-ID: References: <4E32F277.1080502@naturalbridge.com> User-Agent: Alpine 2.02 (LNX 1266 2009-07-14) MIME-Version: 1.0 Content-ID: X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello all, I received my GSOC t-shirt yesterday which reminded me I have a promise to keep... After realising that it can take forever to find enough free time to work on GCC, I decided to work a couple of hours whenever I can and post updates to my patches as time permits. Hopefully some of them will make it into 4.7? On Mon, 22 Aug 2011, Dimitrios Apostolou wrote: > > For the record I'm posting here the final version of this patch, in case it > gets applied. It adds minor stylistic fixes, plus a small change in > alloc_pool sizes. Any further testing I do will be posted under this thread. > > The previously posted Changelog applies, with the following addition: > > (df_scan_alloc): Rounded up allocation pools size, reduced the > mw_reg_pool size, it was unnecessarily large. > > Paolo, did I assume correctly that the mw_reg_pool is significantly smaller > than the rest? That was the case on i386, I assumed it would be similar in > other arch as well. > The attached patch (df2b.diff, exactly the same as the one in parent email) applies successfully to latest gcc snapshot. In addition to previous testing (i386,x86_64) I've just finished testing on sparc-linux-gnu at the GCC compile farm having no regressions. Finally I think Steven's tests on IA64 went ok. Wasn't testing the only thing holding this patch? On sparc runtime of compiling df-scan.c seems to have been reduced from 34s to 33s user time, for a debug build (--enable-checking=assert,misc,runtime,rtl,df). But measurements are too flaky since node is busy. The complete changelog is the following: 2011-07-29 Dimitrios Apostolou Paolo Bonzini (df_def_record_1): Assert a parallel must contain an EXPR_LIST at this point. Receive the LOC and move its extraction... (df_defs_record): ... here. Rewrote logic with a switch statement instead of multiple if-else. (df_find_hard_reg_defs, df_find_hard_reg_defs_1): New functions that duplicate the logic of df_defs_record() and df_def_record_1() but without actually recording any DEFs, only marking them in the defs HARD_REG_SET. (df_get_call_refs): Call df_find_hard_reg_defs() to mark DEFs that are the result of the call. Record DF_REF_BASE DEFs in REGNO order. Use regs_invalidated_by_call HARD_REG_SET instead of regs_invalidated_by_call_regset bitmap. (df_insn_refs_collect): Record DF_REF_REGULAR DEFs after df_get_call_refs(). (df_scan_alloc): Rounded up allocation pools size, reduced the mw_reg_pool size, it was unnecessarily large. Thanks, Dimitris === modified file 'gcc/df-scan.c' --- gcc/df-scan.c 2011-02-02 20:08:06 +0000 +++ gcc/df-scan.c 2011-08-22 15:17:18 +0000 @@ -111,7 +111,7 @@ static void df_ref_record (enum df_ref_c rtx, rtx *, basic_block, struct df_insn_info *, enum df_ref_type, int ref_flags); -static void df_def_record_1 (struct df_collection_rec *, rtx, +static void df_def_record_1 (struct df_collection_rec *, rtx *, basic_block, struct df_insn_info *, int ref_flags); static void df_defs_record (struct df_collection_rec *, rtx, @@ -318,7 +318,7 @@ df_scan_alloc (bitmap all_blocks ATTRIBU { struct df_scan_problem_data *problem_data; unsigned int insn_num = get_max_uid () + 1; - unsigned int block_size = 400; + unsigned int block_size = 512; basic_block bb; /* Given the number of pools, this is really faster than tearing @@ -347,7 +347,7 @@ df_scan_alloc (bitmap all_blocks ATTRIBU sizeof (struct df_reg_info), block_size); problem_data->mw_reg_pool = create_alloc_pool ("df_scan mw_reg", - sizeof (struct df_mw_hardreg), block_size); + sizeof (struct df_mw_hardreg), block_size / 16); bitmap_obstack_initialize (&problem_data->reg_bitmaps); bitmap_obstack_initialize (&problem_data->insn_bitmaps); @@ -2916,40 +2916,27 @@ df_read_modify_subreg_p (rtx x) } -/* Process all the registers defined in the rtx, X. +/* Process all the registers defined in the rtx pointed by LOC. Autoincrement/decrement definitions will be picked up by df_uses_record. */ static void df_def_record_1 (struct df_collection_rec *collection_rec, - rtx x, basic_block bb, struct df_insn_info *insn_info, + rtx *loc, basic_block bb, struct df_insn_info *insn_info, int flags) { - rtx *loc; - rtx dst; - - /* We may recursively call ourselves on EXPR_LIST when dealing with PARALLEL - construct. */ - if (GET_CODE (x) == EXPR_LIST || GET_CODE (x) == CLOBBER) - loc = &XEXP (x, 0); - else - loc = &SET_DEST (x); - dst = *loc; + rtx dst = *loc; /* It is legal to have a set destination be a parallel. */ if (GET_CODE (dst) == PARALLEL) { int i; - for (i = XVECLEN (dst, 0) - 1; i >= 0; i--) { rtx temp = XVECEXP (dst, 0, i); - if (GET_CODE (temp) == EXPR_LIST || GET_CODE (temp) == CLOBBER - || GET_CODE (temp) == SET) - df_def_record_1 (collection_rec, - temp, bb, insn_info, - GET_CODE (temp) == CLOBBER - ? flags | DF_REF_MUST_CLOBBER : flags); + gcc_assert (GET_CODE (temp) == EXPR_LIST); + df_def_record_1 (collection_rec, &XEXP (temp, 0), + bb, insn_info, flags); } return; } @@ -3003,26 +2990,98 @@ df_defs_record (struct df_collection_rec int flags) { RTX_CODE code = GET_CODE (x); + int i; - if (code == SET || code == CLOBBER) - { - /* Mark the single def within the pattern. */ - int clobber_flags = flags; - clobber_flags |= (code == CLOBBER) ? DF_REF_MUST_CLOBBER : 0; - df_def_record_1 (collection_rec, x, bb, insn_info, clobber_flags); - } - else if (code == COND_EXEC) + switch (code) { + case SET: + df_def_record_1 (collection_rec, &SET_DEST (x), bb, insn_info, flags); + break; + + case CLOBBER: + flags |= DF_REF_MUST_CLOBBER; + df_def_record_1 (collection_rec, &XEXP (x, 0), bb, insn_info, flags); + break; + + case COND_EXEC: df_defs_record (collection_rec, COND_EXEC_CODE (x), bb, insn_info, DF_REF_CONDITIONAL); + break; + + case PARALLEL: + for (i = XVECLEN (x, 0) - 1; i >= 0; i--) + df_defs_record (collection_rec, XVECEXP (x, 0, i), + bb, insn_info, flags); + break; + default: + /* No DEFs to record in other cases */ + break; } - else if (code == PARALLEL) +} + +/* Set the bits in *defs of registers defined in the pattern rtx */ + +static void +df_find_hard_reg_defs_1 (rtx *loc, basic_block bb, + int flags, HARD_REG_SET *defs) +{ + rtx dst = *loc; + + /* It is legal to have a set destination be a parallel. */ + if (GET_CODE (dst) == PARALLEL) { int i; + for (i = XVECLEN (dst, 0) - 1; i >= 0; i--) + { + rtx temp = XVECEXP (dst, 0, i); + gcc_assert (GET_CODE (temp) == EXPR_LIST); + df_find_hard_reg_defs_1 (&XEXP (temp, 0), bb, flags, defs); + } + return; + } + + if (GET_CODE (dst) == STRICT_LOW_PART) + dst = XEXP (dst, 0); + + if (GET_CODE (dst) == ZERO_EXTRACT) + dst = XEXP (dst, 0); - /* Mark the multiple defs within the pattern. */ + /* At this point if we do not have a reg or a subreg, just return. */ + if (REG_P (dst)) + SET_HARD_REG_BIT (*defs, REGNO (dst)); + else if (GET_CODE (dst) == SUBREG && REG_P (SUBREG_REG (dst))) + SET_HARD_REG_BIT (*defs, REGNO (SUBREG_REG (dst))); +} + +static void +df_find_hard_reg_defs (rtx x, basic_block bb, + int flags, HARD_REG_SET *defs) +{ + RTX_CODE code = GET_CODE (x); + int i; + + switch (code) + { + case SET: + df_find_hard_reg_defs_1 (&SET_DEST (x), bb, flags, defs); + break; + + case CLOBBER: + flags |= DF_REF_MUST_CLOBBER; + df_find_hard_reg_defs_1 (&XEXP (x, 0), bb, flags, defs); + break; + + case COND_EXEC: + df_find_hard_reg_defs (COND_EXEC_CODE (x), bb, DF_REF_CONDITIONAL, defs); + break; + + case PARALLEL: for (i = XVECLEN (x, 0) - 1; i >= 0; i--) - df_defs_record (collection_rec, XVECEXP (x, 0, i), bb, insn_info, flags); + df_find_hard_reg_defs (XVECEXP (x, 0, i), bb, flags, defs); + break; + default: + /* No DEFs to record in other cases */ + break; } } @@ -3308,7 +3367,7 @@ df_get_conditional_uses (struct df_colle } -/* Get call's extra defs and uses. */ +/* Get call's extra defs and uses (track caller-saved registers). */ static void df_get_call_refs (struct df_collection_rec * collection_rec, @@ -3317,20 +3376,50 @@ df_get_call_refs (struct df_collection_r int flags) { rtx note; - bitmap_iterator bi; - unsigned int ui; bool is_sibling_call; unsigned int i; - df_ref def; - bitmap_head defs_generated; + HARD_REG_SET defs_generated; - bitmap_initialize (&defs_generated, &df_bitmap_obstack); + CLEAR_HARD_REG_SET (defs_generated); + df_find_hard_reg_defs (PATTERN (insn_info->insn), bb, + 0, &defs_generated); - /* Do not generate clobbers for registers that are the result of the - call. This causes ordering problems in the chain building code - depending on which def is seen first. */ - FOR_EACH_VEC_ELT (df_ref, collection_rec->def_vec, i, def) - bitmap_set_bit (&defs_generated, DF_REF_REGNO (def)); + is_sibling_call = SIBLING_CALL_P (insn_info->insn); + + for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) + { + if (i == STACK_POINTER_REGNUM) + /* The stack ptr is used (honorarily) by a CALL insn. */ + df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i], + NULL, bb, insn_info, DF_REF_REG_USE, + DF_REF_CALL_STACK_USAGE | flags); + else if (global_regs[i]) + { + /* Calls to const functions cannot access any global registers and + calls to pure functions cannot set them. All other calls may + reference any of the global registers, so they are recorded as + used. */ + if (!RTL_CONST_CALL_P (insn_info->insn)) + { + df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i], + NULL, bb, insn_info, DF_REF_REG_USE, flags); + if (!RTL_PURE_CALL_P (insn_info->insn)) + df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i], + NULL, bb, insn_info, DF_REF_REG_DEF, flags); + } + } + else + if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i) + /* no clobbers for regs that are the result of the call */ + && !TEST_HARD_REG_BIT (defs_generated, i) + && (!is_sibling_call + || !bitmap_bit_p (df->exit_block_uses, i) + || refers_to_regno_p (i, i+1, + crtl->return_rtx, NULL))) + df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i], + NULL, bb, insn_info, DF_REF_REG_DEF, + DF_REF_MAY_CLOBBER | flags); + } /* Record the registers used to pass arguments, and explicitly noted as clobbered. */ @@ -3345,7 +3434,7 @@ df_get_call_refs (struct df_collection_r if (REG_P (XEXP (XEXP (note, 0), 0))) { unsigned int regno = REGNO (XEXP (XEXP (note, 0), 0)); - if (!bitmap_bit_p (&defs_generated, regno)) + if (!TEST_HARD_REG_BIT (defs_generated, regno)) df_defs_record (collection_rec, XEXP (note, 0), bb, insn_info, flags); } @@ -3355,40 +3444,6 @@ df_get_call_refs (struct df_collection_r } } - /* The stack ptr is used (honorarily) by a CALL insn. */ - df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[STACK_POINTER_REGNUM], - NULL, bb, insn_info, DF_REF_REG_USE, - DF_REF_CALL_STACK_USAGE | flags); - - /* Calls to const functions cannot access any global registers and calls to - pure functions cannot set them. All other calls may reference any of the - global registers, so they are recorded as used. */ - if (!RTL_CONST_CALL_P (insn_info->insn)) - for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) - if (global_regs[i]) - { - df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i], - NULL, bb, insn_info, DF_REF_REG_USE, flags); - if (!RTL_PURE_CALL_P (insn_info->insn)) - df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i], - NULL, bb, insn_info, DF_REF_REG_DEF, flags); - } - - is_sibling_call = SIBLING_CALL_P (insn_info->insn); - EXECUTE_IF_SET_IN_BITMAP (regs_invalidated_by_call_regset, 0, ui, bi) - { - if (!global_regs[ui] - && (!bitmap_bit_p (&defs_generated, ui)) - && (!is_sibling_call - || !bitmap_bit_p (df->exit_block_uses, ui) - || refers_to_regno_p (ui, ui+1, - crtl->return_rtx, NULL))) - df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[ui], - NULL, bb, insn_info, DF_REF_REG_DEF, - DF_REF_MAY_CLOBBER | flags); - } - - bitmap_clear (&defs_generated); return; } @@ -3398,7 +3453,7 @@ df_get_call_refs (struct df_collection_r and reg chains. */ static void -df_insn_refs_collect (struct df_collection_rec* collection_rec, +df_insn_refs_collect (struct df_collection_rec *collection_rec, basic_block bb, struct df_insn_info *insn_info) { rtx note; @@ -3410,9 +3465,6 @@ df_insn_refs_collect (struct df_collecti VEC_truncate (df_ref, collection_rec->eq_use_vec, 0); VEC_truncate (df_mw_hardreg_ptr, collection_rec->mw_vec, 0); - /* Record register defs. */ - df_defs_record (collection_rec, PATTERN (insn_info->insn), bb, insn_info, 0); - /* Process REG_EQUIV/REG_EQUAL notes. */ for (note = REG_NOTES (insn_info->insn); note; note = XEXP (note, 1)) @@ -3444,12 +3496,17 @@ df_insn_refs_collect (struct df_collecti } if (CALL_P (insn_info->insn)) + /* Record DF_REF_BASE register defs for CALL_INSNs. */ df_get_call_refs (collection_rec, bb, insn_info, (is_cond_exec) ? DF_REF_CONDITIONAL : 0); + /* Record DF_REF_REGULAR defs and uses. */ + df_defs_record (collection_rec, PATTERN (insn_info->insn), + bb, insn_info, 0); + /* Record the register uses. */ - df_uses_record (collection_rec, - &PATTERN (insn_info->insn), DF_REF_REG_USE, bb, insn_info, 0); + df_uses_record (collection_rec, &PATTERN (insn_info->insn), + DF_REF_REG_USE, bb, insn_info, 0); /* DF_REF_CONDITIONAL needs corresponding USES. */ if (is_cond_exec)