From patchwork Mon Oct 18 15:36:43 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 68206 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 150A4B70AA for ; Tue, 19 Oct 2010 02:36:37 +1100 (EST) Received: (qmail 695 invoked by alias); 18 Oct 2010 15:36:35 -0000 Received: (qmail 32531 invoked by uid 22791); 18 Oct 2010 15:36:29 -0000 X-SWARE-Spam-Status: No, hits=0.9 required=5.0 tests=BAYES_50, TW_CL, TW_EG, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 18 Oct 2010 15:36:13 +0000 Received: (qmail 30442 invoked from network); 18 Oct 2010 15:36:10 -0000 Received: from unknown (HELO ?192.168.1.66?) (vries@127.0.0.2) by mail.codesourcery.com with ESMTPA; 18 Oct 2010 15:36:10 -0000 Message-ID: <4CBC698B.3080204@codesourcery.com> Date: Mon, 18 Oct 2010 17:36:43 +0200 From: Tom de Vries User-Agent: Thunderbird 2.0.0.24 (X11/20100411) MIME-Version: 1.0 To: gcc-patches@gcc.gnu.org CC: Bernd Schmidt Subject: new sign/zero extension elimination pass Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org I created a new sign/zero extension elimination pass. The motivating example for this pass is: void f(unsigned char *p, short s, int c, int *z) { if (c) *z = 0; *p ^= (unsigned char)s; } For MIPS, compilation results in the following insns. (set (reg/v:SI 199) (sign_extend:SI (subreg:HI (reg:SI 200) 2))) ... (set (reg:QI 203) (subreg:QI (reg/v:SI 199) 3)) These insns are the only def and the only use of reg 199, each located in a different bb. The sign-extension preserves the lower half of reg 200 and copies it to reg 199, and the subreg use of reg 199 only reads the least significant byte. The sign extension is therefore redundant (the extension part, not the copy part), and can safely be replaced with a regcopy from reg 200 to reg 199: (set (reg/v:SI 199) (reg:SI 200)) This new pass manages to analyze this pattern, and replace the sign_extend with a regcopy, which results in 1 less instruction in the assembly. The other passes that eliminate sign/zero extensions do no manage to do that. Combine doesn't work since the def and the use of reg 199 are in a different bb. Implicit_zee does not work here since it only combines an extension with the defs of its src operand, which is not applicable in this case. The pass does a couple of linear scans over the insns, so it's not expensive in terms of runtime. See ee.c for a more detailed writeup related to motivating example, comparison to other passes that do sign/zero extension elimination, intended effect, implementation and limitations. The pass has been tested on X86_64, MIPS, ARM. An earlier version of the pass was tested on PPC as well. All on Linux host. At the moment there are 2 known issue: - a bug on ARM, which I'm fixing currently. - an assert while building on MIPS. I hit the assert gcc_assert ((*inner_mode == *outer_mode) != (*extend != UNKNOWN)); in loop-iv.c:get_biv_step. I disabled it and no tests fell over, but that needs more investigation. I would like review comments on the general idea of the pass. - Tom de Vries Index: gcc/tree-pass.h =================================================================== --- gcc/tree-pass.h (revision 165080) +++ gcc/tree-pass.h (working copy) @@ -479,6 +479,7 @@ extern struct rtl_opt_pass pass_initial_value_sets; extern struct rtl_opt_pass pass_unshare_all_rtl; extern struct rtl_opt_pass pass_instantiate_virtual_regs; +extern struct rtl_opt_pass pass_ee; extern struct rtl_opt_pass pass_rtl_fwprop; extern struct rtl_opt_pass pass_rtl_fwprop_addr; extern struct rtl_opt_pass pass_jump2; Index: gcc/opts.c =================================================================== --- gcc/opts.c (revision 165080) +++ gcc/opts.c (working copy) @@ -823,6 +823,7 @@ flag_tree_switch_conversion = opt2; flag_ipa_cp = opt2; flag_ipa_sra = opt2; + flag_ee = opt2; /* Track fields in field-sensitive alias analysis. */ set_param_value ("max-fields-for-field-sensitive", Index: gcc/timevar.def =================================================================== --- gcc/timevar.def (revision 165080) +++ gcc/timevar.def (working copy) @@ -179,6 +179,7 @@ DEFTIMEVAR (TV_VARCONST , "varconst") DEFTIMEVAR (TV_LOWER_SUBREG , "lower subreg") DEFTIMEVAR (TV_JUMP , "jump") +DEFTIMEVAR (TV_EE , "extension elimination") DEFTIMEVAR (TV_FWPROP , "forward prop") DEFTIMEVAR (TV_CSE , "CSE") DEFTIMEVAR (TV_DCE , "dead code elimination") Index: gcc/ee.c =================================================================== --- gcc/ee.c (revision 0) +++ gcc/ee.c (revision 0) @@ -0,0 +1,907 @@ +/* Redundant extension elimination + Copyright (C) 2010 Free Software Foundation, Inc. + Contributed by Tom de Vries (tom@codesourcery.com) + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +/* + + MOTIVATING EXAMPLE + + The motivating example for this pass is: + + void f(unsigned char *p, short s, int c, int *z) + { + if (c) + *z = 0; + *p ^= (unsigned char)s; + } + + For MIPS, compilation results in the following insns. + + (set (reg/v:SI 199) + (sign_extend:SI (subreg:HI (reg:SI 200) 2))) + + ... + + (set (reg:QI 203) + (subreg:QI (reg/v:SI 199) 3)) + + These insns are the only def and the only use of reg 199, each located in a + different bb. + + The sign-extension preserves the lower half of reg 200 and copies them to + reg 199, and the subreg use of reg 199 only reads the least significant byte. + The sign extension is therefore redundant (the extension part, not the copy + part), and can safely be replaced with a regcopy from reg 200 to reg 199. + + + OTHER SIGN/ZERO EXTENSION ELIMINATION PASSES + + There are other passes which eliminate sign/zero-extension: combine and + implicit_zee. Both attempt to eliminate extensions by combining them with + other instructions. The combine pass does this at bb level, + implicit_zee works at inter-bb level. + + The combine pass combine an extension with either: + - all uses of the extension, or + - all defs of the operand of the extension. + The implicit_zee pass only implements the latter. + + For our motivating example, combine doesn't work since the def and the use of + reg 199 are in a different bb. + + Implicit_zee does not work since it only combines an extension with the defs + of its operand. + + + INTENDED EFFECT + + This pass works by removing sign/zero-extensions, or replacing them with + regcopies. The idea there is that the regcopy might be eliminated by a later + pass. In case the regcopy cannot be eliminated, it might at least be cheaper + than the extension. + + + IMPLEMENTATION + + The pass scans a number of times over all instructions. + + The first scan collects all extensions. + + The second scan registers all uses of a reg in the biggest_use array. After + that first scan, the biggest_use array contains the size in bits of the + biggest use of each reg, which allows us to find redundant extensions. + + A number of backward scans, bounded by --param ee-max-propagate= is done + in which information from the previous scan is used. As a runtime improvement, + also information from the current scan is used in propagation, if we know that + the information from the current scan is final (all uses have been + registered). If there are no more changes in size of biggest use, or there are + no more extensions, no new scan is initiated. + + After the last scan, redundant extensions are deleted or replaced. + + In case that the src and dest reg of the replacement are not of the same size, + we do not replace with a normal regcopy, but with a truncate or with the copy + of a paradoxical subreg instead. + + + LIMITATIONS + + The scope of the analysis is limited to an extension and its uses. The other + type of analysis (related to the defs of the operand of an extension) is not + done. + + Furthermore, we do the analysis of biggest use per reg. So when determining + whether an extension is redundant, we take all uses of a the dest reg into + account, also the ones that are not uses of the extension. This could be + overcome by calculating the def-use chains and using those for analysis + instead. + + Finally, there is propagation of information, but not in its most runtime + efficient form. In order to have that a text-book backward iterative approach + is needed. */ + + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "rtl.h" +#include "tree.h" +#include "tm_p.h" +#include "flags.h" +#include "regs.h" +#include "hard-reg-set.h" +#include "basic-block.h" +#include "insn-config.h" +#include "function.h" +#include "expr.h" +#include "insn-attr.h" +#include "recog.h" +#include "toplev.h" +#include "target.h" +#include "timevar.h" +#include "optabs.h" +#include "insn-codes.h" +#include "rtlhooks-def.h" +#include "output.h" +#include "params.h" +#include "timevar.h" +#include "tree-pass.h" +#include "cgraph.h" + +#define SKIP_REG (-1) + +/* Array to register the biggest use of a reg, in bits. */ + +static int *biggest_use; + +/* Array to register the amount of uses of a reg. */ + +static int *nr_uses; + +/* Arrays to keep the data from the previous run available. */ + +static int *prev_biggest_use; +static int *prev_nr_uses; + +/* Variable that indicates whether the biggest_use info has changed relative to + the previous run. */ + +bool changed; + +/* Vector that contains the extensions in the function. */ + +VEC (rtx,heap) *extensions; + +/* Vector that contains the extensions in the function that are going to be + removed or replaced. */ + +VEC (rtx,heap) *redundant_extensions; + +/* Forward declarations. */ + +static void note_use (rtx *x, void *data); +static bool skip_reg_p (int regno); + +/* Convenience macro to swap 2 variables. */ + +#define SWAP(T, a, b) do { T tmp; tmp = (a); (a) = (b); (b) = tmp; } while (0) + +/* Check whether this is a paradoxical subreg. */ + +static bool +paradoxical_subreg_p (rtx subreg) +{ + enum machine_mode subreg_mode, reg_mode; + + if (GET_CODE (subreg) != SUBREG) + return false; + + subreg_mode = GET_MODE (subreg); + reg_mode = GET_MODE (SUBREG_REG (subreg)); + + if (GET_MODE_SIZE (subreg_mode) > GET_MODE_SIZE (reg_mode)) + return true; + + return false; +} + +/* Get the size and reg number of a REG or SUBREG use. */ + +static bool +reg_use_p (rtx use, int *size, unsigned int *regno) +{ + rtx reg; + + if (REG_P (use)) + { + *regno = REGNO (use); + *size = GET_MODE_BITSIZE (GET_MODE (use)); + return true; + } + else if (GET_CODE (use) == SUBREG) + { + reg = SUBREG_REG (use); + + if (!REG_P (reg)) + return false; + + *regno = REGNO (reg); + + if (paradoxical_subreg_p (use)) + *size = GET_MODE_BITSIZE (GET_MODE (reg)); + else + *size = subreg_lsb (use) + GET_MODE_BITSIZE (GET_MODE (use)); + + return true; + } + + return false; +} + +/* Register the use of a reg. */ + +static void +register_use (int size, unsigned int regno) +{ + int *current; + int prev; + + gcc_assert (size >= 0); + gcc_assert (regno < (unsigned int)max_reg_num ()); + + current = &biggest_use[regno]; + + if (*current == SKIP_REG) + return; + + *current = MAX (*current, size); + nr_uses[regno]++; + + if (prev_nr_uses == NULL) + return; + + prev = prev_biggest_use[regno]; + if (nr_uses[regno] == prev_nr_uses[regno] && *current < prev) + { + if (dump_file) + fprintf (dump_file, "reg %d: size %d -> %d\n", regno, prev, *current); + changed = true; + } +} + +/* Get the biggest use of a reg from a previous scan. In case all uses are + already registered in the current pass, use the biggest use from the current + pass. */ + +static int +get_prev_biggest_use (unsigned int regno) +{ + int current, res; + + gcc_assert (!skip_reg_p (regno)); + + res = prev_biggest_use[regno]; + + current = biggest_use[regno]; + gcc_assert (current <= res); + if (nr_uses[regno] == prev_nr_uses[regno] && current < res) + res = current; + + gcc_assert (res >= 0); + return res; +} + +/* Handle embedded uses. */ + +static void +note_embedded_uses (rtx use, rtx pattern) +{ + const char *format_ptr; + int i, j; + + format_ptr = GET_RTX_FORMAT (GET_CODE (use)); + for (i = 0; i < GET_RTX_LENGTH (GET_CODE (use)); i++) + if (format_ptr[i] == 'e') + note_use (&XEXP (use, i), pattern); + else if (format_ptr[i] == 'E') + for (j = 0; j < XVECLEN (use, i); j++) + note_use (&XVECEXP (use, i, j), pattern); +} + +/* Get the set that has use as its SRC operand. */ + +static rtx +get_set (rtx use, rtx pattern) +{ + rtx sub; + int i; + + if (GET_CODE (pattern) == SET && SET_SRC (pattern) == use) + return pattern; + + if (GET_CODE (pattern) == PARALLEL) + for (i = 0; i < XVECLEN (pattern, 0); ++i) + { + sub = XVECEXP (pattern, 0, i); + if (GET_CODE (sub) == SET && SET_SRC (sub) == use) + return sub; + } + + return NULL_RTX; +} + +/* Handle a restricted op use. In this context restricted means that a bit in an + operand influences only the same bit or more significant bits in the result. + The bitwise ops are a subclass, but PLUS is one as well. */ + +static void +note_restricted_op_use (rtx use, unsigned int nr_operands, rtx pattern) +{ + unsigned int i, smallest; + int operand_size[2]; + int used_size; + unsigned int operand_regno[2]; + bool operand_reg[2]; + bool operand_ignore[2]; + rtx set; + + /* Init operand_reg, operand_size, operand_regno and operand_ignore. */ + for (i = 0; i < nr_operands; ++i) + { + operand_reg[i] = reg_use_p (XEXP (use, i), &operand_size[i], + &operand_regno[i]); + operand_ignore[i] = false; + } + + /* Handle case of reg and-masked with const. */ + if (GET_CODE (use) == AND && CONST_INT_P (XEXP (use, 1)) && operand_reg[0]) + { + used_size = + HOST_BITS_PER_WIDE_INT - clz_hwi (UINTVAL (XEXP (use, 1))); + operand_size[0] = MIN (operand_size[0], used_size); + } + + /* Handle case of reg or-masked with const. */ + if (GET_CODE (use) == IOR && CONST_INT_P (XEXP (use, 1)) && operand_reg[0]) + { + used_size = + HOST_BITS_PER_WIDE_INT - clz_hwi (~UINTVAL (XEXP (use, 1))); + operand_size[0] = MIN (operand_size[0], used_size); + } + + /* Ignore the use of a in 'a = a + b'. */ + set = get_set (use, pattern); + /* TODO: handle GET_CODE ((SET_DEST (set))) == SUBREG. */ + if (set != NULL_RTX && REG_P (SET_DEST (set))) + for (i = 0; i < nr_operands; ++i) + operand_ignore[i] = (operand_reg[i] + && (REGNO (SET_DEST (set)) == operand_regno[i])); + + /* Handle the case a reg is combined with don't care bits. */ + if (nr_operands == 2 && operand_reg[0] && operand_reg[1] + && operand_size[0] != operand_size[1]) + { + smallest = operand_size[0] > operand_size[1]; + + if (paradoxical_subreg_p (XEXP (use, smallest)) + && !SUBREG_PROMOTED_VAR_P (XEXP (use, smallest))) + operand_size[1 - smallest] = operand_size[smallest]; + } + + /* TODO: handle GET_CODE ((SET_DEST (set))) == SUBREG. */ + if (prev_biggest_use != NULL && set != NULL_RTX && REG_P (SET_DEST (set)) + && !skip_reg_p (REGNO (SET_DEST (set)))) + for (i = 0; i < nr_operands; ++i) + operand_size[i] = MIN (operand_size[i], + get_prev_biggest_use (REGNO (SET_DEST (set)))); + + /* Register the operand use, if necessary. */ + for (i = 0; i < nr_operands; ++i) + if (!operand_reg[i]) + note_use (&XEXP (use, i), pattern); + else if (!operand_ignore[i]) + register_use (operand_size[i], operand_regno[i]); +} + +/* Handle all uses noted by note_uses. */ + +static void +note_use (rtx *x, void *data) +{ + rtx use = *x; + rtx pattern = (rtx)data; + int use_size; + unsigned int use_regno; + rtx set; + + switch (GET_CODE (use)) + { + case REG: + case SUBREG: + if (!reg_use_p (use, &use_size, &use_regno)) + { + note_embedded_uses (use, pattern); + return; + } + if (prev_biggest_use != NULL) + { + set = get_set (use, pattern); + /* TODO: handle GET_CODE ((SET_DEST (set))) == SUBREG. */ + if (set != NULL_RTX && REG_P (SET_DEST (set)) + && !skip_reg_p (REGNO (SET_DEST (set)))) + use_size = MIN (use_size, + get_prev_biggest_use (REGNO (SET_DEST (set)))); + } + register_use (use_size, use_regno); + return; + case IOR: + case AND: + case XOR: + case PLUS: + case MINUS: + note_restricted_op_use (use, 2, pattern); + return; + case NOT: + case NEG: + note_restricted_op_use (use, 1, pattern); + return; + case ASHIFT: + if (!reg_use_p (XEXP (use, 0), &use_size, &use_regno) + || !CONST_INT_P (XEXP (use, 1)) + || INTVAL (XEXP (use, 1)) <= 0 + || paradoxical_subreg_p (XEXP (use, 0))) + { + note_embedded_uses (use, pattern); + return; + } + register_use (use_size - INTVAL (XEXP (use, 1)), use_regno); + return; + default: + note_embedded_uses (use, pattern); + return; + } +} + +/* Check whether reg is implicitly used. */ + +static bool +implicit_use_p (int regno ATTRIBUTE_UNUSED) +{ +#ifdef EPILOGUE_USES + if (EPILOGUE_USES (regno)) + return true; +#endif + +#ifdef EH_USES + if (EH_USES (regno)) + return true; +#endif + + return false; +} + +/* Check whether reg should be skipped in analysis. */ + +static bool +skip_reg_p (int regno) +{ + /* TODO: handle hard registers. The problem with hard registers is that + a DI use of r0 can mean a 64bit use of r0 and a 32 bit use of r1. + We don't handle that properly. */ + return implicit_use_p (regno) || HARD_REGISTER_NUM_P (regno); +} + +/* Note the uses of argument registers in a call. */ + +static void +note_call_uses (rtx insn) +{ + rtx link, link_expr; + + if (!CALL_P (insn)) + return; + + for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1)) + { + link_expr = XEXP (link, 0); + + if (GET_CODE (link_expr) == USE) + note_use (&XEXP (link_expr, 0), link); + } +} + +/* Dump the biggest uses found. */ + +static void +dump_biggest_use (void) +{ + int i; + + if (!dump_file) + return; + + for (i = 0; i < max_reg_num (); i++) + if (biggest_use[i] > 0) + fprintf (dump_file, "reg %d: size %d\n", i, biggest_use[i]); + + fprintf (dump_file, "\n"); +} + +/* Calculate the biggest use mode for all regs. */ + +static void +calculate_biggest_use (void) +{ + int i; + basic_block bb; + rtx insn; + + /* Initialize biggest_use for all regs to 0. If a reg is used implicitly, we + handle that reg conservatively and set it to SKIP_REG instead. */ + for (i = 0; i < max_reg_num (); i++) + { + biggest_use[i] = skip_reg_p (i) ? SKIP_REG : 0; + nr_uses[i] = skip_reg_p (i) ? SKIP_REG : 0; + } + + /* For all insns, call note_use for each use in insn. */ + FOR_EACH_BB (bb) + FOR_BB_INSNS_REVERSE (bb, insn) + { + if (!NONDEBUG_INSN_P (insn)) + continue; + + note_uses (&PATTERN (insn), note_use, PATTERN (insn)); + + if (CALL_P (insn)) + note_call_uses (insn); + } +} + +/* Check whether this is a sign/zero extension. */ + +static bool +extension_p (rtx insn, rtx *dest, rtx *inner, int *preserved_size) +{ + rtx src, op0; + + /* Detect set of reg. */ + if (GET_CODE (PATTERN (insn)) != SET) + return false; + + src = SET_SRC (PATTERN (insn)); + *dest = SET_DEST (PATTERN (insn)); + + if (!REG_P (*dest)) + return false; + + /* Detect sign or zero extension. */ + if (GET_CODE (src) == ZERO_EXTEND || GET_CODE (src) == SIGN_EXTEND + || (GET_CODE (src) == AND && CONST_INT_P (XEXP (src, 1)))) + { + op0 = XEXP (src, 0); + + /* Determine amount of least significant bits preserved by operation. */ + if (GET_CODE (src) == AND) + *preserved_size = ctz_hwi (~UINTVAL (XEXP (src, 1))); + else + *preserved_size = GET_MODE_BITSIZE (GET_MODE (op0)); + + if (GET_CODE (op0) == SUBREG) + { + if (subreg_lsb (op0) != 0) + return false; + + *inner = SUBREG_REG (op0); + return true; + } + else if (REG_P (op0)) + { + *inner = op0; + return true; + } + } + + return false; +} + +/* Find extensions and store them in the extensions vector. */ + +static bool +find_extensions (void) +{ + basic_block bb; + rtx insn, dest, inner; + int preserved_size; + + /* For all insns, call note_use for each use in insn. */ + FOR_EACH_BB (bb) + FOR_BB_INSNS (bb, insn) + { + if (!NONDEBUG_INSN_P (insn)) + continue; + + if (!extension_p (insn, &dest, &inner, &preserved_size)) + continue; + + VEC_safe_push (rtx, heap, extensions, insn); + + if (dump_file) + fprintf (dump_file, + "found extension %u with preserved size %d defining" + " reg %d\n", + INSN_UID (insn), preserved_size, REGNO (dest)); + } + + if (dump_file) + { + if (!VEC_empty (rtx, extensions)) + fprintf (dump_file, "\n"); + else + fprintf (dump_file, "no extensions found.\n"); + } + + return !VEC_empty (rtx, extensions); +} + +/* Check whether this is a redundant sign/zero extension. */ + +static bool +redundant_extension_p (rtx insn, rtx *dest, rtx *inner, int *preserved_size) +{ + int biggest_dest_use; + + if (!extension_p (insn, dest, inner, preserved_size)) + return false; + + biggest_dest_use = biggest_use[REGNO (*dest)]; + + if (biggest_dest_use == SKIP_REG) + return false; + + if (*preserved_size < biggest_dest_use) + return false; + + return true; +} + +/* Find the redundant extensions in the extensions vector and move them to the + redundant_extensions vector. */ + +static void +find_redundant_extensions (void) +{ + rtx insn, dest, inner; + int ix; + bool found = false; + int preserved_size; + + FOR_EACH_VEC_ELT_REVERSE (rtx, extensions, ix, insn) + if (redundant_extension_p (insn, &dest, &inner, &preserved_size)) + { + VEC_safe_push (rtx, heap, redundant_extensions, insn); + VEC_unordered_remove (rtx, extensions, ix); + + if (dump_file) + fprintf (dump_file, + "found superfluous extension %u with preserved size %d" + " defining reg %d\n", + INSN_UID (insn), preserved_size, REGNO (dest)); + found = true; + } + + if (dump_file && found) + fprintf (dump_file, "\n"); +} + +/* run calculate_biggest_use iteratively. */ + +static void +propagate (void) +{ + unsigned int nr_propagate = 0; + unsigned int max_propagate = PARAM_VALUE (PARAM_EE_MAX_PROPAGATE); + + while (nr_propagate < max_propagate && !VEC_empty (rtx, extensions)) + { + ++nr_propagate; + + if (dump_file) + fprintf (dump_file, "propagating(%u)\n", nr_propagate); + + SWAP (int *, biggest_use, prev_biggest_use); + SWAP (int *, nr_uses, prev_nr_uses); + + changed = false; + calculate_biggest_use (); + if (!changed) + break; + + if (dump_file) + fprintf (dump_file, "\n"); + + find_redundant_extensions (); + } +} + +/* Try to remove or replace the redundant extension. */ + +static void +try_remove_or_replace_extension (rtx insn, rtx dest, rtx inner) +{ + rtx cp_src, cp_dest, seq, one; + + /* Check whether replacement is needed. */ + if (dest != inner) + { + start_sequence (); + + /* Determine the proper replacement operation. */ + if (GET_MODE (dest) == GET_MODE (inner)) + { + cp_src = inner; + cp_dest = dest; + } + else if (GET_MODE_SIZE (GET_MODE (dest)) + > GET_MODE_SIZE (GET_MODE (inner))) + { + emit_clobber (dest); + cp_src = inner; + cp_dest = gen_lowpart_SUBREG (GET_MODE (inner), dest); + } + else + { + cp_src = gen_rtx_TRUNCATE (GET_MODE (dest), inner); + cp_dest = dest; + } + + emit_move_insn (cp_dest, cp_src); + + seq = get_insns (); + end_sequence (); + + /* If the replacement is not supported, bail out. */ + for (one = seq; one != NULL_RTX; one = NEXT_INSN (one)) + if (recog_memoized (one) < 0 && GET_CODE (PATTERN (one)) != CLOBBER) + return; + + /* Insert the replacement. */ + emit_insn_before (seq, insn); + + if (dump_file) + { + fprintf (dump_file, "superfluous extension %u replaced by\n", + INSN_UID (insn)); + for (one = seq; one != NULL_RTX; one = NEXT_INSN (one)) + fprintf (dump_file, " %u", INSN_UID (seq)); + fprintf (dump_file, "\n"); + } + } + else + if (dump_file) + fprintf (dump_file, "superfluous extension %u removed\n", INSN_UID (insn)); + + /* Remove the extension. */ + delete_insn (insn); +} + +/* Remove the redundant extensions. */ + +static void +remove_redundant_extensions (void) +{ + rtx insn, dest, inner; + int preserved_size; + int ix; + + FOR_EACH_VEC_ELT (rtx, redundant_extensions, ix, insn) + { + extension_p (insn, &dest, &inner, &preserved_size); + try_remove_or_replace_extension (insn, dest, inner); + } + + if (dump_file) + fprintf (dump_file, "\n"); +} + +/* Setup the variables at the start of the pass. */ + +static void +init_pass (bool for_propagate) +{ + if (for_propagate) + { + prev_biggest_use = XNEWVEC (int, max_reg_num ()); + prev_nr_uses = XNEWVEC (int, max_reg_num ()); + return; + } + + biggest_use = XNEWVEC (int, max_reg_num ()); + nr_uses = XNEWVEC (int, max_reg_num ()); + + prev_biggest_use = NULL; + prev_nr_uses = NULL; + + extensions = VEC_alloc (rtx, heap, 10); + redundant_extensions = VEC_alloc (rtx, heap, 10); +} + +/* Free the variables at the end of the pass. */ + +static void +finish_pass (void) +{ + XDELETEVEC (biggest_use); + XDELETEVEC (nr_uses); + + XDELETEVEC (prev_biggest_use); + XDELETEVEC (prev_nr_uses); + + VEC_free (rtx, heap, extensions); + VEC_free (rtx, heap, redundant_extensions); +} + +/* Find redundant extensions and remove or replace them if possible. */ + +static void +find_and_remove_redundant_extensions (void) +{ + init_pass (false); + + if (!find_extensions ()) + { + finish_pass (); + return; + } + + calculate_biggest_use (); + dump_biggest_use (); + + find_redundant_extensions (); + + if (optimize >= 2 && !VEC_empty (rtx, extensions)) + { + init_pass (true); + propagate (); + } + + remove_redundant_extensions (); + + finish_pass (); +} + +/* Remove redundant extensions. */ + +static unsigned int +rest_of_handle_ee (void) +{ + find_and_remove_redundant_extensions (); + return 0; +} + +/* Run ee pass when flag_ee is set at optimization level > 0. */ + +static bool +gate_handle_ee (void) +{ + return (optimize > 0 && flag_ee); +} + +struct rtl_opt_pass pass_ee = +{ + { + RTL_PASS, + "ee", /* name */ + gate_handle_ee, /* gate */ + rest_of_handle_ee, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + TV_EE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_ggc_collect | + TODO_dump_func | + TODO_verify_rtl_sharing, /* todo_flags_finish */ + } +}; Index: gcc/common.opt =================================================================== --- gcc/common.opt (revision 165080) +++ gcc/common.opt (working copy) @@ -761,6 +761,10 @@ Common Report Var(flag_eliminate_dwarf2_dups) Perform DWARF2 duplicate elimination +fextension-elimination +Common Report Var(flag_ee) Init(0) Optimization +Perform extension elimination + fipa-sra Common Report Var(flag_ipa_sra) Init(0) Optimization Perform interprocedural reduction of aggregates Index: gcc/Makefile.in =================================================================== --- gcc/Makefile.in (revision 165080) +++ gcc/Makefile.in (working copy) @@ -1218,6 +1218,7 @@ dwarf2asm.o \ dwarf2out.o \ ebitmap.o \ + ee.o \ emit-rtl.o \ et-forest.o \ except.o \ @@ -3107,6 +3108,11 @@ web.o : web.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ hard-reg-set.h $(FLAGS_H) $(BASIC_BLOCK_H) $(FUNCTION_H) output.h $(TOPLEV_H) $(DIAGNOSTIC_CORE_H) \ insn-config.h $(RECOG_H) $(DF_H) $(OBSTACK_H) $(TIMEVAR_H) $(TREE_PASS_H) +ee.o : ee.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ + hard-reg-set.h $(FLAGS_H) $(BASIC_BLOCK_H) $(FUNCTION_H) output.h \ + $(DF_H) $(TIMEVAR_H) tree-pass.h $(RECOG_H) $(EXPR_H) \ + $(REGS_H) $(TREE_H) $(TM_P_H) insn-config.h $(INSN_ATTR_H) $(TOPLEV_H) $(DIAGNOSTIC_CORE_H) \ + $(TARGET_H) $(OPTABS_H) insn-codes.h rtlhooks-def.h $(PARAMS_H) $(CGRAPH_H) implicit-zee.o : implicit-zee.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ hard-reg-set.h $(FLAGS_H) $(BASIC_BLOCK_H) $(FUNCTION_H) output.h \ $(DF_H) $(TIMEVAR_H) tree-pass.h $(RECOG_H) $(EXPR_H) \ Index: gcc/passes.c =================================================================== --- gcc/passes.c (revision 165080) +++ gcc/passes.c (working copy) @@ -976,6 +976,7 @@ NEXT_PASS (pass_lower_subreg); NEXT_PASS (pass_df_initialize_opt); NEXT_PASS (pass_cse); + NEXT_PASS (pass_ee); NEXT_PASS (pass_rtl_fwprop); NEXT_PASS (pass_rtl_cprop); NEXT_PASS (pass_rtl_pre); Index: gcc/params.def =================================================================== --- gcc/params.def (revision 165080) +++ gcc/params.def (working copy) @@ -849,6 +849,12 @@ "lto-min-partition", "Size of minimal paritition for WHOPR (in estimated instructions)", 1000, 0, 0) + +DEFPARAM (PARAM_EE_MAX_PROPAGATE, + "ee-max-propagate", + "Maximum number of scans over all insn to do propagation", + 3, 0, 0) + /* Local variables: mode:c Index: gcc/testsuite/gcc.dg/extend-4.c =================================================================== --- gcc/testsuite/gcc.dg/extend-4.c (revision 0) +++ gcc/testsuite/gcc.dg/extend-4.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-ee" } */ + +unsigned char f(unsigned int a) +{ + unsigned int b = a & 0x10ff; + return b; +} + +/* { dg-final { scan-rtl-dump-times "and:" 0 "ee" { target mips*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "superfluous extension \[0-9\]+ replaced" 1 "ee" { target mips*-*-* } } } */ +/* { dg-final { cleanup-rtl-dump "ee" } } */ + Index: gcc/testsuite/gcc.dg/extend-1.c =================================================================== --- gcc/testsuite/gcc.dg/extend-1.c (revision 0) +++ gcc/testsuite/gcc.dg/extend-1.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-ee" } */ + +void f(unsigned char * p, short s, int c, int *z) +{ + if (c) + *z = 0; + *p ^= (unsigned char)s; +} + +/* { dg-final { scan-rtl-dump-times "sign_extend:" 0 "ee" { target mips*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "superfluous extension \[0-9\]+ replaced" 1 "ee" { target mips*-*-* } } } */ +/* { dg-final { cleanup-rtl-dump "ee" } } */ Index: gcc/testsuite/gcc.dg/extend-2.c =================================================================== --- gcc/testsuite/gcc.dg/extend-2.c (revision 0) +++ gcc/testsuite/gcc.dg/extend-2.c (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-ee" } */ +/* { dg-require-effective-target ilp32 } */ + +void f(unsigned char * p, short *s, int c) +{ + short or = 0; + while (c) + { + or = or | s[c]; + c --; + } + *p = (unsigned char)or; +} + +/* { dg-final { scan-rtl-dump-times "zero_extend" 0 "ee" { target mips*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "sign_extend" 0 "ee" { target mips*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "superfluous extension \[0-9\]+ replaced" 2 "ee" { target mips*-*-* } } } */ +/* { dg-final { cleanup-rtl-dump "ee" } } */ + Index: gcc/testsuite/gcc.dg/extend-2-64.c =================================================================== --- gcc/testsuite/gcc.dg/extend-2-64.c (revision 0) +++ gcc/testsuite/gcc.dg/extend-2-64.c (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-ee" } */ +/* { dg-require-effective-target mips64 } */ + +void f(unsigned char * p, short *s, int c) +{ + short or = 0; + while (c) + { + or = or | s[c]; + c --; + } + *p = (unsigned char)or; +} + +/* { dg-final { scan-rtl-dump-times "zero_extend:" 1 "ee" { target mips*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "sign_extend:" 0 "ee" { target mips*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "superfluous extension \[0-9\]+ replaced" 3 "ee" { target mips*-*-* } } } */ +/* { dg-final { cleanup-rtl-dump "ee" } } */ + Index: gcc/testsuite/gcc.dg/extend-3.c =================================================================== --- gcc/testsuite/gcc.dg/extend-3.c (revision 0) +++ gcc/testsuite/gcc.dg/extend-3.c (revision 0) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-ee" } */ + +unsigned int f(unsigned char byte) +{ + return byte << 25; +} + +/* { dg-final { scan-rtl-dump-times "zero_extend:" 0 "ee" { target mips*-*-* } } } */ +/* { dg-final { scan-rtl-dump "superfluous extension \[0-9\]+ replaced" "ee" { target mips*-*-* } } } */ +/* { dg-final { cleanup-rtl-dump "ee" } } */ +