From patchwork Sun Jun 27 21:05:04 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Botcazou X-Patchwork-Id: 57104 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 902DAB6EE9 for ; Mon, 28 Jun 2010 07:09:33 +1000 (EST) Received: (qmail 13435 invoked by alias); 27 Jun 2010 21:09:31 -0000 Received: (qmail 13414 invoked by uid 22791); 27 Jun 2010 21:09:30 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00,TW_FC X-Spam-Check-By: sourceware.org Received: from mel.act-europe.fr (HELO mel.act-europe.fr) (212.99.106.210) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 27 Jun 2010 21:09:26 +0000 Received: from localhost (localhost [127.0.0.1]) by filtered-smtp.eu.adacore.com (Postfix) with ESMTP id 83F26CB0238; Sun, 27 Jun 2010 23:09:36 +0200 (CEST) Received: from mel.act-europe.fr ([127.0.0.1]) by localhost (smtp.eu.adacore.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id feMF4vP3SOw6; Sun, 27 Jun 2010 23:09:36 +0200 (CEST) Received: from [192.168.1.2] (bon31-9-83-155-120-49.fbx.proxad.net [83.155.120.49]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mel.act-europe.fr (Postfix) with ESMTP id 48EC7CB0202; Sun, 27 Jun 2010 23:09:36 +0200 (CEST) From: Eric Botcazou To: Bernd Schmidt Subject: Re: combine/dce patch for PR36003, PR42575 Date: Sun, 27 Jun 2010 23:05:04 +0200 User-Agent: KMail/1.9.9 Cc: gcc-patches@gcc.gnu.org References: <4C20938E.2060606@codesourcery.com> <201006241516.38790.ebotcazou@adacore.com> <4C24873D.6050405@codesourcery.com> In-Reply-To: <4C24873D.6050405@codesourcery.com> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <201006272305.05010.ebotcazou@adacore.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org > Surely that's for RA/reload/sched which are naturally expensive? Things > like lower-subreg, ifcvt, dce tend to show up with 0% when I look at the > time report if they show up with any time at all. See Steven's recent figures: while lower-subreg is indeed cheap, ifcvt1 + ifcvt2 account for 0.91% and dce for 0.45% of the compilation time, without the DF overhead. The combined DF overhead is 11%. > Anyhow. Here's a very simple implementation, grafted onto lower-subreg, > which gets rid of the unnecessary insn in PR42575. It also gets rid of > all the byte_lr code :) I have another version where it's done as an > extra pass inside DCE. lower-subreg should already be able to handle this but it cannot because expand_doubleword_mult generates unnecessarily tangled RTL. The first attached patch is sufficient to untangle it, but this has an unexpected side effect on x86 (turning an LEA into ADD + MOVE). Another approach is to teach lower-subreg to handle MULT on the RHS specially, as it already does for ASM_OPERANDS. This is the second attached patch. It has no effect on x86 (and SPARC and others) because of the ??? comment. Both very lightly tested. The part that gets rid of byte DF is OK if it is approved by a DF maintainer. * optabs.c (expand_doubleword_mult): If no target is specified, copy the result of the widening multiplication into a temporary before adjusting it. * stmt.c (expand_return): Do not assign the return value a constant temporary if it is scalar. * lower-subreg.c (special_source_operand): New static function. (simple_move): Use it to recognize special source operands. (resolve_simple_move): Handle special source operands specially. (decompose_multiword_subreg): Do not treat ASM_OPERANDS specially. Index: lower-subreg.c =================================================================== --- lower-subreg.c (revision 161426) +++ lower-subreg.c (working copy) @@ -67,6 +67,24 @@ static bitmap non_decomposable_context; copy from reg M to reg N. */ static VEC(bitmap,heap) *reg_copy_graph; +/* Return nonzero if X is a special source operand for which we may want to + decompose the destination. The value is the number of sub-operands. */ + +static int +special_source_operand (rtx x) +{ + /* We handle ASM_OPERANDS as it is beneficial for things like x86 rdtsc + which returns a DImode value. */ + if (GET_CODE (x) == ASM_OPERANDS) + return 1; + + /* We handle MULT as it is beneficial for double-word multiplication. */ + if (GET_CODE (x) == MULT) + return 2; + + return 0; +} + /* Return whether X is a simple object which we can take a word_mode subreg of. */ @@ -100,11 +118,12 @@ simple_move_operand (rtx x) static rtx simple_move (rtx insn) { - rtx x; - rtx set; + const int n = recog_data.n_operands; enum machine_mode mode; + rtx set, x; + int sso; - if (recog_data.n_operands != 2) + if (n < 2) return NULL_RTX; set = single_set (insn); @@ -118,13 +137,22 @@ simple_move (rtx insn) return NULL_RTX; x = SET_SRC (set); - if (x != recog_data.operand[0] && x != recog_data.operand[1]) - return NULL_RTX; - /* For the src we can handle ASM_OPERANDS, and it is beneficial for - things like x86 rdtsc which returns a DImode value. */ - if (GET_CODE (x) != ASM_OPERANDS - && !simple_move_operand (x)) - return NULL_RTX; + sso = special_source_operand (x); + if (sso) + { + if (n != 1 + sso) + return NULL_RTX; + /* ??? This probably disables the MULT case for several platforms. */ + if (GET_CODE (PATTERN (insn)) == PARALLEL) + return NULL_RTX; + } + else + { + if (x != recog_data.operand[0] && x != recog_data.operand[1]) + return NULL_RTX; + if (!simple_move_operand (x)) + return NULL_RTX; + } /* We try to decompose in integer modes, to avoid generating inefficient code copying between integer and floating point @@ -714,16 +742,23 @@ resolve_simple_move (rtx set, rtx insn) gcc_assert (acg); } + /* If SRC is a special operand, we need to move via a temporary register. */ + if (special_source_operand (src)) + { + int acg; + rtx reg = gen_reg_rtx (orig_mode); + for_each_rtx (&src, resolve_subreg_use, NULL_RTX); + acg = apply_change_group (); + gcc_assert (acg); + emit_move_insn (reg, src); + src = reg; + } + /* If SRC is a register which we can't decompose, or has side effects, we need to move via a temporary register. */ - - if (!can_decompose_p (src) - || side_effects_p (src) - || GET_CODE (src) == ASM_OPERANDS) + else if (!can_decompose_p (src) || side_effects_p (src)) { - rtx reg; - - reg = gen_reg_rtx (orig_mode); + rtx reg = gen_reg_rtx (orig_mode); emit_move_insn (reg, src); src = reg; } @@ -1104,7 +1139,7 @@ decompose_multiword_subregs (void) { rtx set; enum classify_move_insn cmi; - int i, n; + int i; if (!INSN_P (insn) || GET_CODE (PATTERN (insn)) == CLOBBER @@ -1129,25 +1164,11 @@ decompose_multiword_subregs (void) cmi = SIMPLE_MOVE; } - n = recog_data.n_operands; - for (i = 0; i < n; ++i) - { - for_each_rtx (&recog_data.operand[i], - find_decomposable_subregs, - &cmi); - - /* We handle ASM_OPERANDS as a special case to support - things like x86 rdtsc which returns a DImode value. - We can decompose the output, which will certainly be - operand 0, but not the inputs. */ - - if (cmi == SIMPLE_MOVE - && GET_CODE (SET_SRC (set)) == ASM_OPERANDS) - { - gcc_assert (i == 0); - cmi = NOT_SIMPLE_MOVE; - } - } + for (i = 0; i < recog_data.n_operands; ++i) + for_each_rtx (&recog_data.operand[i], + find_decomposable_subregs, + &cmi); + } }