From patchwork Fri May 17 08:25:01 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 244527 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 80EEA2C00D2 for ; Fri, 17 May 2013 18:25:26 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=M5UUVO5MNmvvcImmg 4t/0KCGfMP70kmFpi5PtWxmVoFS8Lr2CNuV/FG58aQcDW8m7im0sSDI3kfcJlTVX UOEQRs12kIEzSPFj57VDNn0dMnRI9AkYpmc6lq8JXrRLVILAf8GlVrKh1GZ3fnr3 9thr6JfkKIJDxbMwK7NLRdbhx8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; s=default; bh=rm70q54XM68gH1n8RFLiIFQ eUsA=; b=VLFqdzu3s7jHXwNrFBBuObUlKfjjtWLrSB0bLL4exI1QINZETSDlk7B FCMbgvWSxR/CmmCnyFJOaZIG2OhuxTpeQd+HN3ySmVAEnHcRlf5BP5IHM3yqBME8 f4uy04Y7ANARQjsWRAKjBufntv45kSIuB2xAdv8KJzshDKblUqM0= Received: (qmail 28471 invoked by alias); 17 May 2013 08:25:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 28407 invoked by uid 89); 17 May 2013 08:25:12 -0000 X-Spam-SWARE-Status: No, score=-6.6 required=5.0 tests=AWL, BAYES_00, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS, TW_CF autolearn=ham version=3.3.1 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Fri, 17 May 2013 08:25:10 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r4H8P9SW023872 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 17 May 2013 04:25:09 -0400 Received: from zalov.cz (vpn-48-63.rdu2.redhat.com [10.10.48.63]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r4H8P7ZM011103 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 17 May 2013 04:25:09 -0400 Received: from zalov.cz (localhost [127.0.0.1]) by zalov.cz (8.14.5/8.14.5) with ESMTP id r4H8P5HJ023820; Fri, 17 May 2013 10:25:06 +0200 Received: (from jakub@localhost) by zalov.cz (8.14.5/8.14.5/Submit) id r4H8P3XZ023819; Fri, 17 May 2013 10:25:03 +0200 Date: Fri, 17 May 2013 10:25:01 +0200 From: Jakub Jelinek To: Uros Bizjak , Richard Henderson , Andreas Krebbel Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix extendsidi2_1 splitting (PR rtl-optimization/57281, PR rtl-optimization/57300 wrong-code, alternative) Message-ID: <20130517082501.GQ1377@tucnak.redhat.com> Reply-To: Jakub Jelinek References: <20130516162210.GI1377@tucnak.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20130516162210.GI1377@tucnak.redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) On Thu, May 16, 2013 at 06:22:10PM +0200, Jakub Jelinek wrote: > As discussed in the PR, there seem to be only 3 define_split > patterns that use dead_or_set_p, one in i386.md and two in s390.md, > but unfortunately insn splitting is done in many passes > (combine, split{1,2,3,4,5}, dbr, pro_and_epilogue, final, sometimes mach) > and only in combine the note problem is computed. Computing the note > problem in split{1,2,3,4,5} just because of the single pattern on i?86 -m32 > and one on s390x -m64 might be too expensive, and while neither of these > targets do dbr scheduling, e.g. during final without cfg one can't > df_analyze. > > So, the following patch fixes it by doing the transformation instead > in the peephole2 pass which computes the notes problem and has REG_DEAD > notes up2date (and peep2_reg_dead_p is used there heavily and works). Alternative, so far untested, patch is let the register is not dead splitter do its job always during split2 and just fix it up during peephole2, if the register was dead. For the non-cltd case the peephole2 is always desirable, we get rid of a register move and free one hard register for potential other uses after peephole2 (cprop_hardreg? anything else that could benefit from that?). For the cltd case, it is questionable, while we gain a free hard register at that spot, it isn't guaranteed any pass will benefit from that, and cltd is 2 byts smaller than sarl $31, %eax. Though, not sure about the performance. So, the patch below doesn't use the second peephole2 for -Os. 2013-05-17 Jakub Jelinek PR rtl-optimization/57281 PR rtl-optimization/57300 * config/i386/i386.md (extendsidi2_1 dead reg splitter): Remove. (extendsidi2_1 peephole2s): Add instead 2 new peephole2s, that undo what the other splitter did if the registers are dead. * gcc.dg/pr57300.c: New test. * gcc.c-torture/execute/pr57281.c: New test. Jakub --- gcc/config/i386/i386.md.jj 2013-05-16 18:22:59.000000000 +0200 +++ gcc/config/i386/i386.md 2013-05-17 10:11:20.365455394 +0200 @@ -3332,22 +3332,8 @@ (define_insn "extendsidi2_1" "!TARGET_64BIT" "#") -;; Extend to memory case when source register does die. -(define_split - [(set (match_operand:DI 0 "memory_operand") - (sign_extend:DI (match_operand:SI 1 "register_operand"))) - (clobber (reg:CC FLAGS_REG)) - (clobber (match_operand:SI 2 "register_operand"))] - "(reload_completed - && dead_or_set_p (insn, operands[1]) - && !reg_mentioned_p (operands[1], operands[0]))" - [(set (match_dup 3) (match_dup 1)) - (parallel [(set (match_dup 1) (ashiftrt:SI (match_dup 1) (const_int 31))) - (clobber (reg:CC FLAGS_REG))]) - (set (match_dup 4) (match_dup 1))] - "split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);") - -;; Extend to memory case when source register does not die. +;; Split the memory case. If the source register doesn't die, it will stay +;; this way, if it does die, following peephole2s take care of it. (define_split [(set (match_operand:DI 0 "memory_operand") (sign_extend:DI (match_operand:SI 1 "register_operand"))) @@ -3376,6 +3362,48 @@ (define_split DONE; }) +;; Peepholes for the case where the source register does die, after +;; being split with the above splitter. +(define_peephole2 + [(set (match_operand:SI 0 "memory_operand") + (match_operand:SI 1 "register_operand")) + (set (match_operand:SI 2 "register_operand") (match_dup 1)) + (parallel [(set (match_dup 2) + (ashiftrt:SI (match_dup 2) + (match_operand:QI 3 "const_int_operand"))) + (clobber (reg:CC FLAGS_REG))]) + (set (match_operand:SI 4 "memory_operand") (match_dup 2))] + "INTVAL (operands[3]) == 31 + && REGNO (operands[1]) != REGNO (operands[2]) + && peep2_reg_dead_p (2, operands[1]) + && peep2_reg_dead_p (4, operands[2]) + && !reg_mentioned_p (operands[2], operands[4])" + [(set (match_dup 0) (match_dup 1)) + (parallel [(set (match_dup 1) (ashiftrt:SI (match_dup 1) (const_int 31))) + (clobber (reg:CC FLAGS_REG))]) + (set (match_dup 4) (match_dup 1))]) + +(define_peephole2 + [(set (match_operand:SI 0 "memory_operand") + (match_operand:SI 1 "register_operand")) + (parallel [(set (match_operand:SI 2 "register_operand") + (ashiftrt:SI (match_dup 1) + (match_operand:QI 3 "const_int_operand"))) + (clobber (reg:CC FLAGS_REG))]) + (set (match_operand:SI 4 "memory_operand") (match_dup 2))] + "INTVAL (operands[3]) == 31 + /* cltd is shorter than sarl $31, %eax */ + && !optimize_function_for_size_p (cfun) + && true_regnum (operands[1]) == AX_REG + && true_regnum (operands[2]) == DX_REG + && peep2_reg_dead_p (2, operands[1]) + && peep2_reg_dead_p (3, operands[2]) + && !reg_mentioned_p (operands[2], operands[4])" + [(set (match_dup 0) (match_dup 1)) + (parallel [(set (match_dup 1) (ashiftrt:SI (match_dup 1) (const_int 31))) + (clobber (reg:CC FLAGS_REG))]) + (set (match_dup 4) (match_dup 1))]) + ;; Extend to register case. Optimize case where source and destination ;; registers match and cases where we can use cltd. (define_split --- gcc/testsuite/gcc.dg/pr57300.c.jj 2013-05-16 15:51:25.084707211 +0200 +++ gcc/testsuite/gcc.dg/pr57300.c 2013-05-16 15:51:25.084707211 +0200 @@ -0,0 +1,21 @@ +/* PR rtl-optimization/57300 */ +/* { dg-do run } */ +/* { dg-options "-O3" } */ +/* { dg-additional-options "-msse2" { target sse2_runtime } } */ + +extern void abort (void); +int a, b, d[10]; +long long c; + +int +main () +{ + int e; + for (e = 0; e < 10; e++) + d[e] = 1; + if (d[0]) + c = a = (b == 0 || 1 % b); + if (a != 1) + abort (); + return 0; +} --- gcc/testsuite/gcc.c-torture/execute/pr57281.c.jj 2013-05-16 15:51:25.085707131 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr57281.c 2013-05-16 15:51:25.085707131 +0200 @@ -0,0 +1,25 @@ +/* PR rtl-optimization/57281 */ + +int a = 1, b, d, *e = &d; +long long c, *g = &c; +volatile long long f; + +int +foo (int h) +{ + int j = *g = b; + return h == 0 ? j : 0; +} + +int +main () +{ + int h = a; + for (; b != -20; b--) + { + (int) f; + *e = 0; + *e = foo (h); + } + return 0; +}