From patchwork Thu May 16 16:22:10 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 244392 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id C23A52C049C for ; Fri, 17 May 2013 02:22:27 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=GQHzqH04Kh3mX8HlZRzio9Q65eyLX 5Yw3H0KdakVV9TnAO1a+tLVhUyRjqlGlRUN1+VNwHSvHXb2Y/8EJj/lEhgBcsXMN 9BLqmtQEfjY3l1zMCfqMHmikvbJ4XXQhIY3T7KAhD3FA0c4k1+so4HTlouajxQTJ Ccrs4uB5HnYwXU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=VetkxU4P+DDPkA03IuEA8i1UcSo=; b=IAV EU6Vy3rHlrnngBrUlvyZUJUdbM15GPGIGwxrP6zwYk0MixA/CmtPFK8X/N/XpACD uPdyToTrQOk5Bw4m5fWRp3qyoPZPu+SBAcUigQwhqrKCkOWl3qev1wVeYfTCEwX8 XCTA3FrhYMGGIG+LjV3/ihJ70nftcd4HHbWRCJd0= Received: (qmail 498 invoked by alias); 16 May 2013 16:22:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 479 invoked by uid 89); 16 May 2013 16:22:19 -0000 X-Spam-SWARE-Status: No, score=-6.6 required=5.0 tests=AWL, BAYES_00, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS, TW_CF autolearn=ham version=3.3.1 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Thu, 16 May 2013 16:22:18 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r4GGMHgJ010802 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 16 May 2013 12:22:17 -0400 Received: from zalov.cz (vpn-57-115.rdu2.redhat.com [10.10.57.115]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r4GGMFiv016901 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 May 2013 12:22:16 -0400 Received: from zalov.cz (localhost [127.0.0.1]) by zalov.cz (8.14.5/8.14.5) with ESMTP id r4GGMD0A004828; Thu, 16 May 2013 18:22:14 +0200 Received: (from jakub@localhost) by zalov.cz (8.14.5/8.14.5/Submit) id r4GGMBYq004827; Thu, 16 May 2013 18:22:11 +0200 Date: Thu, 16 May 2013 18:22:10 +0200 From: Jakub Jelinek To: Uros Bizjak , Richard Henderson , Andreas Krebbel Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix extendsidi2_1 splitting (PR rtl-optimization/57281, PR rtl-optimization/57300 wrong-code) Message-ID: <20130516162210.GI1377@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Hi! As discussed in the PR, there seem to be only 3 define_split patterns that use dead_or_set_p, one in i386.md and two in s390.md, but unfortunately insn splitting is done in many passes (combine, split{1,2,3,4,5}, dbr, pro_and_epilogue, final, sometimes mach) and only in combine the note problem is computed. Computing the note problem in split{1,2,3,4,5} just because of the single pattern on i?86 -m32 and one on s390x -m64 might be too expensive, and while neither of these targets do dbr scheduling, e.g. during final without cfg one can't df_analyze. So, the following patch fixes it by doing the transformation instead in the peephole2 pass which computes the notes problem and has REG_DEAD notes up2date (and peep2_reg_dead_p is used there heavily and works). The splitters in i386.md for extendsidi2_1 were reload_completed, and I think the usual case is that the pattern emerges already from register allocation, the ordering of the relevant passes for i386 then is: split2, ..., pro_and_epilogue, ..., peephole2, ..., split{4,3}, ... split5 not run, ... final When the first splitter (for the case where the register is dead) is turned into peephole2, and the latter remains define_split (so that it is split even if matched later on), we need to prevent the second splitter from splitting during split2, because then peephole2 would likely never match. That is done through the epilogue_completed check, while it isn't exactly peephole2_completed, it is close enough (there are no other splitting passes in between pro_and_epilogue and peephole2). For -O0, which doesn't run either split{3,4,5} passes, we need to use reload_completed, so that it is split at split2, peephole2 won't be run anyway, and if -fno-peephole2, there is no point not splitting immediately during split2 either. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.8? If applied, I think s390x should do something similar. 2013-05-16 Jakub Jelinek PR rtl-optimization/57281 PR rtl-optimization/57300 * config/i386/i386.md (extendsidi2_1 splits): Turn the first one into define_peephole2 from define_split, use peep2_reg_dead_p instead of dead_or_set_p. Guard the second splitter with epilogue_completed if possible. * gcc.dg/pr57300.c: New test. * gcc.c-torture/execute/pr57281.c: New test. Jakub --- gcc/config/i386/i386.md.jj 2013-05-16 12:36:29.669418198 +0200 +++ gcc/config/i386/i386.md 2013-05-16 16:03:08.833424642 +0200 @@ -3333,13 +3333,13 @@ (define_insn "extendsidi2_1" "#") ;; Extend to memory case when source register does die. -(define_split - [(set (match_operand:DI 0 "memory_operand") - (sign_extend:DI (match_operand:SI 1 "register_operand"))) - (clobber (reg:CC FLAGS_REG)) - (clobber (match_operand:SI 2 "register_operand"))] - "(reload_completed - && dead_or_set_p (insn, operands[1]) +;; This is a peephole2, so that it can use peep2_reg_dead_p. +(define_peephole2 + [(parallel [(set (match_operand:DI 0 "memory_operand") + (sign_extend:DI (match_operand:SI 1 "register_operand"))) + (clobber (reg:CC FLAGS_REG)) + (clobber (match_operand:SI 2 "register_operand"))])] + "(peep2_reg_dead_p (1, operands[1]) && !reg_mentioned_p (operands[1], operands[0]))" [(set (match_dup 3) (match_dup 1)) (parallel [(set (match_dup 1) (ashiftrt:SI (match_dup 1) (const_int 31))) @@ -3348,12 +3348,15 @@ (define_split "split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);") ;; Extend to memory case when source register does not die. +;; Guarded by epilogue_completed, instead of reload_completed, if possible, +;; so that it isn't split before peephole2 has been run. Don't do that +;; if none of the split[345] passes will be run or peephole2 will not happen. (define_split [(set (match_operand:DI 0 "memory_operand") (sign_extend:DI (match_operand:SI 1 "register_operand"))) (clobber (reg:CC FLAGS_REG)) (clobber (match_operand:SI 2 "register_operand"))] - "reload_completed" + "(optimize > 0 && flag_peephole2) ? epilogue_completed : reload_completed" [(const_int 0)] { split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]); --- gcc/testsuite/gcc.dg/pr57300.c.jj 2013-05-16 15:51:25.084707211 +0200 +++ gcc/testsuite/gcc.dg/pr57300.c 2013-05-16 15:51:25.084707211 +0200 @@ -0,0 +1,21 @@ +/* PR rtl-optimization/57300 */ +/* { dg-do run } */ +/* { dg-options "-O3" } */ +/* { dg-additional-options "-msse2" { target sse2_runtime } } */ + +extern void abort (void); +int a, b, d[10]; +long long c; + +int +main () +{ + int e; + for (e = 0; e < 10; e++) + d[e] = 1; + if (d[0]) + c = a = (b == 0 || 1 % b); + if (a != 1) + abort (); + return 0; +} --- gcc/testsuite/gcc.c-torture/execute/pr57281.c.jj 2013-05-16 15:51:25.085707131 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr57281.c 2013-05-16 15:51:25.085707131 +0200 @@ -0,0 +1,25 @@ +/* PR rtl-optimization/57281 */ + +int a = 1, b, d, *e = &d; +long long c, *g = &c; +volatile long long f; + +int +foo (int h) +{ + int j = *g = b; + return h == 0 ? j : 0; +} + +int +main () +{ + int h = a; + for (; b != -20; b--) + { + (int) f; + *e = 0; + *e = foo (h); + } + return 0; +}