From patchwork Wed May 29 17:15:07 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Meador Inge X-Patchwork-Id: 247354 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id DC30D2C00A9 for ; Thu, 30 May 2013 03:15:22 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=D+HtWgYgww5BPA6DHfJ5rS16VQ7Vrr9W+HfGdxZa7FdujX+eyh tsO0PtxKEiodS9lXGBANsxtCLamgLtInFP0jZF4aoDWE7RS750rQ/3MzRd2xeekC kFpIAwia4Ageu3t7GG3kWS83yb68VRNNK6BSyfNYYQHdpyt95xenMkczA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=FOuSLNK65Hz+rPtltyvTjMvs/yY=; b=Jbr3t6rQjyDrHjnnId/7 sCJn9MnSgZ0j2SSEM9RjfR5EgETxTpQ2UitHelBN9ObxPyLVIQbviz3SnmYVnJXR zCqlRtcWe6T933y9N4PVB04Avrk3dNQ9UfeauTvyhCgloKnrmScuRm3CNdCc75Qa 5t+pFTdaiDTCGYj90VXmIh4= Received: (qmail 6939 invoked by alias); 29 May 2013 17:15:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 6926 invoked by uid 89); 29 May 2013 17:15:15 -0000 X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, TW_DR autolearn=ham version=3.3.1 Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 29 May 2013 17:15:12 +0000 Received: from svr-orw-exc-10.mgc.mentorg.com ([147.34.98.58]) by relay1.mentorg.com with esmtp id 1Uhjxu-0002GI-Hy from meador_inge@mentor.com ; Wed, 29 May 2013 10:15:10 -0700 Received: from SVR-ORW-FEM-03.mgc.mentorg.com ([147.34.97.39]) by SVR-ORW-EXC-10.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 29 May 2013 10:15:10 -0700 Received: from pepper.mgc.mentorg.com (147.34.91.1) by svr-orw-fem-03.mgc.mentorg.com (147.34.97.39) with Microsoft SMTP Server id 14.2.247.3; Wed, 29 May 2013 10:15:15 -0700 From: Meador Inge To: CC: , Subject: [PATCH] ARM: Don't clobber CC reg when it is live after the peephole window Date: Wed, 29 May 2013 12:15:07 -0500 Message-ID: <1369847707-8357-1-git-send-email-meadori@codesourcery.com> MIME-Version: 1.0 Hi All, This patch fixes a bug in one of the ARM peephole2 optimizations. The peephole2 optimization in question was changed to use the CC-updating form for all of the instructions produced by the peephole so that the encoding will be smaller when compiling for thumb [1]. However, I don't think that is always safe. For example, the CC register might be used by something *after* the peephole window. The current peephole will transform: (insn:TI 7 49 18 2 (set (reg:CC 24 cc) (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn} (nil)) (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (reg:SI 3 r3 [140]) (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp} (expr_list:REG_EQUIV (const_int 0 [0]) (nil))) (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc) (const_int 0 [0])) (set (reg:SI 3 r3 [138]) (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp} (expr_list:REG_EQUIV (const_int 1 [0x1]) (nil))) (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp} (expr_list:REG_DEAD (reg/f:SI 2 r2 [143]) (nil))) (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc) (const_int 0 [0])) (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp} (nil)) (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8]) (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn} (expr_list:REG_DEAD (reg:CC 24 cc) (expr_list:REG_DEAD (reg:QI 3 r3 [140]) (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) (nil))))) into the following: (insn 59 49 60 2 (parallel [ (set (reg:CC 24 cc) (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) (const_int 0 [0]))) (set (reg:SI 1 r1) (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) (const_int 0 [0]))) ]) repro.c:6 -1 (nil)) (insn 60 59 61 2 (parallel [ (set (reg:CC 24 cc) (compare:CC (const_int 0 [0]) (reg:SI 1 r1))) (set (reg:SI 3 r3 [140]) (minus:SI (const_int 0 [0]) (reg:SI 1 r1))) ]) repro.c:6 -1 (nil)) (insn 61 60 19 2 (parallel [ (set (reg:SI 3 r3 [140]) (plus:SI (plus:SI (reg:SI 3 r3 [140]) (reg:SI 1 r1)) (geu:SI (reg:CC 24 cc) (const_int 0 [0])))) (clobber (reg:CC 24 cc)) ]) repro.c:6 -1 (nil)) (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp} (nil)) (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc) (const_int 0 [0])) (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp} (expr_list:REG_DEAD (reg/f:SI 2 r2 [143]) (nil))) (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8]) (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn} (expr_list:REG_DEAD (reg:CC 24 cc) (expr_list:REG_DEAD (reg:QI 3 r3 [140]) (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) (nil))))) This gets compiled into the incorrect sequence: ldrb r3, [r0, #0] ldr r2, .L4 subs r1, r3, #0 rsbs r3, r1, #0 adcs r3, r3, r1 strne r3, [r2, #0] streq r3, [r2, #0] strneb r3, [r0, #0] The conditional stores are now dealing with an incorrect condition state. This patch fixes the problem by ensuring that the CC reg is dead after the peephole window for the current peephole definition and falls back on the original pre-PR46975 peephole when it is live. Unfortunately I had trouble coming up with a reproduction case against mainline. I only noticed the bug while working with some local changes that exposed it. Built and tested a full ARM GNU/Linux toolchain. No regressions in the GCC test suite. OK? gcc/ 2013-05-29 Meador Inge * config/arm/arm.md (conditional move peephole2): Only clobber CC register when it is dead after the peephole window. [1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html Index: gcc/config/arm/arm.md =================================================================== --- gcc/config/arm/arm.md (revision 199414) +++ gcc/config/arm/arm.md (working copy) @@ -9978,29 +9978,48 @@ ;; Attempt to improve the sequence generated by the compare_scc splitters ;; not to use conditional execution. (define_peephole2 - [(set (reg:CC CC_REGNUM) + [(set (match_operand 0 "cc_register" "") (compare:CC (match_operand:SI 1 "register_operand" "") (match_operand:SI 2 "arm_rhs_operand" ""))) (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) - (set (match_operand:SI 0 "register_operand" "") (const_int 0))) + (set (match_operand:SI 3 "register_operand" "") (const_int 0))) (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) - (set (match_dup 0) (const_int 1))) - (match_scratch:SI 3 "r")] - "TARGET_32BIT" + (set (match_dup 3) (const_int 1))) + (match_scratch:SI 4 "r")] + "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])" [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (match_dup 2))) - (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))]) + (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (const_int 0) (match_dup 3))) - (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))]) + (compare:CC (const_int 0) (match_dup 4))) + (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))]) (parallel - [(set (match_dup 0) - (plus:SI (plus:SI (match_dup 0) (match_dup 3)) + [(set (match_dup 3) + (plus:SI (plus:SI (match_dup 3) (match_dup 4)) (geu:SI (reg:CC CC_REGNUM) (const_int 0)))) (clobber (reg:CC CC_REGNUM))])]) +(define_peephole2 + [(set (reg:CC CC_REGNUM) + (compare:CC (match_operand:SI 0 "register_operand" "") + (match_operand:SI 1 "arm_rhs_operand" ""))) + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) + (set (match_operand:SI 2 "register_operand" "") (const_int 0))) + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) + (set (match_dup 2) (const_int 1))) + (match_scratch:SI 3 "r")] + "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])" + [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1))) + (parallel + [(set (reg:CC CC_REGNUM) + (compare:CC (const_int 0) (match_dup 3))) + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))]) + (set (match_dup 2) + (plus:SI (plus:SI (match_dup 2) (match_dup 3)) + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))]) + (define_insn "*cond_move" [(set (match_operand:SI 0 "s_register_operand" "=r,r,r") (if_then_else:SI (match_operator 3 "equality_operator"