From patchwork Wed Jun 19 12:32:56 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Earnshaw X-Patchwork-Id: 252571 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 65CB82C02B8 for ; Wed, 19 Jun 2013 22:33:13 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=MiCcBRh4dzfJNBw/8 0PPbqfqjJUf/Kk1HdW+81+/+C8K+qaap06xwAdG/4eIyFf294mAhcPmWrBqlum3W laHImI4olNCVUA3EUDN1Wime8vgpS3VSNVxx+nxjnQIextK76XpP97ms38jhmVqv nveVdfrbANdcNfIaSsG0XVdxZg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=XHDwr/6B47/LIOjcD/obHwH YTNE=; b=A4cvHKYBY1WSc79u56xt5vZNgpJ3mIt8BP6rGbw/PU70wBv3xRLHU38 Sa6IzGs+GY9Kkyzp+y+MoFaRMMJnp3YuyQe4gza5gGU/7y0xXdUN238FefSeKN07 Iw02lyvix+734UGjpBlZ8vhefLhnVRhMsDzd591WhGQTFqpZT7Bo= Received: (qmail 19426 invoked by alias); 19 Jun 2013 12:33:05 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 19414 invoked by uid 89); 19 Jun 2013 12:33:04 -0000 X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, KHOP_THREADED, RCVD_IN_DNSWL_LOW, SPF_PASS, TW_CL autolearn=ham version=3.3.1 Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 19 Jun 2013 12:33:03 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Wed, 19 Jun 2013 13:33:00 +0100 Received: from [10.1.208.33] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 19 Jun 2013 13:32:57 +0100 Message-ID: <51C1A4F8.5080702@arm.com> Date: Wed, 19 Jun 2013 13:32:56 +0100 From: Richard Earnshaw User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:15.0) Gecko/20120907 Thunderbird/15.0.1 MIME-Version: 1.0 To: Meador Inge CC: "gcc-patches@gcc.gnu.org" , Ramana Radhakrishnan Subject: Re: [PATCH] ARM: Don't clobber CC reg when it is live after the peephole window References: <1369847707-8357-1-git-send-email-meadori@codesourcery.com> <51B08A77.7020700@arm.com> <51B0D3A3.2020306@codesourcery.com> <51C08957.3080406@codesourcery.com> In-Reply-To: <51C08957.3080406@codesourcery.com> X-MC-Unique: 113061913330000501 X-Virus-Found: No On 18/06/13 17:22, Meador Inge wrote: > Ping. > > On 06/06/2013 01:23 PM, Meador Inge wrote: >> On 06/06/2013 08:11 AM, Richard Earnshaw wrote: >> >>> I understand (and agree with) this bit... >>> >>>> +(define_peephole2 >>>> + [(set (reg:CC CC_REGNUM) >>>> + (compare:CC (match_operand:SI 0 "register_operand" "") >>>> + (match_operand:SI 1 "arm_rhs_operand" ""))) >>>> + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) >>>> + (set (match_operand:SI 2 "register_operand" "") (const_int 0))) >>>> + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) >>>> + (set (match_dup 2) (const_int 1))) >>>> + (match_scratch:SI 3 "r")] >>>> + "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])" >>>> + [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1))) >>>> + (parallel >>>> + [(set (reg:CC CC_REGNUM) >>>> + (compare:CC (const_int 0) (match_dup 3))) >>>> + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))]) >>>> + (set (match_dup 2) >>>> + (plus:SI (plus:SI (match_dup 2) (match_dup 3)) >>>> + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))]) >>>> + >>> >>> ... but what's this bit about? >> >> The original intent was to revert back to the original peephole pattern >> (pre-PR 46975) when the CC reg is still live, but that doesn't properly >> maintain the CC state either (it just happened to pass in the test >> case I was looking at because I only cared about the Z flag, which is >> maintained the same). >> >> OK with the above bit left out? >> > > Sorry for the delay, I've been sidetracked onto other things. Having looked at this patch I realized that we were missing a trick on ARMv5 and later, when a more efficient sequence exists, particularly for Cortex-A15. By using CLZ we can avoid the need to set the condition code register at all, which gives us far more scheduling freedom. It's also best not to unnecessarily clobber the condition code register even if there are other instructions in the sequence that do set/use the flags (the peepholer pass right at the end will do this optimization when it is useful), so I've tweaked some of the existing alternatives as well. Finally, we can use peep2_regno_dead_p (rather than peep2_reg_dead_p) to avoid having to create extra match_operand values. The result is that I've now committed the patch below. R. 2013-06-19 Richard Earnshaw arm.md (split for eq(reg, 0)): Add variants for ARMv5 and Thumb2. (peepholes for eq(reg, not-0)): Ensure condition register is dead after pattern. Use more efficient sequences on ARMv5 and Thumb2. --- gcc/config/arm/arm.md (revision 200187) +++ gcc/config/arm/arm.md (local) @@ -10021,6 +10021,16 @@ (define_split (eq:SI (match_operand:SI 1 "s_register_operand" "") (const_int 0))) (clobber (reg:CC CC_REGNUM))] + "arm_arch5 && TARGET_32BIT" + [(set (match_dup 0) (clz:SI (match_dup 1))) + (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 5)))] +) + +(define_split + [(set (match_operand:SI 0 "s_register_operand" "") + (eq:SI (match_operand:SI 1 "s_register_operand" "") + (const_int 0))) + (clobber (reg:CC CC_REGNUM))] "TARGET_32BIT && reload_completed" [(parallel [(set (reg:CC CC_REGNUM) @@ -10090,29 +10100,87 @@ (define_insn_and_split "*compare_scc" ;; Attempt to improve the sequence generated by the compare_scc splitters ;; not to use conditional execution. + +;; Rd = (eq (reg1) (const_int0)) // ARMv5 +;; clz Rd, reg1 +;; lsr Rd, Rd, #5 (define_peephole2 [(set (reg:CC CC_REGNUM) (compare:CC (match_operand:SI 1 "register_operand" "") - (match_operand:SI 2 "arm_rhs_operand" ""))) + (const_int 0))) + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) + (set (match_operand:SI 0 "register_operand" "") (const_int 0))) + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) + (set (match_dup 0) (const_int 1)))] + "arm_arch5 && TARGET_32BIT && peep2_regno_dead_p (3, CC_REGNUM)" + [(set (match_dup 0) (clz:SI (match_dup 1))) + (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 5)))] +) + +;; Rd = (eq (reg1) (const_int0)) // !ARMv5 +;; negs Rd, reg1 +;; adc Rd, Rd, reg1 +(define_peephole2 + [(set (reg:CC CC_REGNUM) + (compare:CC (match_operand:SI 1 "register_operand" "") + (const_int 0))) (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) (set (match_operand:SI 0 "register_operand" "") (const_int 0))) (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) (set (match_dup 0) (const_int 1))) - (match_scratch:SI 3 "r")] - "TARGET_32BIT" + (match_scratch:SI 2 "r")] + "TARGET_32BIT && peep2_regno_dead_p (3, CC_REGNUM)" [(parallel [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 1) (match_dup 2))) - (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))]) + (compare:CC (const_int 0) (match_dup 1))) + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 1)))]) + (set (match_dup 0) + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))] +) + +;; Rd = (eq (reg1) (reg2/imm)) // ARMv5 +;; sub Rd, Reg1, reg2 +;; clz Rd, Rd +;; lsr Rd, Rd, #5 +(define_peephole2 + [(set (reg:CC CC_REGNUM) + (compare:CC (match_operand:SI 1 "register_operand" "") + (match_operand:SI 2 "arm_rhs_operand" ""))) + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) + (set (match_operand:SI 0 "register_operand" "") (const_int 0))) + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) + (set (match_dup 0) (const_int 1)))] + "arm_arch5 && TARGET_32BIT && peep2_regno_dead_p (3, CC_REGNUM)" + [(set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2))) + (set (match_dup 0) (clz:SI (match_dup 0))) + (set (match_dup 0) (lshiftrt:SI (match_dup 0) (const_int 5)))] +) + + +;; Rd = (eq (reg1) (reg2/imm)) // ! ARMv5 +;; sub T1, Reg1, reg2 +;; negs Rd, T1 +;; adc Rd, Rd, T1 +(define_peephole2 + [(set (reg:CC CC_REGNUM) + (compare:CC (match_operand:SI 1 "register_operand" "") + (match_operand:SI 2 "arm_rhs_operand" ""))) + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) + (set (match_operand:SI 0 "register_operand" "") (const_int 0))) + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) + (set (match_dup 0) (const_int 1))) + (match_scratch:SI 3 "r")] + "TARGET_32BIT && peep2_regno_dead_p (3, CC_REGNUM)" + [(set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2))) (parallel [(set (reg:CC CC_REGNUM) (compare:CC (const_int 0) (match_dup 3))) (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))]) - (parallel - [(set (match_dup 0) - (plus:SI (plus:SI (match_dup 0) (match_dup 3)) - (geu:SI (reg:CC CC_REGNUM) (const_int 0)))) - (clobber (reg:CC CC_REGNUM))])]) + (set (match_dup 0) + (plus:SI (plus:SI (match_dup 0) (match_dup 3)) + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))] +) (define_insn "*cond_move" [(set (match_operand:SI 0 "s_register_operand" "=r,r,r")