From patchwork Tue Oct 29 12:29:02 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hurugalawadi, Naveen" X-Patchwork-Id: 286825 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 44A292C041C for ; Tue, 29 Oct 2013 23:29:23 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; q=dns; s=default; b=D4ZwBif08njdvICd i+H98cnetR6K9OMkYpzlv8NMGVMV3WsvVEWMgOi0iEX1TLL09CzshsN4JktuJj2d tR7lZudxyyo8Tgdd6rfVdI/eXjIoscWS/4w7xNl4CBetpQQXCx+2IETXRTw4c4Ic Tj4MtWrfJcXnRaDLARIAvECYj/0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; s=default; bh=QHtJiafz6b8Po+U00w7qCA XhV80=; b=vVU/BkCNkpQPtiyfsOJVn5b4H3+o2KW6KmERobiK/kVA36A/HBPrVN hbVZVkSHCltbo1fV9CJ7avOl4eLah09dNmEFfS7++fTmM08M9hUP2v0SVq4I8cY4 n0cEbtsKRM8P2OKlE5vYHI4NDYLQjQxou7vGzMFpL7DEeOdfyWA6Y= Received: (qmail 27958 invoked by alias); 29 Oct 2013 12:29:17 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 27929 invoked by uid 89); 29 Oct 2013 12:29:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 X-HELO: na01-bl2-obe.outbound.protection.outlook.com Received: from mail-bl2lp0208.outbound.protection.outlook.com (HELO na01-bl2-obe.outbound.protection.outlook.com) (207.46.163.208) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 29 Oct 2013 12:29:13 +0000 Received: from SN2PR07MB029.namprd07.prod.outlook.com (10.255.174.39) by SN2PR07MB032.namprd07.prod.outlook.com (10.255.174.42) with Microsoft SMTP Server (TLS) id 15.0.785.10; Tue, 29 Oct 2013 12:29:03 +0000 Received: from SN2PR07MB029.namprd07.prod.outlook.com ([169.254.11.53]) by SN2PR07MB029.namprd07.prod.outlook.com ([169.254.11.53]) with mapi id 15.00.0810.005; Tue, 29 Oct 2013 12:29:03 +0000 From: "Hurugalawadi, Naveen" To: Ramana Radhakrishnan CC: "gcc-patches@gcc.gnu.org" , Richard Earnshaw , Marcus Shawcroft Subject: RE: [PING] [AArch64] Peepholes to generate ldp and stp instructions Date: Tue, 29 Oct 2013 12:29:02 +0000 Message-ID: <87e23a53a6384ed6b9886d9424e12816@SN2PR07MB029.namprd07.prod.outlook.com> References: <7aad32511cb04e4f9707b3626e1116a2@SN2PR07MB029.namprd07.prod.outlook.com>, <526F8552.6030607@arm.com> In-Reply-To: <526F8552.6030607@arm.com> x-forefront-prvs: 0014E2CF50 x-forefront-antispam-report: SFV:NSPM; SFS:(189002)(199002)(51914003)(377424004)(164054003)(81816001)(54316002)(56776001)(76482001)(85306002)(69226001)(74316001)(80976001)(79102001)(19580395003)(81342001)(83322001)(74366001)(74662001)(74502001)(31966008)(47446002)(74876001)(81686001)(77982001)(59766001)(63696002)(19580405001)(65816001)(81542001)(87266001)(66066001)(80022001)(53806001)(54356001)(49866001)(46102001)(51856001)(47736001)(47976001)(33646001)(4396001)(83072001)(50986001)(74706001)(77096001)(76796001)(56816003)(76576001)(76786001)(24736002); DIR:OUT; SFP:; SCL:1; SRVR:SN2PR07MB032; H:SN2PR07MB029.namprd07.prod.outlook.com; CLIP:115.119.134.194; FPR:; RD:InfoNoRecords; MX:1; A:1; LANG:en; MIME-Version: 1.0 X-OriginatorOrg: DuplicateDomain-a3ec847f-e37f-4d9a-9900-9d9d96f75f58.caviumnetworks.com Hi, >> You are better off CCing the maintainers for such reviews. Let me do >> that for you. I cannot approve or reject this patch but I have a few >> comments as below. Thanks for the quick review and comments. Please find attached the modified patch as per review comments. Please review the same and let me know if its okay. Build and tested on aarch64-thunder-elf (using Cavium's internal simulator). No new regressions. 2013-10-29 Naveen H.S gcc/ * config/aarch64/aarch64.md (peephole2 to generate ldp instruction for 2 consecutive loads from memory): New. (peephole2 to generate stp instruction for 2 consecutive stores to memory in integer mode): New. (peephole2 to generate ldp instruction for 2 consecutive loads from memory in floating point mode): New. (peephole2 to generate stp instruction for 2 consecutive stores to memory in floating point mode): New. gcc/testsuite * gcc.target/aarch64/ldp-stp.c: New testcase. Thanks, Naveen --- gcc/config/aarch64/aarch64.md 2013-10-28 17:15:52.363975264 +0530 +++ gcc/config/aarch64/aarch64.md 2013-10-29 17:40:48.516129561 +0530 @@ -1068,6 +1068,27 @@ (set_attr "mode" "")] ) +(define_peephole2 + [(set (match_operand:GPI 0 "register_operand") + (match_operand:GPI 1 "aarch64_mem_pair_operand")) + (set (match_operand:GPI 2 "register_operand") + (match_operand:GPI 3 "memory_operand"))] + "GET_CODE (operands[1]) == MEM + && GET_CODE (XEXP (operands[1], 0)) == PLUS + && REG_P (XEXP (XEXP (operands[1], 0), 0)) + && CONST_INT_P (XEXP (XEXP (operands[1], 0), 1)) + && GET_MODE (operands[0]) == GET_MODE (XEXP (XEXP (operands[1], 0), 0)) + && REGNO (operands[0]) != REGNO (operands[2]) + && GP_REGNUM_P (REGNO (operands[0])) && GP_REGNUM_P (REGNO (operands[2])) + && REGNO_REG_CLASS (REGNO (operands[0])) + == REGNO_REG_CLASS (REGNO (operands[2])) + && rtx_equal_p (XEXP (operands[3], 0), + plus_constant (Pmode, XEXP (operands[1], 0), + GET_MODE_SIZE (mode)))" + [(parallel [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))])] +) + ;; Operands 0 and 2 are tied together by the final condition; so we allow ;; fairly lax checking on the second memory operation. (define_insn "store_pair" @@ -1085,6 +1106,27 @@ (set_attr "mode" "")] ) +(define_peephole2 + [(set (match_operand:GPI 0 "aarch64_mem_pair_operand") + (match_operand:GPI 1 "register_operand")) + (set (match_operand:GPI 2 "memory_operand") + (match_operand:GPI 3 "register_operand"))] + "GET_CODE (operands[0]) == MEM + && GET_CODE (XEXP (operands[0], 0)) == PLUS + && REG_P (XEXP (XEXP (operands[0], 0), 0)) + && CONST_INT_P (XEXP (XEXP (operands[0], 0), 1)) + && GET_MODE (operands[1]) == GET_MODE (XEXP (XEXP (operands[0], 0), 0)) + && REGNO (operands[1]) != REGNO (operands[3]) + && GP_REGNUM_P (REGNO (operands[1])) && GP_REGNUM_P (REGNO (operands[3])) + && REGNO_REG_CLASS (REGNO (operands[1])) + == REGNO_REG_CLASS (REGNO (operands[3])) + && rtx_equal_p (XEXP (operands[2], 0), + plus_constant (Pmode, XEXP (operands[0], 0), + GET_MODE_SIZE (mode)))" + [(parallel [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))])] +) + ;; Operands 1 and 3 are tied together by the final condition; so we allow ;; fairly lax checking on the second memory operation. (define_insn "load_pair" @@ -1102,6 +1144,27 @@ (set_attr "mode" "")] ) +(define_peephole2 + [(set (match_operand:GPF 0 "register_operand") + (match_operand:GPF 1 "aarch64_mem_pair_operand")) + (set (match_operand:GPF 2 "register_operand") + (match_operand:GPF 3 "memory_operand"))] + "GET_CODE (operands[1]) == MEM + && GET_CODE (XEXP (operands[1], 0)) == PLUS + && REG_P (XEXP (XEXP (operands[1], 0), 0)) + && CONST_INT_P (XEXP (XEXP (operands[1], 0), 1)) + && GET_MODE (operands[0]) == GET_MODE (XEXP (XEXP (operands[1], 0), 0)) + && REGNO (operands[0]) != REGNO (operands[2]) + && FP_REGNUM_P (REGNO (operands[0])) && FP_REGNUM_P (REGNO (operands[2])) + && REGNO_REG_CLASS (REGNO (operands[0])) + == REGNO_REG_CLASS (REGNO (operands[2])) + && rtx_equal_p (XEXP (operands[3], 0), + plus_constant (Pmode, XEXP (operands[1], 0), + GET_MODE_SIZE (mode)))" + [(parallel [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))])] +) + ;; Operands 0 and 2 are tied together by the final condition; so we allow ;; fairly lax checking on the second memory operation. (define_insn "store_pair" @@ -1119,6 +1182,27 @@ (set_attr "mode" "")] ) +(define_peephole2 + [(set (match_operand:GPF 0 "aarch64_mem_pair_operand") + (match_operand:GPF 1 "register_operand")) + (set (match_operand:GPF 2 "memory_operand") + (match_operand:GPF 3 "register_operand"))] + "GET_CODE (operands[0]) == MEM + && GET_CODE (XEXP (operands[0], 0)) == PLUS + && REG_P (XEXP (XEXP (operands[0], 0), 0)) + && CONST_INT_P (XEXP (XEXP (operands[0], 0), 1)) + && GET_MODE (operands[1]) == GET_MODE (XEXP (XEXP (operands[0], 0), 0)) + && REGNO (operands[1]) != REGNO (operands[3]) + && FP_REGNUM_P (REGNO (operands[1])) && FP_REGNUM_P (REGNO (operands[3])) + && REGNO_REG_CLASS (REGNO (operands[1])) + == REGNO_REG_CLASS (REGNO (operands[3])) + && rtx_equal_p (XEXP (operands[2], 0), + plus_constant (Pmode, XEXP (operands[0], 0), + GET_MODE_SIZE (mode)))" + [(parallel [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))])] +) + ;; Load pair with writeback. This is primarily used in function epilogues ;; when restoring [fp,lr] (define_insn "loadwb_pair_" --- gcc/testsuite/gcc.target/aarch64/ldp-stp.c 1970-01-01 05:30:00.000000000 +0530 +++ gcc/testsuite/gcc.target/aarch64/ldp-stp.c 2013-10-28 19:01:11.695986357 +0530 @@ -0,0 +1,33 @@ +/* { dg-options "-Os" } */ + +extern void abort (void); + +typedef struct +{ + long int x, y; +} ldst; + +void +f (ldst p0, ldst p1, ldst p2, ldst p3, ldst p4, ldst p5) +{ + if (p2.x != 1 || p2.y != -1 + || p3.x != -1 || p3.y != 1 || p4.x != 0 || p4.y != -1) + abort (); +} + +void +foo () +{ + ldst p0, p1, p2, p3, p4, p5; + + p4.x = 0; + p4.y = -1; + + p5.x = 1; + p5.y = 0; + + f (p0, p1, p2, p3, p4, p5); +} + +/* { dg-final { scan-assembler-times "ldp\tx\[0-9\]+, x\[0-9\]" 3 } } */ +/* { dg-final { scan-assembler-times "stp\tx\[0-9\]+, x\[0-9\]" 3 } } */