From patchwork Tue Sep 30 15:00:10 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiong Wang X-Patchwork-Id: 395010 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 636D114016B for ; Wed, 1 Oct 2014 01:00:23 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=VnYFU97cBMZYDxyhQ W0w6x87i4ic+VeFmOxGYTYNVuNd0x73TQWpEs70sUXL9I6G5X/vFATSMGWdpDUYH E8uWeMWnBzZ/6loLl8musC7jCFECRoM+nxcOzijIUJQxVPzNZix209MFxOy9rTuf RdeC3iLQTU32B/udWxJdv3iBsw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=epzQjiTw+q14+gWbvNaqQ90 thzc=; b=DcLE1QvpaIXY9z53C87ACtmPkrLwpwCItqpAXJVIXrTtunJgCMK85k5 pm17lMoGkO8zqd+Ig6LMqZpvKuIRWXw7s+D4EEjXXU4BhNl4B2yIjP0p+JycRebt w+9ZunM/S8DTPHvn2vVdPROp0Fx8owq7vMsY2sbXZXEKjZSizafI= Received: (qmail 21377 invoked by alias); 30 Sep 2014 15:00:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 21361 invoked by uid 89); 30 Sep 2014 15:00:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: service87.mimecast.com Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 30 Sep 2014 15:00:13 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Tue, 30 Sep 2014 16:00:11 +0100 Received: from [10.1.205.157] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 30 Sep 2014 16:00:11 +0100 Message-ID: <542AC57A.20701@arm.com> Date: Tue, 30 Sep 2014 16:00:10 +0100 From: Jiong Wang User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: Kugan , "gcc-patches@gcc.gnu.org" CC: Marcus Shawcroft , Richard Earnshaw Subject: Re: [PATCH][AArch64] LR register not used in leaf functions References: <54204387.5090105@linaro.org> <54204727.6080002@arm.com> <54272A2D.7030306@linaro.org> In-Reply-To: <54272A2D.7030306@linaro.org> X-MC-Unique: 114093016001101401 X-IsSubscribed: yes On 27/09/14 22:20, Kugan wrote: > > On 23/09/14 01:58, Jiong Wang wrote: >> On 22/09/14 16:43, Kugan wrote: >> >>> AArch64 has the same issue ARM had where the LR register was not used in >>> leaf functions. This was reported in >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42017. In AArch64, this >>> test-case need to be added with more live ranges for the need for the >>> LR_REGNUM. i.e test-case in the PR needs additional loops up to r31 for >>> the case AArch64 to see this. >>> >>> The same fix (from the thread >>> https://gcc.gnu.org/ml/gcc-patches/2011-04/msg02191.html) which went >>> into ARM should apply to AArch64 as well. Regression tested on qemu for >>> aarch64-none-linux-gnu with no new regressions. Is this OK for trunk? >> This still be a partial fix. LR should be a caller-saved register free >> to use in case it's saved properly to across function call. > Indeed. This should be improved from the generic code. Right now, if a > hard register is used in EPILOGUE_USES, it conflicts with all the live > ranges till a call site kills. I think we should have this patch till > the generic code can be improved. below is my local patch. LR is treated as free register, and strictly following AArch64 ABI, frame should always be created, FP maintained properly if LR clobbered under -fno-omit-frame-pointer. gcc/ * config/aarch64/aarch64.h (CALL_USED_REGISTERS): Mark LR as caller-save. (EPILOGUE_USES): Guard the check by epilogue_completed. * config/aarch64/aarch64.c (aarch64_layout_frame): Explictly check for LR. (aarch64_can_eliminate): Check LR_REGNUM liveness. gcc/testsuite/ * gcc.target/aarch64/lr_free_1.c: New testcase for -fomit-frame-pointer. * gcc.target/aarch64/lr_free_2.c: New testcase for leaf -fno-omit-frame-pointer. diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index db950da..892b310 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -250,7 +250,7 @@ extern unsigned long aarch64_tune_flags; 1, 1, 1, 1, 1, 1, 1, 1, /* R0 - R7 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* R8 - R15 */ \ 1, 1, 1, 0, 0, 0, 0, 0, /* R16 - R23 */ \ - 0, 0, 0, 0, 0, 1, 0, 1, /* R24 - R30, SP */ \ + 0, 0, 0, 0, 0, 1, 1, 1, /* R24 - R30, SP */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* V0 - V7 */ \ 0, 0, 0, 0, 0, 0, 0, 0, /* V8 - V15 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* V16 - V23 */ \ @@ -309,7 +309,7 @@ extern unsigned long aarch64_tune_flags; considered live at the start of the called function. */ #define EPILOGUE_USES(REGNO) \ - ((REGNO) == LR_REGNUM) + (epilogue_completed && (REGNO) == LR_REGNUM) /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function, the stack pointer does not matter. The value is tested only in diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 15c7be6..8b39b2a 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1864,7 +1864,8 @@ aarch64_layout_frame (void) /* ... and any callee saved register that dataflow says is live. */ for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++) if (df_regs_ever_live_p (regno) - && !call_used_regs[regno]) + && (regno == R30_REGNUM + || !call_used_regs[regno])) cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED; for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++) @@ -4313,6 +4314,16 @@ aarch64_can_eliminate (const int from, const int to) return false; } + else + { + /* If we decided that we didn't need a leaf frame pointer but then used + LR in the function, then we'll want a frame pointer after all, so + prevent this elimination to ensure a frame pointer is used. */ + if (to == STACK_POINTER_REGNUM + && flag_omit_leaf_frame_pointer + && df_regs_ever_live_p (LR_REGNUM)) + return false; + } return true; } diff --git a/gcc/testsuite/gcc.target/aarch64/lr_free_1.c b/gcc/testsuite/gcc.target/aarch64/lr_free_1.c new file mode 100644 index 0000000..4c530a2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/lr_free_1.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options "-fno-inline -O2 -fomit-frame-pointer -ffixed-x2 -ffixed-x3 -ffixed-x4 -ffixed-x5 -ffixed-x6 -ffixed-x7 -ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11 -ffixed-x12 -ffixed-x13 -ffixed-x14 -ffixed-x15 -ffixed-x16 -ffixed-x17 -ffixed-x18 -ffixed-x19 -ffixed-x20 -ffixed-x21 -ffixed-x22 -ffixed-x23 -ffixed-x24 -ffixed-x25 -ffixed-x26 -ffixed-x27 -ffixed-28 -ffixed-29 --save-temps -mgeneral-regs-only -fno-ipa-cp" } */ + +extern void abort (); + +int +dec (int a, int b) +{ + return a + b; +} + +int +cal (int a, int b) +{ + int sum1 = a * b; + int sum2 = a / b; + int sum = dec (sum1, sum2); + return a + b + sum + sum1 + sum2; +} + +int +main (int argc, char **argv) +{ + int ret = cal (2, 1); + + if (ret != 11) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times "str\tx30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */ +/* { dg-final { scan-assembler "str\tw30, \\\[sp, \[0-9\]+\\\]" } } */ + +/* { dg-final { scan-assembler "ldr\tw30, \\\[sp, \[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler-times "ldr\tx30, \\\[sp\\\], \[0-9\]+" 2 } } */ + +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/lr_free_2.c b/gcc/testsuite/gcc.target/aarch64/lr_free_2.c new file mode 100644 index 0000000..2bfb6ad --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/lr_free_2.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-fno-inline -O2 -ffixed-x2 -ffixed-x3 -ffixed-x4 -ffixed-x5 -ffixed-x6 -ffixed-x7 -ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11 -ffixed-x12 -ffixed-x13 -ffixed-x14 -ffixed-x15 -ffixed-x16 -ffixed-x17 -ffixed-x18 -ffixed-x19 -ffixed-x20 -ffixed-x21 -ffixed-x22 -ffixed-x23 -ffixed-x24 -ffixed-x25 -ffixed-x26 -ffixed-x27 -ffixed-x28 --save-temps -mgeneral-regs-only -fno-ipa-cp -fdump-rtl-ira" } */ + +extern void abort (); + +int +cal (int a, int b) +{ + /* { dg-final { scan-assembler-times "stp\tx29, x30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */ + int sum = a + b; + int sum1 = a * b; + /* { dg-final { scan-assembler-times "ldr\tx29, x30, \\\[sp\\\], \[0-9\]+" 2 } } */ + /* { dg-final { scan-rtl-dump "assign reg 30" "ira" } } */ + return (a + b + sum + sum1); +} + +int +main (int argc, char **argv) +{ + int ret = cal (1, 2); + + if (ret != 8) + abort (); + + return 0; +} + +/* { dg-final { cleanup-saved-temps } } */ +/* { dg-final { cleanup-rtl-dump "ira" } } */