diff mbox

[AArch64] LR register not used in leaf functions

Message ID 542AC57A.20701@arm.com
State New
Headers show

Commit Message

Jiong Wang Sept. 30, 2014, 3 p.m. UTC
On 27/09/14 22:20, Kugan wrote:
>
> On 23/09/14 01:58, Jiong Wang wrote:
>> On 22/09/14 16:43, Kugan wrote:
>>
>>> AArch64 has the same issue ARM had where the LR register was not used in
>>> leaf functions. This was reported in
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42017. In AArch64, this
>>> test-case need to be added with more live ranges for the need for the
>>> LR_REGNUM. i.e test-case in the PR needs additional loops up to r31 for
>>> the case AArch64 to see this.
>>>
>>> The same fix (from the thread
>>> https://gcc.gnu.org/ml/gcc-patches/2011-04/msg02191.html) which went
>>> into ARM should apply to AArch64 as well. Regression tested on qemu for
>>> aarch64-none-linux-gnu with no new regressions. Is this OK for trunk?
>> This still be a partial fix. LR should be a caller-saved register free
>> to use in case it's saved properly to across function call.
> Indeed. This should be improved from the generic code. Right now, if a
> hard register is used in EPILOGUE_USES, it conflicts with all the live
> ranges till a call site kills.  I think we should have this patch till
> the generic code can be improved.

below is my local patch. LR is treated as free register, and strictly
following AArch64 ABI, frame should always be created, FP maintained
properly if LR clobbered under -fno-omit-frame-pointer.


gcc/
   * config/aarch64/aarch64.h (CALL_USED_REGISTERS): Mark LR as caller-save.
   (EPILOGUE_USES): Guard the check by epilogue_completed.
   * config/aarch64/aarch64.c (aarch64_layout_frame): Explictly check for LR.
   (aarch64_can_eliminate): Check LR_REGNUM liveness.

gcc/testsuite/
   * gcc.target/aarch64/lr_free_1.c: New testcase for -fomit-frame-pointer.
   * gcc.target/aarch64/lr_free_2.c: New testcase for leaf -fno-omit-frame-pointer.

Comments

Kugan Vivekanandarajah Oct. 1, 2014, 8 a.m. UTC | #1
On 01/10/14 01:00, Jiong Wang wrote:
> On 27/09/14 22:20, Kugan wrote:
>>
>> On 23/09/14 01:58, Jiong Wang wrote:
>>> On 22/09/14 16:43, Kugan wrote:
>>>
>>>> AArch64 has the same issue ARM had where the LR register was not
>>>> used in
>>>> leaf functions. This was reported in
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42017. In AArch64, this
>>>> test-case need to be added with more live ranges for the need for the
>>>> LR_REGNUM. i.e test-case in the PR needs additional loops up to r31 for
>>>> the case AArch64 to see this.
>>>>
>>>> The same fix (from the thread
>>>> https://gcc.gnu.org/ml/gcc-patches/2011-04/msg02191.html) which went
>>>> into ARM should apply to AArch64 as well. Regression tested on qemu for
>>>> aarch64-none-linux-gnu with no new regressions. Is this OK for trunk?
>>> This still be a partial fix. LR should be a caller-saved register free
>>> to use in case it's saved properly to across function call.
>> Indeed. This should be improved from the generic code. Right now, if a
>> hard register is used in EPILOGUE_USES, it conflicts with all the live
>> ranges till a call site kills.  I think we should have this patch till
>> the generic code can be improved.
> 
> below is my local patch. LR is treated as free register, and strictly
> following AArch64 ABI, frame should always be created, FP maintained
> properly if LR clobbered under -fno-omit-frame-pointer.

Thanks Jiong. Sorry I missed your point. As for the additions in your patch:


   /* ... and any callee saved register that dataflow says is live.  */
   for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
     if (df_regs_ever_live_p (regno)
-	&& !call_used_regs[regno])
+	&& (regno == R30_REGNUM
+	    || !call_used_regs[regno]))
       cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;

AArch64 CALL_USED_REGISTERS defines R30_REGNUM to be zero. Therefore
shouldn’t this be redundant?


   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
@@ -4313,6 +4314,16 @@ aarch64_can_eliminate (const int from, const int to)

       return false;
     }
+  else
+    {
+      /* If we decided that we didn't need a leaf frame pointer but
then used
+	 LR in the function, then we'll want a frame pointer after all, so
+	 prevent this elimination to ensure a frame pointer is used.  */
+      if (to == STACK_POINTER_REGNUM
+	  && flag_omit_leaf_frame_pointer
+	  && df_regs_ever_live_p (LR_REGNUM))
+	return false;
+    }

aarch64_frame_pointer_required makes frame pointer needed when
flag_omit_leaf_frame_pointer and df_regs_ever_live_p (LR_REGNUM).
Is this addition still needed?

Thanks,
Kugan
Jiong Wang Nov. 4, 2014, 10:56 a.m. UTC | #2
On 01/10/14 09:00, Kugan wrote:
> On 01/10/14 01:00, Jiong Wang wrote:
>> On 27/09/14 22:20, Kugan wrote:
>>> On 23/09/14 01:58, Jiong Wang wrote:
>>>> On 22/09/14 16:43, Kugan wrote:
>>>>
>>>>> AArch64 has the same issue ARM had where the LR register was not
>>>>> used in
>>>>> leaf functions. This was reported in
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42017. In AArch64, this
>>>>> test-case need to be added with more live ranges for the need for the
>>>>> LR_REGNUM. i.e test-case in the PR needs additional loops up to r31 for
>>>>> the case AArch64 to see this.
>>>>>
>>>>> The same fix (from the thread
>>>>> https://gcc.gnu.org/ml/gcc-patches/2011-04/msg02191.html) which went
>>>>> into ARM should apply to AArch64 as well. Regression tested on qemu for
>>>>> aarch64-none-linux-gnu with no new regressions. Is this OK for trunk?
>>>> This still be a partial fix. LR should be a caller-saved register free
>>>> to use in case it's saved properly to across function call.
>>> Indeed. This should be improved from the generic code. Right now, if a
>>> hard register is used in EPILOGUE_USES, it conflicts with all the live
>>> ranges till a call site kills.  I think we should have this patch till
>>> the generic code can be improved.
>> below is my local patch. LR is treated as free register, and strictly
>> following AArch64 ABI, frame should always be created, FP maintained
>> properly if LR clobbered under -fno-omit-frame-pointer.
> Thanks Jiong. Sorry I missed your point. As for the additions in your patch:

I am really sorry, just noticed the reply...

>     /* ... and any callee saved register that dataflow says is live.  */
>     for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
>       if (df_regs_ever_live_p (regno)
> -	&& !call_used_regs[regno])
> +	&& (regno == R30_REGNUM
> +	    || !call_used_regs[regno]))
>         cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
>
> AArch64 CALL_USED_REGISTERS defines R30_REGNUM to be zero. Therefore
> shouldn’t this be redundant?

that's because the patch also modified r30 to be caller saved.


-    0, 0, 0, 0,   0, 1, 0, 1,	/* R24 - R30, SP */	\
+    0, 0, 0, 0,   0, 1, 1, 1,	/* R24 - R30, SP */	\

thus, we don't need ad-hoc code in pro/epi to save/restore LR.

on the other hand, we want LR to be a callee-saved register that it could be
used as free register is some scenario.

you can't compile the testcase lr_free_2.c for details.

>       }
> +  else
> +    {
> +      /* If we decided that we didn't need a leaf frame pointer but
> then used
> +	 LR in the function, then we'll want a frame pointer after all, so
> +	 prevent this elimination to ensure a frame pointer is used.  */
> +      if (to == STACK_POINTER_REGNUM
> +	  && flag_omit_leaf_frame_pointer
> +	  && df_regs_ever_live_p (LR_REGNUM))
> +	return false;
> +    }
>
> aarch64_frame_pointer_required makes frame pointer needed when
> flag_omit_leaf_frame_pointer and df_regs_ever_live_p (LR_REGNUM).
> Is this addition still needed?

it's needed.

the problem is the df_reg_ever_live_p (LR_REGNUM) in "aarch64_frame_pointer_required"

   if (flag_omit_leaf_frame_pointer
       && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM)))

is invoked before register allocation that it can only catch those LR_REGNUM alive scenarios produced
by inline assembly clobber, for example __asm__ ("":::"x30"), written by user explicitly in code, while
it can't catch the scenarios that LR_REGNUM allocated by register allocator.

so, we need to add another check in aarch64_can_eliminate which is invoked after register allocation to
catch those scenarios where LR allocated by register allocator.

thanks.

>
> Thanks,
> Kugan
>
Marcus Shawcroft Nov. 11, 2014, 2:48 p.m. UTC | #3
On 30 September 2014 16:00, Jiong Wang <jiong.wang@arm.com> wrote:

> gcc/
>   * config/aarch64/aarch64.h (CALL_USED_REGISTERS): Mark LR as caller-save.
>   (EPILOGUE_USES): Guard the check by epilogue_completed.
>   * config/aarch64/aarch64.c (aarch64_layout_frame): Explictly check for LR.
>   (aarch64_can_eliminate): Check LR_REGNUM liveness.
>
> gcc/testsuite/
>   * gcc.target/aarch64/lr_free_1.c: New testcase for -fomit-frame-pointer.
>   * gcc.target/aarch64/lr_free_2.c: New testcase for leaf
> -fno-omit-frame-pointer.

OK /Marcus
Andrew Pinski Nov. 15, 2014, 12:15 a.m. UTC | #4
On Tue, Sep 30, 2014 at 8:00 AM, Jiong Wang <jiong.wang@arm.com> wrote:
> On 27/09/14 22:20, Kugan wrote:
>>
>>
>> On 23/09/14 01:58, Jiong Wang wrote:
>>>
>>> On 22/09/14 16:43, Kugan wrote:
>>>
>>>> AArch64 has the same issue ARM had where the LR register was not used in
>>>> leaf functions. This was reported in
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42017. In AArch64, this
>>>> test-case need to be added with more live ranges for the need for the
>>>> LR_REGNUM. i.e test-case in the PR needs additional loops up to r31 for
>>>> the case AArch64 to see this.
>>>>
>>>> The same fix (from the thread
>>>> https://gcc.gnu.org/ml/gcc-patches/2011-04/msg02191.html) which went
>>>> into ARM should apply to AArch64 as well. Regression tested on qemu for
>>>> aarch64-none-linux-gnu with no new regressions. Is this OK for trunk?
>>>
>>> This still be a partial fix. LR should be a caller-saved register free
>>> to use in case it's saved properly to across function call.
>>
>> Indeed. This should be improved from the generic code. Right now, if a
>> hard register is used in EPILOGUE_USES, it conflicts with all the live
>> ranges till a call site kills.  I think we should have this patch till
>> the generic code can be improved.
>
>
> below is my local patch. LR is treated as free register, and strictly
> following AArch64 ABI, frame should always be created, FP maintained
> properly if LR clobbered under -fno-omit-frame-pointer.
>
>
> gcc/
>   * config/aarch64/aarch64.h (CALL_USED_REGISTERS): Mark LR as caller-save.
>   (EPILOGUE_USES): Guard the check by epilogue_completed.
>   * config/aarch64/aarch64.c (aarch64_layout_frame): Explictly check for LR.
>   (aarch64_can_eliminate): Check LR_REGNUM liveness.
>
> gcc/testsuite/
>   * gcc.target/aarch64/lr_free_1.c: New testcase for -fomit-frame-pointer.
>   * gcc.target/aarch64/lr_free_2.c: New testcase for leaf
> -fno-omit-frame-pointer.

+      /* If we decided that we didn't need a leaf frame pointer but then used
+ LR in the function, then we'll want a frame pointer after all, so
+ prevent this elimination to ensure a frame pointer is used.  */
+      if (to == STACK_POINTER_REGNUM
+  && flag_omit_leaf_frame_pointer
+  && df_regs_ever_live_p (LR_REGNUM))
+ return false;

This breaks my build on aarch64-elf (with some local modifications) as
aarch64_frame_pointer_required returns true but then we use LR but now
aarch64_can_eliminate and aarch64_frame_pointer_required are
inconsitant which is not a valid thing for LRA (and reload).

This was mentioned in https://gcc.gnu.org/ml/gcc-patches/2013-12/msg00151.html :
" IRA calls hook frame_pointer_required and it returns false. After
that LRA calls can_eliminate hook and it returns false which means
that fp can not be used for allocation and we should spill all pseudos
assigned to it."

Can you revert your patch until you can figure out how to get LRA (and
reload) to play nicely with what you want to do?

Thanks,
Andrew Pinski
Wang Jiong Nov. 15, 2014, 2:08 p.m. UTC | #5
2014-11-15 0:15 GMT+00:00 Andrew Pinski <pinskia@gmail.com>:
> On Tue, Sep 30, 2014 at 8:00 AM, Jiong Wang <jiong.wang@arm.com> wrote:
>> On 27/09/14 22:20, Kugan wrote:
>>>
>>>
>>> On 23/09/14 01:58, Jiong Wang wrote:
>
> +      /* If we decided that we didn't need a leaf frame pointer but then used
> + LR in the function, then we'll want a frame pointer after all, so
> + prevent this elimination to ensure a frame pointer is used.  */
> +      if (to == STACK_POINTER_REGNUM
> +  && flag_omit_leaf_frame_pointer
> +  && df_regs_ever_live_p (LR_REGNUM))
> + return false;
>
> This breaks my build on aarch64-elf (with some local modifications)

Hi Andrew,

  then what's your local modification?
  I think the problem is we need to figure out why there is an ICE
after your local modification?
  can you please send me your local modification and testcase if possible.


> aarch64_frame_pointer_required returns true but then we use LR but now
> aarch64_can_eliminate and aarch64_frame_pointer_required are
> inconsitant which is not a valid thing for LRA (and reload).
>
> This was mentioned in https://gcc.gnu.org/ml/gcc-patches/2013-12/msg00151.html :
> " IRA calls hook frame_pointer_required and it returns false. After
> that LRA calls can_eliminate hook and it returns false which means
> that fp can not be used for allocation and we should spill all pseudos
> assigned to it."
>
> Thanks,
> Andrew Pinski
Andrew Pinski Nov. 15, 2014, 3:21 p.m. UTC | #6
On Sat, Nov 15, 2014 at 6:08 AM, Jiong Wang
<wong.kwongyuan.tools@gmail.com> wrote:
> 2014-11-15 0:15 GMT+00:00 Andrew Pinski <pinskia@gmail.com>:
>> On Tue, Sep 30, 2014 at 8:00 AM, Jiong Wang <jiong.wang@arm.com> wrote:
>>> On 27/09/14 22:20, Kugan wrote:
>>>>
>>>>
>>>> On 23/09/14 01:58, Jiong Wang wrote:
>>
>> +      /* If we decided that we didn't need a leaf frame pointer but then used
>> + LR in the function, then we'll want a frame pointer after all, so
>> + prevent this elimination to ensure a frame pointer is used.  */
>> +      if (to == STACK_POINTER_REGNUM
>> +  && flag_omit_leaf_frame_pointer
>> +  && df_regs_ever_live_p (LR_REGNUM))
>> + return false;
>>
>> This breaks my build on aarch64-elf (with some local modifications)
>
> Hi Andrew,
>
>   then what's your local modification?
>   I think the problem is we need to figure out why there is an ICE
> after your local modification?
>   can you please send me your local modification and testcase if possible.

My local modifications can be found in the gcc git at
apinski/thunderx-cost.  Note I reverted this patch so I can continue
working.  The testcase is compiling newlib.  Let me try to get it
again.
I was configuring a combined build with:
--disable-fixed-point --without-ppl --without-python --disable-werror
--enable-plugins --enable-checking --disable-sim --with-newlib
--disable-tls --with-cpu=thunderx --with-multilib-list=lp64,ilp32
--target=aarch64-thunderx-elf --enable-languages=c,c++

Thanks,
Andrew Pinski

>
>
>> aarch64_frame_pointer_required returns true but then we use LR but now
>> aarch64_can_eliminate and aarch64_frame_pointer_required are
>> inconsitant which is not a valid thing for LRA (and reload).
>>
>> This was mentioned in https://gcc.gnu.org/ml/gcc-patches/2013-12/msg00151.html :
>> " IRA calls hook frame_pointer_required and it returns false. After
>> that LRA calls can_eliminate hook and it returns false which means
>> that fp can not be used for allocation and we should spill all pseudos
>> assigned to it."
>>
>> Thanks,
>> Andrew Pinski
Andrew Pinski Nov. 15, 2014, 3:49 p.m. UTC | #7
On Sat, Nov 15, 2014 at 7:21 AM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Sat, Nov 15, 2014 at 6:08 AM, Jiong Wang
> <wong.kwongyuan.tools@gmail.com> wrote:
>> 2014-11-15 0:15 GMT+00:00 Andrew Pinski <pinskia@gmail.com>:
>>> On Tue, Sep 30, 2014 at 8:00 AM, Jiong Wang <jiong.wang@arm.com> wrote:
>>>> On 27/09/14 22:20, Kugan wrote:
>>>>>
>>>>>
>>>>> On 23/09/14 01:58, Jiong Wang wrote:
>>>
>>> +      /* If we decided that we didn't need a leaf frame pointer but then used
>>> + LR in the function, then we'll want a frame pointer after all, so
>>> + prevent this elimination to ensure a frame pointer is used.  */
>>> +      if (to == STACK_POINTER_REGNUM
>>> +  && flag_omit_leaf_frame_pointer
>>> +  && df_regs_ever_live_p (LR_REGNUM))
>>> + return false;
>>>
>>> This breaks my build on aarch64-elf (with some local modifications)
>>
>> Hi Andrew,
>>
>>   then what's your local modification?
>>   I think the problem is we need to figure out why there is an ICE
>> after your local modification?
>>   can you please send me your local modification and testcase if possible.
>
> My local modifications can be found in the gcc git at
> apinski/thunderx-cost.  Note I reverted this patch so I can continue
> working.  The testcase is compiling newlib.  Let me try to get it
> again.
> I was configuring a combined build with:
> --disable-fixed-point --without-ppl --without-python --disable-werror
> --enable-plugins --enable-checking --disable-sim --with-newlib
> --disable-tls --with-cpu=thunderx --with-multilib-list=lp64,ilp32
> --target=aarch64-thunderx-elf --enable-languages=c,c++

Attached is the preprocessed source.
cc1 strtol.i -mabi=ilp32 -O2
is enough to reproduce the ICE.

Thanks,
Andrew

>
> Thanks,
> Andrew Pinski
>
>>
>>
>>> aarch64_frame_pointer_required returns true but then we use LR but now
>>> aarch64_can_eliminate and aarch64_frame_pointer_required are
>>> inconsitant which is not a valid thing for LRA (and reload).
>>>
>>> This was mentioned in https://gcc.gnu.org/ml/gcc-patches/2013-12/msg00151.html :
>>> " IRA calls hook frame_pointer_required and it returns false. After
>>> that LRA calls can_eliminate hook and it returns false which means
>>> that fp can not be used for allocation and we should spill all pseudos
>>> assigned to it."
>>>
>>> Thanks,
>>> Andrew Pinski
Wang Jiong Nov. 15, 2014, 7:53 p.m. UTC | #8
2014-11-15 15:49 GMT+00:00 Andrew Pinski <pinskia@gmail.com>:
>> My local modifications can be found in the gcc git at
>> apinski/thunderx-cost.  Note I reverted this patch so I can continue
>> working.  The testcase is compiling newlib.  Let me try to get it
>> again.
>> I was configuring a combined build with:
>> --disable-fixed-point --without-ppl --without-python --disable-werror
>> --enable-plugins --enable-checking --disable-sim --with-newlib
>> --disable-tls --with-cpu=thunderx --with-multilib-list=lp64,ilp32
>> --target=aarch64-thunderx-elf --enable-languages=c,c++
>
> Attached is the preprocessed source.
> cc1 strtol.i -mabi=ilp32 -O2
> is enough to reproduce the ICE.

thanks. I can reproduce this ICE under -mabi=ilp32 on your code base.

and it's strange LR marked as alive while the final assembly don't have it.
the reason of this is the following pattern

define_insn "*tb<optab><mode>1"
define_insn "*cb<optab><mode>1"

always declare to clobber some register while they don't always use them
in code generation, so sub-optimal code generated, some registers are wasted.

you can see before my patch x19 is allocated for that clobber, thus there
is an unnecessary save of x19 to stack, while after my patch, x30 is allocated
for that clobber, so aarch64_can_eliminate invoked after this will
think this function
require frame pointer according to our ABI, so there are unncessary frame setup
instruction.

basically, it's not inconsistent between aarch64_require_frame_p and
aarch64_can_eliminate.  it's inconsistent between
aarch64_can_eliminate
invoked before assign_by_spills and after that.

and my first impression is that the gcc_assert in lra-eliminate.c is
to strong and need to be relaxed in some situation.

>
> Thanks,
> Andrew
>
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index db950da..892b310 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -250,7 +250,7 @@  extern unsigned long aarch64_tune_flags;
     1, 1, 1, 1,   1, 1, 1, 1,	/* R0 - R7 */		\
     1, 1, 1, 1,   1, 1, 1, 1,	/* R8 - R15 */		\
     1, 1, 1, 0,   0, 0, 0, 0,	/* R16 - R23 */		\
-    0, 0, 0, 0,   0, 1, 0, 1,	/* R24 - R30, SP */	\
+    0, 0, 0, 0,   0, 1, 1, 1,	/* R24 - R30, SP */	\
     1, 1, 1, 1,   1, 1, 1, 1,	/* V0 - V7 */		\
     0, 0, 0, 0,   0, 0, 0, 0,	/* V8 - V15 */		\
     1, 1, 1, 1,   1, 1, 1, 1,   /* V16 - V23 */         \
@@ -309,7 +309,7 @@  extern unsigned long aarch64_tune_flags;
    considered live at the start of the called function.  */
 
 #define EPILOGUE_USES(REGNO) \
-  ((REGNO) == LR_REGNUM)
+  (epilogue_completed && (REGNO) == LR_REGNUM)
 
 /* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
    the stack pointer does not matter.  The value is tested only in
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 15c7be6..8b39b2a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1864,7 +1864,8 @@  aarch64_layout_frame (void)
   /* ... and any callee saved register that dataflow says is live.  */
   for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
     if (df_regs_ever_live_p (regno)
-	&& !call_used_regs[regno])
+	&& (regno == R30_REGNUM
+	    || !call_used_regs[regno]))
       cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;
 
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
@@ -4313,6 +4314,16 @@  aarch64_can_eliminate (const int from, const int to)
 
       return false;
     }
+  else
+    {
+      /* If we decided that we didn't need a leaf frame pointer but then used
+	 LR in the function, then we'll want a frame pointer after all, so
+	 prevent this elimination to ensure a frame pointer is used.  */
+      if (to == STACK_POINTER_REGNUM
+	  && flag_omit_leaf_frame_pointer
+	  && df_regs_ever_live_p (LR_REGNUM))
+	return false;
+    }
 
   return true;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/lr_free_1.c b/gcc/testsuite/gcc.target/aarch64/lr_free_1.c
new file mode 100644
index 0000000..4c530a2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lr_free_1.c
@@ -0,0 +1,38 @@ 
+/* { dg-do run } */
+/* { dg-options "-fno-inline -O2 -fomit-frame-pointer -ffixed-x2 -ffixed-x3 -ffixed-x4 -ffixed-x5 -ffixed-x6 -ffixed-x7 -ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11 -ffixed-x12 -ffixed-x13 -ffixed-x14 -ffixed-x15 -ffixed-x16 -ffixed-x17 -ffixed-x18 -ffixed-x19 -ffixed-x20 -ffixed-x21 -ffixed-x22 -ffixed-x23 -ffixed-x24 -ffixed-x25 -ffixed-x26 -ffixed-x27 -ffixed-28 -ffixed-29 --save-temps -mgeneral-regs-only -fno-ipa-cp" } */
+
+extern void abort ();
+
+int
+dec (int a, int b)
+{
+  return a + b;
+}
+
+int
+cal (int a, int b)
+{
+  int sum1 = a * b;
+  int sum2 = a / b;
+  int sum = dec (sum1, sum2);
+  return a + b + sum + sum1 + sum2;
+}
+
+int
+main (int argc, char **argv)
+{
+  int ret = cal (2, 1);
+
+  if (ret != 11)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "str\tx30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */
+/* { dg-final { scan-assembler "str\tw30, \\\[sp, \[0-9\]+\\\]" } } */
+
+/* { dg-final { scan-assembler "ldr\tw30, \\\[sp, \[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-times "ldr\tx30, \\\[sp\\\], \[0-9\]+" 2 } } */
+
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/lr_free_2.c b/gcc/testsuite/gcc.target/aarch64/lr_free_2.c
new file mode 100644
index 0000000..2bfb6ad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lr_free_2.c
@@ -0,0 +1,29 @@ 
+/* { dg-do run } */
+/* { dg-options "-fno-inline -O2 -ffixed-x2 -ffixed-x3 -ffixed-x4 -ffixed-x5 -ffixed-x6 -ffixed-x7 -ffixed-x8 -ffixed-x9 -ffixed-x10 -ffixed-x11 -ffixed-x12 -ffixed-x13 -ffixed-x14 -ffixed-x15 -ffixed-x16 -ffixed-x17 -ffixed-x18 -ffixed-x19 -ffixed-x20 -ffixed-x21 -ffixed-x22 -ffixed-x23 -ffixed-x24 -ffixed-x25 -ffixed-x26 -ffixed-x27 -ffixed-x28 --save-temps -mgeneral-regs-only -fno-ipa-cp -fdump-rtl-ira" } */
+
+extern void abort ();
+
+int
+cal (int a, int b)
+{
+  /* { dg-final { scan-assembler-times "stp\tx29, x30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */
+  int sum = a + b;
+  int sum1 = a * b;
+  /* { dg-final { scan-assembler-times "ldr\tx29, x30, \\\[sp\\\], \[0-9\]+" 2 } } */
+  /* { dg-final { scan-rtl-dump "assign reg 30" "ira" } } */
+  return (a + b + sum + sum1);
+}
+
+int
+main (int argc, char **argv)
+{
+  int ret = cal (1, 2);
+
+  if (ret != 8)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { cleanup-saved-temps } } */
+/* { dg-final { cleanup-rtl-dump "ira" } } */