diff mbox

[AArch64] Simplify frame layout for stack probing

Message ID DB6PR0801MB20538770C0B5E53FBBB7D48B83B80@DB6PR0801MB2053.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Wilco Dijkstra July 25, 2017, 1:58 p.m. UTC
This patch makes some changes to the frame layout in order to simplify
stack probing.  We want to use the save of LR as a probe in any non-leaf
function.  With shrinkwrapping we may only save LR before a call, so it
is useful to define a fixed location in the callee-saves. So force LR at
the bottom of the callee-saves even with -fomit-frame-pointer.

Also remove a rarely used frame layout that saves the callee-saves first
with -fomit-frame-pointer.

OK for commit (and backport to GCC7)?

ChangeLog:
2017-07-25  Wilco Dijkstra  <wdijkstr@arm.com>

	* config/aarch64/aarch64.c (aarch64_layout_frame):
	Ensure LR is always stored at the bottom of the callee-saves.
	Remove frame option which saves callee-saves at top of frame.

--

Comments

Jeff Law July 26, 2017, 7:15 p.m. UTC | #1
On 07/25/2017 07:58 AM, Wilco Dijkstra wrote:
> This patch makes some changes to the frame layout in order to simplify
> stack probing.  We want to use the save of LR as a probe in any non-leaf
> function.  With shrinkwrapping we may only save LR before a call, so it
> is useful to define a fixed location in the callee-saves. So force LR at
> the bottom of the callee-saves even with -fomit-frame-pointer.
> 
> Also remove a rarely used frame layout that saves the callee-saves first
> with -fomit-frame-pointer.
> 
> OK for commit (and backport to GCC7)?
> 
> ChangeLog:
> 2017-07-25  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_layout_frame):
> 	Ensure LR is always stored at the bottom of the callee-saves.
> 	Remove frame option which saves callee-saves at top of frame.
I'll let the appropriate aarch64 maintainers comment on correctness.
But I wanted to give an explicit thanks for simplifying this so that we
can rely on it.

Jeff
James Greenhalgh Oct. 26, 2017, 3:19 p.m. UTC | #2
On Tue, Jul 25, 2017 at 02:58:04PM +0100, Wilco Dijkstra wrote:
> This patch makes some changes to the frame layout in order to simplify
> stack probing.  We want to use the save of LR as a probe in any non-leaf
> function.  With shrinkwrapping we may only save LR before a call, so it
> is useful to define a fixed location in the callee-saves. So force LR at
> the bottom of the callee-saves even with -fomit-frame-pointer.
> 
> Also remove a rarely used frame layout that saves the callee-saves first
> with -fomit-frame-pointer.
> 
> OK for commit (and backport to GCC7)?

OK. Leave it a week before backporting.

Reviewed by: James Greenhalgh <james.greenhalgh@arm.com>

Thanks,
James

> 
> ChangeLog:
> 2017-07-25  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_layout_frame):
> 	Ensure LR is always stored at the bottom of the callee-saves.
> 	Remove frame option which saves callee-saves at top of frame.
>
James Greenhalgh Oct. 27, 2017, 8:46 a.m. UTC | #3
On Thu, Oct 26, 2017 at 04:19:35PM +0100, James Greenhalgh wrote:
> On Tue, Jul 25, 2017 at 02:58:04PM +0100, Wilco Dijkstra wrote:
> > This patch makes some changes to the frame layout in order to simplify
> > stack probing.  We want to use the save of LR as a probe in any non-leaf
> > function.  With shrinkwrapping we may only save LR before a call, so it
> > is useful to define a fixed location in the callee-saves. So force LR at
> > the bottom of the callee-saves even with -fomit-frame-pointer.
> > 
> > Also remove a rarely used frame layout that saves the callee-saves first
> > with -fomit-frame-pointer.
> > 
> > OK for commit (and backport to GCC7)?
> 
> OK. Leave it a week before backporting.

This caused:

  Failures:
	gcc.target/aarch64/test_frame_4.c
	gcc.target/aarch64/test_frame_2.c
	gcc.target/aarch64/test_frame_7.c
	gcc.target/aarch64/test_frame_10.c
	
  Bisected to: 

  Author: wilco
  Date:   Thu Oct 26 16:40:25 2017 +0000

    Simplify frame layout for stack probing
    
    This patch makes some changes to the frame layout in order to simplify
    stack probing.  We want to use the save of LR as a probe in any non-leaf
    function.  With shrinkwrapping we may only save LR before a call, so it
    is useful to define a fixed location in the callee-saves. So force LR at
    the bottom of the callee-saves even with -fomit-frame-pointer.
    
    Also remove a rarely used frame layout that saves the callee-saves first
    with -fomit-frame-pointer.  Doing so allows the store of LR to be used as
    a valid stack probe in all frames.
    
        gcc/
    	* config/aarch64/aarch64.c (aarch64_layout_frame):
            Ensure LR is always stored at the bottom of the callee-saves.
            Remove rarely used frame layout which saves callee-saves at top of
            frame, so the store of LR can be used as a valid probe in all cases.
    
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@254112

Please look in to this.

This will also block the request to backport the patch until after the
failures have been resolved.

There's no reason we shouldn't be catching bugs like this (simple
scan assembler tests which have been in the port for years, that will
obviously never pass after your changes) before the patch makes it to
trunk. How was this patch tested?

Thanks,
James
Wilco Dijkstra Nov. 3, 2017, 4:47 p.m. UTC | #4
James Greenhalgh wrote:
>
> This caused:
>
>  Failures:
>        gcc.target/aarch64/test_frame_4.c
>        gcc.target/aarch64/test_frame_2.c
>        gcc.target/aarch64/test_frame_7.c
>        gcc.target/aarch64/test_frame_10.c

Sorry, I missed that in testing. I've reverted part of the patch that caused this.
The tests are definitely too picky but they also uncovered a real code generation
inefficiency, so I need to look into that further.

I've committed this:

2017-11-03  Wilco Dijkstra  <wdijkstr@arm.com>

        PR target/82786
        * config/aarch64/aarch64.c (aarch64_layout_frame):
        Undo forcing of LR at bottom of frame.
--
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2fc7db4..949f3cb 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-11-03  Wilco Dijkstra  <wdijkstr@arm.com>
+
+       PR target/82786
+       * config/aarch64/aarch64.c (aarch64_layout_frame):
+       Undo forcing of LR at bottom of frame.
+
 2017-11-03  Jeff Law  <law@redhat.com>
 
        * cfganal.c (single_pred_edge_ignoring_loop_edges): New function
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1e12645..12f247d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2908,8 +2908,7 @@ aarch64_frame_pointer_required (void)
 
 /* Mark the registers that need to be saved by the callee and calculate
    the size of the callee-saved registers area and frame record (both FP
-   and LR may be omitted).  If the function is not a leaf, ensure LR is
-   saved at the bottom of the callee-save area.  */
+   and LR may be omitted).  */
 static void
 aarch64_layout_frame (void)
 {
@@ -2966,13 +2965,6 @@ aarch64_layout_frame (void)
       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
       offset = 2 * UNITS_PER_WORD;
     }
-  else if (!crtl->is_leaf)
-    {
-      /* Ensure LR is saved at the bottom of the callee-saves.  */
-      cfun->machine->frame.reg_offset[R30_REGNUM] = 0;
-      cfun->machine->frame.wb_candidate1 = R30_REGNUM;
-      offset = UNITS_PER_WORD;
-    }
 
   /* Now assign stack slots for them.  */
   for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b8a4160d9de8e689ccd26cb9f0ce046ee65e0ef4..3fc36ae28d18b9635480fd99f1fa7719267e66e4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2875,7 +2875,8 @@  aarch64_frame_pointer_required (void)
 
 /* Mark the registers that need to be saved by the callee and calculate
    the size of the callee-saved registers area and frame record (both FP
-   and LR may be omitted).  */
+   and LR may be omitted).  If the function is not a leaf, ensure LR is
+   saved at the bottom of the callee-save area.  */
 static void
 aarch64_layout_frame (void)
 {
@@ -2926,7 +2927,14 @@  aarch64_layout_frame (void)
       cfun->machine->frame.wb_candidate1 = R29_REGNUM;
       cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD;
       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
-      offset += 2 * UNITS_PER_WORD;
+      offset = 2 * UNITS_PER_WORD;
+    }
+  else if (!crtl->is_leaf)
+    {
+      /* Ensure LR is saved at the bottom of the callee-saves.  */
+      cfun->machine->frame.reg_offset[R30_REGNUM] = 0;
+      cfun->machine->frame.wb_candidate1 = R30_REGNUM;
+      offset = UNITS_PER_WORD;
     }
 
   /* Now assign stack slots for them.  */
@@ -3025,20 +3033,6 @@  aarch64_layout_frame (void)
       cfun->machine->frame.final_adjust
 	= cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
     }
-  else if (!frame_pointer_needed
-	   && varargs_and_saved_regs_size < max_push_offset)
-    {
-      /* Frame with large local area and outgoing arguments (this pushes the
-	 callee-saves first, followed by the locals and outgoing area):
-	 stp reg1, reg2, [sp, -varargs_and_saved_regs_size]!
-	 stp reg3, reg4, [sp, 16]
-	 sub sp, sp, frame_size - varargs_and_saved_regs_size  */
-      cfun->machine->frame.callee_adjust = varargs_and_saved_regs_size;
-      cfun->machine->frame.final_adjust
-	= cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.hard_fp_offset = cfun->machine->frame.callee_adjust;
-      cfun->machine->frame.locals_offset = cfun->machine->frame.hard_fp_offset;
-    }
   else
     {
       /* Frame with large local area and outgoing arguments using frame pointer: