diff mbox

[ARM] Fix register r3 wrongly used to save ip in nested APCS frame

Message ID 2528794.36Q92I15i4@polaris
State New
Headers show

Commit Message

Eric Botcazou Sept. 5, 2013, 2:07 p.m. UTC
Hi,

in case the ip register needs to be saved before establishing an APCS frame, 
i.e. when it is used as the static chain register, the prologue tries various 
tricks in the following order:

	       1. The last argument register r3.
	       2. A slot on the stack above the frame.  (This only
	          works if the function is not a varargs function).
	       3. Register r3 again, after pushing the argument registers
	          onto the stack.

#3 doesn't really work since its implementation reads:

	    {
	      /* Store the args on the stack.  */
	      if (cfun->machine->uses_anonymous_args)
		insn = emit_multi_reg_push
		  ((0xf0 >> (args_to_push / 4)) & 0xf);
	      else
		insn = emit_insn
		  (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
			       GEN_INT (- args_to_push)));

	      RTX_FRAME_RELATED_P (insn) = 1;

	      saved_pretend_args = 1;
	      fp_offset = args_to_push;
	      args_to_push = 0;

	      /* Now reuse r3 to preserve IP.  */
	      emit_set_insn (gen_rtx_REG (SImode, 3), ip_rtx);
	    }

It works only if cfun->machine->uses_anonymous_args is true, because in this 
case r3 is pushed onto the stack as part of the multi-reg push.  Otherwise, 
the contents of r3 are simply overwritten by the last line.

Fixed by saving ip on the stack in this latter case, using one of the slots to 
be used by the pushed arguments.  This eliminates the last ACATS failures on 
ARM/VxWorks.  OK for the mainline?


2013-09-05  Eric Botcazou  <ebotcazou@adacore.com>

	* config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with
	arguments to push onto the stack and no varargs, save ip into a stack
	slot if r3 isn't available on entry.

Comments

Richard Earnshaw Sept. 5, 2013, 2:41 p.m. UTC | #1
On 05/09/13 15:07, Eric Botcazou wrote:
> Hi,
> 
> in case the ip register needs to be saved before establishing an APCS frame, 
> i.e. when it is used as the static chain register, the prologue tries various 
> tricks in the following order:
> 
> 	       1. The last argument register r3.
> 	       2. A slot on the stack above the frame.  (This only
> 	          works if the function is not a varargs function).
> 	       3. Register r3 again, after pushing the argument registers
> 	          onto the stack.
> 
> #3 doesn't really work since its implementation reads:
> 
> 	    {
> 	      /* Store the args on the stack.  */
> 	      if (cfun->machine->uses_anonymous_args)
> 		insn = emit_multi_reg_push
> 		  ((0xf0 >> (args_to_push / 4)) & 0xf);
> 	      else
> 		insn = emit_insn
> 		  (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
> 			       GEN_INT (- args_to_push)));
> 
> 	      RTX_FRAME_RELATED_P (insn) = 1;
> 
> 	      saved_pretend_args = 1;
> 	      fp_offset = args_to_push;
> 	      args_to_push = 0;
> 
> 	      /* Now reuse r3 to preserve IP.  */
> 	      emit_set_insn (gen_rtx_REG (SImode, 3), ip_rtx);
> 	    }
> 
> It works only if cfun->machine->uses_anonymous_args is true, because in this 
> case r3 is pushed onto the stack as part of the multi-reg push.  Otherwise, 
> the contents of r3 are simply overwritten by the last line.
> 
> Fixed by saving ip on the stack in this latter case, using one of the slots to 
> be used by the pushed arguments.  This eliminates the last ACATS failures on 
> ARM/VxWorks.  OK for the mainline?
> 
> 
> 2013-09-05  Eric Botcazou  <ebotcazou@adacore.com>
> 
> 	* config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with
> 	arguments to push onto the stack and no varargs, save ip into a stack
> 	slot if r3 isn't available on entry.
> 
> 

This is all fragile code, so a testcase would be very much appreciated.

R.

> 
> p.diff
> 
> 
> Index: config/arm/arm.c
> ===================================================================
> --- config/arm/arm.c	(revision 202160)
> +++ config/arm/arm.c	(working copy)
> @@ -18209,11 +18209,13 @@ arm_expand_prologue (void)
>  	     whilst the frame is being created.  We try the following
>  	     places in order:
>  
> -	       1. The last argument register r3.
> -	       2. A slot on the stack above the frame.  (This only
> -	          works if the function is not a varargs function).
> +	       1. The last argument register r3 if it is available.
> +	       2. A slot on the stack above the frame if there are no
> +		  arguments to push onto the stack.
>  	       3. Register r3 again, after pushing the argument registers
> -	          onto the stack.
> +	          onto the stack, if this is a varargs function.
> +	       4. A slot on the stack created for the arguments to push,
> +		  if this isn't a varargs function.
>  
>  	     Note - we only need to tell the dwarf2 backend about the SP
>  	     adjustment in the second variant; the static chain register
> @@ -18244,21 +18246,25 @@ arm_expand_prologue (void)
>  	    {
>  	      /* Store the args on the stack.  */
>  	      if (cfun->machine->uses_anonymous_args)
> -		insn = emit_multi_reg_push
> -		  ((0xf0 >> (args_to_push / 4)) & 0xf);
> +		{
> +		  insn
> +		    = emit_multi_reg_push ((0xf0 >> (args_to_push / 4)) & 0xf);
> +		  emit_set_insn (gen_rtx_REG (SImode, 3), ip_rtx);
> +		  saved_pretend_args = 1;
> +		}
>  	      else
> -		insn = emit_insn
> -		  (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
> -			       GEN_INT (- args_to_push)));
> +		{
> +		  rtx x = plus_constant (Pmode, stack_pointer_rtx,
> +					 args_to_push - 4);
> +		  insn = emit_insn (gen_addsi3 (stack_pointer_rtx,
> +						stack_pointer_rtx,
> +						GEN_INT (- args_to_push)));
> +		  emit_set_insn (gen_frame_mem (SImode, x), ip_rtx);
> +		}
>  
>  	      RTX_FRAME_RELATED_P (insn) = 1;
> -
> -	      saved_pretend_args = 1;
>  	      fp_offset = args_to_push;
>  	      args_to_push = 0;
> -
> -	      /* Now reuse r3 to preserve IP.  */
> -	      emit_set_insn (gen_rtx_REG (SImode, 3), ip_rtx);
>  	    }
>  	}
>  
> @@ -18363,7 +18369,7 @@ arm_expand_prologue (void)
>  	      /* Recover the static chain register.  */
>  	      if (!arm_r3_live_at_start_p () || saved_pretend_args)
>  		insn = gen_rtx_REG (SImode, 3);
> -	      else /* if (crtl->args.pretend_args_size == 0) */
> +	      else
>  		{
>  		  insn = plus_constant (Pmode, hard_frame_pointer_rtx, 4);
>  		  insn = gen_frame_mem (SImode, insn);
>
Eric Botcazou Sept. 5, 2013, 2:59 p.m. UTC | #2
> This is all fragile code, so a testcase would be very much appreciated.

A small one is ACATS c45342a (ada/acats/tests/c4/c45342a.ada).
Eric Botcazou Oct. 14, 2013, 8:46 a.m. UTC | #3
> > 2013-09-05  Eric Botcazou  <ebotcazou@adacore.com>
> > 
> > 	* config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with
> > 	arguments to push onto the stack and no varargs, save ip into a stack
> > 	slot if r3 isn't available on entry.
> 
> This is all fragile code, so a testcase would be very much appreciated.

I have attached a reduced Ada testcase, compile it for arm-wrs-vxworks and 
you'll see at the beginning of p__f$1593:

	.type	p__f$1593, %function
p__f$1593:
	@ Nested: function declared inside another function.
	@ args = 16, pretend = 4, frame = 140
	@ frame_needed = 1, uses_anonymous_args = 0
	sub	sp, sp, #4
	mov	r3, ip
	add	ip, sp, #4
	stmfd	sp!, {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc}
	sub	fp, ip, #8
	mov	ip, r3
	@ ip needed
	sub	sp, sp, #140
	str	r0, [fp, #-64]
	str	r1, [fp, #-72]
	str	r2, [fp, #-68]
	str	r3, [fp, #4]

so r3 is live on function's entry but gets clobbered by the ip save.  With the 
patch, the assembly code reads:

p__f$1593:
	@ Nested: function declared inside another function.
	@ args = 16, pretend = 4, frame = 140
	@ frame_needed = 1, uses_anonymous_args = 0
	sub	sp, sp, #4
	str	ip, [sp]
	add	ip, sp, #4
	stmfd	sp!, {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc}
	sub	fp, ip, #8
	ldr	ip, [fp, #4]
	@ ip needed
	sub	sp, sp, #140
	str	r0, [fp, #-64]
	str	r1, [fp, #-72]
	str	r2, [fp, #-68]
	str	r3, [fp, #4]

which looks correct.  FWIW we have had the patch in our tree for 4 months now.
Richard Earnshaw Nov. 28, 2013, 3:11 p.m. UTC | #4
Eric,

My apologies for taking so long to look at this.

> 2013-09-05  Eric Botcazou  <ebotcazou@adacore.com>
>
> 	* config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with
> 	arguments to push onto the stack and no varargs, save ip into a stack
> 	slot if r3 isn't available on entry.

Sorry, but this is not quite right either, as shown by the attached
testcase (in C this time, so we can commit it to gcc.target/arm :-)

The problem is that if we have some alignment padding we end up storing
ip in one location but restoring it from another.

	str	ip, [sp, #4]
	add	ip, sp, #8
	stmfd	sp!, {fp, ip, lr, pc}
	sub	fp, ip, #12
	ldr	ip, [fp, #4]	// <==== Should be fp + 8
	@ ip needed
	str	r3, [fp, #8]
	ldr	ip, [ip]

R.

On 14/10/13 09:46, Eric Botcazou wrote:
> p__f$1593:
> 	@ Nested: function declared inside another function.
> 	@ args = 16, pretend = 4, frame = 140
> 	@ frame_needed = 1, uses_anonymous_args = 0
> 	sub	sp, sp, #4
> 	str	ip, [sp]
> 	add	ip, sp, #4
> 	stmfd	sp!, {r4, r5, r6, r7, r8, r9, r10, fp, ip, lr, pc}


> 	sub	fp, ip, #8
> 	ldr	ip, [fp, #4]
> 	@ ip needed
> 	sub	sp, sp, #140
> 	str	r0, [fp, #-64]
> 	str	r1, [fp, #-72]
> 	str	r2, [fp, #-68]
> 	str	r3, [fp, #4]
> 
> which looks correct.  FWIW we have had the patch in our tree for 4 months now.
> 
> 
> p.adb
> 
> 
> procedure P (I : Integer) is
> 
>   SUBTYPE S IS INTEGER RANGE 1..100;
>   TYPE ARR IS ARRAY (S RANGE <>) OF INTEGER;
> 
>   A : ARR (2..9);
> 
>   FUNCTION F (AR_VAR1, AR_VAR2, AR_VAR3 : ARR) RETURN ARR IS
>   BEGIN
>     if I = 0 then
>       RETURN AR_VAR1 & AR_VAR2 & AR_VAR3;
>     else
>       RETURN AR_VAR1;
>     end if;
>   END;
> 
> begin
>   A := (8,7,6,5,4,3,2,1);
>   if F(A(2..3), A(2..4), A(2..4)) /= (8,7,8,7,6,8,7,6) then
>     raise Program_Error;
>   end if;
> end;
>
/* { dg-do run } */
/* { dg-options "-fno-omit-frame-pointer -mapcs-frame -O" } */
struct x
{
  int y;
  int z;
};

int __attribute__((noinline)) f (int c, int d, int e, int h, int i)
{
  int a;
  struct x b;

  int __attribute__((noinline)) g (int p, int q, int r, struct x s)
  {
    return a + p + q + r + s.y + s.z;
  }

  a = 5;
  b.y = h;
  b.z = i;

  return g(c, d, e, b);
}

int main()
{
  if (f (1, 2, 3, 4, 5) != 20)
    abort();
  exit (0);
}
diff mbox

Patch

Index: config/arm/arm.c
===================================================================
--- config/arm/arm.c	(revision 202160)
+++ config/arm/arm.c	(working copy)
@@ -18209,11 +18209,13 @@  arm_expand_prologue (void)
 	     whilst the frame is being created.  We try the following
 	     places in order:
 
-	       1. The last argument register r3.
-	       2. A slot on the stack above the frame.  (This only
-	          works if the function is not a varargs function).
+	       1. The last argument register r3 if it is available.
+	       2. A slot on the stack above the frame if there are no
+		  arguments to push onto the stack.
 	       3. Register r3 again, after pushing the argument registers
-	          onto the stack.
+	          onto the stack, if this is a varargs function.
+	       4. A slot on the stack created for the arguments to push,
+		  if this isn't a varargs function.
 
 	     Note - we only need to tell the dwarf2 backend about the SP
 	     adjustment in the second variant; the static chain register
@@ -18244,21 +18246,25 @@  arm_expand_prologue (void)
 	    {
 	      /* Store the args on the stack.  */
 	      if (cfun->machine->uses_anonymous_args)
-		insn = emit_multi_reg_push
-		  ((0xf0 >> (args_to_push / 4)) & 0xf);
+		{
+		  insn
+		    = emit_multi_reg_push ((0xf0 >> (args_to_push / 4)) & 0xf);
+		  emit_set_insn (gen_rtx_REG (SImode, 3), ip_rtx);
+		  saved_pretend_args = 1;
+		}
 	      else
-		insn = emit_insn
-		  (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
-			       GEN_INT (- args_to_push)));
+		{
+		  rtx x = plus_constant (Pmode, stack_pointer_rtx,
+					 args_to_push - 4);
+		  insn = emit_insn (gen_addsi3 (stack_pointer_rtx,
+						stack_pointer_rtx,
+						GEN_INT (- args_to_push)));
+		  emit_set_insn (gen_frame_mem (SImode, x), ip_rtx);
+		}
 
 	      RTX_FRAME_RELATED_P (insn) = 1;
-
-	      saved_pretend_args = 1;
 	      fp_offset = args_to_push;
 	      args_to_push = 0;
-
-	      /* Now reuse r3 to preserve IP.  */
-	      emit_set_insn (gen_rtx_REG (SImode, 3), ip_rtx);
 	    }
 	}
 
@@ -18363,7 +18369,7 @@  arm_expand_prologue (void)
 	      /* Recover the static chain register.  */
 	      if (!arm_r3_live_at_start_p () || saved_pretend_args)
 		insn = gen_rtx_REG (SImode, 3);
-	      else /* if (crtl->args.pretend_args_size == 0) */
+	      else
 		{
 		  insn = plus_constant (Pmode, hard_frame_pointer_rtx, 4);
 		  insn = gen_frame_mem (SImode, insn);