Patchwork [CFT] Generic stack alignment for user variables

login
register
mail settings
Submitter Richard Henderson
Date Oct. 2, 2010, 10:51 p.m.
Message ID <4CA7B76E.1070008@redhat.com>
Download mbox | patch
Permalink /patch/66587/
State New
Headers show

Comments

Richard Henderson - Oct. 2, 2010, 10:51 p.m.
On 10/01/2010 06:14 PM, Jack Howarth wrote:
>    This patch causes massive breakage on x86_64-apple-darwin10. The testsuite
> is still running but at -m32 alone in the gcc testsuite I get the new failures of...
...
> FAIL: gcc.dg/torture/stackalign/alloca-1.c  -O0  execution test

Feh.  Should have tried plain linux myself; the -mpreferred-stack-boundary
switch does stuff that the other targets I tested can't.

A small bug in the explow.c change.  Here's a replacement patch for that file.


r~
Jack Howarth - Oct. 5, 2010, 1:02 a.m.
On Sat, Oct 02, 2010 at 03:51:26PM -0700, Richard Henderson wrote:
> On 10/01/2010 06:14 PM, Jack Howarth wrote:
> >    This patch causes massive breakage on x86_64-apple-darwin10. The testsuite
> > is still running but at -m32 alone in the gcc testsuite I get the new failures of...
> ...
> > FAIL: gcc.dg/torture/stackalign/alloca-1.c  -O0  execution test
> 
> Feh.  Should have tried plain linux myself; the -mpreferred-stack-boundary
> switch does stuff that the other targets I tested can't.
> 
> A small bug in the explow.c change.  Here's a replacement patch for that file.
> 
> 
> r~

Richard,
   It appears that you only tested -m64 on x86_64-apple-darwin10. Using the patch
below with the original d-stackalign-1 patch and testing with...

make -k check RUNTESTFLAGS="--target_board=unix'{-m32,-m64}'"

reveals failures at -m32 such as...

Running /sw/src/fink.build/gcc46-4.6.0-1000/gcc-4.6-20101004/gcc/testsuite/gcc.target/i386/stackalign/stackalign.exp ...
FAIL: gcc.target/i386/stackalign/longlong-1.c -mstackrealign (internal compiler error)
FAIL: gcc.target/i386/stackalign/longlong-1.c -mstackrealign (test for excess errors)
ERROR: gcc.target/i386/stackalign/longlong-1.c: error executing dg-final: couldn't open "longlong-1.s": no such file or directory
FAIL: gcc.target/i386/stackalign/longlong-1.c -mno-stackrealign (internal compiler error)
FAIL: gcc.target/i386/stackalign/longlong-1.c -mno-stackrealign (test for excess errors)
ERROR: gcc.target/i386/stackalign/longlong-1.c: error executing dg-final: couldn't open "longlong-1.s": no such file or directory

which appear in the form...

Executing on host: /sw/src/fink.build/gcc46-4.6.0-1000/darwin_objdir/gcc/xgcc -B/sw/src/fink.build/gcc46-4.6.0-1000/darwin_objdir/gcc/ /sw/src/fink.build/gcc46-4.6.0-1000/gcc-4.6-20101004/gcc/testsuite/gcc.target/i386/stackalign/longlong-1.c  -mstackrealign -O2 -mpreferred-stack-boundary=2 -S  -m32 -o longlong-1.s    (timeout = 300)
/sw/src/fink.build/gcc46-4.6.0-1000/gcc-4.6-20101004/gcc/testsuite/gcc.target/i386/stackalign/longlong-1.c: In function 'f1':
/sw/src/fink.build/gcc46-4.6.0-1000/gcc-4.6-20101004/gcc/testsuite/gcc.target/i386/stackalign/longlong-1.c:11:6: internal compiler error: in expand_one_stack_var_at, at cfgexpand.c:740
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
compiler exited with status 1

Unfortunately this doesn't backtrace after the crash.
                       Jack

> Index: explow.c
> ===================================================================
> --- explow.c	(revision 164907)
> +++ explow.c	(working copy)
> @@ -1123,16 +1123,19 @@
>  }
>  
>  /* Return an rtx representing the address of an area of memory dynamically
> -   pushed on the stack.  This region of memory is always aligned to
> -   a multiple of BIGGEST_ALIGNMENT.
> +   pushed on the stack.
>  
>     Any required stack pointer alignment is preserved.
>  
>     SIZE is an rtx representing the size of the area.
> -   TARGET is a place in which the address can be placed.
>  
> -   KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.
> +   SIZE_ALIGN is the alignment (in bits) that we know SIZE has.  This
> +   parameter may be zero.  If so, a proper value will be extracted 
> +   from SIZE if it is constant, otherwise BITS_PER_UNIT will be assumed.
>  
> +   REQUIRED_ALIGN is the alignment (in bits) required for the region
> +   of memory.
> +
>     If CANNOT_ACCUMULATE is set to TRUE, the caller guarantees that the
>     stack space allocated by the generated code cannot be added with itself
>     in the course of the execution of the function.  It is always safe to
> @@ -1141,12 +1144,12 @@
>     loops to it executes the associated deallocation code.  */
>  
>  rtx
> -allocate_dynamic_stack_space (rtx size, rtx target, int known_align,
> -			      bool cannot_accumulate)
> +allocate_dynamic_stack_space (rtx size, unsigned size_align,
> +			      unsigned required_align, bool cannot_accumulate)
>  {
>    HOST_WIDE_INT stack_usage_size = -1;
> -  bool known_align_valid = true;
> -  rtx final_label, final_target;
> +  rtx final_label, final_target, target;
> +  bool must_align;
>  
>    /* If we're asking for zero bytes, it doesn't matter what we point
>       to since we can't dereference it.  But return a reasonable
> @@ -1192,6 +1195,23 @@
>    if (GET_MODE (size) != VOIDmode && GET_MODE (size) != Pmode)
>      size = convert_to_mode (Pmode, size, 1);
>  
> +  /* Adjust SIZE_ALIGN, if needed.  */
> +  if (CONST_INT_P (size))
> +    {
> +      unsigned HOST_WIDE_INT lsb;
> +
> +      lsb = INTVAL (size);
> +      lsb &= -lsb;
> +
> +      /* Watch out for overflow truncating to "unsigned".  */
> +      if (lsb > UINT_MAX / BITS_PER_UNIT)
> +	size_align = 1u << (HOST_BITS_PER_INT - 1);
> +      else
> +	size_align = (unsigned)lsb * BITS_PER_UNIT;
> +    }
> +  else if (size_align < BITS_PER_UNIT)
> +    size_align = BITS_PER_UNIT;
> +
>    /* We can't attempt to minimize alignment necessary, because we don't
>       know the final value of preferred_stack_boundary yet while executing
>       this code.  */
> @@ -1199,35 +1219,39 @@
>      crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
>  
>    /* We will need to ensure that the address we return is aligned to
> -     BIGGEST_ALIGNMENT.  If STACK_DYNAMIC_OFFSET is defined, we don't
> +     REQUIRED_ALIGN.  If STACK_DYNAMIC_OFFSET is defined, we don't
>       always know its final value at this point in the compilation (it
>       might depend on the size of the outgoing parameter lists, for
>       example), so we must align the value to be returned in that case.
>       (Note that STACK_DYNAMIC_OFFSET will have a default nonzero value if
>       STACK_POINTER_OFFSET or ACCUMULATE_OUTGOING_ARGS are defined).
>       We must also do an alignment operation on the returned value if
> -     the stack pointer alignment is less strict that BIGGEST_ALIGNMENT.
> +     the stack pointer alignment is less strict than REQUIRED_ALIGN.
>  
>       If we have to align, we must leave space in SIZE for the hole
>       that might result from the alignment operation.  */
>  
> +  must_align = (crtl->preferred_stack_boundary < required_align);
>  #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
> -#define MUST_ALIGN 1
> -#else
> -#define MUST_ALIGN (crtl->preferred_stack_boundary < BIGGEST_ALIGNMENT)
> +  must_align = true;
>  #endif
>  
> -  if (MUST_ALIGN)
> +  if (must_align)
>      {
> -      size
> -        = force_operand (plus_constant (size,
> -					BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
> -			 NULL_RTX);
> +      unsigned extra = required_align;
>  
> +      extra -= (required_align > PREFERRED_STACK_BOUNDARY
> +		? PREFERRED_STACK_BOUNDARY : 1);
> +      extra /= BITS_PER_UNIT;
> +
> +      size = plus_constant (size, extra);
> +      size = force_operand (size, NULL_RTX);
> +
>        if (flag_stack_usage)
> -	stack_usage_size += BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1;
> +	stack_usage_size += extra;
>  
> -      known_align_valid = false;
> +      if (size_align > PREFERRED_STACK_BOUNDARY)
> +	size_align = PREFERRED_STACK_BOUNDARY;
>      }
>  
>  #ifdef SETJMP_VIA_SAVE_AREA
> @@ -1257,7 +1281,8 @@
>        if (flag_stack_usage)
>  	current_function_dynamic_alloc_count++;
>  
> -      known_align_valid = false;
> +      /* ??? Can we infer a minimum of STACK_BOUNDARY here?  */
> +      size_align = BITS_PER_UNIT;
>      }
>  #endif /* SETJMP_VIA_SAVE_AREA */
>  
> @@ -1274,7 +1299,7 @@
>       insns.  Since this is an extremely rare event, we have no reliable
>       way of knowing which systems have this problem.  So we avoid even
>       momentarily mis-aligning the stack.  */
> -  if (!known_align_valid || known_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0)
> +  if (size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0)
>      {
>        size = round_push (size);
>  
> @@ -1285,14 +1310,8 @@
>  	}
>      }
>  
> -  /* Don't use a TARGET that isn't a pseudo or is the wrong mode.  */
> -  if (target == 0 || !REG_P (target)
> -      || REGNO (target) < FIRST_PSEUDO_REGISTER
> -      || GET_MODE (target) != Pmode)
> -    target = gen_reg_rtx (Pmode);
> +  target = gen_reg_rtx (Pmode);
>  
> -  mark_reg_pointer (target, known_align);
> -
>    /* The size is supposed to be fully adjusted at this point so record it
>       if stack usage info is requested.  */
>    if (flag_stack_usage)
> @@ -1341,7 +1360,6 @@
>  	return space;
>  
>        final_target = gen_reg_rtx (Pmode);
> -      mark_reg_pointer (final_target, known_align);
>  
>        emit_move_insn (final_target, space);
>  
> @@ -1440,35 +1458,38 @@
>  #endif
>      }
>  
> -  if (MUST_ALIGN)
> +  /* Finish up the split stack handling.  */
> +  if (final_label != NULL_RTX)
>      {
> +      gcc_assert (flag_split_stack);
> +      emit_move_insn (final_target, target);
> +      emit_label (final_label);
> +      target = final_target;
> +    }
> +
> +  if (must_align)
> +    {
>        /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
>  	 but we know it can't.  So add ourselves and then do
>  	 TRUNC_DIV_EXPR.  */
>        target = expand_binop (Pmode, add_optab, target,
> -			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
> +			     GEN_INT (required_align / BITS_PER_UNIT - 1),
>  			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
>        target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
> -			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
> +			      GEN_INT (required_align / BITS_PER_UNIT),
>  			      NULL_RTX, 1);
>        target = expand_mult (Pmode, target,
> -			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
> +			    GEN_INT (required_align / BITS_PER_UNIT),
>  			    NULL_RTX, 1);
>      }
>  
> +  /* Now that we've committed to a return value, mark its alignment.  */
> +  mark_reg_pointer (target, required_align);
> +
>    /* Record the new stack level for nonlocal gotos.  */
>    if (cfun->nonlocal_goto_save_area != 0)
>      update_nonlocal_goto_save_area ();
>  
> -  /* Finish up the split stack handling.  */
> -  if (final_label != NULL_RTX)
> -    {
> -      gcc_assert (flag_split_stack);
> -      emit_move_insn (final_target, target);
> -      emit_label (final_label);
> -      target = final_target;
> -    }
> -
>    return target;
>  }
>

Patch

Index: explow.c
===================================================================
--- explow.c	(revision 164907)
+++ explow.c	(working copy)
@@ -1123,16 +1123,19 @@ 
 }
 
 /* Return an rtx representing the address of an area of memory dynamically
-   pushed on the stack.  This region of memory is always aligned to
-   a multiple of BIGGEST_ALIGNMENT.
+   pushed on the stack.
 
    Any required stack pointer alignment is preserved.
 
    SIZE is an rtx representing the size of the area.
-   TARGET is a place in which the address can be placed.
 
-   KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.
+   SIZE_ALIGN is the alignment (in bits) that we know SIZE has.  This
+   parameter may be zero.  If so, a proper value will be extracted 
+   from SIZE if it is constant, otherwise BITS_PER_UNIT will be assumed.
 
+   REQUIRED_ALIGN is the alignment (in bits) required for the region
+   of memory.
+
    If CANNOT_ACCUMULATE is set to TRUE, the caller guarantees that the
    stack space allocated by the generated code cannot be added with itself
    in the course of the execution of the function.  It is always safe to
@@ -1141,12 +1144,12 @@ 
    loops to it executes the associated deallocation code.  */
 
 rtx
-allocate_dynamic_stack_space (rtx size, rtx target, int known_align,
-			      bool cannot_accumulate)
+allocate_dynamic_stack_space (rtx size, unsigned size_align,
+			      unsigned required_align, bool cannot_accumulate)
 {
   HOST_WIDE_INT stack_usage_size = -1;
-  bool known_align_valid = true;
-  rtx final_label, final_target;
+  rtx final_label, final_target, target;
+  bool must_align;
 
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
@@ -1192,6 +1195,23 @@ 
   if (GET_MODE (size) != VOIDmode && GET_MODE (size) != Pmode)
     size = convert_to_mode (Pmode, size, 1);
 
+  /* Adjust SIZE_ALIGN, if needed.  */
+  if (CONST_INT_P (size))
+    {
+      unsigned HOST_WIDE_INT lsb;
+
+      lsb = INTVAL (size);
+      lsb &= -lsb;
+
+      /* Watch out for overflow truncating to "unsigned".  */
+      if (lsb > UINT_MAX / BITS_PER_UNIT)
+	size_align = 1u << (HOST_BITS_PER_INT - 1);
+      else
+	size_align = (unsigned)lsb * BITS_PER_UNIT;
+    }
+  else if (size_align < BITS_PER_UNIT)
+    size_align = BITS_PER_UNIT;
+
   /* We can't attempt to minimize alignment necessary, because we don't
      know the final value of preferred_stack_boundary yet while executing
      this code.  */
@@ -1199,35 +1219,39 @@ 
     crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
 
   /* We will need to ensure that the address we return is aligned to
-     BIGGEST_ALIGNMENT.  If STACK_DYNAMIC_OFFSET is defined, we don't
+     REQUIRED_ALIGN.  If STACK_DYNAMIC_OFFSET is defined, we don't
      always know its final value at this point in the compilation (it
      might depend on the size of the outgoing parameter lists, for
      example), so we must align the value to be returned in that case.
      (Note that STACK_DYNAMIC_OFFSET will have a default nonzero value if
      STACK_POINTER_OFFSET or ACCUMULATE_OUTGOING_ARGS are defined).
      We must also do an alignment operation on the returned value if
-     the stack pointer alignment is less strict that BIGGEST_ALIGNMENT.
+     the stack pointer alignment is less strict than REQUIRED_ALIGN.
 
      If we have to align, we must leave space in SIZE for the hole
      that might result from the alignment operation.  */
 
+  must_align = (crtl->preferred_stack_boundary < required_align);
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
-#define MUST_ALIGN 1
-#else
-#define MUST_ALIGN (crtl->preferred_stack_boundary < BIGGEST_ALIGNMENT)
+  must_align = true;
 #endif
 
-  if (MUST_ALIGN)
+  if (must_align)
     {
-      size
-        = force_operand (plus_constant (size,
-					BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
-			 NULL_RTX);
+      unsigned extra = required_align;
 
+      extra -= (required_align > PREFERRED_STACK_BOUNDARY
+		? PREFERRED_STACK_BOUNDARY : 1);
+      extra /= BITS_PER_UNIT;
+
+      size = plus_constant (size, extra);
+      size = force_operand (size, NULL_RTX);
+
       if (flag_stack_usage)
-	stack_usage_size += BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1;
+	stack_usage_size += extra;
 
-      known_align_valid = false;
+      if (size_align > PREFERRED_STACK_BOUNDARY)
+	size_align = PREFERRED_STACK_BOUNDARY;
     }
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1257,7 +1281,8 @@ 
       if (flag_stack_usage)
 	current_function_dynamic_alloc_count++;
 
-      known_align_valid = false;
+      /* ??? Can we infer a minimum of STACK_BOUNDARY here?  */
+      size_align = BITS_PER_UNIT;
     }
 #endif /* SETJMP_VIA_SAVE_AREA */
 
@@ -1274,7 +1299,7 @@ 
      insns.  Since this is an extremely rare event, we have no reliable
      way of knowing which systems have this problem.  So we avoid even
      momentarily mis-aligning the stack.  */
-  if (!known_align_valid || known_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0)
+  if (size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0)
     {
       size = round_push (size);
 
@@ -1285,14 +1310,8 @@ 
 	}
     }
 
-  /* Don't use a TARGET that isn't a pseudo or is the wrong mode.  */
-  if (target == 0 || !REG_P (target)
-      || REGNO (target) < FIRST_PSEUDO_REGISTER
-      || GET_MODE (target) != Pmode)
-    target = gen_reg_rtx (Pmode);
+  target = gen_reg_rtx (Pmode);
 
-  mark_reg_pointer (target, known_align);
-
   /* The size is supposed to be fully adjusted at this point so record it
      if stack usage info is requested.  */
   if (flag_stack_usage)
@@ -1341,7 +1360,6 @@ 
 	return space;
 
       final_target = gen_reg_rtx (Pmode);
-      mark_reg_pointer (final_target, known_align);
 
       emit_move_insn (final_target, space);
 
@@ -1440,35 +1458,38 @@ 
 #endif
     }
 
-  if (MUST_ALIGN)
+  /* Finish up the split stack handling.  */
+  if (final_label != NULL_RTX)
     {
+      gcc_assert (flag_split_stack);
+      emit_move_insn (final_target, target);
+      emit_label (final_label);
+      target = final_target;
+    }
+
+  if (must_align)
+    {
       /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
 	 but we know it can't.  So add ourselves and then do
 	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
       target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
+			      GEN_INT (required_align / BITS_PER_UNIT),
 			      NULL_RTX, 1);
       target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
+			    GEN_INT (required_align / BITS_PER_UNIT),
 			    NULL_RTX, 1);
     }
 
+  /* Now that we've committed to a return value, mark its alignment.  */
+  mark_reg_pointer (target, required_align);
+
   /* Record the new stack level for nonlocal gotos.  */
   if (cfun->nonlocal_goto_save_area != 0)
     update_nonlocal_goto_save_area ();
 
-  /* Finish up the split stack handling.  */
-  if (final_label != NULL_RTX)
-    {
-      gcc_assert (flag_split_stack);
-      emit_move_insn (final_target, target);
-      emit_label (final_label);
-      target = final_target;
-    }
-
   return target;
 }