diff mbox

Fix PR middle-end/46894

Message ID 201101131720.52843.ebotcazou@adacore.com
State New
Headers show

Commit Message

Eric Botcazou Jan. 13, 2011, 4:20 p.m. UTC
Hi,

this is a fallout of Richard's stack realignment patch (PR rtl-opt/33721) 
visible on PowerPC 32-bit.  The problem is that in the checkg_32qi function:

typedef int __attribute__((mode(QI))) qi;

typedef qi __attribute__((vector_size (32))) v32qi;

typedef union U32QI { v32qi v; qi a[32]; } u32qi;

extern v32qi g_v32qi;

void checkg_32qi (void)
{
  u32qi u;
  qi *p = u.a;
  u.v = g_v32qi;
  checkp_32qi (p);
}

'u' overlaps 'p' on the stack:

(gdb) p &u
$26 = (u32qi *) 0xffffe500
(gdb) p &p
$27 = (qi **) 0xffffe518

The dynamic allocation for the super-aligned variable is broken:

(gdb) p/x $r1
$28 = 0xffffe510

(gdb) p/x $r0
$30 = 0xffffe500

the difference should be at least 32.

The wrong computation is done in allocate_dynamic_stack_space:

  if (must_align)
    {
      unsigned extra, extra_align;

      if (required_align > PREFERRED_STACK_BOUNDARY)
	extra_align = PREFERRED_STACK_BOUNDARY;
      else if (required_align > STACK_BOUNDARY)
	extra_align = STACK_BOUNDARY;
      else
	extra_align = BITS_PER_UNIT;
      extra = (required_align - extra_align) / BITS_PER_UNIT;

      size = plus_constant (size, extra);
      size = force_operand (size, NULL_RTX);

      if (flag_stack_usage)
	stack_usage_size += extra;

      if (extra && size_align > extra_align)
	size_align = extra_align;
    }

(gdb) p required_align
$10 = 256
(gdb) p extra_align
$11 = 128
(gdb) p extra
$12 = 16

but virtual_stack_dynamic_rtx isn't aligned, i.e. STACK_DYNAMIC_OFFSET is 
defined and isn't a multiple of PREFERRED_STACK_BOUNDARY, it's 8 here.

The attached patch forces extra_align to BITS_PER_UNIT in this case, i.e.

#if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
  must_align = true;
#endif

Now there is a hitch: since

2003-10-07  Geoffrey Keating  <geoffk@apple.com>

	* function.c (pad_to_arg_alignment): Take STACK_POINTER_OFFSET into
	account	when aligning arguments.
	* calls.c (STACK_POINTER_OFFSET): Move default from here ...
	* defaults.h (STACK_POINTER_OFFSET): ... to here.

STACK_POINTER_OFFSET is always defined, so we always align; with the patch, 
we'll additionally always assume BITS_PER_UNIT.  But I guess this latter 
pessimization is dwarfed by the former.  What do you think, Richard?

Lightly tested on x86 and PowerPC Linux for now.


2011-01-13  Eric Botcazou  <ebotcazou@adacore.com>

	PR middle-end/46894
	* explow.c (allocate_dynamic_stack_space): Do not assume more than
	BITS_PER_UNIT alignment if STACK_DYNAMIC_OFFSET or STACK_POINTER_OFFSET
	are defined.

Comments

Richard Henderson Jan. 13, 2011, 4:42 p.m. UTC | #1
On 01/13/2011 08:20 AM, Eric Botcazou wrote:
> Now there is a hitch: since
> 
> 2003-10-07  Geoffrey Keating  <geoffk@apple.com>
> 
> 	* function.c (pad_to_arg_alignment): Take STACK_POINTER_OFFSET into
> 	account	when aligning arguments.
> 	* calls.c (STACK_POINTER_OFFSET): Move default from here ...
> 	* defaults.h (STACK_POINTER_OFFSET): ... to here.
> 
> STACK_POINTER_OFFSET is always defined, so we always align; with the patch, 
> we'll additionally always assume BITS_PER_UNIT.  But I guess this latter 
> pessimization is dwarfed by the former.  What do you think, Richard?

I think that you shouldn't change the behaviour of alignment wrt
STACK_POINTER_OFFSET.  So far, all targets do have (sp+S_P_O) with
the required alignment.

What about

  must_align = (crtl->preferred_stack_boundary < required_align);
  if (STACK_POINTER_OFFSET * BITS_PER_UNIT % required_align)
    must_align = true;

  if (must_align)
    {
      if (required_align > P_S_B)
	...
    }

#ifdef STACK_DYNAMIC_OFFSET
  /* ??? S_D_O will not be finalized until we've finished expanding the
     function.  It would be nice to know what minimum alignment we might
     assume.  E.g. PUSH_ROUNDING or something.  */
  must_align = true;
  extra_align = BITS_PER_UNIT;
#endif

  if (must_align)
    {
      unsigned extra = ...


r~
Eric Botcazou Jan. 13, 2011, 5:14 p.m. UTC | #2
> I think that you shouldn't change the behaviour of alignment wrt
> STACK_POINTER_OFFSET.  So far, all targets do have (sp+S_P_O) with
> the required alignment.
>
> What about
>
>   must_align = (crtl->preferred_stack_boundary < required_align);
>   if (STACK_POINTER_OFFSET * BITS_PER_UNIT % required_align)
>     must_align = true;
>
>   if (must_align)
>     {
>       if (required_align > P_S_B)
> 	...
>     }
>
> #ifdef STACK_DYNAMIC_OFFSET
>   /* ??? S_D_O will not be finalized until we've finished expanding the
>      function.  It would be nice to know what minimum alignment we might
>      assume.  E.g. PUSH_ROUNDING or something.  */
>   must_align = true;
>   extra_align = BITS_PER_UNIT;
> #endif
>
>   if (must_align)
>     {
>       unsigned extra = ...

Fine with me, but an explicit ack from a RM would probably be in order because 
you're changing the behavior for all targets where STACK_DYNAMIC_OFFSET isn't 
defined, i.e. all of them except for PPC, PA and s390; the default definition 
of STACK_DYNAMIC_OFFSET in function.c doesn't seem to guarantee that it will 
always maintain PREFERRED_STACK_BOUNDARY alignment.
Richard Henderson Jan. 13, 2011, 6:08 p.m. UTC | #3
On 01/13/2011 09:14 AM, Eric Botcazou wrote:
> Fine with me, but an explicit ack from a RM would probably be in order because 
> you're changing the behavior for all targets where STACK_DYNAMIC_OFFSET isn't 
> defined...

How's that?  If S_D_O is not defined here, then the ifdef would not have
triggered before either.  Or do you simply mean that we're currently 
always aligning, perhaps hiding problems with the default definition of
S_D_O from function.c?


r~
Eric Botcazou Jan. 13, 2011, 6:47 p.m. UTC | #4
> How's that?  If S_D_O is not defined here, then the ifdef would not have
> triggered before either.  Or do you simply mean that we're currently
> always aligning, perhaps hiding problems with the default definition of
> S_D_O from function.c?

Yes, I thought this was clear enough in the chunk of text you quoted, but 
re-reading it, this might indeed have been ambiguous.  To rephrase, since

2003-10-07  Geoffrey Keating  <geoffk@apple.com>

        * function.c (pad_to_arg_alignment): Take STACK_POINTER_OFFSET into
        account when aligning arguments.
        * calls.c (STACK_POINTER_OFFSET): Move default from here ...
        * defaults.h (STACK_POINTER_OFFSET): ... to here.

STACK_POINTER_OFFSET has always been defined, so we have always aligned in 
allocate_dynamic_stack_space.  On the 4.5 branch there is in defaults.h:

#ifndef STACK_POINTER_OFFSET
#define STACK_POINTER_OFFSET    0
#endif

and in allocate_dynamic_stack_space:

#if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
#define MUST_ALIGN 1
#else
#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
#endif

  if (MUST_ALIGN)
    {
      size
        = force_operand (plus_constant (size,
					BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
			 NULL_RTX);

      if (flag_stack_usage_info)
	stack_usage_size += BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1;

      known_align_valid = false;
    }

[...]

  if (MUST_ALIGN)
    {
      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
	 but we know it can't.  So add ourselves and then do
	 TRUNC_DIV_EXPR.  */
      target = expand_binop (Pmode, add_optab, target,
			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
			      NULL_RTX, 1);
      target = expand_mult (Pmode, target,
			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
			    NULL_RTX, 1);
    }
Iain Sandoe Jan. 14, 2011, 5:12 p.m. UTC | #5
Hi Eric,

FWIW, the posted patch reg-strapped on powerpc-darwin9 and fixes the  
bug.
(not that I'm suggesting the additional deliberations are unnecessary  
- just a data point).

cheers
Iain

On 13 Jan 2011, at 18:47, Eric Botcazou wrote:

>> How's that?  If S_D_O is not defined here, then the ifdef would not  
>> have
>> triggered before either.  Or do you simply mean that we're currently
>> always aligning, perhaps hiding problems with the default  
>> definition of
>> S_D_O from function.c?
>
> Yes, I thought this was clear enough in the chunk of text you  
> quoted, but
> re-reading it, this might indeed have been ambiguous.  To rephrase,  
> since
>
> 2003-10-07  Geoffrey Keating  <geoffk@apple.com>
>
>        * function.c (pad_to_arg_alignment): Take  
> STACK_POINTER_OFFSET into
>        account when aligning arguments.
>        * calls.c (STACK_POINTER_OFFSET): Move default from here ...
>        * defaults.h (STACK_POINTER_OFFSET): ... to here.
>
> STACK_POINTER_OFFSET has always been defined, so we have always  
> aligned in
> allocate_dynamic_stack_space.  On the 4.5 branch there is in  
> defaults.h:
>
> #ifndef STACK_POINTER_OFFSET
> #define STACK_POINTER_OFFSET    0
> #endif
>
> and in allocate_dynamic_stack_space:
>
> #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
> #define MUST_ALIGN 1
> #else
> #define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
> #endif
>
>  if (MUST_ALIGN)
>    {
>      size
>        = force_operand (plus_constant (size,
> 					BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
> 			 NULL_RTX);
>
>      if (flag_stack_usage_info)
> 	stack_usage_size += BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1;
>
>      known_align_valid = false;
>    }
>
> [...]
>
>  if (MUST_ALIGN)
>    {
>      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
> 	 but we know it can't.  So add ourselves and then do
> 	 TRUNC_DIV_EXPR.  */
>      target = expand_binop (Pmode, add_optab, target,
> 			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
> 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
>      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
> 			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
> 			      NULL_RTX, 1);
>      target = expand_mult (Pmode, target,
> 			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
> 			    NULL_RTX, 1);
>    }
>
> -- 
> Eric Botcazou
Eric Botcazou Jan. 15, 2011, noon UTC | #6
> FWIW, the posted patch reg-strapped on powerpc-darwin9 and fixes the
> bug.
> (not that I'm suggesting the additional deliberations are unnecessary
> - just a data point).

Thanks for the testing.

Here's my proposal: let's fix the regression by applying my original patch and 
open a new PR with Richard's patch attached and 4.7 as target milestone.
diff mbox

Patch

Index: explow.c
===================================================================
--- explow.c	(revision 168704)
+++ explow.c	(working copy)
@@ -1148,6 +1148,7 @@  allocate_dynamic_stack_space (rtx size,
 {
   HOST_WIDE_INT stack_usage_size = -1;
   rtx final_label, final_target, target;
+  unsigned extra_align = 0;
   bool must_align;
 
   /* If we're asking for zero bytes, it doesn't matter what we point
@@ -1230,22 +1231,26 @@  allocate_dynamic_stack_space (rtx size,
      If we have to align, we must leave space in SIZE for the hole
      that might result from the alignment operation.  */
 
-  must_align = (crtl->preferred_stack_boundary < required_align);
-#if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
-  must_align = true;
-#endif
-
-  if (must_align)
+  if (crtl->preferred_stack_boundary < required_align)
     {
-      unsigned extra, extra_align;
-
+      must_align = true;
       if (required_align > PREFERRED_STACK_BOUNDARY)
 	extra_align = PREFERRED_STACK_BOUNDARY;
       else if (required_align > STACK_BOUNDARY)
 	extra_align = STACK_BOUNDARY;
       else
 	extra_align = BITS_PER_UNIT;
-      extra = (required_align - extra_align) / BITS_PER_UNIT;
+    }
+
+  /* ??? STACK_POINTER_OFFSET is always defined now.  */
+#if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
+  must_align = true;
+  extra_align = BITS_PER_UNIT;
+#endif
+
+  if (must_align)
+    {
+      unsigned extra = (required_align - extra_align) / BITS_PER_UNIT;
 
       size = plus_constant (size, extra);
       size = force_operand (size, NULL_RTX);