diff mbox

[2/2,v4] Drop excess size used for run time allocated stack variables.

Message ID 20160726155317.GA5718@linux.vnet.ibm.com
State New
Headers show

Commit Message

Dominik Vogt July 26, 2016, 3:53 p.m. UTC
Finally a patch that works and is simple.  Bootstrapped and
regression tested on s390, s390x biarch and x86_64.  The new patch
exploits the known alignment of (stack pointer +
STACK_DYNAMIC_OFFSET) as described earlier (see below).  I think
that is the right way to get rid of the extra allocation.  It
took a long time to understand the problem.

As the patch triggers a bug in the fortran compiler, the
der_type.f90 test case may fail on some targets if this patch is
used without the fortran fix that I've posted in another thread.

(The patch also contains a fix for a typo in a comment in the
patched function.)

See ChangeLog for a full description of the new patch.

Since the patch is all new, we're not going to commit it without a
new OK.

> Actually I was goind to abandon the patch in its current state.
> :-)  We talked about it internally and concluded that the problem
> is really this:
> 
>  * get_dynamic_stack_size is passed a SIZE of a data block (which
>    is allocated elsewhere), the SIZE_ALIGN of the SIZE (i.e. the
>    alignment of the underlying memory units (e.g. 32 bytes split
>    into 4 times 8 bytes = 64 bit alignment) and the
>    REQUIRED_ALIGN of the data portion of the allocated memory.
>  * Assuming the function is called with SIZE = 2, SIZE_ALIGN = 8
>    and REQUIRED_ALIGN = 64 it first adds 7 bytes to SIZE -> 9.
>    This is what is needed to have two bytes 8-byte-aligned at some
>    memory location without any known alignment.
>  * Finally round_push is called to round up SIZE to a multiple of
>    the stack slot size.
> 
> The key to understanding this is that the function assumes that
> STACK_DYNMAIC_OFFSET is completely unknown at the time its called
> and therefore it does not make assumptions about the alignment of
> STACKPOINTER + STACK_DYNMAIC_OFFSET.  The latest patch simply
> hard-codes that SP + SDO is supposed to be aligned to at least
> stack slot size (and does that in a very complicated way).  Since
> there is no guarantee that this is the case on all targets, the
> patch is broken.  It may miscalculate a SIZE that is too small in
> some cases.
> 
> However, on many targets there is some guarantee about the
> alignment of SP + SDO even if the actual value of SDO is unknown.
> On s390x it's always 8-byte-aligned (stack slot size).  So the
> right fix should be to add knowledge about the target's guaranteed
> alignment of SP + SDO to the function.  I'm right now testing a
> much simpler patch that uses
> REGNO_POINTER_ALIGN(VIRTUAL_STACK_DYNAMIC_REGNUM) as the
> alignment.

Ciao

Dominik ^_^  ^_^

Comments

Jeff Law Aug. 18, 2016, 4:20 p.m. UTC | #1
On 07/26/2016 09:53 AM, Dominik Vogt wrote:
> Finally a patch that works and is simple.  Bootstrapped and
> regression tested on s390, s390x biarch and x86_64.  The new patch
> exploits the known alignment of (stack pointer +
> STACK_DYNAMIC_OFFSET) as described earlier (see below).  I think
> that is the right way to get rid of the extra allocation.  It
> took a long time to understand the problem.
>
> As the patch triggers a bug in the fortran compiler, the
> der_type.f90 test case may fail on some targets if this patch is
> used without the fortran fix that I've posted in another thread.
>
> (The patch also contains a fix for a typo in a comment in the
> patched function.)
>
> See ChangeLog for a full description of the new patch.
>
> Since the patch is all new, we're not going to commit it without a
> new OK.
I like this one much better :-)

OK.

Thanks for your patience,
JEff
Andreas Krebbel Aug. 23, 2016, 9:23 a.m. UTC | #2
> gcc/ChangeLog
> 
> 	* explow.c (get_dynamic_stack_size): Take known alignment of stack
> 	pointer + STACK_DYNAMIC_OFFSET into account when calculating the size
> 	needed.
> 	Correct a typo in a comment.

Applied.  Thanks!

-Andreas-
diff mbox

Patch

diff --git a/gcc/explow.c b/gcc/explow.c
index a345690..f97a214 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1224,9 +1224,15 @@  get_dynamic_stack_size (rtx *psize, unsigned size_align,
      example), so we must preventively align the value.  We leave space
      in SIZE for the hole that might result from the alignment operation.  */
 
-  extra = (required_align - BITS_PER_UNIT) / BITS_PER_UNIT;
-  size = plus_constant (Pmode, size, extra);
-  size = force_operand (size, NULL_RTX);
+  unsigned known_align = REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM);
+  if (known_align == 0)
+    known_align = BITS_PER_UNIT;
+  if (required_align > known_align)
+    {
+      extra = (required_align - known_align) / BITS_PER_UNIT;
+      size = plus_constant (Pmode, size, extra);
+      size = force_operand (size, NULL_RTX);
+    }
 
   if (flag_stack_usage_info && pstack_usage_size)
     *pstack_usage_size += extra;
@@ -1235,7 +1241,7 @@  get_dynamic_stack_size (rtx *psize, unsigned size_align,
     size_align = BITS_PER_UNIT;
 
   /* Round the size to a multiple of the required stack alignment.
-     Since the stack if presumed to be rounded before this allocation,
+     Since the stack is presumed to be rounded before this allocation,
      this will maintain the required alignment.
 
      If the stack grows downward, we could save an insn by subtracting