diff mbox

[v4] avoid alignment of static variables affecting stack's

Message ID 566AE23802000078000BEAE4@prv-mh.provo.novell.com
State New
Headers show

Commit Message

Jan Beulich Dec. 11, 2015, 1:48 p.m. UTC
Function (or more narrow) scope static variables (as well as others not
placed on the stack) should also not have any effect on the stack
alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
construct using an 8-byte aligned sub-file-scope local variable.

According to my checking bad behavior started with 4.6.x (4.5.3 was
still okay), but generated code got quite a bit worse as of 4.9.0.

[v4: Bail early, using is_global_var(), as requested by Bernd.]
[v3: Re-base to current trunk.]
[v2: Drop inclusion of hard register variables, as requested by
     Jakub and Richard.]

gcc/
2015-12-11  Jan Beulich  <jbeulich@suse.com>

	* cfgexpand.c (expand_one_var): Exit early for static and
	external variables when adjusting stack alignment related.

gcc/testsuite/
2015-12-11  Jan Beulich  <jbeulich@suse.com>

	* gcc.c-torture/execute/stkalign.c: New.
avoid alignment of static variables affecting stack's

Function (or more narrow) scope static variables (as well as others not
placed on the stack) should also not have any effect on the stack
alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
construct using an 8-byte aligned sub-file-scope local variable.

According to my checking bad behavior started with 4.6.x (4.5.3 was
still okay), but generated code got quite a bit worse as of 4.9.0.

[v4: Bail early, using is_global_var(), as requested by Bernd.]
[v3: Re-base to current trunk.]
[v2: Drop inclusion of hard register variables, as requested by
     Jakub and Richard.]

gcc/
2015-12-11  Jan Beulich  <jbeulich@suse.com>

	* cfgexpand.c (expand_one_var): Exit early for static and
	external variables when adjusting stack alignment related.

gcc/testsuite/
2015-12-11  Jan Beulich  <jbeulich@suse.com>

	* gcc.c-torture/execute/stkalign.c: New.

--- 2015-12-09/gcc/cfgexpand.c
+++ 2015-12-09/gcc/cfgexpand.c
@@ -1550,6 +1550,9 @@ expand_one_var (tree var, bool toplevel,
 
   if (TREE_TYPE (var) != error_mark_node && TREE_CODE (var) == VAR_DECL)
     {
+      if (is_global_var (var))
+	return 0;
+
       /* Because we don't know if VAR will be in register or on stack,
 	 we conservatively assume it will be on stack even if VAR is
 	 eventually put into register after RA pass.  For non-automatic
--- 2015-12-09/gcc/testsuite/gcc.c-torture/execute/stkalign.c
+++ 2015-12-09/gcc/testsuite/gcc.c-torture/execute/stkalign.c
@@ -0,0 +1,26 @@
+/* { dg-options "-fno-inline" } */
+
+#include <assert.h>
+
+#define ALIGNMENT 64
+
+unsigned test(unsigned n, unsigned p)
+{
+  static struct { char __attribute__((__aligned__(ALIGNMENT))) c; } s;
+  unsigned x;
+
+  assert(__alignof__(s) == ALIGNMENT);
+  asm ("" : "=g" (x), "+m" (s) : "0" (&x));
+
+  return n ? test(n - 1, x) : (x ^ p);
+}
+
+int main (int argc, char *argv[] __attribute__((unused)))
+{
+  unsigned int x = test(argc, 0);
+
+  x |= test(argc + 1, 0);
+  x |= test(argc + 2, 0);
+
+  return !(x & (ALIGNMENT - 1));
+}

Comments

Bernd Schmidt Dec. 11, 2015, 1:54 p.m. UTC | #1
On 12/11/2015 02:48 PM, Jan Beulich wrote:
> Function (or more narrow) scope static variables (as well as others not
> placed on the stack) should also not have any effect on the stack
> alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
> construct using an 8-byte aligned sub-file-scope local variable.
>
> According to my checking bad behavior started with 4.6.x (4.5.3 was
> still okay), but generated code got quite a bit worse as of 4.9.0.
>
> [v4: Bail early, using is_global_var(), as requested by Bernd.]

In case I haven't made it obvious, this is OK.


Bernd
Richard Biener Dec. 14, 2015, 8:35 a.m. UTC | #2
On Fri, Dec 11, 2015 at 2:54 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 12/11/2015 02:48 PM, Jan Beulich wrote:
>>
>> Function (or more narrow) scope static variables (as well as others not
>> placed on the stack) should also not have any effect on the stack
>> alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
>> construct using an 8-byte aligned sub-file-scope local variable.
>>
>> According to my checking bad behavior started with 4.6.x (4.5.3 was
>> still okay), but generated code got quite a bit worse as of 4.9.0.
>>
>> [v4: Bail early, using is_global_var(), as requested by Bernd.]
>
>
> In case I haven't made it obvious, this is OK.

But I wonder if it makes sense because shortly after the early-out we check

      if (TREE_STATIC (var)
          || DECL_EXTERNAL (var)
          || (TREE_CODE (origvar) == SSA_NAME && use_register_for_decl (var)))

so either there are obvious cleanup opportunities left or the patch is broken.

Richard.

>
> Bernd
Jan Beulich Dec. 14, 2015, 8:44 a.m. UTC | #3
>>> On 14.12.15 at 09:35, <richard.guenther@gmail.com> wrote:
> On Fri, Dec 11, 2015 at 2:54 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
>> On 12/11/2015 02:48 PM, Jan Beulich wrote:
>>>
>>> Function (or more narrow) scope static variables (as well as others not
>>> placed on the stack) should also not have any effect on the stack
>>> alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
>>> construct using an 8-byte aligned sub-file-scope local variable.
>>>
>>> According to my checking bad behavior started with 4.6.x (4.5.3 was
>>> still okay), but generated code got quite a bit worse as of 4.9.0.
>>>
>>> [v4: Bail early, using is_global_var(), as requested by Bernd.]
>>
>>
>> In case I haven't made it obvious, this is OK.
> 
> But I wonder if it makes sense because shortly after the early-out we check
> 
>       if (TREE_STATIC (var)
>           || DECL_EXTERNAL (var)
>           || (TREE_CODE (origvar) == SSA_NAME && use_register_for_decl (var)))
> 
> so either there are obvious cleanup opportunities left or the patch is 
> broken.

Looks like a cleanup opportunity I overlooked when following
Bernd's advice.

Jan
Richard Biener Dec. 14, 2015, 9:07 a.m. UTC | #4
On Mon, Dec 14, 2015 at 9:44 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 14.12.15 at 09:35, <richard.guenther@gmail.com> wrote:
>> On Fri, Dec 11, 2015 at 2:54 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
>>> On 12/11/2015 02:48 PM, Jan Beulich wrote:
>>>>
>>>> Function (or more narrow) scope static variables (as well as others not
>>>> placed on the stack) should also not have any effect on the stack
>>>> alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
>>>> construct using an 8-byte aligned sub-file-scope local variable.
>>>>
>>>> According to my checking bad behavior started with 4.6.x (4.5.3 was
>>>> still okay), but generated code got quite a bit worse as of 4.9.0.
>>>>
>>>> [v4: Bail early, using is_global_var(), as requested by Bernd.]
>>>
>>>
>>> In case I haven't made it obvious, this is OK.
>>
>> But I wonder if it makes sense because shortly after the early-out we check
>>
>>       if (TREE_STATIC (var)
>>           || DECL_EXTERNAL (var)
>>           || (TREE_CODE (origvar) == SSA_NAME && use_register_for_decl (var)))
>>
>> so either there are obvious cleanup opportunities left or the patch is
>> broken.
>
> Looks like a cleanup opportunity I overlooked when following
> Bernd's advice.

Well, looking at callers it doesn't sound so obvious ...

/* Expand all variables used in the function.  */

static rtx_insn *
expand_used_vars (void)
{
..
  len = vec_safe_length (cfun->local_decls);
  FOR_EACH_LOCAL_DECL (cfun, i, var)
    {
...
      /* We didn't set a block for static or extern because it's hard
         to tell the difference between a global variable (re)declared
         in a local scope, and one that's really declared there to
         begin with.  And it doesn't really matter much, since we're
         not giving them stack space.  Expand them now.  */
      else if (TREE_STATIC (var) || DECL_EXTERNAL (var))
        expand_now = true;
...
      if (expand_now)
        expand_one_var (var, true, true);

but then you'll immediately return.

So to make sense of your patch a larger refactorig is necessary.  expand_one_var
also later contains

  else if (DECL_EXTERNAL (var))
    ;
  else if (DECL_HAS_VALUE_EXPR_P (var))
    ;
  else if (TREE_STATIC (var))
    ;

so expects externals/statics after recording alignment.  Which means
recording alignment is necessary.

Note that we also record alignment to make sure we can spill to properly
aligned stack slots.

I don't see why we don't need to do that for used statics/externs.  That is
are you sure we never need to spill a var of their type?

Richard.

> Jan
>
Jan Beulich Dec. 14, 2015, 9:37 a.m. UTC | #5
>>> On 14.12.15 at 10:07, <richard.guenther@gmail.com> wrote:
> Note that we also record alignment to make sure we can spill to properly
> aligned stack slots.
> 
> I don't see why we don't need to do that for used statics/externs.  That is
> are you sure we never need to spill a var of their type?

No, I'm not, but note that the discussion on v1/v2 of this patch never
really led anywhere, which prompted me to resend the patch after
several months of silence. Also I'm not convinced that hypothetical
spilling needs should lead to unconditional stack alignment increases.
I.e. either at the time alignment gets recorded it is known that a spill
is needed, or the spill gets avoided when stack alignment isn't large
enough (after all such spilling should only be an optimization, not a
correctness requirement aiui). For me to really look into this situation
I'd need to know conditions that would result in such a spill to actually
occur (I've never observed one in practice).

In any event (and again taking into consideration the long period of
silence on the previous discussion thread) I don't mind my change to
be reverted if only the problem finally gets taken care of. Globally
changing very many functions' stack alignment in e.g. the Linux kernel
just because of a function local static debugging variable getting
emitted in certain not uncommon configurations is not acceptable imo.

Jan
Bernd Schmidt Dec. 14, 2015, 12:09 p.m. UTC | #6
On 12/14/2015 10:07 AM, Richard Biener wrote:

> Note that we also record alignment to make sure we can spill to properly
> aligned stack slots.

> I don't see why we don't need to do that for used statics/externs.  That is
> are you sure we never need to spill a var of their type?

Why would they be different from other global variables declared outside 
a function? We don't have to worry about those.


Bernd
Richard Biener Dec. 14, 2015, 2:20 p.m. UTC | #7
On Mon, Dec 14, 2015 at 1:09 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 12/14/2015 10:07 AM, Richard Biener wrote:
>
>> Note that we also record alignment to make sure we can spill to properly
>> aligned stack slots.
>
>
>> I don't see why we don't need to do that for used statics/externs.  That
>> is
>> are you sure we never need to spill a var of their type?
>
>
> Why would they be different from other global variables declared outside a
> function? We don't have to worry about those.

No idea.

But there must be a reason statics/externals are expected and handled in all the
callers (and the function).

The new early-out just makes the code even more confusing than before.

Richard.

>
> Bernd
>
Bernd Schmidt Dec. 14, 2015, 2:42 p.m. UTC | #8
On 12/14/2015 03:20 PM, Richard Biener wrote:

> But there must be a reason statics/externals are expected and handled in all the
> callers (and the function).

At the time this was written (git rev 60d031234), expand_one_var looked 
very different. It called a subroutine expand_one_static_var which did 
things like call rest_of_decl_compilation. That stuff was then removed 
by Jan in a patch to drop non-unit-at-a-time compilation.

> The new early-out just makes the code even more confusing than before.

Ok, so let's bail out even earlier. The whole expand_now thing for 
statics is likely to be just a historical artifact at this point.


Bernd
diff mbox

Patch

--- 2015-12-09/gcc/cfgexpand.c
+++ 2015-12-09/gcc/cfgexpand.c
@@ -1550,6 +1550,9 @@  expand_one_var (tree var, bool toplevel,
 
   if (TREE_TYPE (var) != error_mark_node && TREE_CODE (var) == VAR_DECL)
     {
+      if (is_global_var (var))
+	return 0;
+
       /* Because we don't know if VAR will be in register or on stack,
 	 we conservatively assume it will be on stack even if VAR is
 	 eventually put into register after RA pass.  For non-automatic
--- 2015-12-09/gcc/testsuite/gcc.c-torture/execute/stkalign.c
+++ 2015-12-09/gcc/testsuite/gcc.c-torture/execute/stkalign.c
@@ -0,0 +1,26 @@ 
+/* { dg-options "-fno-inline" } */
+
+#include <assert.h>
+
+#define ALIGNMENT 64
+
+unsigned test(unsigned n, unsigned p)
+{
+  static struct { char __attribute__((__aligned__(ALIGNMENT))) c; } s;
+  unsigned x;
+
+  assert(__alignof__(s) == ALIGNMENT);
+  asm ("" : "=g" (x), "+m" (s) : "0" (&x));
+
+  return n ? test(n - 1, x) : (x ^ p);
+}
+
+int main (int argc, char *argv[] __attribute__((unused)))
+{
+  unsigned int x = test(argc, 0);
+
+  x |= test(argc + 1, 0);
+  x |= test(argc + 2, 0);
+
+  return !(x & (ALIGNMENT - 1));
+}