Patchwork DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

login
register
mail settings
Submitter Jakub Jelinek
Date June 7, 2013, 7:25 p.m.
Message ID <20130607192540.GH1493@tucnak.redhat.com>
Download mbox | patch
Permalink /patch/249799/
State New
Headers show

Comments

Jakub Jelinek - June 7, 2013, 7:25 p.m.
Hi!

This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
for optimization purposes beyond ABI mandated levels.  It is fine to emit
the vars aligned as much as we want for optimization purposes, but if we
can't be sure that references to that decl bind to the definition we
increased the alignment on (e.g. common variables, or -fpic code without
hidden visibility, weak vars etc.), we can't assume that alignment.
As DECL_ALIGN is used for both the alignment emitted for the definitions
and alignment assumed on code referring to it, this patch increases
DECL_ALIGN only on decls where decl_binds_to_current_def_p is true,
and otherwise the optimization part on top of that emits only when
aligning definition.
On x86_64, DATA_ALIGNMENT macro was partly an optimization, partly ABI
mandated alignment increase, so I've introduced a new macro,
DATA_ABI_ALIGNMENT, which is the ABI mandated increase only (on x86-64
I think the only one is that arrays with size 16 bytes or more (and VLAs,
but that is not handled by DATA*ALIGNMENT) are at least 16 byte aligned).

Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
targets, I've kept them all using DATA_ALIGNMENT, which is considered
optimization increase only now, if there is some ABI mandated alignment
increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
well as DATA_ALIGNMENT.  The patch causes some vectorization regressions
(tweaked in the testsuite), especially for common vars where we used to
align say common arrays to 256 bits rather than the ABI mandated 128 bits,
or for -fpic code, but I'm afraid we need to live with that, if you compile
another file with say icc or some other compiler which doesn't increase
alignment beyond ABI mandated level and that other file defines the var say
as non-common, we have wrong-code.

2013-06-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/56564
	* varasm.c (align_variable): Don't use DATA_ALIGNMENT or
	CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
	Use DATA_ABI_ALIGNMENT for that case instead if defined.
	(get_variable_align): New function.
	(get_variable_section, emit_bss, emit_common,
	assemble_variable_contents, place_block_symbol): Use
	get_variable_align instead of DECL_ALIGN.
	(assemble_noswitch_variable): Add align argument, use it
	instead of DECL_ALIGN.
	(assemble_variable): Adjust caller.  Use get_variable_align
	instead of DECL_ALIGN.
	* config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
	caller.
	(DATA_ABI_ALIGNMENT): Define.
	* config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
	* config/i386/i386.c (x86_data_alignment): Add opt argument.  If
	opt is false, only return the psABI mandated alignment increase.
	* doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
	* doc/tm.texi: Regenerated.

	* gcc.target/i386/pr56564-1.c: New test.
	* gcc.target/i386/pr56564-2.c: New test.
	* gcc.target/i386/pr56564-3.c: New test.
	* gcc.target/i386/pr56564-4.c: New test.
	* gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
	* gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
	* gcc.target/i386/vect-sizes-1.c: Likewise.
	* gcc.target/i386/memcpy-1.c: Likewise.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.


	Jakub
Richard Henderson - June 7, 2013, 8:43 p.m.
On 06/07/2013 12:25 PM, Jakub Jelinek wrote:
> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
> for optimization purposes beyond ABI mandated levels.  It is fine to emit
> the vars aligned as much as we want for optimization purposes, but if we
> can't be sure that references to that decl bind to the definition we
> increased the alignment on (e.g. common variables, or -fpic code without
> hidden visibility, weak vars etc.), we can't assume that alignment.

When the linker merges common blocks, it chooses both maximum size and maximum
alignment.  Thus for any common block for which we can prove the block must
reside in the module (any executable, or hidden common in shared object), we
can go ahead and use the increased alignment.

It's only in shared objects with non-hidden common blocks that we have a
problem, since in that case we may resolve the common block via copy reloc to a
memory block in another module.

So while decl_binds_to_current_def_p is a good starting point, I think we can
do a little better with common blocks.  Which ought to take care of those
vectorization regressions you mention.

> @@ -966,8 +966,12 @@ align_variable (tree decl, bool dont_out
>        align = MAX_OFILE_ALIGNMENT;
>      }
>  
> -  /* On some machines, it is good to increase alignment sometimes.  */
> -  if (! DECL_USER_ALIGN (decl))
> +  /* On some machines, it is good to increase alignment sometimes.
> +     But as DECL_ALIGN is used both for actually emitting the variable
> +     and for code accessing the variable as guaranteed alignment, we
> +     can only increase the alignment if it is a performance optimization
> +     if the references to it must bind to the current definition.  */
> +  if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl))
>      {
>  #ifdef DATA_ALIGNMENT
>        unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
> @@ -988,12 +992,69 @@ align_variable (tree decl, bool dont_out
>  	}
>  #endif
>      }
> +#ifdef DATA_ABI_ALIGNMENT
> +  else if (! DECL_USER_ALIGN (decl))
> +    {
> +      unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
> +      /* For backwards compatibility, don't assume the ABI alignment for
> +	 TLS variables.  */
> +      if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
> +	align = data_align;
> +    }
> +#endif

This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
defined, but DATA_ALIGNMENT isn't.  And while I realize you documented it, I
don't like the restriction that D_A /must/ return something larger than D_A_A.
 All that means is that in complex cases D_A will have to call D_A_A itself.

I would think that it would be better to rearrange as

  if (!D_U_A)
    {
  #ifdef D_A_A
      align = ...
  #endif
  #ifdef D_A
      if (d_b_t_c_d_p)
        align = ...
  #endif
    }

Why the special case for TLS?  If we really want that special case surely that
test should go into D_A_A itself, and not here in generic code.

> Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
> targets, I've kept them all using DATA_ALIGNMENT, which is considered
> optimization increase only now, if there is some ABI mandated alignment
> increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
> well as DATA_ALIGNMENT.

I've had a brief look over the instances of D_A within the tree atm.  Most of
them carry the cut-n-paste comment "for the same reasons".  These I believe
never intended an ABI change, and were really only interested in optimization.

But these I think require a good hard look to see if they really intended an
ABI alignment:

c6x	comment explicitly mentions abi
cris	compiler options for alignment -- systemwide or local?
mmix	comment mentions GETA instruction
s390	comment mentions LARL instruction
rs6000	SPE and E500 portion of the alignment non-optional?

Relevant port maintainers CCed.


r~
Jakub Jelinek - June 7, 2013, 9:14 p.m.
On Fri, Jun 07, 2013 at 01:43:27PM -0700, Richard Henderson wrote:
> On 06/07/2013 12:25 PM, Jakub Jelinek wrote:
> > This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
> > for optimization purposes beyond ABI mandated levels.  It is fine to emit
> > the vars aligned as much as we want for optimization purposes, but if we
> > can't be sure that references to that decl bind to the definition we
> > increased the alignment on (e.g. common variables, or -fpic code without
> > hidden visibility, weak vars etc.), we can't assume that alignment.
> 
> When the linker merges common blocks, it chooses both maximum size and maximum
> alignment.  Thus for any common block for which we can prove the block must
> reside in the module (any executable, or hidden common in shared object), we
> can go ahead and use the increased alignment.

But consider say:
one TU:
struct S { char buf[15]; } s __attribute__((aligned (32)));

another TU:
char c = 7;
struct S { char buf[15]; } s = { { 1, 2 } };
char d = 8;
int main () { return 0; }

(the aligned(32) is there just to simulate the DATA_ALIGNMENT optimization
increase).  Linker warns about this (thus the question is if we want to
increase the alignment for optimization on commons at all) and doesn't align
it.

> This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
> defined, but DATA_ALIGNMENT isn't.  And while I realize you documented it, I
> don't like the restriction that D_A /must/ return something larger than D_A_A.
>  All that means is that in complex cases D_A will have to call D_A_A itself.

Yeah, I guess I can rearrange it.  The reason I wrote it that way was to
avoid an extra function call, but that is probably not big enough overhead.

> I would think that it would be better to rearrange as
> 
>   if (!D_U_A)
>     {
>   #ifdef D_A_A
>       align = ...
>   #endif
>   #ifdef D_A
>       if (d_b_t_c_d_p)
>         align = ...
>   #endif
>     }
> 
> Why the special case for TLS?  If we really want that special case surely that
> test should go into D_A_A itself, and not here in generic code.

I special case TLS for backwards ABI compatibility with GCC <= 4.8.1,
because we ignored DATA_ALIGNMENT for TLS vars, even if what it returned
wasn't just an optimization, but ABI requirement.  So, if you have:
__thread char a[16] = { 1, 2 };
in one shared library and
__thread char a[16] = { 1, 2 };
use (a);
in another shared library, compile one with GCC 4.8.1 e.g. and the other one
with 4.9.0, if use (a) would assume the psABI mandated 16 byte alignment,
but the actual definition was from 4.8.1 compiled object and was only 1 byte
aligned, it could crash.  Also, neither DATA_ALIGNMENT nor DATA_ABI_ALIGNMENT sees
the decl itself, it just looks at type and previously assumed alignment.

> > Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
> > targets, I've kept them all using DATA_ALIGNMENT, which is considered
> > optimization increase only now, if there is some ABI mandated alignment
> > increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
> > well as DATA_ALIGNMENT.
> 
> I've had a brief look over the instances of D_A within the tree atm.  Most of
> them carry the cut-n-paste comment "for the same reasons".  These I believe
> never intended an ABI change, and were really only interested in optimization.

Thanks for looking into this.

> But these I think require a good hard look to see if they really intended an
> ABI alignment:
> 
> c6x	comment explicitly mentions abi
> cris	compiler options for alignment -- systemwide or local?
> mmix	comment mentions GETA instruction
> s390	comment mentions LARL instruction
> rs6000	SPE and E500 portion of the alignment non-optional?
> 
> Relevant port maintainers CCed.

	Jakub
Hans-Peter Nilsson - June 7, 2013, 10:56 p.m.
On Fri, 7 Jun 2013, Richard Henderson wrote:
> I've had a brief look over the instances of D_A within the tree atm.  Most of
> them carry the cut-n-paste comment "for the same reasons".  These I believe
> never intended an ABI change, and were really only interested in optimization.
>
> But these I think require a good hard look to see if they really intended an
> ABI alignment:

I'm not sure what is about to change how?

> cris	compiler options for alignment -- systemwide or local?

No, DATA_ALIGNMENT in cris.h is not intended as an ABI
indication, but as an optimization when emitting data.
(This was the way to do it at the time.  Has this changed?)

The ABI is as indicated by BIGGEST_ALIGNMENT: 8 (bits; one
byte).  Nothing is guaranteed (to the data referer) to have a
bigger alignment - unless otherwise indicated by attribute-align.

(Unfortunately I can't change BIGGEST_ALIGNMENT to indicate that
atomic variables require "natural alignment", or actually not to
straddle a cache-boundary, as increasing BIGGEST_ALIGNMENT makes
GCC change the ABI.  But that's a slightly different issue.)

> mmix	comment mentions GETA instruction

Yep, data must be at least 32-bit-aligned so addresses can be
formed with a GETA insn.  BIGGEST_ALIGNMENT is 64 though and
STRICT_ALIGNMENT; natural alignment is required for proper
interpretation as the low bits are ignored.

brgds, H-P
Jakub Jelinek - June 8, 2013, 3:05 p.m.
On Fri, Jun 07, 2013 at 06:56:34PM -0400, Hans-Peter Nilsson wrote:
> > cris	compiler options for alignment -- systemwide or local?
> 
> No, DATA_ALIGNMENT in cris.h is not intended as an ABI
> indication, but as an optimization when emitting data.
> (This was the way to do it at the time.  Has this changed?)
> 
> The ABI is as indicated by BIGGEST_ALIGNMENT: 8 (bits; one
> byte).  Nothing is guaranteed (to the data referer) to have a
> bigger alignment - unless otherwise indicated by attribute-align.
> 
> (Unfortunately I can't change BIGGEST_ALIGNMENT to indicate that
> atomic variables require "natural alignment", or actually not to
> straddle a cache-boundary, as increasing BIGGEST_ALIGNMENT makes
> GCC change the ABI.  But that's a slightly different issue.)

Right now it is unfortunately part of ABI, which is something the patch
attempts to cure in a backwards compatible way (i.e., data will be still
alignment the way it used to be, but code will no longer assume it is that
much aligned, unless it is DATA_ABI_ALIGNMENT).
> 
> > mmix	comment mentions GETA instruction
> 
> Yep, data must be at least 32-bit-aligned so addresses can be
> formed with a GETA insn.  BIGGEST_ALIGNMENT is 64 though and
> STRICT_ALIGNMENT; natural alignment is required for proper
> interpretation as the low bits are ignored.

Then mmix would probably want to define DATA_ABI_ALIGNMENT instead of
DATA_ALIGNMENT after the patch (or both).

	Jakub
Bernd Schmidt - June 10, 2013, 10:51 a.m.
On 06/07/2013 10:43 PM, Richard Henderson wrote:
> But these I think require a good hard look to see if they really intended an
> ABI alignment:
> 
> c6x	comment explicitly mentions abi

The ABI specifies a minimum alignment for arrays.


Bernd
Jakub Jelinek - June 10, 2013, 10:55 a.m.
On Mon, Jun 10, 2013 at 12:51:05PM +0200, Bernd Schmidt wrote:
> On 06/07/2013 10:43 PM, Richard Henderson wrote:
> > But these I think require a good hard look to see if they really intended an
> > ABI alignment:
> > 
> > c6x	comment explicitly mentions abi
> 
> The ABI specifies a minimum alignment for arrays.

Thus after the patch c6x.h (DATA_ALIGNMENT) should be renamed to
DATA_ABI_ALIGNMENT, right?

	Jakub
Bernd Schmidt - June 10, 2013, 11:03 a.m.
On 06/10/2013 12:55 PM, Jakub Jelinek wrote:
> On Mon, Jun 10, 2013 at 12:51:05PM +0200, Bernd Schmidt wrote:
>> On 06/07/2013 10:43 PM, Richard Henderson wrote:
>>> But these I think require a good hard look to see if they really intended an
>>> ABI alignment:
>>>
>>> c6x	comment explicitly mentions abi
>>
>> The ABI specifies a minimum alignment for arrays.
> 
> Thus after the patch c6x.h (DATA_ALIGNMENT) should be renamed to
> DATA_ABI_ALIGNMENT, right?

I think so.


Bernd
Richard Henderson - June 10, 2013, 2:51 p.m.
On 06/07/2013 02:14 PM, Jakub Jelinek wrote:
>> > When the linker merges common blocks, it chooses both maximum size and maximum
>> > alignment.  Thus for any common block for which we can prove the block must
>> > reside in the module (any executable, or hidden common in shared object), we
>> > can go ahead and use the increased alignment.
> But consider say:
> one TU:
> struct S { char buf[15]; } s __attribute__((aligned (32)));
> 
> another TU:
> char c = 7;
> struct S { char buf[15]; } s = { { 1, 2 } };
> char d = 8;
> int main () { return 0; }
> 
> (the aligned(32) is there just to simulate the DATA_ALIGNMENT optimization
> increase).  Linker warns about this (thus the question is if we want to
> increase the alignment for optimization on commons at all) and doesn't align
> it.
> 

Oh, right.  I hadn't considered commons unifying with non-common variables,
and the failure of commoning in that case.  I'd mostly been thinking of
uninitialized Fortran-like common blocks, where it is more common for the
blocks to have nothing in common but the name.



r~
Edmar Wienskoski - June 12, 2013, 5:52 p.m.
The e500v2 (SPE) hardware is such that if the address of vector (double world
load / stores) are not double world aligned the instruction will trap.

So this alignment is not optional.

Edmar


On Fri, Jun 7, 2013 at 3:43 PM, Richard Henderson <rth@redhat.com> wrote:
> On 06/07/2013 12:25 PM, Jakub Jelinek wrote:
>> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
>> for optimization purposes beyond ABI mandated levels.  It is fine to emit
>> the vars aligned as much as we want for optimization purposes, but if we
>> can't be sure that references to that decl bind to the definition we
>> increased the alignment on (e.g. common variables, or -fpic code without
>> hidden visibility, weak vars etc.), we can't assume that alignment.
>
> When the linker merges common blocks, it chooses both maximum size and maximum
> alignment.  Thus for any common block for which we can prove the block must
> reside in the module (any executable, or hidden common in shared object), we
> can go ahead and use the increased alignment.
>
> It's only in shared objects with non-hidden common blocks that we have a
> problem, since in that case we may resolve the common block via copy reloc to a
> memory block in another module.
>
> So while decl_binds_to_current_def_p is a good starting point, I think we can
> do a little better with common blocks.  Which ought to take care of those
> vectorization regressions you mention.
>
>> @@ -966,8 +966,12 @@ align_variable (tree decl, bool dont_out
>>        align = MAX_OFILE_ALIGNMENT;
>>      }
>>
>> -  /* On some machines, it is good to increase alignment sometimes.  */
>> -  if (! DECL_USER_ALIGN (decl))
>> +  /* On some machines, it is good to increase alignment sometimes.
>> +     But as DECL_ALIGN is used both for actually emitting the variable
>> +     and for code accessing the variable as guaranteed alignment, we
>> +     can only increase the alignment if it is a performance optimization
>> +     if the references to it must bind to the current definition.  */
>> +  if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl))
>>      {
>>  #ifdef DATA_ALIGNMENT
>>        unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
>> @@ -988,12 +992,69 @@ align_variable (tree decl, bool dont_out
>>       }
>>  #endif
>>      }
>> +#ifdef DATA_ABI_ALIGNMENT
>> +  else if (! DECL_USER_ALIGN (decl))
>> +    {
>> +      unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
>> +      /* For backwards compatibility, don't assume the ABI alignment for
>> +      TLS variables.  */
>> +      if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
>> +     align = data_align;
>> +    }
>> +#endif
>
> This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
> defined, but DATA_ALIGNMENT isn't.  And while I realize you documented it, I
> don't like the restriction that D_A /must/ return something larger than D_A_A.
>  All that means is that in complex cases D_A will have to call D_A_A itself.
>
> I would think that it would be better to rearrange as
>
>   if (!D_U_A)
>     {
>   #ifdef D_A_A
>       align = ...
>   #endif
>   #ifdef D_A
>       if (d_b_t_c_d_p)
>         align = ...
>   #endif
>     }
>
> Why the special case for TLS?  If we really want that special case surely that
> test should go into D_A_A itself, and not here in generic code.
>
>> Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
>> targets, I've kept them all using DATA_ALIGNMENT, which is considered
>> optimization increase only now, if there is some ABI mandated alignment
>> increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
>> well as DATA_ALIGNMENT.
>
> I've had a brief look over the instances of D_A within the tree atm.  Most of
> them carry the cut-n-paste comment "for the same reasons".  These I believe
> never intended an ABI change, and were really only interested in optimization.
>
> But these I think require a good hard look to see if they really intended an
> ABI alignment:
>
> c6x     comment explicitly mentions abi
> cris    compiler options for alignment -- systemwide or local?
> mmix    comment mentions GETA instruction
> s390    comment mentions LARL instruction
> rs6000  SPE and E500 portion of the alignment non-optional?
>
> Relevant port maintainers CCed.
>
>
> r~
Igor Zamyatin - June 19, 2013, 7:01 a.m.
The change also affects vectorizer in avx case which could be seen for
gcc.dg/tree-ssa/loop-19.c test.

After the change report says

loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment ===
loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
loop-19_bad.c:16: note: can't force alignment of ref: a[j_9]
loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
loop-19_bad.c:16: note: can't force alignment of ref: c[j_9]

AFAICS first condition in ix86_data_alignment was true before the
change so 256 was a return value.

Do we need to tweak this test also?

Thanks,
Igor

> Hi!
>
> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls for optimization purposes beyond ABI mandated levels.  It is fine to emit the vars aligned as much as we want for optimization purposes, but if we can't be sure that references to that decl bind to the definition we increased the alignment on (e.g. common variables, or -fpic code without hidden visibility, weak vars etc.), we can't assume that alignment.
> As DECL_ALIGN is used for both the alignment emitted for the definitions and alignment assumed on code referring to it, this patch increases DECL_ALIGN only on decls where decl_binds_to_current_def_p is true, and otherwise the optimization part on top of that emits only when aligning definition.
> On x86_64, DATA_ALIGNMENT macro was partly an optimization, partly ABI mandated alignment increase, so I've introduced a new macro, DATA_ABI_ALIGNMENT, which is the ABI mandated increase only (on x86-64 I think the only one is that arrays with size 16 bytes or more (and VLAs, but that is not handled by DATA*ALIGNMENT) are at least 16 byte aligned).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other targets, I've kept them all using DATA_ALIGNMENT, which is considered optimization increase only now, if there is some ABI mandated alignment increase on other targets, that should be done in DATA_ABI_ALIGNMENT as well as DATA_ALIGNMENT.  The patch causes some vectorization regressions (tweaked in the testsuite), especially for common vars where we used to align say common arrays to 256 bits rather than the ABI mandated 128 bits, or for -fpic code, but I'm afraid we need to live with that, if you compile another file with say icc or some other compiler which doesn't increase alignment beyond ABI mandated level and that other file defines the var say as non-common, we have wrong-code.
>
> 2013-06-07  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/56564
>         * varasm.c (align_variable): Don't use DATA_ALIGNMENT or
>         CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
>         Use DATA_ABI_ALIGNMENT for that case instead if defined.
>         (get_variable_align): New function.
>         (get_variable_section, emit_bss, emit_common,
>         assemble_variable_contents, place_block_symbol): Use
>         get_variable_align instead of DECL_ALIGN.
>         (assemble_noswitch_variable): Add align argument, use it
>         instead of DECL_ALIGN.
>         (assemble_variable): Adjust caller.  Use get_variable_align
>         instead of DECL_ALIGN.
>         * config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
>         caller.
>         (DATA_ABI_ALIGNMENT): Define.
>         * config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
>         * config/i386/i386.c (x86_data_alignment): Add opt argument.  If
>         opt is false, only return the psABI mandated alignment increase.
>         * doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
>         * doc/tm.texi: Regenerated.
>
>         * gcc.target/i386/pr56564-1.c: New test.
>         * gcc.target/i386/pr56564-2.c: New test.
>         * gcc.target/i386/pr56564-3.c: New test.
>         * gcc.target/i386/pr56564-4.c: New test.
>         * gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
>         * gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
>         * gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
>         * gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
>         * gcc.target/i386/vect-sizes-1.c: Likewise.
>         * gcc.target/i386/memcpy-1.c: Likewise.
>         * gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
>         * gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.
>
Jakub Jelinek - June 19, 2013, 7:05 a.m.
On Wed, Jun 19, 2013 at 11:01:59AM +0400, Igor Zamyatin wrote:
> The change also affects vectorizer in avx case which could be seen for
> gcc.dg/tree-ssa/loop-19.c test.
> 
> After the change report says
> 
> loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment ===
> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
> loop-19_bad.c:16: note: can't force alignment of ref: a[j_9]
> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
> loop-19_bad.c:16: note: can't force alignment of ref: c[j_9]
> 
> AFAICS first condition in ix86_data_alignment was true before the
> change so 256 was a return value.
> 
> Do we need to tweak this test also?

I'd add -fno-common to the test.

	Jakub

Patch

--- gcc/varasm.c.jj	2013-06-07 13:17:17.000000000 +0200
+++ gcc/varasm.c	2013-06-07 15:38:36.710908852 +0200
@@ -966,8 +966,12 @@  align_variable (tree decl, bool dont_out
       align = MAX_OFILE_ALIGNMENT;
     }
 
-  /* On some machines, it is good to increase alignment sometimes.  */
-  if (! DECL_USER_ALIGN (decl))
+  /* On some machines, it is good to increase alignment sometimes.
+     But as DECL_ALIGN is used both for actually emitting the variable
+     and for code accessing the variable as guaranteed alignment, we
+     can only increase the alignment if it is a performance optimization
+     if the references to it must bind to the current definition.  */
+  if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl))
     {
 #ifdef DATA_ALIGNMENT
       unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
@@ -988,12 +992,69 @@  align_variable (tree decl, bool dont_out
 	}
 #endif
     }
+#ifdef DATA_ABI_ALIGNMENT
+  else if (! DECL_USER_ALIGN (decl))
+    {
+      unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
+      /* For backwards compatibility, don't assume the ABI alignment for
+	 TLS variables.  */
+      if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
+	align = data_align;
+    }
+#endif
 
   /* Reset the alignment in case we have made it tighter, so we can benefit
      from it in get_pointer_alignment.  */
   DECL_ALIGN (decl) = align;
 }
 
+/* Return DECL_ALIGN (decl), possibly increased for optimization purposes
+   beyond what align_variable returned.  */
+
+static unsigned int
+get_variable_align (tree decl)
+{
+  unsigned int align = DECL_ALIGN (decl);
+
+  /* For user aligned vars or static vars align_variable already did
+     everything.  */
+  if (DECL_USER_ALIGN (decl) || !TREE_PUBLIC (decl))
+    return align;
+
+  /* For decls that bind to the current definition, align_variable
+     did also everything, except for not assuming ABI required alignment
+     of TLS variables.  For other vars, increase the alignment here
+     as an optimization.  */
+  if (!decl_binds_to_current_def_p (decl))
+    {
+      /* On some machines, it is good to increase alignment sometimes.  */
+#ifdef DATA_ALIGNMENT
+      unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
+      /* Don't increase alignment too much for TLS variables - TLS space
+	 is too precious.  */
+      if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
+	align = data_align;
+#endif
+#ifdef CONSTANT_ALIGNMENT
+      if (DECL_INITIAL (decl) != 0 && DECL_INITIAL (decl) != error_mark_node)
+	{
+	  unsigned int const_align = CONSTANT_ALIGNMENT (DECL_INITIAL (decl),
+							 align);
+	  /* Don't increase alignment too much for TLS variables - TLS space
+	     is too precious.  */
+	  if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
+	    align = const_align;
+	}
+    }
+#endif
+
+#ifdef DATA_ABI_ALIGNMENT
+  if (DECL_THREAD_LOCAL_P (decl))
+    return DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
+#endif
+  return align;
+}
+
 /* Return the section into which the given VAR_DECL or CONST_DECL
    should be placed.  PREFER_NOSWITCH_P is true if a noswitch
    section should be used wherever possible.  */
@@ -1043,7 +1104,8 @@  get_variable_section (tree decl, bool pr
 	return bss_noswitch_section;
     }
 
-  return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
+  return targetm.asm_out.select_section (decl, reloc,
+					 get_variable_align (decl));
 }
 
 /* Return the block into which object_block DECL should be placed.  */
@@ -1780,7 +1842,8 @@  emit_bss (tree decl ATTRIBUTE_UNUSED,
 	  unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED)
 {
 #if defined ASM_OUTPUT_ALIGNED_BSS
-  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size, DECL_ALIGN (decl));
+  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size,
+			  get_variable_align (decl));
   return true;
 #endif
 }
@@ -1796,10 +1859,11 @@  emit_common (tree decl ATTRIBUTE_UNUSED,
 {
 #if defined ASM_OUTPUT_ALIGNED_DECL_COMMON
   ASM_OUTPUT_ALIGNED_DECL_COMMON (asm_out_file, decl, name,
-				  size, DECL_ALIGN (decl));
+				  size, get_variable_align (decl));
   return true;
 #elif defined ASM_OUTPUT_ALIGNED_COMMON
-  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size, DECL_ALIGN (decl));
+  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size,
+			     get_variable_align (decl));
   return true;
 #else
   ASM_OUTPUT_COMMON (asm_out_file, name, size, rounded);
@@ -1828,7 +1892,8 @@  emit_tls_common (tree decl ATTRIBUTE_UNU
    NAME is the name of DECL's SYMBOL_REF.  */
 
 static void
-assemble_noswitch_variable (tree decl, const char *name, section *sect)
+assemble_noswitch_variable (tree decl, const char *name, section *sect,
+			    unsigned int align)
 {
   unsigned HOST_WIDE_INT size, rounded;
 
@@ -1850,7 +1915,7 @@  assemble_noswitch_variable (tree decl, c
 	     * (BIGGEST_ALIGNMENT / BITS_PER_UNIT));
 
   if (!sect->noswitch.callback (decl, name, size, rounded)
-      && (unsigned HOST_WIDE_INT) DECL_ALIGN_UNIT (decl) > rounded)
+      && (unsigned HOST_WIDE_INT) (align / BITS_PER_UNIT) > rounded)
     warning (0, "requested alignment for %q+D is greater than "
 	     "implemented alignment of %wu", decl, rounded);
 }
@@ -1880,7 +1945,7 @@  assemble_variable_contents (tree decl, c
 	/* Output the actual data.  */
 	output_constant (DECL_INITIAL (decl),
 			 tree_low_cst (DECL_SIZE_UNIT (decl), 1),
-			 DECL_ALIGN (decl));
+			 get_variable_align (decl));
       else
 	/* Leave space for it.  */
 	assemble_zeros (tree_low_cst (DECL_SIZE_UNIT (decl), 1));
@@ -1904,6 +1969,7 @@  assemble_variable (tree decl, int top_le
   const char *name;
   rtx decl_rtl, symbol;
   section *sect;
+  unsigned int align;
   bool asan_protected = false;
 
   /* This function is supposed to handle VARIABLES.  Ensure we have one.  */
@@ -2003,6 +2069,8 @@  assemble_variable (tree decl, int top_le
 
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
 
+  align = get_variable_align (decl);
+
   if (TREE_PUBLIC (decl))
     maybe_assemble_visibility (decl);
 
@@ -2032,12 +2100,12 @@  assemble_variable (tree decl, int top_le
       place_block_symbol (symbol);
     }
   else if (SECTION_STYLE (sect) == SECTION_NOSWITCH)
-    assemble_noswitch_variable (decl, name, sect);
+    assemble_noswitch_variable (decl, name, sect, align);
   else
     {
       switch_to_section (sect);
-      if (DECL_ALIGN (decl) > BITS_PER_UNIT)
-	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (DECL_ALIGN_UNIT (decl)));
+      if (align > BITS_PER_UNIT)
+	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));
       assemble_variable_contents (decl, name, dont_output_data);
       if (asan_protected)
 	{
@@ -6967,7 +7035,7 @@  place_block_symbol (rtx symbol)
   else
     {
       decl = SYMBOL_REF_DECL (symbol);
-      alignment = DECL_ALIGN (decl);
+      alignment = get_variable_align (decl);
       size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
       if (flag_asan && asan_protect_global (decl))
 	{
--- gcc/config/i386/i386.h.jj	2013-06-03 19:15:34.000000000 +0200
+++ gcc/config/i386/i386.h	2013-06-07 14:48:36.430589051 +0200
@@ -859,7 +859,19 @@  enum target_cpu_default
    cause character arrays to be word-aligned so that `strcpy' calls
    that copy constants to character arrays can be done inline.  */
 
-#define DATA_ALIGNMENT(TYPE, ALIGN) ix86_data_alignment ((TYPE), (ALIGN))
+#define DATA_ALIGNMENT(TYPE, ALIGN) \
+  ix86_data_alignment ((TYPE), (ALIGN), true)
+
+/* Similar to DATA_ALIGNMENT, but for the cases where the ABI mandates
+   some alignment increase, instead of optimization only purposes.  E.g.
+   AMD x86-64 psABI says that variables with array type larger than 15 bytes
+   must be aligned to 16 byte boundaries.  DATA_ALIGNMENT should always
+   return the same or larger value than DATA_ABI_ALIGNMENT.
+
+   If this macro is not defined, then ALIGN is used.  */
+
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \
+  ix86_data_alignment ((TYPE), (ALIGN), false)
 
 /* If defined, a C expression to compute the alignment for a local
    variable.  TYPE is the data type, and ALIGN is the alignment that
--- gcc/config/i386/i386-protos.h.jj	2013-05-14 21:30:19.000000000 +0200
+++ gcc/config/i386/i386-protos.h	2013-06-07 13:31:21.937823575 +0200
@@ -207,7 +207,7 @@  extern void init_cumulative_args (CUMULA
 #endif	/* RTX_CODE  */
 
 #ifdef TREE_CODE
-extern int ix86_data_alignment (tree, int);
+extern int ix86_data_alignment (tree, int, bool);
 extern unsigned int ix86_local_alignment (tree, enum machine_mode,
 					  unsigned int);
 extern unsigned int ix86_minimum_alignment (tree, enum machine_mode,
--- gcc/config/i386/i386.c.jj	2013-06-07 13:17:17.000000000 +0200
+++ gcc/config/i386/i386.c	2013-06-07 13:37:24.845416361 +0200
@@ -25375,11 +25375,12 @@  ix86_constant_alignment (tree exp, int a
    instead of that alignment to align the object.  */
 
 int
-ix86_data_alignment (tree type, int align)
+ix86_data_alignment (tree type, int align, bool opt)
 {
   int max_align = optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
 
-  if (AGGREGATE_TYPE_P (type)
+  if (opt
+      && AGGREGATE_TYPE_P (type)
       && TYPE_SIZE (type)
       && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
       && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align
@@ -25391,14 +25392,17 @@  ix86_data_alignment (tree type, int alig
      to 16byte boundary.  */
   if (TARGET_64BIT)
     {
-      if (AGGREGATE_TYPE_P (type)
-	   && TYPE_SIZE (type)
-	   && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
-	   && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
-	       || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
+      if ((opt ? AGGREGATE_TYPE_P (type) : TREE_CODE (type) == ARRAY_TYPE)
+	  && TYPE_SIZE (type)
+	  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
+	  && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
+	      || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
 	return 128;
     }
 
+  if (!opt)
+    return align;
+
   if (TREE_CODE (type) == ARRAY_TYPE)
     {
       if (TYPE_MODE (TREE_TYPE (type)) == DFmode && align < 64)
--- gcc/doc/tm.texi.in.jj	2013-06-01 14:47:17.000000000 +0200
+++ gcc/doc/tm.texi.in	2013-06-07 14:47:10.192003375 +0200
@@ -1062,6 +1062,16 @@  arrays to be word-aligned so that @code{
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.  @code{DATA_ALIGNMENT} should always
+return the same or larger value than @code{DATA_ABI_ALIGNMENT}.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
--- gcc/doc/tm.texi.jj	2013-06-01 14:47:17.241680273 +0200
+++ gcc/doc/tm.texi	2013-06-07 14:47:29.400694547 +0200
@@ -1078,6 +1078,16 @@  arrays to be word-aligned so that @code{
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.  @code{DATA_ALIGNMENT} should always
+return the same or larger value than @code{DATA_ABI_ALIGNMENT}.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
--- gcc/testsuite/gcc.target/i386/pr56564-1.c.jj	2013-06-07 15:17:15.879403383 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-1.c	2013-06-07 15:44:18.386232149 +0200
@@ -0,0 +1,25 @@ 
+/* PR target/56564 */
+/* { dg-do compile { target { fpic && lp64 } } } */
+/* { dg-options "-O3 -fpic -fdump-tree-optimized" } */
+
+struct S { long a, b; } s = { 5, 6 };
+char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-2.c.jj	2013-06-07 15:19:28.900986237 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-2.c	2013-06-07 15:44:25.120129885 +0200
@@ -0,0 +1,25 @@ 
+/* PR target/56564 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-O3 -fno-pic -fdump-tree-optimized" } */
+
+struct S { long a, b; } s = { 5, 6 };
+char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 2 "optimized" } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-3.c.jj	2013-06-07 15:22:26.470079983 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-3.c	2013-06-07 15:44:34.872968234 +0200
@@ -0,0 +1,28 @@ 
+/* PR target/56564 */
+/* { dg-do compile { target { fpic && lp64 } } } */
+/* { dg-options "-O3 -fpic -fdump-tree-optimized" } */
+
+__thread struct S { long a, b; } s = { 5, 6 };
+__thread char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+/* For backwards compatibility we don't assume that t must
+   be aligned to 16 bytes, but align it anyway.  */
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 0 "optimized" } } */
+/* { dg-final { scan-assembler-not ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-4.c.jj	2013-06-07 15:25:12.638326084 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-4.c	2013-06-07 15:45:59.489567940 +0200
@@ -0,0 +1,22 @@ 
+/* PR target/56564 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-O3 -fno-pic -fdump-tree-optimized" } */
+
+__thread struct S { long a, b; } s = { 5, 6 };
+__thread char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-assembler-not ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c	2013-06-07 21:07:22.341380267 +0200
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c	2013-06-07 21:06:06.911946765 +0200
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c	2013-06-07 21:06:41.421248309 +0200
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -mtune=generic" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -mtune=generic -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c	2013-06-07 21:06:53.516302188 +0200
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/vect-sizes-1.c.jj	2010-11-01 09:06:30.000000000 +0100
+++ gcc/testsuite/gcc.target/i386/vect-sizes-1.c	2013-06-07 21:08:07.851581595 +0200
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O3 -ffast-math -mavx -mtune=generic" } */
+/* { dg-options "-O3 -ffast-math -mavx -mtune=generic -fno-common" } */
 
 double a[1024];
 
--- gcc/testsuite/gcc.target/i386/memcpy-1.c.jj	2011-07-11 10:39:29.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/memcpy-1.c	2013-06-07 21:08:46.263945653 +0200
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-require-effective-target ia32 } */
-/* { dg-options "-O2 -march=pentiumpro -minline-all-stringops" } */
+/* { dg-options "-O2 -march=pentiumpro -minline-all-stringops -fno-common" } */
 /* { dg-final { scan-assembler "rep" } } */
 /* { dg-final { scan-assembler "movs" } } */
 /* { dg-final { scan-assembler-not "test" } } */
--- gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c.jj	2009-11-04 08:15:26.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c	2013-06-07 20:58:02.091268267 +0200
@@ -18,7 +18,7 @@  struct s{
   struct t e;   /* unaligned (offset 2N+4N+4 B) */
 };
  
-struct s tmp;
+struct s tmp = { 1 };
 
 int main1 ()
 {  
--- gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c.jj	2009-11-04 08:15:26.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c	2013-06-07 20:58:38.201668248 +0200
@@ -18,7 +18,7 @@  struct s{
   struct t e;   /* unaligned (offset 2N+4N+4 B) */
 };
  
-struct s tmp;
+struct s tmp = { 1 };
 
 int main1 ()
 {