Patchwork DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

login
register
mail settings
Submitter Alan Modra
Date June 13, 2013, 3:37 p.m.
Message ID <20130613153701.GI21523@bubble.grove.modra.org>
Download mbox | patch
Permalink /patch/251095/
State New
Headers show

Comments

Alan Modra - June 13, 2013, 3:37 p.m.
On Thu, Jun 13, 2013 at 05:10:51PM +0930, Alan Modra wrote:
> On Wed, Jun 12, 2013 at 12:52:03PM -0500, Edmar Wienskoski wrote:
> > The e500v2 (SPE) hardware is such that if the address of vector (double world
> > load / stores) are not double world aligned the instruction will trap.
> > 
> > So this alignment is not optional.
> 
> Vector type alignment is also specified by the ppc64 abi.  I think we
> want the following.  Note that DATA_ALIGNMENT has been broken for
> vectors right from the initial vector support (and the error was
> copied for e500 double).  For example
> 
> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
> vec_align x = { 0, 0, 0, 0 };
> 
> currently loses the extra alignment.  Fixed by never decreasing
> alignment in DATA_ABI_ALIGNMENT.  Testing in progress.  OK to
> apply assuming bootstrap is good?  (I think I need a change in
> offsettable_ok_by_alignment too.  I'll do that in a separate patch.)

Revised patch with offsettable_ok_by_alignment change, avoiding dumb
idea of using statement expressions.  This one actually bootstraps and
passes regression testing.

	* config/rs6000/rs6000.h (enum data_align): New.
	(LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment.
	(DATA_ABI_ALIGNMENT): Define.
	(CONSTANT_ALIGNMENT): Correct comment.
	* config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare.
	* config/rs6000/rs6000.c (rs6000_data_alignment): New function.
	(offsettable_ok_by_alignment): Align by DATA_ABI_ALIGNMENT.
	Pass "type" not "decl" to DATA_ALIGNMENT.
Jakub Jelinek - June 13, 2013, 3:42 p.m.
On Fri, Jun 14, 2013 at 01:07:01AM +0930, Alan Modra wrote:
> @@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT
>        type = TREE_TYPE (decl);
>  
>        dalign = TYPE_ALIGN (type);
> +      dalign = DATA_ABI_ALIGNMENT (type, dalign);
>        if (CONSTANT_CLASS_P (decl))
>  	dalign = CONSTANT_ALIGNMENT (decl, dalign);
>        else
> -	dalign = DATA_ALIGNMENT (decl, dalign);
> +	dalign = DATA_ALIGNMENT (type, dalign);
>  
>        if (dsize == 0)
>  	{

What is this code trying to do?  Shouldn't it just use DECL_ALIGN
which should be set to the right value from get_variable_alignment?
I mean, if !decl_binds_to_current_def_p (decl), then using DATA_ALIGNMENT
or CONSTANT_ALIGNMENT (for anything but actually emitting the var into
object, or just as an optimization hint that very likely the decl will be
aligned enough, but not guaranteed), which are optimization, is wrong
(an ABI problem).

	Jakub
Alan Modra - June 13, 2013, 10:48 p.m.
On Thu, Jun 13, 2013 at 05:42:17PM +0200, Jakub Jelinek wrote:
> On Fri, Jun 14, 2013 at 01:07:01AM +0930, Alan Modra wrote:
> > @@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT
> >        type = TREE_TYPE (decl);
> >  
> >        dalign = TYPE_ALIGN (type);
> > +      dalign = DATA_ABI_ALIGNMENT (type, dalign);
> >        if (CONSTANT_CLASS_P (decl))
> >  	dalign = CONSTANT_ALIGNMENT (decl, dalign);
> >        else
> > -	dalign = DATA_ALIGNMENT (decl, dalign);
> > +	dalign = DATA_ALIGNMENT (type, dalign);
> >  
> >        if (dsize == 0)
> >  	{
> 
> What is this code trying to do?  Shouldn't it just use DECL_ALIGN
> which should be set to the right value from get_variable_alignment?
> I mean, if !decl_binds_to_current_def_p (decl), then using DATA_ALIGNMENT
> or CONSTANT_ALIGNMENT (for anything but actually emitting the var into
> object, or just as an optimization hint that very likely the decl will be
> aligned enough, but not guaranteed), which are optimization, is wrong
> (an ABI problem).

It is handling !DECL_P trees, which must be local.  I know I saw
STRING_CST here when I wrote offsettable_ok_by_alignment, hence the
use of CONSTANT_ALIGNMENT.  I'm not so sure about the need for
DATA_ALIGNMENT now, but if it was correct before then we ought to
be using both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT after your
changes.
Jakub Jelinek - June 14, 2013, 8:59 a.m.
On Fri, Jun 14, 2013 at 08:18:19AM +0930, Alan Modra wrote:
> On Thu, Jun 13, 2013 at 05:42:17PM +0200, Jakub Jelinek wrote:
> > On Fri, Jun 14, 2013 at 01:07:01AM +0930, Alan Modra wrote:
> > > @@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT
> > >        type = TREE_TYPE (decl);
> > >  
> > >        dalign = TYPE_ALIGN (type);
> > > +      dalign = DATA_ABI_ALIGNMENT (type, dalign);
> > >        if (CONSTANT_CLASS_P (decl))
> > >  	dalign = CONSTANT_ALIGNMENT (decl, dalign);
> > >        else
> > > -	dalign = DATA_ALIGNMENT (decl, dalign);
> > > +	dalign = DATA_ALIGNMENT (type, dalign);
> > >  
> > >        if (dsize == 0)
> > >  	{
> > 
> > What is this code trying to do?  Shouldn't it just use DECL_ALIGN
> > which should be set to the right value from get_variable_alignment?
> > I mean, if !decl_binds_to_current_def_p (decl), then using DATA_ALIGNMENT
> > or CONSTANT_ALIGNMENT (for anything but actually emitting the var into
> > object, or just as an optimization hint that very likely the decl will be
> > aligned enough, but not guaranteed), which are optimization, is wrong
> > (an ABI problem).
> 
> It is handling !DECL_P trees, which must be local.  I know I saw
> STRING_CST here when I wrote offsettable_ok_by_alignment, hence the
> use of CONSTANT_ALIGNMENT.  I'm not so sure about the need for
> DATA_ALIGNMENT now, but if it was correct before then we ought to
> be using both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT after your
> changes.

Yeah, then it makes sense.  Sorry for not looking up earlier that this is
the !DECL_P case.

As for the 
typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));                                                                            
vec_align x = { 0, 0, 0, 0 };                                                                                                                    
changes, that is ABI changing bugfix, so the question is, are you fine with
breaking the ABI (between 4.8 and 4.9, or if you wanted to backport it to
4.8 too (I certainly plan to backport the non-ppc DATA_ABI_ALIGNMENT changes
to 4.8.2, already am using it in our compilers))?  The other option is
to fix the ABI, but keep things backwards ABI compatible.  That would be
done by decreasing the alignment as it used to do before in DATA_ABI_ALIGNMENT,
and increasing it to the desirable level only in DATA_ALIGNMENT.  That has
the effect that when emitting the decls into assembly e.g. the above will
now be correctly 32 byte aligned, but accesses to such decl in compiler
generated code will only assume that alignment if
decl_binds_to_current_def_p, otherwise they will keep assuming the old
(broken) lowered alignment.  At least for 4.8 backport IMHO that would be a
better idea (but of course would need big comment explaning it).

	Jakub
Alan Modra - June 14, 2013, 10:42 a.m.
On Fri, Jun 14, 2013 at 10:59:52AM +0200, Jakub Jelinek wrote:
> On Fri, Jun 14, 2013 at 08:18:19AM +0930, Alan Modra wrote:
> > It is handling !DECL_P trees, which must be local.  I know I saw
> > STRING_CST here when I wrote offsettable_ok_by_alignment, hence the
> > use of CONSTANT_ALIGNMENT.  I'm not so sure about the need for
> > DATA_ALIGNMENT now, but if it was correct before then we ought to
> > be using both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT after your
> > changes.
> 
> Yeah, then it makes sense.  Sorry for not looking up earlier that this is
> the !DECL_P case.

Your comment prodded me into looking at whether the !DECL_P code is
needed in 4.9, and it looks like we never see !DECL_P trees..
Bootstrap and regression tests all langs on powerpc64 didn't hit a
gcc_unreachable() I put there, both with -fsection-anchors and
-fno-section-anchors.  David, please consider that piece of the patch
retracted.

> As for the 
> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));                                                                            
> vec_align x = { 0, 0, 0, 0 };                                                                                                                    
> changes, that is ABI changing bugfix, so the question is, are you fine with
> breaking the ABI (between 4.8 and 4.9, or if you wanted to backport it to
> 4.8 too (I certainly plan to backport the non-ppc DATA_ABI_ALIGNMENT changes
> to 4.8.2, already am using it in our compilers))?  The other option is
> to fix the ABI, but keep things backwards ABI compatible.  That would be
> done by decreasing the alignment as it used to do before in DATA_ABI_ALIGNMENT,
> and increasing it to the desirable level only in DATA_ALIGNMENT.  That has
> the effect that when emitting the decls into assembly e.g. the above will
> now be correctly 32 byte aligned, but accesses to such decl in compiler
> generated code will only assume that alignment if
> decl_binds_to_current_def_p, otherwise they will keep assuming the old
> (broken) lowered alignment.  At least for 4.8 backport IMHO that would be a
> better idea (but of course would need big comment explaning it).

I see your point, but for there to be a real problem we'd need
a) A library exporting such a type with (supposed) increased
   alignment, and,
b) gcc would need to make use of the increased alignment.

(a) must be rare or non-existent or you'd think we would have had a
bug report about lack of user alignment in vector typedefs.  The code
has been like this since 2001-11-07, so users have had a long time to
discover it.  (Of course, this is an argument for just ignoring the
bug too.)

(b) doesn't happen in the rs6000 backend as far as I'm aware.  Do you
know whether there is some optimisation based on alignment in generic
parts of gcc?  A quick test like

typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
vec_align x = { 0, 0, 0, 0 };

long f1 (void)
{
  return (long) &x & -32;
}

static int y __attribute__ ((aligned(32)));

long f2 (void)
{
  return (long) &y & -32;
}

shows the "& -32" in both functions isn't optimised away.
Jakub Jelinek - June 14, 2013, 10:54 a.m.
On Fri, Jun 14, 2013 at 08:12:02PM +0930, Alan Modra wrote:
> > As for the 
> > typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));                                                                            
> > vec_align x = { 0, 0, 0, 0 };                                                                                                                    
> > changes, that is ABI changing bugfix, so the question is, are you fine with
> > breaking the ABI (between 4.8 and 4.9, or if you wanted to backport it to
> > 4.8 too (I certainly plan to backport the non-ppc DATA_ABI_ALIGNMENT changes
> > to 4.8.2, already am using it in our compilers))?  The other option is
> > to fix the ABI, but keep things backwards ABI compatible.  That would be
> > done by decreasing the alignment as it used to do before in DATA_ABI_ALIGNMENT,
> > and increasing it to the desirable level only in DATA_ALIGNMENT.  That has
> > the effect that when emitting the decls into assembly e.g. the above will
> > now be correctly 32 byte aligned, but accesses to such decl in compiler
> > generated code will only assume that alignment if
> > decl_binds_to_current_def_p, otherwise they will keep assuming the old
> > (broken) lowered alignment.  At least for 4.8 backport IMHO that would be a
> > better idea (but of course would need big comment explaning it).
> 
> I see your point, but for there to be a real problem we'd need
> a) A library exporting such a type with (supposed) increased
>    alignment, and,
> b) gcc would need to make use of the increased alignment.
> 
> (a) must be rare or non-existent or you'd think we would have had a
> bug report about lack of user alignment in vector typedefs.  The code
> has been like this since 2001-11-07, so users have had a long time to
> discover it.  (Of course, this is an argument for just ignoring the
> bug too.)

It doesn't have to be an exported symbol from a library, it is enough to
compile some objects using one compiler and other objects using another
compiler, then link into the same library.

> (b) doesn't happen in the rs6000 backend as far as I'm aware.  Do you
> know whether there is some optimisation based on alignment in generic
> parts of gcc?  A quick test like

Tons of them, the DECL_ALIGN value is used say by get_pointer_alignment,
vectorizer assumptions, is added to MEM_ATTRS, so anything looking at
alignment in RTL can optimize too.

> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
> vec_align x = { 0, 0, 0, 0 };
> 
> long f1 (void)
> {
>   return (long) &x & -32;
> }

Try (long) &x & 31; ?  That &x & -32 not being optimized into &x
is guess a missed optimization.

Consider if you put:
typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
vec_align x = { 0, 0, 0, 0 };
into one TU and compile with gcc 4.8.1, then
typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
extern vec_align x;

long f1 (void)
{
  return (long) &x & 31;
}
in another TU and compile with gcc trunk after your patch.  I bet
it will be optimized into return 0; by the trunk + your patch compiler,
while the alignment will be actually just 16 byte.

	Jakub
Alan Modra - June 14, 2013, 2:57 p.m.
On Fri, Jun 14, 2013 at 12:54:40PM +0200, Jakub Jelinek wrote:
> On Fri, Jun 14, 2013 at 08:12:02PM +0930, Alan Modra wrote:
> > I see your point, but for there to be a real problem we'd need
> > a) A library exporting such a type with (supposed) increased
> >    alignment, and,
> > b) gcc would need to make use of the increased alignment.
> > 
> > (a) must be rare or non-existent or you'd think we would have had a
> > bug report about lack of user alignment in vector typedefs.  The code
> > has been like this since 2001-11-07, so users have had a long time to
> > discover it.  (Of course, this is an argument for just ignoring the
> > bug too.)
> 
> It doesn't have to be an exported symbol from a library, it is enough to
> compile some objects using one compiler and other objects using another
> compiler, then link into the same library.

OK.

> Try (long) &x & 31; ?  That &x & -32 not being optimized into &x
> is guess a missed optimization.

Huh, trust me to hit another bug. :)

> Consider if you put:
> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
> vec_align x = { 0, 0, 0, 0 };
> into one TU and compile with gcc 4.8.1, then
> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
> extern vec_align x;
> 
> long f1 (void)
> {
>   return (long) &x & 31;
> }
> in another TU and compile with gcc trunk after your patch.  I bet
> it will be optimized into return 0; by the trunk + your patch compiler,
> while the alignment will be actually just 16 byte.

Right.  Counterpoint is that gcc made exactly the same sort of error
across TUs and even in the same TU prior to my change.  eg.

typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
vec_align x = { 0, 0, 0, 0 };
long f1 (void) { return (long)&x & 31; }
int y __attribute__ ((vector_size(16), aligned(32))) = { 0, 0, 0, 0 };
long f2 (void) { return (long)&y & 31; }

compiles to

.L.f1:
        li 3,0
        blr
..
.L.f2:
        li 3,0
        blr
..
        .globl y
        .lcomm  y,16,32
        .type   y, @object
        .globl x
        .lcomm  x,16,16
        .type   x, @object
        .ident  "GCC: (GNU) 4.7.2 20120921 (Red Hat 4.7.2-2)"

My implementation of rs6000_data_alignment therefore doesn't introduce
a *new* ABI incompatibility.  I question whether it is worth
complicating rs6000_data_alignment, especially since your suggestion
of using the older buggy alignment in DATA_ABI_ALIGNMENT then
increasing in DATA_ALIGNMENT isn't as simple as it sounds.  We're
not talking about some fixed increase in DATA_ALIGNMENT but what we
want is the value of alignment before DATA_ABI_ALIGNMENT.  Perhaps
that could be retrieved from TYPE_ALIGN (type) and
MAX_OFILE_ALIGNMENT, but that would make our DATA_ALIGNMENT the only
target to need such tricks.
David Edelsohn - June 17, 2013, 11:37 p.m.
On Thu, Jun 13, 2013 at 11:37 AM, Alan Modra <amodra@gmail.com> wrote:

> Revised patch with offsettable_ok_by_alignment change, avoiding dumb
> idea of using statement expressions.  This one actually bootstraps and
> passes regression testing.
>
>         * config/rs6000/rs6000.h (enum data_align): New.
>         (LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment.
>         (DATA_ABI_ALIGNMENT): Define.
>         (CONSTANT_ALIGNMENT): Correct comment.
>         * config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare.
>         * config/rs6000/rs6000.c (rs6000_data_alignment): New function.

The revised patch, without the DECL_P part is okay.

The original code produced the necessary alignment and neither of us
can find any code in public packages that increases the alignment for
PPC vector types.  While there is the possibility that a user could
encounter an object file produced by an older GCC with less strict
alignment and a version of GCC with this fix would make an incorrect
assumption, this does not seem very likely in practice.

Thanks, David

Patch

Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 200055)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -813,12 +813,6 @@  extern unsigned rs6000_pointer_size;
 /* No data type wants to be aligned rounder than this.  */
 #define BIGGEST_ALIGNMENT 128
 
-/* A C expression to compute the alignment for a variables in the
-   local store.  TYPE is the data type, and ALIGN is the alignment
-   that the object would ordinarily have.  */
-#define LOCAL_ALIGNMENT(TYPE, ALIGN)				\
-  DATA_ALIGNMENT (TYPE, ALIGN)
-
 /* Alignment of field after `int : 0' in a structure.  */
 #define EMPTY_FIELD_BOUNDARY 32
 
@@ -828,8 +822,15 @@  extern unsigned rs6000_pointer_size;
 /* A bit-field declared as `int' forces `int' alignment for the struct.  */
 #define PCC_BITFIELD_TYPE_MATTERS 1
 
-/* Make strings word-aligned so strcpy from constants will be faster.
-   Make vector constants quadword aligned.  */
+enum data_align { align_abi, align_opt, align_both };
+
+/* A C expression to compute the alignment for a variables in the
+   local store.  TYPE is the data type, and ALIGN is the alignment
+   that the object would ordinarily have.  */
+#define LOCAL_ALIGNMENT(TYPE, ALIGN)				\
+  rs6000_data_alignment (TYPE, ALIGN, align_both)
+
+/* Make strings word-aligned so strcpy from constants will be faster.  */
 #define CONSTANT_ALIGNMENT(EXP, ALIGN)                           \
   (TREE_CODE (EXP) == STRING_CST	                         \
    && (STRICT_ALIGNMENT || !optimize_size)                       \
@@ -837,21 +838,14 @@  extern unsigned rs6000_pointer_size;
    ? BITS_PER_WORD                                               \
    : (ALIGN))
 
-/* Make arrays of chars word-aligned for the same reasons.
-   Align vectors to 128 bits.  Align SPE vectors and E500 v2 doubles to
+/* Make arrays of chars word-aligned for the same reasons.  */
+#define DATA_ALIGNMENT(TYPE, ALIGN) \
+  rs6000_data_alignment (TYPE, ALIGN, align_opt)
+
+/* Align vectors to 128 bits.  Align SPE vectors and E500 v2 doubles to
    64 bits.  */
-#define DATA_ALIGNMENT(TYPE, ALIGN)					\
-  (TREE_CODE (TYPE) == VECTOR_TYPE					\
-   ? (((TARGET_SPE && SPE_VECTOR_MODE (TYPE_MODE (TYPE)))		\
-       || (TARGET_PAIRED_FLOAT && PAIRED_VECTOR_MODE (TYPE_MODE (TYPE)))) \
-      ? 64 : 128)							\
-   : ((TARGET_E500_DOUBLE						\
-       && TREE_CODE (TYPE) == REAL_TYPE					\
-       && TYPE_MODE (TYPE) == DFmode)					\
-      ? 64								\
-      : (TREE_CODE (TYPE) == ARRAY_TYPE					\
-	 && TYPE_MODE (TREE_TYPE (TYPE)) == QImode			\
-	 && (ALIGN) < BITS_PER_WORD) ? BITS_PER_WORD : (ALIGN)))
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \
+  rs6000_data_alignment (TYPE, ALIGN, align_abi)
 
 /* Nonzero if move instructions will actually fail to work
    when given unaligned data.  */
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 200055)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -141,6 +141,7 @@  extern int rs6000_loop_align (rtx);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
+extern unsigned int rs6000_data_alignment (tree, unsigned int, enum data_align);
 extern unsigned int rs6000_special_round_type_align (tree, unsigned int,
 						     unsigned int);
 extern unsigned int darwin_rs6000_special_round_type_align (tree, unsigned int,
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 200055)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5384,6 +5390,44 @@  invalid_e500_subreg (rtx op, enum machine_mode mod
   return false;
 }
 
+/* Return alignment of TYPE.  Existing alignment is ALIGN.  HOW
+   selects whether the alignment is abi mandated, optional, or
+   both abi and optional alignment.  */
+   
+unsigned int
+rs6000_data_alignment (tree type, unsigned int align, enum data_align how)
+{
+  if (how != align_opt)
+    {
+      if (TREE_CODE (type) == VECTOR_TYPE)
+	{
+	  if ((TARGET_SPE && SPE_VECTOR_MODE (TYPE_MODE (type)))
+	      || (TARGET_PAIRED_FLOAT && PAIRED_VECTOR_MODE (TYPE_MODE (type))))
+	    {
+	      if (align < 64)
+		align = 64;
+	    }
+	  else if (align < 128)
+	    align = 128;
+	}
+      else if (TARGET_E500_DOUBLE
+	       && TREE_CODE (type) == REAL_TYPE
+	       && TYPE_MODE (type) == DFmode
+	       && align < 64)
+	align = 64;
+    }
+
+  if (how != align_abi)
+    {
+      if (TREE_CODE (type) == ARRAY_TYPE
+	  && TYPE_MODE (TREE_TYPE (type)) == QImode
+	  && align < BITS_PER_WORD)
+	align = BITS_PER_WORD;
+    }
+
+  return align;
+}
+
 /* AIX increases natural record alignment to doubleword if the first
    field is an FP double while the FP fields remain word aligned.  */
 
@@ -5774,10 +5818,11 @@  offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT
       type = TREE_TYPE (decl);
 
       dalign = TYPE_ALIGN (type);
+      dalign = DATA_ABI_ALIGNMENT (type, dalign);
       if (CONSTANT_CLASS_P (decl))
 	dalign = CONSTANT_ALIGNMENT (decl, dalign);
       else
-	dalign = DATA_ALIGNMENT (decl, dalign);
+	dalign = DATA_ALIGNMENT (type, dalign);
 
       if (dsize == 0)
 	{