diff mbox

[RS6000] strict alignment for little-endian

Message ID 20131119065927.GB22514@bubble.grove.modra.org
State New
Headers show

Commit Message

Alan Modra Nov. 19, 2013, 6:59 a.m. UTC
On Mon, Jun 10, 2013 at 03:14:04PM -0400, Michael Meissner wrote:
> On Fri, Jun 07, 2013 at 10:54:39AM +0930, Alan Modra wrote:
> > I'd like to remove -mstrict-align for little-endian powerpc, because
> > the assumption that mis-aligned accesses are massively slow isn't true
> > for current powerpc processors.  However I also don't want to break
> > old machines, so probably should add -mstrict-align back for some set
> > of cpus.  Can anyone tell me the set?
> 
> I don't know the set.  I recall the original PPC's (604, 603, etc.) had the
> strict alignment rule in little endian rule, and the 750 relaxed it for GPRs
> (but not FPRs), but I don't know about later machines.

After some research and bothering people who ought to know, I came
up with the following:

Since modern powerpc processors that support little-endian (power6 and
later) do so properly rather than via the low bit swizzling of older
processors (603, 604, 750, 74xx) we don't take traps on all unaligned
accesses.  Even so, power6 and power7 still take more alignment traps
in little-endian mode than big-endian, so it may be advantageous to
leave -mstrict-align as the default for power7.  Anton tells me that
for power8, -mstrict-align is not necessary.  It's not desirable since
gcc easily loses track of alignment, for instance with -mstrict-align

void foo (char *p, char *q)
{
  __builtin_memcpy (p, q, 4);
}

generates

	lbz 7,0(4)
	lbz 8,1(4)
	lbz 10,2(4)
	lbz 9,3(4)
	stb 7,0(3)
	stb 8,1(3)
	stb 10,2(3)
	stb 9,3(3)
	blr

whereas -mno-strict-align gives

	lwz 9,0(4)
	stw 9,0(3)
	blr

This patch disables the default -mstrict-align on power8.  Rather than
testing values from enum processor_type, which is prone to error when
adding a new processor, I chose one of the mask bits that happens to
be set for power8, and is likely to be set for future new processors.
I also take the opportunity to update PROCESSOR_DEFAULT64 to the value
most likely appropriate for machines running ELFv2.  Bootstrapped and
regression tested powerpc64-linux.  OK to apply?

David, it also occurs to me that PROCESSOR_DEFAULT* isn't the best
choice of macro name nowadays.  The macro is only used to default
-mtune, not -mcpu.  If you think it a good idea, I'll
s/PROCESSOR_DEFAULT/TUNE_DEFAULT/ throughout config/rs6000 as a
followup commit to this one.

	* config/rs6000/sysv4.h (CC1_ENDIAN_LITTLE_SPEC): Define as empty.
	* config/rs6000/rs6000.c (rs6000_option_override_internal): Default
	to strict alignment on older processors when little-endian.
	* config/rs6000/linux64.h (PROCESSOR_DEFAULT64): Default to power8
	for ELFv2.

Comments

Eric Botcazou Nov. 19, 2013, 9:42 a.m. UTC | #1
> It's not desirable since gcc easily loses track of alignment, for instance
> with -mstrict-align
> 
> void foo (char *p, char *q)
> {
>   __builtin_memcpy (p, q, 4);
> }
> 
> generates
> 
> 	lbz 7,0(4)
> 	lbz 8,1(4)
> 	lbz 10,2(4)
> 	lbz 9,3(4)
> 	stb 7,0(3)
> 	stb 8,1(3)
> 	stb 10,2(3)
> 	stb 9,3(3)
> 	blr
> 
> whereas -mno-strict-align gives
> 
> 	lwz 9,0(4)
> 	stw 9,0(3)
> 	blr

I presume you meant:

void foo (int *p, int *q)
{
  __builtin_memcpy (p, q, 4);
}

which will yield the same generated code.  Yes, it's an unfortunate regression 
on strict-alignment platforms, up to 4.5 the generated code was the same.
David Edelsohn Nov. 19, 2013, 3:53 p.m. UTC | #2
On Tue, Nov 19, 2013 at 1:59 AM, Alan Modra <amodra@gmail.com> wrote:

> This patch disables the default -mstrict-align on power8.  Rather than
> testing values from enum processor_type, which is prone to error when
> adding a new processor, I chose one of the mask bits that happens to
> be set for power8, and is likely to be set for future new processors.
> I also take the opportunity to update PROCESSOR_DEFAULT64 to the value
> most likely appropriate for machines running ELFv2.  Bootstrapped and
> regression tested powerpc64-linux.  OK to apply?
>
> David, it also occurs to me that PROCESSOR_DEFAULT* isn't the best
> choice of macro name nowadays.  The macro is only used to default
> -mtune, not -mcpu.  If you think it a good idea, I'll
> s/PROCESSOR_DEFAULT/TUNE_DEFAULT/ throughout config/rs6000 as a
> followup commit to this one.
>
>         * config/rs6000/sysv4.h (CC1_ENDIAN_LITTLE_SPEC): Define as empty.
>         * config/rs6000/rs6000.c (rs6000_option_override_internal): Default
>         to strict alignment on older processors when little-endian.
>         * config/rs6000/linux64.h (PROCESSOR_DEFAULT64): Default to power8
>         for ELFv2.

The patch is okay.

I agree that the mechanism for selecting the default tuning should be
cleaned up, but renaming PROCESSOR_DEFAULT to TUNE_DEFAULT just papers
over the issue.

Thanks, David
diff mbox

Patch

Index: gcc/config/rs6000/linux64.h
===================================================================
--- gcc/config/rs6000/linux64.h	(revision 205002)
+++ gcc/config/rs6000/linux64.h	(working copy)
@@ -71,7 +71,11 @@ 
 #undef  PROCESSOR_DEFAULT
 #define PROCESSOR_DEFAULT PROCESSOR_POWER7
 #undef  PROCESSOR_DEFAULT64
+#ifdef LINUX64_DEFAULT_ABI_ELFv2
+#define PROCESSOR_DEFAULT64 PROCESSOR_POWER8
+#else
 #define PROCESSOR_DEFAULT64 PROCESSOR_POWER7
+#endif
 
 /* We don't need to generate entries in .fixup, except when
    -mrelocatable or -mrelocatable-lib is given.  */
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 205002)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -3217,6 +3217,12 @@ 
 	}
     }
 
+  /* If little-endian, default to -mstrict-align on older processors.
+     Testing for htm matches power8 and later.  */
+  if (!BYTES_BIG_ENDIAN
+      && !(processor_target_table[tune_index].target_enable & OPTION_MASK_HTM))
+    rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
+
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
     {
Index: gcc/config/rs6000/sysv4.h
===================================================================
--- gcc/config/rs6000/sysv4.h	(revision 205002)
+++ gcc/config/rs6000/sysv4.h	(working copy)
@@ -538,12 +538,7 @@ 
 
 #define	CC1_ENDIAN_BIG_SPEC ""
 
-#define	CC1_ENDIAN_LITTLE_SPEC "\
-%{!mstrict-align: %{!mno-strict-align: \
-    %{!mcall-i960-old: \
-	-mstrict-align \
-    } \
-}}"
+#define	CC1_ENDIAN_LITTLE_SPEC ""
 
 #define	CC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_big)"