diff mbox

, PR 64019, fix power7/power8 regression

Message ID 20141201224829.GA27632@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner Dec. 1, 2014, 10:48 p.m. UTC
in my change on November 24th (adding the support to use scalar floating point
values in Altivec registers) there was a regression when Spec 2000 was compiled
for 32-bit big endian power7 systems.  The rs6000_legitimize_reload_address
function generated reg+offset address for scalar values.  This resulted in
compiler generating these addresses to load up a constant in some cases in
32-bit.  This patch does not give an optimized address for scalar types if they
can go in Altivec registers.  By not generating an 'optimized' address, reload
falls to try other options, and it eventually generates the lfd instruction
with an offset instead of an lxsdx.

I have bootstraped these patches on big endian power7, big endian power8, and
little endian power8 systems, and there were no regressions.  Is the patch ok
to install?

[gcc]
2014-12-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/64019
	* config/rs6000/rs6000.c (rs6000_legitimize_reload_address): Do
	not create LO_SUM address for constant addresses if the type can
	go in Altivec registers.

[gcc/testsuite]
2014-12-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/64019
	* gcc.target/powerpc/pr64019.c: New file.

Comments

David Edelsohn Dec. 2, 2014, 3:39 a.m. UTC | #1
On Mon, Dec 1, 2014 at 5:48 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> in my change on November 24th (adding the support to use scalar floating point
> values in Altivec registers) there was a regression when Spec 2000 was compiled
> for 32-bit big endian power7 systems.  The rs6000_legitimize_reload_address
> function generated reg+offset address for scalar values.  This resulted in
> compiler generating these addresses to load up a constant in some cases in
> 32-bit.  This patch does not give an optimized address for scalar types if they
> can go in Altivec registers.  By not generating an 'optimized' address, reload
> falls to try other options, and it eventually generates the lfd instruction
> with an offset instead of an lxsdx.
>
> I have bootstraped these patches on big endian power7, big endian power8, and
> little endian power8 systems, and there were no regressions.  Is the patch ok
> to install?
>
> [gcc]
> 2014-12-01  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         PR target/64019
>         * config/rs6000/rs6000.c (rs6000_legitimize_reload_address): Do
>         not create LO_SUM address for constant addresses if the type can
>         go in Altivec registers.
>
> [gcc/testsuite]
> 2014-12-01  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         PR target/64019
>         * gcc.target/powerpc/pr64019.c: New file.

Okay.

Thanks, David
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 218090)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -7593,7 +7593,11 @@  rs6000_legitimize_reload_address (rtx x,
 	 naturally aligned.  Since we say the address is good here, we
 	 can't disable offsets from LO_SUMs in mem_operand_gpr.
 	 FIXME: Allow offset from lo_sum for other modes too, when
-	 mem is sufficiently aligned.  */
+	 mem is sufficiently aligned.
+
+	 Also disallow this if the type can got in VMX/Altivec registers, since
+	 those registers do not have d-form (reg+offset) address modes.  */
+      && !reg_addr[mode].scalar_in_vmx_p
       && mode != TFmode
       && mode != TDmode
       && (mode != TImode || !TARGET_VSX_TIMODE)
Index: gcc/testsuite/gcc.target/powerpc/pr64019.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr64019.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr64019.c	(revision 0)
@@ -0,0 +1,71 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power7" } } */
+/* { dg-options "-O2 -ffast-math -mcpu=power7" } */
+
+#include <math.h>
+
+typedef struct
+{
+  double x, y, z;
+  double q, a, b, mass;
+  double vx, vy, vz, vw, dx, dy, dz;
+}
+ATOM;
+int
+u_f_nonbon (lambda)
+     double lambda;
+{
+  double r, r0, xt, yt, zt;
+  double lcutoff, cutoff, get_f_variable ();
+  double rdebye;
+  int inbond, inangle, i;
+  ATOM *a1, *a2, *bonded[10], *angled[10];
+  ATOM *(*use)[];
+  int uselist (), nuse, used;
+  ATOM *cp, *bp;
+  int a_number (), inbuffer;
+  double (*buffer)[], xx, yy, zz, k;
+  int invector, atomsused, ii, jj, imax;
+  double (*vector)[];
+  ATOM *(*atms)[];
+  double dielectric;
+  rdebye = cutoff / 2.;
+  dielectric = get_f_variable ("dielec");
+  imax = a_number ();
+  for (jj = 1; jj < imax; jj++, a1 = bp)
+    {
+      if ((*use)[used] == a1)
+	{
+	  used += 1;
+	}
+      while ((*use)[used] != a1)
+	{
+	  for (i = 0; i < inbuffer; i++)
+	    {
+	    }
+	  xx = a1->x + lambda * a1->dx;
+	  yy = a1->y + lambda * a1->dy;
+	  zz = a1->z + lambda * a1->dz;
+	  for (i = 0; i < inbuffer; i++)
+	    {
+	      xt = xx - (*buffer)[3 * i];
+	      yt = yy - (*buffer)[3 * i + 1];
+	      zt = zz - (*buffer)[3 * i + 2];
+	      r = xt * xt + yt * yt + zt * zt;
+	      r0 = sqrt (r);
+	      xt = xt / r0;
+	      zt = zt / r0;
+	      k =
+		-a1->q * (*atms)[i]->q * dielectric * exp (-r0 / rdebye) *
+		(1. / (rdebye * r0) + 1. / r);
+	      k += a1->a * (*atms)[i]->a / r / r0 * 6;
+	      k -= a1->b * (*atms)[i]->b / r / r / r0 * 12;
+	      (*vector)[3 * i] = xt * k;
+	      (*vector)[3 * i + 1] = yt * k;
+	      (*vector)[3 * i + 2] = zt * k;
+	    }
+	}
+    }
+}