diff mbox

[rs6000] Avoid optimization problem for VEC_PERM_EXPR

Message ID 1390520607.3449.92.camel@gnopaine
State New
Headers show

Commit Message

Bill Schmidt Jan. 23, 2014, 11:43 p.m. UTC
Hi,

While testing another patch, I hit a regression at -O2 for two of the
vector shuffle tests.  This patch fixes the problem.

The problem was introduced with the little endian fixes for
VEC_PERM_EXPR.  The original change performed the necessary
transformation at expand time, but this is incorrect.  This creates a
technically incorrect representation in the RTL, and RTL simplification
can subsequently fold the expression in an unexpected way.  The correct
fix is to delay the transformation until final code generation.

Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
regressions.  Is this ok for trunk?  I will also plan to backport this
to ibm/gcc-4_8-branch after burn-in.

Thanks,
Bill


2014-01-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove
	correction for little endian...
	* config/rs6000/vsx.md (vsx_xxpermdi2_<mode>_1): ...and move it to
	here.

Comments

David Edelsohn Jan. 24, 2014, 12:42 a.m. UTC | #1
On Thu, Jan 23, 2014 at 6:43 PM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Hi,
>
> While testing another patch, I hit a regression at -O2 for two of the
> vector shuffle tests.  This patch fixes the problem.
>
> The problem was introduced with the little endian fixes for
> VEC_PERM_EXPR.  The original change performed the necessary
> transformation at expand time, but this is incorrect.  This creates a
> technically incorrect representation in the RTL, and RTL simplification
> can subsequently fold the expression in an unexpected way.  The correct
> fix is to delay the transformation until final code generation.
>
> Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?  I will also plan to backport this
> to ibm/gcc-4_8-branch after burn-in.
>
> Thanks,
> Bill
>
>
> 2014-01-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove
>         correction for little endian...
>         * config/rs6000/vsx.md (vsx_xxpermdi2_<mode>_1): ...and move it to
>         here.

Okay. Would you please move the comment also?

Thanks, David
Bill Schmidt Jan. 24, 2014, 1:48 a.m. UTC | #2
On Thu, 2014-01-23 at 19:42 -0500, David Edelsohn wrote:
> On Thu, Jan 23, 2014 at 6:43 PM, Bill Schmidt
> <wschmidt@linux.vnet.ibm.com> wrote:
> > Hi,
> >
> > While testing another patch, I hit a regression at -O2 for two of the
> > vector shuffle tests.  This patch fixes the problem.
> >
> > The problem was introduced with the little endian fixes for
> > VEC_PERM_EXPR.  The original change performed the necessary
> > transformation at expand time, but this is incorrect.  This creates a
> > technically incorrect representation in the RTL, and RTL simplification
> > can subsequently fold the expression in an unexpected way.  The correct
> > fix is to delay the transformation until final code generation.
> >
> > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
> > regressions.  Is this ok for trunk?  I will also plan to backport this
> > to ibm/gcc-4_8-branch after burn-in.
> >
> > Thanks,
> > Bill
> >
> >
> > 2014-01-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
> >
> >         * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove
> >         correction for little endian...
> >         * config/rs6000/vsx.md (vsx_xxpermdi2_<mode>_1): ...and move it to
> >         here.
> 
> Okay. Would you please move the comment also?

Oh, sure.  Will do.

Bill
> 
> Thanks, David
>
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 206889)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -30115,22 +30121,6 @@  rs6000_expand_vec_perm_const_1 (rtx target, rtx op
       vmode = GET_MODE (target);
       gcc_assert (GET_MODE_NUNITS (vmode) == 2);
       dmode = mode_for_vector (GET_MODE_INNER (vmode), 4);
-
-      /* For little endian, swap operands and invert/swap selectors
-	 to get the correct xxpermdi.  The operand swap sets up the
-	 inputs as a little endian array.  The selectors are swapped
-	 because they are defined to use big endian ordering.  The
-	 selectors are inverted to get the correct doublewords for
-	 little endian ordering.  */
-      if (!BYTES_BIG_ENDIAN)
-	{
-	  int n;
-	  perm0 = 3 - perm0;
-	  perm1 = 3 - perm1;
-	  n = perm0, perm0 = perm1, perm1 = n;
-	  x = op0, op0 = op1, op1 = x;
-	}
-
       x = gen_rtx_VEC_CONCAT (dmode, op0, op1);
       v = gen_rtvec (2, GEN_INT (perm0), GEN_INT (perm1));
       x = gen_rtx_VEC_SELECT (vmode, x, gen_rtx_PARALLEL (VOIDmode, v));
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 206889)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -1634,9 +1634,26 @@ 
 		     (match_operand 4 "const_2_to_3_operand" "")])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
-  int mask = (INTVAL (operands[3]) << 1) | (INTVAL (operands[4]) - 2);
+  int op3, op4, mask;
+
+  if (BYTES_BIG_ENDIAN)
+    {
+      op3 = INTVAL (operands[3]);
+      op4 = INTVAL (operands[4]);
+    }
+  else
+    {
+      op3 = 3 - INTVAL (operands[4]);
+      op4 = 3 - INTVAL (operands[3]);
+    }
+
+  mask = (op3 << 1) | (op4 - 2);
   operands[3] = GEN_INT (mask);
-  return "xxpermdi %x0,%x1,%x2,%3";
+
+  if (BYTES_BIG_ENDIAN)
+    return "xxpermdi %x0,%x1,%x2,%3";
+  else
+    return "xxpermdi %x0,%x2,%x1,%3";
 }
   [(set_attr "type" "vecperm")])