diff mbox

[rs6000] Generate correct constant permutes using xxpermdi

Message ID 1385134908.3052.4.camel@gnopaine
State New
Headers show

Commit Message

Bill Schmidt Nov. 22, 2013, 3:41 p.m. UTC
Hi,

Most of our constant vector permutes use the vperm instructions, but for
V2DImode and V2DFmode we use xxpermdi.  This patch corrects the
generated xxpermdi to be correct for little endian, which fixes failures
of the test cases gcc.dg/torture/vshuf-v2d[fi].c.  Note that we can't
fix this directly in the pattern for xxpermdi, because that pattern is
used by the corresponding intrinsic.

Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
regressions.  Ok for trunk?

Thanks,
Bill


2013-11-22  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Correct
	for little endian.

Comments

David Edelsohn Nov. 23, 2013, 3:19 a.m. UTC | #1
On Fri, Nov 22, 2013 at 10:41 AM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Hi,
>
> Most of our constant vector permutes use the vperm instructions, but for
> V2DImode and V2DFmode we use xxpermdi.  This patch corrects the
> generated xxpermdi to be correct for little endian, which fixes failures
> of the test cases gcc.dg/torture/vshuf-v2d[fi].c.  Note that we can't
> fix this directly in the pattern for xxpermdi, because that pattern is
> used by the corresponding intrinsic.
>
> Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
> regressions.  Ok for trunk?
>
> Thanks,
> Bill
>
>
> 2013-11-22  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Correct
>         for little endian.

Okay.

The instrinsic could use a separate, dedicated pattern.

Thanks, David
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 205243)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -30021,6 +30021,21 @@  rs6000_expand_vec_perm_const_1 (rtx target, rtx op
       gcc_assert (GET_MODE_NUNITS (vmode) == 2);
       dmode = mode_for_vector (GET_MODE_INNER (vmode), 4);
 
+      /* For little endian, swap operands and invert/swap selectors
+	 to get the correct xxpermdi.  The operand swap sets up the
+	 inputs as a little endian array.  The selectors are swapped
+	 because they are defined to use big endian ordering.  The
+	 selectors are inverted to get the correct doublewords for
+	 little endian ordering.  */
+      if (!BYTES_BIG_ENDIAN)
+	{
+	  int n;
+	  perm0 = 3 - perm0;
+	  perm1 = 3 - perm1;
+	  n = perm0, perm0 = perm1, perm1 = n;
+	  x = op0, op0 = op1, op1 = x;
+	}
+
       x = gen_rtx_VEC_CONCAT (dmode, op0, op1);
       v = gen_rtvec (2, GEN_INT (perm0), GEN_INT (perm1));
       x = gen_rtx_VEC_SELECT (vmode, x, gen_rtx_PARALLEL (VOIDmode, v));