diff mbox

[1/2,x86] Add palignr support for AVX2.

Message ID 20141001113815.GQ1986@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Oct. 1, 2014, 11:38 a.m. UTC
On Wed, Oct 01, 2014 at 12:35:14PM +0200, Jakub Jelinek wrote:
> On Wed, Oct 01, 2014 at 12:28:51PM +0200, Uros Bizjak wrote:
> > On Wed, Oct 1, 2014 at 12:16 PM, Evgeny Stupachenko <evstupac@gmail.com> wrote:
> > > Getting back to initial patch, is it ok?
> > 
> > IMO, we should start with Jakub's proposed patch [1]
> > 
> > [1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00010.html
> 
> That doesn't compile, will post a new version; got interrupted when
> I found that in
> GCC_TEST_RUN_EXPENSIVE=1 make check-gcc RUNTESTFLAGS='--target_board=unix/-mavx2 dg-torture.exp=vshuf*.c'
> one test is miscompiled even with unpatched compiler, debugging that now.

Let's start with the bugfix.  The || doesn't make any sense, and we really
want to fill in 4 bits (0, 1, 4, 5) of the immediate, not just two, anyway.
valid_perm_using_mode_p (V2TImode, d) should already guarantee that
it is possible to permutate it as V2TI, so all we care about are the
values of d->perm[0] and d->perm[nelt / 2], but we care not just which
lane it is, but also which operand (src1 or src2).

Tested with
GCC_TEST_RUN_EXPENSIVE=1 make check-gcc RUNTESTFLAGS='--target_board=unix/-mavx2 dg-torture.exp=vshuf*.c'
Ok for trunk/4.9/4.8?

2014-10-01  Jakub Jelinek  <jakub@redhat.com>

	PR target/63428
	* config/i386/i386.c (expand_vec_perm_pshufb): Fix up rperm[0]
	argument to avx2_permv2ti.

	* gcc.dg/torture/vshuf-4.inc: Move test 122 from EXPTESTS
	to test 24 in TESTS.



	Jakub

Comments

Uros Bizjak Oct. 1, 2014, 11:45 a.m. UTC | #1
On Wed, Oct 1, 2014 at 1:38 PM, Jakub Jelinek <jakub@redhat.com> wrote:

>> That doesn't compile, will post a new version; got interrupted when
>> I found that in
>> GCC_TEST_RUN_EXPENSIVE=1 make check-gcc RUNTESTFLAGS='--target_board=unix/-mavx2 dg-torture.exp=vshuf*.c'
>> one test is miscompiled even with unpatched compiler, debugging that now.
>
> Let's start with the bugfix.  The || doesn't make any sense, and we really
> want to fill in 4 bits (0, 1, 4, 5) of the immediate, not just two, anyway.
> valid_perm_using_mode_p (V2TImode, d) should already guarantee that
> it is possible to permutate it as V2TI, so all we care about are the
> values of d->perm[0] and d->perm[nelt / 2], but we care not just which
> lane it is, but also which operand (src1 or src2).
>
> Tested with
> GCC_TEST_RUN_EXPENSIVE=1 make check-gcc RUNTESTFLAGS='--target_board=unix/-mavx2 dg-torture.exp=vshuf*.c'
> Ok for trunk/4.9/4.8?
>
> 2014-10-01  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/63428
>         * config/i386/i386.c (expand_vec_perm_pshufb): Fix up rperm[0]
>         argument to avx2_permv2ti.
>
>         * gcc.dg/torture/vshuf-4.inc: Move test 122 from EXPTESTS
>         to test 24 in TESTS.

OK.

Thanks,
Uros.
diff mbox

Patch

--- gcc/config/i386/i386.c.jj	2014-10-01 11:22:09.000000000 +0200
+++ gcc/config/i386/i386.c	2014-10-01 13:00:30.835809031 +0200
@@ -42961,8 +42961,8 @@  expand_vec_perm_pshufb (struct expand_ve
 	      op0 = gen_lowpart (V4DImode, d->op0);
 	      op1 = gen_lowpart (V4DImode, d->op1);
 	      rperm[0]
-		= GEN_INT (((d->perm[0] & (nelt / 2)) ? 1 : 0)
-			   || ((d->perm[nelt / 2] & (nelt / 2)) ? 2 : 0));
+		= GEN_INT ((d->perm[0] / (nelt / 2))
+			   | ((d->perm[nelt / 2] / (nelt / 2)) * 16));
 	      emit_insn (gen_avx2_permv2ti (target, op0, op1, rperm[0]));
 	      if (target != d->target)
 		emit_move_insn (d->target, gen_lowpart (d->vmode, target));
--- gcc/testsuite/gcc.dg/torture/vshuf-4.inc.jj	2012-03-20 08:51:25.000000000 +0100
+++ gcc/testsuite/gcc.dg/torture/vshuf-4.inc	2014-10-01 13:23:07.163090945 +0200
@@ -23,7 +23,8 @@  T (19,	3, 2, 1, 0) \
 T (20,	0, 4, 1, 5) \
 T (21,	2, 6, 3, 7) \
 T (22,	1, 2, 3, 0) \
-T (23,	2, 1, 0, 3)
+T (23,	2, 1, 0, 3) \
+T (24,	2, 5, 6, 3)
 #define EXPTESTS \
 T (116,	1, 2, 4, 3) \
 T (117,	7, 3, 3, 0) \
@@ -31,7 +32,6 @@  T (118,	5, 3, 2, 7) \
 T (119,	0, 3, 5, 6) \
 T (120,	0, 0, 1, 5) \
 T (121,	4, 6, 2, 1) \
-T (122,	2, 5, 6, 3) \
 T (123,	4, 6, 3, 2) \
 T (124,	4, 7, 5, 6) \
 T (125,	0, 4, 2, 4) \