diff mbox

[RFC,RFH,i386] Fix gcc.target/i386/pr61403.c FAIL with -mavx2

Message ID CAFULd4bNS96-6hfHRBXA4xUGqOrsqFkD42W1EWro2ROnL1_edQ@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak Oct. 2, 2014, 6:34 p.m. UTC
On Wed, Oct 1, 2014 at 9:03 PM, Uros Bizjak <ubizjak@gmail.com> wrote:

>> And now the expand_vec_perm_palignr improvement, tested
>> with GCC_TEST_RUN_EXPENSIVE=1 make check-gcc \
>> RUNTESTFLAGS='--target_board=unix/-mavx2 dg-torture.exp=vshuf*.c'
>> E.g.
>> typedef unsigned long long V __attribute__ ((vector_size (32)));
>> extern void abort (void);
>> V a, b, c, d;
>> void test_14 (void)
>> {
>>   V mask = { 6, 1, 3, 4 };
>>   int i;
>>   c = __builtin_shuffle (a, mask);
>>   d = __builtin_shuffle (a, b, mask);
>> }
>> (distilled from test 15 in vshuf-v4di.c) results in:
>> -       vmovdqa a(%rip), %ymm0
>> -       vpermq  $54, %ymm0, %ymm1
>> -       vpshufb .LC1(%rip), %ymm0, %ymm0
>> -       vmovdqa %ymm1, c(%rip)
>> -       vmovdqa b(%rip), %ymm1
>> -       vpshufb .LC0(%rip), %ymm1, %ymm1
>> -       vpermq  $78, %ymm1, %ymm1
>> -       vpor    %ymm1, %ymm0, %ymm0
>> +       vmovdqa a(%rip), %ymm1
>> +       vpermq  $54, %ymm1, %ymm0
>> +       vmovdqa %ymm0, c(%rip)
>> +       vmovdqa b(%rip), %ymm0
>> +       vpalignr        $8, %ymm1, %ymm0, %ymm0
>> +       vpermq  $99, %ymm0, %ymm0
>>         vmovdqa %ymm0, d(%rip)
>>         vzeroupper
>>         ret
>> change (and two fewer .rodata constants).
>
> On a related note, I would like to point out that
> gcc.target/i386/pr61403.c also fails to generate blend insn with
> -mavx2. The new insn sequence includes lots of new vpshufb insns with
> memory access.

Following patch fixes the failure:

--cut here--
--cut here--

The comment above expand_vec_perm_pblendv claims that:

  /* Use the same checks as in expand_vec_perm_blend, but skipping
     AVX and AVX2 as they require more than 2 instructions.  */

However, I see a significant reduction in vpshufb and vpor
instructions (33->16 and 22->11), and 6 new vblendps insns.

BTW: I have no access to avx2 target, so I can't test the patch with a
runtime tests. OTOH, it doesn't ICE for "GCC_TEST_RUN_EXPENSIVE=1 make
check-gcc RUNTESTFLAGS='--target_board=unix/-mavx2
dg-torture.exp=vshuf*.c'".

Jakub, what do you think?

Uros.
diff mbox

Patch

Index: i386.c
===================================================================
--- i386.c      (revision 215802)
+++ i386.c      (working copy)
@@ -43407,8 +43407,10 @@  expand_vec_perm_pblendv (struct expand_vec_perm_d
      AVX and AVX2 as they require more than 2 instructions.  */
   if (d->one_operand_p)
     return false;
-  if (TARGET_SSE4_1 && GET_MODE_SIZE (vmode) == 16)
+  if (TARGET_AVX2 && GET_MODE_SIZE (vmode) == 32)
     ;
+  else if (TARGET_SSE4_1 && GET_MODE_SIZE (vmode) == 16)
+    ;
   else
     return false;