diff mbox

Extend shift permutations on power of 2 cases

Message ID CAOvf_xwgDvfNkhLGmAy==02cvtfiCgEt-yCKKNPddiAym55Z7Q@mail.gmail.com
State New
Headers show

Commit Message

Evgeny Stupachenko Nov. 28, 2014, 11:33 a.m. UTC
Hi,

After fixing PR60451 (pack scheme instead of pshufb) general power of
2 permutations become more profitable.
So I'd like to revert the changes keeping algorithm live or implement
new hook to use best algorithm/size for particular -mtune.

Below is reverting patch. Bootstrap and make check passed.
Is it ok?

2014-11-28  Evgeny Stupachenko  <evstupac@gmail.com>

gcc/testsuite
        * gcc.target/i386/pr52252-atom-1.c: Delete.

gcc/
        * tree-vect-data-refs.c (vect_transform_grouped_load): Limit shift
        permutations to loads group of size 3.

     vect_permute_load_chain (dr_chain, size, stmt, gsi, &result_chain);

On Wed, Nov 12, 2014 at 4:16 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Nov 12, 2014 at 2:15 PM, Evgeny Stupachenko <evstupac@gmail.com> wrote:
>> To avoid misunderstanding.
>> I haven't yet committed this obvious fix.
>> Is it ok?
>
> If it is obvious, then it doesn't need an approval.
>
> So, OK.
>
> Thanks,
> Uros.

Comments

Richard Biener Nov. 28, 2014, 12:05 p.m. UTC | #1
On Fri, Nov 28, 2014 at 12:33 PM, Evgeny Stupachenko <evstupac@gmail.com> wrote:
> Hi,
>
> After fixing PR60451 (pack scheme instead of pshufb) general power of
> 2 permutations become more profitable.
> So I'd like to revert the changes keeping algorithm live or implement
> new hook to use best algorithm/size for particular -mtune.
>
> Below is reverting patch. Bootstrap and make check passed.
> Is it ok?

Ok.

Thanks,
Richard.

> 2014-11-28  Evgeny Stupachenko  <evstupac@gmail.com>
>
> gcc/testsuite
>         * gcc.target/i386/pr52252-atom-1.c: Delete.
>
> gcc/
>         * tree-vect-data-refs.c (vect_transform_grouped_load): Limit shift
>         permutations to loads group of size 3.
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr52252-atom-1.c
> b/gcc/testsuite/gcc.target/i386/pr52252-atom-1.c
> deleted file mode 100644
> index 020e983..0000000
> --- a/gcc/testsuite/gcc.target/i386/pr52252-atom-1.c
> +++ /dev/null
> @@ -1,22 +0,0 @@
> -/* { dg-do compile } */
> -/* { dg-require-effective-target ssse3 } */
> -/* { dg-options "-O2 -ftree-vectorize -mssse3 -mtune=slm" } */
> -#define byte unsigned char
> -
> -void
> -pair_mul_sum(byte *in, byte *out, int size)
> -{
> -  int j;
> -  for(j = 0; j < size; j++)
> -    {
> -      byte a = in[0];
> -      byte b = in[1];
> -      byte c = in[2];
> -      byte d = in[3];
> -      out[0] = (byte)(a * b) + (byte)(b * c) + (byte)(c * d) + (byte)(d * a);
> -      in += 4;
> -      out += 1;
> -    }
> -}
> -
> -/* { dg-final { scan-assembler "perm2i128|palignr" } } */
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 35d0e0f..8451bda 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -5617,6 +5617,7 @@ vect_transform_grouped_load (gimple stmt,
> vec<tree> dr_chain, int size,
>       get chain for loads group using vect_shift_permute_load_chain.  */
>    mode = TYPE_MODE (STMT_VINFO_VECTYPE (vinfo_for_stmt (stmt)));
>    if (targetm.sched.reassociation_width (VEC_PERM_EXPR, mode) > 1
> +      || exact_log2 (size) != -1
>        || !vect_shift_permute_load_chain (dr_chain, size, stmt,
>                                          gsi, &result_chain))
>      vect_permute_load_chain (dr_chain, size, stmt, gsi, &result_chain);
>
> On Wed, Nov 12, 2014 at 4:16 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Wed, Nov 12, 2014 at 2:15 PM, Evgeny Stupachenko <evstupac@gmail.com> wrote:
>>> To avoid misunderstanding.
>>> I haven't yet committed this obvious fix.
>>> Is it ok?
>>
>> If it is obvious, then it doesn't need an approval.
>>
>> So, OK.
>>
>> Thanks,
>> Uros.
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.target/i386/pr52252-atom-1.c
b/gcc/testsuite/gcc.target/i386/pr52252-atom-1.c
deleted file mode 100644
index 020e983..0000000
--- a/gcc/testsuite/gcc.target/i386/pr52252-atom-1.c
+++ /dev/null
@@ -1,22 +0,0 @@ 
-/* { dg-do compile } */
-/* { dg-require-effective-target ssse3 } */
-/* { dg-options "-O2 -ftree-vectorize -mssse3 -mtune=slm" } */
-#define byte unsigned char
-
-void
-pair_mul_sum(byte *in, byte *out, int size)
-{
-  int j;
-  for(j = 0; j < size; j++)
-    {
-      byte a = in[0];
-      byte b = in[1];
-      byte c = in[2];
-      byte d = in[3];
-      out[0] = (byte)(a * b) + (byte)(b * c) + (byte)(c * d) + (byte)(d * a);
-      in += 4;
-      out += 1;
-    }
-}
-
-/* { dg-final { scan-assembler "perm2i128|palignr" } } */
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 35d0e0f..8451bda 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -5617,6 +5617,7 @@  vect_transform_grouped_load (gimple stmt,
vec<tree> dr_chain, int size,
      get chain for loads group using vect_shift_permute_load_chain.  */
   mode = TYPE_MODE (STMT_VINFO_VECTYPE (vinfo_for_stmt (stmt)));
   if (targetm.sched.reassociation_width (VEC_PERM_EXPR, mode) > 1
+      || exact_log2 (size) != -1
       || !vect_shift_permute_load_chain (dr_chain, size, stmt,
                                         gsi, &result_chain))