diff mbox

Vector shuffling

Message ID CABYV9SW_krtcEK_V21uaN7YgfRwU-Td+fGh7Q_-7Ha0VgHArSA@mail.gmail.com
State New
Headers show

Commit Message

Artem Shinkarov Oct. 3, 2011, 11:04 p.m. UTC
On Mon, Oct 3, 2011 at 6:12 PM, Richard Henderson <rth@redhat.com> wrote:
> On 10/03/2011 09:43 AM, Artem Shinkarov wrote:
>> Hi, Richard
>>
>> There is a problem with the testcases of the patch you have committed
>> for me. The code in every test-case is doubled. Could you please,
>> apply the following patch, otherwise it would fail all the tests from
>> the vector-shuffle-patch would fail.
>
> Huh.  Dunno what happened there.  Fixed.
>
>> Also, if it is possible, could you change my name from in the
>> ChangeLog from "Artem Shinkarov" to "Artjoms Sinkarovs". The last
>> version is the way I am spelled in the passport, and the name I use in
>> the ChangeLog.
>
> Fixed.
>
>
> r~
>

Richard, there was a problem causing segfault in ix86_expand_vshuffle
which I have fixed with the patch attached.

Another thing I cannot figure out is the following case:
#define vector(elcount, type)  \
__attribute__((vector_size((elcount)*sizeof(type)))) type

vector (8, short) __attribute__ ((noinline))
f (vector (8, short) x, vector (8, short) y, vector (8, short) mask) {
    return  __builtin_shuffle (x, y, mask);
}

int main (int argc, char *argv[]) {
    vector (8, short) v0 = {argc, 1,2,3,4,5,6,7};
    vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7};
    vector (8, short) mask0 = {0,2,3,1,4,5,6,7};
    vector (8, short) v2;
    int i;

    v2 = f (v0, v1,  mask0);
    /* v2 =  __builtin_shuffle (v0, v1, mask0); */
    for (i = 0; i < 8; i ++)
      __builtin_printf ("%i, ", v2[i]);

    return 0;
}

I am compiling with support of ssse3, in my case it is ./xgcc -B. b.c
-O3 -mtune=core2 -march=core2

And I get 1, 1, 1, 3, 4, 5, 1, 7, on the output, which is wrong.

But if I will call __builtin_shuffle directly, then the answer is correct.

Any ideas?


Thanks,
Artem.

Comments

Artem Shinkarov Oct. 4, 2011, 3:18 p.m. UTC | #1
Ping.

Richard, the patch in the attachment should be submitted asap. The
other problem could wait for a while.

Thanks,
Artem.

On Tue, Oct 4, 2011 at 12:04 AM, Artem Shinkarov
<artyom.shinkaroff@gmail.com> wrote:
> On Mon, Oct 3, 2011 at 6:12 PM, Richard Henderson <rth@redhat.com> wrote:
>> On 10/03/2011 09:43 AM, Artem Shinkarov wrote:
>>> Hi, Richard
>>>
>>> There is a problem with the testcases of the patch you have committed
>>> for me. The code in every test-case is doubled. Could you please,
>>> apply the following patch, otherwise it would fail all the tests from
>>> the vector-shuffle-patch would fail.
>>
>> Huh.  Dunno what happened there.  Fixed.
>>
>>> Also, if it is possible, could you change my name from in the
>>> ChangeLog from "Artem Shinkarov" to "Artjoms Sinkarovs". The last
>>> version is the way I am spelled in the passport, and the name I use in
>>> the ChangeLog.
>>
>> Fixed.
>>
>>
>> r~
>>
>
> Richard, there was a problem causing segfault in ix86_expand_vshuffle
> which I have fixed with the patch attached.
>
> Another thing I cannot figure out is the following case:
> #define vector(elcount, type)  \
> __attribute__((vector_size((elcount)*sizeof(type)))) type
>
> vector (8, short) __attribute__ ((noinline))
> f (vector (8, short) x, vector (8, short) y, vector (8, short) mask) {
>    return  __builtin_shuffle (x, y, mask);
> }
>
> int main (int argc, char *argv[]) {
>    vector (8, short) v0 = {argc, 1,2,3,4,5,6,7};
>    vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7};
>    vector (8, short) mask0 = {0,2,3,1,4,5,6,7};
>    vector (8, short) v2;
>    int i;
>
>    v2 = f (v0, v1,  mask0);
>    /* v2 =  __builtin_shuffle (v0, v1, mask0); */
>    for (i = 0; i < 8; i ++)
>      __builtin_printf ("%i, ", v2[i]);
>
>    return 0;
> }
>
> I am compiling with support of ssse3, in my case it is ./xgcc -B. b.c
> -O3 -mtune=core2 -march=core2
>
> And I get 1, 1, 1, 3, 4, 5, 1, 7, on the output, which is wrong.
>
> But if I will call __builtin_shuffle directly, then the answer is correct.
>
> Any ideas?
>
>
> Thanks,
> Artem.
>
Richard Henderson Oct. 4, 2011, 4:30 p.m. UTC | #2
On 10/04/2011 08:18 AM, Artem Shinkarov wrote:
> Ping.
> 
> Richard, the patch in the attachment should be submitted asap. The
> other problem could wait for a while.

The patch in the attachment is wrong too.  I've re-written the x86
backend support, adding TARGET_XOP in the process.  I've also re-written
the test cases so that they actually test what we wanted.

Patch to follow once testing is complete.


r~
diff mbox

Patch

Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 179464)
+++ gcc/config/i386/i386.c	(working copy)
@@ -19312,14 +19312,17 @@  ix86_expand_vshuffle (rtx operands[])
       xops[1] = operands[1];
       xops[2] = operands[2];
       xops[3] = gen_rtx_EQ (mode, mask, w_vector);
-      xops[4] = t1;
-      xops[5] = t2;
+      xops[4] = t2;
+      xops[5] = t1;
 
       return ix86_expand_int_vcond (xops);
     }
 
-  /* mask = mask * {w, w, ...}  */
-  new_mask = expand_simple_binop (maskmode, MULT, new_mask, w_vector,
+  /* mask = mask * {16/w, 16/w, ...}  */
+  for (i = 0; i < w; i++)
+    vec[i] = GEN_INT (16/w);
+  vt = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (w, vec));
+  new_mask = expand_simple_binop (maskmode, MULT, new_mask, vt,
 				  NULL_RTX, 0, OPTAB_DIRECT);
 
   /* Convert mask to vector of chars.  */
@@ -19332,7 +19335,7 @@  ix86_expand_vshuffle (rtx operands[])
      ...  */
   for (i = 0; i < w; i++)
     for (j = 0; j < 16/w; j++)
-      vec[i*w+j] = GEN_INT (i*16/w);
+      vec[i*(16/w)+j] = GEN_INT (i*16/w);
   vt = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, vec));
   vt = force_reg (V16QImode, vt);
 
@@ -19344,7 +19347,7 @@  ix86_expand_vshuffle (rtx operands[])
      new_mask = new_mask + {0,1,..,16/w, 0,1,..,16/w, ...}  */
   for (i = 0; i < w; i++)
     for (j = 0; j < 16/w; j++)
-      vec[i*w+j] = GEN_INT (j);
+      vec[i*(16/w)+j] = GEN_INT (j);
 
   vt = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, vec));
   new_mask = expand_simple_binop (V16QImode, PLUS, new_mask, vt,
@@ -19386,8 +19389,8 @@  ix86_expand_vshuffle (rtx operands[])
       xops[1] = operands[1];
       xops[2] = operands[2];
       xops[3] = gen_rtx_EQ (mode, mask, w_vector);
-      xops[4] = t1;
-      xops[5] = t2;
+      xops[4] = t2;
+      xops[5] = t1;
 
       return ix86_expand_int_vcond (xops);
     }