Patchwork PATCH: PR target/46051: [4.6 Regression] ICE: in extract_insn, at recog.c:2110 with -mavx -ftree-vectorize

login
register
mail settings
Submitter H.J. Lu
Date Oct. 16, 2010, 11:07 p.m.
Message ID <AANLkTinSJXcETZoqPs1fxDbh6bGDhrFozSFuZcyUmKG-@mail.gmail.com>
Download mbox | patch
Permalink /patch/68056/
State New
Headers show

Comments

H.J. Lu - Oct. 16, 2010, 11:07 p.m.
On Tue, Oct 12, 2010 at 9:57 AM, Richard Henderson <rth@redhat.com> wrote:
> On 10/11/2010 06:49 PM, H.J. Lu wrote:
>>  (define_expand "storent<mode>"
>> +  [(set (match_operand:AVX256MODEF2P 0 "memory_operand" "")
>> +     (unspec:AVX256MODEF2P
>> +       [(match_operand:AVX256MODEF2P 1 "register_operand" "")]
>> +       UNSPEC_MOVNT))]
>> +  "AVX256_VEC_FLOAT_MODE_P (<MODE>mode)"
>> +  "")
>> +
>> +(define_expand "storent<mode>"
>>    [(set (match_operand:SSEMODEF2P 0 "memory_operand" "")
>>       (unspec:SSEMODEF2P
>>         [(match_operand:SSEMODEF2P 1 "register_operand" "")]
>
> Do not duplicate all of these patterns.  Instead create a
> mode iterator that only supplies the AVX modes when AVX is
> enabled.
>
>> +(define_insn "*vec_concat_lo_<mode>_avx"
>> +(define_insn "*vec_concat_hi_<mode>_avx"
>> +(define_insn "*vec_concat_lo_<mode>_avx"
>> +(define_insn "*vec_concat_hi_<mode>_avx"
>
> Why are you doing this as separate patterns like this?
> Canonically, the select goes on the outside of the concat,
>
>  (vec_select (vec_concat x y) (parallel ...))
>
> instead of how you're doing it here.
>
> Indeed, putting the vec_select on the outside would allow
> a match_parallel to describe the full permutation of this
> insn.  You're only considering half of the cases here.
>

Here is the reason where those patterns are used. OK for
trunk?

Thanks.
Uros Bizjak - Oct. 17, 2010, 8:34 a.m.
On Sat, 2010-10-16 at 16:07 -0700, H.J. Lu wrote:
> On Tue, Oct 12, 2010 at 9:57 AM, Richard Henderson <rth@redhat.com> wrote:
> > On 10/11/2010 06:49 PM, H.J. Lu wrote:
> >>  (define_expand "storent<mode>"
> >> +  [(set (match_operand:AVX256MODEF2P 0 "memory_operand" "")
> >> +     (unspec:AVX256MODEF2P
> >> +       [(match_operand:AVX256MODEF2P 1 "register_operand" "")]
> >> +       UNSPEC_MOVNT))]
> >> +  "AVX256_VEC_FLOAT_MODE_P (<MODE>mode)"
> >> +  "")
> >> +
> >> +(define_expand "storent<mode>"
> >>    [(set (match_operand:SSEMODEF2P 0 "memory_operand" "")
> >>       (unspec:SSEMODEF2P
> >>         [(match_operand:SSEMODEF2P 1 "register_operand" "")]
> >
> > Do not duplicate all of these patterns.  Instead create a
> > mode iterator that only supplies the AVX modes when AVX is
> > enabled.
> >
> >> +(define_insn "*vec_concat_lo_<mode>_avx"
> >> +(define_insn "*vec_concat_hi_<mode>_avx"
> >> +(define_insn "*vec_concat_lo_<mode>_avx"
> >> +(define_insn "*vec_concat_hi_<mode>_avx"
> >
> > Why are you doing this as separate patterns like this?
> > Canonically, the select goes on the outside of the concat,
> >
> >  (vec_select (vec_concat x y) (parallel ...))
> >
> > instead of how you're doing it here.
> >
> > Indeed, putting the vec_select on the outside would allow
> > a match_parallel to describe the full permutation of this
> > insn.  You're only considering half of the cases here.

In the attached patch, you still use vec_concat before vec_select.
However, I think that new insn patterns are not needed at all, since
*avx_vperm2f128<mode>_nozero should handle all these RTXes.

Please fix the expanders that generate wrong patterns instead, i.e.
vec_interleave_lowv4df.

Uros.

Patch

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index db5e4de..6d0ab60 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -12311,3 +12311,67 @@ 
   [(set_attr "type" "ssecvt")
    (set_attr "prefix" "vex")
    (set_attr "mode" "V8SF")])
+
+(define_insn "*vec_concat_lo_<mode>_avx"
+  [(set (match_operand:AVX256MODE4P 0 "register_operand" "=x")
+	(vec_concat:AVX256MODE4P
+	  (vec_select:<avxhalfvecmode>
+	    (match_operand:AVX256MODE4P 1 "register_operand" "x")
+	    (parallel [(const_int 0) (const_int 1)]))
+	  (vec_select:<avxhalfvecmode>
+	    (match_operand:AVX256MODE4P 2 "nonimmediate_operand" "xm")
+	    (parallel [(const_int 0) (const_int 1)]))))]
+  "TARGET_AVX"
+  "vperm2f128\t{$0x20, %2, %1, %0|%0, %1, %2, 0x20}"
+  [(set_attr "type" "sselog")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "*vec_concat_hi_<mode>_avx"
+  [(set (match_operand:AVX256MODE4P 0 "register_operand" "=x")
+	(vec_concat:AVX256MODE4P
+	  (vec_select:<avxhalfvecmode>
+	    (match_operand:AVX256MODE4P 1 "register_operand" "x")
+	    (parallel [(const_int 2) (const_int 3)]))
+	  (vec_select:<avxhalfvecmode>
+	    (match_operand:AVX256MODE4P 2 "nonimmediate_operand" "xm")
+	    (parallel [(const_int 2) (const_int 3)]))))]
+  "TARGET_AVX"
+  "vperm2f128\t{$0x31, %2, %1, %0|%0, %1, %2, 0x31}"
+  [(set_attr "type" "sselog")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "*vec_concat_lo_<mode>_avx"
+  [(set (match_operand:AVX256MODE8P 0 "register_operand" "=x")
+	(vec_concat:AVX256MODE8P
+	  (vec_select:<avxhalfvecmode>
+	    (match_operand:AVX256MODE8P 1 "register_operand" "x")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)]))
+	  (vec_select:<avxhalfvecmode>
+	    (match_operand:AVX256MODE8P 2 "nonimmediate_operand" "xm")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)]))))]
+  "TARGET_AVX"
+  "vperm2f128\t{$0x20, %2, %1, %0|%0, %1, %2, 0x20}"
+  [(set_attr "type" "sselog")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "*vec_concat_hi_<mode>_avx"
+  [(set (match_operand:AVX256MODE8P 0 "register_operand" "=x")
+	(vec_concat:AVX256MODE8P
+	  (vec_select:<avxhalfvecmode>
+	    (match_operand:AVX256MODE8P 1 "register_operand" "x")
+	    (parallel [(const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))
+	  (vec_select:<avxhalfvecmode>
+	    (match_operand:AVX256MODE8P 2 "nonimmediate_operand" "xm")
+	    (parallel [(const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "TARGET_AVX"
+  "vperm2f128\t{$0x31, %2, %1, %0|%0, %1, %2, 00x31}"
+  [(set_attr "type" "sselog")
+   (set_attr "prefix" "vex")
+   (set_attr "mode" "V8SF")])
--- /dev/null	2010-10-13 10:55:58.381855970 -0700
+++ gcc/gcc/testsuite/gcc.target/i386/pr46051.c	2010-10-16 16:02:16.336909362 -0700
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -mavx -mtune=generic" } */
+
+double val1[4][2], val2[4][2], chk[4][2];
+
+void
+foo (void)
+{
+  int i, j;
+  for (i = 0; i < 4; i++)
+    {
+      double tmp = 0;
+      for (j = 0; j < 2; j++)
+	tmp += val1[i][j] * val2[i][j];
+      for (j = 0; j < 2; j++)
+	chk[i][j] = tmp;
+    }
+}