diff mbox

PR64387

Message ID 20150114111823.GA66571@msticlxl7.ims.intel.com
State New
Headers show

Commit Message

Ilya Tocar Jan. 14, 2015, 11:18 a.m. UTC
Hi,

This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387
Which was caused by different in predicates between vec_unpacks_hi
and vec_extract_hi.
Ok for trunk?
ChangeLog:

gcc/
	PR target/64387
	* config/i386/sse.md (vec_unpacks_hi_v8sf): Fix predicate.
	(vec_unpacks_hi_v16sf): Ditto.

testsuite/
	PR target/64387
	* gcc.target/i386/pr64387.c: New test.
---
 gcc/config/i386/sse.md                  |  4 ++--
 gcc/testsuite/gcc.target/i386/pr64387.c | 15 +++++++++++++++
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr64387.c

Comments

Uros Bizjak Jan. 14, 2015, 11:36 a.m. UTC | #1
On Wed, Jan 14, 2015 at 12:18 PM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
> Hi,
>
> This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387
> Which was caused by different in predicates between vec_unpacks_hi
> and vec_extract_hi.

Why are vec_unpacks_hi_{v8sf,v16sf} expanders different than
vec_unpacks_hi_v4sf? I think that these should all be expand in the
same way, similar to vec_unpacks_hi_v4sf.

Uros.
Ilya Tocar Jan. 14, 2015, 12:14 p.m. UTC | #2
On 14 Jan 12:36, Uros Bizjak wrote:
> On Wed, Jan 14, 2015 at 12:18 PM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
> > Hi,
> >
> > This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387
> > Which was caused by different in predicates between vec_unpacks_hi
> > and vec_extract_hi.
> 
> Why are vec_unpacks_hi_{v8sf,v16sf} expanders different than
> vec_unpacks_hi_v4sf? I think that these should all be expand in the
> same way, similar to vec_unpacks_hi_v4sf.
>
In v4sf case we use movhlps, which is not avalible in v{8,16}sf case.
Uros Bizjak Jan. 14, 2015, 12:22 p.m. UTC | #3
On Wed, Jan 14, 2015 at 1:14 PM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
> On 14 Jan 12:36, Uros Bizjak wrote:
>> On Wed, Jan 14, 2015 at 12:18 PM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
>> > Hi,
>> >
>> > This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387
>> > Which was caused by different in predicates between vec_unpacks_hi
>> > and vec_extract_hi.
>>
>> Why are vec_unpacks_hi_{v8sf,v16sf} expanders different than
>> vec_unpacks_hi_v4sf? I think that these should all be expand in the
>> same way, similar to vec_unpacks_hi_v4sf.
>>
> In v4sf case we use movhlps, which is not avalible in v{8,16}sf case.

I see. We are generating vextract<something> here, that has
"register_operand" for its operand 1 constraint.

Patch is OK.

Thanks,
Uros.
Ilya Tocar Feb. 4, 2015, 11:05 a.m. UTC | #4
I think that fix for avx2 part should be backported to 4.8/4.9
What do you think?

On 14 Jan 14:18, Ilya Tocar wrote:
> Hi,
> 
> This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387
> Which was caused by different in predicates between vec_unpacks_hi
> and vec_extract_hi.
> Ok for trunk?
> ChangeLog:
> 
> gcc/
> 	PR target/64387
> 	* config/i386/sse.md (vec_unpacks_hi_v8sf): Fix predicate.
> 	(vec_unpacks_hi_v16sf): Ditto.
> 
> testsuite/
> 	PR target/64387
> 	* gcc.target/i386/pr64387.c: New test.
> ---
>  gcc/config/i386/sse.md                  |  4 ++--
>  gcc/testsuite/gcc.target/i386/pr64387.c | 15 +++++++++++++++
>  2 files changed, 17 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr64387.c
> 
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 3b9108c..cd4af4e 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -5078,7 +5078,7 @@
>  (define_expand "vec_unpacks_hi_v8sf"
>    [(set (match_dup 2)
>  	(vec_select:V4SF
> -	  (match_operand:V8SF 1 "nonimmediate_operand")
> +	  (match_operand:V8SF 1 "register_operand")
>  	  (parallel [(const_int 4) (const_int 5)
>  		     (const_int 6) (const_int 7)])))
>     (set (match_operand:V4DF 0 "register_operand")
> @@ -5090,7 +5090,7 @@
>  (define_expand "vec_unpacks_hi_v16sf"
>    [(set (match_dup 2)
>  	(vec_select:V8SF
> -	  (match_operand:V16SF 1 "nonimmediate_operand")
> +	  (match_operand:V16SF 1 "register_operand")
>  	  (parallel [(const_int 8) (const_int 9)
>  		     (const_int 10) (const_int 11)
>  		     (const_int 12) (const_int 13)
> diff --git a/gcc/testsuite/gcc.target/i386/pr64387.c b/gcc/testsuite/gcc.target/i386/pr64387.c
> new file mode 100644
> index 0000000..dd38142
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr64387.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize -ffloat-store -mavx512er" } */
> +
> +float x[256];
> +
> +double *
> +foo (void)
> +{
> +  double *z = __builtin_malloc (sizeof (double) * 256);
> +  int i;
> +  for (i = 0; i < 256; ++i)
> +    z[i] = x[i] + 1.0f;
> +  foo ();
> +  return 0;
> +}
> -- 
> 1.8.3.1
>
Uros Bizjak Feb. 4, 2015, 1:11 p.m. UTC | #5
On Wed, Feb 4, 2015 at 12:05 PM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
> I think that fix for avx2 part should be backported to 4.8/4.9
> What do you think?

OK also for branches.

Thanks,
Uros.
diff mbox

Patch

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3b9108c..cd4af4e 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -5078,7 +5078,7 @@ 
 (define_expand "vec_unpacks_hi_v8sf"
   [(set (match_dup 2)
 	(vec_select:V4SF
-	  (match_operand:V8SF 1 "nonimmediate_operand")
+	  (match_operand:V8SF 1 "register_operand")
 	  (parallel [(const_int 4) (const_int 5)
 		     (const_int 6) (const_int 7)])))
    (set (match_operand:V4DF 0 "register_operand")
@@ -5090,7 +5090,7 @@ 
 (define_expand "vec_unpacks_hi_v16sf"
   [(set (match_dup 2)
 	(vec_select:V8SF
-	  (match_operand:V16SF 1 "nonimmediate_operand")
+	  (match_operand:V16SF 1 "register_operand")
 	  (parallel [(const_int 8) (const_int 9)
 		     (const_int 10) (const_int 11)
 		     (const_int 12) (const_int 13)
diff --git a/gcc/testsuite/gcc.target/i386/pr64387.c b/gcc/testsuite/gcc.target/i386/pr64387.c
new file mode 100644
index 0000000..dd38142
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr64387.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffloat-store -mavx512er" } */
+
+float x[256];
+
+double *
+foo (void)
+{
+  double *z = __builtin_malloc (sizeof (double) * 256);
+  int i;
+  for (i = 0; i < 256; ++i)
+    z[i] = x[i] + 1.0f;
+  foo ();
+  return 0;
+}