diff mbox

[PR,tree-optimization/70252] Fix boolean vectors conversion

Message ID 20160317120202.GB2050@msticlxl57.ims.intel.com
State New
Headers show

Commit Message

Ilya Enkovich March 17, 2016, 12:02 p.m. UTC
Hi,

Current widening and narrowing vectorization functions may work
incorrectly for scalar masks because we may have different boolean
vector types having the same mode.  E.g. vec<bool>(4) and vec<bool>(8)
both have QImode.  That means if we need to convert vec<bool>(4) into
vec<bool>(16) we may actually find QImode->HImode conversion optab
and try to use it which is incorrect because this optab entry is
used for vec<bool>(8) to vec<bool>(16) conversion.

I suppose the best fix for GCC 6 is to just catch and disable such
conversion by checking number of vetor elements.  It doesn't disable
any vectorization because we don't have any conversion patterns
for vec<bool>(4) anyway.

It's not clear what to do for GCC 7 though to enable such conversions.
It looks like for AVX-512 we have three boolean vectors sharing the
same QImode: vec<bool>(2), vec<bool>(4) and vec<bool>(8).  It means
we can't use optabs to check operations on these vectors even
using conversion optabs instead of direct ones.  Can we use half/quarter
byte modes for such masks or something like that?  Another option is to
handle their conversion separately with no optabs usage at all (target hooks?).

The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.
OK for trunk?

Thanks,
Ilya
--
gcc/

2016-03-17  Ilya Enkovich  <enkovich.gnu@gmail.com>

	PR tree-optimization/70252
	* tree-vect-stmts.c (supportable_widening_operation): Check resulting
	boolean vector has a proper number of elements.
	(supportable_narrowing_operation): Likewise.

gcc/testsuite/

2016-03-17  Ilya Enkovich  <enkovich.gnu@gmail.com>

	PR tree-optimization/70252
	* gcc.dg/pr70252.c: New test.

Comments

Richard Biener March 17, 2016, 1:16 p.m. UTC | #1
On Thu, Mar 17, 2016 at 1:02 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
> Hi,
>
> Current widening and narrowing vectorization functions may work
> incorrectly for scalar masks because we may have different boolean
> vector types having the same mode.  E.g. vec<bool>(4) and vec<bool>(8)
> both have QImode.  That means if we need to convert vec<bool>(4) into
> vec<bool>(16) we may actually find QImode->HImode conversion optab
> and try to use it which is incorrect because this optab entry is
> used for vec<bool>(8) to vec<bool>(16) conversion.
>
> I suppose the best fix for GCC 6 is to just catch and disable such
> conversion by checking number of vetor elements.  It doesn't disable
> any vectorization because we don't have any conversion patterns
> for vec<bool>(4) anyway.
>
> It's not clear what to do for GCC 7 though to enable such conversions.
> It looks like for AVX-512 we have three boolean vectors sharing the
> same QImode: vec<bool>(2), vec<bool>(4) and vec<bool>(8).  It means
> we can't use optabs to check operations on these vectors even
> using conversion optabs instead of direct ones.  Can we use half/quarter
> byte modes for such masks or something like that?  Another option is to
> handle their conversion separately with no optabs usage at all (target hooks?).
>
> The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.
> OK for trunk?

Ok.

Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2016-03-17  Ilya Enkovich  <enkovich.gnu@gmail.com>
>
>         PR tree-optimization/70252
>         * tree-vect-stmts.c (supportable_widening_operation): Check resulting
>         boolean vector has a proper number of elements.
>         (supportable_narrowing_operation): Likewise.
>
> gcc/testsuite/
>
> 2016-03-17  Ilya Enkovich  <enkovich.gnu@gmail.com>
>
>         PR tree-optimization/70252
>         * gcc.dg/pr70252.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.dg/pr70252.c b/gcc/testsuite/gcc.dg/pr70252.c
> new file mode 100644
> index 0000000..209e691
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr70252.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +/* { dg-additional-options "-march=skylake-avx512" { target { i?86-*-* x86_64-*-* } } } */
> +
> +extern unsigned char a [150];
> +extern unsigned char b [150];
> +extern unsigned char c [150];
> +extern unsigned char d [150];
> +extern unsigned char e [150];
> +
> +void foo () {
> +  for (int i = 92; i <= 141; i += 2) {
> +    int tmp = (d [i] && b [i]) <= (a [i] > c [i]);
> +    e [i] = tmp >> b [i];
> +  }
> +}
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 06b1ab7..d12c062 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -8940,7 +8940,12 @@ supportable_widening_operation (enum tree_code code, gimple *stmt,
>
>    if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
>        && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
> -    return true;
> +      /* For scalar masks we may have different boolean
> +        vector types having the same QImode.  Thus we
> +        add additional check for elements number.  */
> +    return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +           || (TYPE_VECTOR_SUBPARTS (vectype) / 2
> +               == TYPE_VECTOR_SUBPARTS (wide_vectype)));
>
>    /* Check if it's a multi-step conversion that can be done using intermediate
>       types.  */
> @@ -8991,7 +8996,9 @@ supportable_widening_operation (enum tree_code code, gimple *stmt,
>
>        if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
>           && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
> -       return true;
> +       return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +               || (TYPE_VECTOR_SUBPARTS (intermediate_type) / 2
> +                   == TYPE_VECTOR_SUBPARTS (wide_vectype)));
>
>        prev_type = intermediate_type;
>        prev_mode = intermediate_mode;
> @@ -9075,7 +9082,12 @@ supportable_narrowing_operation (enum tree_code code,
>    *code1 = c1;
>
>    if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
> -    return true;
> +    /* For scalar masks we may have different boolean
> +       vector types having the same QImode.  Thus we
> +       add additional check for elements number.  */
> +    return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +           || (TYPE_VECTOR_SUBPARTS (vectype) * 2
> +               == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
>
>    /* Check if it's a multi-step conversion that can be done using intermediate
>       types.  */
> @@ -9140,7 +9152,9 @@ supportable_narrowing_operation (enum tree_code code,
>        (*multi_step_cvt)++;
>
>        if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
> -       return true;
> +       return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +               || (TYPE_VECTOR_SUBPARTS (intermediate_type) * 2
> +                   == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
>
>        prev_mode = intermediate_mode;
>        prev_type = intermediate_type;
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.dg/pr70252.c b/gcc/testsuite/gcc.dg/pr70252.c
new file mode 100644
index 0000000..209e691
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr70252.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+/* { dg-additional-options "-march=skylake-avx512" { target { i?86-*-* x86_64-*-* } } } */
+
+extern unsigned char a [150];
+extern unsigned char b [150];
+extern unsigned char c [150];
+extern unsigned char d [150];
+extern unsigned char e [150];
+
+void foo () {
+  for (int i = 92; i <= 141; i += 2) {
+    int tmp = (d [i] && b [i]) <= (a [i] > c [i]);
+    e [i] = tmp >> b [i];
+  }
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 06b1ab7..d12c062 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -8940,7 +8940,12 @@  supportable_widening_operation (enum tree_code code, gimple *stmt,
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
       && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
-    return true;
+      /* For scalar masks we may have different boolean
+	 vector types having the same QImode.  Thus we
+	 add additional check for elements number.  */
+    return (!VECTOR_BOOLEAN_TYPE_P (vectype)
+	    || (TYPE_VECTOR_SUBPARTS (vectype) / 2
+		== TYPE_VECTOR_SUBPARTS (wide_vectype)));
 
   /* Check if it's a multi-step conversion that can be done using intermediate
      types.  */
@@ -8991,7 +8996,9 @@  supportable_widening_operation (enum tree_code code, gimple *stmt,
 
       if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
 	  && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
-	return true;
+	return (!VECTOR_BOOLEAN_TYPE_P (vectype)
+		|| (TYPE_VECTOR_SUBPARTS (intermediate_type) / 2
+		    == TYPE_VECTOR_SUBPARTS (wide_vectype)));
 
       prev_type = intermediate_type;
       prev_mode = intermediate_mode;
@@ -9075,7 +9082,12 @@  supportable_narrowing_operation (enum tree_code code,
   *code1 = c1;
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
-    return true;
+    /* For scalar masks we may have different boolean
+       vector types having the same QImode.  Thus we
+       add additional check for elements number.  */
+    return (!VECTOR_BOOLEAN_TYPE_P (vectype)
+	    || (TYPE_VECTOR_SUBPARTS (vectype) * 2
+		== TYPE_VECTOR_SUBPARTS (narrow_vectype)));
 
   /* Check if it's a multi-step conversion that can be done using intermediate
      types.  */
@@ -9140,7 +9152,9 @@  supportable_narrowing_operation (enum tree_code code,
       (*multi_step_cvt)++;
 
       if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
-	return true;
+	return (!VECTOR_BOOLEAN_TYPE_P (vectype)
+		|| (TYPE_VECTOR_SUBPARTS (intermediate_type) * 2
+		    == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
 
       prev_mode = intermediate_mode;
       prev_type = intermediate_type;