diff mbox series

i386: Fix -mavx -mno-mavx2 ICE with VEC_COND_EXPR [PR93637]

Message ID 20200210143247.GQ17695@tucnak
State New
Headers show
Series i386: Fix -mavx -mno-mavx2 ICE with VEC_COND_EXPR [PR93637] | expand

Commit Message

Jakub Jelinek Feb. 10, 2020, 2:32 p.m. UTC
Hi!

As mentioned in the PR, for -mavx -mno-avx2 the backend does support
vcondv4div4df and vcondv8siv8sf optabs (while generally 32-byte vectors
aren't much supported in that case, it is performed using
vandps/vandnps/vorps).  The problem is that after the last generic vector
lowering (where the VEC_COND_EXPR still compares two V4DF vectors and
has two V4DI last operands and V4DI result and so is considered ok) fre4
folds the condition into constant, at which point the middle-end during
expansion will try vcond_mask_optab and fall back to trying to expand it
as the constant vector < 0 vcondv4div4di, but neither of them is supported
for -mavx -mno-avx2 and thus we ICE.

So, the options I see is either what the following patch does, also support
vcond_mask_v4div4di and vcond_mask_v4siv4si already for TARGET_AVX, or
require for vcondv4div4df and vcondv8siv8sf TARGET_AVX2 rather than current
TARGET_AVX.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Or do you prefer the disabling of the vcond patterns instead?

2020-02-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/93637
	* config/i386/sse.md (VI_256_AVX2): New mode iterator.
	(vcond_mask_<mode><sseintvecmodelower>): Use it instead of VI_256.
	Change condition from TARGET_AVX2 to TARGET_AVX.

	* gcc.target/i386/avx-pr93637.c: New test.


	Jakub

Comments

Richard Biener Feb. 10, 2020, 2:40 p.m. UTC | #1
On Mon, Feb 10, 2020 at 3:33 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> As mentioned in the PR, for -mavx -mno-avx2 the backend does support
> vcondv4div4df and vcondv8siv8sf optabs (while generally 32-byte vectors
> aren't much supported in that case, it is performed using
> vandps/vandnps/vorps).  The problem is that after the last generic vector
> lowering (where the VEC_COND_EXPR still compares two V4DF vectors and
> has two V4DI last operands and V4DI result and so is considered ok) fre4
> folds the condition into constant, at which point the middle-end during
> expansion will try vcond_mask_optab and fall back to trying to expand it
> as the constant vector < 0 vcondv4div4di, but neither of them is supported
> for -mavx -mno-avx2 and thus we ICE.

Hmm.  Maybe with FP operands we can also try to implement the mask
as != 0.0 FP condition?  Not sure if -1 (or 1) is enough non-NaNish to
not cause problems of course.

> So, the options I see is either what the following patch does, also support
> vcond_mask_v4div4di and vcond_mask_v4siv4si already for TARGET_AVX, or
> require for vcondv4div4df and vcondv8siv8sf TARGET_AVX2 rather than current
> TARGET_AVX.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> Or do you prefer the disabling of the vcond patterns instead?
>
> 2020-02-10  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/93637
>         * config/i386/sse.md (VI_256_AVX2): New mode iterator.
>         (vcond_mask_<mode><sseintvecmodelower>): Use it instead of VI_256.
>         Change condition from TARGET_AVX2 to TARGET_AVX.
>
>         * gcc.target/i386/avx-pr93637.c: New test.
>
> --- gcc/config/i386/sse.md.jj   2020-02-10 13:14:02.970131692 +0100
> +++ gcc/config/i386/sse.md      2020-02-10 13:15:54.343473253 +0100
> @@ -3430,13 +3430,19 @@ (define_expand "vcond_mask_<mode><avx512
>           (match_operand:<avx512fmaskmode> 3 "register_operand")))]
>    "TARGET_AVX512BW")
>
> +;; As vcondv4div4df and vcondv8siv8sf are enabled already with TARGET_AVX,
> +;; and their condition can be folded late into a constant, we need to
> +;; support vcond_mask_v4div4di and vcond_mask_v8siv8si for TARGET_AVX.
> +(define_mode_iterator VI_256_AVX2 [(V32QI "TARGET_AVX2") (V16HI "TARGET_AVX2")
> +                                  V8SI V4DI])
> +
>  (define_expand "vcond_mask_<mode><sseintvecmodelower>"
> -  [(set (match_operand:VI_256 0 "register_operand")
> -       (vec_merge:VI_256
> -         (match_operand:VI_256 1 "nonimmediate_operand")
> -         (match_operand:VI_256 2 "nonimm_or_0_operand")
> +  [(set (match_operand:VI_256_AVX2 0 "register_operand")
> +       (vec_merge:VI_256_AVX2
> +         (match_operand:VI_256_AVX2 1 "nonimmediate_operand")
> +         (match_operand:VI_256_AVX2 2 "nonimm_or_0_operand")
>           (match_operand:<sseintvecmode> 3 "register_operand")))]
> -  "TARGET_AVX2"
> +  "TARGET_AVX"
>  {
>    ix86_expand_sse_movcc (operands[0], operands[3],
>                          operands[1], operands[2]);
> --- gcc/testsuite/gcc.target/i386/avx-pr93637.c.jj      2020-02-10 13:19:18.212437488 +0100
> +++ gcc/testsuite/gcc.target/i386/avx-pr93637.c 2020-02-10 13:18:25.651220171 +0100
> @@ -0,0 +1,17 @@
> +/* PR target/93637 */
> +/* { dg-do compile } */
> +/* { dg-options "-mavx -mno-avx2 -O3 --param sccvn-max-alias-queries-per-access=3" } */
> +
> +double
> +foo (void)
> +{
> +  int i;
> +  double r = 7.0;
> +  double a[] = { 0.0, 0.0, -0.0, 0.0, 0.0, -0.0, 1.0, 0.0, 0.0, -0.0, 1.0, 0.0, 1.0, 1.0 };
> +
> +  for (i = 0; i < sizeof (a) / sizeof (a[0]); ++i)
> +    if (a[i] == 0.0)
> +      r = a[i];
> +
> +  return r;
> +}
>
>         Jakub
>
Jakub Jelinek Feb. 10, 2020, 2:47 p.m. UTC | #2
On Mon, Feb 10, 2020 at 03:40:18PM +0100, Richard Biener wrote:
> Hmm.  Maybe with FP operands we can also try to implement the mask
> as != 0.0 FP condition?  Not sure if -1 (or 1) is enough non-NaNish to
> not cause problems of course.

Well, by the time we reach expansion, we have just the VECTOR_CST integral
operand, and we use right now:
      gcc_assert (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (op0)));
      if (get_vcond_mask_icode (mode, TYPE_MODE (TREE_TYPE (op0)))
          != CODE_FOR_nothing)
        return expand_vec_cond_mask_expr (vec_cond_type, op0, op1,
                                          op2, target);
      /* Fake op0 < 0.  */
      else
        {
          gcc_assert (GET_MODE_CLASS (TYPE_MODE (TREE_TYPE (op0)))
                      == MODE_VECTOR_INT);
          op0a = op0;
          op0b = build_zero_cst (TREE_TYPE (op0));
          tcode = LT_EXPR;
        }
so we'd need to try that op0 < 0 case and if that fails, try to find some
floating point (or other?) mode that has the same element size and precision
and change the condition back to comparison of two floating point constants
and then hope combine will fold it back to vcond_mask* again.

	Jakub
Uros Bizjak Feb. 10, 2020, 8:16 p.m. UTC | #3
On Mon, Feb 10, 2020 at 3:33 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> As mentioned in the PR, for -mavx -mno-avx2 the backend does support
> vcondv4div4df and vcondv8siv8sf optabs (while generally 32-byte vectors
> aren't much supported in that case, it is performed using
> vandps/vandnps/vorps).  The problem is that after the last generic vector
> lowering (where the VEC_COND_EXPR still compares two V4DF vectors and
> has two V4DI last operands and V4DI result and so is considered ok) fre4
> folds the condition into constant, at which point the middle-end during
> expansion will try vcond_mask_optab and fall back to trying to expand it
> as the constant vector < 0 vcondv4div4di, but neither of them is supported
> for -mavx -mno-avx2 and thus we ICE.
>
> So, the options I see is either what the following patch does, also support
> vcond_mask_v4div4di and vcond_mask_v4siv4si already for TARGET_AVX, or
> require for vcondv4div4df and vcondv8siv8sf TARGET_AVX2 rather than current
> TARGET_AVX.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> Or do you prefer the disabling of the vcond patterns instead?
>
> 2020-02-10  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/93637
>         * config/i386/sse.md (VI_256_AVX2): New mode iterator.
>         (vcond_mask_<mode><sseintvecmodelower>): Use it instead of VI_256.
>         Change condition from TARGET_AVX2 to TARGET_AVX.
>
>         * gcc.target/i386/avx-pr93637.c: New test.

LGTM as an ICE fix for stage4. We can try Richard's suggestion "later".

Thanks,
Uros.

>
> --- gcc/config/i386/sse.md.jj   2020-02-10 13:14:02.970131692 +0100
> +++ gcc/config/i386/sse.md      2020-02-10 13:15:54.343473253 +0100
> @@ -3430,13 +3430,19 @@ (define_expand "vcond_mask_<mode><avx512
>           (match_operand:<avx512fmaskmode> 3 "register_operand")))]
>    "TARGET_AVX512BW")
>
> +;; As vcondv4div4df and vcondv8siv8sf are enabled already with TARGET_AVX,
> +;; and their condition can be folded late into a constant, we need to
> +;; support vcond_mask_v4div4di and vcond_mask_v8siv8si for TARGET_AVX.
> +(define_mode_iterator VI_256_AVX2 [(V32QI "TARGET_AVX2") (V16HI "TARGET_AVX2")
> +                                  V8SI V4DI])
> +
>  (define_expand "vcond_mask_<mode><sseintvecmodelower>"
> -  [(set (match_operand:VI_256 0 "register_operand")
> -       (vec_merge:VI_256
> -         (match_operand:VI_256 1 "nonimmediate_operand")
> -         (match_operand:VI_256 2 "nonimm_or_0_operand")
> +  [(set (match_operand:VI_256_AVX2 0 "register_operand")
> +       (vec_merge:VI_256_AVX2
> +         (match_operand:VI_256_AVX2 1 "nonimmediate_operand")
> +         (match_operand:VI_256_AVX2 2 "nonimm_or_0_operand")
>           (match_operand:<sseintvecmode> 3 "register_operand")))]
> -  "TARGET_AVX2"
> +  "TARGET_AVX"
>  {
>    ix86_expand_sse_movcc (operands[0], operands[3],
>                          operands[1], operands[2]);
> --- gcc/testsuite/gcc.target/i386/avx-pr93637.c.jj      2020-02-10 13:19:18.212437488 +0100
> +++ gcc/testsuite/gcc.target/i386/avx-pr93637.c 2020-02-10 13:18:25.651220171 +0100
> @@ -0,0 +1,17 @@
> +/* PR target/93637 */
> +/* { dg-do compile } */
> +/* { dg-options "-mavx -mno-avx2 -O3 --param sccvn-max-alias-queries-per-access=3" } */
> +
> +double
> +foo (void)
> +{
> +  int i;
> +  double r = 7.0;
> +  double a[] = { 0.0, 0.0, -0.0, 0.0, 0.0, -0.0, 1.0, 0.0, 0.0, -0.0, 1.0, 0.0, 1.0, 1.0 };
> +
> +  for (i = 0; i < sizeof (a) / sizeof (a[0]); ++i)
> +    if (a[i] == 0.0)
> +      r = a[i];
> +
> +  return r;
> +}
>
>         Jakub
>
diff mbox series

Patch

--- gcc/config/i386/sse.md.jj	2020-02-10 13:14:02.970131692 +0100
+++ gcc/config/i386/sse.md	2020-02-10 13:15:54.343473253 +0100
@@ -3430,13 +3430,19 @@  (define_expand "vcond_mask_<mode><avx512
 	  (match_operand:<avx512fmaskmode> 3 "register_operand")))]
   "TARGET_AVX512BW")
 
+;; As vcondv4div4df and vcondv8siv8sf are enabled already with TARGET_AVX,
+;; and their condition can be folded late into a constant, we need to
+;; support vcond_mask_v4div4di and vcond_mask_v8siv8si for TARGET_AVX.
+(define_mode_iterator VI_256_AVX2 [(V32QI "TARGET_AVX2") (V16HI "TARGET_AVX2")
+				   V8SI V4DI])
+
 (define_expand "vcond_mask_<mode><sseintvecmodelower>"
-  [(set (match_operand:VI_256 0 "register_operand")
-	(vec_merge:VI_256
-	  (match_operand:VI_256 1 "nonimmediate_operand")
-	  (match_operand:VI_256 2 "nonimm_or_0_operand")
+  [(set (match_operand:VI_256_AVX2 0 "register_operand")
+	(vec_merge:VI_256_AVX2
+	  (match_operand:VI_256_AVX2 1 "nonimmediate_operand")
+	  (match_operand:VI_256_AVX2 2 "nonimm_or_0_operand")
 	  (match_operand:<sseintvecmode> 3 "register_operand")))]
-  "TARGET_AVX2"
+  "TARGET_AVX"
 {
   ix86_expand_sse_movcc (operands[0], operands[3],
 			 operands[1], operands[2]);
--- gcc/testsuite/gcc.target/i386/avx-pr93637.c.jj	2020-02-10 13:19:18.212437488 +0100
+++ gcc/testsuite/gcc.target/i386/avx-pr93637.c	2020-02-10 13:18:25.651220171 +0100
@@ -0,0 +1,17 @@ 
+/* PR target/93637 */
+/* { dg-do compile } */
+/* { dg-options "-mavx -mno-avx2 -O3 --param sccvn-max-alias-queries-per-access=3" } */
+
+double
+foo (void)
+{
+  int i;
+  double r = 7.0;
+  double a[] = { 0.0, 0.0, -0.0, 0.0, 0.0, -0.0, 1.0, 0.0, 0.0, -0.0, 1.0, 0.0, 1.0, 1.0 };
+
+  for (i = 0; i < sizeof (a) / sizeof (a[0]); ++i)
+    if (a[i] == 0.0)
+      r = a[i];
+
+  return r;
+}