diff mbox series

[v1] RISC-V: Support FP irintf auto vectorization

Message ID 20231012015233.2918814-1-pan2.li@intel.com
State New
Headers show
Series [v1] RISC-V: Support FP irintf auto vectorization | expand

Commit Message

Li, Pan2 Oct. 12, 2023, 1:52 a.m. UTC
From: Pan Li <pan2.li@intel.com>

This patch would like to support the FP irintf auto vectorization.

* int irintf (float)

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on SF => SI.

Given we have code like:

void
test_irintf (int *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_irintf (in[i]);
}

Before this patch:
.L3:
  ...
  flw      fa5,0(a1)
  fcvt.w.s a5,fa5,dyn
  sw       a5,-4(a0)
  ...
  bne      a1,a4,.L3

After this patch:
.L3:
  ...
  vle32.v     v1,0(a1)
  vfcvt.x.f.v v1,v1
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3

The rest part like DF => SI/HF => SI will be covered by the hook
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

	* config/riscv/autovec.md (lrint<mode><vlconvert>2): Rename from.
	(lrint<mode><v_i_l_ll_convert>2): Rename to.
	* config/riscv/vector-iterators.md: Rename and remove TARGET_64BIT.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: New test.
	* gcc.target/riscv/rvv/autovec/vls/math-irint-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/config/riscv/autovec.md                   |  9 ++-
 gcc/config/riscv/vector-iterators.md          | 74 +++++++++----------
 .../riscv/rvv/autovec/unop/math-irint-0.c     | 14 ++++
 .../riscv/rvv/autovec/unop/math-irint-run-0.c | 63 ++++++++++++++++
 .../riscv/rvv/autovec/vls/math-irint-0.c      | 30 ++++++++
 5 files changed, 149 insertions(+), 41 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c

Comments

钟居哲 Oct. 12, 2023, 2:01 a.m. UTC | #1
LGTM。 Thanks。



juzhe.zhong@rivai.ai
 
From: pan2.li
Date: 2023-10-12 09:52
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Support FP irintf auto vectorization
From: Pan Li <pan2.li@intel.com>
 
This patch would like to support the FP irintf auto vectorization.
 
* int irintf (float)
 
Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on SF => SI.
 
Given we have code like:
 
void
test_irintf (int *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_irintf (in[i]);
}
 
Before this patch:
.L3:
  ...
  flw      fa5,0(a1)
  fcvt.w.s a5,fa5,dyn
  sw       a5,-4(a0)
  ...
  bne      a1,a4,.L3
 
After this patch:
.L3:
  ...
  vle32.v     v1,0(a1)
  vfcvt.x.f.v v1,v1
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3
 
The rest part like DF => SI/HF => SI will be covered by the hook
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.
 
gcc/ChangeLog:
 
* config/riscv/autovec.md (lrint<mode><vlconvert>2): Rename from.
(lrint<mode><v_i_l_ll_convert>2): Rename to.
* config/riscv/vector-iterators.md: Rename and remove TARGET_64BIT.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-irint-0.c: New test.
 
Signed-off-by: Pan Li <pan2.li@intel.com>
---
gcc/config/riscv/autovec.md                   |  9 ++-
gcc/config/riscv/vector-iterators.md          | 74 +++++++++----------
.../riscv/rvv/autovec/unop/math-irint-0.c     | 14 ++++
.../riscv/rvv/autovec/unop/math-irint-run-0.c | 63 ++++++++++++++++
.../riscv/rvv/autovec/vls/math-irint-0.c      | 30 ++++++++
5 files changed, 149 insertions(+), 41 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c
 
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index dc76a01d82c..c3a51e22ceb 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2240,6 +2240,7 @@ (define_expand "<u>avg<v_double_trunc>3_ceil"
;; - trunc/truncf
;; - roundeven/roundevenf
;; - lrint/lrintf
+;; - irintf
;; -------------------------------------------------------------------------
(define_expand "ceil<mode>2"
   [(match_operand:V_VLSF 0 "register_operand")
@@ -2311,12 +2312,12 @@ (define_expand "roundeven<mode>2"
   }
)
-(define_expand "lrint<mode><vlconvert>2"
-  [(match_operand:<VLCONVERT>     0 "register_operand")
-   (match_operand:V_VLS_FCONVERTL 1 "register_operand")]
+(define_expand "lrint<mode><v_i_l_ll_convert>2"
+  [(match_operand:<V_I_L_LL_CONVERT>    0 "register_operand")
+   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-    riscv_vector::expand_vec_lrint (operands[0], operands[1], <MODE>mode, <VLCONVERT>mode);
+    riscv_vector::expand_vec_lrint (operands[0], operands[1], <MODE>mode, <V_I_L_LL_CONVERT>mode);
     DONE;
   }
)
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index bb0c46ea30a..96ddd34c958 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -3281,8 +3281,8 @@ (define_mode_attr vnnconvert [
   (V512DI "v512hf")
])
-;; L indicates convert to long
-(define_mode_attr VLCONVERT [
+;; Convert to int, long and long long
+(define_mode_attr V_I_L_LL_CONVERT [
   (RVVM8SF "RVVM8SI") (RVVM4SF "RVVM4SI") (RVVM2SF "RVVM2SI")
   (RVVM1SF "RVVM1SI") (RVVMF2SF "RVVMF2SI")
@@ -3298,7 +3298,7 @@ (define_mode_attr VLCONVERT [
   (V512DF "V512DI")
])
-(define_mode_attr vlconvert [
+(define_mode_attr v_i_l_ll_convert [
   (RVVM8SF "rvvm8si") (RVVM4SF "rvvm4si") (RVVM2SF "rvvm2si")
   (RVVM1SF "rvvm1si") (RVVMF2SF "rvvmf2si")
@@ -3314,40 +3314,40 @@ (define_mode_attr vlconvert [
   (V512DF "v512di")
])
-(define_mode_iterator V_VLS_FCONVERTL [
-  (RVVM8SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM4SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM1SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN > 32")
-
-  (RVVM8DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM4DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM2DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-
-  (V1SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V2SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V4SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V8SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V16SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 64")
-  (V32SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 128")
-  (V64SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 256")
-  (V128SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 512")
-  (V256SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 1024")
-  (V512SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 2048")
-  (V1024SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 4096")
-
-  (V1DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (V2DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (V4DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (V8DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 64")
-  (V16DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 128")
-  (V32DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 256")
-  (V64DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 512")
-  (V128DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 1024")
-  (V256DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 2048")
-  (V512DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 4096")
+(define_mode_iterator V_VLS_FCONVERT_I_L_LL [
+  (RVVM8SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVM4SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVM2SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVM1SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
+
+  (RVVM8DF "TARGET_VECTOR_ELEN_FP_64")
+  (RVVM4DF "TARGET_VECTOR_ELEN_FP_64")
+  (RVVM2DF "TARGET_VECTOR_ELEN_FP_64")
+  (RVVM1DF "TARGET_VECTOR_ELEN_FP_64")
+
+  (V1SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32")
+  (V2SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32")
+  (V4SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32")
+  (V8SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32")
+  (V16SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 64")
+  (V32SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  (V64SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 256")
+  (V128SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 512")
+  (V256SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 1024")
+  (V512SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 2048")
+  (V1024SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 4096")
+
+  (V1DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V2DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V4DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V8DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 64")
+  (V16DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
+  (V32DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 256")
+  (V64DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 512")
+  (V128DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 1024")
+  (V256DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 2048")
+  (V512DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 4096")
])
(define_mode_attr VDEMOTE [
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c
new file mode 100644
index 00000000000..3ca2f651763
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_float_int___builtin_irintf:
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+*/
+TEST_UNARY_CALL_CVT (float, int, __builtin_irintf)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
new file mode 100644
index 00000000000..0be38528d0b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
@@ -0,0 +1,63 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+float in[ARRAY_SIZE];
+int out[ARRAY_SIZE];
+int ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (float, int, __builtin_irintf)
+TEST_ASSERT (int)
+
+TEST_INIT_CVT (float, 1.2, int, __builtin_irintf (1.2), 1)
+TEST_INIT_CVT (float, -1.2, int, __builtin_irintf (-1.2), 2)
+TEST_INIT_CVT (float, 0.5, int, __builtin_irintf (0.5), 3)
+TEST_INIT_CVT (float, -0.5, int, __builtin_irintf (-0.5), 4)
+TEST_INIT_CVT (float, 0.1, int, __builtin_irintf (0.1), 5)
+TEST_INIT_CVT (float, -0.1, int, __builtin_irintf (-0.1), 6)
+TEST_INIT_CVT (float, 3.0, int, __builtin_irintf (3.0), 7)
+TEST_INIT_CVT (float, -3.0, int, __builtin_irintf (-3.0), 8)
+TEST_INIT_CVT (float, 8388607.5, int, __builtin_irintf (8388607.5), 9)
+TEST_INIT_CVT (float, 8388609.0, int, __builtin_irintf (8388609.0), 10)
+TEST_INIT_CVT (float, -8388607.5, int, __builtin_irintf (-8388607.5), 11)
+TEST_INIT_CVT (float, -8388609.0, int, __builtin_irintf (-8388609.0), 12)
+TEST_INIT_CVT (float, 0.0, int, __builtin_irintf (-0.0), 13)
+TEST_INIT_CVT (float, -0.0, int, __builtin_irintf (-0.0), 14)
+TEST_INIT_CVT (float, 2147483520.0, int, __builtin_irintf (2147483520.0), 15)
+TEST_INIT_CVT (float, 2147483648.0, int, __builtin_irintf (2147483648.0), 16)
+TEST_INIT_CVT (float, -2147483648.0, int, __builtin_irintf (-2147483648.0), 17)
+TEST_INIT_CVT (float, -2147483904.0, int, __builtin_irintf (-2147483904.0), 18)
+TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_irintf (__builtin_inff ()), 19)
+TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_irintf (-__builtin_inff ()), 20)
+TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fffffff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (float, int, 1, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 2, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 3, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 4, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 5, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 6, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 7, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 8, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 9, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 10, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 11, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 12, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 13, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 14, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 15, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 16, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 17, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 18, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 19, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 20, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 21, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c
new file mode 100644
index 00000000000..3297bc6ec65
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl4096b -mabi=lp64d -O3 --param=riscv-autovec-lmul=m8 -ffast-math -fdump-tree-optimized" } */
+
+#include "def.h"
+
+DEF_OP_V_CVT (irintf, 1, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 2, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 4, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 8, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 16, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 32, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 64, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 128, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 256, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 512, float, int, __builtin_irintf)
+
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "4,4" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "16,16" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "32,32" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "64,64" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "128,128" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "256,256" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "512,512" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "1024,1024" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "2048,2048" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "4096,4096" "optimized" } } */
+/* { dg-final { scan-assembler-times {vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+} 9 } } */
Li, Pan2 Oct. 12, 2023, 2:07 a.m. UTC | #2
Committed, thanks Juzhe.

Pan

From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: Thursday, October 12, 2023 10:02 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Li, Pan2 <pan2.li@intel.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com>
Subject: Re: [PATCH v1] RISC-V: Support FP irintf auto vectorization

LGTM。 Thanks。
diff mbox series

Patch

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index dc76a01d82c..c3a51e22ceb 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2240,6 +2240,7 @@  (define_expand "<u>avg<v_double_trunc>3_ceil"
 ;; - trunc/truncf
 ;; - roundeven/roundevenf
 ;; - lrint/lrintf
+;; - irintf
 ;; -------------------------------------------------------------------------
 (define_expand "ceil<mode>2"
   [(match_operand:V_VLSF 0 "register_operand")
@@ -2311,12 +2312,12 @@  (define_expand "roundeven<mode>2"
   }
 )
 
-(define_expand "lrint<mode><vlconvert>2"
-  [(match_operand:<VLCONVERT>     0 "register_operand")
-   (match_operand:V_VLS_FCONVERTL 1 "register_operand")]
+(define_expand "lrint<mode><v_i_l_ll_convert>2"
+  [(match_operand:<V_I_L_LL_CONVERT>    0 "register_operand")
+   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-    riscv_vector::expand_vec_lrint (operands[0], operands[1], <MODE>mode, <VLCONVERT>mode);
+    riscv_vector::expand_vec_lrint (operands[0], operands[1], <MODE>mode, <V_I_L_LL_CONVERT>mode);
     DONE;
   }
 )
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index bb0c46ea30a..96ddd34c958 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -3281,8 +3281,8 @@  (define_mode_attr vnnconvert [
   (V512DI "v512hf")
 ])
 
-;; L indicates convert to long
-(define_mode_attr VLCONVERT [
+;; Convert to int, long and long long
+(define_mode_attr V_I_L_LL_CONVERT [
   (RVVM8SF "RVVM8SI") (RVVM4SF "RVVM4SI") (RVVM2SF "RVVM2SI")
   (RVVM1SF "RVVM1SI") (RVVMF2SF "RVVMF2SI")
 
@@ -3298,7 +3298,7 @@  (define_mode_attr VLCONVERT [
   (V512DF "V512DI")
 ])
 
-(define_mode_attr vlconvert [
+(define_mode_attr v_i_l_ll_convert [
   (RVVM8SF "rvvm8si") (RVVM4SF "rvvm4si") (RVVM2SF "rvvm2si")
   (RVVM1SF "rvvm1si") (RVVMF2SF "rvvmf2si")
 
@@ -3314,40 +3314,40 @@  (define_mode_attr vlconvert [
   (V512DF "v512di")
 ])
 
-(define_mode_iterator V_VLS_FCONVERTL [
-  (RVVM8SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM4SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM1SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN > 32")
-
-  (RVVM8DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM4DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM2DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (RVVM1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-
-  (V1SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V2SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V4SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V8SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (V16SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 64")
-  (V32SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 128")
-  (V64SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 256")
-  (V128SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 512")
-  (V256SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 1024")
-  (V512SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 2048")
-  (V1024SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT && TARGET_MIN_VLEN >= 4096")
-
-  (V1DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (V2DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (V4DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT")
-  (V8DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 64")
-  (V16DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 128")
-  (V32DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 256")
-  (V64DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 512")
-  (V128DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 1024")
-  (V256DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 2048")
-  (V512DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_64BIT && TARGET_MIN_VLEN >= 4096")
+(define_mode_iterator V_VLS_FCONVERT_I_L_LL [
+  (RVVM8SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVM4SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVM2SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVM1SF "TARGET_VECTOR_ELEN_FP_32")
+  (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
+
+  (RVVM8DF "TARGET_VECTOR_ELEN_FP_64")
+  (RVVM4DF "TARGET_VECTOR_ELEN_FP_64")
+  (RVVM2DF "TARGET_VECTOR_ELEN_FP_64")
+  (RVVM1DF "TARGET_VECTOR_ELEN_FP_64")
+
+  (V1SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32")
+  (V2SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32")
+  (V4SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32")
+  (V8SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32")
+  (V16SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 64")
+  (V32SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  (V64SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 256")
+  (V128SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 512")
+  (V256SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 1024")
+  (V512SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 2048")
+  (V1024SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 4096")
+
+  (V1DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V2DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V4DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V8DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 64")
+  (V16DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
+  (V32DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 256")
+  (V64DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 512")
+  (V128DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 1024")
+  (V256DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 2048")
+  (V512DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 4096")
 ])
 
 (define_mode_attr VDEMOTE [
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c
new file mode 100644
index 00000000000..3ca2f651763
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_float_int___builtin_irintf:
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+*/
+TEST_UNARY_CALL_CVT (float, int, __builtin_irintf)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
new file mode 100644
index 00000000000..0be38528d0b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
@@ -0,0 +1,63 @@ 
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+float in[ARRAY_SIZE];
+int out[ARRAY_SIZE];
+int ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (float, int, __builtin_irintf)
+TEST_ASSERT (int)
+
+TEST_INIT_CVT (float, 1.2, int, __builtin_irintf (1.2), 1)
+TEST_INIT_CVT (float, -1.2, int, __builtin_irintf (-1.2), 2)
+TEST_INIT_CVT (float, 0.5, int, __builtin_irintf (0.5), 3)
+TEST_INIT_CVT (float, -0.5, int, __builtin_irintf (-0.5), 4)
+TEST_INIT_CVT (float, 0.1, int, __builtin_irintf (0.1), 5)
+TEST_INIT_CVT (float, -0.1, int, __builtin_irintf (-0.1), 6)
+TEST_INIT_CVT (float, 3.0, int, __builtin_irintf (3.0), 7)
+TEST_INIT_CVT (float, -3.0, int, __builtin_irintf (-3.0), 8)
+TEST_INIT_CVT (float, 8388607.5, int, __builtin_irintf (8388607.5), 9)
+TEST_INIT_CVT (float, 8388609.0, int, __builtin_irintf (8388609.0), 10)
+TEST_INIT_CVT (float, -8388607.5, int, __builtin_irintf (-8388607.5), 11)
+TEST_INIT_CVT (float, -8388609.0, int, __builtin_irintf (-8388609.0), 12)
+TEST_INIT_CVT (float, 0.0, int, __builtin_irintf (-0.0), 13)
+TEST_INIT_CVT (float, -0.0, int, __builtin_irintf (-0.0), 14)
+TEST_INIT_CVT (float, 2147483520.0, int, __builtin_irintf (2147483520.0), 15)
+TEST_INIT_CVT (float, 2147483648.0, int, __builtin_irintf (2147483648.0), 16)
+TEST_INIT_CVT (float, -2147483648.0, int, __builtin_irintf (-2147483648.0), 17)
+TEST_INIT_CVT (float, -2147483904.0, int, __builtin_irintf (-2147483904.0), 18)
+TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_irintf (__builtin_inff ()), 19)
+TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_irintf (-__builtin_inff ()), 20)
+TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fffffff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (float, int, 1, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 2, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 3, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 4, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 5, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 6, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 7, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 8, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 9, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 10, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 11, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 12, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 13, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 14, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 15, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 16, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 17, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 18, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 19, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 20, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 21, __builtin_irintf, in, out, ref, ARRAY_SIZE);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c
new file mode 100644
index 00000000000..3297bc6ec65
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl4096b -mabi=lp64d -O3 --param=riscv-autovec-lmul=m8 -ffast-math -fdump-tree-optimized" } */
+
+#include "def.h"
+
+DEF_OP_V_CVT (irintf, 1, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 2, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 4, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 8, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 16, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 32, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 64, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 128, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 256, float, int, __builtin_irintf)
+DEF_OP_V_CVT (irintf, 512, float, int, __builtin_irintf)
+
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "4,4" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "16,16" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "32,32" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "64,64" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "128,128" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "256,256" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "512,512" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "1024,1024" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "2048,2048" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "4096,4096" "optimized" } } */
+/* { dg-final { scan-assembler-times {vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+} 9 } } */