diff mbox series

[v1] RISC-V: Support RVV FP16 ZVFH floating-point intrinsic API

Message ID 20230605065058.1037581-1-pan2.li@intel.com
State New
Headers show
Series [v1] RISC-V: Support RVV FP16 ZVFH floating-point intrinsic API | expand

Commit Message

Li, Pan2 via Gcc-patches June 5, 2023, 6:50 a.m. UTC
From: Pan Li <pan2.li@intel.com>

This patch support the intrinsic API of FP16 ZVFH floating-point. Aka
SEW=16 for below instructions:

vfadd vfsub vfrsub vfwadd vfwsub
vfmul vfdiv vfrdiv vfwmul
vfmacc vfnmacc vfmsac vfnmsac vfmadd
vfnmadd vfmsub vfnmsub vfwmacc vfwnmacc vfwmsac vfwnmsac
vfsqrt vfrsqrt7 vfrec7
vfmin vfmax
vfsgnj vfsgnjn vfsgnjx
vmfeq vmfne vmflt vmfle vmfgt vmfge
vfclass vfmerge
vfmv
vfcvt vfwcvt vfncvt

Then users can leverage the instrinsic APIs to perform the FP=16 related
operations. Please note not all the instrinsic APIs are coverred in the
test files, only pick some typical ones due to too many. We will perform
the FP16 related instrinsic API test entirely soon.

Signed-off-by: Pan Li <pan2.li@intel.com>

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins-types.def
	(vfloat32mf2_t): New type for DEF_RVV_WEXTF_OPS.
	(vfloat32m1_t): Ditto.
	(vfloat32m2_t): Ditto.
	(vfloat32m4_t): Ditto.
	(vfloat32m8_t): Ditto.
	(vint16mf4_t): New type for DEF_RVV_CONVERT_I_OPS.
	(vint16mf2_t): Ditto.
	(vint16m1_t): Ditto.
	(vint16m2_t): Ditto.
	(vint16m4_t): Ditto.
	(vint16m8_t): Ditto.
	(vuint16mf4_t): New type for DEF_RVV_CONVERT_U_OPS.
	(vuint16mf2_t): Ditto.
	(vuint16m1_t): Ditto.
	(vuint16m2_t): Ditto.
	(vuint16m4_t): Ditto.
	(vuint16m8_t): Ditto.
	(vint32mf2_t): New type for DEF_RVV_WCONVERT_I_OPS.
	(vint32m1_t): Ditto.
	(vint32m2_t): Ditto.
	(vint32m4_t): Ditto.
	(vint32m8_t): Ditto.
	(vuint32mf2_t): New type for DEF_RVV_WCONVERT_U_OPS.
	(vuint32m1_t): Ditto.
	(vuint32m2_t): Ditto.
	(vuint32m4_t): Ditto.
	(vuint32m8_t): Ditto.
	* config/riscv/vector-iterators.md: Add FP=16 support for V,
	VWCONVERTI, VCONVERT, VNCONVERT, VMUL1 and vlmul1.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: New test.
---
 .../riscv/riscv-vector-builtins-types.def     |  32 ++
 gcc/config/riscv/vector-iterators.md          |  21 +
 .../riscv/rvv/base/zvfh-intrinsic.c           | 418 ++++++++++++++++++
 3 files changed, 471 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c

Comments

juzhe.zhong@rivai.ai June 5, 2023, 7:29 a.m. UTC | #1
+DEF_RVV_WEXTF_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WEXTF_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
Is this used in vfwcvt ? convert FP16 -> FP32, if yes, you should add ZVFHMIN or ZVFH require checking.


+DEF_RVV_CONVERT_I_OPS (vint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_CONVERT_I_OPS (vint16mf2_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m1_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m2_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m4_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m8_t, 0)

same


+DEF_RVV_CONVERT_U_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_CONVERT_U_OPS (vuint16mf2_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m1_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m2_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m4_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m8_t, 0

same

+DEF_RVV_WCONVERT_I_OPS (vint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WCONVERT_I_OPS (vint32m1_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m2_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m4_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m8_t, 0)


same

+DEF_RVV_WCONVERT_U_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WCONVERT_U_OPS (vuint32m1_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m2_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m4_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m8_t, 0)

same



Otherwise, LGTM.


juzhe.zhong@rivai.ai
 
From: pan2.li
Date: 2023-06-05 14:50
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; yanzhang.wang
Subject: [PATCH v1] RISC-V: Support RVV FP16 ZVFH floating-point intrinsic API
From: Pan Li <pan2.li@intel.com>
 
This patch support the intrinsic API of FP16 ZVFH floating-point. Aka
SEW=16 for below instructions:
 
vfadd vfsub vfrsub vfwadd vfwsub
vfmul vfdiv vfrdiv vfwmul
vfmacc vfnmacc vfmsac vfnmsac vfmadd
vfnmadd vfmsub vfnmsub vfwmacc vfwnmacc vfwmsac vfwnmsac
vfsqrt vfrsqrt7 vfrec7
vfmin vfmax
vfsgnj vfsgnjn vfsgnjx
vmfeq vmfne vmflt vmfle vmfgt vmfge
vfclass vfmerge
vfmv
vfcvt vfwcvt vfncvt
 
Then users can leverage the instrinsic APIs to perform the FP=16 related
operations. Please note not all the instrinsic APIs are coverred in the
test files, only pick some typical ones due to too many. We will perform
the FP16 related instrinsic API test entirely soon.
 
Signed-off-by: Pan Li <pan2.li@intel.com>
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins-types.def
(vfloat32mf2_t): New type for DEF_RVV_WEXTF_OPS.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vint16mf4_t): New type for DEF_RVV_CONVERT_I_OPS.
(vint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vuint16mf4_t): New type for DEF_RVV_CONVERT_U_OPS.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vint32mf2_t): New type for DEF_RVV_WCONVERT_I_OPS.
(vint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vuint32mf2_t): New type for DEF_RVV_WCONVERT_U_OPS.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
* config/riscv/vector-iterators.md: Add FP=16 support for V,
VWCONVERTI, VCONVERT, VNCONVERT, VMUL1 and vlmul1.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: New test.
---
.../riscv/riscv-vector-builtins-types.def     |  32 ++
gcc/config/riscv/vector-iterators.md          |  21 +
.../riscv/rvv/base/zvfh-intrinsic.c           | 418 ++++++++++++++++++
3 files changed, 471 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c
 
diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def b/gcc/config/riscv/riscv-vector-builtins-types.def
index 9cb3aca992e..348aa05dd91 100644
--- a/gcc/config/riscv/riscv-vector-builtins-types.def
+++ b/gcc/config/riscv/riscv-vector-builtins-types.def
@@ -518,11 +518,24 @@ DEF_RVV_FULL_V_U_OPS (vuint64m2_t, RVV_REQUIRE_FULL_V)
DEF_RVV_FULL_V_U_OPS (vuint64m4_t, RVV_REQUIRE_FULL_V)
DEF_RVV_FULL_V_U_OPS (vuint64m8_t, RVV_REQUIRE_FULL_V)
+DEF_RVV_WEXTF_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WEXTF_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
+
DEF_RVV_WEXTF_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64)
DEF_RVV_WEXTF_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64)
DEF_RVV_WEXTF_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64)
DEF_RVV_WEXTF_OPS (vfloat64m8_t, RVV_REQUIRE_ELEN_FP_64)
+DEF_RVV_CONVERT_I_OPS (vint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_CONVERT_I_OPS (vint16mf2_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m1_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m2_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m4_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m8_t, 0)
+
DEF_RVV_CONVERT_I_OPS (vint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
DEF_RVV_CONVERT_I_OPS (vint32m1_t, 0)
DEF_RVV_CONVERT_I_OPS (vint32m2_t, 0)
@@ -533,6 +546,13 @@ DEF_RVV_CONVERT_I_OPS (vint64m2_t, RVV_REQUIRE_ELEN_64)
DEF_RVV_CONVERT_I_OPS (vint64m4_t, RVV_REQUIRE_ELEN_64)
DEF_RVV_CONVERT_I_OPS (vint64m8_t, RVV_REQUIRE_ELEN_64)
+DEF_RVV_CONVERT_U_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_CONVERT_U_OPS (vuint16mf2_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m1_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m2_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m4_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m8_t, 0)
+
DEF_RVV_CONVERT_U_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
DEF_RVV_CONVERT_U_OPS (vuint32m1_t, 0)
DEF_RVV_CONVERT_U_OPS (vuint32m2_t, 0)
@@ -543,11 +563,23 @@ DEF_RVV_CONVERT_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64)
DEF_RVV_CONVERT_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64)
DEF_RVV_CONVERT_U_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64)
+DEF_RVV_WCONVERT_I_OPS (vint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WCONVERT_I_OPS (vint32m1_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m2_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m4_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m8_t, 0)
+
DEF_RVV_WCONVERT_I_OPS (vint64m1_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
DEF_RVV_WCONVERT_I_OPS (vint64m2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
DEF_RVV_WCONVERT_I_OPS (vint64m4_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
DEF_RVV_WCONVERT_I_OPS (vint64m8_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
+DEF_RVV_WCONVERT_U_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WCONVERT_U_OPS (vuint32m1_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m2_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m4_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m8_t, 0)
+
DEF_RVV_WCONVERT_U_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
DEF_RVV_WCONVERT_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
DEF_RVV_WCONVERT_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 90743ed76c5..e4f2ba90799 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -296,6 +296,14 @@ (define_mode_iterator VWI_ZVE32 [
])
(define_mode_iterator VF [
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
@@ -496,6 +504,13 @@ (define_mode_iterator VWEXTF [
])
(define_mode_iterator VWCONVERTI [
+  (VNx1SI "TARGET_MIN_VLEN < 128 && TARGET_VECTOR_ELEN_FP_16")
+  (VNx2SI "TARGET_VECTOR_ELEN_FP_16")
+  (VNx4SI "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8SI "TARGET_VECTOR_ELEN_FP_16")
+  (VNx16SI "TARGET_MIN_VLEN > 32 && TARGET_VECTOR_ELEN_FP_16")
+  (VNx32SI "TARGET_MIN_VLEN >= 128 && TARGET_VECTOR_ELEN_FP_16")
+
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
   (VNx4DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
@@ -1239,17 +1254,21 @@ (define_mode_attr VINDEX_OCT_EXT [
])
(define_mode_attr VCONVERT [
+  (VNx1HF "VNx1HI") (VNx2HF "VNx2HI") (VNx4HF "VNx4HI") (VNx8HF "VNx8HI") (VNx16HF "VNx16HI") (VNx32HF "VNx32HI") (VNx64HF "VNx64HI")
   (VNx1SF "VNx1SI") (VNx2SF "VNx2SI") (VNx4SF "VNx4SI") (VNx8SF "VNx8SI") (VNx16SF "VNx16SI") (VNx32SF "VNx32SI")
   (VNx1DF "VNx1DI") (VNx2DF "VNx2DI") (VNx4DF "VNx4DI") (VNx8DF "VNx8DI") (VNx16DF "VNx16DI")
])
(define_mode_attr vconvert [
+  (VNx1HF "vnx1hi") (VNx2HF "vnx2hi") (VNx4HF "vnx4hi") (VNx8HF "vnx8hi") (VNx16HF "vnx16hi") (VNx32HF "vnx32hi") (VNx64HF "vnx64hi")
   (VNx1SF "vnx1si") (VNx2SF "vnx2si") (VNx4SF "vnx4si") (VNx8SF "vnx8si") (VNx16SF "vnx16si") (VNx32SF "vnx32si")
   (VNx1DF "vnx1di") (VNx2DF "vnx2di") (VNx4DF "vnx4di") (VNx8DF "vnx8di") (VNx16DF "vnx16di")
])
(define_mode_attr VNCONVERT [
+  (VNx1HF "VNx1QI") (VNx2HF "VNx2QI") (VNx4HF "VNx4QI") (VNx8HF "VNx8QI") (VNx16HF "VNx16QI") (VNx32HF "VNx32QI") (VNx64HF "VNx64QI")
   (VNx1SF "VNx1HI") (VNx2SF "VNx2HI") (VNx4SF "VNx4HI") (VNx8SF "VNx8HI") (VNx16SF "VNx16HI") (VNx32SF "VNx32HI")
+  (VNx1SI "VNx1HF") (VNx2SI "VNx2HF") (VNx4SI "VNx4HF") (VNx8SI "VNx8HF") (VNx16SI "VNx16HF") (VNx32SI "VNx32HF")
   (VNx1DI "VNx1SF") (VNx2DI "VNx2SF") (VNx4DI "VNx4SF") (VNx8DI "VNx8SF") (VNx16DI "VNx16SF")
   (VNx1DF "VNx1SI") (VNx2DF "VNx2SI") (VNx4DF "VNx4SI") (VNx8DF "VNx8SI") (VNx16DF "VNx16SI")
])
@@ -1263,6 +1282,7 @@ (define_mode_attr VLMUL1 [
   (VNx8SI "VNx4SI") (VNx16SI "VNx4SI") (VNx32SI "VNx4SI")
   (VNx1DI "VNx2DI") (VNx2DI "VNx2DI")
   (VNx4DI "VNx2DI") (VNx8DI "VNx2DI") (VNx16DI "VNx2DI")
+  (VNx1HF "VNx8HF") (VNx2HF "VNx8HF") (VNx4HF "VNx8HF") (VNx8HF "VNx8HF") (VNx16HF "VNx8HF") (VNx32HF "VNx8HF") (VNx64HF "VNx8HF")
   (VNx1SF "VNx4SF") (VNx2SF "VNx4SF")
   (VNx4SF "VNx4SF") (VNx8SF "VNx4SF") (VNx16SF "VNx4SF") (VNx32SF "VNx4SF")
   (VNx1DF "VNx2DF") (VNx2DF "VNx2DF")
@@ -1333,6 +1353,7 @@ (define_mode_attr vlmul1 [
   (VNx8SI "vnx4si") (VNx16SI "vnx4si") (VNx32SI "vnx4si")
   (VNx1DI "vnx2di") (VNx2DI "vnx2di")
   (VNx4DI "vnx2di") (VNx8DI "vnx2di") (VNx16DI "vnx2di")
+  (VNx1HF "vnx8hf") (VNx2HF "vnx8hf") (VNx4HF "vnx8hf") (VNx8HF "vnx8hf") (VNx16HF "vnx8hf") (VNx32HF "vnx8hf") (VNx64HF "vnx8hf")
   (VNx1SF "vnx4sf") (VNx2SF "vnx4sf")
   (VNx4SF "vnx4sf") (VNx8SF "vnx4sf") (VNx16SF "vnx4sf") (VNx32SF "vnx4sf")
   (VNx1DF "vnx2df") (VNx2DF "vnx2df")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c
new file mode 100644
index 00000000000..0d244aac9ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c
@@ -0,0 +1,418 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3" } */
+
+#include "riscv_vector.h"
+
+typedef _Float16 float16_t;
+
+vfloat16mf4_t test_vfadd_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfadd_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfadd_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfadd_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfsub_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfsub_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfsub_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfsub_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfrsub_vf_f16mf4(vfloat16mf4_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfrsub_vf_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfrsub_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfrsub_vf_f16m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwadd_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwadd_vv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwadd_vv_f32m8(vfloat16m4_t op1, vfloat16m4_t op2, size_t vl) {
+  return __riscv_vfwadd_vv_f32m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwadd_wv_f32mf2(vfloat32mf2_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwadd_wv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwadd_wv_f32m8(vfloat32m8_t op1, vfloat16m4_t op2, size_t vl) {
+  return __riscv_vfwadd_wv_f32m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwsub_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwsub_vv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwsub_vv_f32m8(vfloat16m4_t op1, vfloat16m4_t op2, size_t vl) {
+  return __riscv_vfwsub_vv_f32m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwsub_wv_f32mf2(vfloat32mf2_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwsub_wv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwsub_wv_f32m8(vfloat32m8_t op1, vfloat16m4_t op2, size_t vl) {
+  return __riscv_vfwsub_wv_f32m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfmul_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfmul_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfmul_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfmul_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfdiv_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfdiv_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfdiv_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfdiv_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfrdiv_vf_f16mf4(vfloat16mf4_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfrdiv_vf_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfrdiv_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfrdiv_vf_f16m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwmul_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwmul_vv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwmul_vf_f32m8(vfloat16m4_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfwmul_vf_f32m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfmacc_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfmacc_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfmacc_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfmacc_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfnmacc_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfnmacc_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfnmacc_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfnmacc_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfmsac_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfmsac_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfmsac_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfmsac_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfnmsac_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfnmsac_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfnmsac_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfnmsac_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfmadd_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfmadd_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfmadd_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfmadd_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfnmadd_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfnmadd_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfnmadd_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfnmadd_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfmsub_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfmsub_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfmsub_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfmsub_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfnmsub_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfnmsub_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfnmsub_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfnmsub_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat32mf2_t test_vfwmacc_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfwmacc_vv_f32mf2(vd, vs1, vs2, vl);
+}
+
+vfloat32m8_t test_vfwmacc_vf_f32m8(vfloat32m8_t vd, float16_t vs1, vfloat16m4_t vs2, size_t vl) {
+  return __riscv_vfwmacc_vf_f32m8(vd, vs1, vs2, vl);
+}
+
+vfloat32mf2_t test_vfwnmacc_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfwnmacc_vv_f32mf2(vd, vs1, vs2, vl);
+}
+
+vfloat32m8_t test_vfwnmacc_vf_f32m8(vfloat32m8_t vd, float16_t vs1, vfloat16m4_t vs2, size_t vl) {
+  return __riscv_vfwnmacc_vf_f32m8(vd, vs1, vs2, vl);
+}
+
+vfloat32mf2_t test_vfwmsac_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfwmsac_vv_f32mf2(vd, vs1, vs2, vl);
+}
+
+vfloat32m8_t test_vfwmsac_vf_f32m8(vfloat32m8_t vd, float16_t vs1, vfloat16m4_t vs2, size_t vl) {
+  return __riscv_vfwmsac_vf_f32m8(vd, vs1, vs2, vl);
+}
+
+vfloat32mf2_t test_vfwnmsac_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfwnmsac_vv_f32mf2(vd, vs1, vs2, vl);
+}
+
+vfloat32m8_t test_vfwnmsac_vf_f32m8(vfloat32m8_t vd, float16_t vs1, vfloat16m4_t vs2, size_t vl) {
+  return __riscv_vfwnmsac_vf_f32m8(vd, vs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfsqrt_v_f16mf4(vfloat16mf4_t op1, size_t vl) {
+  return __riscv_vfsqrt_v_f16mf4(op1, vl);
+}
+
+vfloat16m8_t test_vfsqrt_v_f16m8(vfloat16m8_t op1, size_t vl) {
+  return __riscv_vfsqrt_v_f16m8(op1, vl);
+}
+
+vfloat16mf4_t test_vfrsqrt7_v_f16mf4(vfloat16mf4_t op1, size_t vl) {
+  return __riscv_vfrsqrt7_v_f16mf4(op1, vl);
+}
+
+vfloat16m8_t test_vfrsqrt7_v_f16m8(vfloat16m8_t op1, size_t vl) {
+  return __riscv_vfrsqrt7_v_f16m8(op1, vl);
+}
+
+vfloat16mf4_t test_vfrec7_v_f16mf4(vfloat16mf4_t op1, size_t vl) {
+  return __riscv_vfrec7_v_f16mf4(op1, vl);
+}
+
+vfloat16m8_t test_vfrec7_v_f16m8(vfloat16m8_t op1, size_t vl) {
+  return __riscv_vfrec7_v_f16m8(op1, vl);
+}
+
+vfloat16mf4_t test_vfmin_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfmin_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfmin_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfmin_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfmax_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfmax_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfmax_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfmax_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfsgnj_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfsgnj_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfsgnj_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfsgnj_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfsgnjn_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfsgnjn_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfsgnjn_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfsgnjn_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfsgnjx_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfsgnjx_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfsgnjx_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfsgnjx_vf_f16m8(op1, op2, vl);
+}
+
+vbool64_t test_vmfeq_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfeq_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfeq_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfeq_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmfne_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfne_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfne_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfne_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmflt_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmflt_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmflt_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmflt_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmfle_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfle_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfle_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfle_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmfgt_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfgt_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfgt_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfgt_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmfge_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfge_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfge_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfge_vf_f16m8_b2(op1, op2, vl);
+}
+
+vuint16mf4_t test_vfclass_v_u16mf4(vfloat16mf4_t op1, size_t vl) {
+  return __riscv_vfclass_v_u16mf4(op1, vl);
+}
+
+vuint16m8_t test_vfclass_v_u16m8(vfloat16m8_t op1, size_t vl) {
+  return __riscv_vfclass_v_u16m8(op1, vl);
+}
+
+vfloat16mf4_t test_vfmerge_vfm_f16mf4(vfloat16mf4_t op1, float16_t op2, vbool64_t mask, size_t vl) {
+  return __riscv_vfmerge_vfm_f16mf4(op1, op2, mask, vl);
+}
+
+vfloat16m8_t test_vfmerge_vfm_f16m8(vfloat16m8_t op1, float16_t op2, vbool2_t mask, size_t vl) {
+  return __riscv_vfmerge_vfm_f16m8(op1, op2, mask, vl);
+}
+
+vfloat16mf4_t test_vfmv_v_f_f16mf4(float16_t src, size_t vl) {
+  return __riscv_vfmv_v_f_f16mf4(src, vl);
+}
+
+vfloat16m8_t test_vfmv_v_f_f16m8(float16_t src, size_t vl) {
+  return __riscv_vfmv_v_f_f16m8(src, vl);
+}
+
+vint16mf4_t test_vfcvt_x_f_v_i16mf4(vfloat16mf4_t src, size_t vl) {
+  return __riscv_vfcvt_x_f_v_i16mf4(src, vl);
+}
+
+vuint16m8_t test_vfcvt_xu_f_v_u16m8(vfloat16m8_t src, size_t vl) {
+  return __riscv_vfcvt_xu_f_v_u16m8(src, vl);
+}
+
+vfloat16mf4_t test_vfcvt_f_x_v_f16mf4(vint16mf4_t src, size_t vl) {
+  return __riscv_vfcvt_f_x_v_f16mf4(src, vl);
+}
+
+vfloat16m8_t test_vfcvt_f_xu_v_f16m8(vuint16m8_t src, size_t vl) {
+  return __riscv_vfcvt_f_xu_v_f16m8(src, vl);
+}
+
+vint16mf4_t test_vfcvt_rtz_x_f_v_i16mf4(vfloat16mf4_t src, size_t vl) {
+  return __riscv_vfcvt_rtz_x_f_v_i16mf4(src, vl);
+}
+
+vuint16m8_t test_vfcvt_rtz_xu_f_v_u16m8(vfloat16m8_t src, size_t vl) {
+  return __riscv_vfcvt_rtz_xu_f_v_u16m8(src, vl);
+}
+
+vfloat16mf4_t test_vfwcvt_f_x_v_f16mf4(vint8mf8_t src, size_t vl) {
+  return __riscv_vfwcvt_f_x_v_f16mf4(src, vl);
+}
+
+vuint32m8_t test_vfwcvt_xu_f_v_u32m8(vfloat16m4_t src, size_t vl) {
+  return __riscv_vfwcvt_xu_f_v_u32m8(src, vl);
+}
+
+vint8mf8_t test_vfncvt_x_f_w_i8mf8(vfloat16mf4_t src, size_t vl) {
+  return __riscv_vfncvt_x_f_w_i8mf8(src, vl);
+}
+
+vfloat16m4_t test_vfncvt_f_xu_w_f16m4(vuint32m8_t src, size_t vl) {
+  return __riscv_vfncvt_f_xu_w_f16m4(src, vl);
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au]} 43 } } */
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m4,\s*t[au],\s*m[au]} 11 } } */
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} 34 } } */
+/* { dg-final { scan-assembler-times {vfadd\.v[fv]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsub\.v[fv]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfrsub\.vf\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwadd\.[wv]v\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vfwsub\.[wv]v\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vfmul\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfdiv\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfrdiv\.vf\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwmul\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfnmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfnmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmadd\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfnmadd\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsub\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfnmsub\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwnmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwnmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsqrt\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfrsqrt7\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfrec7\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsgnj\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsgnjn\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsgnjx\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfeq\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfne\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmflt\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfle\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfgt\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfge\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfclass\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmerge\.vfm\s+v[0-9]+,\s*v[0-9]+,\s*fa[0-9]+,\s*v0} 2 } } */
+/* { dg-final { scan-assembler-times {vfmv\.v\.f\s+v[0-9]+,\s*fa[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.rtz\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.rtz\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfwcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfwcvt\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfncvt\.x\.f\.w\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfncvt\.f\.xu\.w\s+v[0-9]+,\s*v[0-9]+} 1 } } */
Li, Pan2 via Gcc-patches June 5, 2023, 7:37 a.m. UTC | #2
Thanks, make sense, will update V2 for this.

Pan

From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: Monday, June 5, 2023 3:30 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Kito.cheng <kito.cheng@sifive.com>; Li, Pan2 <pan2.li@intel.com>; Wang, Yanzhang <yanzhang.wang@intel.com>
Subject: Re: [PATCH v1] RISC-V: Support RVV FP16 ZVFH floating-point intrinsic API


+DEF_RVV_WEXTF_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_WEXTF_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)

+DEF_RVV_WEXTF_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)

+DEF_RVV_WEXTF_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)

+DEF_RVV_WEXTF_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
Is this used in vfwcvt ? convert FP16 -> FP32, if yes, you should add ZVFHMIN or ZVFH require checking.



+DEF_RVV_CONVERT_I_OPS (vint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_CONVERT_I_OPS (vint16mf2_t, 0)

+DEF_RVV_CONVERT_I_OPS (vint16m1_t, 0)

+DEF_RVV_CONVERT_I_OPS (vint16m2_t, 0)

+DEF_RVV_CONVERT_I_OPS (vint16m4_t, 0)

+DEF_RVV_CONVERT_I_OPS (vint16m8_t, 0)

same



+DEF_RVV_CONVERT_U_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_CONVERT_U_OPS (vuint16mf2_t, 0)

+DEF_RVV_CONVERT_U_OPS (vuint16m1_t, 0)

+DEF_RVV_CONVERT_U_OPS (vuint16m2_t, 0)

+DEF_RVV_CONVERT_U_OPS (vuint16m4_t, 0)

+DEF_RVV_CONVERT_U_OPS (vuint16m8_t, 0

same


+DEF_RVV_WCONVERT_I_OPS (vint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_WCONVERT_I_OPS (vint32m1_t, 0)

+DEF_RVV_WCONVERT_I_OPS (vint32m2_t, 0)

+DEF_RVV_WCONVERT_I_OPS (vint32m4_t, 0)

+DEF_RVV_WCONVERT_I_OPS (vint32m8_t, 0)


same


+DEF_RVV_WCONVERT_U_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_WCONVERT_U_OPS (vuint32m1_t, 0)

+DEF_RVV_WCONVERT_U_OPS (vuint32m2_t, 0)

+DEF_RVV_WCONVERT_U_OPS (vuint32m4_t, 0)

+DEF_RVV_WCONVERT_U_OPS (vuint32m8_t, 0)

same



Otherwise, LGTM.
Li, Pan2 via Gcc-patches June 5, 2023, 8:22 a.m. UTC | #3
Updated the PATCH V2 for the ZVFH requirement.

https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620636.html

Pan

From: Li, Pan2
Sent: Monday, June 5, 2023 3:37 PM
To: juzhe.zhong@rivai.ai; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Kito.cheng <kito.cheng@sifive.com>; Wang, Yanzhang <yanzhang.wang@intel.com>
Subject: RE: [PATCH v1] RISC-V: Support RVV FP16 ZVFH floating-point intrinsic API

Thanks, make sense, will update V2 for this.

Pan

From: juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai> <juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>>
Sent: Monday, June 5, 2023 3:30 PM
To: Li, Pan2 <pan2.li@intel.com<mailto:pan2.li@intel.com>>; gcc-patches <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>
Cc: Kito.cheng <kito.cheng@sifive.com<mailto:kito.cheng@sifive.com>>; Li, Pan2 <pan2.li@intel.com<mailto:pan2.li@intel.com>>; Wang, Yanzhang <yanzhang.wang@intel.com<mailto:yanzhang.wang@intel.com>>
Subject: Re: [PATCH v1] RISC-V: Support RVV FP16 ZVFH floating-point intrinsic API


+DEF_RVV_WEXTF_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_WEXTF_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)

+DEF_RVV_WEXTF_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)

+DEF_RVV_WEXTF_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)

+DEF_RVV_WEXTF_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
Is this used in vfwcvt ? convert FP16 -> FP32, if yes, you should add ZVFHMIN or ZVFH require checking.



+DEF_RVV_CONVERT_I_OPS (vint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_CONVERT_I_OPS (vint16mf2_t, 0)

+DEF_RVV_CONVERT_I_OPS (vint16m1_t, 0)

+DEF_RVV_CONVERT_I_OPS (vint16m2_t, 0)

+DEF_RVV_CONVERT_I_OPS (vint16m4_t, 0)

+DEF_RVV_CONVERT_I_OPS (vint16m8_t, 0)

same



+DEF_RVV_CONVERT_U_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_CONVERT_U_OPS (vuint16mf2_t, 0)

+DEF_RVV_CONVERT_U_OPS (vuint16m1_t, 0)

+DEF_RVV_CONVERT_U_OPS (vuint16m2_t, 0)

+DEF_RVV_CONVERT_U_OPS (vuint16m4_t, 0)

+DEF_RVV_CONVERT_U_OPS (vuint16m8_t, 0

same


+DEF_RVV_WCONVERT_I_OPS (vint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_WCONVERT_I_OPS (vint32m1_t, 0)

+DEF_RVV_WCONVERT_I_OPS (vint32m2_t, 0)

+DEF_RVV_WCONVERT_I_OPS (vint32m4_t, 0)

+DEF_RVV_WCONVERT_I_OPS (vint32m8_t, 0)


same


+DEF_RVV_WCONVERT_U_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)

+DEF_RVV_WCONVERT_U_OPS (vuint32m1_t, 0)

+DEF_RVV_WCONVERT_U_OPS (vuint32m2_t, 0)

+DEF_RVV_WCONVERT_U_OPS (vuint32m4_t, 0)

+DEF_RVV_WCONVERT_U_OPS (vuint32m8_t, 0)

same



Otherwise, LGTM.
diff mbox series

Patch

diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def b/gcc/config/riscv/riscv-vector-builtins-types.def
index 9cb3aca992e..348aa05dd91 100644
--- a/gcc/config/riscv/riscv-vector-builtins-types.def
+++ b/gcc/config/riscv/riscv-vector-builtins-types.def
@@ -518,11 +518,24 @@  DEF_RVV_FULL_V_U_OPS (vuint64m2_t, RVV_REQUIRE_FULL_V)
 DEF_RVV_FULL_V_U_OPS (vuint64m4_t, RVV_REQUIRE_FULL_V)
 DEF_RVV_FULL_V_U_OPS (vuint64m8_t, RVV_REQUIRE_FULL_V)
 
+DEF_RVV_WEXTF_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WEXTF_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32)
+DEF_RVV_WEXTF_OPS (vfloat32m8_t, RVV_REQUIRE_ELEN_FP_32)
+
 DEF_RVV_WEXTF_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64)
 DEF_RVV_WEXTF_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64)
 DEF_RVV_WEXTF_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64)
 DEF_RVV_WEXTF_OPS (vfloat64m8_t, RVV_REQUIRE_ELEN_FP_64)
 
+DEF_RVV_CONVERT_I_OPS (vint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_CONVERT_I_OPS (vint16mf2_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m1_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m2_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m4_t, 0)
+DEF_RVV_CONVERT_I_OPS (vint16m8_t, 0)
+
 DEF_RVV_CONVERT_I_OPS (vint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
 DEF_RVV_CONVERT_I_OPS (vint32m1_t, 0)
 DEF_RVV_CONVERT_I_OPS (vint32m2_t, 0)
@@ -533,6 +546,13 @@  DEF_RVV_CONVERT_I_OPS (vint64m2_t, RVV_REQUIRE_ELEN_64)
 DEF_RVV_CONVERT_I_OPS (vint64m4_t, RVV_REQUIRE_ELEN_64)
 DEF_RVV_CONVERT_I_OPS (vint64m8_t, RVV_REQUIRE_ELEN_64)
 
+DEF_RVV_CONVERT_U_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_CONVERT_U_OPS (vuint16mf2_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m1_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m2_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m4_t, 0)
+DEF_RVV_CONVERT_U_OPS (vuint16m8_t, 0)
+
 DEF_RVV_CONVERT_U_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
 DEF_RVV_CONVERT_U_OPS (vuint32m1_t, 0)
 DEF_RVV_CONVERT_U_OPS (vuint32m2_t, 0)
@@ -543,11 +563,23 @@  DEF_RVV_CONVERT_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64)
 DEF_RVV_CONVERT_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64)
 DEF_RVV_CONVERT_U_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64)
 
+DEF_RVV_WCONVERT_I_OPS (vint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WCONVERT_I_OPS (vint32m1_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m2_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m4_t, 0)
+DEF_RVV_WCONVERT_I_OPS (vint32m8_t, 0)
+
 DEF_RVV_WCONVERT_I_OPS (vint64m1_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
 DEF_RVV_WCONVERT_I_OPS (vint64m2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
 DEF_RVV_WCONVERT_I_OPS (vint64m4_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
 DEF_RVV_WCONVERT_I_OPS (vint64m8_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
 
+DEF_RVV_WCONVERT_U_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64)
+DEF_RVV_WCONVERT_U_OPS (vuint32m1_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m2_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m4_t, 0)
+DEF_RVV_WCONVERT_U_OPS (vuint32m8_t, 0)
+
 DEF_RVV_WCONVERT_U_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
 DEF_RVV_WCONVERT_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
 DEF_RVV_WCONVERT_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_ELEN_64)
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 90743ed76c5..e4f2ba90799 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -296,6 +296,14 @@  (define_mode_iterator VWI_ZVE32 [
 ])
 
 (define_mode_iterator VF [
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
@@ -496,6 +504,13 @@  (define_mode_iterator VWEXTF [
 ])
 
 (define_mode_iterator VWCONVERTI [
+  (VNx1SI "TARGET_MIN_VLEN < 128 && TARGET_VECTOR_ELEN_FP_16")
+  (VNx2SI "TARGET_VECTOR_ELEN_FP_16")
+  (VNx4SI "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8SI "TARGET_VECTOR_ELEN_FP_16")
+  (VNx16SI "TARGET_MIN_VLEN > 32 && TARGET_VECTOR_ELEN_FP_16")
+  (VNx32SI "TARGET_MIN_VLEN >= 128 && TARGET_VECTOR_ELEN_FP_16")
+
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
   (VNx4DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
@@ -1239,17 +1254,21 @@  (define_mode_attr VINDEX_OCT_EXT [
 ])
 
 (define_mode_attr VCONVERT [
+  (VNx1HF "VNx1HI") (VNx2HF "VNx2HI") (VNx4HF "VNx4HI") (VNx8HF "VNx8HI") (VNx16HF "VNx16HI") (VNx32HF "VNx32HI") (VNx64HF "VNx64HI")
   (VNx1SF "VNx1SI") (VNx2SF "VNx2SI") (VNx4SF "VNx4SI") (VNx8SF "VNx8SI") (VNx16SF "VNx16SI") (VNx32SF "VNx32SI")
   (VNx1DF "VNx1DI") (VNx2DF "VNx2DI") (VNx4DF "VNx4DI") (VNx8DF "VNx8DI") (VNx16DF "VNx16DI")
 ])
 
 (define_mode_attr vconvert [
+  (VNx1HF "vnx1hi") (VNx2HF "vnx2hi") (VNx4HF "vnx4hi") (VNx8HF "vnx8hi") (VNx16HF "vnx16hi") (VNx32HF "vnx32hi") (VNx64HF "vnx64hi")
   (VNx1SF "vnx1si") (VNx2SF "vnx2si") (VNx4SF "vnx4si") (VNx8SF "vnx8si") (VNx16SF "vnx16si") (VNx32SF "vnx32si")
   (VNx1DF "vnx1di") (VNx2DF "vnx2di") (VNx4DF "vnx4di") (VNx8DF "vnx8di") (VNx16DF "vnx16di")
 ])
 
 (define_mode_attr VNCONVERT [
+  (VNx1HF "VNx1QI") (VNx2HF "VNx2QI") (VNx4HF "VNx4QI") (VNx8HF "VNx8QI") (VNx16HF "VNx16QI") (VNx32HF "VNx32QI") (VNx64HF "VNx64QI")
   (VNx1SF "VNx1HI") (VNx2SF "VNx2HI") (VNx4SF "VNx4HI") (VNx8SF "VNx8HI") (VNx16SF "VNx16HI") (VNx32SF "VNx32HI")
+  (VNx1SI "VNx1HF") (VNx2SI "VNx2HF") (VNx4SI "VNx4HF") (VNx8SI "VNx8HF") (VNx16SI "VNx16HF") (VNx32SI "VNx32HF")
   (VNx1DI "VNx1SF") (VNx2DI "VNx2SF") (VNx4DI "VNx4SF") (VNx8DI "VNx8SF") (VNx16DI "VNx16SF")
   (VNx1DF "VNx1SI") (VNx2DF "VNx2SI") (VNx4DF "VNx4SI") (VNx8DF "VNx8SI") (VNx16DF "VNx16SI")
 ])
@@ -1263,6 +1282,7 @@  (define_mode_attr VLMUL1 [
   (VNx8SI "VNx4SI") (VNx16SI "VNx4SI") (VNx32SI "VNx4SI")
   (VNx1DI "VNx2DI") (VNx2DI "VNx2DI")
   (VNx4DI "VNx2DI") (VNx8DI "VNx2DI") (VNx16DI "VNx2DI")
+  (VNx1HF "VNx8HF") (VNx2HF "VNx8HF") (VNx4HF "VNx8HF") (VNx8HF "VNx8HF") (VNx16HF "VNx8HF") (VNx32HF "VNx8HF") (VNx64HF "VNx8HF")
   (VNx1SF "VNx4SF") (VNx2SF "VNx4SF")
   (VNx4SF "VNx4SF") (VNx8SF "VNx4SF") (VNx16SF "VNx4SF") (VNx32SF "VNx4SF")
   (VNx1DF "VNx2DF") (VNx2DF "VNx2DF")
@@ -1333,6 +1353,7 @@  (define_mode_attr vlmul1 [
   (VNx8SI "vnx4si") (VNx16SI "vnx4si") (VNx32SI "vnx4si")
   (VNx1DI "vnx2di") (VNx2DI "vnx2di")
   (VNx4DI "vnx2di") (VNx8DI "vnx2di") (VNx16DI "vnx2di")
+  (VNx1HF "vnx8hf") (VNx2HF "vnx8hf") (VNx4HF "vnx8hf") (VNx8HF "vnx8hf") (VNx16HF "vnx8hf") (VNx32HF "vnx8hf") (VNx64HF "vnx8hf")
   (VNx1SF "vnx4sf") (VNx2SF "vnx4sf")
   (VNx4SF "vnx4sf") (VNx8SF "vnx4sf") (VNx16SF "vnx4sf") (VNx32SF "vnx4sf")
   (VNx1DF "vnx2df") (VNx2DF "vnx2df")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c
new file mode 100644
index 00000000000..0d244aac9ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c
@@ -0,0 +1,418 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3" } */
+
+#include "riscv_vector.h"
+
+typedef _Float16 float16_t;
+
+vfloat16mf4_t test_vfadd_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfadd_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfadd_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfadd_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfsub_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfsub_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfsub_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfsub_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfrsub_vf_f16mf4(vfloat16mf4_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfrsub_vf_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfrsub_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfrsub_vf_f16m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwadd_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwadd_vv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwadd_vv_f32m8(vfloat16m4_t op1, vfloat16m4_t op2, size_t vl) {
+  return __riscv_vfwadd_vv_f32m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwadd_wv_f32mf2(vfloat32mf2_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwadd_wv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwadd_wv_f32m8(vfloat32m8_t op1, vfloat16m4_t op2, size_t vl) {
+  return __riscv_vfwadd_wv_f32m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwsub_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwsub_vv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwsub_vv_f32m8(vfloat16m4_t op1, vfloat16m4_t op2, size_t vl) {
+  return __riscv_vfwsub_vv_f32m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwsub_wv_f32mf2(vfloat32mf2_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwsub_wv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwsub_wv_f32m8(vfloat32m8_t op1, vfloat16m4_t op2, size_t vl) {
+  return __riscv_vfwsub_wv_f32m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfmul_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfmul_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfmul_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfmul_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfdiv_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfdiv_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfdiv_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfdiv_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfrdiv_vf_f16mf4(vfloat16mf4_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfrdiv_vf_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfrdiv_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfrdiv_vf_f16m8(op1, op2, vl);
+}
+
+vfloat32mf2_t test_vfwmul_vv_f32mf2(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfwmul_vv_f32mf2(op1, op2, vl);
+}
+
+vfloat32m8_t test_vfwmul_vf_f32m8(vfloat16m4_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfwmul_vf_f32m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfmacc_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfmacc_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfmacc_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfmacc_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfnmacc_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfnmacc_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfnmacc_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfnmacc_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfmsac_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfmsac_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfmsac_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfmsac_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfnmsac_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfnmsac_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfnmsac_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfnmsac_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfmadd_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfmadd_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfmadd_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfmadd_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfnmadd_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfnmadd_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfnmadd_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfnmadd_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfmsub_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfmsub_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfmsub_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfmsub_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfnmsub_vv_f16mf4(vfloat16mf4_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfnmsub_vv_f16mf4(vd, vs1, vs2, vl);
+}
+
+vfloat16m8_t test_vfnmsub_vf_f16m8(vfloat16m8_t vd, float16_t rs1, vfloat16m8_t vs2, size_t vl) {
+  return __riscv_vfnmsub_vf_f16m8(vd, rs1, vs2, vl);
+}
+
+vfloat32mf2_t test_vfwmacc_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfwmacc_vv_f32mf2(vd, vs1, vs2, vl);
+}
+
+vfloat32m8_t test_vfwmacc_vf_f32m8(vfloat32m8_t vd, float16_t vs1, vfloat16m4_t vs2, size_t vl) {
+  return __riscv_vfwmacc_vf_f32m8(vd, vs1, vs2, vl);
+}
+
+vfloat32mf2_t test_vfwnmacc_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfwnmacc_vv_f32mf2(vd, vs1, vs2, vl);
+}
+
+vfloat32m8_t test_vfwnmacc_vf_f32m8(vfloat32m8_t vd, float16_t vs1, vfloat16m4_t vs2, size_t vl) {
+  return __riscv_vfwnmacc_vf_f32m8(vd, vs1, vs2, vl);
+}
+
+vfloat32mf2_t test_vfwmsac_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfwmsac_vv_f32mf2(vd, vs1, vs2, vl);
+}
+
+vfloat32m8_t test_vfwmsac_vf_f32m8(vfloat32m8_t vd, float16_t vs1, vfloat16m4_t vs2, size_t vl) {
+  return __riscv_vfwmsac_vf_f32m8(vd, vs1, vs2, vl);
+}
+
+vfloat32mf2_t test_vfwnmsac_vv_f32mf2(vfloat32mf2_t vd, vfloat16mf4_t vs1, vfloat16mf4_t vs2, size_t vl) {
+  return __riscv_vfwnmsac_vv_f32mf2(vd, vs1, vs2, vl);
+}
+
+vfloat32m8_t test_vfwnmsac_vf_f32m8(vfloat32m8_t vd, float16_t vs1, vfloat16m4_t vs2, size_t vl) {
+  return __riscv_vfwnmsac_vf_f32m8(vd, vs1, vs2, vl);
+}
+
+vfloat16mf4_t test_vfsqrt_v_f16mf4(vfloat16mf4_t op1, size_t vl) {
+  return __riscv_vfsqrt_v_f16mf4(op1, vl);
+}
+
+vfloat16m8_t test_vfsqrt_v_f16m8(vfloat16m8_t op1, size_t vl) {
+  return __riscv_vfsqrt_v_f16m8(op1, vl);
+}
+
+vfloat16mf4_t test_vfrsqrt7_v_f16mf4(vfloat16mf4_t op1, size_t vl) {
+  return __riscv_vfrsqrt7_v_f16mf4(op1, vl);
+}
+
+vfloat16m8_t test_vfrsqrt7_v_f16m8(vfloat16m8_t op1, size_t vl) {
+  return __riscv_vfrsqrt7_v_f16m8(op1, vl);
+}
+
+vfloat16mf4_t test_vfrec7_v_f16mf4(vfloat16mf4_t op1, size_t vl) {
+  return __riscv_vfrec7_v_f16mf4(op1, vl);
+}
+
+vfloat16m8_t test_vfrec7_v_f16m8(vfloat16m8_t op1, size_t vl) {
+  return __riscv_vfrec7_v_f16m8(op1, vl);
+}
+
+vfloat16mf4_t test_vfmin_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfmin_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfmin_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfmin_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfmax_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfmax_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfmax_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfmax_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfsgnj_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfsgnj_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfsgnj_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfsgnj_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfsgnjn_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfsgnjn_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfsgnjn_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfsgnjn_vf_f16m8(op1, op2, vl);
+}
+
+vfloat16mf4_t test_vfsgnjx_vv_f16mf4(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vfsgnjx_vv_f16mf4(op1, op2, vl);
+}
+
+vfloat16m8_t test_vfsgnjx_vf_f16m8(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vfsgnjx_vf_f16m8(op1, op2, vl);
+}
+
+vbool64_t test_vmfeq_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfeq_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfeq_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfeq_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmfne_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfne_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfne_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfne_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmflt_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmflt_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmflt_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmflt_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmfle_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfle_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfle_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfle_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmfgt_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfgt_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfgt_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfgt_vf_f16m8_b2(op1, op2, vl);
+}
+
+vbool64_t test_vmfge_vv_f16mf4_b64(vfloat16mf4_t op1, vfloat16mf4_t op2, size_t vl) {
+  return __riscv_vmfge_vv_f16mf4_b64(op1, op2, vl);
+}
+
+vbool2_t test_vmfge_vf_f16m8_b2(vfloat16m8_t op1, float16_t op2, size_t vl) {
+  return __riscv_vmfge_vf_f16m8_b2(op1, op2, vl);
+}
+
+vuint16mf4_t test_vfclass_v_u16mf4(vfloat16mf4_t op1, size_t vl) {
+  return __riscv_vfclass_v_u16mf4(op1, vl);
+}
+
+vuint16m8_t test_vfclass_v_u16m8(vfloat16m8_t op1, size_t vl) {
+  return __riscv_vfclass_v_u16m8(op1, vl);
+}
+
+vfloat16mf4_t test_vfmerge_vfm_f16mf4(vfloat16mf4_t op1, float16_t op2, vbool64_t mask, size_t vl) {
+  return __riscv_vfmerge_vfm_f16mf4(op1, op2, mask, vl);
+}
+
+vfloat16m8_t test_vfmerge_vfm_f16m8(vfloat16m8_t op1, float16_t op2, vbool2_t mask, size_t vl) {
+  return __riscv_vfmerge_vfm_f16m8(op1, op2, mask, vl);
+}
+
+vfloat16mf4_t test_vfmv_v_f_f16mf4(float16_t src, size_t vl) {
+  return __riscv_vfmv_v_f_f16mf4(src, vl);
+}
+
+vfloat16m8_t test_vfmv_v_f_f16m8(float16_t src, size_t vl) {
+  return __riscv_vfmv_v_f_f16m8(src, vl);
+}
+
+vint16mf4_t test_vfcvt_x_f_v_i16mf4(vfloat16mf4_t src, size_t vl) {
+  return __riscv_vfcvt_x_f_v_i16mf4(src, vl);
+}
+
+vuint16m8_t test_vfcvt_xu_f_v_u16m8(vfloat16m8_t src, size_t vl) {
+  return __riscv_vfcvt_xu_f_v_u16m8(src, vl);
+}
+
+vfloat16mf4_t test_vfcvt_f_x_v_f16mf4(vint16mf4_t src, size_t vl) {
+  return __riscv_vfcvt_f_x_v_f16mf4(src, vl);
+}
+
+vfloat16m8_t test_vfcvt_f_xu_v_f16m8(vuint16m8_t src, size_t vl) {
+  return __riscv_vfcvt_f_xu_v_f16m8(src, vl);
+}
+
+vint16mf4_t test_vfcvt_rtz_x_f_v_i16mf4(vfloat16mf4_t src, size_t vl) {
+  return __riscv_vfcvt_rtz_x_f_v_i16mf4(src, vl);
+}
+
+vuint16m8_t test_vfcvt_rtz_xu_f_v_u16m8(vfloat16m8_t src, size_t vl) {
+  return __riscv_vfcvt_rtz_xu_f_v_u16m8(src, vl);
+}
+
+vfloat16mf4_t test_vfwcvt_f_x_v_f16mf4(vint8mf8_t src, size_t vl) {
+  return __riscv_vfwcvt_f_x_v_f16mf4(src, vl);
+}
+
+vuint32m8_t test_vfwcvt_xu_f_v_u32m8(vfloat16m4_t src, size_t vl) {
+  return __riscv_vfwcvt_xu_f_v_u32m8(src, vl);
+}
+
+vint8mf8_t test_vfncvt_x_f_w_i8mf8(vfloat16mf4_t src, size_t vl) {
+  return __riscv_vfncvt_x_f_w_i8mf8(src, vl);
+}
+
+vfloat16m4_t test_vfncvt_f_xu_w_f16m4(vuint32m8_t src, size_t vl) {
+  return __riscv_vfncvt_f_xu_w_f16m4(src, vl);
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au]} 43 } } */
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m4,\s*t[au],\s*m[au]} 11 } } */
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} 34 } } */
+/* { dg-final { scan-assembler-times {vfadd\.v[fv]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsub\.v[fv]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfrsub\.vf\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwadd\.[wv]v\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vfwsub\.[wv]v\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 4 } } */
+/* { dg-final { scan-assembler-times {vfmul\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfdiv\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfrdiv\.vf\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwmul\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfnmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfnmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmadd\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfnmadd\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsub\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfnmsub\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwnmacc\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfwnmsac\.v[vf]\s+v[0-9]+,\s*[vfa]+[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsqrt\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfrsqrt7\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfrec7\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsgnj\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsgnjn\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfsgnjx\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfeq\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfne\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmflt\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfle\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfgt\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vmfge\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfclass\.v\s+v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfmerge\.vfm\s+v[0-9]+,\s*v[0-9]+,\s*fa[0-9]+,\s*v0} 2 } } */
+/* { dg-final { scan-assembler-times {vfmv\.v\.f\s+v[0-9]+,\s*fa[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.rtz\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfcvt\.rtz\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfwcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfwcvt\.xu\.f\.v\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfncvt\.x\.f\.w\s+v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {vfncvt\.f\.xu\.w\s+v[0-9]+,\s*v[0-9]+} 1 } } */