diff mbox series

RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector

Message ID 20240321234552.2140254-1-christoph.muellner@vrull.eu
State New
Headers show
Series RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector | expand

Commit Message

Christoph Müllner March 21, 2024, 11:45 p.m. UTC
The expansion of `memset` (via expand_builtin_memset_args())
uses clear_by_pieces() and store_by_pieces() to avoid calls
to the C runtime. To check if a type can be used for that purpose
the function by_pieces_mode_supported_p() tests if a `mov` and
a `vec_duplicate` INSN can be expaned by the backend.

The `vec_duplicate` expansion takes arguments of type `V_VLS`.
The `mov` expansions take arguments of type `V`, `VB`, `VT`,
`VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
not types but type iterators) include fractional LMUL types.
E.g. `V_VLS` includes `V`, which includes `VI`, which includes
`RVVMF2QI`.

This results in an attempt to use fractional LMUL-types for
the `memset` expansion resulting in an ICE for XTheadVector,
because that extension cannot handle fractional LMULs.

This patch addresses this issue by splitting the definition
of the `VI` mode itereator into `VI_NOFRAC` (without fractional
LMUL types) and `VI_FRAC` (only fractional LMUL types).
Further, it defines `V_VLS` such, that `VI_FRAC` types are only
included if XTheadVector is not enabled.

The effect is demonstrated by a new test case that shows
that the by-pieces framework now emits `sb` instructions
instead of triggering an ICE.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

	PR 114194

gcc/ChangeLog:

	* config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
	Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 gcc/config/riscv/vector-iterators.md          | 19 +++++--
 .../riscv/rvv/xtheadvector/pr114194.c         | 56 +++++++++++++++++++
 2 files changed, 69 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c

Comments

juzhe.zhong@rivai.ai March 22, 2024, 1:17 a.m. UTC | #1
LGTM.



juzhe.zhong@rivai.ai
 
From: Christoph Müllner
Date: 2024-03-22 07:45
To: gcc-patches; Kito Cheng; Palmer Dabbelt; Andrew Waterman; Philipp Tomsich; Camel Coder; Bruce Hoult; Juzhe-Zhong; Jun Sha; Xianmiao Qu; Jin Ma
CC: Christoph Müllner
Subject: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
The expansion of `memset` (via expand_builtin_memset_args())
uses clear_by_pieces() and store_by_pieces() to avoid calls
to the C runtime. To check if a type can be used for that purpose
the function by_pieces_mode_supported_p() tests if a `mov` and
a `vec_duplicate` INSN can be expaned by the backend.
 
The `vec_duplicate` expansion takes arguments of type `V_VLS`.
The `mov` expansions take arguments of type `V`, `VB`, `VT`,
`VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
not types but type iterators) include fractional LMUL types.
E.g. `V_VLS` includes `V`, which includes `VI`, which includes
`RVVMF2QI`.
 
This results in an attempt to use fractional LMUL-types for
the `memset` expansion resulting in an ICE for XTheadVector,
because that extension cannot handle fractional LMULs.
 
This patch addresses this issue by splitting the definition
of the `VI` mode itereator into `VI_NOFRAC` (without fractional
LMUL types) and `VI_FRAC` (only fractional LMUL types).
Further, it defines `V_VLS` such, that `VI_FRAC` types are only
included if XTheadVector is not enabled.
 
The effect is demonstrated by a new test case that shows
that the by-pieces framework now emits `sb` instructions
instead of triggering an ICE.
 
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
 
PR 114194
 
gcc/ChangeLog:
 
* config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
 
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
gcc/config/riscv/vector-iterators.md          | 19 +++++--
.../riscv/rvv/xtheadvector/pr114194.c         | 56 +++++++++++++++++++
2 files changed, 69 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
 
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index c2ea7e8b10a..a24e1bf078f 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
   UNSPECV_FRM_RESTORE_EXIT
])
-(define_mode_iterator VI [
-  RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
-
-  RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
-
-  RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
+;; Subset of VI with fractional LMUL types
+(define_mode_iterator VI_FRAC [
+  RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
+  RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
+  (RVVMF2SI "TARGET_MIN_VLEN > 32")
+])
+;; Subset of VI with non-fractional LMUL types
+(define_mode_iterator VI_NOFRAC [
+  RVVM8QI RVVM4QI RVVM2QI RVVM1QI
+  RVVM8HI RVVM4HI RVVM2HI RVVM1HI
+  RVVM8SI RVVM4SI RVVM2SI RVVM1SI
   (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
   (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
])
+(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
+
;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
;; changed to TARGET_ZVFH.  TARGET_VECTOR_ELEN_FP_16 is also true for
;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
new file mode 100644
index 00000000000..fc2d1349425
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
+/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** foo0_1:
+** sb\tzero,0([a-x0-9]+)
+** ret
+*/
+void foo0_1 (void *p)
+{
+  __builtin_memset (p, 0, 1);
+}
+
+/*
+** foo0_7:
+** sb\tzero,0([a-x0-9]+)
+** sb\tzero,1([a-x0-9]+)
+** sb\tzero,2([a-x0-9]+)
+** sb\tzero,3([a-x0-9]+)
+** sb\tzero,4([a-x0-9]+)
+** sb\tzero,5([a-x0-9]+)
+** sb\tzero,6([a-x0-9]+)
+** ret
+*/
+void foo0_7 (void *p)
+{
+  __builtin_memset (p, 0, 7);
+}
+
+/*
+** foo1_1:
+** li\t[a-x0-9]+,1
+** sb\t[a-x0-9]+,0([a-x0-9]+)
+** ret
+*/
+void foo1_1 (void *p)
+{
+  __builtin_memset (p, 1, 1);
+}
+
+/*
+** foo1_5:
+** li\t[a-x0-9]+,1
+** sb\t[a-x0-9]+,0([a-x0-9]+)
+** sb\t[a-x0-9]+,1([a-x0-9]+)
+** sb\t[a-x0-9]+,2([a-x0-9]+)
+** sb\t[a-x0-9]+,3([a-x0-9]+)
+** sb\t[a-x0-9]+,4([a-x0-9]+)
+** ret
+*/
+void foo1_5 (void *p)
+{
+  __builtin_memset (p, 1, 5);
+}
Bruce Hoult March 22, 2024, 3:43 a.m. UTC | #2
> The effect is demonstrated by a new test case that shows
that the by-pieces framework now emits `sb` instructions
instead of triggering an ICE

So these small memset() now don't use RVV at all if xtheadvector is enabled?

I don't have evidence whether the use of RVV (whether V or
xtheadvector) for these memsets is a win or not, but the treatment
should probably be consistent.

I don't know why RVV 1.0 uses a fractional LMUL at all here. It would
work perfectly well with LMUL=1 and just setting vl to the appropriate
length (which is always less than 16 bytes). Use of fractional LMUL
doesn't save any resources.

On Fri, Mar 22, 2024 at 12:46 PM Christoph Müllner
<christoph.muellner@vrull.eu> wrote:
>
> The expansion of `memset` (via expand_builtin_memset_args())
> uses clear_by_pieces() and store_by_pieces() to avoid calls
> to the C runtime. To check if a type can be used for that purpose
> the function by_pieces_mode_supported_p() tests if a `mov` and
> a `vec_duplicate` INSN can be expaned by the backend.
>
> The `vec_duplicate` expansion takes arguments of type `V_VLS`.
> The `mov` expansions take arguments of type `V`, `VB`, `VT`,
> `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
> not types but type iterators) include fractional LMUL types.
> E.g. `V_VLS` includes `V`, which includes `VI`, which includes
> `RVVMF2QI`.
>
> This results in an attempt to use fractional LMUL-types for
> the `memset` expansion resulting in an ICE for XTheadVector,
> because that extension cannot handle fractional LMULs.
>
> This patch addresses this issue by splitting the definition
> of the `VI` mode itereator into `VI_NOFRAC` (without fractional
> LMUL types) and `VI_FRAC` (only fractional LMUL types).
> Further, it defines `V_VLS` such, that `VI_FRAC` types are only
> included if XTheadVector is not enabled.
>
> The effect is demonstrated by a new test case that shows
> that the by-pieces framework now emits `sb` instructions
> instead of triggering an ICE.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
>
>         PR 114194
>
> gcc/ChangeLog:
>
>         * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
>         Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> ---
>  gcc/config/riscv/vector-iterators.md          | 19 +++++--
>  .../riscv/rvv/xtheadvector/pr114194.c         | 56 +++++++++++++++++++
>  2 files changed, 69 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
>
> diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
> index c2ea7e8b10a..a24e1bf078f 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
>    UNSPECV_FRM_RESTORE_EXIT
>  ])
>
> -(define_mode_iterator VI [
> -  RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> -
> -  RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> -
> -  RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +;; Subset of VI with fractional LMUL types
> +(define_mode_iterator VI_FRAC [
> +  RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> +  RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> +  (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +])
>
> +;; Subset of VI with non-fractional LMUL types
> +(define_mode_iterator VI_NOFRAC [
> +  RVVM8QI RVVM4QI RVVM2QI RVVM1QI
> +  RVVM8HI RVVM4HI RVVM2HI RVVM1HI
> +  RVVM8SI RVVM4SI RVVM2SI RVVM1SI
>    (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
>    (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
>  ])
>
> +(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
> +
>  ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
>  ;; changed to TARGET_ZVFH.  TARGET_VECTOR_ELEN_FP_16 is also true for
>  ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> new file mode 100644
> index 00000000000..fc2d1349425
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
> +/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** foo0_1:
> +**     sb\tzero,0([a-x0-9]+)
> +**     ret
> +*/
> +void foo0_1 (void *p)
> +{
> +  __builtin_memset (p, 0, 1);
> +}
> +
> +/*
> +** foo0_7:
> +**     sb\tzero,0([a-x0-9]+)
> +**     sb\tzero,1([a-x0-9]+)
> +**     sb\tzero,2([a-x0-9]+)
> +**     sb\tzero,3([a-x0-9]+)
> +**     sb\tzero,4([a-x0-9]+)
> +**     sb\tzero,5([a-x0-9]+)
> +**     sb\tzero,6([a-x0-9]+)
> +**     ret
> +*/
> +void foo0_7 (void *p)
> +{
> +  __builtin_memset (p, 0, 7);
> +}
> +
> +/*
> +** foo1_1:
> +**     li\t[a-x0-9]+,1
> +**     sb\t[a-x0-9]+,0([a-x0-9]+)
> +**     ret
> +*/
> +void foo1_1 (void *p)
> +{
> +  __builtin_memset (p, 1, 1);
> +}
> +
> +/*
> +** foo1_5:
> +**     li\t[a-x0-9]+,1
> +**     sb\t[a-x0-9]+,0([a-x0-9]+)
> +**     sb\t[a-x0-9]+,1([a-x0-9]+)
> +**     sb\t[a-x0-9]+,2([a-x0-9]+)
> +**     sb\t[a-x0-9]+,3([a-x0-9]+)
> +**     sb\t[a-x0-9]+,4([a-x0-9]+)
> +**     ret
> +*/
> +void foo1_5 (void *p)
> +{
> +  __builtin_memset (p, 1, 5);
> +}
> --
> 2.44.0
>
>
Christoph Müllner March 22, 2024, 6:54 a.m. UTC | #3
On Fri, Mar 22, 2024 at 2:18 AM juzhe.zhong@rivai.ai
<juzhe.zhong@rivai.ai> wrote:
>
> LGTM.

Pushed.
Thanks!

>
> ________________________________
> juzhe.zhong@rivai.ai
>
>
> From: Christoph Müllner
> Date: 2024-03-22 07:45
> To: gcc-patches; Kito Cheng; Palmer Dabbelt; Andrew Waterman; Philipp Tomsich; Camel Coder; Bruce Hoult; Juzhe-Zhong; Jun Sha; Xianmiao Qu; Jin Ma
> CC: Christoph Müllner
> Subject: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector
> The expansion of `memset` (via expand_builtin_memset_args())
> uses clear_by_pieces() and store_by_pieces() to avoid calls
> to the C runtime. To check if a type can be used for that purpose
> the function by_pieces_mode_supported_p() tests if a `mov` and
> a `vec_duplicate` INSN can be expaned by the backend.
>
> The `vec_duplicate` expansion takes arguments of type `V_VLS`.
> The `mov` expansions take arguments of type `V`, `VB`, `VT`,
> `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
> not types but type iterators) include fractional LMUL types.
> E.g. `V_VLS` includes `V`, which includes `VI`, which includes
> `RVVMF2QI`.
>
> This results in an attempt to use fractional LMUL-types for
> the `memset` expansion resulting in an ICE for XTheadVector,
> because that extension cannot handle fractional LMULs.
>
> This patch addresses this issue by splitting the definition
> of the `VI` mode itereator into `VI_NOFRAC` (without fractional
> LMUL types) and `VI_FRAC` (only fractional LMUL types).
> Further, it defines `V_VLS` such, that `VI_FRAC` types are only
> included if XTheadVector is not enabled.
>
> The effect is demonstrated by a new test case that shows
> that the by-pieces framework now emits `sb` instructions
> instead of triggering an ICE.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
>
> PR 114194
>
> gcc/ChangeLog:
>
> * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
> Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> ---
> gcc/config/riscv/vector-iterators.md          | 19 +++++--
> .../riscv/rvv/xtheadvector/pr114194.c         | 56 +++++++++++++++++++
> 2 files changed, 69 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
>
> diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
> index c2ea7e8b10a..a24e1bf078f 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
>    UNSPECV_FRM_RESTORE_EXIT
> ])
> -(define_mode_iterator VI [
> -  RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> -
> -  RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> -
> -  RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +;; Subset of VI with fractional LMUL types
> +(define_mode_iterator VI_FRAC [
> +  RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> +  RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> +  (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +])
> +;; Subset of VI with non-fractional LMUL types
> +(define_mode_iterator VI_NOFRAC [
> +  RVVM8QI RVVM4QI RVVM2QI RVVM1QI
> +  RVVM8HI RVVM4HI RVVM2HI RVVM1HI
> +  RVVM8SI RVVM4SI RVVM2SI RVVM1SI
>    (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
>    (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
> ])
> +(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
> +
> ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
> ;; changed to TARGET_ZVFH.  TARGET_VECTOR_ELEN_FP_16 is also true for
> ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> new file mode 100644
> index 00000000000..fc2d1349425
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
> +/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** foo0_1:
> +** sb\tzero,0([a-x0-9]+)
> +** ret
> +*/
> +void foo0_1 (void *p)
> +{
> +  __builtin_memset (p, 0, 1);
> +}
> +
> +/*
> +** foo0_7:
> +** sb\tzero,0([a-x0-9]+)
> +** sb\tzero,1([a-x0-9]+)
> +** sb\tzero,2([a-x0-9]+)
> +** sb\tzero,3([a-x0-9]+)
> +** sb\tzero,4([a-x0-9]+)
> +** sb\tzero,5([a-x0-9]+)
> +** sb\tzero,6([a-x0-9]+)
> +** ret
> +*/
> +void foo0_7 (void *p)
> +{
> +  __builtin_memset (p, 0, 7);
> +}
> +
> +/*
> +** foo1_1:
> +** li\t[a-x0-9]+,1
> +** sb\t[a-x0-9]+,0([a-x0-9]+)
> +** ret
> +*/
> +void foo1_1 (void *p)
> +{
> +  __builtin_memset (p, 1, 1);
> +}
> +
> +/*
> +** foo1_5:
> +** li\t[a-x0-9]+,1
> +** sb\t[a-x0-9]+,0([a-x0-9]+)
> +** sb\t[a-x0-9]+,1([a-x0-9]+)
> +** sb\t[a-x0-9]+,2([a-x0-9]+)
> +** sb\t[a-x0-9]+,3([a-x0-9]+)
> +** sb\t[a-x0-9]+,4([a-x0-9]+)
> +** ret
> +*/
> +void foo1_5 (void *p)
> +{
> +  __builtin_memset (p, 1, 5);
> +}
> --
> 2.44.0
>
>
Christoph Müllner March 22, 2024, 7:41 a.m. UTC | #4
On Fri, Mar 22, 2024 at 4:43 AM Bruce Hoult <bruce@hoult.org> wrote:
>
> > The effect is demonstrated by a new test case that shows
> that the by-pieces framework now emits `sb` instructions
> instead of triggering an ICE
>
> So these small memset() now don't use RVV at all if xtheadvector is enabled?

Yes, but not directly.
The patch just prevents fractional LMUL modes from being considered
for XTheadVector.
That's necessary because further lowering memory moves with a
fractional LMUL mode
cannot be done for XTheadVector (that's the reason for the ICE).

> I don't have evidence whether the use of RVV (whether V or
> xtheadvector) for these memsets is a win or not, but the treatment
> should probably be consistent.
>
> I don't know why RVV 1.0 uses a fractional LMUL at all here. It would
> work perfectly well with LMUL=1 and just setting vl to the appropriate
> length (which is always less than 16 bytes). Use of fractional LMUL
> doesn't save any resources.

The compiler can consider fractional LMUL values for expansion for RVV,
but that does not mean it will be used in the emitted instruction sequence.
Details like cost model and data alignment also matter.

During testing, I observed that RVV and XTheadVector will both emit sequences
of 'sd' for short memsets with known length, known data to set,
and unknown alignment of the data to be written.
However, I have not excessively tested using all possible tuning parameters,
as my primary goal was to eliminate the reason for the ICE with XTheadVector.

>
> On Fri, Mar 22, 2024 at 12:46 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
> >
> > The expansion of `memset` (via expand_builtin_memset_args())
> > uses clear_by_pieces() and store_by_pieces() to avoid calls
> > to the C runtime. To check if a type can be used for that purpose
> > the function by_pieces_mode_supported_p() tests if a `mov` and
> > a `vec_duplicate` INSN can be expaned by the backend.
> >
> > The `vec_duplicate` expansion takes arguments of type `V_VLS`.
> > The `mov` expansions take arguments of type `V`, `VB`, `VT`,
> > `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
> > not types but type iterators) include fractional LMUL types.
> > E.g. `V_VLS` includes `V`, which includes `VI`, which includes
> > `RVVMF2QI`.
> >
> > This results in an attempt to use fractional LMUL-types for
> > the `memset` expansion resulting in an ICE for XTheadVector,
> > because that extension cannot handle fractional LMULs.
> >
> > This patch addresses this issue by splitting the definition
> > of the `VI` mode itereator into `VI_NOFRAC` (without fractional
> > LMUL types) and `VI_FRAC` (only fractional LMUL types).
> > Further, it defines `V_VLS` such, that `VI_FRAC` types are only
> > included if XTheadVector is not enabled.
> >
> > The effect is demonstrated by a new test case that shows
> > that the by-pieces framework now emits `sb` instructions
> > instead of triggering an ICE.
> >
> > Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> >
> >         PR 114194
> >
> > gcc/ChangeLog:
> >
> >         * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
> >         Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
> >
> > Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> > ---
> >  gcc/config/riscv/vector-iterators.md          | 19 +++++--
> >  .../riscv/rvv/xtheadvector/pr114194.c         | 56 +++++++++++++++++++
> >  2 files changed, 69 insertions(+), 6 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> >
> > diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
> > index c2ea7e8b10a..a24e1bf078f 100644
> > --- a/gcc/config/riscv/vector-iterators.md
> > +++ b/gcc/config/riscv/vector-iterators.md
> > @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
> >    UNSPECV_FRM_RESTORE_EXIT
> >  ])
> >
> > -(define_mode_iterator VI [
> > -  RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> > -
> > -  RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> > -
> > -  RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
> > +;; Subset of VI with fractional LMUL types
> > +(define_mode_iterator VI_FRAC [
> > +  RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> > +  RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> > +  (RVVMF2SI "TARGET_MIN_VLEN > 32")
> > +])
> >
> > +;; Subset of VI with non-fractional LMUL types
> > +(define_mode_iterator VI_NOFRAC [
> > +  RVVM8QI RVVM4QI RVVM2QI RVVM1QI
> > +  RVVM8HI RVVM4HI RVVM2HI RVVM1HI
> > +  RVVM8SI RVVM4SI RVVM2SI RVVM1SI
> >    (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
> >    (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
> >  ])
> >
> > +(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
> > +
> >  ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
> >  ;; changed to TARGET_ZVFH.  TARGET_VECTOR_ELEN_FP_16 is also true for
> >  ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
> > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> > new file mode 100644
> > index 00000000000..fc2d1349425
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> > @@ -0,0 +1,56 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +/*
> > +** foo0_1:
> > +**     sb\tzero,0([a-x0-9]+)
> > +**     ret
> > +*/
> > +void foo0_1 (void *p)
> > +{
> > +  __builtin_memset (p, 0, 1);
> > +}
> > +
> > +/*
> > +** foo0_7:
> > +**     sb\tzero,0([a-x0-9]+)
> > +**     sb\tzero,1([a-x0-9]+)
> > +**     sb\tzero,2([a-x0-9]+)
> > +**     sb\tzero,3([a-x0-9]+)
> > +**     sb\tzero,4([a-x0-9]+)
> > +**     sb\tzero,5([a-x0-9]+)
> > +**     sb\tzero,6([a-x0-9]+)
> > +**     ret
> > +*/
> > +void foo0_7 (void *p)
> > +{
> > +  __builtin_memset (p, 0, 7);
> > +}
> > +
> > +/*
> > +** foo1_1:
> > +**     li\t[a-x0-9]+,1
> > +**     sb\t[a-x0-9]+,0([a-x0-9]+)
> > +**     ret
> > +*/
> > +void foo1_1 (void *p)
> > +{
> > +  __builtin_memset (p, 1, 1);
> > +}
> > +
> > +/*
> > +** foo1_5:
> > +**     li\t[a-x0-9]+,1
> > +**     sb\t[a-x0-9]+,0([a-x0-9]+)
> > +**     sb\t[a-x0-9]+,1([a-x0-9]+)
> > +**     sb\t[a-x0-9]+,2([a-x0-9]+)
> > +**     sb\t[a-x0-9]+,3([a-x0-9]+)
> > +**     sb\t[a-x0-9]+,4([a-x0-9]+)
> > +**     ret
> > +*/
> > +void foo1_5 (void *p)
> > +{
> > +  __builtin_memset (p, 1, 5);
> > +}
> > --
> > 2.44.0
> >
> >
diff mbox series

Patch

diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index c2ea7e8b10a..a24e1bf078f 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -108,17 +108,24 @@  (define_c_enum "unspecv" [
   UNSPECV_FRM_RESTORE_EXIT
 ])
 
-(define_mode_iterator VI [
-  RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
-
-  RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
-
-  RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
+;; Subset of VI with fractional LMUL types
+(define_mode_iterator VI_FRAC [
+  RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
+  RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
+  (RVVMF2SI "TARGET_MIN_VLEN > 32")
+])
 
+;; Subset of VI with non-fractional LMUL types
+(define_mode_iterator VI_NOFRAC [
+  RVVM8QI RVVM4QI RVVM2QI RVVM1QI
+  RVVM8HI RVVM4HI RVVM2HI RVVM1HI
+  RVVM8SI RVVM4SI RVVM2SI RVVM1SI
   (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
   (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
 ])
 
+(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
+
 ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
 ;; changed to TARGET_ZVFH.  TARGET_VECTOR_ELEN_FP_16 is also true for
 ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
new file mode 100644
index 00000000000..fc2d1349425
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
@@ -0,0 +1,56 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
+/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** foo0_1:
+**	sb\tzero,0([a-x0-9]+)
+**	ret
+*/
+void foo0_1 (void *p)
+{
+  __builtin_memset (p, 0, 1);
+}
+
+/*
+** foo0_7:
+**	sb\tzero,0([a-x0-9]+)
+**	sb\tzero,1([a-x0-9]+)
+**	sb\tzero,2([a-x0-9]+)
+**	sb\tzero,3([a-x0-9]+)
+**	sb\tzero,4([a-x0-9]+)
+**	sb\tzero,5([a-x0-9]+)
+**	sb\tzero,6([a-x0-9]+)
+**	ret
+*/
+void foo0_7 (void *p)
+{
+  __builtin_memset (p, 0, 7);
+}
+
+/*
+** foo1_1:
+**	li\t[a-x0-9]+,1
+**	sb\t[a-x0-9]+,0([a-x0-9]+)
+**	ret
+*/
+void foo1_1 (void *p)
+{
+  __builtin_memset (p, 1, 1);
+}
+
+/*
+** foo1_5:
+**	li\t[a-x0-9]+,1
+**	sb\t[a-x0-9]+,0([a-x0-9]+)
+**	sb\t[a-x0-9]+,1([a-x0-9]+)
+**	sb\t[a-x0-9]+,2([a-x0-9]+)
+**	sb\t[a-x0-9]+,3([a-x0-9]+)
+**	sb\t[a-x0-9]+,4([a-x0-9]+)
+**	ret
+*/
+void foo1_5 (void *p)
+{
+  __builtin_memset (p, 1, 5);
+}