diff mbox series

[V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]

Message ID 20230914104952.4173011-1-juzhe.zhong@rivai.ai
State New
Headers show
Series [V4] RISC-V: Expand VLS mode to scalar mode move[PR111391] | expand

Commit Message

juzhe.zhong@rivai.ai Sept. 14, 2023, 10:49 a.m. UTC
This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111391

	PR target/111391

gcc/ChangeLog:

	* config/riscv/autovec.md (@vec_extract<mode><vel>): Remove @.
	(vec_extract<mode><vel>): Ditto.
	* config/riscv/riscv-vsetvl.cc (emit_vsetvl_insn): Fix bug.
	(pass_vsetvl::local_eliminate_vsetvl_insn): Fix bug.
	* config/riscv/riscv.cc (riscv_legitimize_move): Expand VLS mode to scalar mode move.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/partial/slp-9.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/pr111391-1.c: New test.
	* gcc.target/riscv/rvv/autovec/pr111391-2.c: New test.

---
 gcc/config/riscv/autovec.md                   |  2 +-
 gcc/config/riscv/riscv-vsetvl.cc              |  4 +-
 gcc/config/riscv/riscv.cc                     | 64 +++++++++++++++++++
 .../riscv/rvv/autovec/partial/slp-9.c         |  1 -
 .../gcc.target/riscv/rvv/autovec/pr111391-1.c | 28 ++++++++
 .../gcc.target/riscv/rvv/autovec/pr111391-2.c | 10 +++
 6 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-2.c

Comments

Kito Cheng Sept. 14, 2023, 4:17 p.m. UTC | #1
I am thinking what we are doing is something like we are allowing
scalar mode within the vector register, so...not sure should we try to
implement that within the mov pattern?

I guess we need some inputs from Jeff.


e.g.
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 0ecda795b38..ffced41588d 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7621,6 +7621,9 @@ riscv_hard_regno_mode_ok (unsigned int regno,
machine_mode mode)
    }
  else if (V_REG_P (regno))
    {
+      if (mode is scalar)
+       return true;
+
      if (!riscv_v_ext_mode_p (mode))
       return false;

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 6d6a2b3748c..50bac39f125 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2035,8 +2035,8 @@ (define_insn "*movdi_32bit"
   (set_attr "ext" "base,base,base,base,d,d,d,d,d,vector")])

(define_insn "*movdi_64bit"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r, m,
*f,*f,*r,*f,*m,r")
-       (match_operand:DI 1 "move_operand"         "
r,T,m,rJ,*r*J,*m,*f,*f,*f,vp"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r, m,
*f,*f,*r,*f,*m,r,*vr,*r,*vr,*vr,*m")
+       (match_operand:DI 1 "move_operand"         "
r,T,m,rJ,*r*J,*m,*f,*f,*f,vp,vr,vr,r,m,vr"))]
  "TARGET_64BIT
   && (register_operand (operands[0], DImode)
       || reg_or_0_operand (operands[1], DImode))"
Robin Dapp Sept. 14, 2023, 9:06 p.m. UTC | #2
> I am thinking what we are doing is something like we are allowing
> scalar mode within the vector register, so...not sure should we try to
> implement that within the mov pattern?
> 
> I guess we need some inputs from Jeff.

Sorry for the late response.  I have also been thinking about this and
it feels a bit like a bandaid to me.  Usually register-class moves like
this are performed by reload (which consults register_move_costs among
other things) and we are working around it.

The situation is that we move a vec_duplicate of QImodes into a vector
register.  Then we want to use this as scalar call argument so we need
to transfer it back to a DImode register.

One maybe more typical solution would be to allow small VLS vector modes
like V8QI in GPRs (via hard_regno_mode_ok) until reload so we could have
a (set (reg:V8QI a0) (vec_duplicate:V8QI ...)).

The next step would be to have a mov<mode> expander with target "r"
constraint (and source "vr") that performs the actual move.  This is
where Juzhe's mov code could fit in (without the subreg handling).
If I'm not mistaken vmv.x.s without slidedown should be sufficient for
our case as we'd only want to use the whole thing when the full vector
fits into a GPR. 

All that's missing is a (reinterpreting) vtype change to Pmode-sized
elements before. I quickly hacked something together (without the proper
mode change) and the resulting code looks like:

	vsetvli zero, 8, e8, ...
	vmv.v.x	v1,a5
        # missing vsetivli zero, 1, e64, ... or something 
	vmv.x.s	a0,v1

Now, whether that's efficient (and desirable) is a separate issue and
should probably be defined by register_move_costs as well as instruction
costs.  I wasn't actually aware of this call/argument optimization that
uses vec_duplicate and I haven't checked what costing (if at all) it
uses.

Regards
 Robin
juzhe.zhong@rivai.ai Sept. 14, 2023, 10:25 p.m. UTC | #3
>> All that's missing is a (reinterpreting) vtype change to Pmode-sized
>> elements before. I quickly hacked something together (without the proper
>> mode change) and the resulting code looks like:
 
>> vsetvli zero, 8, e8, ...
>> vmv.v.x v1,a5
>>         # missing vsetivli zero, 1, e64, ... or something
>> vmv.x.s a0,v1

This issue has been addressed by this patch:
[PATCH V3] RISC-V: Fix ICE in get_avl_or_vl_reg (gnu.org)



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-09-15 05:06
To: Kito Cheng; Juzhe-Zhong
CC: rdapp.gcc; gcc-patches; kito.cheng; jeffreyalaw
Subject: Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]
> I am thinking what we are doing is something like we are allowing
> scalar mode within the vector register, so...not sure should we try to
> implement that within the mov pattern?
> 
> I guess we need some inputs from Jeff.
 
Sorry for the late response.  I have also been thinking about this and
it feels a bit like a bandaid to me.  Usually register-class moves like
this are performed by reload (which consults register_move_costs among
other things) and we are working around it.
 
The situation is that we move a vec_duplicate of QImodes into a vector
register.  Then we want to use this as scalar call argument so we need
to transfer it back to a DImode register.
 
One maybe more typical solution would be to allow small VLS vector modes
like V8QI in GPRs (via hard_regno_mode_ok) until reload so we could have
a (set (reg:V8QI a0) (vec_duplicate:V8QI ...)).
 
The next step would be to have a mov<mode> expander with target "r"
constraint (and source "vr") that performs the actual move.  This is
where Juzhe's mov code could fit in (without the subreg handling).
If I'm not mistaken vmv.x.s without slidedown should be sufficient for
our case as we'd only want to use the whole thing when the full vector
fits into a GPR. 
 
All that's missing is a (reinterpreting) vtype change to Pmode-sized
elements before. I quickly hacked something together (without the proper
mode change) and the resulting code looks like:
 
vsetvli zero, 8, e8, ...
vmv.v.x v1,a5
        # missing vsetivli zero, 1, e64, ... or something 
vmv.x.s a0,v1
 
Now, whether that's efficient (and desirable) is a separate issue and
should probably be defined by register_move_costs as well as instruction
costs.  I wasn't actually aware of this call/argument optimization that
uses vec_duplicate and I haven't checked what costing (if at all) it
uses.
 
Regards
Robin
juzhe.zhong@rivai.ai Sept. 14, 2023, 10:26 p.m. UTC | #4
I don't think it can fix the case when it is -march=rv64gc_zve32x



juzhe.zhong@rivai.ai
 
From: Kito Cheng
Date: 2023-09-15 00:17
To: Juzhe-Zhong
CC: gcc-patches; kito.cheng; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]
I am thinking what we are doing is something like we are allowing
scalar mode within the vector register, so...not sure should we try to
implement that within the mov pattern?
 
I guess we need some inputs from Jeff.
 
 
e.g.
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 0ecda795b38..ffced41588d 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7621,6 +7621,9 @@ riscv_hard_regno_mode_ok (unsigned int regno,
machine_mode mode)
    }
  else if (V_REG_P (regno))
    {
+      if (mode is scalar)
+       return true;
+
      if (!riscv_v_ext_mode_p (mode))
       return false;
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 6d6a2b3748c..50bac39f125 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2035,8 +2035,8 @@ (define_insn "*movdi_32bit"
   (set_attr "ext" "base,base,base,base,d,d,d,d,d,vector")])
 
(define_insn "*movdi_64bit"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r, m,
*f,*f,*r,*f,*m,r")
-       (match_operand:DI 1 "move_operand"         "
r,T,m,rJ,*r*J,*m,*f,*f,*f,vp"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r, m,
*f,*f,*r,*f,*m,r,*vr,*r,*vr,*vr,*m")
+       (match_operand:DI 1 "move_operand"         "
r,T,m,rJ,*r*J,*m,*f,*f,*f,vp,vr,vr,r,m,vr"))]
  "TARGET_64BIT
   && (register_operand (operands[0], DImode)
       || reg_or_0_operand (operands[1], DImode))"
juzhe.zhong@rivai.ai Sept. 14, 2023, 10:28 p.m. UTC | #5
>> Now, whether that's efficient (and desirable) is a separate issue and
>> should probably be defined by register_move_costs as well as instruction
>> costs.  I wasn't actually aware of this call/argument optimization that
>> uses vec_duplicate and I haven't checked what costing (if at all) it
>> uses.

This is patch is not the performance improve patch. It's a bug fix patch.
I am not optimize the codegen. That's why I put it into move pattern to handle that statically.



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-09-15 05:06
To: Kito Cheng; Juzhe-Zhong
CC: rdapp.gcc; gcc-patches; kito.cheng; jeffreyalaw
Subject: Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]
> I am thinking what we are doing is something like we are allowing
> scalar mode within the vector register, so...not sure should we try to
> implement that within the mov pattern?
> 
> I guess we need some inputs from Jeff.
 
Sorry for the late response.  I have also been thinking about this and
it feels a bit like a bandaid to me.  Usually register-class moves like
this are performed by reload (which consults register_move_costs among
other things) and we are working around it.
 
The situation is that we move a vec_duplicate of QImodes into a vector
register.  Then we want to use this as scalar call argument so we need
to transfer it back to a DImode register.
 
One maybe more typical solution would be to allow small VLS vector modes
like V8QI in GPRs (via hard_regno_mode_ok) until reload so we could have
a (set (reg:V8QI a0) (vec_duplicate:V8QI ...)).
 
The next step would be to have a mov<mode> expander with target "r"
constraint (and source "vr") that performs the actual move.  This is
where Juzhe's mov code could fit in (without the subreg handling).
If I'm not mistaken vmv.x.s without slidedown should be sufficient for
our case as we'd only want to use the whole thing when the full vector
fits into a GPR. 
 
All that's missing is a (reinterpreting) vtype change to Pmode-sized
elements before. I quickly hacked something together (without the proper
mode change) and the resulting code looks like:
 
vsetvli zero, 8, e8, ...
vmv.v.x v1,a5
        # missing vsetivli zero, 1, e64, ... or something 
vmv.x.s a0,v1
 
Now, whether that's efficient (and desirable) is a separate issue and
should probably be defined by register_move_costs as well as instruction
costs.  I wasn't actually aware of this call/argument optimization that
uses vec_duplicate and I haven't checked what costing (if at all) it
uses.
 
Regards
Robin
Jeff Law Sept. 15, 2023, 3:27 p.m. UTC | #6
On 9/14/23 16:26, 钟居哲 wrote:
> I don't think it can fix the case when it is -march=rv64gc_zve32x
> 
> ------------------------------------------------------------------------
> juzhe.zhong@rivai.ai
> 
>     *From:* Kito Cheng <mailto:kito.cheng@gmail.com>
>     *Date:* 2023-09-15 00:17
>     *To:* Juzhe-Zhong <mailto:juzhe.zhong@rivai.ai>
>     *CC:* gcc-patches <mailto:gcc-patches@gcc.gnu.org>; kito.cheng
>     <mailto:kito.cheng@sifive.com>; jeffreyalaw
>     <mailto:jeffreyalaw@gmail.com>; rdapp.gcc <mailto:rdapp.gcc@gmail.com>
>     *Subject:* Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode
>     move[PR111391]
>     I am thinking what we are doing is something like we are allowing
>     scalar mode within the vector register, so...not sure should we try to
>     implement that within the mov pattern?
>     I guess we need some inputs from Jeff.
It's advantageous if we can avoid it.  It often gets quite ugly when you 
start allowing something like scalar modes in vector registers -- 
particularly if you support something other than simple moves.

But we may end up needing to do that anyway to do something like 
supporting the sqrt & recip estimators in scalar FP modes, which can be 
very advantageous for benchmarks like nab.

So my suggestion would be go ahead if it looks like it can really solve 
this problem -- knowing there will likely be a long tail of fallout.  If 
it doesn't help pr111391, then let's defer until we really dive into 
544.nab/644.nab and try to improve that sqrt (x) and 1/sqrt(x) sequence 
that shows up in there.

jeff
juzhe.zhong@rivai.ai Sept. 15, 2023, 3:34 p.m. UTC | #7
You mean this patch is ok?



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-09-15 23:27
To: 钟居哲; kito.cheng
CC: gcc-patches; kito.cheng; rdapp.gcc
Subject: Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]
 
 
On 9/14/23 16:26, 钟居哲 wrote:
> I don't think it can fix the case when it is -march=rv64gc_zve32x
> 
> ------------------------------------------------------------------------
> juzhe.zhong@rivai.ai
> 
>     *From:* Kito Cheng <mailto:kito.cheng@gmail.com>
>     *Date:* 2023-09-15 00:17
>     *To:* Juzhe-Zhong <mailto:juzhe.zhong@rivai.ai>
>     *CC:* gcc-patches <mailto:gcc-patches@gcc.gnu.org>; kito.cheng
>     <mailto:kito.cheng@sifive.com>; jeffreyalaw
>     <mailto:jeffreyalaw@gmail.com>; rdapp.gcc <mailto:rdapp.gcc@gmail.com>
>     *Subject:* Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode
>     move[PR111391]
>     I am thinking what we are doing is something like we are allowing
>     scalar mode within the vector register, so...not sure should we try to
>     implement that within the mov pattern?
>     I guess we need some inputs from Jeff.
It's advantageous if we can avoid it.  It often gets quite ugly when you 
start allowing something like scalar modes in vector registers -- 
particularly if you support something other than simple moves.
 
But we may end up needing to do that anyway to do something like 
supporting the sqrt & recip estimators in scalar FP modes, which can be 
very advantageous for benchmarks like nab.
 
So my suggestion would be go ahead if it looks like it can really solve 
this problem -- knowing there will likely be a long tail of fallout.  If 
it doesn't help pr111391, then let's defer until we really dive into 
544.nab/644.nab and try to improve that sqrt (x) and 1/sqrt(x) sequence 
that shows up in there.
 
jeff
Robin Dapp Sept. 15, 2023, 3:43 p.m. UTC | #8
> You mean this patch is ok?

I thought about it a bit more.  From my point of view the patch is OK
for now in order to get the bug out of the way.

In the longer term I would really prefer a more "regular" solution
(i.e. via hard_regno_mode_ok) and related.  I can take care of that
once I have a bit of time but for now let's go ahead.

Regards
 Robin
Li, Pan2 via Gcc-patches Sept. 16, 2023, 9:56 a.m. UTC | #9
Committed, thanks Robin.

Pan

-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Robin Dapp via Gcc-patches
Sent: Friday, September 15, 2023 11:44 PM
To: 钟居哲 <juzhe.zhong@rivai.ai>; Jeff Law <jeffreyalaw@gmail.com>; kito.cheng <kito.cheng@gmail.com>
Cc: rdapp.gcc@gmail.com; gcc-patches <gcc-patches@gcc.gnu.org>; kito.cheng <kito.cheng@sifive.com>
Subject: Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]

> You mean this patch is ok?

I thought about it a bit more.  From my point of view the patch is OK
for now in order to get the bug out of the way.

In the longer term I would really prefer a more "regular" solution
(i.e. via hard_regno_mode_ok) and related.  I can take care of that
once I have a bit of time but for now let's go ahead.

Regards
 Robin
diff mbox series

Patch

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index e74a1695709..7121bab1716 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1442,7 +1442,7 @@ 
 ;; -------------------------------------------------------------------------
 ;; ---- [INT,FP] Extract a vector element.
 ;; -------------------------------------------------------------------------
-(define_expand "@vec_extract<mode><vel>"
+(define_expand "vec_extract<mode><vel>"
   [(set (match_operand:<VEL>	  0 "register_operand")
      (vec_select:<VEL>
        (match_operand:V_VLS	  1 "register_operand")
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index dc02246756d..5f031c18df5 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -649,6 +649,8 @@  emit_vsetvl_insn (enum vsetvl_type insn_type, enum emit_type emit_type,
     {
       fprintf (dump_file, "\nInsert vsetvl insn PATTERN:\n");
       print_rtl_single (dump_file, pat);
+      fprintf (dump_file, "\nfor insn:\n");
+      print_rtl_single (dump_file, rinsn);
     }
 
   if (emit_type == EMIT_DIRECT)
@@ -3867,7 +3869,7 @@  pass_vsetvl::local_eliminate_vsetvl_insn (const bb_info *bb) const
 	      skip_one = true;
 	    }
 
-	  curr_avl = get_avl (rinsn);
+	  curr_avl = curr_dem.get_avl ();
 
 	  /* Some instrucion like pred_extract_first<mode> don't reqruie avl, so
 	     the avl is null, use vl_placeholder for unify the handling
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 762937b0e37..8c766e2e2be 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2513,6 +2513,70 @@  riscv_legitimize_move (machine_mode mode, rtx dest, rtx src)
 	}
       return true;
     }
+  /* Expand
+       (set (reg:DI target) (subreg:DI (reg:V8QI reg) 0))
+     Expand this data movement instead of simply forbid it since
+     we can improve the code generation for this following scenario
+     by RVV auto-vectorization:
+       (set (reg:V8QI 149) (vec_duplicate:V8QI (reg:QI))
+       (set (reg:DI target) (subreg:DI (reg:V8QI reg) 0))
+     Since RVV mode and scalar mode are in different REG_CLASS,
+     we need to explicitly move data from V_REGS to GR_REGS by scalar move.  */
+  if (SUBREG_P (src) && riscv_v_ext_mode_p (GET_MODE (SUBREG_REG (src))))
+    {
+      machine_mode vmode = GET_MODE (SUBREG_REG (src));
+      unsigned int mode_size = GET_MODE_SIZE (mode).to_constant ();
+      unsigned int vmode_size = GET_MODE_SIZE (vmode).to_constant ();
+      unsigned int nunits = vmode_size / mode_size;
+      scalar_mode smode = as_a<scalar_mode> (mode);
+      unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size;
+      unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+
+      if (num == 2)
+	{
+	  /* If we want to extract 64bit value but ELEN < 64,
+	     we use RVV vector mode with EEW = 32 to extract
+	     the highpart and lowpart.  */
+	  smode = SImode;
+	  nunits = nunits * 2;
+	}
+      vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
+      enum insn_code icode
+	= convert_optab_handler (vec_extract_optab, vmode, smode);
+      gcc_assert (icode != CODE_FOR_nothing);
+      rtx v = gen_lowpart (vmode, SUBREG_REG (src));
+
+      for (unsigned int i = 0; i < num; i++)
+	{
+	  class expand_operand ops[3];
+	  rtx result;
+	  if (num == 1)
+	    result = dest;
+	  else if (i == 0)
+	    result = gen_lowpart (smode, dest);
+	  else
+	    result = gen_reg_rtx (smode);
+	  create_output_operand (&ops[0], result, smode);
+	  ops[0].target = 1;
+	  create_input_operand (&ops[1], v, vmode);
+	  create_integer_operand (&ops[2], index + i);
+	  expand_insn (icode, 3, ops);
+	  if (ops[0].value != result)
+	    emit_move_insn (result, ops[0].value);
+
+	  if (i == 1)
+	    {
+	      rtx tmp
+		= expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
+				gen_int_mode (32, Pmode), NULL_RTX, 0,
+				OPTAB_DIRECT);
+	      rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
+				       OPTAB_DIRECT);
+	      emit_move_insn (dest, tmp2);
+	    }
+	}
+      return true;
+    }
   /* Expand
        (set (reg:QI target) (mem:QI (address)))
      to
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-9.c
index 5fba27c7a35..7c42438c9d9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-9.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-9.c
@@ -29,4 +29,3 @@ 
 TEST_ALL (VEC_PERM)
 
 /* { dg-final { scan-assembler-times {viota.m} 2 } } */
-/* { dg-final { scan-assembler-not {vmv\.v\.i} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-1.c
new file mode 100644
index 00000000000..a7f64c937c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-1.c
@@ -0,0 +1,28 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -Wno-int-conversion -Wno-implicit-function -Wno-incompatible-pointer-types -Wno-implicit-function-declaration -Ofast -ftree-vectorize" } */
+
+int d ();
+typedef struct
+{
+  int b;
+} c;
+int
+e (char *f, long g)
+{
+  f += g;
+  while (g--)
+    *--f = d;
+}
+
+int
+d (c * f)
+{
+  while (h ())
+    switch (f->b)
+      case 'Q':
+      {
+	long a;
+	e (&a, sizeof (a));
+	i (a);
+      }
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-2.c
new file mode 100644
index 00000000000..1f170c962e1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-2.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zve32x_zvl128b -mabi=lp64d -Wno-int-conversion -Wno-implicit-function -Wno-incompatible-pointer-types -Wno-implicit-function-declaration -Ofast -ftree-vectorize" } */
+
+#include "pr111391-1.c"
+
+/* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*2,\s*e32,\s*mf2,\s*t[au],\s*m[au]} 1 } }
+/* { dg-final { scan-assembler-times {vmv\.x\.s} 2 } } */
+/* { dg-final { scan-assembler-times {vslidedown.vi\s+v[0-9]+,\s*v[0-9]+,\s*1} 1 } } */
+/* { dg-final { scan-assembler-times {slli\s+[a-x0-9]+,[a-x0-9]+,32} 1 } } */
+/* { dg-final { scan-assembler-times {or\s+[a-x0-9]+,[a-x0-9]+,[a-x0-9]+} 1 } } */