mbox series

[v6,00/17] riscv: Support vendor extensions and xtheadvector

Message ID 20240503-dev-charlie-support_thead_vector_6_9-v6-0-cb7624e65d82@rivosinc.com
Headers show
Series riscv: Support vendor extensions and xtheadvector | expand

Message

Charlie Jenkins May 3, 2024, 6:18 p.m. UTC
This patch series ended up much larger than expected, please bear with
me! The goal here is to support vendor extensions, starting at probing
the device tree and ending with reporting to userspace.

The main design objective was to allow vendors to operate independently
of each other. This has been achieved by delegating vendor extensions to
a their own files and then accumulating the extensions in
arch/riscv/kernel/vendor_extensions.c.

Each vendor will have their own list of extensions they support.

There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.

On to the xtheadvector specific code. xtheadvector is a custom extension
that is based upon riscv vector version 0.7.1 [1]. All of the vector
routines have been modified to support this alternative vector version
based upon whether xtheadvector was determined to be supported at boot.
I have tested this with an Allwinner Nezha board. I ran into issues
booting the board on 6.9-rc1 so I applied these patches to 6.8. There
are a couple of minor merge conflicts that do arrise when doing that, so
please let me know if you have been able to boot this board with a 6.9
kernel. I used SkiffOS [2] to manage building the image, but upgraded
the U-Boot version to Samuel Holland's more up-to-date version [3] and
changed out the device tree used by U-Boot with the device trees that
are present in upstream linux and this series. Thank you Samuel for all
of the work you did to make this task possible.

To test the integration, I used the riscv vector kselftests. I modified
the test cases to be able to more easily extend them, and then added a
xtheadvector target that works by calling hwprobe and swapping out the
vector asm if needed.

[1] https://github.com/T-head-Semi/thead-extension-spec/blob/95358cb2cca9489361c61d335e03d3134b14133f/xtheadvector.adoc
[2] https://github.com/skiffos/SkiffOS/tree/master/configs/allwinner/nezha
[3] https://github.com/smaeul/u-boot/commit/2e89b706f5c956a70c989cd31665f1429e9a0b48

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
---
Changes in v6:
- Only check vlenb from of if vector enabled in kernel (Conor)
- No need for variadic args in VENDOR_EXTENSION_SUPPORTED so just use a
  standard argument
- Make 'first' variable in riscv_fill_vendor_ext_list() static so that
  the variable value remains across calls to the function (Evan)
- Link to v5: https://lore.kernel.org/r/20240502-dev-charlie-support_thead_vector_6_9-v5-0-d1b5c013a966@rivosinc.com

Changes in v5:
- Make all vendors have the same size bitmap
- Extract vendor hwprobe code into helper macro
- Fix bug related to the handling of vendor extensions in the parsing of
  the isa string (Conor)
- Fix bug with the vendor bitmap being incorrectly populated (Evan)
- Add vendor extensions to /proc/cpuinfo
- Link to v4: https://lore.kernel.org/r/20240426-dev-charlie-support_thead_vector_6_9-v4-0-5cf53b5bc492@rivosinc.com

Changes in v4:
- Disable vector immediately if vlenb from the device tree is not
  homogeneous
- Hide vendor extension code behind a hidden config that vendor
  extensions select to eliminate the code when kernel is compiled
  without vendor extensions
- Clear up naming conventions and introduce some defines to make the
  vendor extension code clearer
- Link to v3: https://lore.kernel.org/r/20240420-dev-charlie-support_thead_vector_6_9-v3-0-67cff4271d1d@rivosinc.com

Changes in v3:
- Allow any hardware to support any vendor extension, rather than
  restricting the vendor extensions to the same vendor as the hardware
- Introduce config options to enable/disable a vendor's extensions
- Link to v2: https://lore.kernel.org/r/20240415-dev-charlie-support_thead_vector_6_9-v2-0-c7d68c603268@rivosinc.com

Changes in v2:
- Added commit hash to xtheadvector
- Simplified riscv,isa vector removal fix to not mess with the DT
  riscv,vendorid
- Moved riscv,vendorid parsing into a different patch and cache the
  value to be used by alternative patching
- Reduce riscv,vendorid missing severity to "info"
- Separate vendor extension list to vendor files
- xtheadvector no longer puts v in the elf_hwcap
- Only patch vendor extension if all harts are associated with the same
  vendor. This is the best chance the kernel has for working properly if
  there are multiple vendors.
- Split hwprobe vendor keys out into vendor file
- Add attribution for Heiko's patches
- Link to v1: https://lore.kernel.org/r/20240411-dev-charlie-support_thead_vector_6_9-v1-0-4af9815ec746@rivosinc.com

---
Charlie Jenkins (15):
      dt-bindings: riscv: Add xtheadvector ISA extension description
      riscv: vector: Use vlenb from DT
      riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
      riscv: Extend cpufeature.c to detect vendor extensions
      riscv: Add vendor extensions to /proc/cpuinfo
      riscv: Introduce vendor variants of extension helpers
      riscv: cpufeature: Extract common elements from extension checking
      riscv: Convert xandespmu to use the vendor extension framework
      riscv: csr: Add CSR encodings for VCSR_VXRM/VCSR_VXSAT
      riscv: Add xtheadvector instruction definitions
      riscv: vector: Support xtheadvector save/restore
      riscv: hwprobe: Add thead vendor extension probing
      riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
      selftests: riscv: Fix vector tests
      selftests: riscv: Support xtheadvector in vector tests

Conor Dooley (1):
      dt-bindings: riscv: cpus: add a vlen register length property

Heiko Stuebner (1):
      RISC-V: define the elements of the VCSR vector CSR

 Documentation/arch/riscv/hwprobe.rst               |  10 +
 Documentation/devicetree/bindings/riscv/cpus.yaml  |   6 +
 .../devicetree/bindings/riscv/extensions.yaml      |  10 +
 arch/riscv/Kconfig                                 |   2 +
 arch/riscv/Kconfig.vendor                          |  44 +++
 arch/riscv/boot/dts/allwinner/sun20i-d1s.dtsi      |   3 +-
 arch/riscv/errata/andes/errata.c                   |   2 +
 arch/riscv/errata/sifive/errata.c                  |   3 +
 arch/riscv/errata/thead/errata.c                   |   3 +
 arch/riscv/include/asm/cpufeature.h                |  98 ++++---
 arch/riscv/include/asm/csr.h                       |  13 +
 arch/riscv/include/asm/hwcap.h                     |   1 -
 arch/riscv/include/asm/hwprobe.h                   |   4 +-
 arch/riscv/include/asm/switch_to.h                 |   2 +-
 arch/riscv/include/asm/vector.h                    | 247 +++++++++++++----
 arch/riscv/include/asm/vendor_extensions.h         | 103 +++++++
 arch/riscv/include/asm/vendor_extensions/andes.h   |  19 ++
 arch/riscv/include/asm/vendor_extensions/thead.h   |  42 +++
 .../include/asm/vendor_extensions/thead_hwprobe.h  |  18 ++
 .../include/asm/vendor_extensions/vendor_hwprobe.h |  37 +++
 arch/riscv/include/uapi/asm/hwprobe.h              |   3 +-
 arch/riscv/include/uapi/asm/vendor/thead.h         |   3 +
 arch/riscv/kernel/Makefile                         |   2 +
 arch/riscv/kernel/cpu.c                            |  35 ++-
 arch/riscv/kernel/cpufeature.c                     | 175 +++++++++---
 arch/riscv/kernel/kernel_mode_vector.c             |   8 +-
 arch/riscv/kernel/process.c                        |   4 +-
 arch/riscv/kernel/signal.c                         |   6 +-
 arch/riscv/kernel/sys_hwprobe.c                    |   5 +
 arch/riscv/kernel/vector.c                         |  25 +-
 arch/riscv/kernel/vendor_extensions.c              |  66 +++++
 arch/riscv/kernel/vendor_extensions/Makefile       |   5 +
 arch/riscv/kernel/vendor_extensions/andes.c        |  18 ++
 arch/riscv/kernel/vendor_extensions/thead.c        |  18 ++
 .../riscv/kernel/vendor_extensions/thead_hwprobe.c |  19 ++
 drivers/perf/riscv_pmu_sbi.c                       |   9 +-
 tools/testing/selftests/riscv/vector/.gitignore    |   3 +-
 tools/testing/selftests/riscv/vector/Makefile      |  17 +-
 .../selftests/riscv/vector/v_exec_initval_nolibc.c |  93 +++++++
 tools/testing/selftests/riscv/vector/v_helpers.c   |  67 +++++
 tools/testing/selftests/riscv/vector/v_helpers.h   |   7 +
 tools/testing/selftests/riscv/vector/v_initval.c   |  22 ++
 .../selftests/riscv/vector/v_initval_nolibc.c      |  68 -----
 .../selftests/riscv/vector/vstate_exec_nolibc.c    |  20 +-
 .../testing/selftests/riscv/vector/vstate_prctl.c  | 295 ++++++++++++---------
 45 files changed, 1319 insertions(+), 341 deletions(-)
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20240411-dev-charlie-support_thead_vector_6_9-1591fc2a431d

Comments

Evan Green May 3, 2024, 9:02 p.m. UTC | #1
On Fri, May 3, 2024 at 11:18 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
>
> Separate vendor extensions out into one struct per vendor
> instead of adding vendor extensions onto riscv_isa_ext.
>
> Add a hidden config RISCV_ISA_VENDOR_EXT to conditionally include this
> code.
>
> The xtheadvector vendor extension is added using these changes.
>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>

Reviewed-by: Evan Green <evan@rivosinc.com>
Evan Green May 3, 2024, 9:03 p.m. UTC | #2
On Fri, May 3, 2024 at 11:18 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
>
> All of the supported vendor extensions that have been listed in
> riscv_isa_vendor_ext_list can be exported through /proc/cpuinfo.
>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>

Reviewed-by: Evan Green <evan@rivosinc.com>
Conor Dooley May 7, 2024, 5:03 p.m. UTC | #3
On Fri, May 03, 2024 at 11:18:21AM -0700, Charlie Jenkins wrote:
> All of the supported vendor extensions that have been listed in
> riscv_isa_vendor_ext_list can be exported through /proc/cpuinfo.
> 
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>

This seems fine, thanks for updating this interface :)

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Cheers,
Conor.
Conor Dooley May 7, 2024, 5:21 p.m. UTC | #4
On Fri, May 03, 2024 at 11:18:15AM -0700, Charlie Jenkins wrote:
> This patch series ended up much larger than expected, please bear with
> me! The goal here is to support vendor extensions, starting at probing
> the device tree and ending with reporting to userspace.

I think I've reviewed all the bits I care about, so thanks for all of
your updates. It'd be nice if Andy could look at the actual vector parts
of it, so I'm adding him to CC.

Cheers,
Conor.
Conor Dooley May 9, 2024, 8:17 a.m. UTC | #5
Hey Charlie,

Just me being a pain again...

On Fri, May 03, 2024 at 11:18:18AM -0700, Charlie Jenkins wrote:
> @@ -671,6 +713,11 @@ void __init riscv_fill_hwcap(void)
>  			pr_info("Falling back to deprecated \"riscv,isa\"\n");
>  			riscv_fill_hwcap_from_isa_string(isa2hwcap);
>  		}
> +
> +		if (elf_hwcap & COMPAT_HWCAP_ISA_V && has_riscv_homogeneous_vlenb() < 0) {
> +			pr_warn("Unsupported heterogeneous vlen detected, vector extension disabled.\n");
> +			elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> +		}
>  	}

After replying to Andy this morning about doing some dependency checking
for vector-reliant extensions I realised that we're actually doing this
check too late to be useful for that case.
Say for Zvkb we want to check if vector has been enabled before marking
the extension available, and without Clement's series that re-works the
riscv_isa_extension_check(), we need to ensure that vector support isn't
gonna get turned after we've already marked Zvkb as usable. If we could
move the riscv_has_homogeneous_vlenb() call before probing extensions we
could then examine the result in riscv_isa_extension_check() for V and
make sure it doesn't get enabled in the first place, rather than
clearing it after the fact.

There's a whole load of moving pieces between different series here at
the moment though, I wish some of it would land so that I could send
some cleanup patches for what I think is inconsistency on top :) I
wouldn't mind if this landed as-is, it's still an improvement and the
user I mention above doesn't actually exist yet.

Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Cheers,
Conor.
Conor Dooley May 9, 2024, 8:18 a.m. UTC | #6
On Fri, May 03, 2024 at 11:18:20AM -0700, Charlie Jenkins wrote:
> Separate vendor extensions out into one struct per vendor
> instead of adding vendor extensions onto riscv_isa_ext.
> 
> Add a hidden config RISCV_ISA_VENDOR_EXT to conditionally include this
> code.
> 
> The xtheadvector vendor extension is added using these changes.
> 
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>

I thought that I had reviewed this the other day,
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Cheers,
Conor.
Conor Dooley May 10, 2024, 8:50 p.m. UTC | #7
On Tue, May 07, 2024 at 06:03:19PM +0100, Conor Dooley wrote:
> On Fri, May 03, 2024 at 11:18:21AM -0700, Charlie Jenkins wrote:
> > All of the supported vendor extensions that have been listed in
> > riscv_isa_vendor_ext_list can be exported through /proc/cpuinfo.
> > 
> > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> 
> This seems fine, thanks for updating this interface :)
> 
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Hmm, actually the automation on patchwork is complaining a bunch about
the series, but I think that's mostly false positives except for this
patch. The nommu defconfigs are prob the easiest way to reproduce this:
  /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:41:55: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'vendor_bitmap'
  /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:42:60: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'per_hart_vendor_bitmap'; did you mean 'per_hart_isa_bitmap'?
  /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:43:60: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'bitmap_size'

Cheers,
Conor.
Charlie Jenkins May 10, 2024, 9:25 p.m. UTC | #8
On Fri, May 10, 2024 at 09:50:32PM +0100, Conor Dooley wrote:
> On Tue, May 07, 2024 at 06:03:19PM +0100, Conor Dooley wrote:
> > On Fri, May 03, 2024 at 11:18:21AM -0700, Charlie Jenkins wrote:
> > > All of the supported vendor extensions that have been listed in
> > > riscv_isa_vendor_ext_list can be exported through /proc/cpuinfo.
> > > 
> > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > 
> > This seems fine, thanks for updating this interface :)
> > 
> > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> 
> Hmm, actually the automation on patchwork is complaining a bunch about
> the series, but I think that's mostly false positives except for this
> patch. The nommu defconfigs are prob the easiest way to reproduce this:
>   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:41:55: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'vendor_bitmap'
>   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:42:60: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'per_hart_vendor_bitmap'; did you mean 'per_hart_isa_bitmap'?
>   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:43:60: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'bitmap_size'
> 
> Cheers,
> Conor.

The false negatives always throw me off. The errors are also offset by
one patch. This was actually introduced in the following patch "riscv:
Introduce vendor variants of extension helpers" because I accidentally
fixed this issue in the patch "riscv: cpufeature: Extract common
elements from extension checking" instead of the one it was introduced
in.

- Charlie
Conor Dooley May 10, 2024, 9:32 p.m. UTC | #9
On Fri, May 10, 2024 at 02:25:26PM -0700, Charlie Jenkins wrote:
> On Fri, May 10, 2024 at 09:50:32PM +0100, Conor Dooley wrote:
> > On Tue, May 07, 2024 at 06:03:19PM +0100, Conor Dooley wrote:
> > > On Fri, May 03, 2024 at 11:18:21AM -0700, Charlie Jenkins wrote:
> > > > All of the supported vendor extensions that have been listed in
> > > > riscv_isa_vendor_ext_list can be exported through /proc/cpuinfo.
> > > > 
> > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > > 
> > > This seems fine, thanks for updating this interface :)
> > > 
> > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > 
> > Hmm, actually the automation on patchwork is complaining a bunch about
> > the series, but I think that's mostly false positives except for this
> > patch. The nommu defconfigs are prob the easiest way to reproduce this:
> >   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:41:55: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'vendor_bitmap'
> >   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:42:60: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'per_hart_vendor_bitmap'; did you mean 'per_hart_isa_bitmap'?
> >   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:43:60: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'bitmap_size'
> > 
> > Cheers,
> > Conor.
> 
> The false negatives always throw me off.

Aye, it's pretty frustrating for me trying to report anything. Any time
a bunch of headers change produces a bunch of file rebuilds and
therefore warnings. That should in theory be caught by the fact that we
apply the patch & build, jump back to HEAD~1, build that & grab the
"before" warning state and then jump forward, rebuild the patch and
gather the "after" state. The idea is that that is an apples:apples
comparison as the same files will need to be rebuilt for both but it is
falling over somewhere. Maybe I'll have time to look into that soonTM.

> The errors are also offset by
> one patch.

Ye, that's my bad I think. In a rush off to another patch before the
thought I had on it left my brain and just pressed reply on the wrong
email. Sorry bout that :)

> This was actually introduced in the following patch "riscv:
> Introduce vendor variants of extension helpers" because I accidentally
> fixed this issue in the patch "riscv: cpufeature: Extract common
> elements from extension checking" instead of the one it was introduced
> in.
Charlie Jenkins May 10, 2024, 9:47 p.m. UTC | #10
On Fri, May 10, 2024 at 10:32:19PM +0100, Conor Dooley wrote:
> On Fri, May 10, 2024 at 02:25:26PM -0700, Charlie Jenkins wrote:
> > On Fri, May 10, 2024 at 09:50:32PM +0100, Conor Dooley wrote:
> > > On Tue, May 07, 2024 at 06:03:19PM +0100, Conor Dooley wrote:
> > > > On Fri, May 03, 2024 at 11:18:21AM -0700, Charlie Jenkins wrote:
> > > > > All of the supported vendor extensions that have been listed in
> > > > > riscv_isa_vendor_ext_list can be exported through /proc/cpuinfo.
> > > > > 
> > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > > > 
> > > > This seems fine, thanks for updating this interface :)
> > > > 
> > > > Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> > > 
> > > Hmm, actually the automation on patchwork is complaining a bunch about
> > > the series, but I think that's mostly false positives except for this
> > > patch. The nommu defconfigs are prob the easiest way to reproduce this:
> > >   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:41:55: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'vendor_bitmap'
> > >   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:42:60: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'per_hart_vendor_bitmap'; did you mean 'per_hart_isa_bitmap'?
> > >   /build/tmp.QPMRM3oUNu/arch/riscv/kernel/vendor_extensions.c:43:60: error: 'struct riscv_isa_vendor_ext_data_list' has no member named 'bitmap_size'
> > > 
> > > Cheers,
> > > Conor.
> > 
> > The false negatives always throw me off.
> 
> Aye, it's pretty frustrating for me trying to report anything. Any time
> a bunch of headers change produces a bunch of file rebuilds and
> therefore warnings. That should in theory be caught by the fact that we
> apply the patch & build, jump back to HEAD~1, build that & grab the
> "before" warning state and then jump forward, rebuild the patch and
> gather the "after" state. The idea is that that is an apples:apples
> comparison as the same files will need to be rebuilt for both but it is
> falling over somewhere. Maybe I'll have time to look into that soonTM.
> 
> > The errors are also offset by
> > one patch.
> 
> Ye, that's my bad I think. In a rush off to another patch before the
> thought I had on it left my brain and just pressed reply on the wrong
> email. Sorry bout that :)
> 

The error message in full is:

Patch 7/17: Test 2/12: .github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh
  Redirect to /build/tmp.OCcmyhkGEw and /build/tmp.TsjyVLqMfy
  Tree base:
  fefb1e9ecc34 ("riscv: Add vendor extensions to /proc/cpuinfo")
  Building the whole tree with the patch
  ../arch/riscv/kernel/vendor_extensions.c:41:42: error: no member named 'vendor_bitmap' in 'struct riscv_isa_vendor_ext_data_list'
  ../arch/riscv/kernel/vendor_extensions.c:42:46: error: no member named 'per_hart_vendor_bitmap' in 'struct riscv_isa_vendor_ext_data_list'; did you mean 'per_hart_isa_bitmap'?
  ../arch/riscv/kernel/vendor_extensions.c:43:47: error: no member named 'bitmap_size' in 'struct riscv_isa_vendor_ext_data_list'


It knows it's on patch 7 but then it prints the title of patch 6.

- Charlie

> > This was actually introduced in the following patch "riscv:
> > Introduce vendor variants of extension helpers" because I accidentally
> > fixed this issue in the patch "riscv: cpufeature: Extract common
> > elements from extension checking" instead of the one it was introduced
> > in.
>
Andy Chiu May 13, 2024, 8:45 a.m. UTC | #11
Hi Charlie,

Sorry, I am late on this. I haven't looked through the entire series
yet, but here is something that I thought worth bringing up sooner.

On Sat, May 4, 2024 at 2:22 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
>
> Use alternatives to add support for xtheadvector vector save/restore
> routines.
>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> ---
>  arch/riscv/Kconfig.vendor              |  13 ++
>  arch/riscv/include/asm/csr.h           |   6 +
>  arch/riscv/include/asm/switch_to.h     |   2 +-
>  arch/riscv/include/asm/vector.h        | 247 ++++++++++++++++++++++++++-------
>  arch/riscv/kernel/cpufeature.c         |   2 +-
>  arch/riscv/kernel/kernel_mode_vector.c |   8 +-
>  arch/riscv/kernel/process.c            |   4 +-
>  arch/riscv/kernel/signal.c             |   6 +-
>  arch/riscv/kernel/vector.c             |  13 +-
>  9 files changed, 233 insertions(+), 68 deletions(-)
>
> diff --git a/arch/riscv/Kconfig.vendor b/arch/riscv/Kconfig.vendor
> index aa5a191e659e..edf49f3065ac 100644
> --- a/arch/riscv/Kconfig.vendor
> +++ b/arch/riscv/Kconfig.vendor
> @@ -13,6 +13,19 @@ config RISCV_ISA_VENDOR_EXT_THEAD
>           extensions. Without this option enabled, T-Head vendor extensions will
>           not be detected at boot and their presence not reported to userspace.
>
> +         If you don't know what to do here, say Y.
> +
> +config RISCV_ISA_XTHEADVECTOR
> +       bool "xtheadvector extension support"
> +       depends on RISCV_ISA_VENDOR_EXT_THEAD
> +       depends on RISCV_ISA_V
> +       depends on FPU
> +       default y
> +       help
> +         Say N here if you want to disable all xtheadvector related procedure
> +         in the kernel. This will disable vector for any T-Head board that
> +         contains xtheadvector rather than the standard vector.
> +
>           If you don't know what to do here, say Y.
>  endmenu
>
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index e5a35efd56e0..13657d096e7d 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -30,6 +30,12 @@
>  #define SR_VS_CLEAN    _AC(0x00000400, UL)
>  #define SR_VS_DIRTY    _AC(0x00000600, UL)
>
> +#define SR_VS_THEAD            _AC(0x01800000, UL) /* xtheadvector Status */
> +#define SR_VS_OFF_THEAD                _AC(0x00000000, UL)
> +#define SR_VS_INITIAL_THEAD    _AC(0x00800000, UL)
> +#define SR_VS_CLEAN_THEAD      _AC(0x01000000, UL)
> +#define SR_VS_DIRTY_THEAD      _AC(0x01800000, UL)
> +
>  #define SR_XS          _AC(0x00018000, UL) /* Extension Status */
>  #define SR_XS_OFF      _AC(0x00000000, UL)
>  #define SR_XS_INITIAL  _AC(0x00008000, UL)
> diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
> index 7efdb0584d47..ada6b5cf2d94 100644
> --- a/arch/riscv/include/asm/switch_to.h
> +++ b/arch/riscv/include/asm/switch_to.h
> @@ -78,7 +78,7 @@ do {                                                  \
>         struct task_struct *__next = (next);            \
>         if (has_fpu())                                  \
>                 __switch_to_fpu(__prev, __next);        \
> -       if (has_vector())                                       \
> +       if (has_vector() || has_xtheadvector())         \
>                 __switch_to_vector(__prev, __next);     \
>         ((last) = __switch_to(__prev, __next));         \
>  } while (0)
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index 731dcd0ed4de..db851dc81870 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -18,6 +18,27 @@
>  #include <asm/cpufeature.h>
>  #include <asm/csr.h>
>  #include <asm/asm.h>
> +#include <asm/vendorid_list.h>
> +#include <asm/vendor_extensions.h>
> +#include <asm/vendor_extensions/thead.h>
> +
> +#define __riscv_v_vstate_or(_val, TYPE) ({                             \
> +       typeof(_val) _res = _val;                                       \
> +       if (has_xtheadvector()) \
> +               _res = (_res & ~SR_VS_THEAD) | SR_VS_##TYPE##_THEAD;    \
> +       else                                                            \
> +               _res = (_res & ~SR_VS) | SR_VS_##TYPE;                  \
> +       _res;                                                           \
> +})
> +
> +#define __riscv_v_vstate_check(_val, TYPE) ({                          \
> +       bool _res;                                                      \
> +       if (has_xtheadvector()) \
> +               _res = ((_val) & SR_VS_THEAD) == SR_VS_##TYPE##_THEAD;  \
> +       else                                                            \
> +               _res = ((_val) & SR_VS) == SR_VS_##TYPE;                \
> +       _res;                                                           \
> +})
>
>  extern unsigned long riscv_v_vsize;
>  int riscv_v_setup_vsize(void);
> @@ -40,39 +61,62 @@ static __always_inline bool has_vector(void)
>         return riscv_has_extension_unlikely(RISCV_ISA_EXT_v);
>  }
>
> +static __always_inline bool has_xtheadvector_no_alternatives(void)
> +{
> +       if (IS_ENABLED(CONFIG_RISCV_ISA_XTHEADVECTOR))
> +               return riscv_isa_vendor_extension_available(THEAD_VENDOR_ID, XTHEADVECTOR);
> +       else
> +               return false;
> +}
> +
> +static __always_inline bool has_xtheadvector(void)
> +{
> +       if (IS_ENABLED(CONFIG_RISCV_ISA_XTHEADVECTOR))
> +               return riscv_has_vendor_extension_unlikely(THEAD_VENDOR_ID,
> +                                                          RISCV_ISA_VENDOR_EXT_XTHEADVECTOR);
> +       else
> +               return false;
> +}
> +
>  static inline void __riscv_v_vstate_clean(struct pt_regs *regs)
>  {
> -       regs->status = (regs->status & ~SR_VS) | SR_VS_CLEAN;
> +       regs->status = __riscv_v_vstate_or(regs->status, CLEAN);
>  }
>
>  static inline void __riscv_v_vstate_dirty(struct pt_regs *regs)
>  {
> -       regs->status = (regs->status & ~SR_VS) | SR_VS_DIRTY;
> +       regs->status = __riscv_v_vstate_or(regs->status, DIRTY);
>  }
>
>  static inline void riscv_v_vstate_off(struct pt_regs *regs)
>  {
> -       regs->status = (regs->status & ~SR_VS) | SR_VS_OFF;
> +       regs->status = __riscv_v_vstate_or(regs->status, OFF);
>  }
>
>  static inline void riscv_v_vstate_on(struct pt_regs *regs)
>  {
> -       regs->status = (regs->status & ~SR_VS) | SR_VS_INITIAL;
> +       regs->status = __riscv_v_vstate_or(regs->status, INITIAL);
>  }
>
>  static inline bool riscv_v_vstate_query(struct pt_regs *regs)
>  {
> -       return (regs->status & SR_VS) != 0;
> +       return !__riscv_v_vstate_check(regs->status, OFF);
>  }
>
>  static __always_inline void riscv_v_enable(void)
>  {
> -       csr_set(CSR_SSTATUS, SR_VS);
> +       if (has_xtheadvector())
> +               csr_set(CSR_SSTATUS, SR_VS_THEAD);
> +       else
> +               csr_set(CSR_SSTATUS, SR_VS);
>  }
>
>  static __always_inline void riscv_v_disable(void)
>  {
> -       csr_clear(CSR_SSTATUS, SR_VS);
> +       if (has_xtheadvector())
> +               csr_clear(CSR_SSTATUS, SR_VS_THEAD);
> +       else
> +               csr_clear(CSR_SSTATUS, SR_VS);
>  }
>
>  static __always_inline void __vstate_csr_save(struct __riscv_v_ext_state *dest)
> @@ -81,10 +125,47 @@ static __always_inline void __vstate_csr_save(struct __riscv_v_ext_state *dest)
>                 "csrr   %0, " __stringify(CSR_VSTART) "\n\t"
>                 "csrr   %1, " __stringify(CSR_VTYPE) "\n\t"
>                 "csrr   %2, " __stringify(CSR_VL) "\n\t"
> -               "csrr   %3, " __stringify(CSR_VCSR) "\n\t"
> -               "csrr   %4, " __stringify(CSR_VLENB) "\n\t"
>                 : "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl),
> -                 "=r" (dest->vcsr), "=r" (dest->vlenb) : :);
> +               "=r" (dest->vcsr) : :);
> +
> +       if (has_xtheadvector()) {
> +               u32 tmp_vcsr;
> +               bool restore_fpu = false;
> +               unsigned long status = csr_read(CSR_SSTATUS);
> +
> +               /*
> +                * CSR_VCSR is defined as
> +                * [2:1] - vxrm[1:0]
> +                * [0] - vxsat
> +                * The earlier vector spec implemented by T-Head uses separate
> +                * registers for the same bit-elements, so just combine those
> +                * into the existing output field.
> +                *
> +                * Additionally T-Head cores need FS to be enabled when accessing
> +                * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions.
> +                * Though the cores do not implement the VXRM and VXSAT fields in the
> +                * FCSR CSR that vector-0.7.1 specifies.
> +                */
> +               if ((status & SR_FS) == SR_FS_OFF) {
> +                       csr_set(CSR_SSTATUS, (status & ~SR_FS) | SR_FS_CLEAN);
> +                       restore_fpu = true;
> +               }
> +
> +               asm volatile (
> +                       "csrr   %[tmp_vcsr], " __stringify(VCSR_VXRM) "\n\t"
> +                       "slliw  %[vcsr], %[tmp_vcsr], " __stringify(VCSR_VXRM_SHIFT) "\n\t"
> +                       "csrr   %[tmp_vcsr], " __stringify(VCSR_VXSAT) "\n\t"
> +                       "or     %[vcsr], %[vcsr], %[tmp_vcsr]\n\t"
> +                       : [vcsr] "=r" (dest->vcsr), [tmp_vcsr] "=&r" (tmp_vcsr));
> +
> +               if (restore_fpu)
> +                       csr_set(CSR_SSTATUS, status);

vlenb is on ABI and ptrace needs that to recover the width of the
register. So we should probably save Xtheadvector's vlenb into vstate
if we meant to maintain ABI compatibility between V and Xtheadvector
from the kernel.

> +       } else {
> +               asm volatile (
> +                       "csrr   %[vcsr], " __stringify(CSR_VCSR) "\n\t"
> +                       "csrr   %[vlenb], " __stringify(CSR_VLENB) "\n\t"
> +                       : [vcsr] "=r" (dest->vcsr), [vlenb] "=r" (dest->vlenb));
> +       }
>  }
>
>  static __always_inline void __vstate_csr_restore(struct __riscv_v_ext_state *src)
> @@ -95,9 +176,37 @@ static __always_inline void __vstate_csr_restore(struct __riscv_v_ext_state *src
>                 "vsetvl  x0, %2, %1\n\t"
>                 ".option pop\n\t"
>                 "csrw   " __stringify(CSR_VSTART) ", %0\n\t"
> -               "csrw   " __stringify(CSR_VCSR) ", %3\n\t"
> -               : : "r" (src->vstart), "r" (src->vtype), "r" (src->vl),
> -                   "r" (src->vcsr) :);
> +               : : "r" (src->vstart), "r" (src->vtype), "r" (src->vl));
> +
> +       if (has_xtheadvector()) {
> +               u32 tmp_vcsr;
> +               bool restore_fpu = false;
> +               unsigned long status = csr_read(CSR_SSTATUS);
> +
> +               /*
> +                * Similar to __vstate_csr_save above, restore values for the
> +                * separate VXRM and VXSAT CSRs from the vcsr variable.
> +                */
> +               if ((status & SR_FS) == SR_FS_OFF) {
> +                       csr_set(CSR_SSTATUS, (status & ~SR_FS) | SR_FS_CLEAN);
> +                       restore_fpu = true;
> +               }
> +
> +               asm volatile (
> +                       "srliw  %[tmp_vcsr], %[vcsr], " __stringify(VCSR_VXRM_SHIFT) "\n\t"
> +                       "andi   %[tmp_vcsr], %[tmp_vcsr], " __stringify(VCSR_VXRM_MASK) "\n\t"
> +                       "csrw   " __stringify(VCSR_VXRM) ", %[tmp_vcsr]\n\t"
> +                       "andi   %[tmp_vcsr], %[vcsr], " __stringify(VCSR_VXSAT_MASK) "\n\t"
> +                       "csrw   " __stringify(VCSR_VXSAT) ", %[tmp_vcsr]\n\t"
> +                       : [tmp_vcsr] "=&r" (tmp_vcsr) : [vcsr] "r" (src->vcsr));
> +
> +               if (restore_fpu)
> +                       csr_set(CSR_SSTATUS, status);
> +       } else {
> +               asm volatile (
> +                       "csrw   " __stringify(CSR_VCSR) ", %[vcsr]\n\t"
> +                       : : [vcsr] "r" (src->vcsr));
> +       }
>  }
>
>  static inline void __riscv_v_vstate_save(struct __riscv_v_ext_state *save_to,
> @@ -107,19 +216,33 @@ static inline void __riscv_v_vstate_save(struct __riscv_v_ext_state *save_to,
>
>         riscv_v_enable();
>         __vstate_csr_save(save_to);
> -       asm volatile (
> -               ".option push\n\t"
> -               ".option arch, +v\n\t"
> -               "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> -               "vse8.v         v0, (%1)\n\t"
> -               "add            %1, %1, %0\n\t"
> -               "vse8.v         v8, (%1)\n\t"
> -               "add            %1, %1, %0\n\t"
> -               "vse8.v         v16, (%1)\n\t"
> -               "add            %1, %1, %0\n\t"
> -               "vse8.v         v24, (%1)\n\t"
> -               ".option pop\n\t"
> -               : "=&r" (vl) : "r" (datap) : "memory");
> +       if (has_xtheadvector()) {
> +               asm volatile (
> +                       "mv t0, %0\n\t"
> +                       THEAD_VSETVLI_T4X0E8M8D1
> +                       THEAD_VSB_V_V0T0
> +                       "add            t0, t0, t4\n\t"
> +                       THEAD_VSB_V_V0T0
> +                       "add            t0, t0, t4\n\t"
> +                       THEAD_VSB_V_V0T0
> +                       "add            t0, t0, t4\n\t"
> +                       THEAD_VSB_V_V0T0
> +                       : : "r" (datap) : "memory", "t0", "t4");
> +       } else {
> +               asm volatile (
> +                       ".option push\n\t"
> +                       ".option arch, +v\n\t"
> +                       "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> +                       "vse8.v         v0, (%1)\n\t"
> +                       "add            %1, %1, %0\n\t"
> +                       "vse8.v         v8, (%1)\n\t"
> +                       "add            %1, %1, %0\n\t"
> +                       "vse8.v         v16, (%1)\n\t"
> +                       "add            %1, %1, %0\n\t"
> +                       "vse8.v         v24, (%1)\n\t"
> +                       ".option pop\n\t"
> +                       : "=&r" (vl) : "r" (datap) : "memory");
> +       }
>         riscv_v_disable();
>  }
>
> @@ -129,55 +252,77 @@ static inline void __riscv_v_vstate_restore(struct __riscv_v_ext_state *restore_
>         unsigned long vl;
>
>         riscv_v_enable();
> -       asm volatile (
> -               ".option push\n\t"
> -               ".option arch, +v\n\t"
> -               "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> -               "vle8.v         v0, (%1)\n\t"
> -               "add            %1, %1, %0\n\t"
> -               "vle8.v         v8, (%1)\n\t"
> -               "add            %1, %1, %0\n\t"
> -               "vle8.v         v16, (%1)\n\t"
> -               "add            %1, %1, %0\n\t"
> -               "vle8.v         v24, (%1)\n\t"
> -               ".option pop\n\t"
> -               : "=&r" (vl) : "r" (datap) : "memory");
> +       if (has_xtheadvector()) {
> +               asm volatile (
> +                       "mv t0, %0\n\t"
> +                       THEAD_VSETVLI_T4X0E8M8D1
> +                       THEAD_VLB_V_V0T0
> +                       "add            t0, t0, t4\n\t"
> +                       THEAD_VLB_V_V0T0
> +                       "add            t0, t0, t4\n\t"
> +                       THEAD_VLB_V_V0T0
> +                       "add            t0, t0, t4\n\t"
> +                       THEAD_VLB_V_V0T0
> +                       : : "r" (datap) : "memory", "t0", "t4");
> +       } else {
> +               asm volatile (
> +                       ".option push\n\t"
> +                       ".option arch, +v\n\t"
> +                       "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> +                       "vle8.v         v0, (%1)\n\t"
> +                       "add            %1, %1, %0\n\t"
> +                       "vle8.v         v8, (%1)\n\t"
> +                       "add            %1, %1, %0\n\t"
> +                       "vle8.v         v16, (%1)\n\t"
> +                       "add            %1, %1, %0\n\t"
> +                       "vle8.v         v24, (%1)\n\t"
> +                       ".option pop\n\t"
> +                       : "=&r" (vl) : "r" (datap) : "memory");
> +       }
>         __vstate_csr_restore(restore_from);
>         riscv_v_disable();
>  }
>
>  static inline void __riscv_v_vstate_discard(void)
>  {
> -       unsigned long vl, vtype_inval = 1UL << (BITS_PER_LONG - 1);
> +       unsigned long vtype_inval = 1UL << (BITS_PER_LONG - 1);
>
>         riscv_v_enable();
> +       if (has_xtheadvector())
> +               asm volatile (THEAD_VSETVLI_X0X0E8M8D1);
> +       else
> +               asm volatile (
> +                       ".option push\n\t"
> +                       ".option arch, +v\n\t"
> +                       "vsetvli        x0, x0, e8, m8, ta, ma\n\t"
> +                       ".option pop\n\t");
> +
>         asm volatile (
>                 ".option push\n\t"
>                 ".option arch, +v\n\t"
> -               "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
>                 "vmv.v.i        v0, -1\n\t"
>                 "vmv.v.i        v8, -1\n\t"
>                 "vmv.v.i        v16, -1\n\t"
>                 "vmv.v.i        v24, -1\n\t"
> -               "vsetvl         %0, x0, %1\n\t"
> +               "vsetvl         x0, x0, %0\n\t"
>                 ".option pop\n\t"
> -               : "=&r" (vl) : "r" (vtype_inval) : "memory");
> +               : : "r" (vtype_inval));
> +
>         riscv_v_disable();
>  }
>
>  static inline void riscv_v_vstate_discard(struct pt_regs *regs)
>  {
> -       if ((regs->status & SR_VS) == SR_VS_OFF)
> -               return;
> -
> -       __riscv_v_vstate_discard();
> -       __riscv_v_vstate_dirty(regs);
> +       if (riscv_v_vstate_query(regs)) {
> +               __riscv_v_vstate_discard();
> +               __riscv_v_vstate_dirty(regs);
> +       }
>  }
>
>  static inline void riscv_v_vstate_save(struct __riscv_v_ext_state *vstate,
>                                        struct pt_regs *regs)
>  {
> -       if ((regs->status & SR_VS) == SR_VS_DIRTY) {
> +       if (__riscv_v_vstate_check(regs->status, DIRTY)) {
>                 __riscv_v_vstate_save(vstate, vstate->datap);
>                 __riscv_v_vstate_clean(regs);
>         }
> @@ -186,7 +331,7 @@ static inline void riscv_v_vstate_save(struct __riscv_v_ext_state *vstate,
>  static inline void riscv_v_vstate_restore(struct __riscv_v_ext_state *vstate,
>                                           struct pt_regs *regs)
>  {
> -       if ((regs->status & SR_VS) != SR_VS_OFF) {
> +       if (riscv_v_vstate_query(regs)) {
>                 __riscv_v_vstate_restore(vstate, vstate->datap);
>                 __riscv_v_vstate_clean(regs);
>         }
> @@ -195,7 +340,7 @@ static inline void riscv_v_vstate_restore(struct __riscv_v_ext_state *vstate,
>  static inline void riscv_v_vstate_set_restore(struct task_struct *task,
>                                               struct pt_regs *regs)
>  {
> -       if ((regs->status & SR_VS) != SR_VS_OFF) {
> +       if (riscv_v_vstate_query(regs)) {
>                 set_tsk_thread_flag(task, TIF_RISCV_V_DEFER_RESTORE);
>                 riscv_v_vstate_on(regs);
>         }
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 2993318b8ea2..44f0017a98d5 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -781,7 +781,7 @@ void __init riscv_fill_hwcap(void)
>                 elf_hwcap &= ~COMPAT_HWCAP_ISA_F;
>         }
>
> -       if (elf_hwcap & COMPAT_HWCAP_ISA_V) {
> +       if (elf_hwcap & COMPAT_HWCAP_ISA_V || has_xtheadvector_no_alternatives()) {
>                 riscv_v_setup_vsize();
>                 /*
>                  * ISA string in device tree might have 'v' flag, but
> diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c
> index 6afe80c7f03a..99972a48e86b 100644
> --- a/arch/riscv/kernel/kernel_mode_vector.c
> +++ b/arch/riscv/kernel/kernel_mode_vector.c
> @@ -143,7 +143,7 @@ static int riscv_v_start_kernel_context(bool *is_nested)
>
>         /* Transfer the ownership of V from user to kernel, then save */
>         riscv_v_start(RISCV_PREEMPT_V | RISCV_PREEMPT_V_DIRTY);
> -       if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) {
> +       if (__riscv_v_vstate_check(task_pt_regs(current)->status, DIRTY)) {
>                 uvstate = &current->thread.vstate;
>                 __riscv_v_vstate_save(uvstate, uvstate->datap);
>         }
> @@ -160,7 +160,7 @@ asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs)
>                 return;
>
>         depth = riscv_v_ctx_get_depth();
> -       if (depth == 0 && (regs->status & SR_VS) == SR_VS_DIRTY)
> +       if (depth == 0 && __riscv_v_vstate_check(regs->status, DIRTY))
>                 riscv_preempt_v_set_dirty();
>
>         riscv_v_ctx_depth_inc();
> @@ -208,7 +208,7 @@ void kernel_vector_begin(void)
>  {
>         bool nested = false;
>
> -       if (WARN_ON(!has_vector()))
> +       if (WARN_ON(!(has_vector() || has_xtheadvector())))
>                 return;
>
>         BUG_ON(!may_use_simd());
> @@ -236,7 +236,7 @@ EXPORT_SYMBOL_GPL(kernel_vector_begin);
>   */
>  void kernel_vector_end(void)
>  {
> -       if (WARN_ON(!has_vector()))
> +       if (WARN_ON(!(has_vector() || has_xtheadvector())))
>                 return;
>
>         riscv_v_disable();
> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> index 92922dbd5b5c..eabca86fc3c0 100644
> --- a/arch/riscv/kernel/process.c
> +++ b/arch/riscv/kernel/process.c
> @@ -178,7 +178,7 @@ void flush_thread(void)
>  void arch_release_task_struct(struct task_struct *tsk)
>  {
>         /* Free the vector context of datap. */
> -       if (has_vector())
> +       if (has_vector() || has_xtheadvector())
>                 riscv_v_thread_free(tsk);
>  }
>
> @@ -225,7 +225,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
>                 p->thread.s[0] = 0;
>         }
>         p->thread.riscv_v_flags = 0;
> -       if (has_vector())
> +       if (has_vector() || has_xtheadvector())
>                 riscv_v_thread_alloc(p);
>         p->thread.ra = (unsigned long)ret_from_fork;
>         p->thread.sp = (unsigned long)childregs; /* kernel sp */
> diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
> index 501e66debf69..5d3ba8e46807 100644
> --- a/arch/riscv/kernel/signal.c
> +++ b/arch/riscv/kernel/signal.c
> @@ -188,7 +188,7 @@ static long restore_sigcontext(struct pt_regs *regs,
>
>                         return 0;
>                 case RISCV_V_MAGIC:
> -                       if (!has_vector() || !riscv_v_vstate_query(regs) ||
> +                       if (!(has_vector() || has_xtheadvector()) || !riscv_v_vstate_query(regs) ||
>                             size != riscv_v_sc_size)
>                                 return -EINVAL;
>
> @@ -210,7 +210,7 @@ static size_t get_rt_frame_size(bool cal_all)
>
>         frame_size = sizeof(*frame);
>
> -       if (has_vector()) {
> +       if (has_vector() || has_xtheadvector()) {
>                 if (cal_all || riscv_v_vstate_query(task_pt_regs(current)))
>                         total_context_size += riscv_v_sc_size;
>         }
> @@ -283,7 +283,7 @@ static long setup_sigcontext(struct rt_sigframe __user *frame,
>         if (has_fpu())
>                 err |= save_fp_state(regs, &sc->sc_fpregs);
>         /* Save the vector state. */
> -       if (has_vector() && riscv_v_vstate_query(regs))
> +       if ((has_vector() || has_xtheadvector()) && riscv_v_vstate_query(regs))
>                 err |= save_v_state(regs, (void __user **)&sc_ext_ptr);
>         /* Write zero to fp-reserved space and check it on restore_sigcontext */
>         err |= __put_user(0, &sc->sc_extdesc.reserved);
> diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
> index e04586cdb7f0..c12ea4547da6 100644
> --- a/arch/riscv/kernel/vector.c
> +++ b/arch/riscv/kernel/vector.c
> @@ -63,7 +63,7 @@ int riscv_v_setup_vsize(void)
>
>  void __init riscv_v_setup_ctx_cache(void)
>  {
> -       if (!has_vector())
> +       if (!(has_vector() || has_xtheadvector()))
>                 return;
>
>         riscv_v_user_cachep = kmem_cache_create_usercopy("riscv_vector_ctx",
> @@ -184,7 +184,8 @@ bool riscv_v_first_use_handler(struct pt_regs *regs)
>         u32 insn = (u32)regs->badaddr;
>
>         /* Do not handle if V is not supported, or disabled */
> -       if (!(ELF_HWCAP & COMPAT_HWCAP_ISA_V))
> +       if (!(ELF_HWCAP & COMPAT_HWCAP_ISA_V) &&
> +           !(has_xtheadvector() && riscv_v_vstate_ctrl_user_allowed()))
>                 return false;
>
>         /* If V has been enabled then it is not the first-use trap */
> @@ -223,7 +224,7 @@ void riscv_v_vstate_ctrl_init(struct task_struct *tsk)
>         bool inherit;
>         int cur, next;
>
> -       if (!has_vector())
> +       if (!(has_vector() || has_xtheadvector()))
>                 return;
>
>         next = riscv_v_ctrl_get_next(tsk);
> @@ -245,7 +246,7 @@ void riscv_v_vstate_ctrl_init(struct task_struct *tsk)
>
>  long riscv_v_vstate_ctrl_get_current(void)
>  {
> -       if (!has_vector())
> +       if (!(has_vector() || has_xtheadvector()))
>                 return -EINVAL;
>
>         return current->thread.vstate_ctrl & PR_RISCV_V_VSTATE_CTRL_MASK;
> @@ -256,7 +257,7 @@ long riscv_v_vstate_ctrl_set_current(unsigned long arg)
>         bool inherit;
>         int cur, next;
>
> -       if (!has_vector())
> +       if (!(has_vector() || has_xtheadvector()))
>                 return -EINVAL;
>
>         if (arg & ~PR_RISCV_V_VSTATE_CTRL_MASK)
> @@ -306,7 +307,7 @@ static struct ctl_table riscv_v_default_vstate_table[] = {
>
>  static int __init riscv_v_sysctl_init(void)
>  {
> -       if (has_vector())
> +       if (has_vector() || has_xtheadvector())
>                 if (!register_sysctl("abi", riscv_v_default_vstate_table))
>                         return -EINVAL;
>         return 0;
>
> --
> 2.44.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

Cheers,
Andy
Charlie Jenkins May 13, 2024, 4:56 p.m. UTC | #12
On Mon, May 13, 2024 at 04:45:18PM +0800, Andy Chiu wrote:
> Hi Charlie,
> 
> Sorry, I am late on this. I haven't looked through the entire series
> yet, but here is something that I thought worth bringing up sooner.
> 
> On Sat, May 4, 2024 at 2:22 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
> >
> > Use alternatives to add support for xtheadvector vector save/restore
> > routines.
> >
> > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > ---
> >  arch/riscv/Kconfig.vendor              |  13 ++
> >  arch/riscv/include/asm/csr.h           |   6 +
> >  arch/riscv/include/asm/switch_to.h     |   2 +-
> >  arch/riscv/include/asm/vector.h        | 247 ++++++++++++++++++++++++++-------
> >  arch/riscv/kernel/cpufeature.c         |   2 +-
> >  arch/riscv/kernel/kernel_mode_vector.c |   8 +-
> >  arch/riscv/kernel/process.c            |   4 +-
> >  arch/riscv/kernel/signal.c             |   6 +-
> >  arch/riscv/kernel/vector.c             |  13 +-
> >  9 files changed, 233 insertions(+), 68 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig.vendor b/arch/riscv/Kconfig.vendor
> > index aa5a191e659e..edf49f3065ac 100644
> > --- a/arch/riscv/Kconfig.vendor
> > +++ b/arch/riscv/Kconfig.vendor
> > @@ -13,6 +13,19 @@ config RISCV_ISA_VENDOR_EXT_THEAD
> >           extensions. Without this option enabled, T-Head vendor extensions will
> >           not be detected at boot and their presence not reported to userspace.
> >
> > +         If you don't know what to do here, say Y.
> > +
> > +config RISCV_ISA_XTHEADVECTOR
> > +       bool "xtheadvector extension support"
> > +       depends on RISCV_ISA_VENDOR_EXT_THEAD
> > +       depends on RISCV_ISA_V
> > +       depends on FPU
> > +       default y
> > +       help
> > +         Say N here if you want to disable all xtheadvector related procedure
> > +         in the kernel. This will disable vector for any T-Head board that
> > +         contains xtheadvector rather than the standard vector.
> > +
> >           If you don't know what to do here, say Y.
> >  endmenu
> >
> > diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> > index e5a35efd56e0..13657d096e7d 100644
> > --- a/arch/riscv/include/asm/csr.h
> > +++ b/arch/riscv/include/asm/csr.h
> > @@ -30,6 +30,12 @@
> >  #define SR_VS_CLEAN    _AC(0x00000400, UL)
> >  #define SR_VS_DIRTY    _AC(0x00000600, UL)
> >
> > +#define SR_VS_THEAD            _AC(0x01800000, UL) /* xtheadvector Status */
> > +#define SR_VS_OFF_THEAD                _AC(0x00000000, UL)
> > +#define SR_VS_INITIAL_THEAD    _AC(0x00800000, UL)
> > +#define SR_VS_CLEAN_THEAD      _AC(0x01000000, UL)
> > +#define SR_VS_DIRTY_THEAD      _AC(0x01800000, UL)
> > +
> >  #define SR_XS          _AC(0x00018000, UL) /* Extension Status */
> >  #define SR_XS_OFF      _AC(0x00000000, UL)
> >  #define SR_XS_INITIAL  _AC(0x00008000, UL)
> > diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
> > index 7efdb0584d47..ada6b5cf2d94 100644
> > --- a/arch/riscv/include/asm/switch_to.h
> > +++ b/arch/riscv/include/asm/switch_to.h
> > @@ -78,7 +78,7 @@ do {                                                  \
> >         struct task_struct *__next = (next);            \
> >         if (has_fpu())                                  \
> >                 __switch_to_fpu(__prev, __next);        \
> > -       if (has_vector())                                       \
> > +       if (has_vector() || has_xtheadvector())         \
> >                 __switch_to_vector(__prev, __next);     \
> >         ((last) = __switch_to(__prev, __next));         \
> >  } while (0)
> > diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> > index 731dcd0ed4de..db851dc81870 100644
> > --- a/arch/riscv/include/asm/vector.h
> > +++ b/arch/riscv/include/asm/vector.h
> > @@ -18,6 +18,27 @@
> >  #include <asm/cpufeature.h>
> >  #include <asm/csr.h>
> >  #include <asm/asm.h>
> > +#include <asm/vendorid_list.h>
> > +#include <asm/vendor_extensions.h>
> > +#include <asm/vendor_extensions/thead.h>
> > +
> > +#define __riscv_v_vstate_or(_val, TYPE) ({                             \
> > +       typeof(_val) _res = _val;                                       \
> > +       if (has_xtheadvector()) \
> > +               _res = (_res & ~SR_VS_THEAD) | SR_VS_##TYPE##_THEAD;    \
> > +       else                                                            \
> > +               _res = (_res & ~SR_VS) | SR_VS_##TYPE;                  \
> > +       _res;                                                           \
> > +})
> > +
> > +#define __riscv_v_vstate_check(_val, TYPE) ({                          \
> > +       bool _res;                                                      \
> > +       if (has_xtheadvector()) \
> > +               _res = ((_val) & SR_VS_THEAD) == SR_VS_##TYPE##_THEAD;  \
> > +       else                                                            \
> > +               _res = ((_val) & SR_VS) == SR_VS_##TYPE;                \
> > +       _res;                                                           \
> > +})
> >
> >  extern unsigned long riscv_v_vsize;
> >  int riscv_v_setup_vsize(void);
> > @@ -40,39 +61,62 @@ static __always_inline bool has_vector(void)
> >         return riscv_has_extension_unlikely(RISCV_ISA_EXT_v);
> >  }
> >
> > +static __always_inline bool has_xtheadvector_no_alternatives(void)
> > +{
> > +       if (IS_ENABLED(CONFIG_RISCV_ISA_XTHEADVECTOR))
> > +               return riscv_isa_vendor_extension_available(THEAD_VENDOR_ID, XTHEADVECTOR);
> > +       else
> > +               return false;
> > +}
> > +
> > +static __always_inline bool has_xtheadvector(void)
> > +{
> > +       if (IS_ENABLED(CONFIG_RISCV_ISA_XTHEADVECTOR))
> > +               return riscv_has_vendor_extension_unlikely(THEAD_VENDOR_ID,
> > +                                                          RISCV_ISA_VENDOR_EXT_XTHEADVECTOR);
> > +       else
> > +               return false;
> > +}
> > +
> >  static inline void __riscv_v_vstate_clean(struct pt_regs *regs)
> >  {
> > -       regs->status = (regs->status & ~SR_VS) | SR_VS_CLEAN;
> > +       regs->status = __riscv_v_vstate_or(regs->status, CLEAN);
> >  }
> >
> >  static inline void __riscv_v_vstate_dirty(struct pt_regs *regs)
> >  {
> > -       regs->status = (regs->status & ~SR_VS) | SR_VS_DIRTY;
> > +       regs->status = __riscv_v_vstate_or(regs->status, DIRTY);
> >  }
> >
> >  static inline void riscv_v_vstate_off(struct pt_regs *regs)
> >  {
> > -       regs->status = (regs->status & ~SR_VS) | SR_VS_OFF;
> > +       regs->status = __riscv_v_vstate_or(regs->status, OFF);
> >  }
> >
> >  static inline void riscv_v_vstate_on(struct pt_regs *regs)
> >  {
> > -       regs->status = (regs->status & ~SR_VS) | SR_VS_INITIAL;
> > +       regs->status = __riscv_v_vstate_or(regs->status, INITIAL);
> >  }
> >
> >  static inline bool riscv_v_vstate_query(struct pt_regs *regs)
> >  {
> > -       return (regs->status & SR_VS) != 0;
> > +       return !__riscv_v_vstate_check(regs->status, OFF);
> >  }
> >
> >  static __always_inline void riscv_v_enable(void)
> >  {
> > -       csr_set(CSR_SSTATUS, SR_VS);
> > +       if (has_xtheadvector())
> > +               csr_set(CSR_SSTATUS, SR_VS_THEAD);
> > +       else
> > +               csr_set(CSR_SSTATUS, SR_VS);
> >  }
> >
> >  static __always_inline void riscv_v_disable(void)
> >  {
> > -       csr_clear(CSR_SSTATUS, SR_VS);
> > +       if (has_xtheadvector())
> > +               csr_clear(CSR_SSTATUS, SR_VS_THEAD);
> > +       else
> > +               csr_clear(CSR_SSTATUS, SR_VS);
> >  }
> >
> >  static __always_inline void __vstate_csr_save(struct __riscv_v_ext_state *dest)
> > @@ -81,10 +125,47 @@ static __always_inline void __vstate_csr_save(struct __riscv_v_ext_state *dest)
> >                 "csrr   %0, " __stringify(CSR_VSTART) "\n\t"
> >                 "csrr   %1, " __stringify(CSR_VTYPE) "\n\t"
> >                 "csrr   %2, " __stringify(CSR_VL) "\n\t"
> > -               "csrr   %3, " __stringify(CSR_VCSR) "\n\t"
> > -               "csrr   %4, " __stringify(CSR_VLENB) "\n\t"
> >                 : "=r" (dest->vstart), "=r" (dest->vtype), "=r" (dest->vl),
> > -                 "=r" (dest->vcsr), "=r" (dest->vlenb) : :);
> > +               "=r" (dest->vcsr) : :);
> > +
> > +       if (has_xtheadvector()) {
> > +               u32 tmp_vcsr;
> > +               bool restore_fpu = false;
> > +               unsigned long status = csr_read(CSR_SSTATUS);
> > +
> > +               /*
> > +                * CSR_VCSR is defined as
> > +                * [2:1] - vxrm[1:0]
> > +                * [0] - vxsat
> > +                * The earlier vector spec implemented by T-Head uses separate
> > +                * registers for the same bit-elements, so just combine those
> > +                * into the existing output field.
> > +                *
> > +                * Additionally T-Head cores need FS to be enabled when accessing
> > +                * the VXRM and VXSAT CSRs, otherwise ending in illegal instructions.
> > +                * Though the cores do not implement the VXRM and VXSAT fields in the
> > +                * FCSR CSR that vector-0.7.1 specifies.
> > +                */
> > +               if ((status & SR_FS) == SR_FS_OFF) {
> > +                       csr_set(CSR_SSTATUS, (status & ~SR_FS) | SR_FS_CLEAN);
> > +                       restore_fpu = true;
> > +               }
> > +
> > +               asm volatile (
> > +                       "csrr   %[tmp_vcsr], " __stringify(VCSR_VXRM) "\n\t"
> > +                       "slliw  %[vcsr], %[tmp_vcsr], " __stringify(VCSR_VXRM_SHIFT) "\n\t"
> > +                       "csrr   %[tmp_vcsr], " __stringify(VCSR_VXSAT) "\n\t"
> > +                       "or     %[vcsr], %[vcsr], %[tmp_vcsr]\n\t"
> > +                       : [vcsr] "=r" (dest->vcsr), [tmp_vcsr] "=&r" (tmp_vcsr));
> > +
> > +               if (restore_fpu)
> > +                       csr_set(CSR_SSTATUS, status);
> 
> vlenb is on ABI and ptrace needs that to recover the width of the
> register. So we should probably save Xtheadvector's vlenb into vstate
> if we meant to maintain ABI compatibility between V and Xtheadvector
> from the kernel.

We can pull the vlenb from riscv_v_vsize which gets populated from the
devicetree for xtheadvector. I will change that in the next version!

- Charlie

> 
> > +       } else {
> > +               asm volatile (
> > +                       "csrr   %[vcsr], " __stringify(CSR_VCSR) "\n\t"
> > +                       "csrr   %[vlenb], " __stringify(CSR_VLENB) "\n\t"
> > +                       : [vcsr] "=r" (dest->vcsr), [vlenb] "=r" (dest->vlenb));
> > +       }
> >  }
> >
> >  static __always_inline void __vstate_csr_restore(struct __riscv_v_ext_state *src)
> > @@ -95,9 +176,37 @@ static __always_inline void __vstate_csr_restore(struct __riscv_v_ext_state *src
> >                 "vsetvl  x0, %2, %1\n\t"
> >                 ".option pop\n\t"
> >                 "csrw   " __stringify(CSR_VSTART) ", %0\n\t"
> > -               "csrw   " __stringify(CSR_VCSR) ", %3\n\t"
> > -               : : "r" (src->vstart), "r" (src->vtype), "r" (src->vl),
> > -                   "r" (src->vcsr) :);
> > +               : : "r" (src->vstart), "r" (src->vtype), "r" (src->vl));
> > +
> > +       if (has_xtheadvector()) {
> > +               u32 tmp_vcsr;
> > +               bool restore_fpu = false;
> > +               unsigned long status = csr_read(CSR_SSTATUS);
> > +
> > +               /*
> > +                * Similar to __vstate_csr_save above, restore values for the
> > +                * separate VXRM and VXSAT CSRs from the vcsr variable.
> > +                */
> > +               if ((status & SR_FS) == SR_FS_OFF) {
> > +                       csr_set(CSR_SSTATUS, (status & ~SR_FS) | SR_FS_CLEAN);
> > +                       restore_fpu = true;
> > +               }
> > +
> > +               asm volatile (
> > +                       "srliw  %[tmp_vcsr], %[vcsr], " __stringify(VCSR_VXRM_SHIFT) "\n\t"
> > +                       "andi   %[tmp_vcsr], %[tmp_vcsr], " __stringify(VCSR_VXRM_MASK) "\n\t"
> > +                       "csrw   " __stringify(VCSR_VXRM) ", %[tmp_vcsr]\n\t"
> > +                       "andi   %[tmp_vcsr], %[vcsr], " __stringify(VCSR_VXSAT_MASK) "\n\t"
> > +                       "csrw   " __stringify(VCSR_VXSAT) ", %[tmp_vcsr]\n\t"
> > +                       : [tmp_vcsr] "=&r" (tmp_vcsr) : [vcsr] "r" (src->vcsr));
> > +
> > +               if (restore_fpu)
> > +                       csr_set(CSR_SSTATUS, status);
> > +       } else {
> > +               asm volatile (
> > +                       "csrw   " __stringify(CSR_VCSR) ", %[vcsr]\n\t"
> > +                       : : [vcsr] "r" (src->vcsr));
> > +       }
> >  }
> >
> >  static inline void __riscv_v_vstate_save(struct __riscv_v_ext_state *save_to,
> > @@ -107,19 +216,33 @@ static inline void __riscv_v_vstate_save(struct __riscv_v_ext_state *save_to,
> >
> >         riscv_v_enable();
> >         __vstate_csr_save(save_to);
> > -       asm volatile (
> > -               ".option push\n\t"
> > -               ".option arch, +v\n\t"
> > -               "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> > -               "vse8.v         v0, (%1)\n\t"
> > -               "add            %1, %1, %0\n\t"
> > -               "vse8.v         v8, (%1)\n\t"
> > -               "add            %1, %1, %0\n\t"
> > -               "vse8.v         v16, (%1)\n\t"
> > -               "add            %1, %1, %0\n\t"
> > -               "vse8.v         v24, (%1)\n\t"
> > -               ".option pop\n\t"
> > -               : "=&r" (vl) : "r" (datap) : "memory");
> > +       if (has_xtheadvector()) {
> > +               asm volatile (
> > +                       "mv t0, %0\n\t"
> > +                       THEAD_VSETVLI_T4X0E8M8D1
> > +                       THEAD_VSB_V_V0T0
> > +                       "add            t0, t0, t4\n\t"
> > +                       THEAD_VSB_V_V0T0
> > +                       "add            t0, t0, t4\n\t"
> > +                       THEAD_VSB_V_V0T0
> > +                       "add            t0, t0, t4\n\t"
> > +                       THEAD_VSB_V_V0T0
> > +                       : : "r" (datap) : "memory", "t0", "t4");
> > +       } else {
> > +               asm volatile (
> > +                       ".option push\n\t"
> > +                       ".option arch, +v\n\t"
> > +                       "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> > +                       "vse8.v         v0, (%1)\n\t"
> > +                       "add            %1, %1, %0\n\t"
> > +                       "vse8.v         v8, (%1)\n\t"
> > +                       "add            %1, %1, %0\n\t"
> > +                       "vse8.v         v16, (%1)\n\t"
> > +                       "add            %1, %1, %0\n\t"
> > +                       "vse8.v         v24, (%1)\n\t"
> > +                       ".option pop\n\t"
> > +                       : "=&r" (vl) : "r" (datap) : "memory");
> > +       }
> >         riscv_v_disable();
> >  }
> >
> > @@ -129,55 +252,77 @@ static inline void __riscv_v_vstate_restore(struct __riscv_v_ext_state *restore_
> >         unsigned long vl;
> >
> >         riscv_v_enable();
> > -       asm volatile (
> > -               ".option push\n\t"
> > -               ".option arch, +v\n\t"
> > -               "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> > -               "vle8.v         v0, (%1)\n\t"
> > -               "add            %1, %1, %0\n\t"
> > -               "vle8.v         v8, (%1)\n\t"
> > -               "add            %1, %1, %0\n\t"
> > -               "vle8.v         v16, (%1)\n\t"
> > -               "add            %1, %1, %0\n\t"
> > -               "vle8.v         v24, (%1)\n\t"
> > -               ".option pop\n\t"
> > -               : "=&r" (vl) : "r" (datap) : "memory");
> > +       if (has_xtheadvector()) {
> > +               asm volatile (
> > +                       "mv t0, %0\n\t"
> > +                       THEAD_VSETVLI_T4X0E8M8D1
> > +                       THEAD_VLB_V_V0T0
> > +                       "add            t0, t0, t4\n\t"
> > +                       THEAD_VLB_V_V0T0
> > +                       "add            t0, t0, t4\n\t"
> > +                       THEAD_VLB_V_V0T0
> > +                       "add            t0, t0, t4\n\t"
> > +                       THEAD_VLB_V_V0T0
> > +                       : : "r" (datap) : "memory", "t0", "t4");
> > +       } else {
> > +               asm volatile (
> > +                       ".option push\n\t"
> > +                       ".option arch, +v\n\t"
> > +                       "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> > +                       "vle8.v         v0, (%1)\n\t"
> > +                       "add            %1, %1, %0\n\t"
> > +                       "vle8.v         v8, (%1)\n\t"
> > +                       "add            %1, %1, %0\n\t"
> > +                       "vle8.v         v16, (%1)\n\t"
> > +                       "add            %1, %1, %0\n\t"
> > +                       "vle8.v         v24, (%1)\n\t"
> > +                       ".option pop\n\t"
> > +                       : "=&r" (vl) : "r" (datap) : "memory");
> > +       }
> >         __vstate_csr_restore(restore_from);
> >         riscv_v_disable();
> >  }
> >
> >  static inline void __riscv_v_vstate_discard(void)
> >  {
> > -       unsigned long vl, vtype_inval = 1UL << (BITS_PER_LONG - 1);
> > +       unsigned long vtype_inval = 1UL << (BITS_PER_LONG - 1);
> >
> >         riscv_v_enable();
> > +       if (has_xtheadvector())
> > +               asm volatile (THEAD_VSETVLI_X0X0E8M8D1);
> > +       else
> > +               asm volatile (
> > +                       ".option push\n\t"
> > +                       ".option arch, +v\n\t"
> > +                       "vsetvli        x0, x0, e8, m8, ta, ma\n\t"
> > +                       ".option pop\n\t");
> > +
> >         asm volatile (
> >                 ".option push\n\t"
> >                 ".option arch, +v\n\t"
> > -               "vsetvli        %0, x0, e8, m8, ta, ma\n\t"
> >                 "vmv.v.i        v0, -1\n\t"
> >                 "vmv.v.i        v8, -1\n\t"
> >                 "vmv.v.i        v16, -1\n\t"
> >                 "vmv.v.i        v24, -1\n\t"
> > -               "vsetvl         %0, x0, %1\n\t"
> > +               "vsetvl         x0, x0, %0\n\t"
> >                 ".option pop\n\t"
> > -               : "=&r" (vl) : "r" (vtype_inval) : "memory");
> > +               : : "r" (vtype_inval));
> > +
> >         riscv_v_disable();
> >  }
> >
> >  static inline void riscv_v_vstate_discard(struct pt_regs *regs)
> >  {
> > -       if ((regs->status & SR_VS) == SR_VS_OFF)
> > -               return;
> > -
> > -       __riscv_v_vstate_discard();
> > -       __riscv_v_vstate_dirty(regs);
> > +       if (riscv_v_vstate_query(regs)) {
> > +               __riscv_v_vstate_discard();
> > +               __riscv_v_vstate_dirty(regs);
> > +       }
> >  }
> >
> >  static inline void riscv_v_vstate_save(struct __riscv_v_ext_state *vstate,
> >                                        struct pt_regs *regs)
> >  {
> > -       if ((regs->status & SR_VS) == SR_VS_DIRTY) {
> > +       if (__riscv_v_vstate_check(regs->status, DIRTY)) {
> >                 __riscv_v_vstate_save(vstate, vstate->datap);
> >                 __riscv_v_vstate_clean(regs);
> >         }
> > @@ -186,7 +331,7 @@ static inline void riscv_v_vstate_save(struct __riscv_v_ext_state *vstate,
> >  static inline void riscv_v_vstate_restore(struct __riscv_v_ext_state *vstate,
> >                                           struct pt_regs *regs)
> >  {
> > -       if ((regs->status & SR_VS) != SR_VS_OFF) {
> > +       if (riscv_v_vstate_query(regs)) {
> >                 __riscv_v_vstate_restore(vstate, vstate->datap);
> >                 __riscv_v_vstate_clean(regs);
> >         }
> > @@ -195,7 +340,7 @@ static inline void riscv_v_vstate_restore(struct __riscv_v_ext_state *vstate,
> >  static inline void riscv_v_vstate_set_restore(struct task_struct *task,
> >                                               struct pt_regs *regs)
> >  {
> > -       if ((regs->status & SR_VS) != SR_VS_OFF) {
> > +       if (riscv_v_vstate_query(regs)) {
> >                 set_tsk_thread_flag(task, TIF_RISCV_V_DEFER_RESTORE);
> >                 riscv_v_vstate_on(regs);
> >         }
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 2993318b8ea2..44f0017a98d5 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -781,7 +781,7 @@ void __init riscv_fill_hwcap(void)
> >                 elf_hwcap &= ~COMPAT_HWCAP_ISA_F;
> >         }
> >
> > -       if (elf_hwcap & COMPAT_HWCAP_ISA_V) {
> > +       if (elf_hwcap & COMPAT_HWCAP_ISA_V || has_xtheadvector_no_alternatives()) {
> >                 riscv_v_setup_vsize();
> >                 /*
> >                  * ISA string in device tree might have 'v' flag, but
> > diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c
> > index 6afe80c7f03a..99972a48e86b 100644
> > --- a/arch/riscv/kernel/kernel_mode_vector.c
> > +++ b/arch/riscv/kernel/kernel_mode_vector.c
> > @@ -143,7 +143,7 @@ static int riscv_v_start_kernel_context(bool *is_nested)
> >
> >         /* Transfer the ownership of V from user to kernel, then save */
> >         riscv_v_start(RISCV_PREEMPT_V | RISCV_PREEMPT_V_DIRTY);
> > -       if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) {
> > +       if (__riscv_v_vstate_check(task_pt_regs(current)->status, DIRTY)) {
> >                 uvstate = &current->thread.vstate;
> >                 __riscv_v_vstate_save(uvstate, uvstate->datap);
> >         }
> > @@ -160,7 +160,7 @@ asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs)
> >                 return;
> >
> >         depth = riscv_v_ctx_get_depth();
> > -       if (depth == 0 && (regs->status & SR_VS) == SR_VS_DIRTY)
> > +       if (depth == 0 && __riscv_v_vstate_check(regs->status, DIRTY))
> >                 riscv_preempt_v_set_dirty();
> >
> >         riscv_v_ctx_depth_inc();
> > @@ -208,7 +208,7 @@ void kernel_vector_begin(void)
> >  {
> >         bool nested = false;
> >
> > -       if (WARN_ON(!has_vector()))
> > +       if (WARN_ON(!(has_vector() || has_xtheadvector())))
> >                 return;
> >
> >         BUG_ON(!may_use_simd());
> > @@ -236,7 +236,7 @@ EXPORT_SYMBOL_GPL(kernel_vector_begin);
> >   */
> >  void kernel_vector_end(void)
> >  {
> > -       if (WARN_ON(!has_vector()))
> > +       if (WARN_ON(!(has_vector() || has_xtheadvector())))
> >                 return;
> >
> >         riscv_v_disable();
> > diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> > index 92922dbd5b5c..eabca86fc3c0 100644
> > --- a/arch/riscv/kernel/process.c
> > +++ b/arch/riscv/kernel/process.c
> > @@ -178,7 +178,7 @@ void flush_thread(void)
> >  void arch_release_task_struct(struct task_struct *tsk)
> >  {
> >         /* Free the vector context of datap. */
> > -       if (has_vector())
> > +       if (has_vector() || has_xtheadvector())
> >                 riscv_v_thread_free(tsk);
> >  }
> >
> > @@ -225,7 +225,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
> >                 p->thread.s[0] = 0;
> >         }
> >         p->thread.riscv_v_flags = 0;
> > -       if (has_vector())
> > +       if (has_vector() || has_xtheadvector())
> >                 riscv_v_thread_alloc(p);
> >         p->thread.ra = (unsigned long)ret_from_fork;
> >         p->thread.sp = (unsigned long)childregs; /* kernel sp */
> > diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
> > index 501e66debf69..5d3ba8e46807 100644
> > --- a/arch/riscv/kernel/signal.c
> > +++ b/arch/riscv/kernel/signal.c
> > @@ -188,7 +188,7 @@ static long restore_sigcontext(struct pt_regs *regs,
> >
> >                         return 0;
> >                 case RISCV_V_MAGIC:
> > -                       if (!has_vector() || !riscv_v_vstate_query(regs) ||
> > +                       if (!(has_vector() || has_xtheadvector()) || !riscv_v_vstate_query(regs) ||
> >                             size != riscv_v_sc_size)
> >                                 return -EINVAL;
> >
> > @@ -210,7 +210,7 @@ static size_t get_rt_frame_size(bool cal_all)
> >
> >         frame_size = sizeof(*frame);
> >
> > -       if (has_vector()) {
> > +       if (has_vector() || has_xtheadvector()) {
> >                 if (cal_all || riscv_v_vstate_query(task_pt_regs(current)))
> >                         total_context_size += riscv_v_sc_size;
> >         }
> > @@ -283,7 +283,7 @@ static long setup_sigcontext(struct rt_sigframe __user *frame,
> >         if (has_fpu())
> >                 err |= save_fp_state(regs, &sc->sc_fpregs);
> >         /* Save the vector state. */
> > -       if (has_vector() && riscv_v_vstate_query(regs))
> > +       if ((has_vector() || has_xtheadvector()) && riscv_v_vstate_query(regs))
> >                 err |= save_v_state(regs, (void __user **)&sc_ext_ptr);
> >         /* Write zero to fp-reserved space and check it on restore_sigcontext */
> >         err |= __put_user(0, &sc->sc_extdesc.reserved);
> > diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
> > index e04586cdb7f0..c12ea4547da6 100644
> > --- a/arch/riscv/kernel/vector.c
> > +++ b/arch/riscv/kernel/vector.c
> > @@ -63,7 +63,7 @@ int riscv_v_setup_vsize(void)
> >
> >  void __init riscv_v_setup_ctx_cache(void)
> >  {
> > -       if (!has_vector())
> > +       if (!(has_vector() || has_xtheadvector()))
> >                 return;
> >
> >         riscv_v_user_cachep = kmem_cache_create_usercopy("riscv_vector_ctx",
> > @@ -184,7 +184,8 @@ bool riscv_v_first_use_handler(struct pt_regs *regs)
> >         u32 insn = (u32)regs->badaddr;
> >
> >         /* Do not handle if V is not supported, or disabled */
> > -       if (!(ELF_HWCAP & COMPAT_HWCAP_ISA_V))
> > +       if (!(ELF_HWCAP & COMPAT_HWCAP_ISA_V) &&
> > +           !(has_xtheadvector() && riscv_v_vstate_ctrl_user_allowed()))
> >                 return false;
> >
> >         /* If V has been enabled then it is not the first-use trap */
> > @@ -223,7 +224,7 @@ void riscv_v_vstate_ctrl_init(struct task_struct *tsk)
> >         bool inherit;
> >         int cur, next;
> >
> > -       if (!has_vector())
> > +       if (!(has_vector() || has_xtheadvector()))
> >                 return;
> >
> >         next = riscv_v_ctrl_get_next(tsk);
> > @@ -245,7 +246,7 @@ void riscv_v_vstate_ctrl_init(struct task_struct *tsk)
> >
> >  long riscv_v_vstate_ctrl_get_current(void)
> >  {
> > -       if (!has_vector())
> > +       if (!(has_vector() || has_xtheadvector()))
> >                 return -EINVAL;
> >
> >         return current->thread.vstate_ctrl & PR_RISCV_V_VSTATE_CTRL_MASK;
> > @@ -256,7 +257,7 @@ long riscv_v_vstate_ctrl_set_current(unsigned long arg)
> >         bool inherit;
> >         int cur, next;
> >
> > -       if (!has_vector())
> > +       if (!(has_vector() || has_xtheadvector()))
> >                 return -EINVAL;
> >
> >         if (arg & ~PR_RISCV_V_VSTATE_CTRL_MASK)
> > @@ -306,7 +307,7 @@ static struct ctl_table riscv_v_default_vstate_table[] = {
> >
> >  static int __init riscv_v_sysctl_init(void)
> >  {
> > -       if (has_vector())
> > +       if (has_vector() || has_xtheadvector())
> >                 if (!register_sysctl("abi", riscv_v_default_vstate_table))
> >                         return -EINVAL;
> >         return 0;
> >
> > --
> > 2.44.0
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
> 
> Cheers,
> Andy
Andy Chiu May 16, 2024, 1:11 p.m. UTC | #13
On Sat, May 4, 2024 at 2:21 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
>
> If vlenb is provided in the device tree, prefer that over reading the
> vlenb csr.
>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>

I agree with Conor that we need a mechanism to turn off v and all
depending extensions with has_riscv_homogeneous_vlenb(). And that can
come after this.

Thanks for adding the homogeneous vlen checking!

Reviewed-by: Andy Chiu <andy.chiu@sifive.com>

> ---
>  arch/riscv/include/asm/cpufeature.h |  2 ++
>  arch/riscv/kernel/cpufeature.c      | 47 +++++++++++++++++++++++++++++++++++++
>  arch/riscv/kernel/vector.c          | 12 +++++++++-
>  3 files changed, 60 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
> index 347805446151..0c4f08577015 100644
> --- a/arch/riscv/include/asm/cpufeature.h
> +++ b/arch/riscv/include/asm/cpufeature.h
> @@ -31,6 +31,8 @@ DECLARE_PER_CPU(struct riscv_cpuinfo, riscv_cpuinfo);
>  /* Per-cpu ISA extensions. */
>  extern struct riscv_isainfo hart_isa[NR_CPUS];
>
> +extern u32 riscv_vlenb_of;
> +
>  void riscv_user_isa_enable(void);
>
>  #if defined(CONFIG_RISCV_MISALIGNED)
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 3ed2359eae35..6c143ea9592b 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -35,6 +35,8 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly;
>  /* Per-cpu ISA extensions. */
>  struct riscv_isainfo hart_isa[NR_CPUS];
>
> +u32 riscv_vlenb_of;
> +
>  /**
>   * riscv_isa_extension_base() - Get base extension word
>   *
> @@ -648,6 +650,46 @@ static int __init riscv_isa_fallback_setup(char *__unused)
>  early_param("riscv_isa_fallback", riscv_isa_fallback_setup);
>  #endif
>
> +static int has_riscv_homogeneous_vlenb(void)
> +{
> +       int cpu;
> +       u32 prev_vlenb = 0;
> +       u32 vlenb;
> +
> +       /* Ignore vlenb if vector is not enabled in the kernel */
> +       if (!IS_ENABLED(CONFIG_RISCV_ISA_V))
> +               return 0;
> +
> +       for_each_possible_cpu(cpu) {
> +               struct device_node *cpu_node;
> +
> +               cpu_node = of_cpu_device_node_get(cpu);
> +               if (!cpu_node) {
> +                       pr_warn("Unable to find cpu node\n");
> +                       return -ENOENT;
> +               }
> +
> +               if (of_property_read_u32(cpu_node, "riscv,vlenb", &vlenb)) {
> +                       of_node_put(cpu_node);
> +
> +                       if (prev_vlenb)
> +                               return -ENOENT;
> +                       continue;
> +               }
> +
> +               if (prev_vlenb && vlenb != prev_vlenb) {
> +                       of_node_put(cpu_node);
> +                       return -ENOENT;
> +               }
> +
> +               prev_vlenb = vlenb;
> +               of_node_put(cpu_node);
> +       }
> +
> +       riscv_vlenb_of = vlenb;
> +       return 0;
> +}
> +
>  void __init riscv_fill_hwcap(void)
>  {
>         char print_str[NUM_ALPHA_EXTS + 1];
> @@ -671,6 +713,11 @@ void __init riscv_fill_hwcap(void)
>                         pr_info("Falling back to deprecated \"riscv,isa\"\n");
>                         riscv_fill_hwcap_from_isa_string(isa2hwcap);
>                 }
> +
> +               if (elf_hwcap & COMPAT_HWCAP_ISA_V && has_riscv_homogeneous_vlenb() < 0) {
> +                       pr_warn("Unsupported heterogeneous vlen detected, vector extension disabled.\n");
> +                       elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> +               }
>         }
>
>         /*
> diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
> index 6727d1d3b8f2..e04586cdb7f0 100644
> --- a/arch/riscv/kernel/vector.c
> +++ b/arch/riscv/kernel/vector.c
> @@ -33,7 +33,17 @@ int riscv_v_setup_vsize(void)
>  {
>         unsigned long this_vsize;
>
> -       /* There are 32 vector registers with vlenb length. */
> +       /*
> +        * There are 32 vector registers with vlenb length.
> +        *
> +        * If the riscv,vlenb property was provided by the firmware, use that
> +        * instead of probing the CSRs.
> +        */
> +       if (riscv_vlenb_of) {
> +               this_vsize = riscv_vlenb_of * 32;
> +               return 0;
> +       }
> +
>         riscv_v_enable();
>         this_vsize = csr_read(CSR_VLENB) * 32;
>         riscv_v_disable();
>
> --
> 2.44.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Andy Chiu May 16, 2024, 2 p.m. UTC | #14
Sorry Charlie, I forgot to include the mailing list. Here is the same as
what I sent in the private message.

On Sat, May 4, 2024 at 2:21 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
>
> If vlenb is provided in the device tree, prefer that over reading the
> vlenb csr.
>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> ---
>  arch/riscv/include/asm/cpufeature.h |  2 ++
>  arch/riscv/kernel/cpufeature.c      | 47 +++++++++++++++++++++++++++++++++++++
>  arch/riscv/kernel/vector.c          | 12 +++++++++-
>  3 files changed, 60 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
> index 347805446151..0c4f08577015 100644
> --- a/arch/riscv/include/asm/cpufeature.h
> +++ b/arch/riscv/include/asm/cpufeature.h
> @@ -31,6 +31,8 @@ DECLARE_PER_CPU(struct riscv_cpuinfo, riscv_cpuinfo);
>  /* Per-cpu ISA extensions. */
>  extern struct riscv_isainfo hart_isa[NR_CPUS];
>
> +extern u32 riscv_vlenb_of;
> +
>  void riscv_user_isa_enable(void);
>
>  #if defined(CONFIG_RISCV_MISALIGNED)
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 3ed2359eae35..6c143ea9592b 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -35,6 +35,8 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly;
>  /* Per-cpu ISA extensions. */
>  struct riscv_isainfo hart_isa[NR_CPUS];
>
> +u32 riscv_vlenb_of;
> +
>  /**
>   * riscv_isa_extension_base() - Get base extension word
>   *
> @@ -648,6 +650,46 @@ static int __init riscv_isa_fallback_setup(char *__unused)
>  early_param("riscv_isa_fallback", riscv_isa_fallback_setup);
>  #endif
>
> +static int has_riscv_homogeneous_vlenb(void)
> +{
> +       int cpu;
> +       u32 prev_vlenb = 0;
> +       u32 vlenb;
> +
> +       /* Ignore vlenb if vector is not enabled in the kernel */
> +       if (!IS_ENABLED(CONFIG_RISCV_ISA_V))
> +               return 0;
> +
> +       for_each_possible_cpu(cpu) {
> +               struct device_node *cpu_node;
> +
> +               cpu_node = of_cpu_device_node_get(cpu);
> +               if (!cpu_node) {
> +                       pr_warn("Unable to find cpu node\n");
> +                       return -ENOENT;
> +               }
> +
> +               if (of_property_read_u32(cpu_node, "riscv,vlenb", &vlenb)) {
> +                       of_node_put(cpu_node);
> +
> +                       if (prev_vlenb)
> +                               return -ENOENT;
> +                       continue;
> +               }
> +
> +               if (prev_vlenb && vlenb != prev_vlenb) {
> +                       of_node_put(cpu_node);
> +                       return -ENOENT;
> +               }
> +
> +               prev_vlenb = vlenb;
> +               of_node_put(cpu_node);
> +       }
> +
> +       riscv_vlenb_of = vlenb;
> +       return 0;
> +}
> +
>  void __init riscv_fill_hwcap(void)
>  {
>         char print_str[NUM_ALPHA_EXTS + 1];
> @@ -671,6 +713,11 @@ void __init riscv_fill_hwcap(void)
>                         pr_info("Falling back to deprecated \"riscv,isa\"\n");
>                         riscv_fill_hwcap_from_isa_string(isa2hwcap);
>                 }
> +
> +               if (elf_hwcap & COMPAT_HWCAP_ISA_V && has_riscv_homogeneous_vlenb() < 0) {
> +                       pr_warn("Unsupported heterogeneous vlen detected, vector extension disabled.\> +                       elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> +               }

We only touch COMPAT_HWCAP_ISA_V and the failed case only turns off the
rectified V. So here we have nothing to do with the Xtheadvector.

However, I am still confused because I think Xtheadvector would also
need to call into this check, so as to setup vlenb.

Apart from that, it seems like some vendor stating Xtheadvector is
actually vector-0.7. Please correct me if I speak anything wrong. One
thing I noticed is that Xtheadvector wouldn't trap on reading
th.vlenb but vector-0.7 would. If that is the case, should we require
Xtheadvector to specify `riscv,vlenb` on the device tree?

>         }
>
>         /*
> diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
> index 6727d1d3b8f2..e04586cdb7f0 100644
> --- a/arch/riscv/kernel/vector.c
> +++ b/arch/riscv/kernel/vector.c
> @@ -33,7 +33,17 @@ int riscv_v_setup_vsize(void)
>  {
>         unsigned long this_vsize;
>
> -       /* There are 32 vector registers with vlenb length. */
> +       /*
> +        * There are 32 vector registers with vlenb length.
> +        *
> +        * If the riscv,vlenb property was provided by the firmware, use that
> +        * instead of probing the CSRs.
> +        */
> +       if (riscv_vlenb_of) {
> +               this_vsize = riscv_vlenb_of * 32;
> +               return 0;
> +       }
> +
>         riscv_v_enable();
>         this_vsize = csr_read(CSR_VLENB) * 32;
>         riscv_v_disable();
>
> --
> 2.44.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

Thanks,
Andy
Conor Dooley May 16, 2024, 4:24 p.m. UTC | #15
On Thu, May 16, 2024 at 10:00:12PM +0800, Andy Chiu wrote:
> On Sat, May 4, 2024 at 2:21 AM Charlie Jenkins <charlie@rivosinc.com> wrote:

> > +               if (elf_hwcap & COMPAT_HWCAP_ISA_V && has_riscv_homogeneous_vlenb() < 0) {
> > +                       pr_warn("Unsupported heterogeneous vlen detected, vector extension disabled.\
> > +                       elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> > +               }
> 
> We only touch COMPAT_HWCAP_ISA_V and the failed case only turns off the
> rectified V. So here we have nothing to do with the Xtheadvector.

There's nothing t-head related in the tree at this point, so doing
anything with it would cause build issues.

> However, I am still confused because I think Xtheadvector would also
> need to call into this check, so as to setup vlenb.


> Apart from that, it seems like some vendor stating Xtheadvector is
> actually vector-0.7.

The T-Head implementation is 0.7.x, but I am not really sure what you
mean by this comment.

> Please correct me if I speak anything wrong. One
> thing I noticed is that Xtheadvector wouldn't trap on reading
> th.vlenb but vector-0.7 would. If that is the case, should we require
> Xtheadvector to specify `riscv,vlenb` on the device tree?

In the world of Linux, "vector-0.7" isn't a thing. There's only 1.0, and
after this patchset, "xtheadvector". My understanding, from discussion
on earlier versions of this series the trap is actually accessing
th.vlenb register, despite the documentation stating that it is
unprivileged:
https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
I assume Charlie tried it but was trapping, as v1 had a comment:
+		 * Although xtheadvector states that th.vlenb exists and
+		 * overlaps with the vector 1.0 extension overlaps, an illegal
+		 * instruction is raised if read. These systems all currently
+		 * have a fixed vector length of 128, so hardcode that value.

Cheers,
Conor.
Charlie Jenkins May 16, 2024, 8:28 p.m. UTC | #16
On Thu, May 16, 2024 at 05:24:25PM +0100, Conor Dooley wrote:
> On Thu, May 16, 2024 at 10:00:12PM +0800, Andy Chiu wrote:
> > On Sat, May 4, 2024 at 2:21 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
> 
> > > +               if (elf_hwcap & COMPAT_HWCAP_ISA_V && has_riscv_homogeneous_vlenb() < 0) {
> > > +                       pr_warn("Unsupported heterogeneous vlen detected, vector extension disabled.\
> > > +                       elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> > > +               }
> > 
> > We only touch COMPAT_HWCAP_ISA_V and the failed case only turns off the
> > rectified V. So here we have nothing to do with the Xtheadvector.
> 
> There's nothing t-head related in the tree at this point, so doing
> anything with it would cause build issues.
> 
> > However, I am still confused because I think Xtheadvector would also
> > need to call into this check, so as to setup vlenb.
> 
> 
> > Apart from that, it seems like some vendor stating Xtheadvector is
> > actually vector-0.7.
> 
> The T-Head implementation is 0.7.x, but I am not really sure what you
> mean by this comment.

Andy, the idea of this patch was to be able to support this binding on
more than just xtheadvector.

You are correct though Andy, this is a problem that a later patch in
this series doesn't disable xtheadvector when vlenb is not homogeneous.
I am going to wait to send out any more versions until after this merge
window but I will fix this in the next version. Thank you! 

> 
> > Please correct me if I speak anything wrong. One
> > thing I noticed is that Xtheadvector wouldn't trap on reading
> > th.vlenb but vector-0.7 would. If that is the case, should we require
> > Xtheadvector to specify `riscv,vlenb` on the device tree?
> 
> In the world of Linux, "vector-0.7" isn't a thing. There's only 1.0, and
> after this patchset, "xtheadvector". My understanding, from discussion
> on earlier versions of this series the trap is actually accessing
> th.vlenb register, despite the documentation stating that it is
> unprivileged:
> https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
> I assume Charlie tried it but was trapping, as v1 had a comment:
> +		 * Although xtheadvector states that th.vlenb exists and
> +		 * overlaps with the vector 1.0 extension overlaps, an illegal
> +		 * instruction is raised if read. These systems all currently
> +		 * have a fixed vector length of 128, so hardcode that value.

On my board with a c906 attempting to read th.vlenb (which is supposed
to have the same encoding as vlenb) raises an illegal instruction
exception from S-mode even though the documentation states that it
shouldn't. Because the documentation states that vlenb is available, I
haven't made it required for xtheadvector, I am not sure the proper
solution for that.

- Charlie

> 
> Cheers,
> Conor.
Conor Dooley May 16, 2024, 8:31 p.m. UTC | #17
On Thu, May 16, 2024 at 01:28:45PM -0700, Charlie Jenkins wrote:
> On Thu, May 16, 2024 at 05:24:25PM +0100, Conor Dooley wrote:
> > On Thu, May 16, 2024 at 10:00:12PM +0800, Andy Chiu wrote:
> > > On Sat, May 4, 2024 at 2:21 AM Charlie Jenkins <charlie@rivosinc.com> wrote:
> > 
> > > > +               if (elf_hwcap & COMPAT_HWCAP_ISA_V && has_riscv_homogeneous_vlenb() < 0) {
> > > > +                       pr_warn("Unsupported heterogeneous vlen detected, vector extension disabled.\
> > > > +                       elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> > > > +               }
> > > 
> > > We only touch COMPAT_HWCAP_ISA_V and the failed case only turns off the
> > > rectified V. So here we have nothing to do with the Xtheadvector.
> > 
> > There's nothing t-head related in the tree at this point, so doing
> > anything with it would cause build issues.
> > 
> > > However, I am still confused because I think Xtheadvector would also
> > > need to call into this check, so as to setup vlenb.
> > 
> > 
> > > Apart from that, it seems like some vendor stating Xtheadvector is
> > > actually vector-0.7.
> > 
> > The T-Head implementation is 0.7.x, but I am not really sure what you
> > mean by this comment.
> 
> Andy, the idea of this patch was to be able to support this binding on
> more than just xtheadvector.
> 
> You are correct though Andy, this is a problem that a later patch in
> this series doesn't disable xtheadvector when vlenb is not homogeneous.
> I am going to wait to send out any more versions until after this merge
> window but I will fix this in the next version. Thank you! 

Agreed on all counts :)

> > > Please correct me if I speak anything wrong. One
> > > thing I noticed is that Xtheadvector wouldn't trap on reading
> > > th.vlenb but vector-0.7 would. If that is the case, should we require
> > > Xtheadvector to specify `riscv,vlenb` on the device tree?
> > 
> > In the world of Linux, "vector-0.7" isn't a thing. There's only 1.0, and
> > after this patchset, "xtheadvector". My understanding, from discussion
> > on earlier versions of this series the trap is actually accessing
> > th.vlenb register, despite the documentation stating that it is
> > unprivileged:
> > https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
> > I assume Charlie tried it but was trapping, as v1 had a comment:
> > +		 * Although xtheadvector states that th.vlenb exists and
> > +		 * overlaps with the vector 1.0 extension overlaps, an illegal
> > +		 * instruction is raised if read. These systems all currently
> > +		 * have a fixed vector length of 128, so hardcode that value.
> 
> On my board with a c906 attempting to read th.vlenb (which is supposed
> to have the same encoding as vlenb) raises an illegal instruction
> exception from S-mode even though the documentation states that it
> shouldn't. Because the documentation states that vlenb is available, I
> haven't made it required for xtheadvector, I am not sure the proper
> solution for that.

Would you mind raising an issue on the T-Head extension spec repo about
this?

Thanks,
Conor.