diff mbox series

[RFC,Linux] powerpc: add documentation for HWCAPs

Message ID 20220520051528.98097-1-npiggin@gmail.com
State New
Headers show
Series [RFC,Linux] powerpc: add documentation for HWCAPs | expand

Commit Message

Nicholas Piggin May 20, 2022, 5:15 a.m. UTC
This takes the arm64 file and adjusts it for powerpc. Feature
descriptions are vaguely handwaved by me.
---

Anybody care to expand on or correct the meaning of these entries or
bikeshed the wording of the intro? Many of them are no longer used
anywhere by upstream kernels and even where they are it's not always
quite clear what the exact intent was, a lot of them are old history
and I don't know what or where they are used.

I may try to get these descriptions pushed into the ABI doc after a
time, but for now they can live in the kernel tree.

Thanks,
Nick

 Documentation/powerpc/elf_hwcaps.rst | 192 +++++++++++++++++++++++++++
 1 file changed, 192 insertions(+)
 create mode 100644 Documentation/powerpc/elf_hwcaps.rst

Comments

Michael Ellerman May 20, 2022, 9:21 a.m. UTC | #1
Nicholas Piggin via Libc-alpha <libc-alpha@sourceware.org> writes:
> This takes the arm64 file and adjusts it for powerpc. Feature
> descriptions are vaguely handwaved by me.
> ---

Thanks for attempting to document this.

> Anybody care to expand on or correct the meaning of these entries or
> bikeshed the wording of the intro? Many of them are no longer used
> anywhere by upstream kernels and even where they are it's not always
> quite clear what the exact intent was, a lot of them are old history
> and I don't know what or where they are used.
>
> I may try to get these descriptions pushed into the ABI doc after a
> time, but for now they can live in the kernel tree.
>
> Thanks,
> Nick
>
>  Documentation/powerpc/elf_hwcaps.rst | 192 +++++++++++++++++++++++++++
>  1 file changed, 192 insertions(+)
>  create mode 100644 Documentation/powerpc/elf_hwcaps.rst
>
> diff --git a/Documentation/powerpc/elf_hwcaps.rst b/Documentation/powerpc/elf_hwcaps.rst
> new file mode 100644
> index 000000000000..d712aae8b867
> --- /dev/null
> +++ b/Documentation/powerpc/elf_hwcaps.rst
> @@ -0,0 +1,192 @@
> +.. _elf_hwcaps_index:
> +
> +==================
> +POWERPC ELF hwcaps
> +==================
> +
> +This document describes the usage and semantics of the powerpc ELF hwcaps.
> +
> +
> +1. Introduction
> +---------------
> +
> +Some hardware or software features are only available on some CPU
> +implementations, and/or with certain kernel configurations, but have no
> +architected discovery mechanism available to userspace code. The kernel

By "no architected discovery mechanism" you mean nothing in the ISA, but
I think a reader might not understand that. After all HWCAP is an
"architected discovery mechanism", architected by the kernel and libc.

Maybe just say "no other discovery mechanism".

> +exposes the presence of these features to userspace through a set
> +of flags called hwcaps, exposed in the auxilliary vector.
>
> +
> +Userspace software can test for features by acquiring the AT_HWCAP or
> +AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
> +flags are set, e.g.::
> +
> +	bool floating_point_is_present(void)
> +	{
> +		unsigned long hwcaps = getauxval(AT_HWCAP);
> +		if (hwcaps & PPC_FEATURE_HAS_FPU)
> +			return true;
> +
> +		return false;
> +	}
> +
> +Where software relies on a feature described by a hwcap, it should check
> +the relevant hwcap flag to verify that the feature is present before
> +attempting to make use of the feature.
> +
> +Features cannot be probed reliably through other means. When a feature
> +is not available, attempting to use it may result in unpredictable
> +behaviour, and is not guaranteed to result in any reliable indication
> +that the feature is unavailable, such as a SIGILL.

I'd just drop the "such as a SIGILL", don't give people ideas :)

> +2. hwcap allocation
> +-------------------
> +
> +HWCAPs are allocated as described in Power Architecture 64-Bit ELF V2 ABI

Are we calling them hwcaps or HWCAPs?

> +Specification (which will be reflected in the kernel's uapi headers).
> +
> +3. The hwcaps exposed in AT_HWCAP
> +---------------------------------
> +
> +PPC_FEATURE_32
> +    32-bit CPU
> +
> +PPC_FEATURE_64
> +    64-bit CPU (userspace may be running in 32-bit mode).
> +
> +PPC_FEATURE_601_INSTR
> +    The processor is PowerPC 601

Unused in the kernel since:
  f0ed73f3fa2c ("powerpc: Remove PowerPC 601")

> +PPC_FEATURE_HAS_ALTIVEC
> +    Vector (aka Altivec, VSX) facility is available.
> +
> +PPC_FEATURE_HAS_FPU
> +    Floating point facility is available.
> +
> +PPC_FEATURE_HAS_MMU
> +    Memory management unit is present.
> +
> +PPC_FEATURE_HAS_4xxMAC
> +    ?

First appeared in v2.4.9.2, as part of "Paul Mackerras: PPC update (big re-org)":

  https://github.com/mpe/linux-fullhistory/commit/dccd38599dad0588f4fb254c0a188b7c70af02e1

No extra context I can see.

I think all our 4xx (40x or 44x) CPUs have that set, so seems like it
just means "is a 40x or 44x".

> +PPC_FEATURE_UNIFIED_CACHE
> +    ?

Unused in the kernel since:
  39c8bf2b3cc1 ("powerpc: Retire e200 core (mpc555x processor)")

> +PPC_FEATURE_HAS_SPE
> +    ?

AFAIK means the CPU supports SPE (Signal Processing Engine) instructions.

They were documented in ISA v2.07 Book I chapter 8.

Not to be confused with the Cell SPEs.

I think GCC has dropped support for SPE, so at some point we may want to
drop the kernel support also, as it will be increasingly untested.

> +PPC_FEATURE_HAS_EFP_SINGLE
> +    ?

Seems to be SPE related, only set on CPUs that also have SPE.

> +PPC_FEATURE_HAS_EFP_DOUBLE
> +    ?

As above.

> +PPC_FEATURE_NO_TB
> +    The timebase facility (mftb instruction) is not available.
> +

Unused in the kernel since:
  f0ed73f3fa2c ("powerpc: Remove PowerPC 601")

> +PPC_FEATURE_POWER4
> +    The processor is POWER4.

We dropped Power4 support in:

  471d7ff8b51b ("powerpc/64s: Remove POWER4 support")

But that bit is still set for PPC970/FX/MP.

> +PPC_FEATURE_POWER5
> +    The processor is POWER5.
> +
> +PPC_FEATURE_POWER5_PLUS
> +    The processor is POWER5+.
> +
> +PPC_FEATURE_CELL
> +    The processor is Cell.
> +
> +PPC_FEATURE_BOOKE
> +    The processor implements the BookE architecture.
> +
> +PPC_FEATURE_SMT
> +    The processor implements SMT.
> +
> +PPC_FEATURE_ICACHE_SNOOP
> +    The processor icache is coherent with the dcache, and instruction storage
> +    can be made consistent with data storage for the purpose of executing
> +    instructions with the instruction sequence:
> +        sync
> +        icbi (to any address)
> +        isync

Where did you get that from, the ISA?

> +PPC_FEATURE_ARCH_2_05
> +    The processor supports the v2.05 userlevel architecture. Processors
> +    supporting later architectures also set this feature.
> +
> +PPC_FEATURE_PA6T
> +    The processor is PA6T.
> +
> +PPC_FEATURE_HAS_DFP
> +    DFP facility is available.
> +
> +PPC_FEATURE_POWER6_EXT
> +    The processor is POWER6.
> +
> +PPC_FEATURE_ARCH_2_06
> +    The processor supports the v2.06 userlevel architecture. Processors
> +    supporting later architectures also set this feature.
> +
> +PPC_FEATURE_HAS_VSX
> +    VSX facility is available.
> +
> +PPC_FEATURE_PSERIES_PERFMON_COMPAT

Explanation in:
  0f4733147520 ("powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT")

But AFAIK only oprofile ever used that, not perf, or maybe perfmon2 uses it?

> +PPC_FEATURE_TRUE_LE
> +    Reserved, do not use

No it's not reserved, you read the comment wrong :)

/* Reserved - do not use		0x00000004 */
#define PPC_FEATURE_TRUE_LE		0x00000002
#define PPC_FEATURE_PPC_LE		0x00000001

It's 4 that's reserved.

> +PPC_FEATURE_PPC_LE
> +    Reserved, do not use

There's some discussion of the two LE properties here:

  fab5db97e44f ("[PATCH] powerpc: Implement support for setting little-endian mode via prctl")

But it doesn't really explain the difference.

And this commit:

  651d765d0b2c ("[PATCH] Add a prctl to change the endianness of a process.")

Added the prctl flags:

# define PR_ENDIAN_LITTLE	1	/* True little endian mode */
# define PR_ENDIAN_PPC_LITTLE	2	/* "PowerPC" pseudo little endian */

Which matches my recollection that PPC_LE is somehow not proper little
endian, but I've forgotten why. Someone older than me will remember :)

> +3. The hwcaps exposed in AT_HWCAP2
> +----------------------------------
> +
> +PPC_FEATURE2_ARCH_2_07
> +    The processor supports the v2.07 userlevel architecture. Processors
> +    supporting later architectures also set this feature.
> +
> +PPC_FEATURE2_HTM
> +    Transactional Memory feature is available.
> +
> +PPC_FEATURE2_DSCR
> +    DSCR facility is available.
> +
> +PPC_FEATURE2_EBB
> +    EBB facility is available.
> +
> +PPC_FEATURE2_ISEL
> +    isel instruction is available. This is superseded by ARCH_2_07 and
> +    later.
> +
> +PPC_FEATURE2_TAR
> +    VSX facility is available.

Typo?

It means the CPU has the "tar" register. I suspect it's never used.

> +PPC_FEATURE2_VEC_CRYPTO
> +    v2.07 crypto instructions are available.
> +
> +PPC_FEATURE2_HTM_NOSC
> +    System calls fail if called in a transactional state, see
> +    Documentation/powerpc/syscall64-abi.rst
> +
> +PPC_FEATURE2_ARCH_3_00
> +    The processor supports the v3.0B / v3.0C userlevel architecture. Processors
> +    supporting later architectures also set this feature.
> +
> +PPC_FEATURE2_HAS_IEEE128
> +    IEEE 128 is available? What instructions/data?
> +
> +PPC_FEATURE2_DARN
> +    darn instruction is available.
> +
> +PPC_FEATURE2_SCV
> +    scv instruction is available.
> +
> +PPC_FEATURE2_HTM_NO_SUSPEND
> +    A limited Transactional Memory facility that does not support suspend is
> +    available, see Documentation/powerpc/transactional_memory.rst.
> +
> +PPC_FEATURE2_ARCH_3_1
> +    The processor supports the v3.1 userlevel architecture. Processors
> +    supporting later architectures also set this feature.
> +
> +PPC_FEATURE2_MMA
> +    MMA facility is available.


cheers
Nicholas Piggin May 20, 2022, 12:06 p.m. UTC | #2
Excerpts from Michael Ellerman's message of May 20, 2022 7:21 pm:
> Nicholas Piggin via Libc-alpha <libc-alpha@sourceware.org> writes:
>> This takes the arm64 file and adjusts it for powerpc. Feature
>> descriptions are vaguely handwaved by me.
>> ---
> 
> Thanks for attempting to document this.

It was mainly copy and paste from two existing files :)

>> +1. Introduction
>> +---------------
>> +
>> +Some hardware or software features are only available on some CPU
>> +implementations, and/or with certain kernel configurations, but have no
>> +architected discovery mechanism available to userspace code. The kernel
> 
> By "no architected discovery mechanism" you mean nothing in the ISA, but
> I think a reader might not understand that. After all HWCAP is an
> "architected discovery mechanism", architected by the kernel and libc.
> 
> Maybe just say "no other discovery mechanism".

Good point I reworded that.

>> +Features cannot be probed reliably through other means. When a feature
>> +is not available, attempting to use it may result in unpredictable
>> +behaviour, and is not guaranteed to result in any reliable indication
>> +that the feature is unavailable, such as a SIGILL.
> 
> I'd just drop the "such as a SIGILL", don't give people ideas :)

Yep.

>> +2. hwcap allocation
>> +-------------------
>> +
>> +HWCAPs are allocated as described in Power Architecture 64-Bit ELF V2 ABI
> 
> Are we calling them hwcaps or HWCAPs?

arm64 was mixed. We'll go with HWCAP.

>> +Specification (which will be reflected in the kernel's uapi headers).
>> +
>> +3. The hwcaps exposed in AT_HWCAP
>> +---------------------------------
>> +
>> +PPC_FEATURE_32
>> +    32-bit CPU
>> +
>> +PPC_FEATURE_64
>> +    64-bit CPU (userspace may be running in 32-bit mode).
>> +
>> +PPC_FEATURE_601_INSTR
>> +    The processor is PowerPC 601
> 
> Unused in the kernel since:
>   f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
> 
>> +PPC_FEATURE_HAS_ALTIVEC
>> +    Vector (aka Altivec, VSX) facility is available.
>> +
>> +PPC_FEATURE_HAS_FPU
>> +    Floating point facility is available.
>> +
>> +PPC_FEATURE_HAS_MMU
>> +    Memory management unit is present.
>> +
>> +PPC_FEATURE_HAS_4xxMAC
>> +    ?
> 
> First appeared in v2.4.9.2, as part of "Paul Mackerras: PPC update (big re-org)":
> 
>   https://github.com/mpe/linux-fullhistory/commit/dccd38599dad0588f4fb254c0a188b7c70af02e1
> 
> No extra context I can see.
> 
> I think all our 4xx (40x or 44x) CPUs have that set, so seems like it
> just means "is a 40x or 44x".
> 
>> +PPC_FEATURE_UNIFIED_CACHE
>> +    ?
> 
> Unused in the kernel since:
>   39c8bf2b3cc1 ("powerpc: Retire e200 core (mpc555x processor)")
> 
>> +PPC_FEATURE_HAS_SPE
>> +    ?
> 
> AFAIK means the CPU supports SPE (Signal Processing Engine) instructions.
> 
> They were documented in ISA v2.07 Book I chapter 8.
> 
> Not to be confused with the Cell SPEs.

Okay.

> 
> I think GCC has dropped support for SPE, so at some point we may want to
> drop the kernel support also, as it will be increasingly untested.
> 
>> +PPC_FEATURE_HAS_EFP_SINGLE
>> +    ?
> 
> Seems to be SPE related, only set on CPUs that also have SPE.

Maybe found some docs on it. It was some ops additional to the SPE
facility by the looks.

> 
>> +PPC_FEATURE_HAS_EFP_DOUBLE
>> +    ?
> 
> As above.
> 
>> +PPC_FEATURE_NO_TB
>> +    The timebase facility (mftb instruction) is not available.
>> +
> 
> Unused in the kernel since:
>   f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
> 
>> +PPC_FEATURE_POWER4
>> +    The processor is POWER4.
> 
> We dropped Power4 support in:
> 
>   471d7ff8b51b ("powerpc/64s: Remove POWER4 support")
> 
> But that bit is still set for PPC970/FX/MP.

Ah good catch.

> 
>> +PPC_FEATURE_POWER5
>> +    The processor is POWER5.
>> +
>> +PPC_FEATURE_POWER5_PLUS
>> +    The processor is POWER5+.
>> +
>> +PPC_FEATURE_CELL
>> +    The processor is Cell.
>> +
>> +PPC_FEATURE_BOOKE
>> +    The processor implements the BookE architecture.
>> +
>> +PPC_FEATURE_SMT
>> +    The processor implements SMT.
>> +
>> +PPC_FEATURE_ICACHE_SNOOP
>> +    The processor icache is coherent with the dcache, and instruction storage
>> +    can be made consistent with data storage for the purpose of executing
>> +    instructions with the instruction sequence:
>> +        sync
>> +        icbi (to any address)
>> +        isync
> 
> Where did you get that from, the ISA?

User manuals. I can't find it in the ISA but arguably I'd say it should
have some note or reference to coherent implementation seeing as all 
POWER CPUs for years have had it.

>> +PPC_FEATURE_ARCH_2_05
>> +    The processor supports the v2.05 userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE_PA6T
>> +    The processor is PA6T.
>> +
>> +PPC_FEATURE_HAS_DFP
>> +    DFP facility is available.
>> +
>> +PPC_FEATURE_POWER6_EXT
>> +    The processor is POWER6.
>> +
>> +PPC_FEATURE_ARCH_2_06
>> +    The processor supports the v2.06 userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE_HAS_VSX
>> +    VSX facility is available.
>> +
>> +PPC_FEATURE_PSERIES_PERFMON_COMPAT
> 
> Explanation in:
>   0f4733147520 ("powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT")
> 
> But AFAIK only oprofile ever used that, not perf, or maybe perfmon2 uses it?

Seems to be the architected PMU events?

> 
>> +PPC_FEATURE_TRUE_LE
>> +    Reserved, do not use
> 
> No it's not reserved, you read the comment wrong :)
> 
> /* Reserved - do not use		0x00000004 */
> #define PPC_FEATURE_TRUE_LE		0x00000002
> #define PPC_FEATURE_PPC_LE		0x00000001
> 
> It's 4 that's reserved.

Ah yep.

> 
>> +PPC_FEATURE_PPC_LE
>> +    Reserved, do not use
> 
> There's some discussion of the two LE properties here:
> 
>   fab5db97e44f ("[PATCH] powerpc: Implement support for setting little-endian mode via prctl")
> 
> But it doesn't really explain the difference.
> 
> And this commit:
> 
>   651d765d0b2c ("[PATCH] Add a prctl to change the endianness of a process.")
> 
> Added the prctl flags:
> 
> # define PR_ENDIAN_LITTLE	1	/* True little endian mode */
> # define PR_ENDIAN_PPC_LITTLE	2	/* "PowerPC" pseudo little endian */
> 
> Which matches my recollection that PPC_LE is somehow not proper little
> endian, but I've forgotten why. Someone older than me will remember :)

Looked it up and found it's "address munging" some strane mode that looks
like little endian to one's own loads and stores, but stores to memory
in some entirely different format that doesn't even match the address!

> 
>> +3. The hwcaps exposed in AT_HWCAP2
>> +----------------------------------
>> +
>> +PPC_FEATURE2_ARCH_2_07
>> +    The processor supports the v2.07 userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE2_HTM
>> +    Transactional Memory feature is available.
>> +
>> +PPC_FEATURE2_DSCR
>> +    DSCR facility is available.
>> +
>> +PPC_FEATURE2_EBB
>> +    EBB facility is available.
>> +
>> +PPC_FEATURE2_ISEL
>> +    isel instruction is available. This is superseded by ARCH_2_07 and
>> +    later.
>> +
>> +PPC_FEATURE2_TAR
>> +    VSX facility is available.
> 
> Typo?
> 
> It means the CPU has the "tar" register. I suspect it's never used.

Yeah typo.

>> +PPC_FEATURE2_VEC_CRYPTO
>> +    v2.07 crypto instructions are available.
>> +
>> +PPC_FEATURE2_HTM_NOSC
>> +    System calls fail if called in a transactional state, see
>> +    Documentation/powerpc/syscall64-abi.rst
>> +
>> +PPC_FEATURE2_ARCH_3_00
>> +    The processor supports the v3.0B / v3.0C userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE2_HAS_IEEE128
>> +    IEEE 128 is available? What instructions/data?
>> +
>> +PPC_FEATURE2_DARN
>> +    darn instruction is available.
>> +
>> +PPC_FEATURE2_SCV
>> +    scv instruction is available.
>> +
>> +PPC_FEATURE2_HTM_NO_SUSPEND
>> +    A limited Transactional Memory facility that does not support suspend is
>> +    available, see Documentation/powerpc/transactional_memory.rst.
>> +
>> +PPC_FEATURE2_ARCH_3_1
>> +    The processor supports the v3.1 userlevel architecture. Processors
>> +    supporting later architectures also set this feature.
>> +
>> +PPC_FEATURE2_MMA
>> +    MMA facility is available.

How's this?

---
 Documentation/powerpc/elf_hwcaps.rst | 209 +++++++++++++++++++++++++++
 1 file changed, 209 insertions(+)
 create mode 100644 Documentation/powerpc/elf_hwcaps.rst

diff --git a/Documentation/powerpc/elf_hwcaps.rst b/Documentation/powerpc/elf_hwcaps.rst
new file mode 100644
index 000000000000..ac0d8983717b
--- /dev/null
+++ b/Documentation/powerpc/elf_hwcaps.rst
@@ -0,0 +1,209 @@
+.. _elf_hwcaps_index:
+
+==================
+POWERPC ELF HWCAPs
+==================
+
+This document describes the usage and semantics of the powerpc ELF HWCAPs.
+
+
+1. Introduction
+---------------
+
+Some hardware or software features are only available on some CPU
+implementations, and/or with certain kernel configurations, but have no other
+discovery mechanism available to userspace code. The kernel exposes the
+presence of these features to userspace through a set of flags called HWCAPs,
+exposed in the auxiliary vector.
+
+Userspace software can test for features by acquiring the AT_HWCAP or
+AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
+flags are set, e.g.::
+
+	bool floating_point_is_present(void)
+	{
+		unsigned long HWCAPs = getauxval(AT_HWCAP);
+		if (HWCAPs & PPC_FEATURE_HAS_FPU)
+			return true;
+
+		return false;
+	}
+
+Where software relies on a feature described by a HWCAP, it should check the
+relevant HWCAP flag to verify that the feature is present before attempting to
+make use of the feature.
+
+Features should not be probed through other means. When a feature is not
+available, attempting to use it may result in unpredictable behaviour, and
+may not be guaranteed to result in any reliable indication that the feature
+is unavailable.
+
+2. HWCAP allocation
+-------------------
+
+HWCAPs are allocated as described in Power Architecture 64-Bit ELF V2 ABI
+Specification (which will be reflected in the kernel's uapi headers).
+
+3. The HWCAPs exposed in AT_HWCAP
+---------------------------------
+
+PPC_FEATURE_32
+    32-bit CPU
+
+PPC_FEATURE_64
+    64-bit CPU (userspace may be running in 32-bit mode).
+
+PPC_FEATURE_601_INSTR
+    The processor is PowerPC 601.
+    Unused in the kernel since:
+      f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
+
+PPC_FEATURE_HAS_ALTIVEC
+    Vector (aka Altivec, VSX) facility is available.
+
+PPC_FEATURE_HAS_FPU
+    Floating point facility is available.
+
+PPC_FEATURE_HAS_MMU
+    Memory management unit is present.
+
+PPC_FEATURE_HAS_4xxMAC
+    The processor is 40x or 44x family.
+
+PPC_FEATURE_UNIFIED_CACHE
+    The processor has a unified L1 cache for instructions and data, as
+    found in the NXP e200.
+    Unused in the kernel since:
+      39c8bf2b3cc1 ("powerpc: Retire e200 core (mpc555x processor)")
+
+PPC_FEATURE_HAS_SPE
+    Signal Processing Engine facility is available.
+
+PPC_FEATURE_HAS_EFP_SINGLE
+    Embedded Floating Point single precision operations are available.
+
+PPC_FEATURE_HAS_EFP_DOUBLE
+    Embedded Floating Point double precision operations are available.
+
+PPC_FEATURE_NO_TB
+    The timebase facility (mftb instruction) is not available.
+    This is a 601 specific HWCAP, so if it is known that the processor
+    running is not a 601, via other HWCAPs or other means, it is not
+    required to test this bit before using the timebase.
+    Unused in the kernel since:
+      f0ed73f3fa2c ("powerpc: Remove PowerPC 601")
+
+PPC_FEATURE_POWER4
+    The processor is POWER4 or PPC970/FX/MP.
+    POWER4 support dropped from the kernel since:
+      471d7ff8b51b ("powerpc/64s: Remove POWER4 support")
+
+PPC_FEATURE_POWER5
+    The processor is POWER5.
+
+PPC_FEATURE_POWER5_PLUS
+    The processor is POWER5+.
+
+PPC_FEATURE_CELL
+    The processor is Cell.
+
+PPC_FEATURE_BOOKE
+    The processor implements the BookE architecture.
+
+PPC_FEATURE_SMT
+    The processor implements SMT.
+
+PPC_FEATURE_ICACHE_SNOOP
+    The processor icache is coherent with the dcache, and instruction storage
+    can be made consistent with data storage for the purpose of executing
+    instructions with the sequence (as described in, e.g., POWER9 Processor
+    User's Manual, 4.6.2.2 Instruction Cache Block Invalidate (icbi)):
+        sync
+        icbi (to any address)
+        isync
+
+PPC_FEATURE_ARCH_2_05
+    The processor supports the v2.05 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE_PA6T
+    The processor is PA6T.
+
+PPC_FEATURE_HAS_DFP
+    DFP facility is available.
+
+PPC_FEATURE_POWER6_EXT
+    The processor is POWER6.
+
+PPC_FEATURE_ARCH_2_06
+    The processor supports the v2.06 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE_HAS_VSX
+    VSX facility is available.
+
+PPC_FEATURE_PSERIES_PERFMON_COMPAT
+    The processor supports architected PMU events in the range 0xE0-0xFF.
+
+PPC_FEATURE_TRUE_LE
+    The processor supports true little-endian mode.
+
+PPC_FEATURE_PPC_LE
+    The processor supports "PowerPC Little-Endian", that uses address
+    munging to make storage access appear to be little-endian, but the
+    data is stored in a different format that is unsuitable to be
+    accessed by other agents not running in this mode.
+
+3. The HWCAPs exposed in AT_HWCAP2
+----------------------------------
+
+PPC_FEATURE2_ARCH_2_07
+    The processor supports the v2.07 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_HTM
+    Transactional Memory feature is available.
+
+PPC_FEATURE2_DSCR
+    DSCR facility is available.
+
+PPC_FEATURE2_EBB
+    EBB facility is available.
+
+PPC_FEATURE2_ISEL
+    isel instruction is available. This is superseded by ARCH_2_07 and
+    later.
+
+PPC_FEATURE2_TAR
+    TAR facility is available.
+
+PPC_FEATURE2_VEC_CRYPTO
+    v2.07 crypto instructions are available.
+
+PPC_FEATURE2_HTM_NOSC
+    System calls fail if called in a transactional state, see
+    Documentation/powerpc/syscall64-abi.rst
+
+PPC_FEATURE2_ARCH_3_00
+    The processor supports the v3.0B / v3.0C userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_HAS_IEEE128
+    IEEE 128 is available? What instructions/data?
+
+PPC_FEATURE2_DARN
+    darn instruction is available.
+
+PPC_FEATURE2_SCV
+    scv instruction is available.
+
+PPC_FEATURE2_HTM_NO_SUSPEND
+    A limited Transactional Memory facility that does not support suspend is
+    available, see Documentation/powerpc/transactional_memory.rst.
+
+PPC_FEATURE2_ARCH_3_1
+    The processor supports the v3.1 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_MMA
+    MMA facility is available.
Paul E Murphy May 20, 2022, 2:21 p.m. UTC | #3
On 5/20/22 12:15 AM, Nicholas Piggin via Gcc wrote:
> This takes the arm64 file and adjusts it for powerpc. Feature
> descriptions are vaguely handwaved by me.
> ---
> 
> Anybody care to expand on or correct the meaning of these entries or
> bikeshed the wording of the intro? Many of them are no longer used
> anywhere by upstream kernels and even where they are it's not always
> quite clear what the exact intent was, a lot of them are old history
> and I don't know what or where they are used.
> 
> I may try to get these descriptions pushed into the ABI doc after a
> time, but for now they can live in the kernel tree.
> 
> Thanks,
> Nick

Thanks, this is really helpful.  I've been caught off-guard by some of 
the subtleties in the meanings of these bits at times.  I think it would 
be helpful to share what is implied by the usage of the word "facility" 
below.  It would resolve some of my questions below.



> +PPC_FEATURE_HAS_ALTIVEC
> +    Vector (aka Altivec, VSX) facility is available.

I think "(aka Altivec, VSX)" might be more accurately stated as "(aka 
Altivec)"?


> +PPC_FEATURE_HAS_DFP
> +    DFP facility is available.

Maybe something like "Decimal floating point instructions are available 
to userspace.  Individual instruction availability is dependent on the
reported architecture version."?


> +PPC_FEATURE_HAS_VSX
> +    VSX facility is available.
A small reminder the features are also dependent on architecture version 
too might be helpful here too.


> +PPC_FEATURE2_TAR
> +    VSX facility is available.

Was manipulating the tar spr was once a privileged instruction, is this 
a hint userspace can use the related instructions?


> +
> +PPC_FEATURE2_HAS_IEEE128
> +    IEEE 128 is available? What instructions/data?

Maybe something like "IEEE 128 binary floating point instructions are 
supported.  Individual instruction availability is dependent on the
reported architecture version."?


> +PPC_FEATURE2_SCV
> +    scv instruction is available.

I think it might be clearer to say "This kernel supports syscalls using 
the scv instruction".


> +PPC_FEATURE2_MMA
> +    MMA facility is available.

Maybe another note that specific instruction availability may depend on 
the reported architecture version?
Peter Bergner May 20, 2022, 4:58 p.m. UTC | #4
On 5/20/22 12:15 AM, Nicholas Piggin via Gcc wrote:
> +PPC_FEATURE_HAS_ALTIVEC
> +    Vector (aka Altivec, VSX) facility is available.

Slight typo.  s/VSX/VMX/


Peter
Segher Boessenkool May 20, 2022, 5:42 p.m. UTC | #5
On Fri, May 20, 2022 at 09:21:43AM -0500, Paul E Murphy wrote:
> >+PPC_FEATURE_HAS_ALTIVEC
> >+    Vector (aka Altivec, VSX) facility is available.
> 
> I think "(aka Altivec, VSX)" might be more accurately stated as "(aka 
> Altivec)"?

"Also known as AltiVec or VMX", yes.

> >+PPC_FEATURE_HAS_DFP
> >+    DFP facility is available.
> 
> Maybe something like "Decimal floating point instructions are available 
> to userspace.  Individual instruction availability is dependent on the
> reported architecture version."?

That is true for *all* facilities, and even the base architecture!  This
is not only hypothetical, either.


Segher
Nicholas Piggin May 21, 2022, 12:11 a.m. UTC | #6
Excerpts from Paul E Murphy's message of May 21, 2022 12:21 am:
> 
> 
> On 5/20/22 12:15 AM, Nicholas Piggin via Gcc wrote:
>> This takes the arm64 file and adjusts it for powerpc. Feature
>> descriptions are vaguely handwaved by me.
>> ---
>> 
>> Anybody care to expand on or correct the meaning of these entries or
>> bikeshed the wording of the intro? Many of them are no longer used
>> anywhere by upstream kernels and even where they are it's not always
>> quite clear what the exact intent was, a lot of them are old history
>> and I don't know what or where they are used.
>> 
>> I may try to get these descriptions pushed into the ABI doc after a
>> time, but for now they can live in the kernel tree.
>> 
>> Thanks,
>> Nick
> 
> Thanks, this is really helpful.  I've been caught off-guard by some of 
> the subtleties in the meanings of these bits at times.  I think it would 
> be helpful to share what is implied by the usage of the word "facility" 
> below.  It would resolve some of my questions below.

Yeah that's probably a good point. In the introduction we can explain
that the facility is a class of instructions, registers, and behaviour,
but that the specifics depend on the ISA version.

> 
> 
> 
>> +PPC_FEATURE_HAS_ALTIVEC
>> +    Vector (aka Altivec, VSX) facility is available.
> 
> I think "(aka Altivec, VSX)" might be more accurately stated as "(aka 
> Altivec)"?

Yes VSX is a thinkso, should be VMX as pointed out.

>> +PPC_FEATURE_HAS_DFP
>> +    DFP facility is available.
> 
> Maybe something like "Decimal floating point instructions are available 
> to userspace.  Individual instruction availability is dependent on the
> reported architecture version."?

Yep, we can cover all these with a note in the intro.

>> +PPC_FEATURE_HAS_VSX
>> +    VSX facility is available.
> A small reminder the features are also dependent on architecture version 
> too might be helpful here too.
> 
> 
>> +PPC_FEATURE2_TAR
>> +    VSX facility is available.
> 
> Was manipulating the tar spr was once a privileged instruction, is this 
> a hint userspace can use the related instructions?

It can be disabled with facility control, and I guess there was
some consideration for how it might be used, e.g., "system software"
could use it for its own purpose then clear the bit for the application.

In practice I don't really know what makes use of this or whether
anything sanely can, it's marked reserved in the ABI. Would be 
interesting to know whether there is much benefit to use it in the
compiler. The kernel could actually use it for something nifty if we
were able to prevent userspace from accessing it entirely...

>> +
>> +PPC_FEATURE2_HAS_IEEE128
>> +    IEEE 128 is available? What instructions/data?
> 
> Maybe something like "IEEE 128 binary floating point instructions are 
> supported.  Individual instruction availability is dependent on the
> reported architecture version."?

Right, I just didn't know what architectural class of instructions
those are. Is it just VSX in general or are there some specific
things we can name?

>> +PPC_FEATURE2_SCV
>> +    scv instruction is available.
> 
> I think it might be clearer to say "This kernel supports syscalls using 
> the scv instruction".

Yeah good point.

>> +PPC_FEATURE2_MMA
>> +    MMA facility is available.
> 
> Maybe another note that specific instruction availability may depend on 
> the reported architecture version?

Thanks,
Nick
Paul E Murphy May 23, 2022, 2:19 p.m. UTC | #7
On 5/20/22 7:11 PM, Nicholas Piggin wrote:
> Excerpts from Paul E Murphy's message of May 21, 2022 12:21 am:
>>
>>
>> On 5/20/22 12:15 AM, Nicholas Piggin via Gcc wrote:
>>> +PPC_FEATURE2_TAR
>>> +    VSX facility is available.
>>
>> Was manipulating the tar spr was once a privileged instruction, is this
>> a hint userspace can use the related instructions?
> 
> It can be disabled with facility control, and I guess there was
> some consideration for how it might be used, e.g., "system software"
> could use it for its own purpose then clear the bit for the application.
> 
> In practice I don't really know what makes use of this or whether
> anything sanely can, it's marked reserved in the ABI. Would be
> interesting to know whether there is much benefit to use it in the
> compiler. The kernel could actually use it for something nifty if we
> were able to prevent userspace from accessing it entirely...

It might be useful as a scratch register for indirect branches in some 
odd cases, such as golang's preemptive userspace threading.  Though, it 
seems more trouble than its worth for a very limited benefit.

> 
>>> +
>>> +PPC_FEATURE2_HAS_IEEE128
>>> +    IEEE 128 is available? What instructions/data?
>>
>> Maybe something like "IEEE 128 binary floating point instructions are
>> supported.  Individual instruction availability is dependent on the
>> reported architecture version."?
> 
> Right, I just didn't know what architectural class of instructions
> those are. Is it just VSX in general or are there some specific
> things we can name?

I think ISA 3.1 buckets this into an OpenPOWER Linux Optional Feature 
for "Quad-precision floating-point (QFP)".  I guess ISA 3.0 predates 
those categorizations.


>>> +PPC_FEATURE2_MMA
>>> +    MMA facility is available.
>>
>> Maybe another note that specific instruction availability may depend on
>> the reported architecture version?
Yep. I wonder if it would help to note how these align (or don't) with 
the various OpenPOWER features.
diff mbox series

Patch

diff --git a/Documentation/powerpc/elf_hwcaps.rst b/Documentation/powerpc/elf_hwcaps.rst
new file mode 100644
index 000000000000..d712aae8b867
--- /dev/null
+++ b/Documentation/powerpc/elf_hwcaps.rst
@@ -0,0 +1,192 @@ 
+.. _elf_hwcaps_index:
+
+==================
+POWERPC ELF hwcaps
+==================
+
+This document describes the usage and semantics of the powerpc ELF hwcaps.
+
+
+1. Introduction
+---------------
+
+Some hardware or software features are only available on some CPU
+implementations, and/or with certain kernel configurations, but have no
+architected discovery mechanism available to userspace code. The kernel
+exposes the presence of these features to userspace through a set
+of flags called hwcaps, exposed in the auxilliary vector.
+
+Userspace software can test for features by acquiring the AT_HWCAP or
+AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
+flags are set, e.g.::
+
+	bool floating_point_is_present(void)
+	{
+		unsigned long hwcaps = getauxval(AT_HWCAP);
+		if (hwcaps & PPC_FEATURE_HAS_FPU)
+			return true;
+
+		return false;
+	}
+
+Where software relies on a feature described by a hwcap, it should check
+the relevant hwcap flag to verify that the feature is present before
+attempting to make use of the feature.
+
+Features cannot be probed reliably through other means. When a feature
+is not available, attempting to use it may result in unpredictable
+behaviour, and is not guaranteed to result in any reliable indication
+that the feature is unavailable, such as a SIGILL.
+
+2. hwcap allocation
+-------------------
+
+HWCAPs are allocated as described in Power Architecture 64-Bit ELF V2 ABI
+Specification (which will be reflected in the kernel's uapi headers).
+
+3. The hwcaps exposed in AT_HWCAP
+---------------------------------
+
+PPC_FEATURE_32
+    32-bit CPU
+
+PPC_FEATURE_64
+    64-bit CPU (userspace may be running in 32-bit mode).
+
+PPC_FEATURE_601_INSTR
+    The processor is PowerPC 601
+
+PPC_FEATURE_HAS_ALTIVEC
+    Vector (aka Altivec, VSX) facility is available.
+
+PPC_FEATURE_HAS_FPU
+    Floating point facility is available.
+
+PPC_FEATURE_HAS_MMU
+    Memory management unit is present.
+
+PPC_FEATURE_HAS_4xxMAC
+    ?
+
+PPC_FEATURE_UNIFIED_CACHE
+    ?
+
+PPC_FEATURE_HAS_SPE
+    ?
+
+PPC_FEATURE_HAS_EFP_SINGLE
+    ?
+
+PPC_FEATURE_HAS_EFP_DOUBLE
+    ?
+
+PPC_FEATURE_NO_TB
+    The timebase facility (mftb instruction) is not available.
+
+PPC_FEATURE_POWER4
+    The processor is POWER4.
+
+PPC_FEATURE_POWER5
+    The processor is POWER5.
+
+PPC_FEATURE_POWER5_PLUS
+    The processor is POWER5+.
+
+PPC_FEATURE_CELL
+    The processor is Cell.
+
+PPC_FEATURE_BOOKE
+    The processor implements the BookE architecture.
+
+PPC_FEATURE_SMT
+    The processor implements SMT.
+
+PPC_FEATURE_ICACHE_SNOOP
+    The processor icache is coherent with the dcache, and instruction storage
+    can be made consistent with data storage for the purpose of executing
+    instructions with the instruction sequence:
+        sync
+        icbi (to any address)
+        isync
+
+PPC_FEATURE_ARCH_2_05
+    The processor supports the v2.05 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE_PA6T
+    The processor is PA6T.
+
+PPC_FEATURE_HAS_DFP
+    DFP facility is available.
+
+PPC_FEATURE_POWER6_EXT
+    The processor is POWER6.
+
+PPC_FEATURE_ARCH_2_06
+    The processor supports the v2.06 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE_HAS_VSX
+    VSX facility is available.
+
+PPC_FEATURE_PSERIES_PERFMON_COMPAT
+
+PPC_FEATURE_TRUE_LE
+    Reserved, do not use
+
+PPC_FEATURE_PPC_LE
+    Reserved, do not use
+
+3. The hwcaps exposed in AT_HWCAP2
+----------------------------------
+
+PPC_FEATURE2_ARCH_2_07
+    The processor supports the v2.07 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_HTM
+    Transactional Memory feature is available.
+
+PPC_FEATURE2_DSCR
+    DSCR facility is available.
+
+PPC_FEATURE2_EBB
+    EBB facility is available.
+
+PPC_FEATURE2_ISEL
+    isel instruction is available. This is superseded by ARCH_2_07 and
+    later.
+
+PPC_FEATURE2_TAR
+    VSX facility is available.
+
+PPC_FEATURE2_VEC_CRYPTO
+    v2.07 crypto instructions are available.
+
+PPC_FEATURE2_HTM_NOSC
+    System calls fail if called in a transactional state, see
+    Documentation/powerpc/syscall64-abi.rst
+
+PPC_FEATURE2_ARCH_3_00
+    The processor supports the v3.0B / v3.0C userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_HAS_IEEE128
+    IEEE 128 is available? What instructions/data?
+
+PPC_FEATURE2_DARN
+    darn instruction is available.
+
+PPC_FEATURE2_SCV
+    scv instruction is available.
+
+PPC_FEATURE2_HTM_NO_SUSPEND
+    A limited Transactional Memory facility that does not support suspend is
+    available, see Documentation/powerpc/transactional_memory.rst.
+
+PPC_FEATURE2_ARCH_3_1
+    The processor supports the v3.1 userlevel architecture. Processors
+    supporting later architectures also set this feature.
+
+PPC_FEATURE2_MMA
+    MMA facility is available.