mbox series

[v3,00/10] powerpc/pseries: New character devices for system parameters and VPD

Message ID 20231025-papr-sys_rtas-vs-lockdown-v3-0-5eb04559e7d8@linux.ibm.com (mailing list archive)
Headers show
Series powerpc/pseries: New character devices for system parameters and VPD | expand

Message

Nathan Lynch via B4 Relay Oct. 26, 2023, 3:24 a.m. UTC
Add character devices that expose PAPR-specific system parameters and
VPD to user space.

The problem: important platform features are enabled on Linux VMs
through the powerpc-specific rtas() syscall in combination with
writeable mappings of /dev/mem. In typical usage, this is encapsulated
behind APIs provided by the librtas library. This paradigm is
incompatible with lockdown, which prohibits /dev/mem access. It also
is too low-level in many cases: a single logical operation may require
multiple sys_rtas() calls in succession to complete. This carries the
risk that a process may exit while leaving an operation unfinished. It
also means that callers must coordinate their use of the syscall for
functions that cannot tolerate multiple concurrent clients, such as
ibm,get-vpd.

The solution presented here is to add a pair of small pseries-specific
"drivers," one for VPD and one for system parameters. The new drivers
expose these facilities to user space in ways that are compatible with
lockdown and require no coordination between their clients.

Since the ibm,get-vpd call sequence performed by the papr-vpd driver
must be serialized against all other uses of the function, the series
begins by adding some new APIs to the core RTAS support code for this
purpose.

Both drivers could potentially support poll() methods to notify
clients of changes to parameters or VPD that happen due to partition
migration and other events. But that should be safe to leave for
later, assuming there's any interest.

I have made changes to librtas to prefer the new interfaces and
verified that existing clients work correctly with the new code. A
draft PR for that work is here:

https://github.com/ibm-power-utilities/librtas/pull/36

The user-space ABI has not changed since v1 of this series.

I expect to propose at least one more small driver in this style for
platform dump retrieval in a separate submission in the future.

---
Changes in v3:
- Add new rtas_function_lock()/unlock() APIs and convert existing code
  to use them.
- Convert papr-vpd to use rtas_function_lock()/unlock() instead of
  having sys_rtas() obtain a driver-private mutex.
- Rebase on current powerpc/next.
- Link to v2: https://lore.kernel.org/r/20231013-papr-sys_rtas-vs-lockdown-v2-0-ead01ce01722@linux.ibm.com

Changes in v2:
- Fix unused-but-set variable warning in papr-sysparm code.
- Rebase on powerpc/next branch.
- Link to v1: https://lore.kernel.org/r/20231006-papr-sys_rtas-vs-lockdown-v1-0-3a36bfb66e2e@linux.ibm.com

Changes in v1 vs initial RFC:
- Add papr-sysparm driver and tests.
- Add a papr-miscdev.h uapi header.
- Prevent sys_rtas() from interfering with papr-vpd call sequences.
- Handle -4 ("VPD changed") status in papr-vpd.
- Include string_helpers.h in papr-vpd.c, per Michal Suchánek
- Link to RFC: https://lore.kernel.org/r/20230822-papr-sys_rtas-vs-lockdown-v1-0-932623cf3c7b@linux.ibm.com

---
Nathan Lynch (10):
      powerpc/rtas: Factor out function descriptor lookup
      powerpc/rtas: Facilitate high-level call sequences
      powerpc/rtas: Serialize firmware activation sequences
      powerpc/rtas: Warn if per-function lock isn't held
      powerpc/uapi: Export papr-miscdev.h header
      powerpc/pseries: Add papr-vpd character driver for VPD retrieval
      powerpc/pseries/papr-sysparm: Validate buffer object lengths
      powerpc/pseries/papr-sysparm: Expose character device to user space
      powerpc/selftests: Add test for papr-vpd
      powerpc/selftests: Add test for papr-sysparm

 Documentation/userspace-api/ioctl/ioctl-number.rst |   4 +
 arch/powerpc/include/asm/papr-sysparm.h            |  17 +-
 arch/powerpc/include/asm/rtas.h                    |   2 +
 arch/powerpc/include/uapi/asm/papr-miscdev.h       |   9 +
 arch/powerpc/include/uapi/asm/papr-sysparm.h       |  58 +++
 arch/powerpc/include/uapi/asm/papr-vpd.h           |  22 +
 arch/powerpc/kernel/rtas.c                         | 157 ++++++-
 arch/powerpc/platforms/pseries/Makefile            |   1 +
 arch/powerpc/platforms/pseries/papr-sysparm.c      | 201 +++++++-
 arch/powerpc/platforms/pseries/papr-vpd.c          | 522 +++++++++++++++++++++
 tools/testing/selftests/powerpc/Makefile           |   2 +
 .../selftests/powerpc/papr_sysparm/.gitignore      |   1 +
 .../selftests/powerpc/papr_sysparm/Makefile        |  12 +
 .../selftests/powerpc/papr_sysparm/papr_sysparm.c  | 164 +++++++
 .../testing/selftests/powerpc/papr_vpd/.gitignore  |   1 +
 tools/testing/selftests/powerpc/papr_vpd/Makefile  |  12 +
 .../testing/selftests/powerpc/papr_vpd/papr_vpd.c  | 352 ++++++++++++++
 17 files changed, 1503 insertions(+), 34 deletions(-)
---
base-commit: 36e826b568e412f61d68fedc02a67b4d8b7583cc
change-id: 20230817-papr-sys_rtas-vs-lockdown-5c54505db792

Best regards,

Comments

Nathan Lynch Oct. 26, 2023, 11:56 p.m. UTC | #1
Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org>
writes:
> I have made changes to librtas to prefer the new interfaces and
> verified that existing clients work correctly with the new code.

Unfortunately I made a mistake in testing this time and introduced a
boot-time oops:

BUG: Kernel NULL pointer dereference on read at 0x00000018
Faulting instruction address: 0xc00000000004223c
Oops: Kernel access of bad area, sig: 7 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Tainted: G        W          6.6.0-rc2+ #129
NIP:  c00000000004223c LR: c000000000042238 CTR: 0000000000000000
REGS: c000000002c579d0 TRAP: 0300   Tainted: G        W           (6.6.0-rc2+)
MSR:  8000000000001033 <SF,ME,IR,DR,RI,LE>  CR: 28000284  XER: 00000000
CFAR: c000000000042008 DAR: 0000000000000018 DSISR: 00080000 IRQMASK: 3 
GPR00: c000000000042238 c000000002c57c70 c000000001f5eb00 0000000000000000 
GPR04: c00000000294cd08 0000000000000002 c000000002c579b4 0000000000000000 
GPR08: 0000000000000000 0000000000000002 c000000002c0da80 0000000000000000 
GPR12: 0000000000000000 c000000005e40000 0000000000000000 0000000002097728 
GPR16: 0000000000001111 0000000000000001 0000000002097b80 00000000020975b8 
GPR20: 00000000020976f0 00000000020974e8 00000000030feb00 00000000030feb00 
GPR24: 0000000000002008 0000000000000000 0000000000000001 c0000000028f3d70 
GPR28: 0000000002d31020 c000000002cac268 c000000002d31020 0000000000000000 
NIP [c00000000004223c] do_enter_rtas+0xcc/0x460
LR [c000000000042238] do_enter_rtas+0xc8/0x460
Call Trace:
[c000000002c57c70] [c000000000042238] do_enter_rtas+0xc8/0x460 (unreliable)
[c000000002c57cc0] [c000000000042e34] rtas_call+0x434/0x490
[c000000002c57d20] [c0000000000fd584] papr_sysparm_get+0xe4/0x230
[c000000002c57db0] [c0000000020267d0] pSeries_probe+0x2f0/0x5fc
[c000000002c57e80] [c00000000200a318] setup_arch+0x11c/0x524
[c000000002c57f10] [c00000000200418c] start_kernel+0xcc/0xc1c
[c000000002c57fe0] [c00000000000e788] start_here_common+0x1c/0x20

This was introduced by patch #4 "powerpc/rtas: Warn if per-function lock
isn't held": __do_enter_rtas() is now attempting token -> descriptor
lookups unconditionally, before the xarray for that has been initialized.

With that change reverted, the series tests OK.
Michal Suchánek Nov. 13, 2023, 9:16 a.m. UTC | #2
Hello,

On Thu, Oct 26, 2023 at 06:56:36PM -0500, Nathan Lynch wrote:
> Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org>
> writes:
> > I have made changes to librtas to prefer the new interfaces and
> > verified that existing clients work correctly with the new code.
> 
> Unfortunately I made a mistake in testing this time and introduced a
> boot-time oops:
> 
> BUG: Kernel NULL pointer dereference on read at 0x00000018
> Faulting instruction address: 0xc00000000004223c
> Oops: Kernel access of bad area, sig: 7 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper Tainted: G        W          6.6.0-rc2+ #129
> NIP:  c00000000004223c LR: c000000000042238 CTR: 0000000000000000
> REGS: c000000002c579d0 TRAP: 0300   Tainted: G        W           (6.6.0-rc2+)
> MSR:  8000000000001033 <SF,ME,IR,DR,RI,LE>  CR: 28000284  XER: 00000000
> CFAR: c000000000042008 DAR: 0000000000000018 DSISR: 00080000 IRQMASK: 3 
> GPR00: c000000000042238 c000000002c57c70 c000000001f5eb00 0000000000000000 
> GPR04: c00000000294cd08 0000000000000002 c000000002c579b4 0000000000000000 
> GPR08: 0000000000000000 0000000000000002 c000000002c0da80 0000000000000000 
> GPR12: 0000000000000000 c000000005e40000 0000000000000000 0000000002097728 
> GPR16: 0000000000001111 0000000000000001 0000000002097b80 00000000020975b8 
> GPR20: 00000000020976f0 00000000020974e8 00000000030feb00 00000000030feb00 
> GPR24: 0000000000002008 0000000000000000 0000000000000001 c0000000028f3d70 
> GPR28: 0000000002d31020 c000000002cac268 c000000002d31020 0000000000000000 
> NIP [c00000000004223c] do_enter_rtas+0xcc/0x460
> LR [c000000000042238] do_enter_rtas+0xc8/0x460
> Call Trace:
> [c000000002c57c70] [c000000000042238] do_enter_rtas+0xc8/0x460 (unreliable)
> [c000000002c57cc0] [c000000000042e34] rtas_call+0x434/0x490
> [c000000002c57d20] [c0000000000fd584] papr_sysparm_get+0xe4/0x230
> [c000000002c57db0] [c0000000020267d0] pSeries_probe+0x2f0/0x5fc
> [c000000002c57e80] [c00000000200a318] setup_arch+0x11c/0x524
> [c000000002c57f10] [c00000000200418c] start_kernel+0xcc/0xc1c
> [c000000002c57fe0] [c00000000000e788] start_here_common+0x1c/0x20
> 
> This was introduced by patch #4 "powerpc/rtas: Warn if per-function lock
> isn't held": __do_enter_rtas() is now attempting token -> descriptor
> lookups unconditionally, before the xarray for that has been initialized.
> 
> With that change reverted, the series tests OK.

What's the status here?

Can this move on with the 4th patch skipped, or is new revision
expected?

Thanks

Michal
Nathan Lynch Nov. 13, 2023, 1:44 p.m. UTC | #3
Michal Suchánek <msuchanek@suse.de> writes:
> What's the status here?
>
> Can this move on with the 4th patch skipped, or is new revision
> expected?

I would like to get some feedback on the idea for coarse per-function
locking in patch #2 "Facilitate high-level call sequences" since that is
a core change to the RTAS subsystem and new in this revision. But I
intend to send a new version (without a boot bug) this week regardless.