mbox series

[RFC,00/16] KVM: PPC: Book3S HV: add XIVE native exploitation mode

Message ID 20180423164341.15767-1-clg@kaod.org
Headers show
Series KVM: PPC: Book3S HV: add XIVE native exploitation mode | expand

Message

Cédric Le Goater April 23, 2018, 4:43 p.m. UTC
Hello,

On the POWER9 processor, the XIVE interrupt controller can control
interrupt sources using MMIO to trigger events, to EOI or to turn off
the sources. Priority management and interrupt acknowledgment is also
controlled by MMIO in the CPU presenter subengine.

PowerNV/baremetal Linux runs natively under XIVE but sPAPR guests need
special support from the hypervisor to do the same. This is called the
XIVE native exploitation mode and today, it can be activated under the
PowerPC Hypervisor, pHyp. However, Linux/KVM lacks XIVE native support
and still offers the old interrupt mode interface using a
XICS-over-XIVE glue which implements the XICS hcalls.

The following series is proposal to add the same support under KVM.


* KVM XIVE for sPAPR

A new KVM device is introduced for the XIVE native exploitation
mode. It reuses most of the XICS-over-XIVE glue implementation
structures which are internal to KVM but it has a completely different
interface. A set of Hypervisor calls configures the sources and the
event queues and from there, all control is done by the guest through
MMIOs.

These MMIO regions (ESB and TIMA) are exposed to guests in QEMU,
similarly to VFIO, and the associated VMAs are populated dynamically
with the appropriate pages using a fault handler. This is implemented
with a couple of KVM device ioctls.

On a POWER9 sPAPR machine, the Client Architecture Support (CAS)
negotiation process determines whether the guest operates with a
interrupt controller using the XICS legacy model, as found on POWER8,
or in XIVE exploitation mode. Which means that the KVM interrupt
device should be created at runtime, after the machine as started.
This requires extra KVM support to create/destroy KVM devices. The
last patches are an experimental attempt to solve that problem.  

Migration raises a couple of issues also. The patchset provide the
necessary accessor routines to capture and restore the state of the
different structures used by KVM, OPAL and HW. But as the migration is
sequenced by QEMU, we might not have enough quiescence points to
capture correctly all HW state. 


* Caveats

 - VMs migrate under load. work in progress.

 - reseting the KVM device has some bad consequences on the MMU. Needs
   more care.

 - not much attention given to pass-through


* Github
 
QEMU

  https://github.com/legoater/qemu/commits/xive

Linux/KVM

  https://github.com/legoater/linux/commits/xive

Thanks,

C.

Cédric Le Goater (16):
  powerpc/xive: export flags for the XIVE native exploitation mode
    hcalls
  powerpc/xive: add OPAL extensions for the XIVE native exploitation
    support
  KVM: PPC: Book3S HV: check the IRQ controller type
  KVM: PPC: Book3S HV: export services for the XIVE native exploitation
    device
  KVM: PPC: Book3S HV: add a new KVM device for the XIVE native
    exploitation mode
  KVM: PPC: Book3S HV: add a SET_SOURCE control to the XIVE native
    device
  KVM: PPC: Book3S HV: add a GET_ESB_FD control to the XIVE native
    device
  KVM: PPC: Book3S HV: add a GET_TIMA_FD control to XIVE native device
  KVM: PPC: Book3S HV: add a VC_BASE control to the XIVE native device
  KVM: PPC: Book3S HV: add a EISN attribute to kvmppc_xive_irq_state
  KVM: PPC: Book3S HV: add support for the XIVE native exploitation mode
    hcalls
  powerpc/xive: update HW definitions
  powerpc/xive: record guest queue page address
  KVM: PPC: Book3S HV: add support for XIVE native migration
  KVM: introduce a KVM_DESTROY_DEVICE ioctl
  KVM: PPC: Book3S HV: disconnect vCPU from IRQ device

 arch/powerpc/include/asm/kvm_host.h            |    2 +
 arch/powerpc/include/asm/kvm_ppc.h             |   78 +-
 arch/powerpc/include/asm/opal-api.h            |   13 +-
 arch/powerpc/include/asm/opal.h                |   12 +
 arch/powerpc/include/asm/xive-regs.h           |   45 +
 arch/powerpc/include/asm/xive.h                |   43 +
 arch/powerpc/include/uapi/asm/kvm.h            |   22 +
 arch/powerpc/kvm/Makefile                      |    4 +-
 arch/powerpc/kvm/book3s.c                      |   53 +-
 arch/powerpc/kvm/book3s_hv.c                   |   31 +
 arch/powerpc/kvm/book3s_hv_builtin.c           |  196 ++++
 arch/powerpc/kvm/book3s_hv_rm_xive_native.c    |   47 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S        |   52 +
 arch/powerpc/kvm/book3s_xics.c                 |    5 +-
 arch/powerpc/kvm/book3s_xive.c                 |  106 +-
 arch/powerpc/kvm/book3s_xive.h                 |   74 ++
 arch/powerpc/kvm/book3s_xive_native.c          | 1257 ++++++++++++++++++++++++
 arch/powerpc/kvm/book3s_xive_native_template.c |  381 +++++++
 arch/powerpc/kvm/powerpc.c                     |   52 +-
 arch/powerpc/platforms/powernv/opal-wrappers.S |    4 +
 arch/powerpc/sysdev/xive/native.c              |  107 ++
 arch/powerpc/sysdev/xive/spapr.c               |   28 +-
 include/uapi/linux/kvm.h                       |    5 +
 virt/kvm/kvm_main.c                            |   40 +
 24 files changed, 2585 insertions(+), 72 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_hv_rm_xive_native.c
 create mode 100644 arch/powerpc/kvm/book3s_xive_native.c
 create mode 100644 arch/powerpc/kvm/book3s_xive_native_template.c