From patchwork Thu Feb 14 05:49:20 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Wood X-Patchwork-Id: 220362 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 9CF5E2C0299 for ; Thu, 14 Feb 2013 16:50:35 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753810Ab3BNFu0 (ORCPT ); Thu, 14 Feb 2013 00:50:26 -0500 Received: from va3ehsobe006.messaging.microsoft.com ([216.32.180.16]:6714 "EHLO va3outboundpool.messaging.microsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754167Ab3BNFuT (ORCPT ); Thu, 14 Feb 2013 00:50:19 -0500 Received: from mail157-va3-R.bigfish.com (10.7.14.247) by VA3EHSOBE008.bigfish.com (10.7.40.28) with Microsoft SMTP Server id 14.1.225.23; Thu, 14 Feb 2013 05:50:17 +0000 Received: from mail157-va3 (localhost [127.0.0.1]) by mail157-va3-R.bigfish.com (Postfix) with ESMTP id A85AE401A7; Thu, 14 Feb 2013 05:50:17 +0000 (UTC) X-Forefront-Antispam-Report: CIP:70.37.183.190; KIP:(null); UIP:(null); IPV:NLI; H:mail.freescale.net; RD:none; EFVD:NLI X-SpamScore: 1 X-BigFish: VS1(z551bizzz1f42h1ee6h1de0h1202h1e76h1d1ah1d2ahzz8275bhz2dh2a8h668h839hd24he5bhf0ah1288h12a5h12a9h12bdh12e5h137ah139eh13b6h1441h1504h1537h162dh1631h1758h1898h18e1h1946h19b5h1155h) Received: from mail157-va3 (localhost.localdomain [127.0.0.1]) by mail157-va3 (MessageSwitch) id 1360820972409989_18012; Thu, 14 Feb 2013 05:49:32 +0000 (UTC) Received: from VA3EHSMHS031.bigfish.com (unknown [10.7.14.242]) by mail157-va3.bigfish.com (Postfix) with ESMTP id 4AF15460249; Thu, 14 Feb 2013 05:49:32 +0000 (UTC) Received: from mail.freescale.net (70.37.183.190) by VA3EHSMHS031.bigfish.com (10.7.99.41) with Microsoft SMTP Server (TLS) id 14.1.225.23; Thu, 14 Feb 2013 05:49:29 +0000 Received: from az84smr01.freescale.net (10.64.34.197) by 039-SN1MMR1-004.039d.mgd.msft.net (10.84.1.14) with Microsoft SMTP Server (TLS) id 14.2.328.11; Thu, 14 Feb 2013 05:49:28 +0000 Received: from snotra.am.freescale.net ([10.214.85.34]) by az84smr01.freescale.net (8.14.3/8.14.0) with ESMTP id r1E5nKJI019191; Wed, 13 Feb 2013 22:49:26 -0700 From: Scott Wood To: Alexander Graf CC: , , Scott Wood Subject: [RFC PATCH 6/6] kvm/ppc/mpic: in-kernel MPIC emulation Date: Wed, 13 Feb 2013 23:49:20 -0600 Message-ID: <1360820960-12537-7-git-send-email-scottwood@freescale.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1360820960-12537-1-git-send-email-scottwood@freescale.com> References: <1360820960-12537-1-git-send-email-scottwood@freescale.com> MIME-Version: 1.0 X-OriginatorOrg: freescale.com Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Hook the MPIC code up to the KVM interfaces, add locking, etc. TODO: irqfd support Signed-off-by: Scott Wood --- Documentation/virtual/kvm/devices/mpic.txt | 36 ++ arch/powerpc/include/asm/kvm_host.h | 9 +- arch/powerpc/include/asm/kvm_ppc.h | 4 + arch/powerpc/kvm/Kconfig | 5 + arch/powerpc/kvm/Makefile | 2 + arch/powerpc/kvm/booke.c | 10 +- arch/powerpc/kvm/mpic.c | 875 +++++++++++++++++++++++----- arch/powerpc/kvm/powerpc.c | 12 +- include/linux/kvm_host.h | 4 +- include/uapi/linux/kvm.h | 17 +- virt/kvm/kvm_main.c | 12 + 11 files changed, 822 insertions(+), 164 deletions(-) create mode 100644 Documentation/virtual/kvm/devices/mpic.txt diff --git a/Documentation/virtual/kvm/devices/mpic.txt b/Documentation/virtual/kvm/devices/mpic.txt new file mode 100644 index 0000000..1ef30f0 --- /dev/null +++ b/Documentation/virtual/kvm/devices/mpic.txt @@ -0,0 +1,36 @@ +MPIC interrupt controller +========================= + +Device types supported: + KVM_DEV_TYPE_FSL_MPIC_20 Freescale MPIC v2.0 + KVM_DEV_TYPE_FSL_MPIC_42 Freescale MPIC v4.2 + +Only one MPIC instance, of any type, may be instantiated. The created +MPIC will act as the system interrupt controller, connecting to each +vcpu's interrupt inputs. + +Groups: + KVM_DEV_MPIC_GRP_MISC + Attributes: + KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit) + Base address of the 256 KiB MPIC register space. Must be + naturally aligned. A value of zero disables the mapping. + Reset value is zero. + + KVM_DEV_MPIC_GRP_REGISTER (rw, 32-bit) + Access MPIC register state. "attr" is the byte offset into + the MPIC register space. Accesses must be 4-byte aligned. + + MSIs may be signaled by using this attribute group to write + to the relevant MSIIR. + + KVM_DEV_MPIC_GRP_IRQ_ACTIVE (rw, 32-bit) + IRQ input line for each standard openpic source. 0 is inactive and 1 + is active, regardless of interrupt sense. + + For edge-triggered interrupts: Writing 1 is considered an activating + edge, and writing 0 is ignored. Reading returns 1 if a previously + signaled edge has not been acknowledged, and 0 otherwise. + + "attr" is the IRQ number. IRQ numbers for standard sources are the + byte offset of the relevant IVPR from EIVPR0, divided by 32. diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 8a72d59..be81c7a 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -256,6 +256,7 @@ struct kvm_arch { #ifdef CONFIG_PPC_BOOK3S_64 struct list_head spapr_tce_tables; #endif + void *irqchip_priv; }; /* @@ -359,6 +360,11 @@ struct kvmppc_slb { #define KVMPPC_BOOKE_MAX_IAC 4 #define KVMPPC_BOOKE_MAX_DAC 2 +/* KVMPPC_EPR_USER takes precedence over KVMPPC_EPR_KERNEL */ +#define KVMPPC_EPR_NONE 0 /* EPR not supported */ +#define KVMPPC_EPR_USER 1 /* exit to userspace to fill EPR */ +#define KVMPPC_EPR_KERNEL 2 /* in-kernel irqchip */ + struct kvmppc_booke_debug_reg { u32 dbcr0; u32 dbcr1; @@ -520,7 +526,7 @@ struct kvm_vcpu_arch { u8 sane; u8 cpu_type; u8 hcall_needed; - u8 epr_enabled; + u8 epr_flags; /* KVMPPC_EPR_xxx */ u8 epr_needed; u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */ @@ -587,5 +593,6 @@ struct kvm_vcpu_arch { #define KVM_MMIO_REG_FQPR 0x0060 #define __KVM_HAVE_ARCH_WQP +#define __KVM_HAVE_CREATE_DEVICE #endif /* __POWERPC_KVM_HOST_H__ */ diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 44a657a..d46504d 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -165,6 +165,8 @@ extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu); extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *); +int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq); + /* * Cuts out inst bits with ordering according to spec. * That means the leftmost bit is zero. All given bits are included. @@ -271,6 +273,8 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr) #endif } +void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu); + int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu, struct kvm_config_tlb *cfg); int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu, diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig index 4730c95..18d5e72 100644 --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -151,6 +151,11 @@ config KVM_E500MC If unsure, say N. +config KVM_MPIC + bool "KVM in-kernel MPIC emulation" + depends on KVM + + source drivers/vhost/Kconfig endif # VIRTUALIZATION diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile index b772ede..4a2277a 100644 --- a/arch/powerpc/kvm/Makefile +++ b/arch/powerpc/kvm/Makefile @@ -103,6 +103,8 @@ kvm-book3s_32-objs := \ book3s_32_mmu.o kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs) +kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o + kvm-objs := $(kvm-objs-m) $(kvm-objs-y) obj-$(CONFIG_KVM_440) += kvm.o diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 020923e..8483cb2 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -347,7 +347,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu, keep_irq = true; } - if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_enabled) + if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_flags) update_epr = true; switch (priority) { @@ -428,8 +428,12 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu, set_guest_esr(vcpu, vcpu->arch.queued_esr); if (update_dear == true) set_guest_dear(vcpu, vcpu->arch.queued_dear); - if (update_epr == true) - kvm_make_request(KVM_REQ_EPR_EXIT, vcpu); + if (update_epr == true) { + if (vcpu->arch.epr_flags & KVMPPC_EPR_USER) + kvm_make_request(KVM_REQ_EPR_EXIT, vcpu); + else if (vcpu->arch.epr_flags & KVMPPC_EPR_KERNEL) + kvmppc_mpic_set_epr(vcpu); + } new_msr &= msr_mask; #if defined(CONFIG_64BIT) diff --git a/arch/powerpc/kvm/mpic.c b/arch/powerpc/kvm/mpic.c index 1df67ae..27040e4 100644 --- a/arch/powerpc/kvm/mpic.c +++ b/arch/powerpc/kvm/mpic.c @@ -23,6 +23,18 @@ * THE SOFTWARE. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "iodev.h" + #define MAX_CPU 32 #define MAX_SRC 256 #define MAX_TMR 4 @@ -89,6 +101,7 @@ static struct fsl_mpic_info fsl_mpic_42 = { #define ILR_INTTGT_INT 0x00 #define ILR_INTTGT_CINT 0x01 /* critical */ #define ILR_INTTGT_MCP 0x02 /* machine check */ +#define NUM_OUTPUTS 3 #define MSIIR_OFFSET 0x140 #define MSIIR_SRS_SHIFT 29 @@ -98,18 +111,14 @@ static struct fsl_mpic_info fsl_mpic_42 = { static int get_current_cpu(void) { - CPUState *cpu_single_cpu; - - if (!cpu_single_env) - return -1; - - cpu_single_cpu = ENV_GET_CPU(cpu_single_env); - return cpu_single_cpu->cpu_index; + struct kvm_vcpu *vcpu = current->thread.kvm_vcpu; + return vcpu ? vcpu->vcpu_id : -1; } -static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx); -static void openpic_cpu_write_internal(void *opaque, gpa_t addr, - uint32_t val, int idx); +static int openpic_cpu_write_internal(struct kvm_io_device *this, gpa_t addr, + u32 val, int idx); +static int openpic_cpu_read_internal(struct kvm_io_device *this, gpa_t addr, + u32 *ptr, int idx); enum irq_type { IRQ_TYPE_NORMAL = 0, @@ -131,7 +140,7 @@ struct irq_source { uint32_t idr; /* IRQ destination register */ uint32_t destmask; /* bitmap of CPU destinations */ int last_cpu; - int output; /* IRQ level, e.g. OPENPIC_OUTPUT_INT */ + int output; /* IRQ level, e.g. ILR_INTTGT_INT */ int pending; /* TRUE if IRQ is pending */ enum irq_type type; bool level:1; /* level-triggered */ @@ -158,16 +167,35 @@ struct irq_source { #define IDR_CI 0x40000000 /* critical interrupt */ struct irq_dest { + struct kvm_vcpu *vcpu; + int32_t ctpr; /* CPU current task priority */ struct irq_queue raised; struct irq_queue servicing; - qemu_irq *irqs; /* Count of IRQ sources asserting on non-INT outputs */ - uint32_t outputs_active[OPENPIC_OUTPUT_NB]; + uint32_t outputs_active[NUM_OUTPUTS]; +}; + +struct openpic; + +struct sub_region { + struct kvm_io_device iodev; + struct openpic *opp; + gpa_t base; + int size; }; struct openpic { + struct kvm_device dev; + struct kvm *kvm; + gpa_t reg_base; + spinlock_t lock; + struct notifier_block vcpu_notifier; + + struct sub_region sub_io_mem[6]; + int sub_count; + /* Behavior control */ struct fsl_mpic_info *fsl; uint32_t model; @@ -208,6 +236,51 @@ struct openpic { uint32_t irq_msi; }; + +static void mpic_irq_raise(struct openpic *opp, struct irq_dest *dst, + int output) +{ + struct kvm_interrupt irq = { + .irq = KVM_INTERRUPT_SET_LEVEL, + }; + + if (!dst->vcpu) { + pr_debug("%s: destination cpu %d does not exist\n", + __func__, dst - &opp->dst[0]); + return; + } + + pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id, + output); + + if (output != ILR_INTTGT_INT) /* TODO */ + return; + + kvm_vcpu_ioctl_interrupt(dst->vcpu, &irq); +} + +static void mpic_irq_lower(struct openpic *opp, struct irq_dest *dst, + int output) +{ + struct kvm_interrupt irq = { + .irq = KVM_INTERRUPT_UNSET, + }; + + if (!dst->vcpu) { + pr_debug("%s: destination cpu %d does not exist\n", + __func__, dst - &opp->dst[0]); + return; + } + + pr_debug("%s: cpu %d output %d\n", __func__, dst->vcpu->vcpu_id, + output); + + if (output != ILR_INTTGT_INT) /* TODO */ + return; + + kvmppc_core_dequeue_external(dst->vcpu, &irq); +} + static inline void IRQ_setbit(struct irq_queue *q, int n_IRQ) { set_bit(n_IRQ, q->queue); @@ -268,7 +341,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ, pr_debug("%s: IRQ %d active %d was %d\n", __func__, n_IRQ, active, was_active); - if (src->output != OPENPIC_OUTPUT_INT) { + if (src->output != ILR_INTTGT_INT) { pr_debug("%s: output %d irq %d active %d was %d count %d\n", __func__, src->output, n_IRQ, active, was_active, dst->outputs_active[src->output]); @@ -282,14 +355,14 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ, dst->outputs_active[src->output]++ == 0) { pr_debug("%s: Raise OpenPIC output %d cpu %d irq %d\n", __func__, src->output, n_CPU, n_IRQ); - qemu_irq_raise(dst->irqs[src->output]); + mpic_irq_raise(opp, dst, src->output); } } else { if (was_active && --dst->outputs_active[src->output] == 0) { pr_debug("%s: Lower OpenPIC output %d cpu %d irq %d\n", __func__, src->output, n_CPU, n_IRQ); - qemu_irq_lower(dst->irqs[src->output]); + mpic_irq_lower(opp, dst, src->output); } } @@ -322,8 +395,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ, } else { pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d/%d\n", __func__, n_CPU, n_IRQ, dst->raised.next); - qemu_irq_raise(opp->dst[n_CPU]. - irqs[OPENPIC_OUTPUT_INT]); + mpic_irq_raise(opp, dst, ILR_INTTGT_INT); } } else { IRQ_get_next(opp, &dst->servicing); @@ -338,8 +410,7 @@ static void IRQ_local_pipe(struct openpic *opp, int n_CPU, int n_IRQ, pr_debug("%s: IRQ %d inactive, current prio %d/%d, CPU %d\n", __func__, n_IRQ, dst->ctpr, dst->servicing.priority, n_CPU); - qemu_irq_lower(opp->dst[n_CPU]. - irqs[OPENPIC_OUTPUT_INT]); + mpic_irq_lower(opp, dst, ILR_INTTGT_INT); } } } @@ -415,8 +486,8 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level) struct irq_source *src; if (n_IRQ >= MAX_IRQ) { - pr_err("%s: IRQ %d out of range\n", __func__, n_IRQ); - abort(); + WARN_ONCE(1, "%s: IRQ %d out of range\n", __func__, n_IRQ); + return; } src = &opp->src[n_IRQ]; @@ -433,7 +504,7 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level) openpic_update_irq(opp, n_IRQ); } - if (src->output != OPENPIC_OUTPUT_INT) { + if (src->output != ILR_INTTGT_INT) { /* Edge-triggered interrupts shouldn't be used * with non-INT delivery, but just in case, * try to make it do something sane rather than @@ -446,15 +517,14 @@ static void openpic_set_irq(void *opaque, int n_IRQ, int level) } } -static void openpic_reset(DeviceState *d) +static void openpic_reset(struct openpic *opp) { - struct openpic *opp = FROM_SYSBUS(typeof(*opp), SYS_BUS_DEVICE(d)); int i; opp->gcr = GCR_RESET; + /* Initialise controller registers */ opp->frr = ((opp->nb_irqs - 1) << FRR_NIRQ_SHIFT) | - ((opp->nb_cpus - 1) << FRR_NCPU_SHIFT) | (opp->vid << FRR_VID_SHIFT); opp->pir = 0; @@ -504,7 +574,7 @@ static inline uint32_t read_IRQreg_idr(struct openpic *opp, int n_IRQ) static inline uint32_t read_IRQreg_ilr(struct openpic *opp, int n_IRQ) { if (opp->flags & OPENPIC_FLAG_ILR) - return output_to_inttgt(opp->src[n_IRQ].output); + return opp->src[n_IRQ].output; return 0xffffffff; } @@ -539,7 +609,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ, __func__); } - src->output = OPENPIC_OUTPUT_CINT; + src->output = ILR_INTTGT_CINT; src->nomask = true; src->destmask = 0; @@ -550,7 +620,7 @@ static inline void write_IRQreg_idr(struct openpic *opp, int n_IRQ, src->destmask |= 1UL << i; } } else { - src->output = OPENPIC_OUTPUT_INT; + src->output = ILR_INTTGT_INT; src->nomask = false; src->destmask = src->idr & normal_mask; } @@ -565,7 +635,7 @@ static inline void write_IRQreg_ilr(struct openpic *opp, int n_IRQ, if (opp->flags & OPENPIC_FLAG_ILR) { struct irq_source *src = &opp->src[n_IRQ]; - src->output = inttgt_to_output(val & ILR_INTTGT_MASK); + src->output = val & ILR_INTTGT_MASK; pr_debug("Set ILR %d to 0x%08x, output %d\n", n_IRQ, src->idr, src->output); @@ -614,34 +684,77 @@ static inline void write_IRQreg_ivpr(struct openpic *opp, int n_IRQ, static void openpic_gcr_write(struct openpic *opp, uint64_t val) { +#if 0 bool mpic_proxy = false; +#endif if (val & GCR_RESET) { - openpic_reset(&opp->busdev.qdev); + openpic_reset(opp); return; } opp->gcr &= ~opp->mpic_mode_mask; opp->gcr |= val & opp->mpic_mode_mask; - +#if 0 /* Set external proxy mode */ if ((val & opp->mpic_mode_mask) == GCR_MODE_PROXY) mpic_proxy = true; ppce500_set_mpic_proxy(mpic_proxy); +#endif } -static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val, - unsigned len) +static int openpic_get_val32(int len, const void *ptr, u32 *val) { - struct openpic *opp = opaque; + if (len != 4) { + pr_debug("%s: bad length %d\n", __func__, len); + return -EINVAL; + } + + memcpy(val, ptr, min(len, 4)); + return 0; +} + +static int openpic_put_val32(int len, void *ptr, u32 val) +{ + /* + * Technically only 32-bit accesses are allowed, but be nice + * to people dumping registers -- it works in real hardware + * (reads only, not writes). + */ + if (len > 4) { + pr_debug("%s: bad length %d\n", __func__, len); + return -EINVAL; + } + + memcpy(ptr, &val, min(len, 4)); + return 0; +} + +static int openpic_gbl_write(struct kvm_io_device *this, gpa_t addr, + int len, const void *ptr) +{ + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; +#if 0 struct irq_dest *dst; - int idx; +#endif + u32 val; + int ret, idx; + + addr -= sub->base; + if (addr > sub->size) + return 1; - pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n", - __func__, addr, val); + ret = openpic_get_val32(len, ptr, &val); + if (ret < 0) + return 1; + + pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val); if (addr & 0xF) - return; + return 0; + + spin_lock_irq(&opp->lock); switch (addr) { case 0x00: /* Block Revision Register1 (BRR1) is Readonly */ @@ -654,7 +767,7 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val, case 0x90: case 0xA0: case 0xB0: - openpic_cpu_write_internal(opp, addr, val, get_current_cpu()); + openpic_cpu_write_internal(this, addr, val, get_current_cpu()); break; case 0x1000: /* FRR */ break; @@ -668,14 +781,18 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val, if ((val & (1 << idx)) && !(opp->pir & (1 << idx))) { pr_debug("Raise OpenPIC RESET output for CPU %d\n", idx); +#if 0 dst = &opp->dst[idx]; - qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_RESET]); + mpic_irq_raise(opp, dst, OPENPIC_OUTPUT_RESET); +#endif } else if (!(val & (1 << idx)) && (opp->pir & (1 << idx))) { pr_debug("Lower OpenPIC RESET output for CPU %d\n", idx); +#if 0 dst = &opp->dst[idx]; - qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_RESET]); + mpic_irq_lower(opp, dst, OPENPIC_OUTPUT_RESET); +#endif } } opp->pir = val; @@ -695,21 +812,34 @@ static void openpic_gbl_write(void *opaque, gpa_t addr, uint64_t val, default: break; } + + spin_unlock_irq(&opp->lock); + return 0; } -static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len) +static int openpic_gbl_read(struct kvm_io_device *this, gpa_t addr, + int len, void *ptr) { - struct openpic *opp = opaque; - uint32_t retval; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; + u32 retval; + int ret; + + addr -= sub->base; + if (addr > sub->size) + return 1; - pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr); + pr_debug("%s: addr %#llx\n", __func__, addr); retval = 0xFFFFFFFF; if (addr & 0xF) - return retval; + goto out; + + spin_lock_irq(&opp->lock); switch (addr) { case 0x1000: /* FRR */ retval = opp->frr; + retval |= (opp->nb_cpus - 1) << FRR_NCPU_SHIFT; break; case 0x1020: /* GCR */ retval = opp->gcr; @@ -731,8 +861,8 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len) case 0x90: case 0xA0: case 0xB0: - retval = - openpic_cpu_read_internal(opp, addr, get_current_cpu()); + retval = openpic_cpu_read_internal(this, addr, + &retval, get_current_cpu()); break; case 0x10A0: /* IPI_IVPR */ case 0x10B0: @@ -750,33 +880,51 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len) default: break; } + + spin_unlock_irq(&opp->lock); +out: pr_debug("%s: => 0x%08x\n", __func__, retval); - return retval; + ret = openpic_put_val32(len, ptr, retval); + if (ret < 0) + return 1; + + return 0; } -static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val, - unsigned len) +static int openpic_tmr_write(struct kvm_io_device *this, gpa_t addr, + int len, const void *ptr) { - struct openpic *opp = opaque; - int idx; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; + u32 val; + int ret, idx; + + addr -= sub->base; + if (addr > sub->size) + return 1; + + ret = openpic_get_val32(len, ptr, &val); + if (ret < 0) + return 1; addr += 0x10f0; - pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n", - __func__, addr, val); + pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val); if (addr & 0xF) - return; + return 0; if (addr == 0x10f0) { /* TFRR */ opp->tfrr = val; - return; + return 0; } idx = (addr >> 6) & 0x3; addr = addr & 0x30; + spin_lock_irq(&opp->lock); + switch (addr & 0x30) { case 0x00: /* TCCR */ break; @@ -795,15 +943,25 @@ static void openpic_tmr_write(void *opaque, gpa_t addr, uint64_t val, write_IRQreg_idr(opp, opp->irq_tim0 + idx, val); break; } + + spin_unlock_irq(&opp->lock); + + return 0; } -static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len) +static int openpic_tmr_read(struct kvm_io_device *this, gpa_t addr, + int len, void *ptr) { - struct openpic *opp = opaque; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; uint32_t retval = -1; - int idx; + int ret, idx; + + addr -= sub->base; + if (addr > sub->size) + return 1; - pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr); + pr_debug("%s: addr %#llx\n", __func__, addr); if (addr & 0xF) goto out; @@ -813,6 +971,9 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len) retval = opp->tfrr; goto out; } + + spin_lock_irq(&opp->lock); + switch (addr & 0x30) { case 0x00: /* TCCR */ retval = opp->timers[idx].tccr; @@ -828,24 +989,40 @@ static uint64_t openpic_tmr_read(void *opaque, gpa_t addr, unsigned len) break; } + spin_unlock_irq(&opp->lock); out: pr_debug("%s: => 0x%08x\n", __func__, retval); - return retval; + ret = openpic_put_val32(len, ptr, retval); + if (ret < 0) + return 1; + + return 0; } -static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val, - unsigned len) +static int openpic_src_write(struct kvm_io_device *this, gpa_t addr, + int len, const void *ptr) { - struct openpic *opp = opaque; - int idx; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; + u32 val; + int ret, idx; - pr_debug("%s: addr %#" HWADDR_PRIx " <= %08" PRIx64 "\n", - __func__, addr, val); + addr -= sub->base; + if (addr > sub->size) + return 1; + + ret = openpic_get_val32(len, ptr, &val); + if (ret < 0) + return 1; + + pr_debug("%s: addr %#llx <= %08x\n", __func__, addr, val); addr = addr & 0xffff; idx = addr >> 5; + spin_lock_irq(&opp->lock); + switch (addr & 0x1f) { case 0x00: write_IRQreg_ivpr(opp, idx, val); @@ -857,20 +1034,32 @@ static void openpic_src_write(void *opaque, gpa_t addr, uint64_t val, write_IRQreg_ilr(opp, idx, val); break; } + + + spin_unlock_irq(&opp->lock); + return 0; } -static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len) +static int openpic_src_read(struct kvm_io_device *this, uint64_t addr, + int len, void *ptr) { - struct openpic *opp = opaque; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; uint32_t retval; - int idx; + int ret, idx; + + addr -= sub->base; + if (addr > sub->size) + return 1; - pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr); + pr_debug("%s: addr %#llx\n", __func__, addr); retval = 0xFFFFFFFF; addr = addr & 0xffff; idx = addr >> 5; + spin_lock_irq(&opp->lock); + switch (addr & 0x1f) { case 0x00: retval = read_IRQreg_ivpr(opp, idx); @@ -883,21 +1072,38 @@ static uint64_t openpic_src_read(void *opaque, uint64_t addr, unsigned len) break; } + spin_unlock_irq(&opp->lock); pr_debug("%s: => 0x%08x\n", __func__, retval); - return retval; + + ret = openpic_put_val32(len, ptr, retval); + if (ret < 0) + return 1; + + return 0; } -static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val, - unsigned size) +static int openpic_msi_write(struct kvm_io_device *this, gpa_t addr, + int len, const void *ptr) { - struct openpic *opp = opaque; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; + u32 val; int idx = opp->irq_msi; - int srs, ibs; + int srs, ibs, ret; + + addr -= sub->base; + if (addr > sub->size) + return 1; + + ret = openpic_get_val32(len, ptr, &val); + if (ret < 0) + return 1; - pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n", - __func__, addr, val); + pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val); if (addr & 0xF) - return; + return 0; + + spin_lock_irq(&opp->lock); switch (addr) { case MSIIR_OFFSET: @@ -911,20 +1117,31 @@ static void openpic_msi_write(void *opaque, gpa_t addr, uint64_t val, /* most registers are read-only, thus ignored */ break; } + + spin_unlock_irq(&opp->lock); + return 0; } -static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size) +static int openpic_msi_read(struct kvm_io_device *this, gpa_t addr, + int len, void *ptr) { - struct openpic *opp = opaque; - uint64_t r = 0; - int i, srs; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; + uint32_t r = 0; + int i, srs, ret; + + addr -= sub->base; + if (addr > sub->size) + return 1; - pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr); + pr_debug("%s: addr %#llx\n", __func__, addr); if (addr & 0xF) - return -1; + return 1; srs = addr >> 4; + spin_lock_irq(&opp->lock); + switch (addr) { case 0x00: case 0x10: @@ -945,45 +1162,76 @@ static uint64_t openpic_msi_read(void *opaque, gpa_t addr, unsigned size) break; } - return r; + spin_unlock_irq(&opp->lock); + pr_debug("%s: => 0x%08x\n", __func__, r); + + ret = openpic_put_val32(len, ptr, r); + if (ret < 0) + return 1; + + return 0; } -static uint64_t openpic_summary_read(void *opaque, gpa_t addr, unsigned size) +static int openpic_summary_read(struct kvm_io_device *this, gpa_t addr, + int len, void *ptr) { - uint64_t r = 0; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + uint32_t r = 0; + int ret; - pr_debug("%s: addr %#" HWADDR_PRIx "\n", __func__, addr); + addr -= sub->base; + if (addr > sub->size) + return 1; + + pr_debug("%s: addr %#llx\n", __func__, addr); /* TODO: EISR/EIMR */ - return r; + ret = openpic_put_val32(len, ptr, r); + if (ret < 0) + return 1; + + return 0; } -static void openpic_summary_write(void *opaque, gpa_t addr, uint64_t val, - unsigned size) +static int openpic_summary_write(struct kvm_io_device *this, gpa_t addr, + int len, const void *ptr) { - pr_debug("%s: addr %#" HWADDR_PRIx " <= 0x%08" PRIx64 "\n", - __func__, addr, val); + struct sub_region *sub = container_of(this, struct sub_region, iodev); + int ret; + uint32_t val; + + addr -= sub->base; + if (addr > sub->size) + return 1; + + ret = openpic_get_val32(len, ptr, &val); + if (ret < 0) + return 1; + + pr_debug("%s: addr %#llx <= 0x%08x\n", __func__, addr, val); /* TODO: EISR/EIMR */ + return 0; } -static void openpic_cpu_write_internal(void *opaque, gpa_t addr, - uint32_t val, int idx) +static int openpic_cpu_write_internal(struct kvm_io_device *this, gpa_t addr, + u32 val, int idx) { - struct openpic *opp = opaque; + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; struct irq_source *src; struct irq_dest *dst; int s_IRQ, n_IRQ; - pr_debug("%s: cpu %d addr %#" HWADDR_PRIx " <= 0x%08x\n", __func__, idx, + pr_debug("%s: cpu %d addr %#llx <= 0x%08x\n", __func__, idx, addr, val); if (idx < 0) - return; + return 0; if (addr & 0xF) - return; + return 0; dst = &opp->dst[idx]; addr &= 0xFF0; @@ -1008,11 +1256,11 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr, if (dst->raised.priority <= dst->ctpr) { pr_debug("%s: Lower OpenPIC INT output cpu %d due to ctpr\n", __func__, idx); - qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]); + mpic_irq_lower(opp, dst, ILR_INTTGT_INT); } else if (dst->raised.priority > dst->servicing.priority) { pr_debug("%s: Raise OpenPIC INT output cpu %d irq %d\n", __func__, idx, dst->raised.next); - qemu_irq_raise(dst->irqs[OPENPIC_OUTPUT_INT]); + mpic_irq_raise(opp, dst, ILR_INTTGT_INT); } break; @@ -1043,18 +1291,38 @@ static void openpic_cpu_write_internal(void *opaque, gpa_t addr, IVPR_PRIORITY(src->ivpr) > dst->servicing.priority)) { pr_debug("Raise OpenPIC INT output cpu %d irq %d\n", idx, n_IRQ); - qemu_irq_raise(opp->dst[idx].irqs[OPENPIC_OUTPUT_INT]); + mpic_irq_raise(opp, dst, ILR_INTTGT_INT); } break; default: break; } + + return 0; } -static void openpic_cpu_write(void *opaque, gpa_t addr, uint64_t val, - unsigned len) +static int openpic_cpu_write(struct kvm_io_device *this, gpa_t addr, + int len, const void *ptr) { - openpic_cpu_write_internal(opaque, addr, val, (addr & 0x1f000) >> 12); + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; + u32 val; + int ret; + + addr -= sub->base; + if (addr > sub->size) + return 1; + + ret = openpic_get_val32(len, ptr, &val); + if (ret < 0) + return 1; + + spin_lock_irq(&opp->lock); + ret = openpic_cpu_write_internal(this, addr, val, + (addr & 0x1f000) >> 12); + + spin_unlock_irq(&opp->lock); + return ret; } static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst, @@ -1064,7 +1332,7 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst, int retval, irq; pr_debug("Lower OpenPIC INT output\n"); - qemu_irq_lower(dst->irqs[OPENPIC_OUTPUT_INT]); + mpic_irq_lower(opp, dst, ILR_INTTGT_INT); irq = IRQ_get_next(opp, &dst->raised); pr_debug("IACK: irq=%d\n", irq); @@ -1107,20 +1375,37 @@ static uint32_t openpic_iack(struct openpic *opp, struct irq_dest *dst, return retval; } -static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx) +void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu) { - struct openpic *opp = opaque; + struct kvm *kvm = vcpu->kvm; + struct openpic *opp = kvm->arch.irqchip_priv; + int cpu = vcpu->vcpu_id; + unsigned long flags; + + spin_lock_irqsave(&opp->lock, flags); + + if ((opp->gcr & opp->mpic_mode_mask) == GCR_MODE_PROXY) + kvmppc_set_epr(vcpu, openpic_iack(opp, &opp->dst[cpu], cpu)); + + spin_unlock_irqrestore(&opp->lock, flags); +} + +static int openpic_cpu_read_internal(struct kvm_io_device *this, gpa_t addr, + u32 *ptr, int idx) +{ + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; struct irq_dest *dst; uint32_t retval; - pr_debug("%s: cpu %d addr %#" HWADDR_PRIx "\n", __func__, idx, addr); + pr_debug("%s: cpu %d addr %#llx\n", __func__, idx, addr); retval = 0xFFFFFFFF; if (idx < 0) - return retval; + goto out; if (addr & 0xF) - return retval; + goto out; dst = &opp->dst[idx]; addr &= 0xFF0; @@ -1142,12 +1427,35 @@ static uint32_t openpic_cpu_read_internal(void *opaque, gpa_t addr, int idx) } pr_debug("%s: => 0x%08x\n", __func__, retval); - return retval; +out: + *ptr = retval; + return 0; } -static uint64_t openpic_cpu_read(void *opaque, gpa_t addr, unsigned len) +static int openpic_cpu_read(struct kvm_io_device *this, gpa_t addr, + int len, void *ptr) { - return openpic_cpu_read_internal(opaque, addr, (addr & 0x1f000) >> 12); + struct sub_region *sub = container_of(this, struct sub_region, iodev); + struct openpic *opp = sub->opp; + int ret; + u32 val; + + addr -= sub->base; + if (addr > sub->size) + return 1; + + spin_lock_irq(&opp->lock); + ret = openpic_cpu_read_internal(this, addr, &val, + (addr & 0x1f000) >> 12); + spin_unlock_irq(&opp->lock); + if (ret < 0) + return 1; + + ret = openpic_put_val32(len, ptr, val); + if (ret < 0) + return 1; + + return 0; } static const struct kvm_io_device_ops openpic_glb_ops_be = { @@ -1205,11 +1513,10 @@ static void fsl_common_init(struct openpic *opp) opp->irq_tim0 = virq; virq += MAX_TMR; - assert(virq <= MAX_IRQ); + BUG_ON(virq > MAX_IRQ); opp->irq_msi = 224; - msi_supported = true; for (i = 0; i < opp->fsl->max_ext; i++) opp->src[i].level = false; @@ -1226,39 +1533,55 @@ static void fsl_common_init(struct openpic *opp) } } -static void map_list(struct openpic *opp, const struct mem_reg *list, - int *count) +static void map_list(struct openpic *opp, const struct mem_reg *list) { + mutex_lock(&opp->kvm->slots_lock); + while (list->name) { - assert(*count < ARRAY_SIZE(opp->sub_io_mem)); + struct sub_region *sub; + + BUG_ON(opp->sub_count >= ARRAY_SIZE(opp->sub_io_mem)); - memory_region_init_io(&opp->sub_io_mem[*count], list->ops, opp, - list->name, list->size); + sub = &opp->sub_io_mem[opp->sub_count]; + sub->opp = opp; + sub->base = opp->reg_base + list->start_addr; + sub->size = list->size; - memory_region_add_subregion(&opp->mem, list->start_addr, - &opp->sub_io_mem[*count]); + kvm_iodevice_init(&sub->iodev, list->ops); - (*count)++; + kvm_io_bus_register_dev(opp->kvm, KVM_MMIO_BUS, + opp->reg_base + list->start_addr, list->size, + &sub->iodev); + + opp->sub_count++; list++; } + + mutex_unlock(&opp->kvm->slots_lock); +} + +static void unmap_all(struct openpic *opp) +{ + int i; + + mutex_lock(&opp->kvm->slots_lock); + + for (i = 0; i < opp->sub_count; i++) { + kvm_io_bus_unregister_dev(opp->kvm, KVM_MMIO_BUS, + &opp->sub_io_mem[i].iodev); + } + + mutex_unlock(&opp->kvm->slots_lock); + + opp->sub_count = 0; } -static int openpic_init(SysBusDevice *dev) +static int set_base_addr(struct kvm *kvm, struct kvm_device *dev, + struct kvm_device_attr *attr) { - struct openpic *opp = FROM_SYSBUS(typeof(*opp), dev); - int i, j; - int list_count = 0; - static const struct mem_reg list_le[] = { - {"glb", &openpic_glb_ops_le, - OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE}, - {"tmr", &openpic_tmr_ops_le, - OPENPIC_TMR_REG_START, OPENPIC_TMR_REG_SIZE}, - {"src", &openpic_src_ops_le, - OPENPIC_SRC_REG_START, OPENPIC_SRC_REG_SIZE}, - {"cpu", &openpic_cpu_ops_le, - OPENPIC_CPU_REG_START, OPENPIC_CPU_REG_SIZE}, - {NULL} - }; + struct openpic *opp = container_of(dev, struct openpic, dev); + u64 base; + static const struct mem_reg list_be[] = { {"glb", &openpic_glb_ops_be, OPENPIC_GLB_REG_START, OPENPIC_GLB_REG_SIZE}, @@ -1278,11 +1601,239 @@ static int openpic_init(SysBusDevice *dev) {NULL} }; - memory_region_init(&opp->mem, "openpic", 0x40000); + if (copy_from_user(&base, (u64 __iomem *)(long)attr->addr, sizeof(u64))) + return -EFAULT; + + if (base & 0x3ffff) { + pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx not aligned\n", + __func__, base); + return -EINVAL; + } + + if (base == opp->reg_base) + return 0; + + unmap_all(opp); + opp->reg_base = base; + + pr_debug("kvm mpic %s: KVM_DEV_MPIC_BASE_ADDR %08llx\n", + __func__, base); + + if (base == 0) + return 0; switch (opp->model) { - case OPENPIC_MODEL_FSL_MPIC_20: + case KVM_DEV_TYPE_FSL_MPIC_20: + map_list(opp, list_be); + map_list(opp, list_fsl); + + break; + + case KVM_DEV_TYPE_FSL_MPIC_42: + map_list(opp, list_be); + map_list(opp, list_fsl); + + break; + default: + WARN_ON_ONCE(1); + } + + return 0; +} + +#define ATTR_SET 0 +#define ATTR_GET 1 + +static int access_reg(struct openpic *opp, gpa_t addr, u32 *val, int type) +{ + int ret; + + if (!opp->sub_count) + return -EPERM; + + if (addr & 3) + return -ENXIO; + + if (addr > 0x40000) + return -ENXIO; + + addr += opp->reg_base; + + mutex_lock(&opp->kvm->slots_lock); + + if (type == ATTR_SET) + ret = kvm_io_bus_write(opp->kvm, KVM_MMIO_BUS, addr, 4, val); + else + ret = kvm_io_bus_read(opp->kvm, KVM_MMIO_BUS, addr, 4, val); + + mutex_unlock(&opp->kvm->slots_lock); + + pr_debug("%s: type %d addr %llx val %x\n", __func__, type, addr, *val); + + return ret; +} + +static int mpic_set_attr(struct kvm *kvm, struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + struct openpic *opp = container_of(dev, struct openpic, dev); + u32 attr32; + + switch (attr->group) { + case KVM_DEV_MPIC_GRP_MISC: + switch (attr->attr) { + case KVM_DEV_MPIC_BASE_ADDR: + return set_base_addr(kvm, dev, attr); + } + + break; + + case KVM_DEV_MPIC_GRP_REGISTER: + if (copy_from_user(&attr32, (u32 __user *)(long)attr->addr, + sizeof(u32))) + return -EFAULT; + + return access_reg(opp, attr->attr, &attr32, ATTR_SET); + + case KVM_DEV_MPIC_GRP_IRQ_ACTIVE: + if (attr->attr > MAX_SRC) + return -EINVAL; + + if (copy_from_user(&attr32, (u32 __user *)(long)attr->addr, + sizeof(u32))) + return -EFAULT; + + if (attr32 != 0 && attr32 != 1) + return -EINVAL; + + spin_lock_irq(&opp->lock); + openpic_set_irq(opp, attr->attr, attr32); + spin_unlock_irq(&opp->lock); + return 0; + } + + return -ENXIO; +} + +static int mpic_get_attr(struct kvm *kvm, struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + struct openpic *opp = container_of(dev, struct openpic, dev); + u64 attr64; + u32 attr32; + int ret; + + switch (attr->group) { + case KVM_DEV_MPIC_GRP_MISC: + switch (attr->attr) { + case KVM_DEV_MPIC_BASE_ADDR: + attr64 = opp->reg_base; + + if (copy_to_user((u64 __user *)(long)attr->addr, + &attr64, sizeof(u64))) + return -EFAULT; + + return 0; + } + + break; + + case KVM_DEV_MPIC_GRP_REGISTER: + ret = access_reg(opp, attr->attr, &attr32, ATTR_GET); + if (ret) + return ret; + + if (copy_to_user((u32 __user *)(long)attr->addr, &attr32, + sizeof(u32))) + return -EFAULT; + + return 0; + + case KVM_DEV_MPIC_GRP_IRQ_ACTIVE: + if (attr->attr > MAX_SRC) + return -EINVAL; + + attr32 = opp->src[attr->attr].pending; + + if (copy_to_user((u32 __user *)(long)attr->addr, &attr32, + sizeof(u32))) + return -EFAULT; + + return 0; + } + + return -ENXIO; +} + +static void mpic_destroy(struct kvm *kvm, struct kvm_device *dev) +{ + struct openpic *opp = container_of(dev, struct openpic, dev); + + blocking_notifier_chain_unregister(&kvm->vcpu_notifier, + &opp->vcpu_notifier); + + unmap_all(opp); + kfree(opp); +} + +static int add_cpu(struct openpic *opp, struct kvm_vcpu *vcpu) +{ + u32 id = vcpu->vcpu_id; + + if (id < 0 || id >= MAX_CPU) + return -EPERM; + + spin_lock_irq(&opp->lock); + + WARN_ON(opp->dst[id].vcpu); + opp->dst[id].vcpu = vcpu; + opp->nb_cpus = max(opp->nb_cpus, id + 1); + + spin_unlock_irq(&opp->lock); + + if (opp->mpic_mode_mask == GCR_MODE_PROXY) + vcpu->arch.epr_flags |= KVMPPC_EPR_KERNEL; + + return 0; +} + +static int kvm_mpic_vcpu_notifier(struct notifier_block *nb, + unsigned long create, void *v) +{ + struct openpic *opp = container_of(nb, struct openpic, vcpu_notifier); + struct kvm_vcpu *vcpu = v; + int ret; + + if (create) { + ret = add_cpu(opp, vcpu); + if (ret < 0) + return notifier_from_errno(ret); + } + + return NOTIFY_OK; +} + +int kvm_create_mpic(struct kvm *kvm, u32 type, struct kvm_device **dev) +{ + struct openpic *opp; + struct kvm_vcpu *vcpu; + int ret, i; + + if (kvm->arch.irqchip_priv) + return -EEXIST; + + opp = kzalloc(sizeof(struct openpic), GFP_KERNEL); + if (!opp) + return 0; + + kvm->arch.irqchip_priv = opp; + opp->kvm = kvm; + opp->model = type; + spin_lock_init(&opp->lock); + + switch (opp->model) { + case KVM_DEV_TYPE_FSL_MPIC_20: opp->fsl = &fsl_mpic_20; opp->brr1 = 0x00400200; opp->flags |= OPENPIC_FLAG_IDR_CRIT; @@ -1290,12 +1841,10 @@ static int openpic_init(SysBusDevice *dev) opp->mpic_mode_mask = GCR_MODE_MIXED; fsl_common_init(opp); - map_list(opp, list_be, &list_count); - map_list(opp, list_fsl, &list_count); break; - case OPENPIC_MODEL_FSL_MPIC_42: + case KVM_DEV_TYPE_FSL_MPIC_42: opp->fsl = &fsl_mpic_42; opp->brr1 = 0x00400402; opp->flags |= OPENPIC_FLAG_ILR; @@ -1303,11 +1852,39 @@ static int openpic_init(SysBusDevice *dev) opp->mpic_mode_mask = GCR_MODE_PROXY; fsl_common_init(opp); - map_list(opp, list_be, &list_count); - map_list(opp, list_fsl, &list_count); break; + + default: + ret = -ENODEV; + goto err; } + openpic_reset(opp); + + opp->dev.type = type; + opp->dev.set_attr = mpic_set_attr; + opp->dev.get_attr = mpic_get_attr; + opp->dev.destroy = mpic_destroy; + *dev = &opp->dev; + + kvm_for_each_vcpu(i, vcpu, kvm) { + ret = add_cpu(opp, vcpu); + if (ret < 0) + goto err; + } + + opp->vcpu_notifier.notifier_call = kvm_mpic_vcpu_notifier; + + /* FIXME: register notifier for subsequently created vcpus */ + ret = blocking_notifier_chain_register(&kvm->vcpu_notifier, + &opp->vcpu_notifier); + if (ret < 0) + goto err; + return 0; + +err: + kfree(opp); + return ret; } diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 61989f4..e3d09f7 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -317,6 +317,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_ENABLE_CAP: case KVM_CAP_ONE_REG: case KVM_CAP_IOEVENTFD: + case KVM_CAP_DEVICE_CTRL: r = 1; break; #ifndef CONFIG_KVM_BOOK3S_64_HV @@ -781,7 +782,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, break; case KVM_CAP_PPC_EPR: r = 0; - vcpu->arch.epr_enabled = cap->args[0]; + if (cap->args[0]) + vcpu->arch.epr_flags |= KVMPPC_EPR_USER; + else + vcpu->arch.epr_flags &= ~KVMPPC_EPR_USER; break; #ifdef CONFIG_BOOKE case KVM_CAP_PPC_BOOKE_WATCHDOG: @@ -927,6 +931,7 @@ static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo) long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { + struct kvm *kvm __maybe_unused = filp->private_data; void __user *argp = (void __user *)arg; long r; @@ -945,7 +950,6 @@ long kvm_arch_vm_ioctl(struct file *filp, #ifdef CONFIG_PPC_BOOK3S_64 case KVM_CREATE_SPAPR_TCE: { struct kvm_create_spapr_tce create_tce; - struct kvm *kvm = filp->private_data; r = -EFAULT; if (copy_from_user(&create_tce, argp, sizeof(create_tce))) @@ -957,7 +961,6 @@ long kvm_arch_vm_ioctl(struct file *filp, #ifdef CONFIG_KVM_BOOK3S_64_HV case KVM_ALLOCATE_RMA: { - struct kvm *kvm = filp->private_data; struct kvm_allocate_rma rma; r = kvm_vm_ioctl_allocate_rma(kvm, &rma); @@ -967,7 +970,6 @@ long kvm_arch_vm_ioctl(struct file *filp, } case KVM_PPC_ALLOCATE_HTAB: { - struct kvm *kvm = filp->private_data; u32 htab_order; r = -EFAULT; @@ -984,7 +986,6 @@ long kvm_arch_vm_ioctl(struct file *filp, } case KVM_PPC_GET_HTAB_FD: { - struct kvm *kvm = filp->private_data; struct kvm_get_htab_fd ghf; r = -EFAULT; @@ -997,7 +998,6 @@ long kvm_arch_vm_ioctl(struct file *filp, #ifdef CONFIG_PPC_BOOK3S_64 case KVM_PPC_GET_SMMU_INFO: { - struct kvm *kvm = filp->private_data; struct kvm_ppc_smmu_info info; memset(&info, 0, sizeof(info)); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 3d28037..48342a6 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1059,5 +1059,7 @@ static inline bool kvm_vcpu_eligible_for_directed_yield(struct kvm_vcpu *vcpu) } #endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */ -#endif +int kvm_create_mpic(struct kvm *kvm, u32 type, struct kvm_device **dev); + +#endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 1f348e0..1048a03 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -910,10 +910,19 @@ struct kvm_device_attr { #define KVM_DEV_ATTR_COMMON 0 #define KVM_DEV_ATTR_TYPE 0 /* 32-bit */ -#define KVM_CREATE_DEVICE _IOWR(KVMIO, 0xac, struct kvm_create_device) -#define KVM_SET_DEVICE_ATTR _IOW(KVMIO, 0xad, struct kvm_device_attr) -#define KVM_GET_DEVICE_ATTR _IOW(KVMIO, 0xae, struct kvm_device_attr) -#define KVM_HAS_DEVICE_ATTR _IOW(KVMIO, 0xaf, struct kvm_device_attr) +#define KVM_DEV_TYPE_FSL_MPIC_20 1 +#define KVM_DEV_TYPE_FSL_MPIC_42 2 + +#define KVM_DEV_MPIC_GRP_MISC 1 +#define KVM_DEV_MPIC_BASE_ADDR 0 /* 64-bit */ + +#define KVM_DEV_MPIC_GRP_REGISTER 2 /* 32-bit */ +#define KVM_DEV_MPIC_GRP_IRQ_ACTIVE 3 /* 32-bit */ + +#define KVM_CREATE_DEVICE _IOWR(KVMIO, 0xab, struct kvm_create_device) +#define KVM_SET_DEVICE_ATTR _IOW(KVMIO, 0xac, struct kvm_device_attr) +#define KVM_GET_DEVICE_ATTR _IOW(KVMIO, 0xad, struct kvm_device_attr) +#define KVM_HAS_DEVICE_ATTR _IOW(KVMIO, 0xae, struct kvm_device_attr) /* * ioctls for vcpu fds diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index dd4c78d..db0c2b3 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2210,6 +2210,18 @@ static int kvm_ioctl_create_device(struct kvm *kvm, } switch (cd->type) { +#ifdef CONFIG_KVM_MPIC + case KVM_DEV_TYPE_FSL_MPIC_20: + case KVM_DEV_TYPE_FSL_MPIC_42: { + if (test) { + r = 0; + break; + } + + r = kvm_create_mpic(kvm, cd->type, &dev); + break; + } +#endif default: r = -ENODEV; goto out;