From patchwork Thu Sep 29 06:45:06 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Gibson X-Patchwork-Id: 116904 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id F2596B6F93 for ; Thu, 29 Sep 2011 16:45:34 +1000 (EST) Received: from localhost ([::1]:34674 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R9AN7-0006KR-6w for incoming@patchwork.ozlabs.org; Thu, 29 Sep 2011 02:45:29 -0400 Received: from eggs.gnu.org ([140.186.70.92]:57472) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R9AMw-0006IY-14 for qemu-devel@nongnu.org; Thu, 29 Sep 2011 02:45:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R9AMt-0002eH-Vx for qemu-devel@nongnu.org; Thu, 29 Sep 2011 02:45:18 -0400 Received: from ozlabs.org ([203.10.76.45]:59381) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R9AMt-0002dM-De for qemu-devel@nongnu.org; Thu, 29 Sep 2011 02:45:15 -0400 Received: by ozlabs.org (Postfix, from userid 1007) id 592511007D6; Thu, 29 Sep 2011 16:45:12 +1000 (EST) From: David Gibson To: agraf@suse.de Date: Thu, 29 Sep 2011 16:45:06 +1000 Message-Id: <1317278706-16105-4-git-send-email-david@gibson.dropbear.id.au> X-Mailer: git-send-email 1.7.6.3 In-Reply-To: <1317278706-16105-1-git-send-email-david@gibson.dropbear.id.au> References: <1317278706-16105-1-git-send-email-david@gibson.dropbear.id.au> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 203.10.76.45 Cc: qemu-devel@nongnu.org Subject: [Qemu-devel] [PATCH 3/3] pseries: Use Book3S-HV TCE acceleration capabilities X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The pseries machine of qemu implements the TCE mechanism used as a virtual IOMMU for the PAPR defined virtual IO devices. Because the PAPR spec only defines a small DMA address space, the guest VIO drivers need to update TCE mappings very frequently - the virtual network device is particularly bad. This means many slow exits to qemu to emulate the H_PUT_TCE hypercall. Sufficiently recent kernels allow this to be mitigated by implementing H_PUT_TCE in the host kernel. To make use of this, however, qemu needs to initialize the necessary TCE tables, and map them into itself so that the VIO device implementations can retrieve the mappings when they access guest memory (which is treated as a virtual DMA operation). This patch adds the necessary calls to use the KVM TCE acceleration. If the kernel does not support acceleration, or there is some other error creating the accelerated TCE table, then it will still fall back to full userspace TCE implementation. Signed-off-by: David Gibson --- hw/spapr_vio.c | 8 ++++++- hw/spapr_vio.h | 1 + target-ppc/kvm.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++ target-ppc/kvm_ppc.h | 14 +++++++++++++ 4 files changed, 76 insertions(+), 1 deletions(-) diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c index 35818e1..1da3032 100644 --- a/hw/spapr_vio.c +++ b/hw/spapr_vio.c @@ -165,7 +165,13 @@ static void rtce_init(VIOsPAPRDevice *dev) * sizeof(VIOsPAPR_RTCE); if (size) { - dev->rtce_table = g_malloc0(size); + dev->rtce_table = kvmppc_create_spapr_tce(dev->reg, + dev->rtce_window_size, + &dev->kvmtce_fd); + + if (!dev->rtce_table) { + dev->rtce_table = g_malloc0(size); + } } } diff --git a/hw/spapr_vio.h b/hw/spapr_vio.h index 4fe5f74..a325a5f 100644 --- a/hw/spapr_vio.h +++ b/hw/spapr_vio.h @@ -57,6 +57,7 @@ typedef struct VIOsPAPRDevice { target_ulong signal_state; uint32_t rtce_window_size; VIOsPAPR_RTCE *rtce_table; + int kvmtce_fd; VIOsPAPR_CRQ crq; } VIOsPAPRDevice; diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 37ee902..866cf7f 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -28,6 +28,7 @@ #include "kvm_ppc.h" #include "cpu.h" #include "device_tree.h" +#include "hw/sysbus.h" #include "hw/spapr.h" #include "hw/sysbus.h" @@ -58,6 +59,7 @@ static int cap_ppc_smt = 0; #ifdef KVM_CAP_PPC_RMA static int cap_ppc_rma = 0; #endif +static int cap_spapr_tce = false; /* XXX We have a race condition where we actually have a level triggered * interrupt, but the infrastructure can't expose that yet, so the guest @@ -87,6 +89,9 @@ int kvm_arch_init(KVMState *s) #ifdef KVM_CAP_PPC_RMA cap_ppc_rma = kvm_check_extension(s, KVM_CAP_PPC_RMA); #endif +#ifdef KVM_CAP_SPAPR_TCE + cap_spapr_tce = kvm_check_extension(s, KVM_CAP_SPAPR_TCE); +#endif if (!cap_interrupt_level) { fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the " @@ -792,6 +797,55 @@ off_t kvmppc_alloc_rma(const char *name) #endif } +void *kvmppc_create_spapr_tce(target_ulong liobn, uint32_t window_size, int *pfd) +{ struct kvm_create_spapr_tce args = { + .liobn = liobn, + .window_size = window_size, + }; + long len; + int fd; + void *table; + + if (!cap_spapr_tce) { + return NULL; + } + + fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_SPAPR_TCE, &args); + if (fd < 0) { + return NULL; + } + + len = (window_size / SPAPR_VIO_TCE_PAGE_SIZE) * sizeof(VIOsPAPR_RTCE); + /* FIXME: round this up to page size */ + + table = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0); + if (table == MAP_FAILED) { + close(fd); + return NULL; + } + + *pfd = fd; + return table; +} + +int kvmppc_remove_spapr_tce(void *table, int fd, uint32_t window_size) +{ + long len; + + if (fd < 0) + return -1; + + len = (window_size / SPAPR_VIO_TCE_PAGE_SIZE)*sizeof(VIOsPAPR_RTCE); + if ((munmap(table, len) < 0) || + (close(fd) < 0)) { + fprintf(stderr, "KVM: Unexpected error removing KVM SPAPR TCE " + "table: %s", strerror(errno)); + /* Leak the table */ + } + + return 0; +} + bool kvm_arch_stop_on_emulation_error(CPUState *env) { return true; diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h index ad9903c..b78579a 100644 --- a/target-ppc/kvm_ppc.h +++ b/target-ppc/kvm_ppc.h @@ -20,6 +20,8 @@ int kvmppc_set_interrupt(CPUState *env, int irq, int level); void kvmppc_set_papr(CPUState *env); int kvmppc_smt_threads(void); off_t kvmppc_alloc_rma(const char *name); +void *kvmppc_create_spapr_tce(target_ulong liobn, uint32_t window_size, int *fd); +int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t window_size); #else @@ -57,6 +59,18 @@ static inline off_t kvmppc_alloc_rma(const char *name) return 0; } +static inline void *kvmppc_create_spapr_tce(target_ulong liobn, + uint32_t window_size, int *fd) +{ + return NULL; +} + +static inline int kvmppc_remove_spapr_tce(void *table, int pfd, + uint32_t window_size) +{ + return -1; +} + #endif #ifndef CONFIG_KVM