From patchwork Thu Jan 29 09:27:24 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Kardashevskiy X-Patchwork-Id: 434466 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 033F51401F0 for ; Thu, 29 Jan 2015 20:42:20 +1100 (AEDT) Received: from localhost ([::1]:58541 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YGlcA-0004ER-7J for incoming@patchwork.ozlabs.org; Thu, 29 Jan 2015 04:42:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42608) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YGlOL-0005Zn-1U for qemu-devel@nongnu.org; Thu, 29 Jan 2015 04:28:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YGlO8-0004Kn-9q for qemu-devel@nongnu.org; Thu, 29 Jan 2015 04:28:00 -0500 Received: from e23smtp07.au.ibm.com ([202.81.31.140]:54440) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YGlO7-0004Ha-F3 for qemu-devel@nongnu.org; Thu, 29 Jan 2015 04:27:48 -0500 Received: from /spool/local by e23smtp07.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 29 Jan 2015 19:27:41 +1000 Received: from d23dlp02.au.ibm.com (202.81.31.213) by e23smtp07.au.ibm.com (202.81.31.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 29 Jan 2015 19:27:39 +1000 Received: from d23relay10.au.ibm.com (d23relay10.au.ibm.com [9.190.26.77]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id 704BA2BB0054; Thu, 29 Jan 2015 20:27:39 +1100 (EST) Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by d23relay10.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t0T9RdqA49938556; Thu, 29 Jan 2015 20:27:39 +1100 Received: from d23av01.au.ibm.com (localhost [127.0.0.1]) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t0T9RcZJ001277; Thu, 29 Jan 2015 20:27:38 +1100 Received: from ozlabs.au.ibm.com (ozlabs.au.ibm.com [9.192.253.14]) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t0T9RWjT000944; Thu, 29 Jan 2015 20:27:38 +1100 Received: from bran.ozlabs.ibm.com (haven.au.ibm.com [9.192.253.15]) by ozlabs.au.ibm.com (Postfix) with ESMTP id DFE98A03B0; Thu, 29 Jan 2015 20:27:38 +1100 (AEDT) Received: from ka1.ozlabs.ibm.com (ka1.ozlabs.ibm.com [10.61.145.11]) by bran.ozlabs.ibm.com (Postfix) with ESMTP id 3CF3316A9D1; Thu, 29 Jan 2015 20:27:38 +1100 (AEDT) From: Alexey Kardashevskiy To: qemu-devel@nongnu.org Date: Thu, 29 Jan 2015 20:27:24 +1100 Message-Id: <1422523650-2888-13-git-send-email-aik@ozlabs.ru> X-Mailer: git-send-email 2.0.0 In-Reply-To: <1422523650-2888-1-git-send-email-aik@ozlabs.ru> References: <1422523650-2888-1-git-send-email-aik@ozlabs.ru> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15012909-0025-0000-0000-000000FBDACC X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 202.81.31.140 Cc: Alexey Kardashevskiy , Alex Williamson , qemu-ppc@nongnu.org, Alexander Graf , David Gibson Subject: [Qemu-devel] [PATCH v4 12/18] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS handlers X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA window(s) which can support page sizes other than 4K. The existing implementation of DDW in the guest tries to create one huge DMA window with 64K or 16MB pages and map the entire guest RAM to. If it succeeds, the guest switches to dma_direct_ops and never calls TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM and not waste time on map/unmap later. This adds 4 RTAS handlers: * ibm,query-pe-dma-window * ibm,create-pe-dma-window * ibm,remove-pe-dma-window * ibm,reset-pe-dma-window These are registered from type_init() callback. These RTAS handlers are implemented in a separate file to avoid polluting spapr_iommu.c with PHB. Signed-off-by: Alexey Kardashevskiy --- Changes: v3: * added ibm,reset-pe-dma-window v2: * double loop squashed to spapr_iommu_fixmask() helper * added @ddw_num counter to PHB, it is used to generate LIOBN for new window; it is reset on ddw-reset event * added ULL to constants used in shift operations * rtas_ibm_reset_pe_dma_window() and rtas_ibm_remove_pe_dma_window() do not remove windows anymore, the PHB callback has to as it will reuse the same code in case of guest reboot as well --- hw/ppc/Makefile.objs | 3 + hw/ppc/spapr_rtas_ddw.c | 297 ++++++++++++++++++++++++++++++++++++++++++++++++ trace-events | 4 + 3 files changed, 304 insertions(+) create mode 100644 hw/ppc/spapr_rtas_ddw.c diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs index 19d9920..d7fe4fb 100644 --- a/hw/ppc/Makefile.objs +++ b/hw/ppc/Makefile.objs @@ -7,6 +7,9 @@ obj-$(CONFIG_PSERIES) += spapr_pci.o ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy) obj-y += spapr_pci_vfio.o endif +ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES), yy) +obj-y += spapr_rtas_ddw.o +endif # PowerPC 4xx boards obj-y += ppc405_boards.o ppc4xx_devs.o ppc405_uc.o ppc440_bamboo.o obj-y += ppc4xx_pci.o diff --git a/hw/ppc/spapr_rtas_ddw.c b/hw/ppc/spapr_rtas_ddw.c new file mode 100644 index 0000000..af70601 --- /dev/null +++ b/hw/ppc/spapr_rtas_ddw.c @@ -0,0 +1,297 @@ +/* + * QEMU sPAPR Dynamic DMA windows support + * + * Copyright (c) 2014 Alexey Kardashevskiy, IBM Corporation. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, + * or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include "hw/ppc/spapr.h" +#include "hw/pci-host/spapr.h" +#include "trace.h" + +static uint32_t spapr_iommu_fixmask(struct ppc_one_seg_page_size *sps, + uint32_t query_mask) +{ + int i, j; + uint32_t mask = 0; + const struct { int shift; uint32_t mask; } masks[] = { + { 12, DDW_PGSIZE_4K }, + { 16, DDW_PGSIZE_64K }, + { 24, DDW_PGSIZE_16M }, + { 25, DDW_PGSIZE_32M }, + { 26, DDW_PGSIZE_64M }, + { 27, DDW_PGSIZE_128M }, + { 28, DDW_PGSIZE_256M }, + { 34, DDW_PGSIZE_16G }, + }; + + for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) { + for (j = 0; j < ARRAY_SIZE(masks); ++j) { + if ((sps[i].page_shift == masks[j].shift) && + (query_mask & masks[j].mask)) { + mask |= masks[j].mask; + } + } + } + + return mask; +} + +static void rtas_ibm_query_pe_dma_window(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ + CPUPPCState *env = &cpu->env; + sPAPRPHBState *sphb; + sPAPRPHBClass *spc; + uint64_t buid; + uint32_t avail, addr, pgmask = 0; + uint32_t windows_supported = 0, page_size_mask = 0, dma32_window_size = 0; + uint64_t dma64_window_size = 0; + unsigned current; + long ret; + + if ((nargs != 3) || (nret != 5)) { + goto param_error_exit; + } + + buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2); + addr = rtas_ld(args, 0); + sphb = spapr_pci_find_phb(spapr, buid); + if (!sphb || !sphb->ddw_enabled) { + goto param_error_exit; + } + + spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + if (!spc->ddw_query) { + goto hw_error_exit; + } + + ret = spc->ddw_query(sphb, &windows_supported, &page_size_mask, + &dma32_window_size, &dma64_window_size); + trace_spapr_iommu_ddw_query(buid, addr, windows_supported, + page_size_mask, pgmask, ret); + if (ret) { + goto hw_error_exit; + } + + current = spapr_phb_get_win_num(sphb); + avail = (windows_supported > current) ? (windows_supported - current) : 0; + + /* Work out supported page masks */ + pgmask = spapr_iommu_fixmask(env->sps.sps, page_size_mask); + + rtas_st(rets, 0, RTAS_OUT_SUCCESS); + rtas_st(rets, 1, avail); + + /* + * This is "Largest contiguous block of TCEs allocated specifically + * for (that is, are reserved for) this PE". + * Return the maximum number as all RAM was in 4K pages. + */ + rtas_st(rets, 2, dma64_window_size >> SPAPR_TCE_PAGE_SHIFT); + rtas_st(rets, 3, pgmask); + rtas_st(rets, 4, 0); /* DMA migration mask, not supported */ + return; + +hw_error_exit: + rtas_st(rets, 0, RTAS_OUT_HW_ERROR); + return; + +param_error_exit: + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_create_pe_dma_window(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ + sPAPRPHBState *sphb; + sPAPRPHBClass *spc; + sPAPRTCETable *tcet = NULL; + uint32_t addr, page_shift, window_shift, liobn; + uint64_t buid; + long ret; + uint32_t windows_supported = 0, page_size_mask = 0, dma32_window_size = 0; + uint64_t dma64_window_size = 0; + + if ((nargs != 5) || (nret != 4)) { + goto param_error_exit; + } + + buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2); + addr = rtas_ld(args, 0); + sphb = spapr_pci_find_phb(spapr, buid); + if (!sphb || !sphb->ddw_enabled) { + goto param_error_exit; + } + + spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + if (!spc->ddw_create || !spc->ddw_query) { + goto hw_error_exit; + } + + ret = spc->ddw_query(sphb, &windows_supported, &page_size_mask, + &dma32_window_size, &dma64_window_size); + if (ret || (spapr_phb_get_win_num(sphb) >= windows_supported)) { + goto hw_error_exit; + } + + page_shift = rtas_ld(args, 3); + window_shift = rtas_ld(args, 4); + /* Default 32bit window#0 is always there so +1 */ + liobn = SPAPR_PCI_LIOBN(sphb->index, spapr_phb_get_win_num(sphb)); + + ret = spc->ddw_create(sphb, liobn, page_shift, window_shift, &tcet); + trace_spapr_iommu_ddw_create(buid, addr, 1ULL << page_shift, + 1ULL << window_shift, + tcet ? tcet->bus_offset : 0xbaadf00d, + liobn, ret); + if (ret || !tcet) { + goto hw_error_exit; + } + + rtas_st(rets, 0, RTAS_OUT_SUCCESS); + rtas_st(rets, 1, liobn); + rtas_st(rets, 2, tcet->bus_offset >> 32); + rtas_st(rets, 3, tcet->bus_offset & ((uint32_t) -1)); + + object_unref(OBJECT(tcet)); + return; + +hw_error_exit: + rtas_st(rets, 0, RTAS_OUT_HW_ERROR); + return; + +param_error_exit: + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_remove_pe_dma_window(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ + sPAPRPHBState *sphb; + sPAPRPHBClass *spc; + sPAPRTCETable *tcet; + uint32_t liobn; + long ret; + + if ((nargs != 1) || (nret != 1)) { + goto param_error_exit; + } + + liobn = rtas_ld(args, 0); + tcet = spapr_tce_find_by_liobn(liobn); + if (!tcet) { + goto param_error_exit; + } + + sphb = SPAPR_PCI_HOST_BRIDGE(OBJECT(tcet)->parent); + if (!sphb || !sphb->ddw_enabled) { + goto param_error_exit; + } + + spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + if (!spc->ddw_remove) { + goto hw_error_exit; + } + + ret = spc->ddw_remove(sphb, tcet); + trace_spapr_iommu_ddw_remove(liobn, ret); + if (ret) { + goto hw_error_exit; + } + + rtas_st(rets, 0, RTAS_OUT_SUCCESS); + return; + +hw_error_exit: + rtas_st(rets, 0, RTAS_OUT_HW_ERROR); + return; + +param_error_exit: + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_reset_pe_dma_window(PowerPCCPU *cpu, + sPAPREnvironment *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ + sPAPRPHBState *sphb; + sPAPRPHBClass *spc; + uint64_t buid; + uint32_t addr; + long ret; + + if ((nargs != 3) || (nret != 1)) { + goto param_error_exit; + } + + buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2); + addr = rtas_ld(args, 0); + sphb = spapr_pci_find_phb(spapr, buid); + if (!sphb || !sphb->ddw_enabled) { + goto param_error_exit; + } + + spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + if (!spc->ddw_reset) { + goto hw_error_exit; + } + + ret = spc->ddw_reset(sphb); + trace_spapr_iommu_ddw_reset(buid, addr, ret); + if (ret) { + goto hw_error_exit; + } + + rtas_st(rets, 0, RTAS_OUT_SUCCESS); + + return; + +hw_error_exit: + rtas_st(rets, 0, RTAS_OUT_HW_ERROR); + return; + +param_error_exit: + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void spapr_rtas_ddw_init(void) +{ + spapr_rtas_register(RTAS_IBM_QUERY_PE_DMA_WINDOW, + "ibm,query-pe-dma-window", + rtas_ibm_query_pe_dma_window); + spapr_rtas_register(RTAS_IBM_CREATE_PE_DMA_WINDOW, + "ibm,create-pe-dma-window", + rtas_ibm_create_pe_dma_window); + spapr_rtas_register(RTAS_IBM_REMOVE_PE_DMA_WINDOW, + "ibm,remove-pe-dma-window", + rtas_ibm_remove_pe_dma_window); + spapr_rtas_register(RTAS_IBM_RESET_PE_DMA_WINDOW, + "ibm,reset-pe-dma-window", + rtas_ibm_reset_pe_dma_window); +} + +type_init(spapr_rtas_ddw_init) diff --git a/trace-events b/trace-events index 04f5df2..9af53d9 100644 --- a/trace-events +++ b/trace-events @@ -1285,6 +1285,10 @@ spapr_iommu_indirect(uint64_t liobn, uint64_t ioba, uint64_t tce, uint64_t iobaN spapr_iommu_stuff(uint64_t liobn, uint64_t ioba, uint64_t tce_value, uint64_t npages, uint64_t ret) "liobn=%"PRIx64" ioba=0x%"PRIx64" tcevalue=0x%"PRIx64" npages=%"PRId64" ret=%"PRId64 spapr_iommu_xlate(uint64_t liobn, uint64_t ioba, uint64_t tce, unsigned perm, unsigned pgsize) "liobn=%"PRIx64" 0x%"PRIx64" -> 0x%"PRIx64" perm=%u mask=%x" spapr_iommu_new_table(uint64_t liobn, void *tcet, void *table, int fd) "liobn=%"PRIx64" tcet=%p table=%p fd=%d" +spapr_iommu_ddw_query(uint64_t buid, uint32_t cfgaddr, uint32_t wa, uint32_t pgz, uint32_t pgz_fixed, long ret) "buid=%"PRIx64" addr=%"PRIx32", %u windows available, sizes %"PRIx32", fixed %"PRIx32", ret = %ld" +spapr_iommu_ddw_create(uint64_t buid, uint32_t cfgaddr, unsigned long long pg_size, unsigned long long req_size, uint64_t start, uint32_t liobn, long ret) "buid=%"PRIx64" addr=%"PRIx32", page size=0x%llx, requested=0x%llx, start addr=%"PRIx64", liobn=%"PRIx32", ret = %ld" +spapr_iommu_ddw_remove(uint32_t liobn, long ret) "liobn=%"PRIx32", ret = %ld" +spapr_iommu_ddw_reset(uint64_t buid, uint32_t cfgaddr, long ret) "buid=%"PRIx64" addr=%"PRIx32", ret = %ld" # hw/ppc/ppc.c ppc_tb_adjust(uint64_t offs1, uint64_t offs2, int64_t diff, int64_t seconds) "adjusted from 0x%"PRIx64" to 0x%"PRIx64", diff %"PRId64" (%"PRId64"s)"