From patchwork Sun Aug 29 22:08:23 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eduard - Gabriel Munteanu X-Patchwork-Id: 62979 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id D8B50B70EC for ; Mon, 30 Aug 2010 08:11:46 +1000 (EST) Received: from localhost ([127.0.0.1]:35858 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Opq6H-0007ex-91 for incoming@patchwork.ozlabs.org; Sun, 29 Aug 2010 18:11:41 -0400 Received: from [140.186.70.92] (port=58011 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Opq4z-00078C-4R for qemu-devel@nongnu.org; Sun, 29 Aug 2010 18:10:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Opq4x-0005iP-LB for qemu-devel@nongnu.org; Sun, 29 Aug 2010 18:10:20 -0400 Received: from mail-bw0-f45.google.com ([209.85.214.45]:38002) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Opq4w-0005iG-UQ for qemu-devel@nongnu.org; Sun, 29 Aug 2010 18:10:19 -0400 Received: by bwz3 with SMTP id 3so3514528bwz.4 for ; Sun, 29 Aug 2010 15:10:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:from:to:cc:subject :date:message-id:x-mailer:in-reply-to:references; bh=b6OQX1evHeuQQZnvONJS7bmIHWunmuwdKbBbNAFc4ek=; b=pXSjYo2maQJP0xQfH20CshwkDd9gkA7e+F+TNE1jsHFImPdjzbGetNmdT6txzzwbaM I4e4vyEEMfYQb4cOsZ1CwMFLQ5LYivqcvhVbVxloyhsYY0kM8OpgZFRoJM01hLeX8C/G RHtQb3XnrnkHyXjm+gVX7UmeiG9Ep7JU/jeiw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; b=Ttlxs63oLdfTKDJOGYYIeH9MCFDvnBGEcy7gs0EnQhvr6W+9feyIPtvFIGnnBTJyHE MpYPvrXz1ba0h1/7D2sXSJ4GaA05YxT//hbMD8ZG4IAGkQ1S6Z6VfTH4H/jn+08cUJxA XF3frd9KiZdSr7b2DdZYpmp9HyV9VfadI6Seo= Received: by 10.204.16.209 with SMTP id p17mr2374267bka.157.1283119817384; Sun, 29 Aug 2010 15:10:17 -0700 (PDT) Received: from localhost ([188.25.244.247]) by mx.google.com with ESMTPS id g12sm4634240bkb.2.2010.08.29.15.10.14 (version=SSLv3 cipher=RC4-MD5); Sun, 29 Aug 2010 15:10:16 -0700 (PDT) From: Eduard - Gabriel Munteanu To: mst@redhat.com Date: Mon, 30 Aug 2010 01:08:23 +0300 Message-Id: <1283119703-9781-1-git-send-email-eduard.munteanu@linux360.ro> X-Mailer: git-send-email 1.7.1 In-Reply-To: References: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) Cc: kvm@vger.kernel.org, joro@8bytes.org, qemu-devel@nongnu.org, blauwirbel@gmail.com, yamahata@valinux.co.jp, paul@codesourcery.com, Eduard - Gabriel Munteanu , avi@redhat.com Subject: [Qemu-devel] [PATCH 2/7] pci: memory access API and IOMMU support X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org PCI devices should access memory through pci_memory_*() instead of cpu_physical_memory_*(). This also provides support for translation and access checking in case an IOMMU is emulated. Memory maps are treated as remote IOTLBs (that is, translation caches belonging to the IOMMU-aware device itself). Clients (devices) must provide callbacks for map invalidation in case these maps are persistent beyond the current I/O context, e.g. AIO DMA transfers. Signed-off-by: Eduard - Gabriel Munteanu --- hw/pci.c | 191 +++++++++++++++++++++++++++++++++++++++++++++++++++- hw/pci.h | 69 +++++++++++++++++++ hw/pci_internals.h | 12 +++ qemu-common.h | 1 + 4 files changed, 272 insertions(+), 1 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 2dc1577..afcb33c 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -158,6 +158,19 @@ static void pci_device_reset(PCIDevice *dev) pci_update_mappings(dev); } +static int pci_no_translate(PCIDevice *iommu, + PCIDevice *dev, + pcibus_t addr, + target_phys_addr_t *paddr, + target_phys_addr_t *len, + unsigned perms) +{ + *paddr = addr; + *len = -1; + + return 0; +} + static void pci_bus_reset(void *opaque) { PCIBus *bus = opaque; @@ -220,7 +233,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent, { qbus_create_inplace(&bus->qbus, &pci_bus_info, parent, name); assert(PCI_FUNC(devfn_min) == 0); - bus->devfn_min = devfn_min; + + bus->devfn_min = devfn_min; + bus->iommu = NULL; + bus->translate = pci_no_translate; /* host bridge */ QLIST_INIT(&bus->child); @@ -1789,3 +1805,176 @@ static char *pcibus_get_dev_path(DeviceState *dev) return strdup(path); } +void pci_register_iommu(PCIDevice *iommu, + PCITranslateFunc *translate) +{ + iommu->bus->iommu = iommu; + iommu->bus->translate = translate; +} + +void pci_memory_rw(PCIDevice *dev, + pcibus_t addr, + uint8_t *buf, + pcibus_t len, + int is_write) +{ + int err; + unsigned perms; + PCIDevice *iommu = dev->bus->iommu; + target_phys_addr_t paddr, plen; + + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ; + + while (len) { + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms); + if (err) { + return; + } + + /* The translation might be valid for larger regions. */ + if (plen > len) { + plen = len; + } + + cpu_physical_memory_rw(paddr, buf, plen, is_write); + + len -= plen; + addr += plen; + buf += plen; + } +} + +static void pci_memory_register_map(PCIDevice *dev, + pcibus_t addr, + pcibus_t len, + target_phys_addr_t paddr, + PCIInvalidateMapFunc *invalidate, + void *invalidate_opaque) +{ + PCIMemoryMap *map; + + map = qemu_malloc(sizeof(PCIMemoryMap)); + map->addr = addr; + map->len = len; + map->paddr = paddr; + map->invalidate = invalidate; + map->invalidate_opaque = invalidate_opaque; + + QLIST_INSERT_HEAD(&dev->memory_maps, map, list); +} + +static void pci_memory_unregister_map(PCIDevice *dev, + target_phys_addr_t paddr, + target_phys_addr_t len) +{ + PCIMemoryMap *map; + + QLIST_FOREACH(map, &dev->memory_maps, list) { + if (map->paddr == paddr && map->len == len) { + QLIST_REMOVE(map, list); + free(map); + } + } +} + +void pci_memory_invalidate_range(PCIDevice *dev, + pcibus_t addr, + pcibus_t len) +{ + PCIMemoryMap *map; + + QLIST_FOREACH(map, &dev->memory_maps, list) { + if (ranges_overlap(addr, len, map->addr, map->len)) { + map->invalidate(map->invalidate_opaque); + QLIST_REMOVE(map, list); + free(map); + } + } +} + +void *pci_memory_map(PCIDevice *dev, + PCIInvalidateMapFunc *cb, + void *opaque, + pcibus_t addr, + target_phys_addr_t *len, + int is_write) +{ + int err; + unsigned perms; + PCIDevice *iommu = dev->bus->iommu; + target_phys_addr_t paddr, plen; + + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ; + + plen = *len; + err = dev->bus->translate(iommu, dev, addr, &paddr, &plen, perms); + if (err) { + return NULL; + } + + /* + * If this is true, the virtual region is contiguous, + * but the translated physical region isn't. We just + * clamp *len, much like cpu_physical_memory_map() does. + */ + if (plen < *len) { + *len = plen; + } + + /* We treat maps as remote TLBs to cope with stuff like AIO. */ + if (cb) { + pci_memory_register_map(dev, addr, *len, paddr, cb, opaque); + } + + return cpu_physical_memory_map(paddr, len, is_write); +} + +void pci_memory_unmap(PCIDevice *dev, + void *buffer, + target_phys_addr_t len, + int is_write, + target_phys_addr_t access_len) +{ + cpu_physical_memory_unmap(buffer, len, is_write, access_len); + pci_memory_unregister_map(dev, (target_phys_addr_t) buffer, len); +} + +#define DEFINE_PCI_LD(suffix, size) \ +uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr) \ +{ \ + int err; \ + target_phys_addr_t paddr, plen; \ + \ + err = dev->bus->translate(dev->bus->iommu, dev, \ + addr, &paddr, &plen, IOMMU_PERM_READ); \ + if (err || (plen < size / 8)) { \ + return 0; \ + } \ + \ + return ld##suffix##_phys(paddr); \ +} + +#define DEFINE_PCI_ST(suffix, size) \ +void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val) \ +{ \ + int err; \ + target_phys_addr_t paddr, plen; \ + \ + err = dev->bus->translate(dev->bus->iommu, dev, \ + addr, &paddr, &plen, IOMMU_PERM_WRITE); \ + if (err || (plen < size / 8)) { \ + return; \ + } \ + \ + st##suffix##_phys(paddr, val); \ +} + +DEFINE_PCI_LD(ub, 8) +DEFINE_PCI_LD(uw, 16) +DEFINE_PCI_LD(l, 32) +DEFINE_PCI_LD(q, 64) + +DEFINE_PCI_ST(b, 8) +DEFINE_PCI_ST(w, 16) +DEFINE_PCI_ST(l, 32) +DEFINE_PCI_ST(q, 64) diff --git a/hw/pci.h b/hw/pci.h index c551f96..c95863a 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -172,6 +172,8 @@ struct PCIDevice { char *romfile; ram_addr_t rom_offset; uint32_t rom_bar; + + QLIST_HEAD(, PCIMemoryMap) memory_maps; }; PCIDevice *pci_register_device(PCIBus *bus, const char *name, @@ -391,4 +393,71 @@ static inline int ranges_overlap(uint64_t first1, uint64_t len1, return !(last2 < first1 || last1 < first2); } +/* + * Memory I/O and PCI IOMMU definitions. + */ + +#define IOMMU_PERM_READ (1 << 0) +#define IOMMU_PERM_WRITE (1 << 1) +#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE) + +typedef int PCIInvalidateMapFunc(void *opaque); +typedef int PCITranslateFunc(PCIDevice *iommu, + PCIDevice *dev, + pcibus_t addr, + target_phys_addr_t *paddr, + target_phys_addr_t *len, + unsigned perms); + +void pci_memory_rw(PCIDevice *dev, + pcibus_t addr, + uint8_t *buf, + pcibus_t len, + int is_write); +void *pci_memory_map(PCIDevice *dev, + PCIInvalidateMapFunc *cb, + void *opaque, + pcibus_t addr, + target_phys_addr_t *len, + int is_write); +void pci_memory_unmap(PCIDevice *dev, + void *buffer, + target_phys_addr_t len, + int is_write, + target_phys_addr_t access_len); +void pci_register_iommu(PCIDevice *dev, PCITranslateFunc *translate); +void pci_memory_invalidate_range(PCIDevice *dev, pcibus_t addr, pcibus_t len); + +#define DECLARE_PCI_LD(suffix, size) \ +uint##size##_t pci_ld##suffix(PCIDevice *dev, pcibus_t addr); + +#define DECLARE_PCI_ST(suffix, size) \ +void pci_st##suffix(PCIDevice *dev, pcibus_t addr, uint##size##_t val); + +DECLARE_PCI_LD(ub, 8) +DECLARE_PCI_LD(uw, 16) +DECLARE_PCI_LD(l, 32) +DECLARE_PCI_LD(q, 64) + +DECLARE_PCI_ST(b, 8) +DECLARE_PCI_ST(w, 16) +DECLARE_PCI_ST(l, 32) +DECLARE_PCI_ST(q, 64) + +static inline void pci_memory_read(PCIDevice *dev, + pcibus_t addr, + uint8_t *buf, + pcibus_t len) +{ + pci_memory_rw(dev, addr, buf, len, 0); +} + +static inline void pci_memory_write(PCIDevice *dev, + pcibus_t addr, + const uint8_t *buf, + pcibus_t len) +{ + pci_memory_rw(dev, addr, (uint8_t *) buf, len, 1); +} + #endif diff --git a/hw/pci_internals.h b/hw/pci_internals.h index e3c93a3..fb134b9 100644 --- a/hw/pci_internals.h +++ b/hw/pci_internals.h @@ -33,6 +33,9 @@ struct PCIBus { Keep a count of the number of devices with raised IRQs. */ int nirq; int *irq_count; + + PCIDevice *iommu; + PCITranslateFunc *translate; }; struct PCIBridge { @@ -44,4 +47,13 @@ struct PCIBridge { const char *bus_name; }; +struct PCIMemoryMap { + pcibus_t addr; + pcibus_t len; + target_phys_addr_t paddr; + PCIInvalidateMapFunc *invalidate; + void *invalidate_opaque; + QLIST_ENTRY(PCIMemoryMap) list; +}; + #endif /* QEMU_PCI_INTERNALS_H */ diff --git a/qemu-common.h b/qemu-common.h index d735235..8b060e8 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -218,6 +218,7 @@ typedef struct SMBusDevice SMBusDevice; typedef struct PCIHostState PCIHostState; typedef struct PCIExpressHost PCIExpressHost; typedef struct PCIBus PCIBus; +typedef struct PCIMemoryMap PCIMemoryMap; typedef struct PCIDevice PCIDevice; typedef struct PCIBridge PCIBridge; typedef struct SerialState SerialState;