diff mbox

[qemu,v18,3/5] vfio: Add host side DMA window capabilities

Message ID 1466471645-5396-4-git-send-email-aik@ozlabs.ru
State New
Headers show

Commit Message

Alexey Kardashevskiy June 21, 2016, 1:14 a.m. UTC
There are going to be multiple IOMMUs per a container. This moves
the single host IOMMU parameter set to a list of VFIOHostDMAWindow.

This should cause no behavioral change and will be used later by
the SPAPR TCE IOMMU v2 which will also add a vfio_host_win_del() helper.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
Changes:
v18:
* vfio_host_win_add() checks for non-overlapping windows instead of calling
vfio_host_win_lookup() which checks for inclusion
* inlined vfio_host_win_lookup() as I ended up using it just once
* put VFIOHostDMAWindow::max_iova in new line in include/hw/vfio/vfio-common.h

v17:
* vfio_host_win_add() uses vfio_host_win_lookup() for overlap check and
aborts if any found instead of returning an error (as recovery is not
possible anyway)
* hw_error() when overlapped iommu is detected

v16:
* adjusted commit log with changes from v15

v15:
* s/vfio_host_iommu_add/vfio_host_win_add/
* s/VFIOHostIOMMU/VFIOHostDMAWindow/
---
 hw/vfio/common.c              | 60 +++++++++++++++++++++++++++++++------------
 include/hw/vfio/vfio-common.h | 10 ++++++--
 2 files changed, 52 insertions(+), 18 deletions(-)

Comments

David Gibson June 21, 2016, 6:50 a.m. UTC | #1
On Tue, Jun 21, 2016 at 11:14:03AM +1000, Alexey Kardashevskiy wrote:
> There are going to be multiple IOMMUs per a container. This moves
> the single host IOMMU parameter set to a list of VFIOHostDMAWindow.
> 
> This should cause no behavioral change and will be used later by
> the SPAPR TCE IOMMU v2 which will also add a vfio_host_win_del() helper.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

Looks ok to me.  Again, Alex, your tree or mine?

One minor point..
[snip]
> @@ -878,17 +908,14 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
>           * existing Type1 IOMMUs generally support any IOVA we're
>           * going to actually try in practice.
>           */
> -        container->min_iova = 0;
> -        container->max_iova = (hwaddr)-1;
> -
> -        /* Assume just 4K IOVA page size */
> -        container->iova_pgsizes = 0x1000;
>          info.argsz = sizeof(info);
>          ret = ioctl(fd, VFIO_IOMMU_GET_INFO, &info);
>          /* Ignore errors */
> -        if ((ret == 0) && (info.flags & VFIO_IOMMU_INFO_PGSIZES)) {
> -            container->iova_pgsizes = info.iova_pgsizes;
> +        if (ret || !(info.flags & VFIO_IOMMU_INFO_PGSIZES)) {
> +            /* Assume 4k IOVA page size */
> +            info.iova_pgsizes = 4096;
>          }
> +        vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);

I don't think it needs to hold this patch up, but at some point we
should work out the real range covered by the x86 IOMMU tables and put
that in here.  I'm pretty sure it won't actually be 2^64-1.
Alex Williamson June 22, 2016, 5:03 p.m. UTC | #2
On Tue, 21 Jun 2016 16:50:17 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Jun 21, 2016 at 11:14:03AM +1000, Alexey Kardashevskiy wrote:
> > There are going to be multiple IOMMUs per a container. This moves
> > the single host IOMMU parameter set to a list of VFIOHostDMAWindow.
> > 
> > This should cause no behavioral change and will be used later by
> > the SPAPR TCE IOMMU v2 which will also add a vfio_host_win_del() helper.
> > 
> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>  
> 
> Looks ok to me.  Again, Alex, your tree or mine?

I gave the previous patch a nak, it needs a respin, but this one looks
ok.  I don't currently have anything pending that would conflict with
this, afaik, so it's ok with me if you want to pull it through your
tree.  I'll ack the respin.
 
> One minor point..
> [snip]
> > @@ -878,17 +908,14 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
> >           * existing Type1 IOMMUs generally support any IOVA we're
> >           * going to actually try in practice.
> >           */
> > -        container->min_iova = 0;
> > -        container->max_iova = (hwaddr)-1;
> > -
> > -        /* Assume just 4K IOVA page size */
> > -        container->iova_pgsizes = 0x1000;
> >          info.argsz = sizeof(info);
> >          ret = ioctl(fd, VFIO_IOMMU_GET_INFO, &info);
> >          /* Ignore errors */
> > -        if ((ret == 0) && (info.flags & VFIO_IOMMU_INFO_PGSIZES)) {
> > -            container->iova_pgsizes = info.iova_pgsizes;
> > +        if (ret || !(info.flags & VFIO_IOMMU_INFO_PGSIZES)) {
> > +            /* Assume 4k IOVA page size */
> > +            info.iova_pgsizes = 4096;
> >          }
> > +        vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);  
> 
> I don't think it needs to hold this patch up, but at some point we
> should work out the real range covered by the x86 IOMMU tables and put
> that in here.  I'm pretty sure it won't actually be 2^64-1.

Between this patch, some work that Eric is doing that would allow us to
exclude the MSI range, and the capability chains that we can add to the
IOMMU_GET_INFO ioctl to describe both the extent and the reserved MSI
area, I think we're getting close to being able to do that.  On AMD I
think we do have a full 64bit address space, but VT-d is definitely
not.  Thanks,

Alex
diff mbox

Patch

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 22be48b..b53a1db 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -28,6 +28,7 @@ 
 #include "exec/memory.h"
 #include "hw/hw.h"
 #include "qemu/error-report.h"
+#include "qemu/range.h"
 #include "sysemu/kvm.h"
 #ifdef CONFIG_KVM
 #include "linux/kvm.h"
@@ -241,6 +242,29 @@  static int vfio_dma_map(VFIOContainer *container, hwaddr iova,
     return -errno;
 }
 
+static void vfio_host_win_add(VFIOContainer *container,
+                              hwaddr min_iova, hwaddr max_iova,
+                              uint64_t iova_pgsizes)
+{
+    VFIOHostDMAWindow *hostwin;
+
+    QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
+        if (ranges_overlap(hostwin->min_iova,
+                           hostwin->max_iova - hostwin->min_iova + 1,
+                           min_iova,
+                           max_iova - min_iova + 1)) {
+            hw_error("%s: Overlapped IOMMU are not enabled", __func__);
+        }
+    }
+
+    hostwin = g_malloc0(sizeof(*hostwin));
+
+    hostwin->min_iova = min_iova;
+    hostwin->max_iova = max_iova;
+    hostwin->iova_pgsizes = iova_pgsizes;
+    QLIST_INSERT_HEAD(&container->hostwin_list, hostwin, hostwin_next);
+}
+
 static bool vfio_listener_skipped_section(MemoryRegionSection *section)
 {
     return (!memory_region_is_ram(section->mr) &&
@@ -329,6 +353,8 @@  static void vfio_listener_region_add(MemoryListener *listener,
     Int128 llend, llsize;
     void *vaddr;
     int ret;
+    VFIOHostDMAWindow *hostwin;
+    bool hostwin_found;
 
     if (vfio_listener_skipped_section(section)) {
         trace_vfio_listener_region_add_skip(
@@ -354,7 +380,15 @@  static void vfio_listener_region_add(MemoryListener *listener,
     }
     end = int128_get64(int128_sub(llend, int128_one()));
 
-    if ((iova < container->min_iova) || (end > container->max_iova)) {
+    hostwin_found = false;
+    QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
+        if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
+            hostwin_found = true;
+            break;
+        }
+    }
+
+    if (!hostwin_found) {
         error_report("vfio: IOMMU container %p can't map guest IOVA region"
                      " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
                      container, iova, end);
@@ -369,10 +403,6 @@  static void vfio_listener_region_add(MemoryListener *listener,
 
         trace_vfio_listener_region_add_iommu(iova, end);
         /*
-         * FIXME: We should do some checking to see if the
-         * capabilities of the host VFIO IOMMU are adequate to model
-         * the guest IOMMU
-         *
          * FIXME: For VFIO iommu types which have KVM acceleration to
          * avoid bouncing all map/unmaps through qemu this way, this
          * would be the right place to wire that up (tell the KVM
@@ -878,17 +908,14 @@  static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
          * existing Type1 IOMMUs generally support any IOVA we're
          * going to actually try in practice.
          */
-        container->min_iova = 0;
-        container->max_iova = (hwaddr)-1;
-
-        /* Assume just 4K IOVA page size */
-        container->iova_pgsizes = 0x1000;
         info.argsz = sizeof(info);
         ret = ioctl(fd, VFIO_IOMMU_GET_INFO, &info);
         /* Ignore errors */
-        if ((ret == 0) && (info.flags & VFIO_IOMMU_INFO_PGSIZES)) {
-            container->iova_pgsizes = info.iova_pgsizes;
+        if (ret || !(info.flags & VFIO_IOMMU_INFO_PGSIZES)) {
+            /* Assume 4k IOVA page size */
+            info.iova_pgsizes = 4096;
         }
+        vfio_host_win_add(container, 0, (hwaddr)-1, info.iova_pgsizes);
     } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU) ||
                ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_v2_IOMMU)) {
         struct vfio_iommu_spapr_tce_info info;
@@ -948,11 +975,12 @@  static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
             }
             goto listener_release_exit;
         }
-        container->min_iova = info.dma32_window_start;
-        container->max_iova = container->min_iova + info.dma32_window_size - 1;
 
-        /* Assume just 4K IOVA pages for now */
-        container->iova_pgsizes = 0x1000;
+        /* The default table uses 4K pages */
+        vfio_host_win_add(container, info.dma32_window_start,
+                          info.dma32_window_start +
+                          info.dma32_window_size - 1,
+                          0x1000);
     } else {
         error_report("vfio: No available IOMMU models");
         ret = -EINVAL;
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 405c3b2..b1f3e92 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -82,9 +82,8 @@  typedef struct VFIOContainer {
      * contiguous IOVA window.  We may need to generalize that in
      * future
      */
-    hwaddr min_iova, max_iova;
-    uint64_t iova_pgsizes;
     QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
+    QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
     QLIST_HEAD(, VFIOGroup) group_list;
     QLIST_ENTRY(VFIOContainer) next;
 } VFIOContainer;
@@ -97,6 +96,13 @@  typedef struct VFIOGuestIOMMU {
     QLIST_ENTRY(VFIOGuestIOMMU) giommu_next;
 } VFIOGuestIOMMU;
 
+typedef struct VFIOHostDMAWindow {
+    hwaddr min_iova;
+    hwaddr max_iova;
+    uint64_t iova_pgsizes;
+    QLIST_ENTRY(VFIOHostDMAWindow) hostwin_next;
+} VFIOHostDMAWindow;
+
 typedef struct VFIODeviceOps VFIODeviceOps;
 
 typedef struct VFIODevice {