diff mbox

[v5,03/18] vfio: allow to notify unmap for very large region

Message ID 1485253571-19058-4-git-send-email-peterx@redhat.com
State New
Headers show

Commit Message

Peter Xu Jan. 24, 2017, 10:25 a.m. UTC
Linux vfio driver supports to do VFIO_IOMMU_UNMAP_DMA for a very big
region. This can be leveraged by QEMU IOMMU implementation to cleanup
existing page mappings for an entire iova address space (by notifying
with an IOTLB with extremely huge addr_mask). However current
vfio_iommu_map_notify() does not allow that. It make sure that all the
translated address in IOTLB is falling into RAM range.

The check makes sense, but it should only be a sensible checker for
mapping operations, and mean little for unmap operations.

This patch moves this check into map logic only, so that we'll get
faster unmap handling (no need to translate again), and also we can then
better support unmapping a very big region when it covers non-ram ranges
or even not-existing ranges.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/vfio/common.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

Comments

Alex Williamson Jan. 24, 2017, 4:32 p.m. UTC | #1
On Tue, 24 Jan 2017 18:25:56 +0800
Peter Xu <peterx@redhat.com> wrote:

> Linux vfio driver supports to do VFIO_IOMMU_UNMAP_DMA for a very big
> region. This can be leveraged by QEMU IOMMU implementation to cleanup
> existing page mappings for an entire iova address space (by notifying
> with an IOTLB with extremely huge addr_mask). However current
> vfio_iommu_map_notify() does not allow that. It make sure that all the
> translated address in IOTLB is falling into RAM range.
> 
> The check makes sense, but it should only be a sensible checker for
> mapping operations, and mean little for unmap operations.
> 
> This patch moves this check into map logic only, so that we'll get
> faster unmap handling (no need to translate again), and also we can then
> better support unmapping a very big region when it covers non-ram ranges
> or even not-existing ranges.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/vfio/common.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ce55dff..4d90844 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -354,11 +354,10 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
>          return;
>      }
>  
> -    if (!vfio_get_vaddr(iotlb, &vaddr, &read_only)) {
> -        return;
> -    }
> -
>      if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> +        if (!vfio_get_vaddr(iotlb, &vaddr, &read_only)) {
> +            return;
> +        }


David, is SPAPR going to freak out if it sees unmaps to ranges that
extend beyond individual mappings, or perhaps include no mappings?
This effectively allows unmapping the entire address space of the iommu
in one pass, without validating or translating the backing.

>          ret = vfio_dma_map(container, iova,
>                             iotlb->addr_mask + 1, vaddr,
>                             read_only);
David Gibson Jan. 31, 2017, 3:35 a.m. UTC | #2
On Tue, Jan 24, 2017 at 09:32:07AM -0700, Alex Williamson wrote:
> On Tue, 24 Jan 2017 18:25:56 +0800
> Peter Xu <peterx@redhat.com> wrote:
> 
> > Linux vfio driver supports to do VFIO_IOMMU_UNMAP_DMA for a very big
> > region. This can be leveraged by QEMU IOMMU implementation to cleanup
> > existing page mappings for an entire iova address space (by notifying
> > with an IOTLB with extremely huge addr_mask). However current
> > vfio_iommu_map_notify() does not allow that. It make sure that all the
> > translated address in IOTLB is falling into RAM range.
> > 
> > The check makes sense, but it should only be a sensible checker for
> > mapping operations, and mean little for unmap operations.
> > 
> > This patch moves this check into map logic only, so that we'll get
> > faster unmap handling (no need to translate again), and also we can then
> > better support unmapping a very big region when it covers non-ram ranges
> > or even not-existing ranges.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  hw/vfio/common.c | 7 +++----
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> > 
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index ce55dff..4d90844 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -354,11 +354,10 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> >          return;
> >      }
> >  
> > -    if (!vfio_get_vaddr(iotlb, &vaddr, &read_only)) {
> > -        return;
> > -    }
> > -
> >      if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > +        if (!vfio_get_vaddr(iotlb, &vaddr, &read_only)) {
> > +            return;
> > +        }
> 
> 
> David, is SPAPR going to freak out if it sees unmaps to ranges that
> extend beyond individual mappings, or perhaps include no mappings?
> This effectively allows unmapping the entire address space of the iommu
> in one pass, without validating or translating the backing.

Extending beyond an individual mapping will be fine.  However, if the
unmap extends beyond the extent of IOVAs mapped by a single TCE table,
then the unmap will fail (with ENXIO or EINVAL depending on whether
there's a problem with origin or only size).

With holidays I've lost the context of this thread, so I can't easily
find the whole patch series this relates to.  From your brief
description above it sounds likes it should be ok - a range covering
just the IOVA space (as long as that's actually correct for spapr tce)
should be ok.

> 
> >          ret = vfio_dma_map(container, iova,
> >                             iotlb->addr_mask + 1, vaddr,
> >                             read_only);
>
Peter Xu Feb. 3, 2017, 7:30 a.m. UTC | #3
On Tue, Jan 31, 2017 at 02:35:00PM +1100, David Gibson wrote:
> On Tue, Jan 24, 2017 at 09:32:07AM -0700, Alex Williamson wrote:
> > On Tue, 24 Jan 2017 18:25:56 +0800
> > Peter Xu <peterx@redhat.com> wrote:
> > 
> > > Linux vfio driver supports to do VFIO_IOMMU_UNMAP_DMA for a very big
> > > region. This can be leveraged by QEMU IOMMU implementation to cleanup
> > > existing page mappings for an entire iova address space (by notifying
> > > with an IOTLB with extremely huge addr_mask). However current
> > > vfio_iommu_map_notify() does not allow that. It make sure that all the
> > > translated address in IOTLB is falling into RAM range.
> > > 
> > > The check makes sense, but it should only be a sensible checker for
> > > mapping operations, and mean little for unmap operations.
> > > 
> > > This patch moves this check into map logic only, so that we'll get
> > > faster unmap handling (no need to translate again), and also we can then
> > > better support unmapping a very big region when it covers non-ram ranges
> > > or even not-existing ranges.
> > > 
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  hw/vfio/common.c | 7 +++----
> > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > > index ce55dff..4d90844 100644
> > > --- a/hw/vfio/common.c
> > > +++ b/hw/vfio/common.c
> > > @@ -354,11 +354,10 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> > >          return;
> > >      }
> > >  
> > > -    if (!vfio_get_vaddr(iotlb, &vaddr, &read_only)) {
> > > -        return;
> > > -    }
> > > -
> > >      if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
> > > +        if (!vfio_get_vaddr(iotlb, &vaddr, &read_only)) {
> > > +            return;
> > > +        }
> > 
> > 
> > David, is SPAPR going to freak out if it sees unmaps to ranges that
> > extend beyond individual mappings, or perhaps include no mappings?
> > This effectively allows unmapping the entire address space of the iommu
> > in one pass, without validating or translating the backing.
> 
> Extending beyond an individual mapping will be fine.  However, if the
> unmap extends beyond the extent of IOVAs mapped by a single TCE table,
> then the unmap will fail (with ENXIO or EINVAL depending on whether
> there's a problem with origin or only size).
> 
> With holidays I've lost the context of this thread, so I can't easily
> find the whole patch series this relates to.  From your brief
> description above it sounds likes it should be ok - a range covering
> just the IOVA space (as long as that's actually correct for spapr tce)
> should be ok.

Thanks for the confirmation! Then I'll move ahead for the next spin.

Thanks,

-- peterx
diff mbox

Patch

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index ce55dff..4d90844 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -354,11 +354,10 @@  static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
         return;
     }
 
-    if (!vfio_get_vaddr(iotlb, &vaddr, &read_only)) {
-        return;
-    }
-
     if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
+        if (!vfio_get_vaddr(iotlb, &vaddr, &read_only)) {
+            return;
+        }
         ret = vfio_dma_map(container, iova,
                            iotlb->addr_mask + 1, vaddr,
                            read_only);