diff mbox

[1/3] exec: add page_mask for address_space_do_translate

Message ID 1496404254-17429-2-git-send-email-peterx@redhat.com
State New
Headers show

Commit Message

Peter Xu June 2, 2017, 11:50 a.m. UTC
The function is originally used for address_space_translate() and what
we care about most is (xlat, plen) range. However for iotlb requests, we
don't really care about "plen", but the size of the page that "xlat" is
located on. While, plen cannot really contain this information.

A simple example to show why "plen" is not good for IOTLB translations:

E.g., for huge pages, it is possible that guest mapped 1G huge page on
device side that used this GPA range:

  0x100000000 - 0x13fffffff

Then let's say we want to translate one IOVA that finally mapped to GPA
0x13ffffe00 (which is located on this 1G huge page). Then here we'll
get:

  (xlat, plen) = (0x13fffe00, 0x200)

So the IOTLB would be only covering a very small range since from
"plen" (which is 0x200 bytes) we cannot tell the size of the page.

Actually we can really know that this is a huge page - we just throw the
information away in address_space_do_translate().

This patch introduced "page_mask" optional parameter to capture that
page mask info. Also, I made "plen" an optional parameter as well, with
some comments for the whole function.

No functional change yet.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 exec.c | 46 ++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 40 insertions(+), 6 deletions(-)

Comments

Michael S. Tsirkin June 2, 2017, 4:45 p.m. UTC | #1
On Fri, Jun 02, 2017 at 07:50:52PM +0800, Peter Xu wrote:
> The function is originally used for address_space_translate() and what
> we care about most is (xlat, plen) range. However for iotlb requests, we
> don't really care about "plen", but the size of the page that "xlat" is
> located on. While, plen cannot really contain this information.
> 
> A simple example to show why "plen" is not good for IOTLB translations:
> 
> E.g., for huge pages, it is possible that guest mapped 1G huge page on
> device side that used this GPA range:
> 
>   0x100000000 - 0x13fffffff
> 
> Then let's say we want to translate one IOVA that finally mapped to GPA
> 0x13ffffe00 (which is located on this 1G huge page). Then here we'll
> get:
> 
>   (xlat, plen) = (0x13fffe00, 0x200)
> 
> So the IOTLB would be only covering a very small range since from
> "plen" (which is 0x200 bytes) we cannot tell the size of the page.
> 
> Actually we can really know that this is a huge page - we just throw the
> information away in address_space_do_translate().
> 
> This patch introduced "page_mask" optional parameter to capture that
> page mask info. Also, I made "plen" an optional parameter as well, with
> some comments for the whole function.
> 
> No functional change yet.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  exec.c | 46 ++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 40 insertions(+), 6 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 8fc0e78..63a3ff0 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -465,21 +465,45 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x
>      return section;
>  }
>  
> -/* Called from RCU critical section */
> +/**
> + * address_space_do_translate - translate an address in AddressSpace
> + *
> + * @as: the address space that we want to translate on
> + * @addr: the address to be translated in above address space
> + * @xlat: the translated address offset within memory region. It
> + *        cannot be @NULL.
> + * @plen_out: valid read/write length of the translated address. It
> + *            can be @NULL when we don't care about it.
> + * @page_mask_out: page mask for the translated address. This
> + *            should only be meaningful for IOMMU translated
> + *            addresses, since there may be huge pages that this bit
> + *            would tell. It can be @NULL if we don't care about it.

Why do we need plen or mask at all? It seems MemoryRegionSection
has address and length already. So if you want to find out
distance to section end, do section.size - xlat and you are done.


> + * @is_write: whether the translation operation is for write
> + * @is_mmio: whether this can be MMIO, set true if it can
> + *
> + * This function is called from RCU critical section
> + */
>  static MemoryRegionSection address_space_do_translate(AddressSpace *as,
>                                                        hwaddr addr,
>                                                        hwaddr *xlat,
> -                                                      hwaddr *plen,
> +                                                      hwaddr *plen_out,
> +                                                      hwaddr *page_mask_out,
>                                                        bool is_write,
>                                                        bool is_mmio)
>  {
>      IOMMUTLBEntry iotlb;
>      MemoryRegionSection *section;
>      MemoryRegion *mr;
> +    hwaddr page_mask = TARGET_PAGE_MASK;
> +    hwaddr plen = (hwaddr)(-1);
> +
> +    if (plen_out) {
> +        plen = *plen_out;
> +    }
>  
>      for (;;) {
>          AddressSpaceDispatch *d = atomic_rcu_read(&as->dispatch);
> -        section = address_space_translate_internal(d, addr, &addr, plen, is_mmio);
> +        section = address_space_translate_internal(d, addr, &addr, &plen, is_mmio);
>          mr = section->mr;
>  
>          if (!mr->iommu_ops) {
> @@ -490,7 +514,8 @@ static MemoryRegionSection address_space_do_translate(AddressSpace *as,
>                                           IOMMU_WO : IOMMU_RO);
>          addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
>                  | (addr & iotlb.addr_mask));
> -        *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1);
> +        page_mask = iotlb.addr_mask;
> +        plen = MIN(plen, (addr | iotlb.addr_mask) - addr + 1);
>          if (!(iotlb.perm & (1 << is_write))) {
>              goto translate_fail;
>          }
> @@ -500,6 +525,14 @@ static MemoryRegionSection address_space_do_translate(AddressSpace *as,
>  
>      *xlat = addr;
>  
> +    if (page_mask_out) {
> +        *page_mask_out = page_mask;
> +    }
> +
> +    if (plen_out) {
> +        *plen_out = plen;
> +    }
> +
>      return *section;
>  
>  translate_fail:
> @@ -518,7 +551,7 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr,
>  
>      /* This can never be MMIO. */
>      section = address_space_do_translate(as, addr, &xlat, &plen,
> -                                         is_write, false);
> +                                         NULL, is_write, false);
>  
>      /* Illegal translation */
>      if (section.mr == &io_mem_unassigned) {
> @@ -560,7 +593,8 @@ MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
>      MemoryRegionSection section;
>  
>      /* This can be MMIO, so setup MMIO bit. */
> -    section = address_space_do_translate(as, addr, xlat, plen, is_write, true);
> +    section = address_space_do_translate(as, addr, xlat, plen,
> +                                         NULL, is_write, true);
>      mr = section.mr;
>  
>      if (xen_enabled() && memory_access_is_direct(mr, is_write)) {
> -- 
> 2.7.4
Peter Xu June 5, 2017, 2:52 a.m. UTC | #2
On Fri, Jun 02, 2017 at 07:45:05PM +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 02, 2017 at 07:50:52PM +0800, Peter Xu wrote:
> > The function is originally used for address_space_translate() and what
> > we care about most is (xlat, plen) range. However for iotlb requests, we
> > don't really care about "plen", but the size of the page that "xlat" is
> > located on. While, plen cannot really contain this information.
> > 
> > A simple example to show why "plen" is not good for IOTLB translations:
> > 
> > E.g., for huge pages, it is possible that guest mapped 1G huge page on
> > device side that used this GPA range:
> > 
> >   0x100000000 - 0x13fffffff
> > 
> > Then let's say we want to translate one IOVA that finally mapped to GPA
> > 0x13ffffe00 (which is located on this 1G huge page). Then here we'll
> > get:
> > 
> >   (xlat, plen) = (0x13fffe00, 0x200)
> > 
> > So the IOTLB would be only covering a very small range since from
> > "plen" (which is 0x200 bytes) we cannot tell the size of the page.
> > 
> > Actually we can really know that this is a huge page - we just throw the
> > information away in address_space_do_translate().
> > 
> > This patch introduced "page_mask" optional parameter to capture that
> > page mask info. Also, I made "plen" an optional parameter as well, with
> > some comments for the whole function.
> > 
> > No functional change yet.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  exec.c | 46 ++++++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 40 insertions(+), 6 deletions(-)
> > 
> > diff --git a/exec.c b/exec.c
> > index 8fc0e78..63a3ff0 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -465,21 +465,45 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x
> >      return section;
> >  }
> >  
> > -/* Called from RCU critical section */
> > +/**
> > + * address_space_do_translate - translate an address in AddressSpace
> > + *
> > + * @as: the address space that we want to translate on
> > + * @addr: the address to be translated in above address space
> > + * @xlat: the translated address offset within memory region. It
> > + *        cannot be @NULL.
> > + * @plen_out: valid read/write length of the translated address. It
> > + *            can be @NULL when we don't care about it.
> > + * @page_mask_out: page mask for the translated address. This
> > + *            should only be meaningful for IOMMU translated
> > + *            addresses, since there may be huge pages that this bit
> > + *            would tell. It can be @NULL if we don't care about it.
> 
> Why do we need plen or mask at all? It seems MemoryRegionSection
> has address and length already. So if you want to find out
> distance to section end, do section.size - xlat and you are done.

Hi, Michael,

When you say:

  section.size - xlat

Do you really mean this?

  section.offset_within_address_space + section.size - xlat

Since otherwise it will make no much sense to me.

Anyway, I don't know whether it'll be okay we remove the plen...

In address_space_do_translate(), the logic is basically:

1. do internal translation (basically to find the section info from
   current address space)
2. do IOMMU translation if the MR is IOMMU typed
3. goto 1.

Along the way (1 -> 2 -> 3 -> 1 -> ...) until we finished the
translation (I don't really know whether we'll have cases for nested
IOMMU translation, but anyway we have a while loop there, so assume
the loop can be executed many times), plen can be shrinking all the
time, either by this in address_space_translate_internal():

    *plen = int128_get64(int128_min(diff, int128_make64(*plen)));

Or this in address_space_do_translate():

    *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1);

And I don't know only using the final section.size to decide plen
would be enough.

Also, for page_mask information - I don't quite sure
MemoryRegionSection can express that info. Again, huge page can be one
example: MemoryRegionSection doesn't really contain huge page
information, while MemoryRegionIOMMUOps.translate() does contain that
information (via addr_mask field).

(I see that you would like IOTLB to be using arbitary length rather
 than page masks. Maybe we can first decide which would be the best
 interface for IOTLB. I'll reply in that context later.)

Thanks,
diff mbox

Patch

diff --git a/exec.c b/exec.c
index 8fc0e78..63a3ff0 100644
--- a/exec.c
+++ b/exec.c
@@ -465,21 +465,45 @@  address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x
     return section;
 }
 
-/* Called from RCU critical section */
+/**
+ * address_space_do_translate - translate an address in AddressSpace
+ *
+ * @as: the address space that we want to translate on
+ * @addr: the address to be translated in above address space
+ * @xlat: the translated address offset within memory region. It
+ *        cannot be @NULL.
+ * @plen_out: valid read/write length of the translated address. It
+ *            can be @NULL when we don't care about it.
+ * @page_mask_out: page mask for the translated address. This
+ *            should only be meaningful for IOMMU translated
+ *            addresses, since there may be huge pages that this bit
+ *            would tell. It can be @NULL if we don't care about it.
+ * @is_write: whether the translation operation is for write
+ * @is_mmio: whether this can be MMIO, set true if it can
+ *
+ * This function is called from RCU critical section
+ */
 static MemoryRegionSection address_space_do_translate(AddressSpace *as,
                                                       hwaddr addr,
                                                       hwaddr *xlat,
-                                                      hwaddr *plen,
+                                                      hwaddr *plen_out,
+                                                      hwaddr *page_mask_out,
                                                       bool is_write,
                                                       bool is_mmio)
 {
     IOMMUTLBEntry iotlb;
     MemoryRegionSection *section;
     MemoryRegion *mr;
+    hwaddr page_mask = TARGET_PAGE_MASK;
+    hwaddr plen = (hwaddr)(-1);
+
+    if (plen_out) {
+        plen = *plen_out;
+    }
 
     for (;;) {
         AddressSpaceDispatch *d = atomic_rcu_read(&as->dispatch);
-        section = address_space_translate_internal(d, addr, &addr, plen, is_mmio);
+        section = address_space_translate_internal(d, addr, &addr, &plen, is_mmio);
         mr = section->mr;
 
         if (!mr->iommu_ops) {
@@ -490,7 +514,8 @@  static MemoryRegionSection address_space_do_translate(AddressSpace *as,
                                          IOMMU_WO : IOMMU_RO);
         addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
                 | (addr & iotlb.addr_mask));
-        *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1);
+        page_mask = iotlb.addr_mask;
+        plen = MIN(plen, (addr | iotlb.addr_mask) - addr + 1);
         if (!(iotlb.perm & (1 << is_write))) {
             goto translate_fail;
         }
@@ -500,6 +525,14 @@  static MemoryRegionSection address_space_do_translate(AddressSpace *as,
 
     *xlat = addr;
 
+    if (page_mask_out) {
+        *page_mask_out = page_mask;
+    }
+
+    if (plen_out) {
+        *plen_out = plen;
+    }
+
     return *section;
 
 translate_fail:
@@ -518,7 +551,7 @@  IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr,
 
     /* This can never be MMIO. */
     section = address_space_do_translate(as, addr, &xlat, &plen,
-                                         is_write, false);
+                                         NULL, is_write, false);
 
     /* Illegal translation */
     if (section.mr == &io_mem_unassigned) {
@@ -560,7 +593,8 @@  MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
     MemoryRegionSection section;
 
     /* This can be MMIO, so setup MMIO bit. */
-    section = address_space_do_translate(as, addr, xlat, plen, is_write, true);
+    section = address_space_do_translate(as, addr, xlat, plen,
+                                         NULL, is_write, true);
     mr = section.mr;
 
     if (xen_enabled() && memory_access_is_direct(mr, is_write)) {