diff mbox series

[2/2] memory: Split zones when do coalesced_io_del()

Message ID 20190817093237.27967-3-peterx@redhat.com
State New
Headers show
Series [1/2] memory: Replace has_coalesced_range with add/del flags | expand

Commit Message

Peter Xu Aug. 17, 2019, 9:32 a.m. UTC
It is a workaround of current KVM's KVM_UNREGISTER_COALESCED_MMIO
interface.  The kernel interface only allows to unregister an mmio
device with exactly the zone size when registered, or any smaller zone
that is included in the device mmio zone.  It does not support the
userspace to specify a very large zone to remove all the small mmio
devices within the zone covered.

Logically speaking it would be nicer to fix this from KVM side, though
in all cases we still need to coop with old kernels so let's do this.

This patch has nothing to do with 3ac7d43a6fbb5d4a3 because this is
probably broken from the very beginning when the
KVM_UNREGISTER_COALESCED_MMIO interface is introduced in kernel.
However to make the backport to stables easier, I'm still using the
commit 3ac7d43a6fbb5d4a3 to track this problem because this will
depend on that otherwise even additions of mmio devices won't work.

Fixes: 3ac7d43a6fbb5d4a3
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 memory.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

Comments

Paolo Bonzini Aug. 19, 2019, 2:24 p.m. UTC | #1
On 17/08/19 11:32, Peter Xu wrote:
> It is a workaround of current KVM's KVM_UNREGISTER_COALESCED_MMIO
> interface.  The kernel interface only allows to unregister an mmio
> device with exactly the zone size when registered, or any smaller zone
> that is included in the device mmio zone.  It does not support the
> userspace to specify a very large zone to remove all the small mmio
> devices within the zone covered.
> 
> Logically speaking it would be nicer to fix this from KVM side, though
> in all cases we still need to coop with old kernels so let's do this.
> 
> This patch has nothing to do with 3ac7d43a6fbb5d4a3 because this is
> probably broken from the very beginning when the
> KVM_UNREGISTER_COALESCED_MMIO interface is introduced in kernel.
> However to make the backport to stables easier, I'm still using the
> commit 3ac7d43a6fbb5d4a3 to track this problem because this will
> depend on that otherwise even additions of mmio devices won't work.
> 
> Fixes: 3ac7d43a6fbb5d4a3
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  memory.c | 30 +++++++++++++++++++++++++++---
>  1 file changed, 27 insertions(+), 3 deletions(-)

This is still messy because memory_region_add_coalescing and
memory_region_clear_coalescing modify fr->mr->colesced.

It's not hard to fix it, but not trivial either.  Probably it is
sufficient to replace memory_region_update_coalesced_range and
memory_region_update_coalesced_range_as with two pairs:

- memory_region_add_coalesced_range and
memory_region_add_coalesced_range_as, which call a new function
flat_range_coalesced_io_add_one to call the listener only on the
newly-added range (and set coalesced_mmio_add_done).
memory_region_add_coalescing then can call
memory_region_add_coalesced_range_as

- memory_region_clear_coalesced_ranges and
memory_region_clear_coalesced_ranges_as, which call
flat_range_coalesced_io_del.  Now memory_region_clear_coalescing can
call memory_region_clear_coalesced_ranges *before* emptying the list, or
exit immediately if it is empty.

Thanks,

Paolo

> diff --git a/memory.c b/memory.c
> index 1a2b465a96..b24cdd13cf 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -864,6 +864,9 @@ static void address_space_update_ioeventfds(AddressSpace *as)
>  
>  static void flat_range_coalesced_io_del(FlatRange *fr, AddressSpace *as)
>  {
> +    CoalescedMemoryRange *cmr;
> +    AddrRange tmp;
> +
>      if (QTAILQ_EMPTY(&fr->mr->coalesced)) {
>          return;
>      }
> @@ -874,9 +877,30 @@ static void flat_range_coalesced_io_del(FlatRange *fr, AddressSpace *as)
>  
>      fr->coalesced_mmio_del_done = true;
>  
> -    MEMORY_LISTENER_UPDATE_REGION(fr, as, Reverse, coalesced_io_del,
> -                                  int128_get64(fr->addr.start),
> -                                  int128_get64(fr->addr.size));
> +    /*
> +     * We split the big region into smaller ones to satisfy KVM's
> +     * KVM_UNREGISTER_COALESCED_MMIO interface, where it does not
> +     * allow to specify a large region to unregister all the devices
> +     * under that zone instead it only accepts exact zones or even a
> +     * smaller zone of previously registered mmio device.  Logically
> +     * speaking we should better fix KVM to allow the userspace to
> +     * unregister multiple mmio devices within a large requested zone,
> +     * but in all cases we'll still need to live with old kernels.  So
> +     * let's simply break the zones into exactly the small pieces when
> +     * we do coalesced_io_add().
> +     */
> +    QTAILQ_FOREACH(cmr, &fr->mr->coalesced, link) {
> +        tmp = addrrange_shift(cmr->addr,
> +                              int128_sub(fr->addr.start,
> +                                         int128_make64(fr->offset_in_region)));
> +        if (!addrrange_intersects(tmp, fr->addr)) {
> +            continue;
> +        }
> +        tmp = addrrange_intersection(tmp, fr->addr);
> +        MEMORY_LISTENER_UPDATE_REGION(fr, as, Reverse, coalesced_io_del,
> +                                      int128_get64(tmp.start),
> +                                      int128_get64(tmp.size));
> +    }
>  }
>  
>  static void flat_range_coalesced_io_add(FlatRange *fr, AddressSpace *as)
>
diff mbox series

Patch

diff --git a/memory.c b/memory.c
index 1a2b465a96..b24cdd13cf 100644
--- a/memory.c
+++ b/memory.c
@@ -864,6 +864,9 @@  static void address_space_update_ioeventfds(AddressSpace *as)
 
 static void flat_range_coalesced_io_del(FlatRange *fr, AddressSpace *as)
 {
+    CoalescedMemoryRange *cmr;
+    AddrRange tmp;
+
     if (QTAILQ_EMPTY(&fr->mr->coalesced)) {
         return;
     }
@@ -874,9 +877,30 @@  static void flat_range_coalesced_io_del(FlatRange *fr, AddressSpace *as)
 
     fr->coalesced_mmio_del_done = true;
 
-    MEMORY_LISTENER_UPDATE_REGION(fr, as, Reverse, coalesced_io_del,
-                                  int128_get64(fr->addr.start),
-                                  int128_get64(fr->addr.size));
+    /*
+     * We split the big region into smaller ones to satisfy KVM's
+     * KVM_UNREGISTER_COALESCED_MMIO interface, where it does not
+     * allow to specify a large region to unregister all the devices
+     * under that zone instead it only accepts exact zones or even a
+     * smaller zone of previously registered mmio device.  Logically
+     * speaking we should better fix KVM to allow the userspace to
+     * unregister multiple mmio devices within a large requested zone,
+     * but in all cases we'll still need to live with old kernels.  So
+     * let's simply break the zones into exactly the small pieces when
+     * we do coalesced_io_add().
+     */
+    QTAILQ_FOREACH(cmr, &fr->mr->coalesced, link) {
+        tmp = addrrange_shift(cmr->addr,
+                              int128_sub(fr->addr.start,
+                                         int128_make64(fr->offset_in_region)));
+        if (!addrrange_intersects(tmp, fr->addr)) {
+            continue;
+        }
+        tmp = addrrange_intersection(tmp, fr->addr);
+        MEMORY_LISTENER_UPDATE_REGION(fr, as, Reverse, coalesced_io_del,
+                                      int128_get64(tmp.start),
+                                      int128_get64(tmp.size));
+    }
 }
 
 static void flat_range_coalesced_io_add(FlatRange *fr, AddressSpace *as)