diff mbox series

KVM: use store-release to mark dirty pages as harvested

Message ID 20220902001936.108645-1-pbonzini@redhat.com
State New
Headers show
Series KVM: use store-release to mark dirty pages as harvested | expand

Commit Message

Paolo Bonzini Sept. 2, 2022, 12:19 a.m. UTC
The following scenario can happen if QEMU sets more RESET flags while
the KVM_RESET_DIRTY_RINGS ioctl is ongoing on another host CPU:

    CPU0                     CPU1               CPU2
    ------------------------ ------------------ ------------------------
                                                fill gfn0
                                                store-rel flags for gfn0
                                                fill gfn1
                                                store-rel flags for gfn1
    load-acq flags for gfn0
    set RESET for gfn0
    load-acq flags for gfn1
    set RESET for gfn1
    do ioctl! ----------->
                             ioctl(RESET_RINGS)
                                                fill gfn2
                                                store-rel flags for gfn2
    load-acq flags for gfn2
    set RESET for gfn2
                             process gfn0
                             process gfn1
                             process gfn2
    do ioctl!
    etc.

The three load-acquire in CPU0 synchronize with the three store-release
in CPU2, but CPU0 and CPU1 are only synchronized up to gfn1 and CPU1
may miss gfn2's fields other than flags.

The kernel must be able to cope with invalid values of the fields, and
userspace *will* invoke the ioctl once more.  However, once the RESET flag
is cleared on gfn2, it is lost forever, therefore in the above scenario
CPU1 must read the correct value of gfn2's fields.

Therefore RESET must be set with a store-release, that will synchronize
with KVM's load-acquire in CPU1.

Cc: Gavin Shan <gshan@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 accel/kvm/kvm-all.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

Comments

Peter Xu Sept. 2, 2022, 2:10 p.m. UTC | #1
On Fri, Sep 02, 2022 at 02:19:36AM +0200, Paolo Bonzini wrote:
> The following scenario can happen if QEMU sets more RESET flags while
> the KVM_RESET_DIRTY_RINGS ioctl is ongoing on another host CPU:
> 
>     CPU0                     CPU1               CPU2
>     ------------------------ ------------------ ------------------------
>                                                 fill gfn0
>                                                 store-rel flags for gfn0
>                                                 fill gfn1
>                                                 store-rel flags for gfn1
>     load-acq flags for gfn0
>     set RESET for gfn0
>     load-acq flags for gfn1
>     set RESET for gfn1
>     do ioctl! ----------->
>                              ioctl(RESET_RINGS)
>                                                 fill gfn2
>                                                 store-rel flags for gfn2
>     load-acq flags for gfn2
>     set RESET for gfn2
>                              process gfn0
>                              process gfn1
>                              process gfn2
>     do ioctl!
>     etc.
> 
> The three load-acquire in CPU0 synchronize with the three store-release
> in CPU2, but CPU0 and CPU1 are only synchronized up to gfn1 and CPU1
> may miss gfn2's fields other than flags.
> 
> The kernel must be able to cope with invalid values of the fields, and
> userspace *will* invoke the ioctl once more.  However, once the RESET flag
> is cleared on gfn2, it is lost forever, therefore in the above scenario
> CPU1 must read the correct value of gfn2's fields.
> 
> Therefore RESET must be set with a store-release, that will synchronize
> with KVM's load-acquire in CPU1.
> 
> Cc: Gavin Shan <gshan@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

Thanks (again)!
Gavin Shan Sept. 7, 2022, 5:42 a.m. UTC | #2
On 9/2/22 10:19 AM, Paolo Bonzini wrote:
> The following scenario can happen if QEMU sets more RESET flags while
> the KVM_RESET_DIRTY_RINGS ioctl is ongoing on another host CPU:
> 
>      CPU0                     CPU1               CPU2
>      ------------------------ ------------------ ------------------------
>                                                  fill gfn0
>                                                  store-rel flags for gfn0
>                                                  fill gfn1
>                                                  store-rel flags for gfn1
>      load-acq flags for gfn0
>      set RESET for gfn0
>      load-acq flags for gfn1
>      set RESET for gfn1
>      do ioctl! ----------->
>                               ioctl(RESET_RINGS)
>                                                  fill gfn2
>                                                  store-rel flags for gfn2
>      load-acq flags for gfn2
>      set RESET for gfn2
>                               process gfn0
>                               process gfn1
>                               process gfn2
>      do ioctl!
>      etc.
> 
> The three load-acquire in CPU0 synchronize with the three store-release
> in CPU2, but CPU0 and CPU1 are only synchronized up to gfn1 and CPU1
> may miss gfn2's fields other than flags.
> 
> The kernel must be able to cope with invalid values of the fields, and
> userspace *will* invoke the ioctl once more.  However, once the RESET flag
> is cleared on gfn2, it is lost forever, therefore in the above scenario
> CPU1 must read the correct value of gfn2's fields.
> 
> Therefore RESET must be set with a store-release, that will synchronize
> with KVM's load-acquire in CPU1.
> 
> Cc: Gavin Shan <gshan@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   accel/kvm/kvm-all.c | 18 +++++++++++++++++-
>   1 file changed, 17 insertions(+), 1 deletion(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>

> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 136c8eaed3..7c8ce18bdd 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -728,7 +728,23 @@ static bool dirty_gfn_is_dirtied(struct kvm_dirty_gfn *gfn)
>   
>   static void dirty_gfn_set_collected(struct kvm_dirty_gfn *gfn)
>   {
> -    gfn->flags = KVM_DIRTY_GFN_F_RESET;
> +    /*
> +     * Use a store-release so that the CPU that executes KVM_RESET_DIRTY_RINGS
> +     * sees the full content of the ring:
> +     *
> +     * CPU0                     CPU1                         CPU2
> +     * ------------------------------------------------------------------------------
> +     *                                                       fill gfn0
> +     *                                                       store-rel flags for gfn0
> +     * load-acq flags for gfn0
> +     * store-rel RESET for gfn0
> +     *                          ioctl(RESET_RINGS)
> +     *                            load-acq flags for gfn0
> +     *                            check if flags have RESET
> +     *
> +     * The synchronization goes from CPU2 to CPU0 to CPU1.
> +     */
> +    qatomic_store_release(&gfn->flags, KVM_DIRTY_GFN_F_RESET);
>   }
>   
>   /*
>
diff mbox series

Patch

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 136c8eaed3..7c8ce18bdd 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -728,7 +728,23 @@  static bool dirty_gfn_is_dirtied(struct kvm_dirty_gfn *gfn)
 
 static void dirty_gfn_set_collected(struct kvm_dirty_gfn *gfn)
 {
-    gfn->flags = KVM_DIRTY_GFN_F_RESET;
+    /*
+     * Use a store-release so that the CPU that executes KVM_RESET_DIRTY_RINGS
+     * sees the full content of the ring:
+     *
+     * CPU0                     CPU1                         CPU2
+     * ------------------------------------------------------------------------------
+     *                                                       fill gfn0
+     *                                                       store-rel flags for gfn0
+     * load-acq flags for gfn0
+     * store-rel RESET for gfn0
+     *                          ioctl(RESET_RINGS)
+     *                            load-acq flags for gfn0
+     *                            check if flags have RESET
+     *
+     * The synchronization goes from CPU2 to CPU0 to CPU1.
+     */
+    qatomic_store_release(&gfn->flags, KVM_DIRTY_GFN_F_RESET);
 }
 
 /*