diff mbox series

[2/8] qemu-thread-posix: cleanup, fix, document QemuEvent

Message ID 20230303171939.237819-3-pbonzini@redhat.com
State New
Headers show
Series Fix missing memory barriers on ARM | expand

Commit Message

Paolo Bonzini March 3, 2023, 5:19 p.m. UTC
QemuEvent is currently broken on ARM due to missing memory barriers
after qatomic_*().  Apart from adding the memory barrier, a closer look
reveals some unpaired memory barriers too.  Document more clearly what
is going on, and remove optimizations that I couldn't quite prove to
be correct.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/qemu-thread-posix.c | 64 ++++++++++++++++++++++++++++------------
 1 file changed, 45 insertions(+), 19 deletions(-)

Comments

Richard Henderson March 5, 2023, 7:11 p.m. UTC | #1
On 3/3/23 09:19, Paolo Bonzini wrote:
> QemuEvent is currently broken on ARM due to missing memory barriers
> after qatomic_*().  Apart from adding the memory barrier, a closer look
> reveals some unpaired memory barriers too.  Document more clearly what
> is going on, and remove optimizations that I couldn't quite prove to
> be correct.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   util/qemu-thread-posix.c | 64 ++++++++++++++++++++++++++++------------
>   1 file changed, 45 insertions(+), 19 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~
David Hildenbrand March 6, 2023, 1:28 p.m. UTC | #2
> -             * Leave the event reset and tell qemu_event_set that there
> -             * are waiters.  No need to retry, because there cannot be
> -             * a concurrent busy->free transition.  After the CAS, the
> -             * event will be either set or busy.
> +             * Leave the event reset and tell qemu_event_set that there are
> +             * waiters.  No need to retry, because there cannot be a concurrent
> +             * busy->free transition.  After the CAS, the event will be either
> +             * set or busy.
> +             *
> +             * Neither the load nor the store of this cmpxchg have particular
> +             * ordering requirements.  The reasoning for the load is the same
> +             * as qatomic_read() above; while moving the store earlier can only
> +             * cause qemu_event_set() to issue _more_ wakeups.

IIUC, the qatomic_read(&ev->value) is mostly an optimization then, to 
not do an unconditional qatomic_cmpxchg(). That's why we don't care 
about the order in particular.

>                */
>               if (qatomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY) == EV_SET) {
>                   return;
>               }
>           }
> +
> +        /*
> +         * This is the final check for a concurrent set, so it does need
> +         * a smp_mb() pairing with the second barrier of qemu_event_set().
> +         * The barrier is inside the FUTEX_WAIT system call.
> +         */
>           qemu_futex_wait(ev, EV_BUSY);
>       }
>   }


Skipping back and forth between the Linux and QEMU memory model is a pain :D

Reviewed-by: David Hildenbrand <david@redhat.com>
diff mbox series

Patch

diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 93d250579741..06d1bff63bb7 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -384,13 +384,21 @@  void qemu_event_destroy(QemuEvent *ev)
 
 void qemu_event_set(QemuEvent *ev)
 {
-    /* qemu_event_set has release semantics, but because it *loads*
+    assert(ev->initialized);
+
+    /*
+     * Pairs with memory barrier in qemu_event_reset.
+     *
+     * qemu_event_set has release semantics, but because it *loads*
      * ev->value we need a full memory barrier here.
      */
-    assert(ev->initialized);
     smp_mb();
     if (qatomic_read(&ev->value) != EV_SET) {
-        if (qatomic_xchg(&ev->value, EV_SET) == EV_BUSY) {
+        int old = qatomic_xchg(&ev->value, EV_SET);
+
+        /* Pairs with memory barrier in kernel futex_wait system call.  */
+        smp_mb__after_rmw();
+        if (old == EV_BUSY) {
             /* There were waiters, wake them up.  */
             qemu_futex_wake(ev, INT_MAX);
         }
@@ -399,18 +407,19 @@  void qemu_event_set(QemuEvent *ev)
 
 void qemu_event_reset(QemuEvent *ev)
 {
-    unsigned value;
-
     assert(ev->initialized);
-    value = qatomic_read(&ev->value);
-    smp_mb_acquire();
-    if (value == EV_SET) {
-        /*
-         * If there was a concurrent reset (or even reset+wait),
-         * do nothing.  Otherwise change EV_SET->EV_FREE.
-         */
-        qatomic_or(&ev->value, EV_FREE);
-    }
+
+    /*
+     * If there was a concurrent reset (or even reset+wait),
+     * do nothing.  Otherwise change EV_SET->EV_FREE.
+     */
+    qatomic_or(&ev->value, EV_FREE);
+
+    /*
+     * Order reset before checking the condition in the caller.
+     * Pairs with the first memory barrier in qemu_event_set().
+     */
+    smp_mb__after_rmw();
 }
 
 void qemu_event_wait(QemuEvent *ev)
@@ -418,20 +427,37 @@  void qemu_event_wait(QemuEvent *ev)
     unsigned value;
 
     assert(ev->initialized);
+
+    /*
+     * This read does not have any particular ordering requirements;
+     * if it moves earlier, we might miss qemu_event_set() and go down the
+     * slow path unnecessarily, but ultimately the memory barrier in
+     * qemu_futex_wait() will ensure the check is done correctly.
+     */
     value = qatomic_read(&ev->value);
-    smp_mb_acquire();
     if (value != EV_SET) {
         if (value == EV_FREE) {
             /*
-             * Leave the event reset and tell qemu_event_set that there
-             * are waiters.  No need to retry, because there cannot be
-             * a concurrent busy->free transition.  After the CAS, the
-             * event will be either set or busy.
+             * Leave the event reset and tell qemu_event_set that there are
+             * waiters.  No need to retry, because there cannot be a concurrent
+             * busy->free transition.  After the CAS, the event will be either
+             * set or busy.
+             *
+             * Neither the load nor the store of this cmpxchg have particular
+             * ordering requirements.  The reasoning for the load is the same
+             * as qatomic_read() above; while moving the store earlier can only
+             * cause qemu_event_set() to issue _more_ wakeups.
              */
             if (qatomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY) == EV_SET) {
                 return;
             }
         }
+
+        /*
+         * This is the final check for a concurrent set, so it does need
+         * a smp_mb() pairing with the second barrier of qemu_event_set().
+         * The barrier is inside the FUTEX_WAIT system call.
+         */
         qemu_futex_wait(ev, EV_BUSY);
     }
 }