diff mbox

atomics: add explicit compiler fence in __atomic memory barriers

Message ID 1433334080-14912-1-git-send-email-pbonzini@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini June 3, 2015, 12:21 p.m. UTC
__atomic_thread_fence does not include a compiler barrier; in the
C++11 memory model, fences take effect in combination with other
atomic operations.  GCC implements this by making __atomic_load and
__atomic_store access memory as if the pointer was volatile, and
leaves no trace whatsoever of acquire and release fences in the
compiler's intermediate representation.

In QEMU, we want memory barriers to act on all memory, but at the same
time we would like to use __atomic_thread_fence for portability reasons.
Add compiler barriers manually around the __atomic_thread_fence.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/atomic.h | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Peter Maydell June 3, 2015, 12:25 p.m. UTC | #1
On 3 June 2015 at 13:21, Paolo Bonzini <pbonzini@redhat.com> wrote:
> __atomic_thread_fence does not include a compiler barrier; in the
> C++11 memory model, fences take effect in combination with other
> atomic operations.  GCC implements this by making __atomic_load and
> __atomic_store access memory as if the pointer was volatile, and
> leaves no trace whatsoever of acquire and release fences in the
> compiler's intermediate representation.
>
> In QEMU, we want memory barriers to act on all memory, but at the same
> time we would like to use __atomic_thread_fence for portability reasons.
> Add compiler barriers manually around the __atomic_thread_fence.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qemu/atomic.h | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
> index 98e05ca..bd2c075 100644
> --- a/include/qemu/atomic.h
> +++ b/include/qemu/atomic.h
> @@ -99,7 +99,13 @@
>
>  #ifndef smp_wmb
>  #ifdef __ATOMIC_RELEASE
> -#define smp_wmb()   __atomic_thread_fence(__ATOMIC_RELEASE)
> +/* __atomic_thread_fence does not include a compiler barrier; instead,
> + * the barrier is part of __atomic_load/__atomic_store's "volatile-like"
> + * semantics. If smp_wmb() is a no-op, absence of the barrier means that
> + * the compiler is free to reorder stores on each side of the barrier.
> + * Add one here, and similarly in smp_rmb() and smp_read_barrier_depends().
> + */
> +#define smp_wmb()   ({ barrier(); __atomic_thread_fence(__ATOMIC_RELEASE); barrier(); })

The comment says "add one" but the patch is adding two.
An explanation of why you need a barrier on both sides and
can't manage with just one might be helpful.

thanks
-- PMM
Paolo Bonzini June 3, 2015, 12:31 p.m. UTC | #2
On 03/06/2015 14:25, Peter Maydell wrote:
>> > +/* __atomic_thread_fence does not include a compiler barrier; instead,
>> > + * the barrier is part of __atomic_load/__atomic_store's "volatile-like"
>> > + * semantics. If smp_wmb() is a no-op, absence of the barrier means that
>> > + * the compiler is free to reorder stores on each side of the barrier.
>> > + * Add one here, and similarly in smp_rmb() and smp_read_barrier_depends().
>> > + */
>> > +#define smp_wmb()   ({ barrier(); __atomic_thread_fence(__ATOMIC_RELEASE); barrier(); })
> The comment says "add one" but the patch is adding two.
> An explanation of why you need a barrier on both sides and
> can't manage with just one might be helpful.

Well, the reason is mostly that I wasn't sure if one is enough.

We want to keep the fence in place, and two barriers are firm enough to
block it on both sides.  If the fence is a no-op, "barrier();
barrier();" is the same as a single compiler barrier.

Paolo
diff mbox

Patch

diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
index 98e05ca..bd2c075 100644
--- a/include/qemu/atomic.h
+++ b/include/qemu/atomic.h
@@ -99,7 +99,13 @@ 
 
 #ifndef smp_wmb
 #ifdef __ATOMIC_RELEASE
-#define smp_wmb()   __atomic_thread_fence(__ATOMIC_RELEASE)
+/* __atomic_thread_fence does not include a compiler barrier; instead,
+ * the barrier is part of __atomic_load/__atomic_store's "volatile-like"
+ * semantics. If smp_wmb() is a no-op, absence of the barrier means that
+ * the compiler is free to reorder stores on each side of the barrier.
+ * Add one here, and similarly in smp_rmb() and smp_read_barrier_depends().
+ */
+#define smp_wmb()   ({ barrier(); __atomic_thread_fence(__ATOMIC_RELEASE); barrier(); })
 #else
 #define smp_wmb()   __sync_synchronize()
 #endif
@@ -107,7 +113,7 @@ 
 
 #ifndef smp_rmb
 #ifdef __ATOMIC_ACQUIRE
-#define smp_rmb()   __atomic_thread_fence(__ATOMIC_ACQUIRE)
+#define smp_rmb()   ({ barrier(); __atomic_thread_fence(__ATOMIC_ACQUIRE); barrier(); })
 #else
 #define smp_rmb()   __sync_synchronize()
 #endif
@@ -115,7 +121,7 @@ 
 
 #ifndef smp_read_barrier_depends
 #ifdef __ATOMIC_CONSUME
-#define smp_read_barrier_depends()   __atomic_thread_fence(__ATOMIC_CONSUME)
+#define smp_read_barrier_depends()   ({ barrier(); __atomic_thread_fence(__ATOMIC_CONSUME); barrier(); })
 #else
 #define smp_read_barrier_depends()   barrier()
 #endif