diff mbox

[v2] atomics: add explicit compiler fence in __atomic memory barriers

Message ID 1433334891-20306-1-git-send-email-pbonzini@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini June 3, 2015, 12:34 p.m. UTC
__atomic_thread_fence does not include a compiler barrier; in the
C++11 memory model, fences take effect in combination with other
atomic operations.  GCC implements this by making __atomic_load and
__atomic_store access memory as if the pointer was volatile, and
leaves no trace whatsoever of acquire and release fences in the
compiler's intermediate representation.

In QEMU, we want memory barriers to act on all memory, but at the same
time we would like to use __atomic_thread_fence for portability reasons.
Add compiler barriers manually around the __atomic_thread_fence.

Thanks to Kevin Wolf for the analysis.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
        v1->v2: improved comment [Peter]
---
 include/qemu/atomic.h | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Stefan Hajnoczi June 5, 2015, 1:18 p.m. UTC | #1
On Wed, Jun 03, 2015 at 02:34:51PM +0200, Paolo Bonzini wrote:
> __atomic_thread_fence does not include a compiler barrier; in the
> C++11 memory model, fences take effect in combination with other
> atomic operations.  GCC implements this by making __atomic_load and
> __atomic_store access memory as if the pointer was volatile, and
> leaves no trace whatsoever of acquire and release fences in the
> compiler's intermediate representation.
> 
> In QEMU, we want memory barriers to act on all memory, but at the same
> time we would like to use __atomic_thread_fence for portability reasons.
> Add compiler barriers manually around the __atomic_thread_fence.
> 
> Thanks to Kevin Wolf for the analysis.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>         v1->v2: improved comment [Peter]
> ---
>  include/qemu/atomic.h | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
diff mbox

Patch

diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
index 98e05ca..1586375 100644
--- a/include/qemu/atomic.h
+++ b/include/qemu/atomic.h
@@ -99,7 +99,15 @@ 
 
 #ifndef smp_wmb
 #ifdef __ATOMIC_RELEASE
-#define smp_wmb()   __atomic_thread_fence(__ATOMIC_RELEASE)
+/* __atomic_thread_fence does not include a compiler barrier; instead,
+ * the barrier is part of __atomic_load/__atomic_store's "volatile-like"
+ * semantics. If smp_wmb() is a no-op, absence of the barrier means that
+ * the compiler is free to reorder stores on each side of the smp_wmb().
+ * So add it here, and similarly in smp_rmb() and
+ * smp_read_barrier_depends(); placing barriers on both sides of the
+ * __atomic_thread_fence keeps it firmly in place.
+ */
+#define smp_wmb()   ({ barrier(); __atomic_thread_fence(__ATOMIC_RELEASE); barrier(); })
 #else
 #define smp_wmb()   __sync_synchronize()
 #endif
@@ -107,7 +115,7 @@ 
 
 #ifndef smp_rmb
 #ifdef __ATOMIC_ACQUIRE
-#define smp_rmb()   __atomic_thread_fence(__ATOMIC_ACQUIRE)
+#define smp_rmb()   ({ barrier(); __atomic_thread_fence(__ATOMIC_ACQUIRE); barrier(); })
 #else
 #define smp_rmb()   __sync_synchronize()
 #endif
@@ -115,7 +123,7 @@ 
 
 #ifndef smp_read_barrier_depends
 #ifdef __ATOMIC_CONSUME
-#define smp_read_barrier_depends()   __atomic_thread_fence(__ATOMIC_CONSUME)
+#define smp_read_barrier_depends()   ({ barrier(); __atomic_thread_fence(__ATOMIC_CONSUME); barrier(); })
 #else
 #define smp_read_barrier_depends()   barrier()
 #endif