[RFC] Fix SPARC atomic_write_barrier.
diff mbox

Message ID 1414703220.10085.209.camel@triegel.csb
State New
Headers show

Commit Message

Torvald Riegel Oct. 30, 2014, 9:07 p.m. UTC
This patch changes SPARC write barriers to be just release barriers.  I
haven't tested this, so this is based on my understanding of the SPARC
memory model (TSO).

For a release barrier, we have code like
  foo = 1;
  atomic_write_barrier ();
  release_flag = 1;
and release_flag could also be a spinlock unlock, for example.

So we want to prevent store/store and load/store reordering, so that the
release_flag assigment is the "last" thing.

Previously, the write barriers prevented -- AFAIU SPARC assembly --
store/load reordering, which is what you'd need a full barrier for on
TSO (e.g., consider Dekker synchronization).  The other kinds of
reordering barriers (e.g., load/store) are implicit on TSO (as on x86).
Thus, the change doesn't fix a correctness issue (at least if using
TSO!) but just a performance issue.

Could someone who cares about SPARC please review and test this?

Comments

David Miller Oct. 30, 2014, 11:08 p.m. UTC | #1
From: Torvald Riegel <triegel@redhat.com>
Date: Thu, 30 Oct 2014 22:07:00 +0100

> This patch changes SPARC write barriers to be just release barriers.  I
> haven't tested this, so this is based on my understanding of the SPARC
> memory model (TSO).

The sparc memory model is variable.

It can be TSO, PSO, or RMO.

It is controlled by a bit in the processor state register, and not
all cpus support all memory models.

Linux happens to run all threads in TSO currently, but this is not
something GLIBC or any userland code should really depend upon.

So we should use whatever memory barriers are necessary in the least
strict memory model possible, which is RMO.
Torvald Riegel Oct. 31, 2014, 12:22 a.m. UTC | #2
On Thu, 2014-10-30 at 19:08 -0400, David Miller wrote:
> From: Torvald Riegel <triegel@redhat.com>
> Date: Thu, 30 Oct 2014 22:07:00 +0100
> 
> > This patch changes SPARC write barriers to be just release barriers.  I
> > haven't tested this, so this is based on my understanding of the SPARC
> > memory model (TSO).
> 
> The sparc memory model is variable.
> 
> It can be TSO, PSO, or RMO.
> 
> It is controlled by a bit in the processor state register, and not
> all cpus support all memory models.
> 
> Linux happens to run all threads in TSO currently, but this is not
> something GLIBC or any userland code should really depend upon.
> 
> So we should use whatever memory barriers are necessary in the least
> strict memory model possible, which is RMO.

The patch doesn't rely on TSO.  If you consider RMO, then it could be
considered a bug fix because LoadStore was missing from the release
barrier.

With "my understanding of the SPARC memory model" I was referring to the
assumption that on TSO, acquire and release barriers are no-ops,
effectively, because the reordering constraints that make up the
barriers are implicitly provided.  And StoreLoad is not, but was part of
the release barrier, so this seemed to be a mistake.

Do you have an opinion on the actual patch, rather than my description?
David Miller Oct. 31, 2014, 1:58 a.m. UTC | #3
From: Torvald Riegel <triegel@redhat.com>
Date: Fri, 31 Oct 2014 01:22:16 +0100

> Do you have an opinion on the actual patch, rather than my description?

Sorry for getting distracted :-) I'll review your patch as soon as I
can.
David Miller Oct. 31, 2014, 3:38 a.m. UTC | #4
From: David Miller <davem@davemloft.net>
Date: Thu, 30 Oct 2014 21:58:34 -0400 (EDT)

> From: Torvald Riegel <triegel@redhat.com>
> Date: Fri, 31 Oct 2014 01:22:16 +0100
> 
>> Do you have an opinion on the actual patch, rather than my description?
> 
> Sorry for getting distracted :-) I'll review your patch as soon as I
> can.

It looks good, and passes the tests I've done so far.

Please feel free to commit it.

Thanks!

Patch
diff mbox

commit 5adee7234e1ca9ea7b8998a096c369cddeea47c8
Author: Torvald Riegel <triegel@redhat.com>
Date:   Wed Oct 29 19:14:14 2014 +0100

    Fix SPARC atomic_write_barrier.

diff --git a/sysdeps/sparc/sparc32/bits/atomic.h b/sysdeps/sparc/sparc32/bits/atomic.h
index 1b4175d..2ae2eaa 100644
--- a/sysdeps/sparc/sparc32/bits/atomic.h
+++ b/sysdeps/sparc/sparc32/bits/atomic.h
@@ -346,8 +346,8 @@  extern uint64_t _dl_hwcap __attribute__((weak));
 #define atomic_write_barrier()						\
   do {									\
      if (__atomic_is_v9)						\
-       /* membar  #StoreLoad | #StoreStore */				\
-       __asm __volatile (".word 0x8143e00a" : : : "memory");		\
+       /* membar  #LoadStore | #StoreStore */				\
+       __asm __volatile (".word 0x8143e00c" : : : "memory");		\
      else								\
        __asm __volatile ("" : : : "memory");				\
   } while (0)
diff --git a/sysdeps/sparc/sparc32/sparcv9/bits/atomic.h b/sysdeps/sparc/sparc32/sparcv9/bits/atomic.h
index 8441de3..7644796 100644
--- a/sysdeps/sparc/sparc32/sparcv9/bits/atomic.h
+++ b/sysdeps/sparc/sparc32/sparcv9/bits/atomic.h
@@ -99,4 +99,4 @@  typedef uintmax_t uatomic_max_t;
 #define atomic_read_barrier() \
   __asm __volatile ("membar #LoadLoad | #LoadStore" : : : "memory")
 #define atomic_write_barrier() \
-  __asm __volatile ("membar #StoreLoad | #StoreStore" : : : "memory")
+  __asm __volatile ("membar #LoadStore | #StoreStore" : : : "memory")
diff --git a/sysdeps/sparc/sparc64/bits/atomic.h b/sysdeps/sparc/sparc64/bits/atomic.h
index ccb7319..2bca42b 100644
--- a/sysdeps/sparc/sparc64/bits/atomic.h
+++ b/sysdeps/sparc/sparc64/bits/atomic.h
@@ -120,4 +120,4 @@  typedef uintmax_t uatomic_max_t;
 #define atomic_read_barrier() \
   __asm __volatile ("membar #LoadLoad | #LoadStore" : : : "memory")
 #define atomic_write_barrier() \
-  __asm __volatile ("membar #StoreLoad | #StoreStore" : : : "memory")
+  __asm __volatile ("membar #LoadStore | #StoreStore" : : : "memory")