diff mbox

[RESEND,tip/locking/core,v5,1/6] powerpc: atomic: Make _return atomics and *{cmp}xchg fully ordered

Message ID 1445854536-22645-1-git-send-email-boqun.feng@gmail.com (mailing list archive)
State Superseded
Headers show

Commit Message

Boqun Feng Oct. 26, 2015, 10:15 a.m. UTC
This patch fixes two problems to make value-returning atomics and
{cmp}xchg fully ordered on PPC.

According to memory-barriers.txt:

> Any atomic operation that modifies some state in memory and returns
> information about the state (old or new) implies an SMP-conditional
> general memory barrier (smp_mb()) on each side of the actual
> operation ...

which means these operations should be fully ordered. However on PPC,
PPC_ATOMIC_ENTRY_BARRIER is the barrier before the actual operation,
which is currently "lwsync" if SMP=y. The leading "lwsync" can not
guarantee fully ordered atomics, according to Paul Mckenney:

https://lkml.org/lkml/2015/10/14/970

To fix this, we define PPC_ATOMIC_ENTRY_BARRIER as "sync" to guarantee
the fully-ordered semantics.

This also makes futex atomics fully ordered, which can avoid possible
memory ordering problems if userspace code relies on futex system call
for fully ordered semantics.

Another thing to fix is that xchg, cmpxchg and their atomic{64}_
versions are currently RELEASE+ACQUIRE, which are not fully ordered.

So also replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with
PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in
__{cmp,}xchg_{u32,u64} respectively to guarantee fully ordered semantics
of atomic{,64}_{cmp,}xchg() and {cmp,}xchg(), as a complement of commit
b97021f85517 ("powerpc: Fix atomic_xxx_return barrier semantics").

Cc: <stable@vger.kernel.org> # 3.4+
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---

Michael, I also change PPC_ATOMIC_ENTRY_BARRIER as "sync" if SMP=y in this
version , which is different from the previous one, so request for a new ack.
Thank you ;-)

 arch/powerpc/include/asm/cmpxchg.h | 16 ++++++++--------
 arch/powerpc/include/asm/synch.h   |  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

Comments

Michael Ellerman Oct. 27, 2015, 2:33 a.m. UTC | #1
On Mon, 2015-26-10 at 10:15:36 UTC, Boqun Feng wrote:
> This patch fixes two problems to make value-returning atomics and
> {cmp}xchg fully ordered on PPC.

Hi Boqun,

Can you please split this into two patches. One that does the cmpxchg change
and one that changes PPC_ATOMIC_ENTRY_BARRIER.

Also given how pervasive this change is I'd like to take it via the powerpc
next tree, so can you please send this patch (which will be two after you split
it) as powerpc patches. And the rest can go via tip?

cheers
Boqun Feng Oct. 27, 2015, 3:06 a.m. UTC | #2
On Tue, Oct 27, 2015 at 01:33:47PM +1100, Michael Ellerman wrote:
> On Mon, 2015-26-10 at 10:15:36 UTC, Boqun Feng wrote:
> > This patch fixes two problems to make value-returning atomics and
> > {cmp}xchg fully ordered on PPC.
> 
> Hi Boqun,
> 
> Can you please split this into two patches. One that does the cmpxchg change
> and one that changes PPC_ATOMIC_ENTRY_BARRIER.
> 

OK, make sense ;-)

> Also given how pervasive this change is I'd like to take it via the powerpc
> next tree, so can you please send this patch (which will be two after you split
> it) as powerpc patches. And the rest can go via tip?
> 

One problem is that patch 5 will remove __xchg_u32 and __xchg_64
entirely, which are modified in this patch(patch 1), so there will be
some conflicts if two branch get merged, I think.

Alternative way is that all this series go to powerpc next tree as most
of the dependent patches are already there. I just need to remove
inc/dec related code and resend them when appropriate. Besides, I can
pull patch 2 out and send it as a tip patch because it's general code
and no one depends on this in this series.

To summerize:

patch 1(split to two), 3, 4(remove inc/dec implementation), 5, 6 sent as
powerpc patches for powerpc next, patch 2(unmodified) sent as tip patch
for locking/core.

Peter and Michael, this works for you both?

Regards,
Boqun Feng Oct. 30, 2015, 12:56 a.m. UTC | #3
On Tue, Oct 27, 2015 at 11:06:52AM +0800, Boqun Feng wrote:
> On Tue, Oct 27, 2015 at 01:33:47PM +1100, Michael Ellerman wrote:
> > On Mon, 2015-26-10 at 10:15:36 UTC, Boqun Feng wrote:
> > > This patch fixes two problems to make value-returning atomics and
> > > {cmp}xchg fully ordered on PPC.
> > 
> > Hi Boqun,
> > 
> > Can you please split this into two patches. One that does the cmpxchg change
> > and one that changes PPC_ATOMIC_ENTRY_BARRIER.
> > 
> 
> OK, make sense ;-)
> 
> > Also given how pervasive this change is I'd like to take it via the powerpc
> > next tree, so can you please send this patch (which will be two after you split
> > it) as powerpc patches. And the rest can go via tip?
> > 
> 
> One problem is that patch 5 will remove __xchg_u32 and __xchg_64
> entirely, which are modified in this patch(patch 1), so there will be
> some conflicts if two branch get merged, I think.
> 
> Alternative way is that all this series go to powerpc next tree as most
> of the dependent patches are already there. I just need to remove
> inc/dec related code and resend them when appropriate. Besides, I can
> pull patch 2 out and send it as a tip patch because it's general code
> and no one depends on this in this series.
> 
> To summerize:
> 
> patch 1(split to two), 3, 4(remove inc/dec implementation), 5, 6 sent as
> powerpc patches for powerpc next, patch 2(unmodified) sent as tip patch
> for locking/core.
> 
> Peter and Michael, this works for you both?
> 

Thoughts? ;-)

Regards,
Boqun
Boqun Feng Nov. 2, 2015, 1:22 a.m. UTC | #4
On Fri, Oct 30, 2015 at 08:56:33AM +0800, Boqun Feng wrote:
> On Tue, Oct 27, 2015 at 11:06:52AM +0800, Boqun Feng wrote:
> > On Tue, Oct 27, 2015 at 01:33:47PM +1100, Michael Ellerman wrote:
> > > On Mon, 2015-26-10 at 10:15:36 UTC, Boqun Feng wrote:
> > > > This patch fixes two problems to make value-returning atomics and
> > > > {cmp}xchg fully ordered on PPC.
> > > 
> > > Hi Boqun,
> > > 
> > > Can you please split this into two patches. One that does the cmpxchg change
> > > and one that changes PPC_ATOMIC_ENTRY_BARRIER.
> > > 
> > 
> > OK, make sense ;-)
> > 
> > > Also given how pervasive this change is I'd like to take it via the powerpc
> > > next tree, so can you please send this patch (which will be two after you split
> > > it) as powerpc patches. And the rest can go via tip?
> > > 
> > 
> > One problem is that patch 5 will remove __xchg_u32 and __xchg_64
> > entirely, which are modified in this patch(patch 1), so there will be
> > some conflicts if two branch get merged, I think.
> > 
> > Alternative way is that all this series go to powerpc next tree as most
> > of the dependent patches are already there. I just need to remove
> > inc/dec related code and resend them when appropriate. Besides, I can
> > pull patch 2 out and send it as a tip patch because it's general code
> > and no one depends on this in this series.
> > 
> > To summerize:
> > 
> > patch 1(split to two), 3, 4(remove inc/dec implementation), 5, 6 sent as
> > powerpc patches for powerpc next, patch 2(unmodified) sent as tip patch
> > for locking/core.
> > 
> > Peter and Michael, this works for you both?
> > 
> 
> Thoughts? ;-)
> 

Peter and Michael, I will split patch 1 to two and send them as patches
for powerpc next first. The rest of this can wait util we are on the
same page of where they'd better go.

Regards,
Boqun
Boqun Feng Nov. 4, 2015, 1:22 a.m. UTC | #5
On Mon, Nov 02, 2015 at 09:22:40AM +0800, Boqun Feng wrote:
> > On Tue, Oct 27, 2015 at 11:06:52AM +0800, Boqun Feng wrote:
> > > To summerize:
> > > 
> > > patch 1(split to two), 3, 4(remove inc/dec implementation), 5, 6 sent as
> > > powerpc patches for powerpc next, patch 2(unmodified) sent as tip patch
> > > for locking/core.
> > > 
> > > Peter and Michael, this works for you both?
> > > 
> > 
> > Thoughts? ;-)
> > 
> 
> Peter and Michael, I will split patch 1 to two and send them as patches
> for powerpc next first. The rest of this can wait util we are on the
> same page of where they'd better go.
> 

I'm about to send patch 2(adding trivial tests) as a patch for the tip
tree, and rest of this series will be patches for powerpc next.

Will, AFAIK, you are currently working on variants on arm64, right? I
wonder whether you depend on patch 3 (allow archictures to provide
self-defined __atomic_op_*), if so I can also send patch 3 as a patch
for tip tree and wait until it merged into powerpc next to send the
rest. 

Thanks and Best Regards,
Boqun
Will Deacon Nov. 4, 2015, 10:15 a.m. UTC | #6
On Wed, Nov 04, 2015 at 09:22:13AM +0800, Boqun Feng wrote:
> Will, AFAIK, you are currently working on variants on arm64, right? I
> wonder whether you depend on patch 3 (allow archictures to provide
> self-defined __atomic_op_*), if so I can also send patch 3 as a patch
> for tip tree and wait until it merged into powerpc next to send the
> rest. 

The arm64 patches are all queued in the arm64 tree and have been sitting
in -next for a while. They don't dependent on anything else.

Will
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index ad6263c..d1a8d93 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -18,12 +18,12 @@  __xchg_u32(volatile void *p, unsigned long val)
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned int *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -61,12 +61,12 @@  __xchg_u64(volatile void *p, unsigned long val)
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stdcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned long *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -151,14 +151,14 @@  __cmpxchg_u32(volatile unsigned int *p, unsigned long old, unsigned long new)
 	unsigned int prev;
 
 	__asm__ __volatile__ (
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%2		# __cmpxchg_u32\n\
 	cmpw	0,%0,%3\n\
 	bne-	2f\n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%4,0,%2\n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	"\n\
 2:"
 	: "=&r" (prev), "+m" (*p)
@@ -197,13 +197,13 @@  __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
 	unsigned long prev;
 
 	__asm__ __volatile__ (
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%2		# __cmpxchg_u64\n\
 	cmpd	0,%0,%3\n\
 	bne-	2f\n\
 	stdcx.	%4,0,%2\n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	"\n\
 2:"
 	: "=&r" (prev), "+m" (*p)
diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/synch.h
index e682a71..c508686 100644
--- a/arch/powerpc/include/asm/synch.h
+++ b/arch/powerpc/include/asm/synch.h
@@ -44,7 +44,7 @@  static inline void isync(void)
 	MAKE_LWSYNC_SECTION_ENTRY(97, __lwsync_fixup);
 #define PPC_ACQUIRE_BARRIER	 "\n" stringify_in_c(__PPC_ACQUIRE_BARRIER)
 #define PPC_RELEASE_BARRIER	 stringify_in_c(LWSYNC) "\n"
-#define PPC_ATOMIC_ENTRY_BARRIER "\n" stringify_in_c(LWSYNC) "\n"
+#define PPC_ATOMIC_ENTRY_BARRIER "\n" stringify_in_c(sync) "\n"
 #define PPC_ATOMIC_EXIT_BARRIER	 "\n" stringify_in_c(sync) "\n"
 #else
 #define PPC_ACQUIRE_BARRIER