diff mbox

[RFC,08/26] locking: Remove spin_unlock_wait() generic definitions

Message ID 1498780894-8253-8-git-send-email-paulmck@linux.vnet.ibm.com
State RFC
Delegated to: Pablo Neira
Headers show

Commit Message

Paul E. McKenney June 30, 2017, 12:01 a.m. UTC
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair.  This commit therefore removes spin_unlock_wait() and related
definitions from core code.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/asm-generic/qspinlock.h |  14 -----
 include/linux/spinlock.h        |  31 -----------
 include/linux/spinlock_up.h     |   6 ---
 kernel/locking/qspinlock.c      | 117 ----------------------------------------
 4 files changed, 168 deletions(-)

Comments

Will Deacon June 30, 2017, 9:19 a.m. UTC | #1
On Thu, Jun 29, 2017 at 05:01:16PM -0700, Paul E. McKenney wrote:
> There is no agreed-upon definition of spin_unlock_wait()'s semantics,
> and it appears that all callers could do just as well with a lock/unlock
> pair.  This commit therefore removes spin_unlock_wait() and related
> definitions from core code.
> 
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Cc: Andrea Parri <parri.andrea@gmail.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>  include/asm-generic/qspinlock.h |  14 -----
>  include/linux/spinlock.h        |  31 -----------
>  include/linux/spinlock_up.h     |   6 ---
>  kernel/locking/qspinlock.c      | 117 ----------------------------------------
>  4 files changed, 168 deletions(-)

[...]

> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> index b2caec7315af..64a9051e4c2c 100644
> --- a/kernel/locking/qspinlock.c
> +++ b/kernel/locking/qspinlock.c
> @@ -267,123 +267,6 @@ static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
>  #define queued_spin_lock_slowpath	native_queued_spin_lock_slowpath
>  #endif
>  
> -/*
> - * Various notes on spin_is_locked() and spin_unlock_wait(), which are
> - * 'interesting' functions:
> - *
> - * PROBLEM: some architectures have an interesting issue with atomic ACQUIRE
> - * operations in that the ACQUIRE applies to the LOAD _not_ the STORE (ARM64,
> - * PPC). Also qspinlock has a similar issue per construction, the setting of
> - * the locked byte can be unordered acquiring the lock proper.
> - *
> - * This gets to be 'interesting' in the following cases, where the /should/s
> - * end up false because of this issue.
> - *
> - *
> - * CASE 1:
> - *
> - * So the spin_is_locked() correctness issue comes from something like:
> - *
> - *   CPU0				CPU1
> - *
> - *   global_lock();			local_lock(i)
> - *     spin_lock(&G)			  spin_lock(&L[i])
> - *     for (i)				  if (!spin_is_locked(&G)) {
> - *       spin_unlock_wait(&L[i]);	    smp_acquire__after_ctrl_dep();
> - *					    return;
> - *					  }
> - *					  // deal with fail
> - *
> - * Where it is important CPU1 sees G locked or CPU0 sees L[i] locked such
> - * that there is exclusion between the two critical sections.
> - *
> - * The load from spin_is_locked(&G) /should/ be constrained by the ACQUIRE from
> - * spin_lock(&L[i]), and similarly the load(s) from spin_unlock_wait(&L[i])
> - * /should/ be constrained by the ACQUIRE from spin_lock(&G).
> - *
> - * Similarly, later stuff is constrained by the ACQUIRE from CTRL+RMB.

Might be worth keeping this comment about spin_is_locked, since we're not
removing that guy just yet!

Will
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney June 30, 2017, 12:38 p.m. UTC | #2
On Fri, Jun 30, 2017 at 10:19:29AM +0100, Will Deacon wrote:
> On Thu, Jun 29, 2017 at 05:01:16PM -0700, Paul E. McKenney wrote:
> > There is no agreed-upon definition of spin_unlock_wait()'s semantics,
> > and it appears that all callers could do just as well with a lock/unlock
> > pair.  This commit therefore removes spin_unlock_wait() and related
> > definitions from core code.
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Alan Stern <stern@rowland.harvard.edu>
> > Cc: Andrea Parri <parri.andrea@gmail.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > ---
> >  include/asm-generic/qspinlock.h |  14 -----
> >  include/linux/spinlock.h        |  31 -----------
> >  include/linux/spinlock_up.h     |   6 ---
> >  kernel/locking/qspinlock.c      | 117 ----------------------------------------
> >  4 files changed, 168 deletions(-)
> 
> [...]
> 
> > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> > index b2caec7315af..64a9051e4c2c 100644
> > --- a/kernel/locking/qspinlock.c
> > +++ b/kernel/locking/qspinlock.c
> > @@ -267,123 +267,6 @@ static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
> >  #define queued_spin_lock_slowpath	native_queued_spin_lock_slowpath
> >  #endif
> >  
> > -/*
> > - * Various notes on spin_is_locked() and spin_unlock_wait(), which are
> > - * 'interesting' functions:
> > - *
> > - * PROBLEM: some architectures have an interesting issue with atomic ACQUIRE
> > - * operations in that the ACQUIRE applies to the LOAD _not_ the STORE (ARM64,
> > - * PPC). Also qspinlock has a similar issue per construction, the setting of
> > - * the locked byte can be unordered acquiring the lock proper.
> > - *
> > - * This gets to be 'interesting' in the following cases, where the /should/s
> > - * end up false because of this issue.
> > - *
> > - *
> > - * CASE 1:
> > - *
> > - * So the spin_is_locked() correctness issue comes from something like:
> > - *
> > - *   CPU0				CPU1
> > - *
> > - *   global_lock();			local_lock(i)
> > - *     spin_lock(&G)			  spin_lock(&L[i])
> > - *     for (i)				  if (!spin_is_locked(&G)) {
> > - *       spin_unlock_wait(&L[i]);	    smp_acquire__after_ctrl_dep();
> > - *					    return;
> > - *					  }
> > - *					  // deal with fail
> > - *
> > - * Where it is important CPU1 sees G locked or CPU0 sees L[i] locked such
> > - * that there is exclusion between the two critical sections.
> > - *
> > - * The load from spin_is_locked(&G) /should/ be constrained by the ACQUIRE from
> > - * spin_lock(&L[i]), and similarly the load(s) from spin_unlock_wait(&L[i])
> > - * /should/ be constrained by the ACQUIRE from spin_lock(&G).
> > - *
> > - * Similarly, later stuff is constrained by the ACQUIRE from CTRL+RMB.
> 
> Might be worth keeping this comment about spin_is_locked, since we're not
> removing that guy just yet!

Ah, all the examples had spin_unlock_wait() in them.  So what I need to
do is to create a spin_unlock_wait()-free example to illustrate the
text starting with "The load from spin_is_locked(", correct?

I also need to check all uses of spin_is_locked().  There might no
longer be any that rely on any particular ordering...

								Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Will Deacon June 30, 2017, 1:13 p.m. UTC | #3
On Fri, Jun 30, 2017 at 05:38:15AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 30, 2017 at 10:19:29AM +0100, Will Deacon wrote:
> > On Thu, Jun 29, 2017 at 05:01:16PM -0700, Paul E. McKenney wrote:
> > > There is no agreed-upon definition of spin_unlock_wait()'s semantics,
> > > and it appears that all callers could do just as well with a lock/unlock
> > > pair.  This commit therefore removes spin_unlock_wait() and related
> > > definitions from core code.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Cc: Arnd Bergmann <arnd@arndb.de>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: Will Deacon <will.deacon@arm.com>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Alan Stern <stern@rowland.harvard.edu>
> > > Cc: Andrea Parri <parri.andrea@gmail.com>
> > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > ---
> > >  include/asm-generic/qspinlock.h |  14 -----
> > >  include/linux/spinlock.h        |  31 -----------
> > >  include/linux/spinlock_up.h     |   6 ---
> > >  kernel/locking/qspinlock.c      | 117 ----------------------------------------
> > >  4 files changed, 168 deletions(-)
> > 
> > [...]
> > 
> > > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> > > index b2caec7315af..64a9051e4c2c 100644
> > > --- a/kernel/locking/qspinlock.c
> > > +++ b/kernel/locking/qspinlock.c
> > > @@ -267,123 +267,6 @@ static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
> > >  #define queued_spin_lock_slowpath	native_queued_spin_lock_slowpath
> > >  #endif
> > >  
> > > -/*
> > > - * Various notes on spin_is_locked() and spin_unlock_wait(), which are
> > > - * 'interesting' functions:
> > > - *
> > > - * PROBLEM: some architectures have an interesting issue with atomic ACQUIRE
> > > - * operations in that the ACQUIRE applies to the LOAD _not_ the STORE (ARM64,
> > > - * PPC). Also qspinlock has a similar issue per construction, the setting of
> > > - * the locked byte can be unordered acquiring the lock proper.
> > > - *
> > > - * This gets to be 'interesting' in the following cases, where the /should/s
> > > - * end up false because of this issue.
> > > - *
> > > - *
> > > - * CASE 1:
> > > - *
> > > - * So the spin_is_locked() correctness issue comes from something like:
> > > - *
> > > - *   CPU0				CPU1
> > > - *
> > > - *   global_lock();			local_lock(i)
> > > - *     spin_lock(&G)			  spin_lock(&L[i])
> > > - *     for (i)				  if (!spin_is_locked(&G)) {
> > > - *       spin_unlock_wait(&L[i]);	    smp_acquire__after_ctrl_dep();
> > > - *					    return;
> > > - *					  }
> > > - *					  // deal with fail
> > > - *
> > > - * Where it is important CPU1 sees G locked or CPU0 sees L[i] locked such
> > > - * that there is exclusion between the two critical sections.
> > > - *
> > > - * The load from spin_is_locked(&G) /should/ be constrained by the ACQUIRE from
> > > - * spin_lock(&L[i]), and similarly the load(s) from spin_unlock_wait(&L[i])
> > > - * /should/ be constrained by the ACQUIRE from spin_lock(&G).
> > > - *
> > > - * Similarly, later stuff is constrained by the ACQUIRE from CTRL+RMB.
> > 
> > Might be worth keeping this comment about spin_is_locked, since we're not
> > removing that guy just yet!
> 
> Ah, all the examples had spin_unlock_wait() in them.  So what I need to
> do is to create a spin_unlock_wait()-free example to illustrate the
> text starting with "The load from spin_is_locked(", correct?

Yeah, I think so.

> I also need to check all uses of spin_is_locked().  There might no
> longer be any that rely on any particular ordering...

Right. I think we're looking for the "insane case" as per 38b850a73034
(which was apparently used by ipc/sem.c at the time, but no longer).

There's a usage in kernel/debug/debug_core.c, but it doesn't fill me with
joy.

Will
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney June 30, 2017, 10:18 p.m. UTC | #4
On Fri, Jun 30, 2017 at 02:13:39PM +0100, Will Deacon wrote:
> On Fri, Jun 30, 2017 at 05:38:15AM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 30, 2017 at 10:19:29AM +0100, Will Deacon wrote:
> > > On Thu, Jun 29, 2017 at 05:01:16PM -0700, Paul E. McKenney wrote:
> > > > There is no agreed-upon definition of spin_unlock_wait()'s semantics,
> > > > and it appears that all callers could do just as well with a lock/unlock
> > > > pair.  This commit therefore removes spin_unlock_wait() and related
> > > > definitions from core code.
> > > > 
> > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > > Cc: Arnd Bergmann <arnd@arndb.de>
> > > > Cc: Ingo Molnar <mingo@redhat.com>
> > > > Cc: Will Deacon <will.deacon@arm.com>
> > > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > > Cc: Alan Stern <stern@rowland.harvard.edu>
> > > > Cc: Andrea Parri <parri.andrea@gmail.com>
> > > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > > ---
> > > >  include/asm-generic/qspinlock.h |  14 -----
> > > >  include/linux/spinlock.h        |  31 -----------
> > > >  include/linux/spinlock_up.h     |   6 ---
> > > >  kernel/locking/qspinlock.c      | 117 ----------------------------------------
> > > >  4 files changed, 168 deletions(-)
> > > 
> > > [...]
> > > 
> > > > diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> > > > index b2caec7315af..64a9051e4c2c 100644
> > > > --- a/kernel/locking/qspinlock.c
> > > > +++ b/kernel/locking/qspinlock.c
> > > > @@ -267,123 +267,6 @@ static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
> > > >  #define queued_spin_lock_slowpath	native_queued_spin_lock_slowpath
> > > >  #endif
> > > >  
> > > > -/*
> > > > - * Various notes on spin_is_locked() and spin_unlock_wait(), which are
> > > > - * 'interesting' functions:
> > > > - *
> > > > - * PROBLEM: some architectures have an interesting issue with atomic ACQUIRE
> > > > - * operations in that the ACQUIRE applies to the LOAD _not_ the STORE (ARM64,
> > > > - * PPC). Also qspinlock has a similar issue per construction, the setting of
> > > > - * the locked byte can be unordered acquiring the lock proper.
> > > > - *
> > > > - * This gets to be 'interesting' in the following cases, where the /should/s
> > > > - * end up false because of this issue.
> > > > - *
> > > > - *
> > > > - * CASE 1:
> > > > - *
> > > > - * So the spin_is_locked() correctness issue comes from something like:
> > > > - *
> > > > - *   CPU0				CPU1
> > > > - *
> > > > - *   global_lock();			local_lock(i)
> > > > - *     spin_lock(&G)			  spin_lock(&L[i])
> > > > - *     for (i)				  if (!spin_is_locked(&G)) {
> > > > - *       spin_unlock_wait(&L[i]);	    smp_acquire__after_ctrl_dep();
> > > > - *					    return;
> > > > - *					  }
> > > > - *					  // deal with fail
> > > > - *
> > > > - * Where it is important CPU1 sees G locked or CPU0 sees L[i] locked such
> > > > - * that there is exclusion between the two critical sections.
> > > > - *
> > > > - * The load from spin_is_locked(&G) /should/ be constrained by the ACQUIRE from
> > > > - * spin_lock(&L[i]), and similarly the load(s) from spin_unlock_wait(&L[i])
> > > > - * /should/ be constrained by the ACQUIRE from spin_lock(&G).
> > > > - *
> > > > - * Similarly, later stuff is constrained by the ACQUIRE from CTRL+RMB.
> > > 
> > > Might be worth keeping this comment about spin_is_locked, since we're not
> > > removing that guy just yet!
> > 
> > Ah, all the examples had spin_unlock_wait() in them.  So what I need to
> > do is to create a spin_unlock_wait()-free example to illustrate the
> > text starting with "The load from spin_is_locked(", correct?
> 
> Yeah, I think so.
> 
> > I also need to check all uses of spin_is_locked().  There might no
> > longer be any that rely on any particular ordering...
> 
> Right. I think we're looking for the "insane case" as per 38b850a73034
> (which was apparently used by ipc/sem.c at the time, but no longer).
> 
> There's a usage in kernel/debug/debug_core.c, but it doesn't fill me with
> joy.

That is indeed an interesting one...  But my first round will be what
semantics the implementations seem to provide:

Acquire courtesy of TSO: s390, sparc, x86.
Acquire: ia64 (in reality fully ordered).
Control dependency: alpha, arc, arm, blackfin, hexagon, m32r, mn10300, tile,
	xtensa.
Control dependency plus leading full barrier: arm64, powerpc.
UP-only: c6x, cris, frv, h8300, m68k, microblaze nios2, openrisc, um, unicore32.

Special cases:
	metag: Acquire if !CONFIG_METAG_SMP_WRITE_REORDERING.
	       Otherwise control dependency?
	mips: Control dependency, acquire if CONFIG_CPU_CAVIUM_OCTEON.
	parisc: Acquire courtesy of TSO, but why barrier in smp_load_acquire?
	sh: Acquire if one of SH4A, SH5, or J2, otherwise acquire?  UP-only?

Are these correct, or am I missing something with any of them?

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Will Deacon July 3, 2017, 1:15 p.m. UTC | #5
On Fri, Jun 30, 2017 at 03:18:40PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 30, 2017 at 02:13:39PM +0100, Will Deacon wrote:
> > On Fri, Jun 30, 2017 at 05:38:15AM -0700, Paul E. McKenney wrote:
> > > I also need to check all uses of spin_is_locked().  There might no
> > > longer be any that rely on any particular ordering...
> > 
> > Right. I think we're looking for the "insane case" as per 38b850a73034
> > (which was apparently used by ipc/sem.c at the time, but no longer).
> > 
> > There's a usage in kernel/debug/debug_core.c, but it doesn't fill me with
> > joy.
> 
> That is indeed an interesting one...  But my first round will be what
> semantics the implementations seem to provide:
> 
> Acquire courtesy of TSO: s390, sparc, x86.
> Acquire: ia64 (in reality fully ordered).
> Control dependency: alpha, arc, arm, blackfin, hexagon, m32r, mn10300, tile,
> 	xtensa.
> Control dependency plus leading full barrier: arm64, powerpc.
> UP-only: c6x, cris, frv, h8300, m68k, microblaze nios2, openrisc, um, unicore32.
> 
> Special cases:
> 	metag: Acquire if !CONFIG_METAG_SMP_WRITE_REORDERING.
> 	       Otherwise control dependency?
> 	mips: Control dependency, acquire if CONFIG_CPU_CAVIUM_OCTEON.
> 	parisc: Acquire courtesy of TSO, but why barrier in smp_load_acquire?
> 	sh: Acquire if one of SH4A, SH5, or J2, otherwise acquire?  UP-only?
> 
> Are these correct, or am I missing something with any of them?

That looks about right but, at least on ARM, I think we have to consider
the semantics of spin_is_locked with respect to the other spin_* functions,
rather than in isolation.

For example, ARM only has a control dependency, but spin_lock has a trailing
smp_mb() and spin_unlock has both leading and trailing smp_mb().

Will
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney July 3, 2017, 4:18 p.m. UTC | #6
On Mon, Jul 03, 2017 at 02:15:14PM +0100, Will Deacon wrote:
> On Fri, Jun 30, 2017 at 03:18:40PM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 30, 2017 at 02:13:39PM +0100, Will Deacon wrote:
> > > On Fri, Jun 30, 2017 at 05:38:15AM -0700, Paul E. McKenney wrote:
> > > > I also need to check all uses of spin_is_locked().  There might no
> > > > longer be any that rely on any particular ordering...
> > > 
> > > Right. I think we're looking for the "insane case" as per 38b850a73034
> > > (which was apparently used by ipc/sem.c at the time, but no longer).
> > > 
> > > There's a usage in kernel/debug/debug_core.c, but it doesn't fill me with
> > > joy.
> > 
> > That is indeed an interesting one...  But my first round will be what
> > semantics the implementations seem to provide:
> > 
> > Acquire courtesy of TSO: s390, sparc, x86.
> > Acquire: ia64 (in reality fully ordered).
> > Control dependency: alpha, arc, arm, blackfin, hexagon, m32r, mn10300, tile,
> > 	xtensa.
> > Control dependency plus leading full barrier: arm64, powerpc.
> > UP-only: c6x, cris, frv, h8300, m68k, microblaze nios2, openrisc, um, unicore32.
> > 
> > Special cases:
> > 	metag: Acquire if !CONFIG_METAG_SMP_WRITE_REORDERING.
> > 	       Otherwise control dependency?
> > 	mips: Control dependency, acquire if CONFIG_CPU_CAVIUM_OCTEON.
> > 	parisc: Acquire courtesy of TSO, but why barrier in smp_load_acquire?
> > 	sh: Acquire if one of SH4A, SH5, or J2, otherwise acquire?  UP-only?
> > 
> > Are these correct, or am I missing something with any of them?
> 
> That looks about right but, at least on ARM, I think we have to consider
> the semantics of spin_is_locked with respect to the other spin_* functions,
> rather than in isolation.
> 
> For example, ARM only has a control dependency, but spin_lock has a trailing
> smp_mb() and spin_unlock has both leading and trailing smp_mb().

Agreed, and my next step is to look at spin_lock() followed by
spin_is_locked(), not necessarily the same lock.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds July 3, 2017, 4:40 p.m. UTC | #7
On Mon, Jul 3, 2017 at 9:18 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
>
> Agreed, and my next step is to look at spin_lock() followed by
> spin_is_locked(), not necessarily the same lock.

Hmm. Most (all?) "spin_is_locked()" really should be about the same
thread that took the lock (ie it's about asserts and lock debugging).

The optimistic ABBA avoidance pattern for spinlocks *should* be

    spin_lock(inner)
    ...
    if (!try_lock(outer)) {
           spin_unlock(inner);
           .. do them in the right order ..

so I don't think spin_is_locked() should have any memory barriers.

In fact, the core function for spin_is_locked() is arguably
arch_spin_value_unlocked() which doesn't even do the access itself.

                       Linus
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Will Deacon July 3, 2017, 5:13 p.m. UTC | #8
On Mon, Jul 03, 2017 at 09:40:22AM -0700, Linus Torvalds wrote:
> On Mon, Jul 3, 2017 at 9:18 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> >
> > Agreed, and my next step is to look at spin_lock() followed by
> > spin_is_locked(), not necessarily the same lock.
> 
> Hmm. Most (all?) "spin_is_locked()" really should be about the same
> thread that took the lock (ie it's about asserts and lock debugging).
> 
> The optimistic ABBA avoidance pattern for spinlocks *should* be
> 
>     spin_lock(inner)
>     ...
>     if (!try_lock(outer)) {
>            spin_unlock(inner);
>            .. do them in the right order ..
> 
> so I don't think spin_is_locked() should have any memory barriers.
> 
> In fact, the core function for spin_is_locked() is arguably
> arch_spin_value_unlocked() which doesn't even do the access itself.

Yeah, but there's some spaced-out stuff going on in kgdb_cpu_enter where
it looks to me like raw_spin_is_locked is used for synchronization. My
eyes are hurting looking at it, though.

Will
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney July 3, 2017, 9:10 p.m. UTC | #9
On Mon, Jul 03, 2017 at 09:40:22AM -0700, Linus Torvalds wrote:
> On Mon, Jul 3, 2017 at 9:18 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> >
> > Agreed, and my next step is to look at spin_lock() followed by
> > spin_is_locked(), not necessarily the same lock.
> 
> Hmm. Most (all?) "spin_is_locked()" really should be about the same
> thread that took the lock (ie it's about asserts and lock debugging).

Good to know, that does make things easier.  ;-)

I am not certain that it is feasible to automatically recognize
non-assert/non-debugging use cases of spin_is_locked(), but there is
aways manual inspection.

> The optimistic ABBA avoidance pattern for spinlocks *should* be
> 
>     spin_lock(inner)
>     ...
>     if (!try_lock(outer)) {
>            spin_unlock(inner);
>            .. do them in the right order ..
> 
> so I don't think spin_is_locked() should have any memory barriers.
> 
> In fact, the core function for spin_is_locked() is arguably
> arch_spin_value_unlocked() which doesn't even do the access itself.

OK, so we should rework any cases where people are relying on acquisition
of one spin_lock() being ordered with a later spin_is_locked() on some
other lock by that same thread.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney July 3, 2017, 10:30 p.m. UTC | #10
On Mon, Jul 03, 2017 at 06:13:38PM +0100, Will Deacon wrote:
> On Mon, Jul 03, 2017 at 09:40:22AM -0700, Linus Torvalds wrote:
> > On Mon, Jul 3, 2017 at 9:18 AM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > Agreed, and my next step is to look at spin_lock() followed by
> > > spin_is_locked(), not necessarily the same lock.
> > 
> > Hmm. Most (all?) "spin_is_locked()" really should be about the same
> > thread that took the lock (ie it's about asserts and lock debugging).
> > 
> > The optimistic ABBA avoidance pattern for spinlocks *should* be
> > 
> >     spin_lock(inner)
> >     ...
> >     if (!try_lock(outer)) {
> >            spin_unlock(inner);
> >            .. do them in the right order ..
> > 
> > so I don't think spin_is_locked() should have any memory barriers.
> > 
> > In fact, the core function for spin_is_locked() is arguably
> > arch_spin_value_unlocked() which doesn't even do the access itself.
> 
> Yeah, but there's some spaced-out stuff going on in kgdb_cpu_enter where
> it looks to me like raw_spin_is_locked is used for synchronization. My
> eyes are hurting looking at it, though.

That certainly is one interesting function, isn't it?  I wonder what
happens if you replace the raw_spin_is_locked() calls with an
unlock under a trylock check?  ;-)

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds July 3, 2017, 10:49 p.m. UTC | #11
On Mon, Jul 3, 2017 at 3:30 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
>
> That certainly is one interesting function, isn't it?  I wonder what
> happens if you replace the raw_spin_is_locked() calls with an
> unlock under a trylock check?  ;-)

Deadlock due to interrupts again?

Didn't your spin_unlock_wait() patches teach you anything? Checking
state is fundamentally different from taking the lock. Even a trylock.

I guess you could try with the irqsave versions. But no, we're not doing that.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney July 4, 2017, 12:39 a.m. UTC | #12
On Mon, Jul 03, 2017 at 03:49:42PM -0700, Linus Torvalds wrote:
> On Mon, Jul 3, 2017 at 3:30 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> >
> > That certainly is one interesting function, isn't it?  I wonder what
> > happens if you replace the raw_spin_is_locked() calls with an
> > unlock under a trylock check?  ;-)
> 
> Deadlock due to interrupts again?

Unless I am missing something subtle, the kgdb_cpu_enter() function in
question has a local_irq_save() over the "interesting" portion of its
workings, so interrupt-handler self-deadlock should not happen.

> Didn't your spin_unlock_wait() patches teach you anything? Checking
> state is fundamentally different from taking the lock. Even a trylock.

That was an embarrassing bug, no two ways about it.  :-/

> I guess you could try with the irqsave versions. But no, we're not doing that.

Again, no need in this case.

But I agree with Will's assessment of this function...

The raw_spin_is_locked() looks to be asking if -any- CPU holds the
dbg_slave_lock, and the answer could of course change immediately
on return from raw_spin_is_locked().  Perhaps the theory is that
if other CPU holds the lock, this CPU is supposed to be subjected to
kgdb_roundup_cpus().  Except that the CPU that held dbg_slave_lock might
be just about to release that lock.  Odd.

Seems like there should be a get_online_cpus() somewhere, but maybe
that constraint is to be manually enforced.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney July 4, 2017, 12:54 a.m. UTC | #13
On Mon, Jul 03, 2017 at 05:39:36PM -0700, Paul E. McKenney wrote:
> On Mon, Jul 03, 2017 at 03:49:42PM -0700, Linus Torvalds wrote:
> > On Mon, Jul 3, 2017 at 3:30 PM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > That certainly is one interesting function, isn't it?  I wonder what
> > > happens if you replace the raw_spin_is_locked() calls with an
> > > unlock under a trylock check?  ;-)
> > 
> > Deadlock due to interrupts again?
> 
> Unless I am missing something subtle, the kgdb_cpu_enter() function in
> question has a local_irq_save() over the "interesting" portion of its
> workings, so interrupt-handler self-deadlock should not happen.
> 
> > Didn't your spin_unlock_wait() patches teach you anything? Checking
> > state is fundamentally different from taking the lock. Even a trylock.
> 
> That was an embarrassing bug, no two ways about it.  :-/
> 
> > I guess you could try with the irqsave versions. But no, we're not doing that.
> 
> Again, no need in this case.
> 
> But I agree with Will's assessment of this function...
> 
> The raw_spin_is_locked() looks to be asking if -any- CPU holds the
> dbg_slave_lock, and the answer could of course change immediately
> on return from raw_spin_is_locked().  Perhaps the theory is that
> if other CPU holds the lock, this CPU is supposed to be subjected to
> kgdb_roundup_cpus().  Except that the CPU that held dbg_slave_lock might
> be just about to release that lock.  Odd.
> 
> Seems like there should be a get_online_cpus() somewhere, but maybe
> that constraint is to be manually enforced.

Except that invoking get_online_cpus() from an exception handler would
be of course be a spectacularly bad idea.  I would feel better if the
num_online_cpus() was under the local_irq_save(), but perhaps this code
is relying on the stop_machine().  Except that it appears we could
deadlock with offline waiting for stop_machine() to complete and kdbg
waiting for all CPUs to report, including those in stop_machine().

Looks like the current situation is "Don't use kdbg if there is any
possibility of CPU-hotplug operations."  Not necessarily an unreasonable
restriction.

But I need to let me eyes heal a bit before looking at this more.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
index 9f0681bf1e87..66260777d644 100644
--- a/include/asm-generic/qspinlock.h
+++ b/include/asm-generic/qspinlock.h
@@ -22,17 +22,6 @@ 
 #include <asm-generic/qspinlock_types.h>
 
 /**
- * queued_spin_unlock_wait - wait until the _current_ lock holder releases the lock
- * @lock : Pointer to queued spinlock structure
- *
- * There is a very slight possibility of live-lock if the lockers keep coming
- * and the waiter is just unfortunate enough to not see any unlock state.
- */
-#ifndef queued_spin_unlock_wait
-extern void queued_spin_unlock_wait(struct qspinlock *lock);
-#endif
-
-/**
  * queued_spin_is_locked - is the spinlock locked?
  * @lock: Pointer to queued spinlock structure
  * Return: 1 if it is locked, 0 otherwise
@@ -41,8 +30,6 @@  extern void queued_spin_unlock_wait(struct qspinlock *lock);
 static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
 {
 	/*
-	 * See queued_spin_unlock_wait().
-	 *
 	 * Any !0 state indicates it is locked, even if _Q_LOCKED_VAL
 	 * isn't immediately observable.
 	 */
@@ -135,6 +122,5 @@  static __always_inline bool virt_spin_lock(struct qspinlock *lock)
 #define arch_spin_trylock(l)		queued_spin_trylock(l)
 #define arch_spin_unlock(l)		queued_spin_unlock(l)
 #define arch_spin_lock_flags(l, f)	queued_spin_lock(l)
-#define arch_spin_unlock_wait(l)	queued_spin_unlock_wait(l)
 
 #endif /* __ASM_GENERIC_QSPINLOCK_H */
diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index d9510e8522d4..ef018a6e4985 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -130,12 +130,6 @@  do {								\
 #define smp_mb__before_spinlock()	smp_wmb()
 #endif
 
-/**
- * raw_spin_unlock_wait - wait until the spinlock gets unlocked
- * @lock: the spinlock in question.
- */
-#define raw_spin_unlock_wait(lock)	arch_spin_unlock_wait(&(lock)->raw_lock)
-
 #ifdef CONFIG_DEBUG_SPINLOCK
  extern void do_raw_spin_lock(raw_spinlock_t *lock) __acquires(lock);
 #define do_raw_spin_lock_flags(lock, flags) do_raw_spin_lock(lock)
@@ -369,31 +363,6 @@  static __always_inline int spin_trylock_irq(spinlock_t *lock)
 	raw_spin_trylock_irqsave(spinlock_check(lock), flags); \
 })
 
-/**
- * spin_unlock_wait - Interpose between successive critical sections
- * @lock: the spinlock whose critical sections are to be interposed.
- *
- * Semantically this is equivalent to a spin_lock() immediately
- * followed by a spin_unlock().  However, most architectures have
- * more efficient implementations in which the spin_unlock_wait()
- * cannot block concurrent lock acquisition, and in some cases
- * where spin_unlock_wait() does not write to the lock variable.
- * Nevertheless, spin_unlock_wait() can have high overhead, so if
- * you feel the need to use it, please check to see if there is
- * a better way to get your job done.
- *
- * The ordering guarantees provided by spin_unlock_wait() are:
- *
- * 1.  All accesses preceding the spin_unlock_wait() happen before
- *     any accesses in later critical sections for this same lock.
- * 2.  All accesses following the spin_unlock_wait() happen after
- *     any accesses in earlier critical sections for this same lock.
- */
-static __always_inline void spin_unlock_wait(spinlock_t *lock)
-{
-	raw_spin_unlock_wait(&lock->rlock);
-}
-
 static __always_inline int spin_is_locked(spinlock_t *lock)
 {
 	return raw_spin_is_locked(&lock->rlock);
diff --git a/include/linux/spinlock_up.h b/include/linux/spinlock_up.h
index 0d9848de677d..612fb530af41 100644
--- a/include/linux/spinlock_up.h
+++ b/include/linux/spinlock_up.h
@@ -26,11 +26,6 @@ 
 #ifdef CONFIG_DEBUG_SPINLOCK
 #define arch_spin_is_locked(x)		((x)->slock == 0)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->slock, VAL);
-}
-
 static inline void arch_spin_lock(arch_spinlock_t *lock)
 {
 	lock->slock = 0;
@@ -73,7 +68,6 @@  static inline void arch_spin_unlock(arch_spinlock_t *lock)
 
 #else /* DEBUG_SPINLOCK */
 #define arch_spin_is_locked(lock)	((void)(lock), 0)
-#define arch_spin_unlock_wait(lock)	do { barrier(); (void)(lock); } while (0)
 /* for sched/core.c and kernel_lock.c: */
 # define arch_spin_lock(lock)		do { barrier(); (void)(lock); } while (0)
 # define arch_spin_lock_flags(lock, flags)	do { barrier(); (void)(lock); } while (0)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index b2caec7315af..64a9051e4c2c 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -267,123 +267,6 @@  static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
 #define queued_spin_lock_slowpath	native_queued_spin_lock_slowpath
 #endif
 
-/*
- * Various notes on spin_is_locked() and spin_unlock_wait(), which are
- * 'interesting' functions:
- *
- * PROBLEM: some architectures have an interesting issue with atomic ACQUIRE
- * operations in that the ACQUIRE applies to the LOAD _not_ the STORE (ARM64,
- * PPC). Also qspinlock has a similar issue per construction, the setting of
- * the locked byte can be unordered acquiring the lock proper.
- *
- * This gets to be 'interesting' in the following cases, where the /should/s
- * end up false because of this issue.
- *
- *
- * CASE 1:
- *
- * So the spin_is_locked() correctness issue comes from something like:
- *
- *   CPU0				CPU1
- *
- *   global_lock();			local_lock(i)
- *     spin_lock(&G)			  spin_lock(&L[i])
- *     for (i)				  if (!spin_is_locked(&G)) {
- *       spin_unlock_wait(&L[i]);	    smp_acquire__after_ctrl_dep();
- *					    return;
- *					  }
- *					  // deal with fail
- *
- * Where it is important CPU1 sees G locked or CPU0 sees L[i] locked such
- * that there is exclusion between the two critical sections.
- *
- * The load from spin_is_locked(&G) /should/ be constrained by the ACQUIRE from
- * spin_lock(&L[i]), and similarly the load(s) from spin_unlock_wait(&L[i])
- * /should/ be constrained by the ACQUIRE from spin_lock(&G).
- *
- * Similarly, later stuff is constrained by the ACQUIRE from CTRL+RMB.
- *
- *
- * CASE 2:
- *
- * For spin_unlock_wait() there is a second correctness issue, namely:
- *
- *   CPU0				CPU1
- *
- *   flag = set;
- *   smp_mb();				spin_lock(&l)
- *   spin_unlock_wait(&l);		if (!flag)
- *					  // add to lockless list
- *					spin_unlock(&l);
- *   // iterate lockless list
- *
- * Which wants to ensure that CPU1 will stop adding bits to the list and CPU0
- * will observe the last entry on the list (if spin_unlock_wait() had ACQUIRE
- * semantics etc..)
- *
- * Where flag /should/ be ordered against the locked store of l.
- */
-
-/*
- * queued_spin_lock_slowpath() can (load-)ACQUIRE the lock before
- * issuing an _unordered_ store to set _Q_LOCKED_VAL.
- *
- * This means that the store can be delayed, but no later than the
- * store-release from the unlock. This means that simply observing
- * _Q_LOCKED_VAL is not sufficient to determine if the lock is acquired.
- *
- * There are two paths that can issue the unordered store:
- *
- *  (1) clear_pending_set_locked():	*,1,0 -> *,0,1
- *
- *  (2) set_locked():			t,0,0 -> t,0,1 ; t != 0
- *      atomic_cmpxchg_relaxed():	t,0,0 -> 0,0,1
- *
- * However, in both cases we have other !0 state we've set before to queue
- * ourseves:
- *
- * For (1) we have the atomic_cmpxchg_acquire() that set _Q_PENDING_VAL, our
- * load is constrained by that ACQUIRE to not pass before that, and thus must
- * observe the store.
- *
- * For (2) we have a more intersting scenario. We enqueue ourselves using
- * xchg_tail(), which ends up being a RELEASE. This in itself is not
- * sufficient, however that is followed by an smp_cond_acquire() on the same
- * word, giving a RELEASE->ACQUIRE ordering. This again constrains our load and
- * guarantees we must observe that store.
- *
- * Therefore both cases have other !0 state that is observable before the
- * unordered locked byte store comes through. This means we can use that to
- * wait for the lock store, and then wait for an unlock.
- */
-#ifndef queued_spin_unlock_wait
-void queued_spin_unlock_wait(struct qspinlock *lock)
-{
-	u32 val;
-
-	for (;;) {
-		val = atomic_read(&lock->val);
-
-		if (!val) /* not locked, we're done */
-			goto done;
-
-		if (val & _Q_LOCKED_MASK) /* locked, go wait for unlock */
-			break;
-
-		/* not locked, but pending, wait until we observe the lock */
-		cpu_relax();
-	}
-
-	/* any unlock is good */
-	while (atomic_read(&lock->val) & _Q_LOCKED_MASK)
-		cpu_relax();
-
-done:
-	smp_acquire__after_ctrl_dep();
-}
-EXPORT_SYMBOL(queued_spin_unlock_wait);
-#endif
-
 #endif /* _GEN_PV_LOCK_SLOWPATH */
 
 /**