Patchwork 64-bit ppc rwsem

login
register
mail settings
Submitter Benjamin Herrenschmidt
Date Aug. 19, 2010, 5:23 a.m.
Message ID <1282195403.22370.296.camel@pasglop>
Download mbox | patch
Permalink /patch/62110/
State Accepted
Commit 529b7307d804f649839b5b65b303442140266d26
Headers show

Comments

Benjamin Herrenschmidt - Aug. 19, 2010, 5:23 a.m.
On Tue, 2010-08-17 at 22:28 -0700, David Miller wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Wed, 18 Aug 2010 15:03:23 +1000
> 
> > I tried various tricks but so far they didn't work. I'll have another
> > look tomorrow, but I may end up having to keep all the crap typecasts.
> 
> The casts are pretty much unavoidable.
> 
> Here's what I'm going to end up using on sparc64:

Similar here, but using atomic_long_t instead so it works for 32-bit too
for me. I suppose we could make that part common indeed.

What about asm-generic/rwsem-atomic.h  or rwsem-cmpxchg.h ?

Below is my current patch, seems to boot fine here so far.

Cheers,
Ben

Subject: [PATCH] powerpc: Make rwsem use "long" type

This makes the 64-bit kernel use 64-bit signed integers for the counter
(effectively supporting 32-bit of active count in the semaphore), thus
avoiding things like overflow of the mmap_sem if you use a really crazy
number of threads

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/include/asm/rwsem.h |   64 ++++++++++++++++++++++----------------
 1 files changed, 37 insertions(+), 27 deletions(-)
David Miller - Aug. 19, 2010, 5:29 a.m.
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Thu, 19 Aug 2010 15:23:23 +1000

> Similar here, but using atomic_long_t instead so it works for 32-bit too
> for me. I suppose we could make that part common indeed.
> 
> What about asm-generic/rwsem-atomic.h  or rwsem-cmpxchg.h ?

Using rwsem-cmpxchg.h sounds best I guess.
Benjamin Herrenschmidt - Aug. 19, 2010, 10:24 a.m.
On Wed, 2010-08-18 at 22:29 -0700, David Miller wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Thu, 19 Aug 2010 15:23:23 +1000
> 
> > Similar here, but using atomic_long_t instead so it works for 32-bit too
> > for me. I suppose we could make that part common indeed.
> > 
> > What about asm-generic/rwsem-atomic.h  or rwsem-cmpxchg.h ?
> 
> Using rwsem-cmpxchg.h sounds best I guess.

Ok, I'll send a new patch tomorrow that does that.

Cheers,
Ben.
Arnd Bergmann - Aug. 23, 2010, 1:44 p.m.
On Thursday 19 August 2010, David Miller wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Thu, 19 Aug 2010 15:23:23 +1000
> 
> > Similar here, but using atomic_long_t instead so it works for 32-bit too
> > for me. I suppose we could make that part common indeed.
> > 
> > What about asm-generic/rwsem-atomic.h  or rwsem-cmpxchg.h ?
> 
> Using rwsem-cmpxchg.h sounds best I guess.

The implementation looks good for asm-generic, but there is now an asymmetry
between the spinlock and the atomic_long_t based version.

Maybe we can make them both do the same thing, either of

1. create include/linux/rwsem-cmpxchg.h and add an
   #elif defined(CONFIG_RWSEM_GENERIC_ATOMIC) to include/linux/rwsem.h

2. move include/linux/rwsem-spinlock.h to include/asm-generic/ and
   include that from all architectures that want the spinlock based version.

Further comments:

* Alpha has an optimization for the uniprocessor case, where the atomic
instructions get turned into nonatomic additions. The spinlock based
version uses no locks on UP but disables interrupts for reasons I don't
understand (nothing running at interrupt time should try to access an rwsem).
Should the generic version do the same as Alpha?

* Is there any architecture that would still benefit from having a separate
rwsem implementation? AFAICT all the remaining ones are just variations of
the same concept of using cmpxchg (or xadd in case of x86), which is what
atomics typically end up doing anyway.

	Arnd
Benjamin Herrenschmidt - Aug. 23, 2010, 10:01 p.m.
On Mon, 2010-08-23 at 15:44 +0200, Arnd Bergmann wrote:
> 
> * Alpha has an optimization for the uniprocessor case, where the atomic
> instructions get turned into nonatomic additions. The spinlock based
> version uses no locks on UP but disables interrupts for reasons I don't
> understand (nothing running at interrupt time should try to access an rwsem).
> Should the generic version do the same as Alpha?

I've seen drivers in the past do trylocks at interrupt time ... tho I
agree it sucks.

> * Is there any architecture that would still benefit from having a separate
> rwsem implementation? AFAICT all the remaining ones are just variations of
> the same concept of using cmpxchg (or xadd in case of x86), which is what
> atomics typically end up doing anyway.

It depends how sensitive rwsems are. 

The "generic" variant based on atomic's and cmpxchg on powerpc is
sub-optimal in the sense that it has stronger memory barriers that would
be necessary (atomic_inc_return for example has both acquire and
release).

But that vs. one more pile of inline asm, we decided it wasn't hot
enough a spot for us to care back then.

Cheers,
Ben.
David Miller - Aug. 23, 2010, 10:18 p.m.
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Tue, 24 Aug 2010 08:01:25 +1000

> On Mon, 2010-08-23 at 15:44 +0200, Arnd Bergmann wrote:
>> 
>> * Alpha has an optimization for the uniprocessor case, where the atomic
>> instructions get turned into nonatomic additions. The spinlock based
>> version uses no locks on UP but disables interrupts for reasons I don't
>> understand (nothing running at interrupt time should try to access an rwsem).
>> Should the generic version do the same as Alpha?
> 
> I've seen drivers in the past do trylocks at interrupt time ... tho I
> agree it sucks.

Recently there was a thread where this was declared absolutely illegal.

Maybe it was allowed, or sort-of worked before, and that's why it's
accounted for with IRQ disables in some implementations.  I don't
know.
Benjamin Herrenschmidt - Aug. 24, 2010, 1:31 a.m.
On Mon, 2010-08-23 at 15:18 -0700, David Miller wrote:
> > I've seen drivers in the past do trylocks at interrupt time ... tho
> I
> > agree it sucks.
> 
> Recently there was a thread where this was declared absolutely
> illegal.
> 
> Maybe it was allowed, or sort-of worked before, and that's why it's
> accounted for with IRQ disables in some implementations.  I don't
> know. 

Ok, I'm happy to say it's a big no-no then.

Arnd, do you want to take over the moving to asm-generic and take care
of the spinlock case as well ? I can send Linus the first patch that
changes powerpc to use atomic_long now along with a few other things I
have pending, then you can pickup from there. Or do you want me to
continue pushing my patch as-is and we can look at cleaning up the
spinlock case separately ?

Cheers,
Ben.
Arnd Bergmann - Aug. 24, 2010, 12:06 p.m.
On Tuesday 24 August 2010, Benjamin Herrenschmidt wrote:
> On Mon, 2010-08-23 at 15:18 -0700, David Miller wrote:
> > > I've seen drivers in the past do trylocks at interrupt time ... tho
> > I
> > > agree it sucks.
> > 
> > Recently there was a thread where this was declared absolutely
> > illegal.
> > 
> > Maybe it was allowed, or sort-of worked before, and that's why it's
> > accounted for with IRQ disables in some implementations.  I don't
> > know. 
> 
> Ok, I'm happy to say it's a big no-no then.
> 
> Arnd, do you want to take over the moving to asm-generic and take care
> of the spinlock case as well ? I can send Linus the first patch that
> changes powerpc to use atomic_long now along with a few other things I
> have pending, then you can pickup from there. Or do you want me to
> continue pushing my patch as-is and we can look at cleaning up the
> spinlock case separately ?

I'm currently doing too many things at once, please push in your existing
patch for now, we can continue from there.

For the asm-generic patch:
Acked-by: Arnd Bergmann <arnd@arndb.de>

Patch

diff --git a/arch/powerpc/include/asm/rwsem.h b/arch/powerpc/include/asm/rwsem.h
index 24cd928..8447d89 100644
--- a/arch/powerpc/include/asm/rwsem.h
+++ b/arch/powerpc/include/asm/rwsem.h
@@ -21,15 +21,20 @@ 
 /*
  * the semaphore definition
  */
-struct rw_semaphore {
-	/* XXX this should be able to be an atomic_t  -- paulus */
-	signed int		count;
-#define RWSEM_UNLOCKED_VALUE		0x00000000
-#define RWSEM_ACTIVE_BIAS		0x00000001
-#define RWSEM_ACTIVE_MASK		0x0000ffff
-#define RWSEM_WAITING_BIAS		(-0x00010000)
+#ifdef CONFIG_PPC64
+# define RWSEM_ACTIVE_MASK		0xffffffffL
+#else
+# define RWSEM_ACTIVE_MASK		0x0000ffffL
+#endif
+
+#define RWSEM_UNLOCKED_VALUE		0x00000000L
+#define RWSEM_ACTIVE_BIAS		0x00000001L
+#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
 #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
 #define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+
+struct rw_semaphore {
+	long			count;
 	spinlock_t		wait_lock;
 	struct list_head	wait_list;
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -43,9 +48,13 @@  struct rw_semaphore {
 # define __RWSEM_DEP_MAP_INIT(lockname)
 #endif
 
-#define __RWSEM_INITIALIZER(name) \
-	{ RWSEM_UNLOCKED_VALUE, __SPIN_LOCK_UNLOCKED((name).wait_lock), \
-	  LIST_HEAD_INIT((name).wait_list) __RWSEM_DEP_MAP_INIT(name) }
+#define __RWSEM_INITIALIZER(name)				\
+{								\
+	RWSEM_UNLOCKED_VALUE,					\
+	__SPIN_LOCK_UNLOCKED((name).wait_lock),			\
+	LIST_HEAD_INIT((name).wait_list)			\
+	__RWSEM_DEP_MAP_INIT(name)				\
+}
 
 #define DECLARE_RWSEM(name)		\
 	struct rw_semaphore name = __RWSEM_INITIALIZER(name)
@@ -70,13 +79,13 @@  extern void __init_rwsem(struct rw_semaphore *sem, const char *name,
  */
 static inline void __down_read(struct rw_semaphore *sem)
 {
-	if (unlikely(atomic_inc_return((atomic_t *)(&sem->count)) <= 0))
+	if (unlikely(atomic_long_inc_return((atomic_long_t *)&sem->count) <= 0))
 		rwsem_down_read_failed(sem);
 }
 
 static inline int __down_read_trylock(struct rw_semaphore *sem)
 {
-	int tmp;
+	long tmp;
 
 	while ((tmp = sem->count) >= 0) {
 		if (tmp == cmpxchg(&sem->count, tmp,
@@ -92,10 +101,10 @@  static inline int __down_read_trylock(struct rw_semaphore *sem)
  */
 static inline void __down_write_nested(struct rw_semaphore *sem, int subclass)
 {
-	int tmp;
+	long tmp;
 
-	tmp = atomic_add_return(RWSEM_ACTIVE_WRITE_BIAS,
-				(atomic_t *)(&sem->count));
+	tmp = atomic_long_add_return(RWSEM_ACTIVE_WRITE_BIAS,
+				     (atomic_long_t *)&sem->count);
 	if (unlikely(tmp != RWSEM_ACTIVE_WRITE_BIAS))
 		rwsem_down_write_failed(sem);
 }
@@ -107,7 +116,7 @@  static inline void __down_write(struct rw_semaphore *sem)
 
 static inline int __down_write_trylock(struct rw_semaphore *sem)
 {
-	int tmp;
+	long tmp;
 
 	tmp = cmpxchg(&sem->count, RWSEM_UNLOCKED_VALUE,
 		      RWSEM_ACTIVE_WRITE_BIAS);
@@ -119,9 +128,9 @@  static inline int __down_write_trylock(struct rw_semaphore *sem)
  */
 static inline void __up_read(struct rw_semaphore *sem)
 {
-	int tmp;
+	long tmp;
 
-	tmp = atomic_dec_return((atomic_t *)(&sem->count));
+	tmp = atomic_long_dec_return((atomic_long_t *)&sem->count);
 	if (unlikely(tmp < -1 && (tmp & RWSEM_ACTIVE_MASK) == 0))
 		rwsem_wake(sem);
 }
@@ -131,17 +140,17 @@  static inline void __up_read(struct rw_semaphore *sem)
  */
 static inline void __up_write(struct rw_semaphore *sem)
 {
-	if (unlikely(atomic_sub_return(RWSEM_ACTIVE_WRITE_BIAS,
-			      (atomic_t *)(&sem->count)) < 0))
+	if (unlikely(atomic_long_sub_return(RWSEM_ACTIVE_WRITE_BIAS,
+				 (atomic_long_t *)&sem->count) < 0))
 		rwsem_wake(sem);
 }
 
 /*
  * implement atomic add functionality
  */
-static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
+static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem)
 {
-	atomic_add(delta, (atomic_t *)(&sem->count));
+	atomic_long_add(delta, (atomic_long_t *)&sem->count);
 }
 
 /*
@@ -149,9 +158,10 @@  static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
  */
 static inline void __downgrade_write(struct rw_semaphore *sem)
 {
-	int tmp;
+	long tmp;
 
-	tmp = atomic_add_return(-RWSEM_WAITING_BIAS, (atomic_t *)(&sem->count));
+	tmp = atomic_long_add_return(-RWSEM_WAITING_BIAS,
+				     (atomic_long_t *)&sem->count);
 	if (tmp < 0)
 		rwsem_downgrade_wake(sem);
 }
@@ -159,14 +169,14 @@  static inline void __downgrade_write(struct rw_semaphore *sem)
 /*
  * implement exchange and add functionality
  */
-static inline int rwsem_atomic_update(int delta, struct rw_semaphore *sem)
+static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem)
 {
-	return atomic_add_return(delta, (atomic_t *)(&sem->count));
+	return atomic_long_add_return(delta, (atomic_long_t *)&sem->count);
 }
 
 static inline int rwsem_is_locked(struct rw_semaphore *sem)
 {
-	return (sem->count != 0);
+	return sem->count != 0;
 }
 
 #endif	/* __KERNEL__ */