Message ID | 20040.65109.439941.444670@pilspetsen.it.uu.se |
---|---|
State | Superseded |
Delegated to: | David Miller |
Headers | show |
On Mon, Aug 15, 2011 at 01:09:09PM +0200, Mikael Pettersson wrote: > The sparc32 version of arch_write_unlock() is just a plain assignment. > Unfortunately this allows the compiler to schedule side-effects in a > protected region to occur after the HW-level unlock, which is broken. > E.g., the following trivial test case gets miscompiled: > > #include <linux/spinlock.h> > rwlock_t lock; > int counter; > void foo(void) { write_lock(&lock); ++counter; write_unlock(&lock); } > > Fixed by adding a compiler memory barrier to arch_write_unlock(). The > sparc64 version combines the barrier and assignment into a single asm(), > so that's what I did here as well. > > Compiled-tested with a sparc32 SMP kernel. > > Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> > --- > --- linux-3.1-rc2/arch/sparc/include/asm/spinlock_32.h.~1~ 2011-07-22 12:01:08.000000000 +0200 > +++ linux-3.1-rc2/arch/sparc/include/asm/spinlock_32.h 2011-08-15 11:43:49.000000000 +0200 > @@ -131,6 +131,15 @@ static inline void arch_write_lock(arch_ > *(volatile __u32 *)&lp->lock = ~0U; > } > > +static void inline arch_write_unlock(arch_rwlock_t *lock) > +{ > + __asm__ __volatile__( > +" st %%g0, [%0]" > + : /* no outputs */ > + : "r" (lock) > + : "memory"); > +} > + > static inline int arch_write_trylock(arch_rwlock_t *rw) > { > unsigned int val; > @@ -175,7 +184,7 @@ static inline int __arch_read_trylock(ar > res; \ > }) > > -#define arch_write_unlock(rw) do { (rw)->lock = 0; } while(0) > +#define arch_write_unlock(rw) arch_write_unlock(rw) > > #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock) > #define arch_read_lock_flags(rw, flags) arch_read_lock(rw) Why keep the tautological define? Just wondering.
Josip Rodin writes: > On Mon, Aug 15, 2011 at 01:09:09PM +0200, Mikael Pettersson wrote: > > The sparc32 version of arch_write_unlock() is just a plain assignment. > > Unfortunately this allows the compiler to schedule side-effects in a > > protected region to occur after the HW-level unlock, which is broken. > > E.g., the following trivial test case gets miscompiled: > > > > #include <linux/spinlock.h> > > rwlock_t lock; > > int counter; > > void foo(void) { write_lock(&lock); ++counter; write_unlock(&lock); } > > > > Fixed by adding a compiler memory barrier to arch_write_unlock(). The > > sparc64 version combines the barrier and assignment into a single asm(), > > so that's what I did here as well. > > > > Compiled-tested with a sparc32 SMP kernel. > > > > Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> > > --- > > --- linux-3.1-rc2/arch/sparc/include/asm/spinlock_32.h.~1~ 2011-07-22 12:01:08.000000000 +0200 > > +++ linux-3.1-rc2/arch/sparc/include/asm/spinlock_32.h 2011-08-15 11:43:49.000000000 +0200 > > @@ -131,6 +131,15 @@ static inline void arch_write_lock(arch_ > > *(volatile __u32 *)&lp->lock = ~0U; > > } > > > > +static void inline arch_write_unlock(arch_rwlock_t *lock) > > +{ > > + __asm__ __volatile__( > > +" st %%g0, [%0]" > > + : /* no outputs */ > > + : "r" (lock) > > + : "memory"); > > +} > > + > > static inline int arch_write_trylock(arch_rwlock_t *rw) > > { > > unsigned int val; > > @@ -175,7 +184,7 @@ static inline int __arch_read_trylock(ar > > res; \ > > }) > > > > -#define arch_write_unlock(rw) do { (rw)->lock = 0; } while(0) > > +#define arch_write_unlock(rw) arch_write_unlock(rw) > > > > #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock) > > #define arch_read_lock_flags(rw, flags) arch_read_lock(rw) > > Why keep the tautological define? Just wondering. Only because sparc64 does it that way. I now see that no other arch has the #define, so perhaps that bit should be deleted (from both sparc64 and sparc32). -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Mikael. > > > @@ -175,7 +184,7 @@ static inline int __arch_read_trylock(ar > > > res; \ > > > }) > > > > > > -#define arch_write_unlock(rw) do { (rw)->lock = 0; } while(0) > > > +#define arch_write_unlock(rw) arch_write_unlock(rw) > > > > > > #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock) > > > #define arch_read_lock_flags(rw, flags) arch_read_lock(rw) > > > > Why keep the tautological define? Just wondering. > > Only because sparc64 does it that way. I now see that no other > arch has the #define, so perhaps that bit should be deleted (from > both sparc64 and sparc32). Please kill the extra define in both sparc32 and sparc64. Preferably in two separate patches. For the subject of the patch please use: [PATCH] sparc32: bla bla Because when davem apply the patch everything in [] is zapped, and it is good to see in the shortlog that this is a sparc32 specific patch. For sparc64 we sometimes use "sparc64: bla bal", and sometimes use "sparc: bla bla". I prefer the first but no strong feelings. And btw thanks for looking at this! Looks like a very difficult bug to nail. Sam -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- linux-3.1-rc2/arch/sparc/include/asm/spinlock_32.h.~1~ 2011-07-22 12:01:08.000000000 +0200 +++ linux-3.1-rc2/arch/sparc/include/asm/spinlock_32.h 2011-08-15 11:43:49.000000000 +0200 @@ -131,6 +131,15 @@ static inline void arch_write_lock(arch_ *(volatile __u32 *)&lp->lock = ~0U; } +static void inline arch_write_unlock(arch_rwlock_t *lock) +{ + __asm__ __volatile__( +" st %%g0, [%0]" + : /* no outputs */ + : "r" (lock) + : "memory"); +} + static inline int arch_write_trylock(arch_rwlock_t *rw) { unsigned int val; @@ -175,7 +184,7 @@ static inline int __arch_read_trylock(ar res; \ }) -#define arch_write_unlock(rw) do { (rw)->lock = 0; } while(0) +#define arch_write_unlock(rw) arch_write_unlock(rw) #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock) #define arch_read_lock_flags(rw, flags) arch_read_lock(rw)
The sparc32 version of arch_write_unlock() is just a plain assignment. Unfortunately this allows the compiler to schedule side-effects in a protected region to occur after the HW-level unlock, which is broken. E.g., the following trivial test case gets miscompiled: #include <linux/spinlock.h> rwlock_t lock; int counter; void foo(void) { write_lock(&lock); ++counter; write_unlock(&lock); } Fixed by adding a compiler memory barrier to arch_write_unlock(). The sparc64 version combines the barrier and assignment into a single asm(), so that's what I did here as well. Compiled-tested with a sparc32 SMP kernel. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> --- -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html