Message ID | 20170406192342.GC3093@worktop (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Thu, Apr 6, 2017 at 12:23 PM, Peter Zijlstra <peterz@infradead.org> wrote: > > Something like so then. According to the SDM mwait is a no-op if we do > not execute monitor first. So this variant should get the first > iteration without expensive instructions. No, the problem is that we *would* have executed a prior monitor that could still be pending - from a previous invocation of smp_cond_load_acquire(). Especially with spinlocks, these things can very much happen back-to-back. And it would be pending with a different address (the previous spinlock) that might not have changed since then (and might not be changing), so now we might actually be pausing in mwait waiting for that *other* thing to change. So it would probably need to do something complicated like #define smp_cond_load_acquire(ptr, cond_expr) \ ({ \ typeof(ptr) __PTR = (ptr); \ typeof(*ptr) VAL; \ do { \ VAL = READ_ONCE(*__PTR); \ if (cond_expr) \ break; \ for (;;) { \ ___monitor(__PTR, 0, 0); \ VAL = READ_ONCE(*__PTR); \ if (cond_expr) break; \ ___mwait(0xf0 /* C0 */, 0); \ } \ } while (0) \ smp_acquire__after_ctrl_dep(); \ VAL; \ }) which might just generate nasty enough code to not be worth it. I dunno Linus
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index bfb28ca..2b5d5a2 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -80,6 +80,36 @@ do { \ #define __smp_mb__before_atomic() barrier() #define __smp_mb__after_atomic() barrier() +static inline void ___monitor(const void *eax, unsigned long ecx, + unsigned long edx) +{ + /* "monitor %eax, %ecx, %edx;" */ + asm volatile(".byte 0x0f, 0x01, 0xc8;" + :: "a" (eax), "c" (ecx), "d"(edx)); +} + +static inline void ___mwait(unsigned long eax, unsigned long ecx) +{ + /* "mwait %eax, %ecx;" */ + asm volatile(".byte 0x0f, 0x01, 0xc9;" + :: "a" (eax), "c" (ecx)); +} + +#define smp_cond_load_acquire(ptr, cond_expr) \ +({ \ + typeof(ptr) __PTR = (ptr); \ + typeof(*ptr) VAL; \ + for (;;) { \ + VAL = READ_ONCE(*__PTR); \ + if (cond_expr) \ + break; \ + ___mwait(0xf0 /* C0 */, 0); \ + ___monitor(__PTR, 0, 0); \ + } \ + smp_acquire__after_ctrl_dep(); \ + VAL; \ +}) + #include <asm-generic/barrier.h> #endif /* _ASM_X86_BARRIER_H */