Message ID | 1416912710.1771.176.camel@triegel.csb |
---|---|
State | New |
Headers | show |
Ping. On Tue, 2014-11-25 at 11:51 +0100, Torvald Riegel wrote: > On Wed, 2014-10-29 at 22:54 +0000, Joseph S. Myers wrote: > > On Wed, 29 Oct 2014, Torvald Riegel wrote: > > > > > So, mfence seems to have been introduced with SSE2. Should I try to > > > test for SSE2 specifically, or rather assume SSE2 support for i786? > > > > I think the i786 directories should be removed; config.guess will never > > return such a processor name for GNU/Linux at least (I don't know what it > > returns on Hurd). The comment in sysdeps/i386/i786/Implies suggests it > > was for PII, but PII was still family 6 (and family 15 came after family > > 6, I don't think there were any x86 processors with family numbers 7 to > > 14). > > > > So, anything conditional on SSE2 should test for __SSE2__. > > How does this updated patch look? The non-SSE full barrier is what, > AFAIU, GCC emits. >
Ping. On Mon, 2014-12-08 at 17:38 +0100, Torvald Riegel wrote: > Ping. > > On Tue, 2014-11-25 at 11:51 +0100, Torvald Riegel wrote: > > On Wed, 2014-10-29 at 22:54 +0000, Joseph S. Myers wrote: > > > On Wed, 29 Oct 2014, Torvald Riegel wrote: > > > > > > > So, mfence seems to have been introduced with SSE2. Should I try to > > > > test for SSE2 specifically, or rather assume SSE2 support for i786? > > > > > > I think the i786 directories should be removed; config.guess will never > > > return such a processor name for GNU/Linux at least (I don't know what it > > > returns on Hurd). The comment in sysdeps/i386/i786/Implies suggests it > > > was for PII, but PII was still family 6 (and family 15 came after family > > > 6, I don't think there were any x86 processors with family numbers 7 to > > > 14). > > > > > > So, anything conditional on SSE2 should test for __SSE2__. > > > > How does this updated patch look? The non-SSE full barrier is what, > > AFAIU, GCC emits. > > > > >
On 25 Nov 2014 11:51, Torvald Riegel wrote: > --- a/sysdeps/i386/i486/bits/atomic.h > +++ b/sysdeps/i386/i486/bits/atomic.h > @@ -535,3 +535,12 @@ typedef uintmax_t uatomic_max_t; > #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask) > > #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask) > + > +#ifdef __SSE2__ > +# define atomic_full_barrier() __asm ("mfence" ::: "memory") > +#else > +# define atomic_full_barrier() \ > + __asm __volatile (LOCK_PREFIX "orl $0, (%%esp)" ::: "memory") > +#endif so this will kick in only when glibc itself is compiled with -msse2/etc... support. then again, these barriers only get used by glibc internal code, so i guess this is the best answer. plus it only impacts x86, and it's not like anyone really cares about that anymore ;). lgtm -mike
commit 055ecbc51899f9f2c560545b183d8cf01df3de94 Author: Torvald Riegel <triegel@redhat.com> Date: Wed Oct 29 10:34:36 2014 +0100 Fix atomic_full_barrier on x86 and x86_64. [BZ #17403] * sysdeps/x86_64/bits/atomic.h: (atomic_full_barrier, atomic_read_barrier, atomic_write_barrier): Define. * sysdeps/i386/i486/bits/atomic.h (atomic_full_barrier, atomic_read_barrier, atomic_write_barrier): Define. diff --git a/sysdeps/i386/i486/bits/atomic.h b/sysdeps/i386/i486/bits/atomic.h index 739d384..c77fe2e 100644 --- a/sysdeps/i386/i486/bits/atomic.h +++ b/sysdeps/i386/i486/bits/atomic.h @@ -535,3 +535,12 @@ typedef uintmax_t uatomic_max_t; #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask) #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask) + +#ifdef __SSE2__ +# define atomic_full_barrier() __asm ("mfence" ::: "memory") +#else +# define atomic_full_barrier() \ + __asm __volatile (LOCK_PREFIX "orl $0, (%%esp)" ::: "memory") +#endif +#define atomic_read_barrier() __asm ("" ::: "memory") +#define atomic_write_barrier() __asm ("" ::: "memory") diff --git a/sysdeps/x86_64/bits/atomic.h b/sysdeps/x86_64/bits/atomic.h index 99dfb50..7e67427 100644 --- a/sysdeps/x86_64/bits/atomic.h +++ b/sysdeps/x86_64/bits/atomic.h @@ -472,3 +472,7 @@ typedef uintmax_t uatomic_max_t; #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask) #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask) + +#define atomic_full_barrier() __asm ("mfence" ::: "memory") +#define atomic_read_barrier() __asm ("" ::: "memory") +#define atomic_write_barrier() __asm ("" ::: "memory")