Patchwork [1/2] Detect and use GCC atomic builtins for locking

login
register
mail settings
Submitter Loïc Minier
Date Feb. 19, 2010, 9:02 p.m.
Message ID <1266613360-23069-1-git-send-email-lool@dooz.org>
Download mbox | patch
Permalink /patch/45882/
State New
Headers show

Comments

Loïc Minier - Feb. 19, 2010, 9:02 p.m.
---
 configure   |   17 +++++++++++++++++
 qemu-lock.h |   13 +++++++++++++
 2 files changed, 30 insertions(+), 0 deletions(-)
malc - Feb. 19, 2010, 9:42 p.m.
On Fri, 19 Feb 2010, Lo?c Minier wrote:

Please look up "gcc 4.1 implements compiler builtins for atomic ops"
thread for reasons why this might not be the best idea.

[..snip..]
Loïc Minier - Feb. 20, 2010, 8:16 a.m.
NB: Addition of these builtins was prompted by qemu failing to build on
 armel in Ubuntu; this is because we default to Thumb 2 mode which
 doesn't have the assembly instructions in question.
 http://launchpadlibrarian.net/38837077/buildlog_ubuntu-lucid-armel.qemu-kvm_0.12.2-0ubuntu6_FAILEDTOBUILD.txt.gz
[...]
  CC    arm-softmmu/syborg_virtio.o
  CC    arm-softmmu/exec.o
/tmp/cc24C9yx.s: Assembler messages:
/tmp/cc24C9yx.s:5392: Error: selected processor does not support `swp r4,r4,[r3]'
/tmp/cc24C9yx.s:6599: Error: selected processor does not support `swp r6,r6,[r3]'
make[2]: *** [exec.o] Error 1
make[1]: *** [subdir-arm-softmmu] Error 2
[...]

On Sat, Feb 20, 2010, malc wrote:
> Please look up "gcc 4.1 implements compiler builtins for atomic ops"
> thread for reasons why this might not be the best idea.

 I found a very old thred (2005) on libc-alpha with this subject; is
 this the one you mean?

 People participating in the libc-alpha were not really constructive and
 were presenting some bogus arguments; let me try to go over the various
 arguments (long):
 - some people wanted atomic builtins to be inline for performance and
   others wanted them to be library calls to allow changing their
   behavior later (e.g. to support a new CPU); the implementation
   actually uses both: inline assembly when supported on the
   architecture, or calls into libgcc which will call into the kernel
   otherwise (or direct calls into the kernel when building statically
   obviously), such as when the ISA doesn't offer sufficient primitives.
   Because *any* instruction might be gotten wrong in hardware, I don't
   see a need to special case locking inline assembly.
 - userspace apps abusing atomic builtins for locking; this is actually
   the case of qemu, but I think using gcc primitives will actually make
   it easier to get it right and will allow coverage of more
   architectures; in my opinion, there's no need to maintain
   hand-written assembly for locks in 2010.
 - someone claimed that modern architectures can do these operations in
   assembly without calling into a library; that's what the atomic
   builtins do, and actually some modern architectures can't do all
   operations in assembly.
 - there were arguments over where such functions belong and the
   semantics of each call; these are in my eyes purely political and
   don't relate to qemu.

 Which are your concerns with atomic builtins and are these in the above
 list?
malc - Feb. 20, 2010, 8:28 a.m.
On Sat, 20 Feb 2010, Lo?c Minier wrote:

>  NB: Addition of these builtins was prompted by qemu failing to build on
>  armel in Ubuntu; this is because we default to Thumb 2 mode which
>  doesn't have the assembly instructions in question.
>  http://launchpadlibrarian.net/38837077/buildlog_ubuntu-lucid-armel.qemu-kvm_0.12.2-0ubuntu6_FAILEDTOBUILD.txt.gz
> [...]
>   CC    arm-softmmu/syborg_virtio.o
>   CC    arm-softmmu/exec.o
> /tmp/cc24C9yx.s: Assembler messages:
> /tmp/cc24C9yx.s:5392: Error: selected processor does not support `swp r4,r4,[r3]'
> /tmp/cc24C9yx.s:6599: Error: selected processor does not support `swp r6,r6,[r3]'
> make[2]: *** [exec.o] Error 1
> make[1]: *** [subdir-arm-softmmu] Error 2
> [...]
> 
> On Sat, Feb 20, 2010, malc wrote:
> > Please look up "gcc 4.1 implements compiler builtins for atomic ops"
> > thread for reasons why this might not be the best idea.
> 
>  I found a very old thred (2005) on libc-alpha with this subject; is
>  this the one you mean?

Yes.

> 
>  People participating in the libc-alpha were not really constructive and
>  were presenting some bogus arguments; let me try to go over the various
>  arguments (long):
>  - some people wanted atomic builtins to be inline for performance and
>    others wanted them to be library calls to allow changing their
>    behavior later (e.g. to support a new CPU); the implementation
>    actually uses both: inline assembly when supported on the
>    architecture, or calls into libgcc which will call into the kernel
>    otherwise (or direct calls into the kernel when building statically
>    obviously), such as when the ISA doesn't offer sufficient primitives.
>    Because *any* instruction might be gotten wrong in hardware, I don't
>    see a need to special case locking inline assembly.
>  - userspace apps abusing atomic builtins for locking; this is actually
>    the case of qemu, but I think using gcc primitives will actually make
>    it easier to get it right and will allow coverage of more
>    architectures; in my opinion, there's no need to maintain
>    hand-written assembly for locks in 2010.
>  - someone claimed that modern architectures can do these operations in
>    assembly without calling into a library; that's what the atomic
>    builtins do, and actually some modern architectures can't do all
>    operations in assembly.
>  - there were arguments over where such functions belong and the
>    semantics of each call; these are in my eyes purely political and
>    don't relate to qemu.
> 
>  Which are your concerns with atomic builtins and are these in the above
>  list?

For instance this:
http://sources.redhat.com/ml/libc-alpha/2005-06/msg00112.html

The builtins are too coarse grained and will do more stuff than strictly
necessary.
Loïc Minier - Feb. 20, 2010, 9:10 a.m.
On Sat, Feb 20, 2010, malc wrote:
> For instance this:
> http://sources.redhat.com/ml/libc-alpha/2005-06/msg00112.html
> 
> The builtins are too coarse grained and will do more stuff than strictly
> necessary.

 Is this the case of the builtins I'm proposing to use?  We could ask
 for new ones without the drawbacks if any.

 Do you have another option to implement locking on thumb-2?
malc - Feb. 20, 2010, 5:57 p.m.
On Sat, 20 Feb 2010, Lo?c Minier wrote:

> On Sat, Feb 20, 2010, malc wrote:
> > For instance this:
> > http://sources.redhat.com/ml/libc-alpha/2005-06/msg00112.html
> > 
> > The builtins are too coarse grained and will do more stuff than strictly
> > necessary.
> 
>  Is this the case of the builtins I'm proposing to use?  We could ask
>  for new ones without the drawbacks if any.
> 
>  Do you have another option to implement locking on thumb-2?

No, i'm against using locking GCC builtins for all the other targets (well
PPC), if they provide satisfactory results for thumb-2 please do submit an 
updated patch which uses them only for this host.
Loïc Minier - Feb. 22, 2010, 10:25 a.m.
On Sat, Feb 20, 2010, malc wrote:
> No, i'm against using locking GCC builtins for all the other targets (well
> PPC)

 Do you have benchmarks with/without GCC atomic builtins?
malc - Feb. 22, 2010, 10:47 a.m.
On Mon, 22 Feb 2010, Lo?c Minier wrote:

> On Sat, Feb 20, 2010, malc wrote:
> > No, i'm against using locking GCC builtins for all the other targets (well
> > PPC)
> 
>  Do you have benchmarks with/without GCC atomic builtins?

No.
Paul Brook - Feb. 28, 2010, 12:26 a.m.
>diff --git a/qemu-lock.h b/qemu-lock.h

This code isn't used in any interesting ways, it should be removed..

Paul

Patch

diff --git a/configure b/configure
index 0a84b0e..b33f045 100755
--- a/configure
+++ b/configure
@@ -1823,6 +1823,20 @@  if compile_prog "" ""; then
 fi
 
 ##########################################
+# check if we have gcc atomic built-ins
+gcc_atomic_builtins=no
+cat > $TMPC << EOF
+int main(void) {
+    int i;
+    __sync_lock_test_and_set(&i, 1);
+    __sync_lock_release(&i);
+}
+EOF
+if compile_prog "" ""; then
+    gcc_atomic_builtins=yes
+fi
+
+##########################################
 # check if we have fdatasync
 
 fdatasync=no
@@ -2168,6 +2182,9 @@  fi
 if test "$gcc_attribute_warn_unused_result" = "yes" ; then
   echo "CONFIG_GCC_ATTRIBUTE_WARN_UNUSED_RESULT=y" >> $config_host_mak
 fi
+if test "$gcc_atomic_builtins" = "yes" ; then
+  echo "CONFIG_GCC_ATOMIC_BUILTINS=y" >> $config_host_mak
+fi
 if test "$fdatasync" = "yes" ; then
   echo "CONFIG_FDATASYNC=y" >> $config_host_mak
 fi
diff --git a/qemu-lock.h b/qemu-lock.h
index 9a3e6ac..5c8eb34 100644
--- a/qemu-lock.h
+++ b/qemu-lock.h
@@ -33,6 +33,14 @@ 
 
 #else
 
+#ifdef CONFIG_GCC_ATOMIC_BUILTINS
+typedef int spinlock_t;
+
+#define SPIN_LOCK_UNLOCKED 0
+
+#define resetlock(p) __sync_lock_release((p))
+#else /* CONFIG_GCC_ATOMIC_BUILTINS */
+
 #if defined(__hppa__)
 
 typedef int spinlock_t[4];
@@ -56,7 +64,11 @@  static inline void resetlock (spinlock_t *p)
 }
 
 #endif
+#endif /* !CONFIG_GCC_ATOMIC_BUILTINS */
 
+#ifdef CONFIG_GCC_ATOMIC_BUILTINS
+#define testandset(p) __sync_lock_test_and_set((p), 1)
+#else /* CONFIG_GCC_ATOMIC_BUILTINS */
 #if defined(_ARCH_PPC)
 static inline int testandset (int *p)
 {
@@ -213,6 +225,7 @@  static inline int testandset (int *p)
 #else
 #error unimplemented CPU support
 #endif
+#endif /* !CONFIG_GCC_ATOMIC_BUILTINS */
 
 #if defined(CONFIG_USER_ONLY)
 static inline void spin_lock(spinlock_t *lock)