diff mbox

RFC: powerpc libgomp locking

Message ID 20111117115647.GB14325@bubble.grove.modra.org
State New
Headers show

Commit Message

Alan Modra Nov. 17, 2011, 11:56 a.m. UTC
This is part of work in progress getting locking back into shape on
powerpc.  (If we ever were in shape, which is doubtful.)  Using the
ia64 version means we have a needless sync at the start of
gomp_mutex_lock (courtesy of __sync_val_compare_and_swap) and a
needless isync at the end of gomp_mutex_unlock (slightly odd use
of __sync_lock_test_and_set to release a lock).  It would be nice to
get rid of the __sync_val_compare_and_swap in config/linux/mutex.c
too.

	* config/linux/powerpc/mutex.h: Rewrite.

Comments

Richard Henderson Nov. 18, 2011, 12:56 a.m. UTC | #1
On 11/17/2011 01:56 AM, Alan Modra wrote:
> This is part of work in progress getting locking back into shape on
> powerpc.  (If we ever were in shape, which is doubtful.)  Using the
> ia64 version means we have a needless sync at the start of
> gomp_mutex_lock (courtesy of __sync_val_compare_and_swap) and a
> needless isync at the end of gomp_mutex_unlock (slightly odd use
> of __sync_lock_test_and_set to release a lock).  It would be nice to
> get rid of the __sync_val_compare_and_swap in config/linux/mutex.c
> too.

We should be able to use this new file as a replacement for all
targets, not just PPC.

And, yes, updating all of the __sync uses in libgomp would be a
good thing.

> +  __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
> +			       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
> +  if (__builtin_expect (oldval != 0, 0))
> +    gomp_mutex_lock_slow (mutex, oldval);

__atomic_compare_exchange_4 returns a boolean success:

  if (__builtin_expect (!__atomic_compare_exchange_4
		        (mutex, &oldval, 1, false,
			 MEMMODEL_ACQUIRE, MEMMODEL_RELAXED), 0))

In theory that should avoid an extra comparison on some targets.


r~
Alan Modra Nov. 18, 2011, 1:47 a.m. UTC | #2
On Thu, Nov 17, 2011 at 02:56:31PM -1000, Richard Henderson wrote:
> On 11/17/2011 01:56 AM, Alan Modra wrote:
> > This is part of work in progress getting locking back into shape on
> > powerpc.  (If we ever were in shape, which is doubtful.)  Using the
> > ia64 version means we have a needless sync at the start of
> > gomp_mutex_lock (courtesy of __sync_val_compare_and_swap) and a
> > needless isync at the end of gomp_mutex_unlock (slightly odd use
> > of __sync_lock_test_and_set to release a lock).  It would be nice to
> > get rid of the __sync_val_compare_and_swap in config/linux/mutex.c
> > too.
> 
> We should be able to use this new file as a replacement for all
> targets, not just PPC.
> 
> And, yes, updating all of the __sync uses in libgomp would be a
> good thing.
> 
> > +  __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
> > +			       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
> > +  if (__builtin_expect (oldval != 0, 0))
> > +    gomp_mutex_lock_slow (mutex, oldval);
> 
> __atomic_compare_exchange_4 returns a boolean success:
> 
>   if (__builtin_expect (!__atomic_compare_exchange_4
> 		        (mutex, &oldval, 1, false,
> 			 MEMMODEL_ACQUIRE, MEMMODEL_RELAXED), 0))
> 
> In theory that should avoid an extra comparison on some targets.

That was the way I wrote it at first.  This way gives three fewer
insns on powerpc, because insns setting up the boolean are elided as
is the comparison of the boolean against zero.  An oldval != 0
comparison insn is also not needed because __atomic_compare_exchange
already did that for us.

BTW, we're left with dead stores into oldval's stack slot.  Is it
legal for the first two parameters of __atomic_compare_exchange to
alias?  I'm guessing that the stores might disappear if we tell gcc
that they can't alias.
Richard Henderson Nov. 18, 2011, 5:40 p.m. UTC | #3
On 11/17/2011 03:47 PM, Alan Modra wrote:
> BTW, we're left with dead stores into oldval's stack slot.  Is it
> legal for the first two parameters of __atomic_compare_exchange to
> alias?  I'm guessing that the stores might disappear if we tell gcc
> that they can't alias.

No they cannot alias.

Macleod and I have discussed several ways to get rid of the dead
address slot, and havn't had time to do any of them yet.  I've
been trying to decide if this counts as a regression from gcc 4.6...


r~
diff mbox

Patch

Index: libgomp/config/linux/powerpc/mutex.h
===================================================================
--- libgomp/config/linux/powerpc/mutex.h	(revision 181425)
+++ libgomp/config/linux/powerpc/mutex.h	(working copy)
@@ -1,2 +1,75 @@ 
-/* On PowerPC __sync_lock_test_and_set isn't a full barrier.  */
-#include "config/linux/ia64/mutex.h"
+/* Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Alan Modra <amodra@au1.ibm.com>
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This is a Linux specific implementation of a mutex synchronization
+   mechanism for libgomp.  This type is private to the library.  This
+   implementation uses atomic instructions and the futex syscall.  */
+
+#ifndef GOMP_MUTEX_H
+#define GOMP_MUTEX_H 1
+
+typedef int gomp_mutex_t;
+extern void gomp_mutex_lock_slow (gomp_mutex_t *mutex, int);
+extern void gomp_mutex_unlock_slow (gomp_mutex_t *mutex);
+
+#define GOMP_MUTEX_INIT_0 1
+
+enum memmodel
+{
+  MEMMODEL_RELAXED = 0,
+  MEMMODEL_CONSUME = 1,
+  MEMMODEL_ACQUIRE = 2,
+  MEMMODEL_RELEASE = 3,
+  MEMMODEL_ACQ_REL = 4,
+  MEMMODEL_SEQ_CST = 5,
+  MEMMODEL_LAST = 6
+};
+
+static inline void gomp_mutex_init (gomp_mutex_t *mutex)
+{
+  *mutex = 0;
+}
+
+static inline void gomp_mutex_destroy (gomp_mutex_t *mutex)
+{
+}
+
+static inline void gomp_mutex_lock (gomp_mutex_t *mutex)
+{
+  int oldval = 0;
+
+  __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
+			       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
+  if (__builtin_expect (oldval != 0, 0))
+    gomp_mutex_lock_slow (mutex, oldval);
+}
+
+static inline void gomp_mutex_unlock (gomp_mutex_t *mutex)
+{
+  int oldval = __atomic_exchange_4 (mutex, 0, MEMMODEL_RELEASE);
+
+  if (__builtin_expect (oldval > 1, 0))
+    gomp_mutex_unlock_slow (mutex);
+}
+#endif /* GOMP_MUTEX_H */