Patchwork Use atomics in libgomp mutex

login
register
mail settings
Submitter Alan Modra
Date Nov. 27, 2011, 10:57 p.m.
Message ID <20111127225720.GB7827@bubble.grove.modra.org>
Download mbox | patch
Permalink /patch/127905/
State New
Headers show

Comments

Alan Modra - Nov. 27, 2011, 10:57 p.m.
This is the mutex part.  Depends on
http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02454.html for MEMMODEL_*
values.

	* config/linux/mutex.h: Use atomic rather than sync builtins.
	* config/linux/mutex.c: Likewise.  Comment.  Use -1 for waiting state.
	* config/linux/omp-lock.h: Comment fix.
	* config/linux/arm/mutex.h: Delete.
	* config/linux/powerpc/mutex.h: Delete.
	* config/linux/ia64/mutex.h: Delete.
	* config/linux/mips/mutex.h: Delete.
Alan Modra - Nov. 28, 2011, 12:07 a.m.
On Mon, Nov 28, 2011 at 09:27:20AM +1030, Alan Modra wrote:
> This is the mutex part.  Depends on
> http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02454.html for MEMMODEL_*
> values.

Arrgh, I posted the wrong patch.  I know it needs work elsewhere in
libgomp to comply with the OMP spec, which says of flush:

14 A flush region without a list is implied at the following locations:
15 • During a barrier region.
16 • At entry to and exit from parallel, critical, and ordered regions.
[snip]

Yet COMP_critical_start simply calls gomp_mutex_lock, relying on the
lock acquisition to do the flush.  This won't necessarily happen with
MEMMODEL_ACQUIRE.

This bit in gomp_mutex_lock
> +  __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
> +			       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
should be
  __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
			       MEMMODEL_ACQ_REL, MEMMODEL_ACQ_REL);

and once you do that it's hard to justify the patch for stage3 as
fixing a bug (unnecessary sync instructions emitted by __sync
builtins).
Jakub Jelinek - Nov. 28, 2011, 8:15 a.m.
On Mon, Nov 28, 2011 at 10:37:09AM +1030, Alan Modra wrote:
> On Mon, Nov 28, 2011 at 09:27:20AM +1030, Alan Modra wrote:
> > This is the mutex part.  Depends on
> > http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02454.html for MEMMODEL_*
> > values.
> 
> Arrgh, I posted the wrong patch.  I know it needs work elsewhere in
> libgomp to comply with the OMP spec, which says of flush:
> 
> 14 A flush region without a list is implied at the following locations:
> 15 • During a barrier region.
> 16 • At entry to and exit from parallel, critical, and ordered regions.
> [snip]
> 
> Yet COMP_critical_start simply calls gomp_mutex_lock, relying on the
> lock acquisition to do the flush.  This won't necessarily happen with
> MEMMODEL_ACQUIRE.
> 
> This bit in gomp_mutex_lock
> > +  __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
> > +			       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
> should be
>   __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
> 			       MEMMODEL_ACQ_REL, MEMMODEL_ACQ_REL);
> 
> and once you do that it's hard to justify the patch for stage3 as
> fixing a bug (unnecessary sync instructions emitted by __sync
> builtins).

If it is just GOMP_critical_start, couldn't it use an extra barrier
before resp. after it calls mutex_lock?

	Jakub
Alan Modra - Nov. 28, 2011, 9:35 a.m.
On Mon, Nov 28, 2011 at 09:15:02AM +0100, Jakub Jelinek wrote:
> On Mon, Nov 28, 2011 at 10:37:09AM +1030, Alan Modra wrote:
> > 14 A flush region without a list is implied at the following locations:
> > 15 • During a barrier region.
> > 16 • At entry to and exit from parallel, critical, and ordered regions.
> > [snip]

> If it is just GOMP_critical_start, couldn't it use an extra barrier
> before resp. after it calls mutex_lock?

I believe gomp_ordered_sync needs a barrier, and I'm unsure about
GOMP_parallel_start but I think one is needed there too.  Yes, putting
the barriers where they are actually needed is the best solution, but
I'm far from an OpenMP expert so am unsure whether these are the only
places requiring a barrier.
Richard Henderson - Nov. 29, 2011, 1:33 a.m.
On 11/27/2011 02:57 PM, Alan Modra wrote:
> This is the mutex part.  Depends on
> http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02454.html for MEMMODEL_*
> values.
> 
> 	* config/linux/mutex.h: Use atomic rather than sync builtins.
> 	* config/linux/mutex.c: Likewise.  Comment.  Use -1 for waiting state.
> 	* config/linux/omp-lock.h: Comment fix.
> 	* config/linux/arm/mutex.h: Delete.
> 	* config/linux/powerpc/mutex.h: Delete.
> 	* config/linux/ia64/mutex.h: Delete.
> 	* config/linux/mips/mutex.h: Delete.

Looks good modulo _4/_n and using the success return of __atomic_compare_exchange.


r~

Patch

Index: libgomp/config/linux/mutex.h
===================================================================
--- libgomp/config/linux/mutex.h	(revision 181718)
+++ libgomp/config/linux/mutex.h	(working copy)
@@ -33,39 +33,35 @@  typedef int gomp_mutex_t;
 
 #define GOMP_MUTEX_INIT_0 1
 
-static inline void gomp_mutex_init (gomp_mutex_t *mutex)
+extern void gomp_mutex_lock_slow (gomp_mutex_t *mutex, int);
+extern void gomp_mutex_unlock_slow (gomp_mutex_t *mutex);
+
+static inline void
+gomp_mutex_init (gomp_mutex_t *mutex)
 {
   *mutex = 0;
 }
 
-extern void gomp_mutex_lock_slow (gomp_mutex_t *mutex, int);
-static inline void gomp_mutex_lock (gomp_mutex_t *mutex)
+static inline void
+gomp_mutex_destroy (gomp_mutex_t *mutex)
 {
-  int oldval = __sync_val_compare_and_swap (mutex, 0, 1);
-  if (__builtin_expect (oldval, 0))
-    gomp_mutex_lock_slow (mutex, oldval);
 }
 
-extern void gomp_mutex_unlock_slow (gomp_mutex_t *mutex);
-static inline void gomp_mutex_unlock (gomp_mutex_t *mutex)
+static inline void
+gomp_mutex_lock (gomp_mutex_t *mutex)
 {
-  /* Warning: By definition __sync_lock_test_and_set() does not have
-     proper memory barrier semantics for a mutex unlock operation.
-     However, this default implementation is written assuming that it
-     does, which is true for some targets.
-
-     Targets that require additional memory barriers before
-     __sync_lock_test_and_set to achieve the release semantics of
-     mutex unlock, are encouraged to include
-     "config/linux/ia64/mutex.h" in a target specific mutex.h instead
-     of using this file.  */
-  int val = __sync_lock_test_and_set (mutex, 0);
-  if (__builtin_expect (val > 1, 0))
-    gomp_mutex_unlock_slow (mutex);
+  int oldval = 0;
+  __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
+			       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
+  if (__builtin_expect (oldval != 0, 0))
+    gomp_mutex_lock_slow (mutex, oldval);
 }
 
-static inline void gomp_mutex_destroy (gomp_mutex_t *mutex)
+static inline void
+gomp_mutex_unlock (gomp_mutex_t *mutex)
 {
+  int wait = __atomic_exchange_4 (mutex, 0, MEMMODEL_RELEASE);
+  if (__builtin_expect (wait < 0, 0))
+    gomp_mutex_unlock_slow (mutex);
 }
-
 #endif /* GOMP_MUTEX_H */
Index: libgomp/config/linux/mutex.c
===================================================================
--- libgomp/config/linux/mutex.c	(revision 181718)
+++ libgomp/config/linux/mutex.c	(working copy)
@@ -34,25 +34,33 @@  long int gomp_futex_wait = FUTEX_WAIT | 
 void
 gomp_mutex_lock_slow (gomp_mutex_t *mutex, int oldval)
 {
+  /* First loop spins a while.  */
   while (oldval == 1)
     {
       if (do_spin (mutex, 1))
 	{
-	  oldval = __sync_lock_test_and_set (mutex, 2);
+	  /* Spin timeout, nothing changed.  Set waiting flag.  */
+	  oldval = __atomic_exchange_4 (mutex, -1, MEMMODEL_ACQUIRE);
 	  if (oldval == 0)
 	    return;
-	  futex_wait (mutex, 2);
+	  futex_wait (mutex, -1);
 	  break;
 	}
       else
 	{
-	  oldval = __sync_val_compare_and_swap (mutex, 0, 1);
+	  /* Something changed.  If now unlocked, we're good to go.  */
+	  oldval = 0;
+	  __atomic_compare_exchange_4 (mutex, &oldval, 1, false,
+				       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED);
 	  if (oldval == 0)
 	    return;
 	}
     }
-  while ((oldval = __sync_lock_test_and_set (mutex, 2)))
-    do_wait (mutex, 2);
+
+  /* Second loop waits until mutex is unlocked.  We always exit this
+     loop with wait flag set, so next unlock will awaken a thread.  */
+  while ((oldval = __atomic_exchange_4 (mutex, -1, MEMMODEL_ACQUIRE)))
+    do_wait (mutex, -1);
 }
 
 void
Index: libgomp/config/linux/omp-lock.h
===================================================================
--- libgomp/config/linux/omp-lock.h	(revision 181718)
+++ libgomp/config/linux/omp-lock.h	(working copy)
@@ -3,8 +3,8 @@ 
    structures without polluting the namespace.
 
    When using the Linux futex primitive, non-recursive locks require
-   only one int.  Recursive locks require we identify the owning task
-   and so require one int and a pointer.  */
+   one int.  Recursive locks require we identify the owning task
+   and so require in addition one int and a pointer.  */
 
 typedef int omp_lock_t;
 typedef struct { int lock, count; void *owner; } omp_nest_lock_t;
Index: libgomp/config/linux/arm/mutex.h
===================================================================
--- libgomp/config/linux/arm/mutex.h	(revision 181718)
+++ libgomp/config/linux/arm/mutex.h	(working copy)
@@ -1,28 +0,0 @@ 
-/* Copyright (C) 2010 Free Software Foundation, Inc.
-   Contributed by ARM Ltd.
-
-   This file is part of the GNU OpenMP Library (libgomp).
-
-   Libgomp is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3, or (at your option)
-   any later version.
-
-   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
-   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
-   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
-   more details.
-
-   Under Section 7 of GPL version 3, you are granted additional
-   permissions described in the GCC Runtime Library Exception, version
-   3.1, as published by the Free Software Foundation.
-
-   You should have received a copy of the GNU General Public License and
-   a copy of the GCC Runtime Library Exception along with this program;
-   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
-   <http://www.gnu.org/licenses/>.  */
-
-/* ARM needs the same correct usage of __sync_synchronize and
-   __sync_lock_test_and_set as ia64.  So we just use its mutex.h.  */
-
-#include "config/linux/ia64/mutex.h"
Index: libgomp/config/linux/powerpc/mutex.h
===================================================================
--- libgomp/config/linux/powerpc/mutex.h	(revision 181718)
+++ libgomp/config/linux/powerpc/mutex.h	(working copy)
@@ -1,2 +0,0 @@ 
-/* On PowerPC __sync_lock_test_and_set isn't a full barrier.  */
-#include "config/linux/ia64/mutex.h"
Index: libgomp/config/linux/ia64/mutex.h
===================================================================
--- libgomp/config/linux/ia64/mutex.h	(revision 181718)
+++ libgomp/config/linux/ia64/mutex.h	(working copy)
@@ -1,66 +0,0 @@ 
-/* Copyright (C) 2005, 2008, 2009, 2011 Free Software Foundation, Inc.
-   Contributed by Richard Henderson <rth@redhat.com>.
-
-   This file is part of the GNU OpenMP Library (libgomp).
-
-   Libgomp is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3, or (at your option)
-   any later version.
-
-   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
-   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
-   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
-   more details.
-
-   Under Section 7 of GPL version 3, you are granted additional
-   permissions described in the GCC Runtime Library Exception, version
-   3.1, as published by the Free Software Foundation.
-
-   You should have received a copy of the GNU General Public License and
-   a copy of the GCC Runtime Library Exception along with this program;
-   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
-   <http://www.gnu.org/licenses/>.  */
-
-/* This is a Linux specific implementation of a mutex synchronization
-   mechanism for libgomp.  This type is private to the library.  This
-   implementation uses atomic instructions and the futex syscall.  */
-
-#ifndef GOMP_MUTEX_H
-#define GOMP_MUTEX_H 1
-
-typedef int gomp_mutex_t;
-
-#define GOMP_MUTEX_INIT_0 1
-
-static inline void gomp_mutex_init (gomp_mutex_t *mutex)
-{
-  *mutex = 0;
-}
-
-extern void gomp_mutex_lock_slow (gomp_mutex_t *mutex, int);
-static inline void gomp_mutex_lock (gomp_mutex_t *mutex)
-{
-  int oldval = __sync_val_compare_and_swap (mutex, 0, 1);
-  if (__builtin_expect (oldval, 0))
-    gomp_mutex_lock_slow (mutex, oldval);
-}
-
-extern void gomp_mutex_unlock_slow (gomp_mutex_t *mutex);
-
-/* IA64 needs a __sync_synchronize call before __sync_lock_test_and_set
-   because __sync_lock_test_and_set is not a full memory fence.  */
-static inline void gomp_mutex_unlock (gomp_mutex_t *mutex)
-{
-  int val;
-  __sync_synchronize ();
-  val = __sync_lock_test_and_set (mutex, 0);
-  if (__builtin_expect (val > 1, 0))
-    gomp_mutex_unlock_slow (mutex);
-}
-
-static inline void gomp_mutex_destroy (gomp_mutex_t *mutex)
-{
-}
-
-#endif /* GOMP_MUTEX_H */
Index: libgomp/config/linux/mips/mutex.h
===================================================================
--- libgomp/config/linux/mips/mutex.h	(revision 181718)
+++ libgomp/config/linux/mips/mutex.h	(working copy)
@@ -1,27 +0,0 @@ 
-/* Copyright (C) 2009 Free Software Foundation, Inc.
-
-   This file is part of the GNU OpenMP Library (libgomp).
-
-   Libgomp is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3, or (at your option)
-   any later version.
-
-   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
-   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
-   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
-   more details.
-
-   Under Section 7 of GPL version 3, you are granted additional
-   permissions described in the GCC Runtime Library Exception, version
-   3.1, as published by the Free Software Foundation.
-
-   You should have received a copy of the GNU General Public License and
-   a copy of the GCC Runtime Library Exception along with this program;
-   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
-   <http://www.gnu.org/licenses/>.  */
-
-/* MIPS needs the same correct usage of __sync_synchronize and
-   __sync_lock_test_and_set as ia64.  So we just use its mutex.h.  */
-
-#include "config/linux/ia64/mutex.h"