__sync_swap* with acq/rel/full memory barrier semantics

This is a patch implementing builtins for an atomic exchange with full, 
acquire, and release memory barrier semantics.  It is similar to 
__sync_lock_test_and_set(), but the target does not have the option of 
implementing a reduced functionality of only implementing a store of 1.  
Also, unlike __sync_lock_test_and_set(), we have all three memory 
barrier variants.

The compiler will fall back to a full barrier if the user requests an 
acquire/release and it is not available in the target.  Also, if no 
variant is available, we will fall back to a compare and swap loop with 
a full barrier at the end.

The real reason for this patch is to implement atomic stores in the C++ 
runtime library, which can currently incorrectly move prior stores past 
an atomic store, thus invalidating the happens-before promise for the 
sequentially consistent model.  I am attaching the corresponding patch 
to libstdc++ to show how I intend to use the builtin.  This is not an 
official submission for the C++ library bits, as I have not yet fully 
tested the library.  I will do so separately.

In a followup patch I will be implementing acq/rel/full variants for all 
the __sync_* builtins which we can use for the atomic loads and for some 
of the OpenMP atomics Jakub has been working on.

Oh yeah, I would gladly accept patterns/patches for other architectures :).

Tested on x86-64 Linux.

OK for mainline?
* c-family/c-common.c (resolve_overloaded_builtin): Add
	BUILT_IN_LOCK_TEST_AND_SET_*_N variants.
	* doc/extend.texi: Document __sync_lock_test_and_set_* variants.
	* libgcc-std.ver: Add __sync_swap_*.
	* optabs.h: Add DOI_sync_swap*.
	Define sync_swap*_optab.
	* optabs.c (expand_sync_swap): New.
	* genopinit.c: Add sync_swap_{acq,rel,full}.
	* config/i386/sync.md ("sync_lock_test_and_set_full<mode>"): New.
	* config/i386/i386.md: Add UNSPECV_SWAP_FULL.
	* tree.h (enum membar_mode): Same.
	* builtins.c (expand_builtin_swap): New.
	(expand_builtin): Add cases for BUILT_IN_SWAP_*.
	* sync-builtins.def (BUILT_IN_SWAP_*): New.
	* expr.h (expand_sync_swap): Protoize.
	(expand_builtin_synchronize): Same.
(__sso_string_base<>::_M_assign): Likewise.
	* include/bits/atomic_2.h (_ITp<>::store): Use __sync_swap_full.
	(_ITp<>::store volatile): Same.
	(_PTp<>::store): Same.
	(_PTp<>::store volatile): Same.

Index: include/bits/atomic_2.h
===================================================================

--- include/bits/atomic_2.h	(revision 173831)
+++ include/bits/atomic_2.h	(working copy)
@@ -249,14 +249,12 @@ namespace __atomic2
 	__glibcxx_assert(__m != memory_order_acq_rel);
 	__glibcxx_assert(__m != memory_order_consume);
 
-	if (__m == memory_order_relaxed)
-	  _M_i = __i;
+	if (__m == memory_order_seq_cst)
+	  (void)__sync_swap_full (&_M_i, __i);
 	else
 	  {
 	    // write_mem_barrier();
 	    _M_i = __i;
-	    if (__m == memory_order_seq_cst)
-	      __sync_synchronize();
 	  }
       }
 
@@ -267,14 +265,12 @@ namespace __atomic2
 	__glibcxx_assert(__m != memory_order_acq_rel);
 	__glibcxx_assert(__m != memory_order_consume);
 
-	if (__m == memory_order_relaxed)
-	  _M_i = __i;
+	if (__m == memory_order_seq_cst)
+	  (void)__sync_swap_full (&_M_i, __i);
 	else
 	  {
 	    // write_mem_barrier();
 	    _M_i = __i;
-	    if (__m == memory_order_seq_cst)
-	      __sync_synchronize();
 	  }
       }
 
@@ -540,14 +536,12 @@ namespace __atomic2
 	__glibcxx_assert(__m != memory_order_acq_rel);
 	__glibcxx_assert(__m != memory_order_consume);
 
-	if (__m == memory_order_relaxed)
-	  _M_p = __p;
+	if (__m = memory_order_seq_cst)
+	  __sync_swap_full (&_M_p, __p);
 	else
 	  {
 	    // write_mem_barrier();
 	    _M_p = __p;
-	    if (__m == memory_order_seq_cst)
-	      __sync_synchronize();
 	  }
       }
 
@@ -559,14 +553,12 @@ namespace __atomic2
 	__glibcxx_assert(__m != memory_order_acq_rel);
 	__glibcxx_assert(__m != memory_order_consume);
 
-	if (__m == memory_order_relaxed)
-	  _M_p = __p;
+	if (__m = memory_order_seq_cst)
+	  __sync_swap_full (&_M_p, __p);
 	else
 	  {
 	    // write_mem_barrier();
 	    _M_p = __p;
-	    if (__m == memory_order_seq_cst)
-	      __sync_synchronize();
 	  }
       }


__sync_swap* with acq/rel/full memory barrier semantics

Commit Message

Comments

Patch