diff mbox

[1/3,ARM,PR,target/65697] Strengthen memory barriers for __sync builtins

Message ID 5587DA07.10307@arm.com
State New
Headers show

Commit Message

Matthew Wahab June 22, 2015, 9:48 a.m. UTC
This is the ARM version of the patches to strengthen memory barriers for the
__sync builtins on ARMv8 targets
(https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01989.html).

The problem is that the barriers generated for the __sync builtins for ARMv8
targets are too weak. This affects the full and the acquire barriers in the
__sync fetch-and-op, compare-and-swap functions and __sync_lock_test_and_set.

This patch series changes the code to strengthen the barriers by replacing
initial load-acquires with a simple load and adding a final memory barrier to
prevent code hoisting.

- Full barriers:  __sync_fetch_and_op, __sync_op_and_fetch
   __sync_*_compare_and_swap

   [load-acquire; code; store-release]
   becomes
   [load; code ; store-release; barrier].

- Acquire barriers:  __sync_lock_test_and_set

   [load-acquire; code; store]
   becomes
   [load; code; store; barrier]

This patch changes the code generated for __sync_fetch_and_<op> and
__sync_<op>_and_fetch builtins.

Tested as part of a series for arm-none-linux-gnueabihf with check-gcc.

Ok for trunk?
Matthew

gcc/
2015-06-22  Matthew Wahab  <matthew.wahab@arm.com>

	PR Target/65697
	* config/armc/arm.c (arm_split_atomic_op): For ARMv8, replace an
	initial acquire barrier with a final full barrier.

Comments

Ramana Radhakrishnan June 29, 2015, 1:07 p.m. UTC | #1
On Mon, Jun 22, 2015 at 10:48 AM, Matthew Wahab <matthew.wahab@arm.com> wrote:
> This is the ARM version of the patches to strengthen memory barriers for the
> __sync builtins on ARMv8 targets
> (https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01989.html).
>
> The problem is that the barriers generated for the __sync builtins for ARMv8
> targets are too weak. This affects the full and the acquire barriers in the
> __sync fetch-and-op, compare-and-swap functions and
> __sync_lock_test_and_set.
>
> This patch series changes the code to strengthen the barriers by replacing
> initial load-acquires with a simple load and adding a final memory barrier
> to
> prevent code hoisting.
>
> - Full barriers:  __sync_fetch_and_op, __sync_op_and_fetch
>   __sync_*_compare_and_swap
>
>   [load-acquire; code; store-release]
>   becomes
>   [load; code ; store-release; barrier].
>
> - Acquire barriers:  __sync_lock_test_and_set
>
>   [load-acquire; code; store]
>   becomes
>   [load; code; store; barrier]
>
> This patch changes the code generated for __sync_fetch_and_<op> and
> __sync_<op>_and_fetch builtins.
>
> Tested as part of a series for arm-none-linux-gnueabihf with check-gcc.
>
> Ok for trunk?
> Matthew
>
> gcc/
> 2015-06-22  Matthew Wahab  <matthew.wahab@arm.com>
>
>         PR Target/65697

s/Target/target

>         * config/armc/arm.c (arm_split_atomic_op): For ARMv8, replace an
>         initial acquire barrier with a final full barrier.

Otherwise OK.

Ramana
diff mbox

Patch

From 3e9f71c04dba20ba66b5c9bae284fcac5fdd91ec Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 22 May 2015 13:31:58 +0100
Subject: [PATCH 1/3] [ARM] Strengthen barriers for sync-fetch-op builtin.

Change-Id: I18f5af5ba4b2e74b5866009d3a090e251eff4a45
---
 gcc/config/arm/arm.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e79a369..94118f4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -27668,6 +27668,8 @@  arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
   rtx_code_label *label;
   rtx x;
 
+  bool is_armv8_sync = arm_arch8 && is_mm_sync (model);
+
   bool use_acquire = TARGET_HAVE_LDACQ
                      && !(is_mm_relaxed (model) || is_mm_consume (model)
 			  || is_mm_release (model));
@@ -27676,6 +27678,11 @@  arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
                      && !(is_mm_relaxed (model) || is_mm_consume (model)
 			  || is_mm_acquire (model));
 
+  /* For ARMv8, a load-acquire is too weak for __sync memory orders.  Instead,
+     a full barrier is emitted after the store-release.  */
+  if (is_armv8_sync)
+    use_acquire = false;
+
   /* Checks whether a barrier is needed and emits one accordingly.  */
   if (!(use_acquire || use_release))
     arm_pre_atomic_barrier (model);
@@ -27746,7 +27753,8 @@  arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
   emit_unlikely_jump (gen_cbranchsi4 (x, cond, const0_rtx, label));
 
   /* Checks whether a barrier is needed and emits one accordingly.  */
-  if (!(use_acquire || use_release))
+  if (is_armv8_sync
+      || !(use_acquire || use_release))
     arm_post_atomic_barrier (model);
 }
 
-- 
1.9.1