Message ID | 5587DA07.10307@arm.com |
---|---|
State | New |
Headers | show |
On Mon, Jun 22, 2015 at 10:48 AM, Matthew Wahab <matthew.wahab@arm.com> wrote: > This is the ARM version of the patches to strengthen memory barriers for the > __sync builtins on ARMv8 targets > (https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01989.html). > > The problem is that the barriers generated for the __sync builtins for ARMv8 > targets are too weak. This affects the full and the acquire barriers in the > __sync fetch-and-op, compare-and-swap functions and > __sync_lock_test_and_set. > > This patch series changes the code to strengthen the barriers by replacing > initial load-acquires with a simple load and adding a final memory barrier > to > prevent code hoisting. > > - Full barriers: __sync_fetch_and_op, __sync_op_and_fetch > __sync_*_compare_and_swap > > [load-acquire; code; store-release] > becomes > [load; code ; store-release; barrier]. > > - Acquire barriers: __sync_lock_test_and_set > > [load-acquire; code; store] > becomes > [load; code; store; barrier] > > This patch changes the code generated for __sync_fetch_and_<op> and > __sync_<op>_and_fetch builtins. > > Tested as part of a series for arm-none-linux-gnueabihf with check-gcc. > > Ok for trunk? > Matthew > > gcc/ > 2015-06-22 Matthew Wahab <matthew.wahab@arm.com> > > PR Target/65697 s/Target/target > * config/armc/arm.c (arm_split_atomic_op): For ARMv8, replace an > initial acquire barrier with a final full barrier. Otherwise OK. Ramana
From 3e9f71c04dba20ba66b5c9bae284fcac5fdd91ec Mon Sep 17 00:00:00 2001 From: Matthew Wahab <matthew.wahab@arm.com> Date: Fri, 22 May 2015 13:31:58 +0100 Subject: [PATCH 1/3] [ARM] Strengthen barriers for sync-fetch-op builtin. Change-Id: I18f5af5ba4b2e74b5866009d3a090e251eff4a45 --- gcc/config/arm/arm.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e79a369..94118f4 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -27668,6 +27668,8 @@ arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem, rtx_code_label *label; rtx x; + bool is_armv8_sync = arm_arch8 && is_mm_sync (model); + bool use_acquire = TARGET_HAVE_LDACQ && !(is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model)); @@ -27676,6 +27678,11 @@ arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem, && !(is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model)); + /* For ARMv8, a load-acquire is too weak for __sync memory orders. Instead, + a full barrier is emitted after the store-release. */ + if (is_armv8_sync) + use_acquire = false; + /* Checks whether a barrier is needed and emits one accordingly. */ if (!(use_acquire || use_release)) arm_pre_atomic_barrier (model); @@ -27746,7 +27753,8 @@ arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem, emit_unlikely_jump (gen_cbranchsi4 (x, cond, const0_rtx, label)); /* Checks whether a barrier is needed and emits one accordingly. */ - if (!(use_acquire || use_release)) + if (is_armv8_sync + || !(use_acquire || use_release)) arm_post_atomic_barrier (model); } -- 1.9.1