diff mbox

CRIS atomics revisited 4/4: give up on alignment of atomic data, RFC for is_lock_free hook

Message ID 201207160349.q6G3n06W019752@ignucius.se.axis.com
State New
Headers show

Commit Message

Hans-Peter Nilsson July 16, 2012, 3:49 a.m. UTC
Well, give up by default that is, and fix it up in a helper
function in glibc to hold a global byte-sized atomic lock for
the duration.  (Sorry!)  Yes, this means that
fold_builtin_atomic_always_lock_free is wrong.  It knows about
alignment in general but doesn't handle the case where the
default alignment of the underlying type is too small for atomic
accesses, and should probably be augmented by a target hook,
alternatively, change the allow_libcall argument in the call to
can_compare_and_swap_p to false.  I guess I should open a PR for
this and add a test-case.  Later.

Too many library API writers don't cater for the possibility
that atomic (lockless) data may need to have certain properties
that may not be matched by the basic underlying data type,
specifically alignment, and fixing the failing instances by hand
is...challenging.  About half the cases have the atomic data
defined in the proximity of the atomic operations, and are
easily locally fixable.  The other half are of increasing
complexity; may have the data defined elsewhere, where the need
for atomicity is surprising and fixing it would be a kludge.
(But with a proper API could be easily handled, e.g. if a
data-type defined specific for the purpose was used; one
different than the underlying type or other common derivated
type used in the library.)

So, we'll change things for cris*-linux.  By default, call a
helper function.  Users can change the default at the caller
site where atomic alignment is known good or where there is
interest in fixing it up when failure is seen; executing a trap
insn was the old default.

Regarding changing fold_builtin_atomic_always_lock_free or
adding a hook, I posit that a better default is to assume atomic
data has to be naturally aligned on *all* existing GCC targets
to accomplish lockless operations, at least for those that don't
just punt to a system call, so maybe
fold_builtin_atomic_always_lock_free should be changed to check
that, by default.  Now, it just assumes that the default
type-alignment is ok and that only a smaller alignment is not
always atomic.  People with counter-examples are asked to please
explain how the counter-example handles data straddling a page
boundary.  Yes, it can be done, but how does the kernel-equivalent
accomplish atomicity; are the pages locked while the instruction
(assumed to cause an exception), is emulated, or is kernel
re-entrance impossible or what?

The default remains the same for non-*-linux-* cris-* and
crisv32-* subtargets, since the code compiled for those targets
is expected to have a different focus, one where fixing
non-aligned data definitions is feasible and desirable.

I deliberately make it optional and use weasel-wording whether
the library functions are actually called or an atomic insn
sequence emitted when suitable; when GCC knows the alignment of
the data (for example, for local static data or through
deliberate attributes) it should be allowed to emit the atomic
instruction sequence even without alignment checks.  Right now
(or maybe it was just the 4.7 branch), GCC's handling of
alignment is so poor that the emitted alignment checks (those
that conditionally execute a trap insn) aren't eliminated for
atomic data with explicit large-enough attribute declarations.
From what I (with limited tree-foo) could see, IIRC basically
everything about alignment for the specific data is discarded
and the underlying default type alignment is reported.  (Right,
a PR is in order; I know I've entered a PR for the related
__alignof__.)

gcc:
	* config/cris/cris.c (cris_init_libfuncs): Handle initialization
	of library functions for basic atomic compare-and-swap.
	* config/cris/cris.h (TARGET_ATOMICS_MAY_CALL_LIBFUNCS): New macro.
	* config/cris/cris.opt (munaligned-atomic-may-use-library): New
	option.
	* config/cris/sync.md ("atomic_fetch_<atomic_op_name><mode>")
	("cris_atomic_fetch_<atomic_op_name><mode>_1")
	("atomic_compare_and_swap<mode>")
	("cris_atomic_compare_and_swap<mode>_1"): Make
	conditional on TARGET_ATOMICS_MAY_CALL_LIBFUNCS for
	sizes larger than byte.

gcc/testsuite:
	* gcc.target/cris/sync-2i.c, gcc.target/cris/sync-2s.c,
	gcc.target/cris/sync-3i.c, gcc.target/cris/sync-3s.c,
	gcc.target/cris/sync-4i.c, gcc.target/cris/sync-4s.c,
	gcc.target/cris/sync-1-v10.c,
	gcc.target/cris/sync-1-v32.c: For cris*-*-linux*, also
	pass -mno-unaligned-atomic-may-use-library.
	* gcc.target/cris/sync-xchg-1.c: New test.


brgds, H-P

Comments

Andrew MacLeod July 17, 2012, 12:24 p.m. UTC | #1
On 07/15/2012 11:49 PM, Hans-Peter Nilsson wrote:
> Well, give up by default that is, and fix it up in a helper
> function in glibc to hold a global byte-sized atomic lock for
> the duration.  (Sorry!)  Yes, this means that
> fold_builtin_atomic_always_lock_free is wrong.  It knows about
> alignment in general but doesn't handle the case where the
> default alignment of the underlying type is too small for atomic
> accesses, and should probably be augmented by a target hook,
> alternatively, change the allow_libcall argument in the call to
> can_compare_and_swap_p to false.  I guess I should open a PR for
> this and add a test-case.  Later.

I set it up to eventually handle alignment issues, but didn't really 
know any details of what to do when, or how.  Input was rather lacking 
at the time and there was a time crunch, so I just punted on it in 4.7 
at the tree level and targets did their own thing.  The library idea was 
new enough that the standards guys couldnt/wouldn't give me a straight 
answer since it hadn't really been thought through I don't think.  My 
apologies if I misrepresent anyone there :-)

The basic problem you are dealing with is a 4 byte object that is not 
aligned properly, so the  lock-free instructions don't work on it., but 
'always_lock_free' says "yes, that type is lock free". Then in the 
library, its implemented via a lock?  Is that approximately it?  or is 
there an additional issue?

The original intention was that  __atomic{is,always}_lock_free takes the 
size and a pointer to the object, and the alignment of the pointer 
object would be checked at compile time to see if it is always 
compatible with size. (so a char pointer to a 4 byte object would fail 
that test). which would means 'always_lock_free' should return false, 
and in turn a call to 'is_lock_free' would generate a library call and 
check the alignment at runtime of the object to determine what to do.    
The library implementation of all the other __atomic operations would 
check 'is_lock_free' to see whether they need to use a lock, or whether 
they can use available lock-free sequences/syscalls.

Some confusion around the standard surfaced and the intention from the 
standard point of view appeared to be that 'is_lock_free' should be true 
or false ALL the time for a given type.    This possibly is related to 
the separation of gcc's built-ins from the implementation details of the 
standard  (ie the standard uses the built-ins but only with aligned 
objects, but those built-ins can also still be used outside the standard 
in more chaotic ways)

So in the end, i wasn't sure what to do and ran out of time.  Making it 
better for 4.8 would be goodness.  Is there a fatal flaw in the original 
(unimplemented) intention?  if so, can it be fixed without changing the API?

Any PR's you open related this this, copy me on them and I'll try to get 
them addressed.

Andrew
diff mbox

Patch

diff --git gcc/config/cris/cris.c gcc/config/cris/cris.c
index 22b254f..e4c11fd 100644
--- gcc/config/cris/cris.c
+++ gcc/config/cris/cris.c
@@ -3130,6 +3176,16 @@  cris_init_libfuncs (void)
   set_optab_libfunc (udiv_optab, SImode, "__Udiv");
   set_optab_libfunc (smod_optab, SImode, "__Mod");
   set_optab_libfunc (umod_optab, SImode, "__Umod");
+
+  /* Atomic data being unaligned is unfortunately a reality.
+     Deal with it.  */
+  if (TARGET_ATOMICS_MAY_CALL_LIBFUNCS)
+    {
+      set_optab_libfunc (sync_compare_and_swap_optab, SImode,
+			 "__cris_atcmpxchgr32");
+      set_optab_libfunc (sync_compare_and_swap_optab, HImode,
+			 "__cris_atcmpxchgr16");
+    }
 }
 
 /* The INIT_EXPANDERS worker sets the per-function-data initializer and
diff --git gcc/config/cris/cris.h gcc/config/cris/cris.h
index dee9d07..6bb457d 100644
--- gcc/config/cris/cris.h
+++ gcc/config/cris/cris.h
@@ -324,6 +324,11 @@  extern int cris_cpu_version;
 #define TARGET_TRAP_USING_BREAK8 \
  (cris_trap_using_break8 == 2 ? TARGET_HAS_BREAK : cris_trap_using_break8)
 
+/* Call library functions by default for GNU/Linux.  */
+#define TARGET_ATOMICS_MAY_CALL_LIBFUNCS		\
+ (cris_atomics_calling_libfunc == 2			\
+  ? TARGET_LINUX : cris_atomics_calling_libfunc)
+
 /* The < v10 atomics turn off interrupts, so they don't need alignment.
    Incidentally, by default alignment is off there causing variables to
    be default unaligned all over, so we'd have to make support
diff --git gcc/config/cris/cris.opt gcc/config/cris/cris.opt
index 2d12a5c..cd58004 100644
--- gcc/config/cris/cris.opt
+++ gcc/config/cris/cris.opt
@@ -183,6 +183,10 @@  mtrap-unaligned-atomic
 Target Report Var(cris_trap_unaligned_atomic) Init(2)
 Emit checks causing \"break 8\" instructions to execute when applying atomic builtins on misaligned memory
 
+munaligned-atomic-may-use-library
+Target Report Var(cris_atomics_calling_libfunc) Init(2)
+Handle atomic builtins that may be applied to unaligned data by calling library functions. Overrides -mtrap-unaligned-atomic.
+
 ; TARGET_SVINTO: Currently this just affects alignment.  FIXME:
 ; Redundant with TARGET_ALIGN_BY_32, or put machine stuff here?
 ; This and the others below could just as well be variables and
diff --git gcc/config/cris/sync.md gcc/config/cris/sync.md
index 558afbf..c465ce5 100644
--- gcc/config/cris/sync.md
+++ gcc/config/cris/sync.md
@@ -110,3 +125,3 @@ 
    (clobber (match_scratch:SI 3 "=&r"))]
-  ""
+  "<MODE>mode == QImode || !TARGET_ATOMICS_MAY_CALL_LIBFUNCS"
 {
@@ -189,3 +205,3 @@ 
    (match_operand 7)]
-  ""
+  "<MODE>mode == QImode || !TARGET_ATOMICS_MAY_CALL_LIBFUNCS"
 {
@@ -217,3 +233,3 @@ 
 	 CRIS_UNSPEC_ATOMIC_SWAP_MEM))]
-  ""
+  "<MODE>mode == QImode || !TARGET_ATOMICS_MAY_CALL_LIBFUNCS"
 {
diff --git gcc/testsuite/gcc.target/cris/sync-2i.c gcc/testsuite/gcc.target/cris/sync-2i.c
index e43aa53..d491d3c 100644
--- gcc/testsuite/gcc.target/cris/sync-2i.c
+++ gcc/testsuite/gcc.target/cris/sync-2i.c
@@ -2,6 +2,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -Dop -Dtype=int" } */
 /* { dg-additional-options "-mtrap-using-break8 -mtrap-unaligned-atomic" { target cris-*-elf } } */
+/* { dg-additional-options "-mno-unaligned-atomic-may-use-library" { target cris*-*-linux* } } */
 /* { dg-final { scan-assembler "\tbreak 8" } } */
 /* { dg-final { scan-assembler "\tbtstq \\(2-1\\)," } } */
 /* { dg-final { scan-assembler-not "\tand" } } */
diff --git gcc/testsuite/gcc.target/cris/sync-2s.c gcc/testsuite/gcc.target/cris/sync-2s.c
index 9be7dc6..06ff98a 100644
--- gcc/testsuite/gcc.target/cris/sync-2s.c
+++ gcc/testsuite/gcc.target/cris/sync-2s.c
@@ -2,6 +2,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -Dop -Dtype=short" } */
 /* { dg-additional-options "-mtrap-using-break8 -mtrap-unaligned-atomic" { target cris-*-elf } } */
+/* { dg-additional-options "-mno-unaligned-atomic-may-use-library" { target cris*-*-linux* } } */
 /* { dg-final { scan-assembler "\tbreak 8" } } */
 /* { dg-final { scan-assembler "\tbtstq \\(1-1\\)," } } */
 /* { dg-final { scan-assembler-not "\tand" } } */
diff --git gcc/testsuite/gcc.target/cris/sync-3i.c gcc/testsuite/gcc.target/cris/sync-3i.c
index 114e0a8..9e67d61 100644
--- gcc/testsuite/gcc.target/cris/sync-3i.c
+++ gcc/testsuite/gcc.target/cris/sync-3i.c
@@ -4,6 +4,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -Dxchg -Dtype=int" } */
 /* { dg-additional-options "-mtrap-using-break8 -mtrap-unaligned-atomic" { target cris-*-elf } } */
+/* { dg-additional-options "-mno-unaligned-atomic-may-use-library" { target cris*-*-linux* } } */
 /* { dg-final { scan-assembler "\tbreak 8" } } */
 /* { dg-final { scan-assembler "\tbtstq \\(2-1\\)," { xfail *-*-* } } } */
 /* { dg-final { scan-assembler-not "\tand" { xfail *-*-* } } } */
diff --git gcc/testsuite/gcc.target/cris/sync-3s.c gcc/testsuite/gcc.target/cris/sync-3s.c
index facbb39..8e87a3b 100644
--- gcc/testsuite/gcc.target/cris/sync-3s.c
+++ gcc/testsuite/gcc.target/cris/sync-3s.c
@@ -4,6 +4,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -Dxchg -Dtype=short" } */
 /* { dg-additional-options "-mtrap-using-break8 -mtrap-unaligned-atomic" { target cris-*-elf } } */
+/* { dg-additional-options "-mno-unaligned-atomic-may-use-library" { target cris*-*-linux* } } */
 /* { dg-final { scan-assembler "\tbreak 8" } } */
 /* { dg-final { scan-assembler "\tbtstq \\(1-1\\)," { xfail *-*-* } } } */
 /* { dg-final { scan-assembler-not "\tand" { xfail *-*-* } } } */
diff --git gcc/testsuite/gcc.target/cris/sync-4i.c gcc/testsuite/gcc.target/cris/sync-4i.c
index 9756c69..78a7012 100644
--- gcc/testsuite/gcc.target/cris/sync-4i.c
+++ gcc/testsuite/gcc.target/cris/sync-4i.c
@@ -1,6 +1,7 @@ 
 /* Check that we don't get alignment-checking code, int.  */
 /* { dg-do compile } */
 /* { dg-options "-O2 -Dtype=int -mno-trap-unaligned-atomic" } */
+/* { dg-additional-options "-mno-unaligned-atomic-may-use-library" { target cris*-*-linux* } } */
 /* { dg-final { scan-assembler-not "\tbreak\[ \t\]" } } */
 /* { dg-final { scan-assembler-not "\tbtstq\[ \t\]\[^5\]" } } */
 /* { dg-final { scan-assembler-not "\tand" } } */
diff --git gcc/testsuite/gcc.target/cris/sync-4s.c gcc/testsuite/gcc.target/cris/sync-4s.c
index 2d64430..6691a48 100644
--- gcc/testsuite/gcc.target/cris/sync-4s.c
+++ gcc/testsuite/gcc.target/cris/sync-4s.c
@@ -1,6 +1,7 @@ 
 /* Check that we don't get alignment-checking code, short.  */
 /* { dg-do compile } */
 /* { dg-options "-O2 -Dtype=short -mno-trap-unaligned-atomic" } */
+/* { dg-additional-options "-mno-unaligned-atomic-may-use-library" { target cris*-*-linux* } } */
 /* { dg-final { scan-assembler-not "\tbreak\[ \t\]" } } */
 /* { dg-final { scan-assembler-not "\tbtstq\[ \t\]\[^5\]" } } */
 /* { dg-final { scan-assembler-not "\tand" } } */
diff --git gcc/testsuite/gcc.target/cris/sync-1-v10.c gcc/testsuite/gcc.target/cris/sync-1-v10.c
index 640098a..6c8dd1a 100644
--- gcc/testsuite/gcc.target/cris/sync-1-v10.c
+++ gcc/testsuite/gcc.target/cris/sync-1-v10.c
@@ -1,4 +1,5 @@ 
 /* Check that we can assemble both base atomic variants.  */
 /* { dg-do assemble } */
 /* { dg-options "-O2 -march=v10" } */
+/* { dg-additional-options "-mno-unaligned-atomic-may-use-library" { target cris*-*-linux* } } */
 #include "sync-1.c"
diff --git gcc/testsuite/gcc.target/cris/sync-1-v32.c gcc/testsuite/gcc.target/cris/sync-1-v32.c
index 644d922..3c1d076 100644
--- gcc/testsuite/gcc.target/cris/sync-1-v32.c
+++ gcc/testsuite/gcc.target/cris/sync-1-v32.c
@@ -1,4 +1,5 @@ 
 /* Check that we can assemble both base atomic variants.  */
 /* { dg-do assemble } */
 /* { dg-options "-O2 -march=v32" } */
+/* { dg-additional-options "-mno-unaligned-atomic-may-use-library" { target cris*-*-linux* } } */
 #include "sync-1.c"
Index: gcc.target/cris/sync-xchg-1.c
===================================================================
--- gcc.target/cris/sync-xchg-1.c	(revision 0)
+++ gcc.target/cris/sync-xchg-1.c	(revision 0)
@@ -0,0 +1,21 @@ 
+/* Check that the basic library call variant is sane; no other calls, no
+   checks compares or branches.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -munaligned-atomic-may-use-library" } */
+/* { dg-final { scan-assembler-not "\tdi" } } */
+/* { dg-final { scan-assembler-not "\tbtstq" } } */
+/* { dg-final { scan-assembler-not "\tand" } } */
+/* { dg-final { scan-assembler-not "\tclearf" } } */
+/* { dg-final { scan-assembler-not "\tmove.d" } } */
+/* { dg-final { scan-assembler-not "\tcmp" } } */
+/* { dg-final { scan-assembler-not "\tb\[^s\]" } } */
+/* { dg-final { scan-assembler-times "\t\[JjBb\]sr" 1 } } */
+
+#ifndef type
+#define type int
+#endif
+
+type svcsw (type *ptr, type oldval, type newval)
+{
+  return __sync_val_compare_and_swap (ptr, oldval, newval);
+}