diff mbox

[libitm] Add SPARC bits

Message ID 201202122115.26506.ebotcazou@adacore.com
State New
Headers show

Commit Message

Eric Botcazou Feb. 12, 2012, 8:15 p.m. UTC
Hi,

this adds a SPARC (V9 only) port to the library.  Straightforward, except that 
the unusual setup of _ITM_beginTransaction's frame consistently crashes GDB if 
it is fully described in the CFI.  Since backtracing through that frame isn't 
very likely to happen in real life, I have left it as-is.

Tested on SPARC/Solaris 8 & 9 (with the other patch) and on SPARC64/Linux, both 
in 32-bit and 64-bit mode.  OK for the mainline?


2012-02-12 Eric Botcazou  <ebotcazou@adacore.com>

	* configure.tgt (target_cpu): Handle sparc and sparc64 & sparcv9.
	* config/sparc/cacheline.h: New file.
	* config/sparc/target.h: Likewise.
	* config/sparc/sjlj.S: Likewise.
	* config/linux/sparc/futex_bits.h: Likewise.

Comments

Richard Henderson Feb. 13, 2012, 8:33 p.m. UTC | #1
On 02/12/2012 12:15 PM, Eric Botcazou wrote:
> 2012-02-12 Eric Botcazou  <ebotcazou@adacore.com>
> 
> 	* configure.tgt (target_cpu): Handle sparc and sparc64 & sparcv9.
> 	* config/sparc/cacheline.h: New file.
> 	* config/sparc/target.h: Likewise.
> 	* config/sparc/sjlj.S: Likewise.
> 	* config/linux/sparc/futex_bits.h: Likewise.

Ok.

Thanks for this.


r~
David Miller Feb. 14, 2012, 1:02 a.m. UTC | #2
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Sun, 12 Feb 2012 21:15:26 +0100

> +	load	[%o1 + OFFSET (JB_CFA)], %fp
> +	cfi_def_cfa(%fp, 0)
> +#if STACK_BIAS
> +	sub	%fp, STACK_BIAS, %fp
> +	cfi_def_cfa_offset(STACK_BIAS)
> +#endif

I think you really need to put the proper value into the %fp register
atomically here.

If an interrupt comes in before you STACK_BIAS adjust the %fp, a
debugger or similar could see a corrupt frame pointer.
David Miller Feb. 14, 2012, 1:14 a.m. UTC | #3
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Sun, 12 Feb 2012 21:15:26 +0100

> +static inline void
> +cpu_relax (void)
> +{
> +  __asm volatile ("" : : : "memory");
> +}

We probably want to do some nop'ish thing here which will yield the
cpu thread on Niagara cpus, I'd recommend something along the lines of
"rd %ccr, %g0" or "rd %y, %g0"
Eric Botcazou Feb. 14, 2012, 8:40 a.m. UTC | #4
> I think you really need to put the proper value into the %fp register
> atomically here.
>
> If an interrupt comes in before you STACK_BIAS adjust the %fp, a
> debugger or similar could see a corrupt frame pointer.

OK, I'm going to make the change.
Eric Botcazou Feb. 14, 2012, 8:42 a.m. UTC | #5
> We probably want to do some nop'ish thing here which will yield the
> cpu thread on Niagara cpus, I'd recommend something along the lines of
> "rd %ccr, %g0" or "rd %y, %g0"

I'm going for the former, thanks.
Eric Botcazou Feb. 27, 2012, 9:42 a.m. UTC | #6
> We probably want to do some nop'ish thing here which will yield the
> cpu thread on Niagara cpus, I'd recommend something along the lines of
> "rd %ccr, %g0" or "rd %y, %g0"

libgomp has its own idea about cpu_relax:

static inline void
cpu_relax (void)
{
#if defined __arch64__ || defined  __sparc_v9__
  __asm volatile ("membar #LoadLoad" : : : "memory");
#else
  __asm volatile ("" : : : "memory");
#endif
}

Maybe we can come up with a single implementation for libgomp and libitm?

And defined(__sparc_v9__) doesn't work on Solaris so we should drop the 
condition altogether in libgomp.
David Miller Feb. 27, 2012, 7:03 p.m. UTC | #7
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Mon, 27 Feb 2012 10:42:17 +0100

>> We probably want to do some nop'ish thing here which will yield the
>> cpu thread on Niagara cpus, I'd recommend something along the lines of
>> "rd %ccr, %g0" or "rd %y, %g0"
> 
> libgomp has its own idea about cpu_relax:
> 
> static inline void
> cpu_relax (void)
> {
> #if defined __arch64__ || defined  __sparc_v9__
>   __asm volatile ("membar #LoadLoad" : : : "memory");
> #else
>   __asm volatile ("" : : : "memory");
> #endif
> }
> 
> Maybe we can come up with a single implementation for libgomp and libitm?

I think that's a great idea.

We need a reliable way to test for v9/v8plus/whatever properly because
nobody is testing current gcc with real 32-bit pre-v9 sparc hardware and
not providing atomics and proper cpu_relax implementations is just silly.
Eric Botcazou Feb. 27, 2012, 8:41 p.m. UTC | #8
> We need a reliable way to test for v9/v8plus/whatever properly because
> nobody is testing current gcc with real 32-bit pre-v9 sparc hardware and
> not providing atomics and proper cpu_relax implementations is just silly.

Both libgomp and libitm already force -mcpu=v9 though (and simply aren't built 
if it cannot be forced) so I don't think we should care about pre-v9 here.

I think the issue is just how we unify the two cpu_relax implementations:

static inline void
cpu_relax (void)
{
#if defined __arch64__ || defined  __sparc_v9__
  __asm volatile ("membar #LoadLoad" : : : "memory");
#else
  __asm volatile ("" : : : "memory");
#endif
}

for libgomp and:

static inline void
cpu_relax (void)
{
  __asm volatile ("rd %%ccr, %%g0" : : : "memory");
}

for libitm.


Would

static inline void
cpu_relax (void)
{
  __asm volatile ("membar #LoadLoad" : : : "memory");
}

be good enough?
David Miller Feb. 27, 2012, 8:45 p.m. UTC | #9
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Mon, 27 Feb 2012 21:41:23 +0100

> I think the issue is just how we unify the two cpu_relax implementations:
> 
> static inline void
> cpu_relax (void)
> {
> #if defined __arch64__ || defined  __sparc_v9__
>   __asm volatile ("membar #LoadLoad" : : : "memory");
> #else
>   __asm volatile ("" : : : "memory");
> #endif
> }
> 
> for libgomp and:
> 
> static inline void
> cpu_relax (void)
> {
>   __asm volatile ("rd %%ccr, %%g0" : : : "memory");
> }
> 
> for libitm.
> 
> 
> Would
> 
> static inline void
> cpu_relax (void)
> {
>   __asm volatile ("membar #LoadLoad" : : : "memory");
> }
> 
> be good enough?

I'm not sure, because I'm pretty sure this type of membar acts as a
NOP on basically every sparc v9 chip ever made.

I would prefer to see the rd %%ccr used in both places if possible,
as we know that forces a cpu thread yield.
Eric Botcazou Feb. 27, 2012, 8:54 p.m. UTC | #10
> I'm not sure, because I'm pretty sure this type of membar acts as a
> NOP on basically every sparc v9 chip ever made.

OK, I was inferring from the libgomp implementation that the #LoadLoad variant
might also have yielding properties.

> I would prefer to see the rd %%ccr used in both places if possible,
> as we know that forces a cpu thread yield.

Fine with me.  I'll do some testing and then change libgomp, thanks.
diff mbox

Patch

Index: configure.tgt
===================================================================
--- configure.tgt	(revision 183864)
+++ configure.tgt	(working copy)
@@ -66,6 +66,34 @@  case "${target_cpu}" in
 
   sh*)		ARCH=sh ;;
 
+  sparc)
+	case " ${CC} ${CFLAGS} " in
+	  *" -m64 "*)
+	    ;;
+	  *)
+	    if test -z "$with_cpu"; then
+	      XCFLAGS="${XCFLAGS} -mcpu=v9"
+	    fi
+	esac
+	ARCH=sparc
+	;;
+
+  sparc64|sparcv9)
+	case " ${CC} ${CFLAGS} " in
+	  *" -m32 "*)
+	    XCFLAGS="${XCFLAGS} -mcpu=v9"
+	    ;;
+	  *" -m64 "*)
+	    ;;
+	  *)
+	    if test "x$with_cpu" = xv8; then
+	      XCFLAGS="${XCFLAGS} -mcpu=v9"
+	    fi
+	    ;;
+	esac
+	ARCH=sparc
+	;;
+
   x86_64)
 	case " ${CC} ${CFLAGS} " in
 	  *" -m32 "*)
Index: config/linux/sparc/futex_bits.h
===================================================================
--- config/linux/sparc/futex_bits.h	(revision 0)
+++ config/linux/sparc/futex_bits.h	(revision 0)
@@ -0,0 +1,62 @@ 
+/* Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sys/syscall.h>
+
+static inline long
+sys_futex0 (std::atomic<int> *addr, int op, int val)
+{
+  register long int g1 __asm__ ("g1");
+  register long int o0 __asm__ ("o0");
+  register long int o1 __asm__ ("o1");
+  register long int o2 __asm__ ("o2");
+  register long int o3 __asm__ ("o3");
+  long res;
+
+  g1 = SYS_futex;
+  o0 = (long) addr;
+  o1 = op;
+  o2 = val;
+  o3 = 0;
+
+#ifdef __arch64__
+  __asm volatile ("ta 0x6d"
+#else
+  __asm volatile ("ta 0x10"
+#endif
+		  : "=r"(g1), "=r"(o0)
+		  : "0"(g1), "1"(o0), "r"(o1), "r"(o2), "r"(o3)
+		  : "g2", "g3", "g4", "g5", "g6",
+		    "f0", "f1", "f2", "f3", "f4", "f5", "f6", "f7",
+		    "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",
+		    "f16", "f17", "f18", "f19", "f20", "f21", "f22", "f23",
+		    "f24", "f25", "f26", "f27", "f28", "f29", "f30", "f31",
+		    "f32", "f34", "f36", "f38", "f40", "f42", "f44", "f46",
+		    "f48", "f50", "f52", "f54", "f56", "f58", "f60", "f62",
+		    "cc", "memory");
+
+  res = o0;
+  if (__builtin_expect ((unsigned long) res >= -515UL, 0))
+    res =- res;
+  return res;
+}
Index: config/sparc/cacheline.h
===================================================================
--- config/sparc/cacheline.h	(revision 0)
+++ config/sparc/cacheline.h	(revision 0)
@@ -0,0 +1,41 @@ 
+/* Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef LIBITM_SPARC_CACHELINE_H
+#define LIBITM_SPARC_CACHELINE_H 1
+
+// A cacheline is the smallest unit with which locks are associated.
+// The current implementation of the _ITM_[RW] barriers assumes that
+// all data types can fit (aligned) within a cachline, which means
+// in practice sizeof(complex long double) is the smallest cacheline size.
+// It ought to be small enough for efficient manipulation of the
+// modification mask, below.
+#ifdef __arch64__
+# define CACHELINE_SIZE 64
+#else
+# define CACHELINE_SIZE 32
+#endif
+
+#include "config/generic/cacheline.h"
+
+#endif // LIBITM_SPARC_CACHELINE_H
Index: config/sparc/sjlj.S
===================================================================
--- config/sparc/sjlj.S	(revision 0)
+++ config/sparc/sjlj.S	(revision 0)
@@ -0,0 +1,96 @@ 
+/* Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "asmcfi.h"
+
+#ifdef __arch64__
+# define WORD_SIZE 8
+# define MIN_FRAME_SIZE 176
+# define STACK_BIAS 2047
+# define load  ldx
+# define store stx
+#else
+# define WORD_SIZE 4
+# define MIN_FRAME_SIZE 96
+# define STACK_BIAS 0
+# define load  ld
+# define store st
+#endif
+
+/* Fields of the JmpBuf structure.  */
+#define JB_CFA 0
+#define JB_PC  1
+#define OFFSET(FIELD) ((FIELD) * WORD_SIZE)
+
+/* The frame size must be a multiple of the double-word size.  */
+#define FRAME_SIZE (MIN_FRAME_SIZE + 2 * WORD_SIZE)
+#define JB_OFFSET  (STACK_BIAS + MIN_FRAME_SIZE)
+
+	.text
+	.align 4
+	.globl	_ITM_beginTransaction
+	.type	_ITM_beginTransaction, #function
+	.proc	016
+_ITM_beginTransaction:
+	cfi_startproc
+	add	%sp, STACK_BIAS, %g1
+	sub	%sp, FRAME_SIZE, %sp
+	cfi_def_cfa_offset(STACK_BIAS + FRAME_SIZE)
+	store	%g1, [%sp + JB_OFFSET + OFFSET (JB_CFA)]
+	store	%o7, [%sp + JB_OFFSET + OFFSET (JB_PC)]
+	/* ??? This triggers an internal error in GDB.  */
+	cfi_offset(%o7, -WORD_SIZE)
+	call	GTM_begin_transaction
+	 add	%sp, JB_OFFSET, %o1
+	load	[%sp + JB_OFFSET + OFFSET (JB_PC)], %o7
+	jmp	%o7+8
+	 add	%sp, FRAME_SIZE, %sp
+	cfi_def_cfa_offset(STACK_BIAS)
+	cfi_endproc
+	.size _ITM_beginTransaction, . - _ITM_beginTransaction
+
+	.align 4
+	.globl	GTM_longjmp
+#ifdef HAVE_ATTRIBUTE_VISIBILITY
+	.hidden	GTM_longjmp
+#endif
+	.type	GTM_longjmp, #function
+	.proc	016
+GTM_longjmp:
+	cfi_startproc
+	flushw
+	load	[%o1 + OFFSET (JB_CFA)], %fp
+	cfi_def_cfa(%fp, 0)
+#if STACK_BIAS
+	sub	%fp, STACK_BIAS, %fp
+	cfi_def_cfa_offset(STACK_BIAS)
+#endif
+	load	[%o1 + OFFSET (JB_PC)], %o7
+	jmp	%o7+8
+	 restore %g0, %o0, %o0
+	cfi_endproc
+	.size GTM_longjmp, . - GTM_longjmp
+
+#ifdef __linux__
+	.section .note.GNU-stack, "", @progbits
+#endif
Index: config/sparc/target.h
===================================================================
--- config/sparc/target.h	(revision 0)
+++ config/sparc/target.h	(revision 0)
@@ -0,0 +1,46 @@ 
+/* Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+namespace GTM HIDDEN {
+
+typedef struct gtm_jmpbuf
+{
+  void *cfa;
+  unsigned long pc;
+} gtm_jmpbuf;
+
+/* UltraSPARC processors generally use a fixed page size of 8K.  */
+#define PAGE_SIZE	8192
+#define FIXED_PAGE_SIZE	1
+
+/* The size of one line in hardware caches (in bytes).  We use the primary
+   cache line size documented for the UltraSPARC T1/T2.  */
+#define HW_CACHELINE_SIZE 16
+
+static inline void
+cpu_relax (void)
+{
+  __asm volatile ("" : : : "memory");
+}
+
+} // namespace GTM