Patchwork [trans-mem] Beginning of refactoring

login
register
mail settings
Submitter Torvald Riegel
Date June 30, 2011, 11:15 a.m.
Message ID <1309432557.3609.798.camel@triegel.csb>
Download mbox | patch
Permalink /patch/102743/
State New
Headers show

Comments

Torvald Riegel - June 30, 2011, 11:15 a.m.
On Wed, 2011-06-29 at 16:01 -0700, Richard Henderson wrote:
> On 05/25/2011 02:10 PM, Torvald Riegel wrote:
> > Patch 2: _ITM_dropReferences is not sufficiently defined in the ABI. It
> > seems to target some form of open nesting for txnal wrappers, but the
> > prose in the ABI specification is unclear. Thus, disable this for now
> > (aka fatal runtime error), and expect the related tests to fail. Pick it
> > up again once that the ABI has been improved and the use cases are
> > clear.
> 
> Sure, but please actually delete the code rather than just comment it out.

Fixed in updated patch2.

> > Patch 3: The actual change in how ABI calls are dispatched. Also,
> > removed method-readonly (broken, will in a similar form reappear in the
> > family of globallock-based algorithms), and disabled method-wbetl (needs
> > larger refactoring, will be revived/remerged later).
> 
> > +CREATE_DISPATCH_FUNCTIONS_T_MEMCPY(M256, GTM::abi_disp()->_, )
> > +CREATE_DISPATCH_FUNCTIONS_T_MEMCPY(M64, GTM::abi_disp()->_, )
> > +CREATE_DISPATCH_FUNCTIONS_T_MEMCPY(M128, GTM::abi_disp()->_, )
> > +CREATE_DISPATCH_FUNCTIONS(GTM::abi_disp()->_, )
> 
> What's the point of using "GTM::abi_disp()->_" as a mandatory argument?
> 
> Further, that second "M2" argument is universally empty.  What's that?

This prepares us to be later able to instantiate ABI mem access
functions without having to go through the vtable dispatch (thus,
avoiding the overhead of the vtable dispatch / indirection). This will
become useful as soon as GCC can create several code paths with
different instrumentation (ie, calling different sets of _ITM_*
functions) for each transaction. Together with LTO, this will allow us
to have the TM implementation in the library and still be as fast as
direct TM support in the compiler.

How it works is that in a TM-method-specific dispatch class we would add
  CREATE_DISPATCH_METHODS(static, _static)
in addition to the 
  CREATE_DISPATCH_METHODS(virtual, )
that is currently there. In barrier.cc we can then do
  CREATE_DISPATCH_FUNCTIONS(GTM::serial_dispatch::_, _static)
and thus bypass the virtual functions, but still keep one implementation
of the TM method (in the static template functions). This example would
create functions that just use the serial-mode accesses.

Right now, CREATE_DISPATCH_FUNCTIONS will always create the same set of
functions (with names as specified in the ABI), but this is because we
haven't defined yet how those we would separate different
instrumentations (eg, name mangling vs. sth. else).

The trailing underscore at the end of the first param is to make the
preprocessor happen. Is there a better way to let it accept
"GTM::abi_disp()->" as argument?

> 
> > +// Creates memcpy/memmove/memset methods.
> > +#define CREATE_DISPATCH_METHODS_MEM()  \
> > +virtual void _memtransfer(void *dst, const void* src, size_t size,     \
> > +bool may_overlap, ls_modifier dst_mod, ls_modifier src_mod)           \
> > +{                                                                     \
> > +  memtransfer_static(dst, src, size, may_overlap, dst_mod, src_mod);  \
> > +}                                                                     \
> > +virtual void _memset(void *dst, int c, size_t size, ls_modifier mod)   \
> > +{                                                                     \
> > +  memset_static(dst, c, size, mod);                                   \
> > +}
> 
> Why are the memtransfer and memset virtuals distinguished from the statics?
> For the patch as written it would seem to be ok to merge them.

For the same reason as above. We seem to need two entry points (virtual
and static) to make that work. In the updated patch3, I add leading
underscores to make the _static version above actually work.

> > +#define ITM_MEMTRANSFER_DEF(TARGET, M2, NAME, READ, WRITE) \
> > +void ITM_REGPARM _ITM_memcpy##NAME(void *dst, const void *src, size_t size)  \
> > +{                                                                            \
> > +  TARGET##memtransfer##M2 (dst, src, size,                                   \
> > +             false, GTM::abi_dispatch::WRITE, GTM::abi_dispatch::READ);      \
> > +}                                                                            \
> > +void ITM_REGPARM _ITM_memmove##NAME(void *dst, const void *src, size_t size) \
> > +{                                                                            \
> > +  if (GTM::abi_dispatch::NONTXNAL == GTM::abi_dispatch::WRITE ||             \
> > +      GTM::abi_dispatch::NONTXNAL == GTM::abi_dispatch::READ)                \
> > +    {                                                                        \
> > +      if (((uintptr_t)dst <= (uintptr_t)src ?                                \
> > +          (uintptr_t)dst + size > (uintptr_t)src :                           \
> > +          (uintptr_t)src + size > (uintptr_t)dst))                           \
> > +        GTM::GTM_fatal("_ITM_memmove overlapping and t/nt is not allowed");  \
> > +      else                                                                   \
> > +        TARGET##memtransfer##M2 (dst, src, size,                             \
> > +                false, GTM::abi_dispatch::WRITE, GTM::abi_dispatch::READ);   \
> > +    }                                                                        \
> > +  TARGET##memtransfer##M2 (dst, src, size,                                   \
> > +              true, GTM::abi_dispatch::WRITE, GTM::abi_dispatch::READ);      \
> > +}
> 
> Ok, I realize we need macros to generate the ABI names both here and in
> CREATE_DISPATCH_FUNCTIONS, but can we limit the code within macros to as
> little as absolutely possible?

The check is now in one of abi_dispatch's methods (implemented in
barrier.cc).

> Missing return/else in there?  Surely not two calls to memtransfer...

Thanks, fixed as well.

> > +protected:
> > +  /// Transactional load. Will be called from the dispatch methods
> > +  /// created below.
> > +  template <typename V> static V load(const V* addr, ls_modifier mod)
> > +  {
> > +    return *addr;
> > +  }
> > +  /// Transactional store. Will be called from the dispatch methods
> > +  /// created below.
> > +  template <typename V> static void store(V* addr, const V value,
> > +      ls_modifier mod)
> > +  {
> > +    *addr = value;
> > +  }
> 
> Why are these here in the base class?  I'm not sure I like having static
> functions that are required to be overridden at each instance, wherein if
> you forget to implement them the build silently succeeds, but of course
> does the wrong thing.

I added macros to generate pure-virtual methods for the base class.
Write/read-through access got moved to the serial_dispatch class.

> What's with the 3 /// comments?  Can we stick with existing gnu standards?

Old doxygen habits. Removed.

> > -#define UNUSED		__attribute__((unused))
> > -#define ALWAYS_INLINE	__attribute__((always_inline))
> > -#ifdef HAVE_ATTRIBUTE_VISIBILITY
> > -# define HIDDEN		__attribute__((visibility("hidden")))
> > -#else
> > -# define HIDDEN
> > -#endif
> > -
> > -#define likely(X)	__builtin_expect((X) != 0, 1)
> > -#define unlikely(X)	__builtin_expect((X), 0)
> > +#include "common.h"
> 
> Why?  We're already in a header that's clearly marked "internal".

I moved these macros to a common header so libitm_i.h doesn't have to be
included everywhere. There was some circular dependency after the
changes (I think in via the barrier definitions for SSE etc.). What's
the issue that you see with this change?

Are the attached updated patches okay for the branch?


Torvald
commit f00a5faa853c268c6d17f2c458f9aa53dfbff9dc
Author: Torvald Riegel <triegel@redhat.com>
Date:   Tue May 24 18:30:28 2011 +0200

    Do not support _ITM_dropReferences anymore
    
    	* alloc_c.cc (_ITM_dropReferences): Don't support it anymore.
    	* testsuite/libitm.c++/dropref.C: _ITM_dropReferences is expected to fail.
    	* testsuite/libitm.c/dropref-2.c: Same.
    	* testsuite/libitm.c/dropref.c: Same.
commit be07f99348ea793529230c5a2238625b826d77ae
Author: Torvald Riegel <triegel@redhat.com>
Date:   Wed May 25 17:21:53 2011 +0200

    New dispatch class/functions. Remove/disable old methods (readonly, wbetl).
    
    	* libitm_i.h: Move parts to common.h and dispatch.h.
    	* common.h: New file.
    	* dispatch.h: New file, new dispatch class.
    	Rename GTM::abi_dispatch::lock_type to ls_modifier.
    	RenameGTM::abi_dispatch::NOLOCK to NONTXNAL.
    	* beginend.cc (GTM::gtm_transaction::begin_transaction): Delegate mode
    	decision to retry.cc.
    	* retry.cc (GTM::gtm_transaction::decide_retry_strategy): Use serial mode
    	only.
    	(GTM::gtm_transaction::decide_begin_dispatch): Same.
    	* method-serial.cc: Adapt to new dispatch. Add serial mode with undo
    	logging.
    	* barrier.cc: Use new barriers definitions.
    	* config/x86/x86_sse.cc: Same.
    	* config/x86/x86_avx.cc: Same.
    	* Makefile.am: Don't build readonly and wbetl methods, memset.cc and
    	memcpy.cc.
    	* Makefile.in: Rebuild.
    	* method-readonly.cc: Remove.
    	* method-wbetl.cc: Rename GTM::abi_dispatch::lock_type to ls_modifier.
    	Rename GTM::abi_dispatch::NOLOCK to NONTXNAL.

diff --git a/libitm/Makefile.am b/libitm/Makefile.am
index 8ad4e5f..bf0180d 100644
--- a/libitm/Makefile.am
+++ b/libitm/Makefile.am
@@ -41,9 +41,9 @@ libitm_la_LDFLAGS = $(libitm_version_info) $(libitm_version_script) \
 
 libitm_la_SOURCES = \
 	aatree.cc alloc.cc alloc_c.cc alloc_cpp.cc barrier.cc beginend.cc \
-	clone.cc cacheline.cc cachepage.cc eh_cpp.cc local.cc memcpy.cc \
-	memset.cc query.cc retry.cc rwlock.cc useraction.cc util.cc \
-	sjlj.S tls.cc method-serial.cc method-readonly.cc method-wbetl.cc
+	clone.cc cacheline.cc cachepage.cc eh_cpp.cc local.cc \
+	query.cc retry.cc rwlock.cc useraction.cc util.cc \
+	sjlj.S tls.cc method-serial.cc
 
 if ARCH_X86
 libitm_la_SOURCES += x86_sse.cc x86_avx.cc
diff --git a/libitm/Makefile.in b/libitm/Makefile.in
index ad03c27..8fc3106 100644
--- a/libitm/Makefile.in
+++ b/libitm/Makefile.in
@@ -93,17 +93,15 @@ LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
 libitm_la_LIBADD =
 am__libitm_la_SOURCES_DIST = aatree.cc alloc.cc alloc_c.cc \
 	alloc_cpp.cc barrier.cc beginend.cc clone.cc cacheline.cc \
-	cachepage.cc eh_cpp.cc local.cc memcpy.cc memset.cc query.cc \
+	cachepage.cc eh_cpp.cc local.cc query.cc \
 	retry.cc rwlock.cc useraction.cc util.cc sjlj.S tls.cc \
-	method-serial.cc method-readonly.cc method-wbetl.cc x86_sse.cc \
-	x86_avx.cc
+	method-serial.cc x86_sse.cc 86_avx.cc
 @ARCH_X86_TRUE@am__objects_1 = x86_sse.lo x86_avx.lo
 am_libitm_la_OBJECTS = aatree.lo alloc.lo alloc_c.lo alloc_cpp.lo \
 	barrier.lo beginend.lo clone.lo cacheline.lo cachepage.lo \
-	eh_cpp.lo local.lo memcpy.lo memset.lo query.lo retry.lo \
+	eh_cpp.lo local.lo query.lo retry.lo \
 	rwlock.lo useraction.lo util.lo sjlj.lo tls.lo \
-	method-serial.lo method-readonly.lo method-wbetl.lo \
-	$(am__objects_1)
+	method-serial.lo $(am__objects_1)
 libitm_la_OBJECTS = $(am_libitm_la_OBJECTS)
 DEFAULT_INCLUDES = -I.@am__isrc@
 depcomp = $(SHELL) $(top_srcdir)/../depcomp
@@ -342,6 +340,7 @@ AM_CPPFLAGS = $(addprefix -I, $(search_path))
 AM_CFLAGS = $(XCFLAGS)
 AM_CXXFLAGS = -std=gnu++0x -funwind-tables -fno-exceptions -fno-rtti \
 	$(XCFLAGS) $(abi_version)
+
 AM_CCASFLAGS = $(XCFLAGS)
 AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
 toolexeclib_LTLIBRARIES = libitm.la
@@ -358,9 +357,9 @@ libitm_la_LDFLAGS = $(libitm_version_info) $(libitm_version_script) \
 
 libitm_la_SOURCES = aatree.cc alloc.cc alloc_c.cc alloc_cpp.cc \
 	barrier.cc beginend.cc clone.cc cacheline.cc cachepage.cc \
-	eh_cpp.cc local.cc memcpy.cc memset.cc query.cc retry.cc \
+	eh_cpp.cc local.cc query.cc retry.cc \
 	rwlock.cc useraction.cc util.cc sjlj.S tls.cc method-serial.cc \
-	method-readonly.cc method-wbetl.cc $(am__append_1)
+	$(am__append_1)
 all: config.h
 	$(MAKE) $(AM_MAKEFLAGS) all-recursive
 
@@ -470,11 +469,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/clone.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eh_cpp.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/local.Plo@am__quote@
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/memcpy.Plo@am__quote@
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/memset.Plo@am__quote@
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/method-readonly.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/method-serial.Plo@am__quote@
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/method-wbetl.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/query.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/retry.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/rwlock.Plo@am__quote@
diff --git a/libitm/barrier.cc b/libitm/barrier.cc
index 52e5a08..10424d0 100644
--- a/libitm/barrier.cc
+++ b/libitm/barrier.cc
@@ -1,4 +1,4 @@
-/* Copyright (C) 2008, 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -23,15 +23,23 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include "libitm_i.h"
-#include "barrier.tpl"
-
-ITM_BARRIERS(U1)
-ITM_BARRIERS(U2)
-ITM_BARRIERS(U4)
-ITM_BARRIERS(U8)
-ITM_BARRIERS(F)
-ITM_BARRIERS(D)
-ITM_BARRIERS(E)
-ITM_BARRIERS(CF)
-ITM_BARRIERS(CD)
-ITM_BARRIERS(CE)
+
+using namespace GTM;
+
+bool abi_dispatch::memmove_overlap_check(void *dst, const void *src,
+    size_t size, ls_modifier dst_mod, ls_modifier src_mod)
+{
+  if (dst_mod == NONTXNAL || src_mod == NONTXNAL)
+    {
+      if (((uintptr_t)dst <= (uintptr_t)src ?
+          (uintptr_t)dst + size > (uintptr_t)src :
+          (uintptr_t)src + size > (uintptr_t)dst))
+        GTM::GTM_fatal("_ITM_memmove overlapping and t/nt is not allowed");
+      return false;
+    }
+  return true;
+}
+
+CREATE_DISPATCH_FUNCTIONS(GTM::abi_disp()->_, )
+//CREATE_DISPATCH_FUNCTIONS(GTM::abi_dispatch::_, _static)
+
diff --git a/libitm/beginend.cc b/libitm/beginend.cc
index 37e23e2..7ed5073 100644
--- a/libitm/beginend.cc
+++ b/libitm/beginend.cc
@@ -1,4 +1,4 @@
-/* Copyright (C) 2008, 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -128,13 +128,27 @@ GTM::gtm_transaction::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb)
 
   set_gtm_tx (tx);
 
+  // ??? pr_undoLogCode is not properly defined in the ABI. Are barriers
+  // omitted because they are not necessary (e.g., a transaction on thread-
+  // local data) or because the compiler thinks that some kind of global
+  // synchronization might perform better?
+  if (unlikely(prop & pr_undoLogCode))
+    GTM_fatal("pr_undoLogCode not supported");
+
   if ((prop & pr_doesGoIrrevocable) || !(prop & pr_instrumentedCode))
+    tx->state = (STATE_SERIAL | STATE_IRREVOCABLE);
+
+  else
+    disp = tx->decide_begin_dispatch ();
+
+  if (tx->state & STATE_SERIAL)
     {
       serial_lock.write_lock ();
 
-      tx->state = (STATE_SERIAL | STATE_IRREVOCABLE);
-
-      disp = dispatch_serial ();
+      if (tx->state & STATE_IRREVOCABLE)
+        disp = dispatch_serialirr ();
+      else
+        disp = dispatch_serial ();
 
       ret = a_runUninstrumentedCode;
       if ((prop & pr_multiwayCode) == pr_instrumentedCode)
@@ -143,14 +157,6 @@ GTM::gtm_transaction::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb)
   else
     {
       serial_lock.read_lock ();
-
-      // ??? Probably want some environment variable to choose the default
-      // STM implementation once we have more than one implemented.
-      if (prop & pr_readOnly)
-	disp = dispatch_readonly ();
-      else
-	disp = dispatch_wbetl ();
-
       ret = a_runInstrumentedCode | a_saveLiveVariables;
     }
 
diff --git a/libitm/common.h b/libitm/common.h
new file mode 100644
index 0000000..d94ca94
--- /dev/null
+++ b/libitm/common.h
@@ -0,0 +1,43 @@
+/* Copyright (C) 2008, 2009 Free Software Foundation, Inc.
+   Contributed by Richard Henderson <rth@redhat.com>.
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* The following are internal implementation functions and definitions.
+   To distinguish them from those defined by the Intel ABI, they all
+   begin with GTM/gtm.  */
+
+#ifndef COMMON_H
+#define COMMON_H 1
+
+#define UNUSED		__attribute__((unused))
+#define ALWAYS_INLINE	__attribute__((always_inline))
+#ifdef HAVE_ATTRIBUTE_VISIBILITY
+# define HIDDEN		__attribute__((visibility("hidden")))
+#else
+# define HIDDEN
+#endif
+
+#define likely(X)	__builtin_expect((X) != 0, 1)
+#define unlikely(X)	__builtin_expect((X), 0)
+
+#endif // COMMON_H
diff --git a/libitm/config/x86/x86_avx.cc b/libitm/config/x86/x86_avx.cc
index ee073d5..f81e3b8 100644
--- a/libitm/config/x86/x86_avx.cc
+++ b/libitm/config/x86/x86_avx.cc
@@ -1,4 +1,4 @@
-/* Copyright (C) 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2009, 2011 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -23,14 +23,16 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include "libitm_i.h"
-#include "barrier.tpl"
+#include "dispatch.h"
 
-ITM_BARRIERS(M256)
+// ??? Use memcpy for now, until we have figured out how to best instantiate
+// these loads/stores.
+CREATE_DISPATCH_FUNCTIONS_T_MEMCPY(M256, GTM::abi_disp()->_, )
 
 void ITM_REGPARM
 _ITM_LM256 (const _ITM_TYPE_M256 *ptr)
 {
-  GTM_LB (ptr, sizeof (*ptr));
+  GTM::GTM_LB (ptr, sizeof (*ptr));
 }
 
 // Helpers for re-aligning two 128-bit values.
diff --git a/libitm/config/x86/x86_sse.cc b/libitm/config/x86/x86_sse.cc
index 5bb9766..a8585e4 100644
--- a/libitm/config/x86/x86_sse.cc
+++ b/libitm/config/x86/x86_sse.cc
@@ -1,4 +1,4 @@
-/* Copyright (C) 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2009, 2011 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -23,21 +23,23 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include "libitm_i.h"
-#include "barrier.tpl"
+#include "dispatch.h"
 
-ITM_BARRIERS(M64)
-ITM_BARRIERS(M128)
+// ??? Use memcpy for now, until we have figured out how to best instantiate
+// these loads/stores.
+CREATE_DISPATCH_FUNCTIONS_T_MEMCPY(M64, GTM::abi_disp()->_, )
+CREATE_DISPATCH_FUNCTIONS_T_MEMCPY(M128, GTM::abi_disp()->_, )
 
 void ITM_REGPARM
 _ITM_LM64 (const _ITM_TYPE_M64 *ptr)
 {
-  GTM_LB (ptr, sizeof (*ptr));
+  GTM::GTM_LB (ptr, sizeof (*ptr));
 }
 
 void ITM_REGPARM
 _ITM_LM128 (const _ITM_TYPE_M128 *ptr)
 {
-  GTM_LB (ptr, sizeof (*ptr));
+  GTM::GTM_LB (ptr, sizeof (*ptr));
 }
 
 // Helpers for re-aligning two 128-bit values.
diff --git a/libitm/dispatch.h b/libitm/dispatch.h
new file mode 100644
index 0000000..955cfab
--- /dev/null
+++ b/libitm/dispatch.h
@@ -0,0 +1,283 @@
+/* Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Torvald Riegel <triegel@redhat.com>.
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef DISPATCH_H
+#define DISPATCH_H 1
+
+#include "libitm.h"
+#include "common.h"
+
+// Creates ABI load/store methods (can be made virtual or static using M,
+// use M2 to create separate methods names for virtual and static)
+// The _PV variants are for the pure-virtual methods in the base class.
+#define ITM_READ_M(T, LSMOD, M, M2)                                         \
+  M _ITM_TYPE_##T ITM_REGPARM _ITM_##LSMOD##T##M2 (const _ITM_TYPE_##T *ptr)\
+  {                                                                         \
+    return load(ptr, abi_dispatch::LSMOD);                                  \
+  }
+
+#define ITM_READ_M_PV(T, LSMOD, M, M2)                                      \
+  M _ITM_TYPE_##T ITM_REGPARM _ITM_##LSMOD##T##M2 (const _ITM_TYPE_##T *ptr)\
+  = 0;
+
+#define ITM_WRITE_M(T, LSMOD, M, M2)                         \
+  M void ITM_REGPARM _ITM_##LSMOD##T##M2 (_ITM_TYPE_##T *ptr,\
+                                         _ITM_TYPE_##T val)  \
+  {                                                          \
+    store(ptr, val, abi_dispatch::LSMOD);                    \
+  }
+
+#define ITM_WRITE_M_PV(T, LSMOD, M, M2)                      \
+  M void ITM_REGPARM _ITM_##LSMOD##T##M2 (_ITM_TYPE_##T *ptr,\
+                                         _ITM_TYPE_##T val)  \
+  = 0;
+
+// Creates ABI load/store methods for all load/store modifiers for a particular
+// type.
+#define CREATE_DISPATCH_METHODS_T(T, M, M2) \
+  ITM_READ_M(T, R, M, M2)                \
+  ITM_READ_M(T, RaR, M, M2)              \
+  ITM_READ_M(T, RaW, M, M2)              \
+  ITM_READ_M(T, RfW, M, M2)              \
+  ITM_WRITE_M(T, W, M, M2)               \
+  ITM_WRITE_M(T, WaR, M, M2)             \
+  ITM_WRITE_M(T, WaW, M, M2)
+#define CREATE_DISPATCH_METHODS_T_PV(T, M, M2) \
+  ITM_READ_M_PV(T, R, M, M2)                \
+  ITM_READ_M_PV(T, RaR, M, M2)              \
+  ITM_READ_M_PV(T, RaW, M, M2)              \
+  ITM_READ_M_PV(T, RfW, M, M2)              \
+  ITM_WRITE_M_PV(T, W, M, M2)               \
+  ITM_WRITE_M_PV(T, WaR, M, M2)             \
+  ITM_WRITE_M_PV(T, WaW, M, M2)
+
+// Creates ABI load/store methods for all types.
+// See CREATE_DISPATCH_FUNCTIONS for comments.
+#define CREATE_DISPATCH_METHODS(M, M2)  \
+  CREATE_DISPATCH_METHODS_T (U1, M, M2) \
+  CREATE_DISPATCH_METHODS_T (U2, M, M2) \
+  CREATE_DISPATCH_METHODS_T (U4, M, M2) \
+  CREATE_DISPATCH_METHODS_T (U8, M, M2) \
+  CREATE_DISPATCH_METHODS_T (F, M, M2)  \
+  CREATE_DISPATCH_METHODS_T (D, M, M2)  \
+  CREATE_DISPATCH_METHODS_T (E, M, M2)  \
+  CREATE_DISPATCH_METHODS_T (CF, M, M2) \
+  CREATE_DISPATCH_METHODS_T (CD, M, M2) \
+  CREATE_DISPATCH_METHODS_T (CE, M, M2)
+#define CREATE_DISPATCH_METHODS_PV(M, M2)  \
+  CREATE_DISPATCH_METHODS_T_PV (U1, M, M2) \
+  CREATE_DISPATCH_METHODS_T_PV (U2, M, M2) \
+  CREATE_DISPATCH_METHODS_T_PV (U4, M, M2) \
+  CREATE_DISPATCH_METHODS_T_PV (U8, M, M2) \
+  CREATE_DISPATCH_METHODS_T_PV (F, M, M2)  \
+  CREATE_DISPATCH_METHODS_T_PV (D, M, M2)  \
+  CREATE_DISPATCH_METHODS_T_PV (E, M, M2)  \
+  CREATE_DISPATCH_METHODS_T_PV (CF, M, M2) \
+  CREATE_DISPATCH_METHODS_T_PV (CD, M, M2) \
+  CREATE_DISPATCH_METHODS_T_PV (CE, M, M2)
+
+// Creates memcpy/memmove/memset methods.
+#define CREATE_DISPATCH_METHODS_MEM()  \
+virtual void _memtransfer(void *dst, const void* src, size_t size,    \
+    bool may_overlap, ls_modifier dst_mod, ls_modifier src_mod)       \
+{                                                                     \
+  _memtransfer_static(dst, src, size, may_overlap, dst_mod, src_mod); \
+}                                                                     \
+virtual void _memset(void *dst, int c, size_t size, ls_modifier mod)  \
+{                                                                     \
+  _memset_static(dst, c, size, mod);                                  \
+}
+
+#define CREATE_DISPATCH_METHODS_MEM_PV()  \
+virtual void _memtransfer(void *dst, const void* src, size_t size,       \
+    bool may_overlap, ls_modifier dst_mod, ls_modifier src_mod) = 0;     \
+virtual void _memset(void *dst, int c, size_t size, ls_modifier mod) = 0;
+
+
+// Creates ABI load/store functions that can target either a class or an
+// object.
+#define ITM_READ(T, LSMOD, TARGET, M2)                                 \
+  _ITM_TYPE_##T ITM_REGPARM _ITM_##LSMOD##T (const _ITM_TYPE_##T *ptr) \
+  {                                                                    \
+    return TARGET##ITM_##LSMOD##T##M2(ptr);                            \
+  }
+
+#define ITM_WRITE(T, LSMOD, TARGET, M2)                                    \
+  void ITM_REGPARM _ITM_##LSMOD##T (_ITM_TYPE_##T *ptr, _ITM_TYPE_##T val) \
+  {                                                                        \
+    TARGET##ITM_##LSMOD##T##M2(ptr, val);                                  \
+  }
+
+// Creates ABI load/store functions for all load/store modifiers for a
+// particular type.
+#define CREATE_DISPATCH_FUNCTIONS_T(T, TARGET, M2) \
+  ITM_READ(T, R, TARGET, M2)                \
+  ITM_READ(T, RaR, TARGET, M2)              \
+  ITM_READ(T, RaW, TARGET, M2)              \
+  ITM_READ(T, RfW, TARGET, M2)              \
+  ITM_WRITE(T, W, TARGET, M2)               \
+  ITM_WRITE(T, WaR, TARGET, M2)             \
+  ITM_WRITE(T, WaW, TARGET, M2)
+
+// Creates ABI memcpy/memmove/memset functions.
+#define ITM_MEMTRANSFER_DEF(TARGET, M2, NAME, READ, WRITE) \
+void ITM_REGPARM _ITM_memcpy##NAME(void *dst, const void *src, size_t size)  \
+{                                                                            \
+  TARGET##memtransfer##M2 (dst, src, size,                                   \
+             false, GTM::abi_dispatch::WRITE, GTM::abi_dispatch::READ);      \
+}                                                                            \
+void ITM_REGPARM _ITM_memmove##NAME(void *dst, const void *src, size_t size) \
+{                                                                            \
+  TARGET##memtransfer##M2 (dst, src, size,                                   \
+      GTM::abi_dispatch::memmove_overlap_check(dst, src, size,               \
+          GTM::abi_dispatch::WRITE, GTM::abi_dispatch::READ),                \
+      GTM::abi_dispatch::WRITE, GTM::abi_dispatch::READ);                    \
+}
+
+#define ITM_MEMSET_DEF(TARGET, M2, WRITE) \
+void ITM_REGPARM _ITM_memset##WRITE(void *dst, int c, size_t size) \
+{                                                                  \
+  TARGET##memset##M2 (dst, c, size, GTM::abi_dispatch::WRITE);     \
+}                                                                  \
+
+
+// ??? The number of virtual methods is large (7*4 for integers, 7*6 for FP,
+// 7*3 for vectors). Is the cache footprint so costly that we should go for
+// a small table instead (i.e., only have two virtual load/store methods for
+// each supported type)? Note that this doesn't affect custom code paths at
+// all because these use only direct calls.
+// A large cache footprint could especially decrease HTM performance (due
+// to HTM capacity). We could add the modifier (RaR etc.) as parameter, which
+// would give us just 4*2+6*2+3*2 functions (so we'd just need one line for
+// the integer loads/stores), but then the modifier can be checked only at
+// runtime.
+// For memcpy/memmove/memset, we just have two virtual methods (memtransfer
+// and memset).
+#define CREATE_DISPATCH_FUNCTIONS(TARGET, M2)  \
+  CREATE_DISPATCH_FUNCTIONS_T (U1, TARGET, M2) \
+  CREATE_DISPATCH_FUNCTIONS_T (U2, TARGET, M2) \
+  CREATE_DISPATCH_FUNCTIONS_T (U4, TARGET, M2) \
+  CREATE_DISPATCH_FUNCTIONS_T (U8, TARGET, M2) \
+  CREATE_DISPATCH_FUNCTIONS_T (F, TARGET, M2)  \
+  CREATE_DISPATCH_FUNCTIONS_T (D, TARGET, M2)  \
+  CREATE_DISPATCH_FUNCTIONS_T (E, TARGET, M2)  \
+  CREATE_DISPATCH_FUNCTIONS_T (CF, TARGET, M2) \
+  CREATE_DISPATCH_FUNCTIONS_T (CD, TARGET, M2) \
+  CREATE_DISPATCH_FUNCTIONS_T (CE, TARGET, M2) \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RnWt,     NONTXNAL, W)      \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RnWtaR,   NONTXNAL, WaR)    \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RnWtaW,   NONTXNAL, WaW)    \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtWn,     R,      NONTXNAL) \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtWt,     R,      W)        \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtWtaR,   R,      WaR)      \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtWtaW,   R,      WaW)      \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtaRWn,   RaR,    NONTXNAL) \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtaRWt,   RaR,    W)        \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtaRWtaR, RaR,    WaR)      \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtaRWtaW, RaR,    WaW)      \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtaWWn,   RaW,    NONTXNAL) \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtaWWt,   RaW,    W)        \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtaWWtaR, RaW,    WaR)      \
+  ITM_MEMTRANSFER_DEF(TARGET, M2, RtaWWtaW, RaW,    WaW)      \
+  ITM_MEMSET_DEF(TARGET, M2, W)   \
+  ITM_MEMSET_DEF(TARGET, M2, WaR) \
+  ITM_MEMSET_DEF(TARGET, M2, WaW)
+
+
+// Creates ABI load/store functions that delegate to a transactional memcpy.
+#define ITM_READ_MEMCPY(T, LSMOD, TARGET, M2)                         \
+  _ITM_TYPE_##T ITM_REGPARM _ITM_##LSMOD##T (const _ITM_TYPE_##T *ptr)\
+  {                                                                   \
+    _ITM_TYPE_##T v;                                                  \
+    TARGET##memtransfer##M2(&v, ptr, sizeof(_ITM_TYPE_##T), false,    \
+        GTM::abi_dispatch::NONTXNAL, GTM::abi_dispatch::LSMOD);       \
+    return v;                                                         \
+  }
+
+#define ITM_WRITE_MEMCPY(T, LSMOD, TARGET, M2)                            \
+  void ITM_REGPARM _ITM_##LSMOD##T (_ITM_TYPE_##T *ptr, _ITM_TYPE_##T val)\
+  {                                                                       \
+    TARGET##memtransfer##M2(ptr, &val, sizeof(_ITM_TYPE_##T), false,      \
+        GTM::abi_dispatch::LSMOD, GTM::abi_dispatch::NONTXNAL);           \
+  }
+
+#define CREATE_DISPATCH_FUNCTIONS_T_MEMCPY(T, TARGET, M2) \
+  ITM_READ_MEMCPY(T, R, TARGET, M2)                \
+  ITM_READ_MEMCPY(T, RaR, TARGET, M2)              \
+  ITM_READ_MEMCPY(T, RaW, TARGET, M2)              \
+  ITM_READ_MEMCPY(T, RfW, TARGET, M2)              \
+  ITM_WRITE_MEMCPY(T, W, TARGET, M2)               \
+  ITM_WRITE_MEMCPY(T, WaR, TARGET, M2)             \
+  ITM_WRITE_MEMCPY(T, WaW, TARGET, M2)
+
+
+namespace GTM HIDDEN {
+
+// This pass-through method is the basis for other methods.
+// It can be used for serial-irrevocable mode.
+struct abi_dispatch
+{
+public:
+  enum ls_modifier { NONTXNAL, R, RaR, RaW, RfW, W, WaR, WaW };
+
+private:
+  // Disallow copies
+  abi_dispatch(const abi_dispatch &) = delete;
+  abi_dispatch& operator=(const abi_dispatch &) = delete;
+
+public:
+  virtual bool trycommit() = 0;
+  virtual void rollback() = 0;
+  virtual void reinit() = 0;
+
+  // Use fini instead of dtor to support a static subclasses that uses
+  // a unique object and so we don't want to destroy it from common code.
+  virtual void fini() = 0;
+
+  bool read_only () const { return m_read_only; }
+  bool write_through() const { return m_write_through; }
+
+  static void *operator new(size_t s) { return xmalloc (s); }
+  static void operator delete(void *p) { free (p); }
+
+public:
+  static bool memmove_overlap_check(void *dst, const void *src, size_t size,
+      ls_modifier dst_mod, ls_modifier src_mod);
+
+  // Creates the ABI dispatch methods for loads and stores.
+  // ??? Should the dispatch table instead be embedded in the dispatch object
+  // to avoid the indirect lookup in the vtable?
+  CREATE_DISPATCH_METHODS_PV(virtual, )
+  // Creates the ABI dispatch methods for memcpy/memmove/memset.
+  CREATE_DISPATCH_METHODS_MEM_PV()
+
+protected:
+  const bool m_read_only;
+  const bool m_write_through;
+  abi_dispatch(bool ro, bool wt) : m_read_only(ro), m_write_through(wt) { }
+};
+
+}
+
+#endif // DISPATCH_H
diff --git a/libitm/libitm_i.h b/libitm/libitm_i.h
index 3291f1b..d323d45 100644
--- a/libitm/libitm_i.h
+++ b/libitm/libitm_i.h
@@ -1,4 +1,4 @@
-/* Copyright (C) 2008, 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -38,16 +38,7 @@
 #include <unwind.h>
 #include <type_traits>
 
-#define UNUSED		__attribute__((unused))
-#define ALWAYS_INLINE	__attribute__((always_inline))
-#ifdef HAVE_ATTRIBUTE_VISIBILITY
-# define HIDDEN		__attribute__((visibility("hidden")))
-#else
-# define HIDDEN
-#endif
-
-#define likely(X)	__builtin_expect((X) != 0, 1)
-#define unlikely(X)	__builtin_expect((X), 0)
+#include "common.h"
 
 namespace GTM HIDDEN {
 
@@ -82,57 +73,12 @@ extern void * xrealloc (void *p, size_t s) __attribute__((malloc, nothrow));
 #include "cacheline.h"
 #include "cachepage.h"
 #include "stmlock.h"
+#include "dispatch.h"
 
 namespace GTM HIDDEN {
 
 // A dispatch table parameterizes the implementation of the STM.
-struct abi_dispatch
-{
- public:
-  enum lock_type { NOLOCK, R, RaR, RaW, RfW, W, WaR, WaW };
-
-  struct mask_pair
-  {
-    gtm_cacheline *line;
-    gtm_cacheline_mask *mask;
-
-    mask_pair() = default;
-    mask_pair(gtm_cacheline *l, gtm_cacheline_mask *m) : line(l), mask(m) { }
-  };
-
- private:
-  // Disallow copies
-  abi_dispatch(const abi_dispatch &) = delete;
-  abi_dispatch& operator=(const abi_dispatch &) = delete;
-
- public:
-  // The default version of these is pass-through.  This merely gives the
-  // a single location to instantiate the base class vtable.
-  virtual const gtm_cacheline *read_lock(const gtm_cacheline *, lock_type);
-  virtual mask_pair write_lock(gtm_cacheline *, lock_type);
-
-  virtual bool trycommit() = 0;
-  virtual void rollback() = 0;
-  virtual void reinit() = 0;
-  virtual bool trydropreference(void *, size_t) = 0;
-
-  // Use fini instead of dtor to support a static subclasses that uses
-  // a unique object and so we don't want to destroy it from common code.
-  virtual void fini() = 0;
-
-  bool read_only () const { return m_read_only; }
-  bool write_through() const { return m_write_through; }
-
-  static void *operator new(size_t s) { return xmalloc (s); }
-  static void operator delete(void *p) { free (p); }
-
- protected:
-  const bool m_read_only;
-  const bool m_write_through;
-  abi_dispatch(bool ro, bool wt) : m_read_only(ro), m_write_through(wt) { }
-
-  static gtm_cacheline_mask mask_sink;
-};
+struct abi_dispatch;
 
 // These values are given to GTM_restart_transaction and indicate the
 // reason for the restart.  The reason is used to decide what STM
@@ -256,6 +202,7 @@ struct gtm_transaction
 
   // In retry.cc
   void decide_retry_strategy (gtm_restart_reason);
+  abi_dispatch* decide_begin_dispatch ();
 
   // In method-serial.cc
   void serialirr_mode ();
@@ -287,9 +234,8 @@ extern void GTM_error (const char *fmt, ...)
 extern void GTM_fatal (const char *fmt, ...)
 	__attribute__((noreturn, format (printf, 1, 2)));
 
-extern abi_dispatch *dispatch_wbetl();
-extern abi_dispatch *dispatch_readonly();
 extern abi_dispatch *dispatch_serial();
+extern abi_dispatch *dispatch_serialirr();
 
 extern gtm_cacheline_mask gtm_mask_stack(gtm_cacheline *, gtm_cacheline_mask);
 
diff --git a/libitm/method-readonly.cc b/libitm/method-readonly.cc
deleted file mode 100644
index ed573e2..0000000
--- a/libitm/method-readonly.cc
+++ /dev/null
@@ -1,124 +0,0 @@
-/* Copyright (C) 2009 Free Software Foundation, Inc.
-   Contributed by Richard Henderson <rth@redhat.com>.
-
-   This file is part of the GNU Transactional Memory Library (libitm).
-
-   Libitm is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3 of the License, or
-   (at your option) any later version.
-
-   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
-   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
-   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
-   more details.
-
-   Under Section 7 of GPL version 3, you are granted additional
-   permissions described in the GCC Runtime Library Exception, version
-   3.1, as published by the Free Software Foundation.
-
-   You should have received a copy of the GNU General Public License and
-   a copy of the GCC Runtime Library Exception along with this program;
-   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include "libitm_i.h"
-
-namespace {
-
-using namespace GTM;
-
-class readonly_dispatch : public abi_dispatch
-{
- private:
-  gtm_version m_start;
-
- public:
-  readonly_dispatch();
-
-  virtual const gtm_cacheline *read_lock(const gtm_cacheline *, lock_type);
-  virtual mask_pair write_lock(gtm_cacheline *, lock_type);
-  virtual bool trycommit();
-  virtual void rollback();
-  virtual void reinit();
-  virtual void fini();
-  virtual bool trydropreference (void *ptr, size_t size) { return true; }
-};
-
-inline
-readonly_dispatch::readonly_dispatch()
-  : abi_dispatch(true, true), m_start(gtm_get_clock ())
-{ }
-
-
-const gtm_cacheline *
-readonly_dispatch::read_lock(const gtm_cacheline *line, lock_type lock)
-{
-  switch (lock)
-    {
-    case NOLOCK:
-    case R:
-    case RaR:
-      return line;
-
-    case RfW:
-      gtm_tx()->restart (RESTART_NOT_READONLY);
-
-    case RaW:
-    default:
-      abort ();
-    }
-}
-
-abi_dispatch::mask_pair
-readonly_dispatch::write_lock(gtm_cacheline *line, lock_type lock)
-{
-  switch (lock)
-    {
-    case NOLOCK:
-      {
-	abi_dispatch::mask_pair pair;
-	pair.line = line;
-	pair.mask = &mask_sink;
-	return pair;
-      }
-
-    case WaW:
-      abort ();
-
-    default:
-      gtm_tx()->restart (RESTART_NOT_READONLY);
-    }
-}
-
-bool
-readonly_dispatch::trycommit ()
-{
-  return gtm_get_clock () == m_start;
-}
-
-void
-readonly_dispatch::rollback ()
-{
-  /* Nothing to do.  */
-}
-
-void
-readonly_dispatch::reinit ()
-{
-  m_start = gtm_get_clock ();
-}
-
-void
-readonly_dispatch::fini ()
-{
-  delete this;
-}
-
-} // anon namespace
-
-abi_dispatch *
-GTM::dispatch_readonly ()
-{
-  return new readonly_dispatch();
-}
diff --git a/libitm/method-serial.cc b/libitm/method-serial.cc
index b46de4d..71208ec 100644
--- a/libitm/method-serial.cc
+++ b/libitm/method-serial.cc
@@ -24,24 +24,6 @@
 
 #include "libitm_i.h"
 
-namespace GTM HIDDEN {
-
-gtm_cacheline_mask abi_dispatch::mask_sink;
-
-const gtm_cacheline *
-abi_dispatch::read_lock(const gtm_cacheline *addr, lock_type)
-{
-  return addr;
-}
-
-abi_dispatch::mask_pair
-abi_dispatch::write_lock(gtm_cacheline *addr, lock_type)
-{
-  return mask_pair (addr, &mask_sink);
-}
-
-} // namespace GTM
-
 // Avoid a dependency on libstdc++ for the pure virtuals in abi_dispatch.
 extern "C" void HIDDEN
 __cxa_pure_virtual ()
@@ -58,25 +40,110 @@ class serial_dispatch : public abi_dispatch
  public:
   serial_dispatch() : abi_dispatch(false, true) { }
 
-  // The read_lock and write_lock methods are implented by the base class.
+ protected:
+  // Transactional loads and stores simply access memory directly.
+  // These methods are static to avoid indirect calls, and will be used by the
+  // virtual ABI dispatch methods or by static direct-access methods created
+  // below.
+  template <typename V> static V load(const V* addr, ls_modifier mod)
+  {
+    return *addr;
+  }
+  template <typename V> static void store(V* addr, const V value,
+      ls_modifier mod)
+  {
+    *addr = value;
+  }
+
+  static void _memtransfer_static(void *dst, const void* src, size_t size,
+      bool may_overlap, ls_modifier dst_mod, ls_modifier src_mod)
+  {
+    if (!may_overlap)
+      ::memcpy(dst, src, size);
+    else
+      ::memmove(dst, src, size);
+  }
+
+  static void _memset_static(void *dst, int c, size_t size, ls_modifier mod)
+  {
+    ::memset(dst, c, size);
+  }
+
+  CREATE_DISPATCH_METHODS(virtual, )
+  CREATE_DISPATCH_METHODS_MEM()
 
+ public:
   virtual bool trycommit() { return true; }
   virtual void rollback() { abort(); }
   virtual void reinit() { }
   virtual void fini() { }
-  virtual bool trydropreference (void *ptr, size_t size) { return true; }
+};
+
+class serial_dispatch_ul : public serial_dispatch
+{
+protected:
+  static void log(const void *addr, size_t len)
+  {
+    // TODO Ensure that this gets inlined: Use internal log interface and LTO.
+    GTM_LB(addr, len);
+  }
+
+  template <typename V> static V load(const V* addr, ls_modifier mod)
+  {
+    return *addr;
+  }
+  template <typename V> static void store(V* addr, const V value,
+      ls_modifier mod)
+  {
+    if (mod != WaW)
+      log(addr, sizeof(V));
+    *addr = value;
+  }
+
+public:
+  static void memtransfer_static(void *dst, const void* src, size_t size,
+      bool may_overlap, ls_modifier dst_mod, ls_modifier src_mod)
+  {
+    if (dst_mod != WaW && dst_mod != NONTXNAL)
+      log(dst, size);
+    if (!may_overlap)
+      memcpy(dst, src, size);
+    else
+      memmove(dst, src, size);
+  }
+
+  static void memset_static(void *dst, int c, size_t size, ls_modifier mod)
+  {
+    if (mod != WaW)
+      log(dst, size);
+    memset(dst, c, size);
+  }
+
+  // Local undo will handle this.
+  // trydropreference() need not be changed either.
+  virtual void rollback() { }
+
+  CREATE_DISPATCH_METHODS(virtual, )
+  CREATE_DISPATCH_METHODS_MEM()
 };
 
 } // anon namespace
 
 static const serial_dispatch o_serial_dispatch;
+static const serial_dispatch_ul o_serial_dispatch_ul;
 
 abi_dispatch *
-GTM::dispatch_serial ()
+GTM::dispatch_serialirr ()
 {
   return const_cast<serial_dispatch *>(&o_serial_dispatch);
 }
 
+abi_dispatch *
+GTM::dispatch_serial ()
+{
+  return const_cast<serial_dispatch_ul *>(&o_serial_dispatch_ul);
+}
+
 // Put the transaction into serial-irrevocable mode.
 
 void
diff --git a/libitm/method-wbetl.cc b/libitm/method-wbetl.cc
index 7ce317e..aff6397 100644
--- a/libitm/method-wbetl.cc
+++ b/libitm/method-wbetl.cc
@@ -83,8 +83,8 @@ class wbetl_dispatch : public abi_dispatch
  public:
   wbetl_dispatch();
 
-  virtual const gtm_cacheline *read_lock(const gtm_cacheline *, lock_type);
-  virtual mask_pair write_lock(gtm_cacheline *, lock_type);
+  virtual const gtm_cacheline *read_lock(const gtm_cacheline *, ls_modifier);
+  virtual mask_pair write_lock(gtm_cacheline *, ls_modifier);
 
   virtual bool trycommit();
   virtual void rollback();
@@ -375,11 +375,11 @@ wbetl_dispatch::do_read_lock (const gtm_cacheline *addr, bool after_read)
 }
 
 const gtm_cacheline *
-wbetl_dispatch::read_lock (const gtm_cacheline *addr, lock_type ltype)
+wbetl_dispatch::read_lock (const gtm_cacheline *addr, ls_modifier ltype)
 {
   switch (ltype)
     {
-    case NOLOCK:
+    case NONTXNAL:
       return addr;
     case R:
       return do_read_lock (addr, false);
@@ -395,13 +395,13 @@ wbetl_dispatch::read_lock (const gtm_cacheline *addr, lock_type ltype)
 }
 
 abi_dispatch::mask_pair
-wbetl_dispatch::write_lock (gtm_cacheline *addr, lock_type ltype)
+wbetl_dispatch::write_lock (gtm_cacheline *addr, ls_modifier ltype)
 {
   gtm_cacheline *line;
 
   switch (ltype)
     {
-    case NOLOCK:
+    case NONTXNAL:
       return mask_pair (addr, &mask_sink);
     case W:
     case WaR:
diff --git a/libitm/retry.cc b/libitm/retry.cc
index a25b282..8e586de 100644
--- a/libitm/retry.cc
+++ b/libitm/retry.cc
@@ -42,6 +42,10 @@ GTM::gtm_transaction::decide_retry_strategy (gtm_restart_reason r)
       // write lock is not yet held, grab it.  Don't do this with
       // an upgrade, since we've no need to preserve the state we
       // acquired with the read.
+      // FIXME this might be dangerous if we use serial mode to change TM
+      // meta data (e.g., reallocate the lock array). Likewise, for
+      // privatization, we must get rid of old references (that is, abort)
+      // or let privatizers know we're still there by not releasing the lock.
       if ((this->state & STATE_SERIAL) == 0)
 	{
 	  this->state |= STATE_SERIAL;
@@ -65,17 +69,23 @@ GTM::gtm_transaction::decide_retry_strategy (gtm_restart_reason r)
     }
   else
     {
-      if (r == RESTART_NOT_READONLY)
-	{
-	  assert ((this->prop & pr_readOnly) == 0);
-	  if (disp->read_only ())
-	    {
-	      disp->fini ();
-	      disp = dispatch_wbetl ();
-	      set_abi_disp (disp);
-	      return;
-	    }
-	}
-      disp->reinit ();
+      GTM_fatal("internal error: unsupported mode");
     }
 }
+
+
+// Decides which TM method should be used on the first attempt to run this
+// transaction, setting this->disp accordingly.
+// serial_lock will not have been acquired if this is the outer-most
+// transaction. If the state is set to STATE_SERIAL, the caller will set the
+// dispatch.
+GTM::abi_dispatch*
+GTM::gtm_transaction::decide_begin_dispatch ()
+{
+  // ??? Probably want some environment variable to choose the default
+  // STM implementation once we have more than one implemented.
+  state = STATE_SERIAL;
+  if (prop & pr_hasNoAbort)
+    state |= STATE_IRREVOCABLE;
+  return 0;
+}
Richard Henderson - June 30, 2011, 2:59 p.m.
On 06/30/2011 04:15 AM, Torvald Riegel wrote:
> On Wed, 2011-06-29 at 16:01 -0700, Richard Henderson wrote:
>> On 05/25/2011 02:10 PM, Torvald Riegel wrote:
>>> Patch 2: _ITM_dropReferences is not sufficiently defined in the ABI. It
>>> seems to target some form of open nesting for txnal wrappers, but the
>>> prose in the ABI specification is unclear. Thus, disable this for now
>>> (aka fatal runtime error), and expect the related tests to fail. Pick it
>>> up again once that the ABI has been improved and the use cases are
>>> clear.
>>
>> Sure, but please actually delete the code rather than just comment it out.
> 
> Fixed in updated patch2.

This patch is ok.

> How it works is that in a TM-method-specific dispatch class we would add
>   CREATE_DISPATCH_METHODS(static, _static)
> in addition to the 
>   CREATE_DISPATCH_METHODS(virtual, )
> that is currently there. In barrier.cc we can then do
>   CREATE_DISPATCH_FUNCTIONS(GTM::serial_dispatch::_, _static)
> and thus bypass the virtual functions, but still keep one implementation
> of the TM method (in the static template functions). This example would
> create functions that just use the serial-mode accesses.

Ah yes, I remember discussing this.

> The trailing underscore at the end of the first param is to make the
> preprocessor happen. Is there a better way to let it accept
> "GTM::abi_disp()->" as argument?

It depends on what else you're passing to the other methods above.

It does seem odd, though, that you're passing some bit of global
state to the virtual dispatch methods (the result of abi_disp())
whereas you're arranging to pass nothing but a name concatenation
to the others.

I'm not sure what to make of your first two examples above, but
let's suppose that the third example is the real one.  In which
case you don't actually need concatenation.  Remove the ## from
the macro and then you can pass in

  GTM::abi_disp()->
and
  GTM::serial_dispatch::

which require no more than adjacency.

>> Why?  We're already in a header that's clearly marked "internal".
> 
> I moved these macros to a common header so libitm_i.h doesn't have to be
> included everywhere. There was some circular dependency after the
> changes (I think in via the barrier definitions for SSE etc.). What's
> the issue that you see with this change?

Not really an issue, it just seemed odd.

> Are the attached updated patches okay for the branch?

Yeah, we can iterate on the points above without re-sending
the entire patch3 again.


r~
Torvald Riegel - June 30, 2011, 4:25 p.m.
On Thu, 2011-06-30 at 07:59 -0700, Richard Henderson wrote:
> I'm not sure what to make of your first two examples above, but
> let's suppose that the third example is the real one.  In which
> case you don't actually need concatenation.  Remove the ## from
> the macro and then you can pass in
> 
>   GTM::abi_disp()->
> and
>   GTM::serial_dispatch::
> 
> which require no more than adjacency.

Good point, thanks. Fixed it, and will commit this and the others to the
branch.

Torvald

Patch

diff --git a/libitm/alloc_c.cc b/libitm/alloc_c.cc
index 66d1109..b87b304 100644
--- a/libitm/alloc_c.cc
+++ b/libitm/alloc_c.cc
@@ -1,4 +1,4 @@ 
-/* Copyright (C) 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2009, 2011 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -63,11 +63,10 @@  __attribute__((transaction_pure))
 void ITM_REGPARM
 _ITM_dropReferences (void *ptr, size_t len)
 {
-  gtm_transaction *tx = gtm_tx();
-  if (!abi_disp()->trydropreference (ptr, len))
-    tx->restart (RESTART_VALIDATE_READ);
-  tx->drop_references_local (ptr, len);
-  tx->drop_references_allocations (ptr);
+  // The semantics of _ITM_dropReferences are not sufficiently defined in the
+  // ABI specification, so it does not make sense to support it right now. See
+  // the libitm documentation for details.
+  GTM_fatal("_ITM_dropReferences is not supported");
 }
 
 } // extern "C"
diff --git a/libitm/testsuite/libitm.c++/dropref.C b/libitm/testsuite/libitm.c++/dropref.C
index 92c8812..ee4f1bb 100644
--- a/libitm/testsuite/libitm.c++/dropref.C
+++ b/libitm/testsuite/libitm.c++/dropref.C
@@ -1,3 +1,4 @@ 
+/* { dg-xfail-run-if "unsupported" { *-*-* } } */
 #include <libitm.h>
 
 char *pp;
diff --git a/libitm/testsuite/libitm.c/dropref-2.c b/libitm/testsuite/libitm.c/dropref-2.c
index ddd4eae..2386b18 100644
--- a/libitm/testsuite/libitm.c/dropref-2.c
+++ b/libitm/testsuite/libitm.c/dropref-2.c
@@ -1,3 +1,4 @@ 
+/* { dg-xfail-run-if "unsupported" { *-*-* } } */
 #include <stdlib.h>
 #include <libitm.h>
 
diff --git a/libitm/testsuite/libitm.c/dropref.c b/libitm/testsuite/libitm.c/dropref.c
index 92c8812..ee4f1bb 100644
--- a/libitm/testsuite/libitm.c/dropref.c
+++ b/libitm/testsuite/libitm.c/dropref.c
@@ -1,3 +1,4 @@ 
+/* { dg-xfail-run-if "unsupported" { *-*-* } } */
 #include <libitm.h>
 
 char *pp;