diff mbox

libitm: Optimize undo log.

Message ID 1325808936.7636.3531.camel@triegel.csb
State New
Headers show

Commit Message

Torvald Riegel Jan. 6, 2012, 12:15 a.m. UTC
This fixes a known, obvious performance problem in libitm's undo log:
Before, each write was put into a separately allocated buffer, which
kills performance.  With the patch, everything goes into a single
buffer: Undo data first, then the length of the data, then the address.
We only need this data when rolling it back, starting from the most
recent undo.  Address and length are gtm_word-sized anyway
(==uintptr_t), and the undo data is adjusted to the same length so that
we don't have to bother with unaligned accesses.

Additionally, the undolog logging code (which is called on each store)
is moved into a header to allow easy inlining.  With that, the memcpy's
used to copy the data can also be optimized, because the length is
known.  This reduces overheads further, especially for doubles it seems
(tested on x86_64).

STAMP apps seem to work too, and run much faster. Surprise! :)
I'll try to get some more testing on ppc tomorrow or so.

OK for trunk?
commit 93c95a6d55e79ce61eaf4aec4265974ded967442
Author: Torvald Riegel <triegel@redhat.com>
Date:   Thu Jan 5 21:49:48 2012 +0100

    libitm: Optimize undo log.
    
    	libitm/
    	* local.cc (GTM_LB): Use GTM::gtm_undolog.
    	(GTM::gtm_thread::drop_references_undolog): Remove.
    	(GTM::gtm_thread::commit_undolog,
    	GTM::gtm_thread::rollback_undolog): Move to ...
    	* libitm_i.h (GTM::gtm_undolog): ...here. New.
    	(GTM::gtm_undolog_entry): Remove.
    	(GTM::gtm_thread): Adapt.
    	* beginend.cc (GTM::gtm_thread::rollback): Adapt.
    	(GTM::gtm_thread::trycommit): Adapt.
    	* method-serial.cc (serial_dispatch::log): Adapt.
    	* method-gl.cc (gl_wt_dispatch::pre_write): Adapt.
    	(gl_wt_dispatch::store): Fix likely/unlikely.
    	* containers.h (GTM::vector::resize): Add additional_capacity
    	parameter and handle it.
    	(GTM::vector::resize_noinline): New/adapt.
    	(GTM::vector::push): New.

Comments

Richard Henderson Jan. 6, 2012, 9:38 p.m. UTC | #1
On 01/06/2012 11:15 AM, Torvald Riegel wrote:
>     libitm: Optimize undo log.
>     
>     	libitm/
>     	* local.cc (GTM_LB): Use GTM::gtm_undolog.
>     	(GTM::gtm_thread::drop_references_undolog): Remove.
>     	(GTM::gtm_thread::commit_undolog,
>     	GTM::gtm_thread::rollback_undolog): Move to ...
>     	* libitm_i.h (GTM::gtm_undolog): ...here. New.
>     	(GTM::gtm_undolog_entry): Remove.
>     	(GTM::gtm_thread): Adapt.
>     	* beginend.cc (GTM::gtm_thread::rollback): Adapt.
>     	(GTM::gtm_thread::trycommit): Adapt.
>     	* method-serial.cc (serial_dispatch::log): Adapt.
>     	* method-gl.cc (gl_wt_dispatch::pre_write): Adapt.
>     	(gl_wt_dispatch::store): Fix likely/unlikely.
>     	* containers.h (GTM::vector::resize): Add additional_capacity
>     	parameter and handle it.
>     	(GTM::vector::resize_noinline): New/adapt.
>     	(GTM::vector::push): New.

Ok.


r~
diff mbox

Patch

diff --git a/libitm/beginend.cc b/libitm/beginend.cc
index 7975481..fe14f32 100644
--- a/libitm/beginend.cc
+++ b/libitm/beginend.cc
@@ -1,4 +1,4 @@ 
-/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011, 2012 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -327,7 +327,7 @@  GTM::gtm_thread::rollback (gtm_transaction_cp *cp, bool aborting)
   // data. Because of the latter, we have to roll it back before any
   // dispatch-specific rollback (which handles synchronization with other
   // transactions).
-  rollback_undolog (cp ? cp->undolog_size : 0);
+  undolog.rollback (cp ? cp->undolog_size : 0);
 
   // Perform dispatch-specific rollback.
   abi_disp()->rollback (cp);
@@ -470,7 +470,7 @@  GTM::gtm_thread::trycommit ()
       // We can commit the undo log after dispatch-specific commit and after
       // making the transaction inactive because we only have to reset
       // gtm_thread state.
-      commit_undolog ();
+      undolog.commit ();
       // Reset further transaction state.
       cxa_catch_count = 0;
       cxa_unthrown = NULL;
diff --git a/libitm/containers.h b/libitm/containers.h
index e8aa94b..394b6f2 100644
--- a/libitm/containers.h
+++ b/libitm/containers.h
@@ -1,4 +1,4 @@ 
-/* Copyright (C) 2011 Free Software Foundation, Inc.
+/* Copyright (C) 2011, 2012 Free Software Foundation, Inc.
    Contributed by Torvald Riegel <triegel@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -70,17 +70,24 @@  class vector
   }
   ~vector<T, alloc_separate_cl>() { if (m_capacity) free(entries); }
 
-  void resize()
+  void resize(size_t additional_capacity)
   {
-    if (m_capacity >= default_resize_max)
-      m_capacity = m_capacity + default_resize_max;
+    size_t target = m_capacity + additional_capacity;
+    if (target > default_resize_max)
+      m_capacity = ((target - 1 + default_resize_max) / default_resize_max)
+        * default_resize_max;
     else
-      m_capacity = m_capacity * 2;
+      while (m_capacity < target)
+        m_capacity = m_capacity * 2;
     if (m_capacity < default_resize_min)
       m_capacity = default_resize_min;
     entries = (T*) xrealloc(entries, sizeof(T) * m_capacity, alloc_separate_cl);
   }
-  void resize_noinline() __attribute__((noinline)) { resize(); }
+  void resize_noinline() __attribute__((noinline)) { resize(1); }
+  void resize_noinline(size_t elements) __attribute__((noinline))
+  {
+    resize(elements);
+  }
 
   size_t size() const { return m_size; }
   size_t capacity() const { return this->capacity; }
@@ -93,6 +100,15 @@  class vector
     return &entries[m_size++];
   }
 
+  iterator push(size_t elements)
+  {
+    // We don't want inlining here since push() is often on the fast path.
+    if (unlikely(m_size + elements > m_capacity)) resize_noinline(elements);
+    iterator it = &entries[m_size];
+    m_size += elements;
+    return it;
+  }
+
   iterator pop() {
     if (likely(m_size > 0))
       {
diff --git a/libitm/libitm_i.h b/libitm/libitm_i.h
index b53792a..f922d22 100644
--- a/libitm/libitm_i.h
+++ b/libitm/libitm_i.h
@@ -1,4 +1,4 @@ 
-/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011, 2012 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -93,9 +93,6 @@  struct gtm_alloc_action
   bool allocated;
 };
 
-// This type is private to local.c.
-struct gtm_undolog_entry;
-
 struct gtm_thread;
 
 // A transaction checkpoint: data that has to saved and restored when doing
@@ -121,6 +118,29 @@  struct gtm_transaction_cp
   void commit(gtm_thread* tx);
 };
 
+// An undo log for writes.
+struct gtm_undolog
+{
+  vector<gtm_word> undolog;
+
+  // Log the previous value at a certain address.
+  // The easiest way to inline this is to just define this here.
+  void log(const void *ptr, size_t len)
+  {
+    size_t words = (len + sizeof(gtm_word) - 1) / sizeof(gtm_word);
+    gtm_word *undo = undolog.push(words + 2);
+    memcpy(undo, ptr, len);
+    undo[words] = len;
+    undo[words + 1] = (gtm_word) ptr;
+  }
+
+  void commit () { undolog.clear(); }
+  size_t size() const { return undolog.size(); }
+
+  // In local.cc
+  void rollback (size_t until_size = 0);
+};
+
 // Contains all thread-specific data required by the entire library.
 // This includes all data relevant to a single transaction. Because most
 // thread-specific data is about the current transaction, we also refer to
@@ -148,7 +168,7 @@  struct gtm_thread
   gtm_jmpbuf jb;
 
   // Data used by local.c for the undo log for both local and shared memory.
-  vector<gtm_undolog_entry*> undolog;
+  gtm_undolog undolog;
 
   // Data used by alloc.c for the malloc/free undo log.
   aa_tree<uintptr_t, gtm_alloc_action> alloc_actions;
@@ -254,11 +274,6 @@  struct gtm_thread
   // In eh_cpp.cc
   void revert_cpp_exceptions (gtm_transaction_cp *cp = 0);
 
-  // In local.cc
-  void commit_undolog (void);
-  void rollback_undolog (size_t until_size = 0);
-  void drop_references_undolog (const void *, size_t);
-
   // In retry.cc
   // Must be called outside of transactions (i.e., after rollback).
   void decide_retry_strategy (gtm_restart_reason);
diff --git a/libitm/local.cc b/libitm/local.cc
index 4f47ff2..39b6da3 100644
--- a/libitm/local.cc
+++ b/libitm/local.cc
@@ -1,4 +1,4 @@ 
-/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011, 2012 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -26,29 +26,9 @@ 
 
 namespace GTM HIDDEN {
 
-struct gtm_undolog_entry
-{
-  void *addr;
-  size_t len;
-  char saved[];
-};
-
 
 void
-gtm_thread::commit_undolog ()
-{
-  size_t i, n = undolog.size();
-
-  if (n > 0)
-    {
-      for (i = 0; i < n; ++i)
-	free (undolog[i]);
-      this->undolog.clear();
-    }
-}
-
-void
-gtm_thread::rollback_undolog (size_t until_size)
+gtm_undolog::rollback (size_t until_size)
 {
   size_t i, n = undolog.size();
 
@@ -56,36 +36,11 @@  gtm_thread::rollback_undolog (size_t until_size)
     {
       for (i = n; i-- > until_size; )
 	{
-	  gtm_undolog_entry *u = *undolog.pop();
-	  if (u)
-	    {
-	      memcpy (u->addr, u->saved, u->len);
-	      free (u);
-	    }
-	}
-    }
-}
-
-/* Forget any references to PTR in the local log.  */
-
-void
-gtm_thread::drop_references_undolog (const void *ptr, size_t len)
-{
-  size_t i, n = undolog.size();
-
-  if (n > 0)
-    {
-      for (i = n; i > 0; i--)
-	{
-	  gtm_undolog_entry *u = undolog[i];
-	  /* ?? Do we need such granularity, or can we get away with
-	     just comparing PTR and LEN. ??  */
-	  if ((const char *)u->addr >= (const char *)ptr
-	      && ((const char *)u->addr + u->len <= (const char *)ptr + len))
-	    {
-	      free (u);
-	      undolog[i] = NULL;
-	    }
+          void *ptr = (void *) undolog[i--];
+          size_t len = undolog[i];
+          size_t words = (len + sizeof(gtm_word) - 1) / sizeof(gtm_word);
+          i -= words;
+          __builtin_memcpy (ptr, &undolog[i], len);
 	}
     }
 }
@@ -94,16 +49,7 @@  void ITM_REGPARM
 GTM_LB (const void *ptr, size_t len)
 {
   gtm_thread *tx = gtm_thr();
-  gtm_undolog_entry *undo;
-
-  undo = (gtm_undolog_entry *)
-      xmalloc (sizeof (struct gtm_undolog_entry) + len);
-  undo->addr = (void *) ptr;
-  undo->len = len;
-
-  tx->undolog.push()[0] = undo;
-
-  memcpy (undo->saved, ptr, len);
+  tx->undolog.log(ptr, len);
 }
 
 } // namespace GTM
diff --git a/libitm/method-gl.cc b/libitm/method-gl.cc
index e678da7..d6d050d 100644
--- a/libitm/method-gl.cc
+++ b/libitm/method-gl.cc
@@ -1,4 +1,4 @@ 
-/* Copyright (C) 2011 Free Software Foundation, Inc.
+/* Copyright (C) 2011, 2012 Free Software Foundation, Inc.
    Contributed by Torvald Riegel <triegel@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -120,8 +120,7 @@  protected:
 	tx->shared_state.store(gl_mg::set_locked(now), memory_order_release);
       }
 
-    // TODO Ensure that this gets inlined: Use internal log interface and LTO.
-    GTM_LB(addr, len);
+    tx->undolog.log(addr, len);
   }
 
   static void validate()
@@ -181,7 +180,7 @@  protected:
   template <typename V> static void store(V* addr, const V value,
       ls_modifier mod)
   {
-    if (unlikely(mod != WaW))
+    if (likely(mod != WaW))
       pre_write(addr, sizeof(V));
     // FIXME We would need an atomic store here but we can't just forge an
     // atomic load for nonatomic data because this might not work on all
diff --git a/libitm/method-serial.cc b/libitm/method-serial.cc
index bf79826..bdecd7b 100644
--- a/libitm/method-serial.cc
+++ b/libitm/method-serial.cc
@@ -1,4 +1,4 @@ 
-/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011, 2012 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>.
 
    This file is part of the GNU Transactional Memory Library (libitm).
@@ -107,8 +107,8 @@  class serial_dispatch : public abi_dispatch
 protected:
   static void log(const void *addr, size_t len)
   {
-    // TODO Ensure that this gets inlined: Use internal log interface and LTO.
-    GTM_LB(addr, len);
+    gtm_thread *tx = gtm_thr();
+    tx->undolog.log(addr, len);
   }
 
   template <typename V> static V load(const V* addr, ls_modifier mod)