Patchwork libitm: Fix race condition in dispatch choice at transaction begin.

login
register
mail settings
Submitter Torvald Riegel
Date Feb. 14, 2012, 12:48 p.m.
Message ID <1329223733.24846.950.camel@triegel.csb>
Download mbox | patch
Permalink /patch/141111/
State New
Headers show

Comments

Torvald Riegel - Feb. 14, 2012, 12:48 p.m.
On Mon, 2012-02-13 at 15:12 -0800, Richard Henderson wrote:
> On 02/13/2012 03:03 PM, Torvald Riegel wrote:
> > -// The default TM method used when starting a new transaction.
> > -static GTM::abi_dispatch* default_dispatch = 0;
> > +// The default TM method used when starting a new transaction.  Initialized
> > +// in number_of_threads_changed() below.
> > +static std::atomic<GTM::abi_dispatch*> default_dispatch;
> 
> I see nothing but memory_order_relaxed uses of default_dispatch?

That is fine based on the assumptions we already had about when dispatch
objects are constructed.  I add several comments to the patch to explain
this, updated version is attached.  Better?

I also addressed another corner case in decide_retry_strategy(): after
method group reinit, if another thread would change default dispatch to
something like serialirr, the code wasn't properly reacquiring serial
mode before (and not setting tx->state properly).  Delegating to the
decide_begin_dispatch() as the current patch does should fix this.

BTW, why can we assume that the dispatch objects are initialized before
we run any transactions (e.g., from within a constructor of another
static object)?  Or is this the library initialization problem that we
still have to deal with at some point?
commit 026b304137e9ef5c9c9f5625a2215f75541fbdfe
Author: Torvald Riegel <triegel@redhat.com>
Date:   Mon Feb 13 23:49:55 2012 +0100

    libitm: Fix race condition in dispatch choice at transaction begin.
    
    	libitm/
    	* beginend.cc (GTM::gtm_thread::begin_transaction): Move serial lock
    	acquisition to ...
    	* retry.cc (GTM::gtm_thread::decide_begin_dispatch): ... here.
    	(default_dispatch): Make atomic.
    	(GTM::gtm_thread::set_default_dispatch): Access atomically.
    	(GTM::gtm_thread::decide_retry_strategy): Access atomically and
    	use decide_begin_dispatch() if default_dispatch might have changed.
    	(GTM::gtm_thread::number_of_threads_changed): Initialize
    	default_dispatch here.
Richard Henderson - Feb. 14, 2012, 7:07 p.m.
[ bkoz, see <bkoz> below ]

On 02/14/2012 04:48 AM, Torvald Riegel wrote:
> BTW, why can we assume that the dispatch objects are initialized before
> we run any transactions (e.g., from within a constructor of another
> static object)?  Or is this the library initialization problem that we
> still have to deal with at some point?

ELF shared libraries are initialized in dependency order.  Static libraries
simply must be careful to be linked in the proper order on the command line.

It should be fine unless the user really tries to do something wrong.

>     	libitm/
>     	* beginend.cc (GTM::gtm_thread::begin_transaction): Move serial lock
>     	acquisition to ...
>     	* retry.cc (GTM::gtm_thread::decide_begin_dispatch): ... here.
>     	(default_dispatch): Make atomic.
>     	(GTM::gtm_thread::set_default_dispatch): Access atomically.
>     	(GTM::gtm_thread::decide_retry_strategy): Access atomically and
>     	use decide_begin_dispatch() if default_dispatch might have changed.
>     	(GTM::gtm_thread::number_of_threads_changed): Initialize
>     	default_dispatch here.

I discussed this with Torvald at length on IRC.  For the record, I think the
change to use atomic<> obscures the actual bug fix, that of rearranging
decide_begin_dispatch.

The change to use atomic<> is at best a theoretical bug fix, for some 
embedded target that doesn't load/store pointers atomically by default.
Or for consumption by some static race detector.

I'd have been happier with the one interesting reference to default_dispatch
being changed to use __atomic_load.  Rather than having special markup for
all of the non-interesting references, and spending cycles figuring out
whether the use of memory_order_relaxed is really correct at each instance.

All that said the patch is ok, Torvald.

<bkoz>
I wonder if the atomic<> template wouldn't have been better with another
(defaulted) template argument to set the default memory access mode for the
object.  In this case we could have used

  static std::atomic<GTM::abi_dispatch*, memory_order_relaxed> default_dispatch;

and then not have had to worry about ugly-ing up the source with .load/.store
accessors to change all of the accesses back to relaxed.

Is a change like this something that might get us in trouble somehow with the
c++ standard, or cause us ABI problems vs 4.7 if we changed this in 4.8?  Or
is the ABI thing a non-issue since we expose the entire template to the user?
</bkoz>



r~

Patch

diff --git a/libitm/beginend.cc b/libitm/beginend.cc
index 08c2174..e6a84de 100644
--- a/libitm/beginend.cc
+++ b/libitm/beginend.cc
@@ -233,16 +233,6 @@  GTM::gtm_thread::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb)
     {
       // Outermost transaction
       disp = tx->decide_begin_dispatch (prop);
-      if (disp == dispatch_serialirr() || disp == dispatch_serial())
-	{
-	  tx->state = STATE_SERIAL;
-	  if (disp == dispatch_serialirr())
-	    tx->state |= STATE_IRREVOCABLE;
-	  serial_lock.write_lock ();
-	}
-      else
-	serial_lock.read_lock (tx);
-
       set_abi_disp (disp);
     }
 
diff --git a/libitm/retry.cc b/libitm/retry.cc
index d57bba0..1920363 100644
--- a/libitm/retry.cc
+++ b/libitm/retry.cc
@@ -27,8 +27,16 @@ 
 #include <ctype.h>
 #include "libitm_i.h"
 
-// The default TM method used when starting a new transaction.
-static GTM::abi_dispatch* default_dispatch = 0;
+// The default TM method used when starting a new transaction.  Initialized
+// in number_of_threads_changed() below.
+// Access to this variable is always synchronized with help of the serial
+// lock, except one read access that happens in decide_begin_dispatch() before
+// a transaction has become active (by acquiring the serial lock in read or
+// write mode).  The default_dispatch is only changed and initialized in
+// serial mode.  Transactions stay active when they restart (see beginend.cc),
+// thus decide_retry_strategy() can expect default_dispatch to be unmodified.
+// See decide_begin_dispatch() for further comments.
+static std::atomic<GTM::abi_dispatch*> default_dispatch;
 // The default TM method as requested by the user, if any.
 static GTM::abi_dispatch* default_dispatch_user = 0;
 
@@ -57,20 +65,24 @@  GTM::gtm_thread::decide_retry_strategy (gtm_restart_reason r)
 	  // given that re-inits should be very infrequent.
 	  serial_lock.read_unlock(this);
 	  serial_lock.write_lock();
-	  if (disp->get_method_group() == default_dispatch->get_method_group())
+	  if (disp->get_method_group()
+	      == default_dispatch.load(memory_order_relaxed)
+	      ->get_method_group())
 	    // Still the same method group.
 	    disp->get_method_group()->reinit();
 	  serial_lock.write_unlock();
-	  serial_lock.read_lock(this);
-	  if (disp->get_method_group() != default_dispatch->get_method_group())
-	    {
-	      disp = default_dispatch;
-	      set_abi_disp(disp);
-	    }
+	  // Also, we're making the transaction inactive, so when we become
+	  // active again, some other thread might have changed the default
+	  // dispatch, so we run the same code as for the first execution
+	  // attempt.
+	  disp = decide_begin_dispatch(prop);
+	  set_abi_disp(disp);
 	}
       else
 	// We are a serial transaction already, which makes things simple.
 	disp->get_method_group()->reinit();
+
+      return;
     }
 
   bool retry_irr = (r == RESTART_SERIAL_IRR);
@@ -124,48 +136,89 @@  GTM::gtm_thread::decide_retry_strategy (gtm_restart_reason r)
 
 
 // Decides which TM method should be used on the first attempt to run this
-// transaction.
+// transaction.  Acquires the serial lock and sets transaction state
+// according to the chosen TM method.
 GTM::abi_dispatch*
 GTM::gtm_thread::decide_begin_dispatch (uint32_t prop)
 {
+  abi_dispatch* dd;
   // TODO Pay more attention to prop flags (eg, *omitted) when selecting
   // dispatch.
+  // ??? We go irrevocable eagerly here, which is not always good for
+  // performance.  Don't do this?
   if ((prop & pr_doesGoIrrevocable) || !(prop & pr_instrumentedCode))
-    return dispatch_serialirr();
-
-  // If we might need closed nesting and the default dispatch has an
-  // alternative that supports closed nesting, use it.
-  // ??? We could choose another TM method that we know supports closed
-  // nesting but isn't the default (e.g., dispatch_serial()). However, we
-  // assume that aborts that need closed nesting are infrequent, so don't
-  // choose a non-default method until we have to actually restart the
-  // transaction.
-  if (!(prop & pr_hasNoAbort) && !default_dispatch->closed_nesting()
-      && default_dispatch->closed_nesting_alternative())
-    return default_dispatch->closed_nesting_alternative();
-
-  // No special case, just use the default dispatch.
-  return default_dispatch;
+    dd = dispatch_serialirr();
+
+  else
+    {
+      // Load the default dispatch.  We're not an active transaction and so it
+      // can change concurrently but will still be some valid dispatch.
+      // Relaxed memory order is okay because we expect each dispatch to be
+      // constructed properly already (at least that its closed_nesting() and
+      // closed_nesting_alternatives() will return sensible values).  It is
+      // harmless if we incorrectly chose the serial or serialirr methods, and
+      // for all other methods we will acquire the serial lock in read mode
+      // and load the default dispatch again.
+      abi_dispatch* dd_orig = default_dispatch.load(memory_order_relaxed);
+      dd = dd_orig;
+
+      // If we might need closed nesting and the default dispatch has an
+      // alternative that supports closed nesting, use it.
+      // ??? We could choose another TM method that we know supports closed
+      // nesting but isn't the default (e.g., dispatch_serial()). However, we
+      // assume that aborts that need closed nesting are infrequent, so don't
+      // choose a non-default method until we have to actually restart the
+      // transaction.
+      if (!(prop & pr_hasNoAbort) && !dd->closed_nesting()
+	  && dd->closed_nesting_alternative())
+	dd = dd->closed_nesting_alternative();
+
+      if (dd != dispatch_serial() && dd != dispatch_serialirr())
+	{
+	  // The current dispatch is supposedly a non-serial one.  Become an
+	  // active transaction and verify this.  Relaxed memory order is fine
+	  // because the serial lock itself will have established
+	  // happens-before for any change to the selected dispatch.
+	  serial_lock.read_lock (this);
+	  if (default_dispatch.load(memory_order_relaxed) == dd_orig)
+	    return dd;
+
+	  // If we raced with a concurrent modification of default_dispatch,
+	  // just fall back to serialirr.  The dispatch choice might not be
+	  // up-to-date anymore, but this is harmless.
+	  serial_lock.read_unlock (this);
+	  dd = dispatch_serialirr();
+	}
+    }
+
+  // We are some kind of serial transaction.
+  serial_lock.write_lock();
+  if (dd == dispatch_serialirr())
+    state = STATE_SERIAL | STATE_IRREVOCABLE;
+  else
+    state = STATE_SERIAL;
+  return dd;
 }
 
 
 void
 GTM::gtm_thread::set_default_dispatch(GTM::abi_dispatch* disp)
 {
-  if (default_dispatch == disp)
+  abi_dispatch* dd = default_dispatch.load(memory_order_relaxed);
+  if (dd == disp)
     return;
-  if (default_dispatch)
+  if (dd)
     {
       // If we are switching method groups, initialize and shut down properly.
-      if (default_dispatch->get_method_group() != disp->get_method_group())
+      if (dd->get_method_group() != disp->get_method_group())
 	{
-	  default_dispatch->get_method_group()->fini();
+	  dd->get_method_group()->fini();
 	  disp->get_method_group()->init();
 	}
     }
   else
     disp->get_method_group()->init();
-  default_dispatch = disp;
+  default_dispatch.store(disp, memory_order_relaxed);
 }
 
 
@@ -233,6 +286,7 @@  GTM::gtm_thread::number_of_threads_changed(unsigned previous, unsigned now)
 	{
 	  initialized = true;
 	  // Check for user preferences here.
+	  default_dispatch = 0;
 	  default_dispatch_user = parse_default_method();
 	}
     }