From patchwork Thu Oct 13 21:02:30 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Torvald Riegel X-Patchwork-Id: 119613 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 6FE91B71BC for ; Fri, 14 Oct 2011 08:02:55 +1100 (EST) Received: (qmail 9857 invoked by alias); 13 Oct 2011 21:02:52 -0000 Received: (qmail 9847 invoked by uid 22791); 13 Oct 2011 21:02:51 -0000 X-SWARE-Spam-Status: No, hits=-6.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_TX X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 13 Oct 2011 21:02:34 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p9DL2YLA010027 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 13 Oct 2011 17:02:34 -0400 Received: from [10.36.5.42] (vpn1-5-42.ams2.redhat.com [10.36.5.42]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p9DL2Vl4018120; Thu, 13 Oct 2011 17:02:32 -0400 Subject: Re: [trans-mem] Add gl_wt TM method. From: Torvald Riegel To: GCC Patches Cc: Richard Henderson In-Reply-To: <1314657235.14756.1465.camel@triegel.csb> References: <1314657235.14756.1465.camel@triegel.csb> Date: Thu, 13 Oct 2011 23:02:30 +0200 Message-ID: <1318539750.3265.1683.camel@triegel.csb> Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Tue, 2011-08-30 at 00:33 +0200, Torvald Riegel wrote: > The attached patches are several changes required for a new TM method, > gl_wt (global lock, write-through), which is added by the last patch > > patch1: Add TM-method-specific begin code. All time-based TMs need to > know at which point in time they start working. Initializing lazily on > the first txnal load or store would be unnecessary overhead. > > patch2: A small fix for serial mode. This change should have been > included in the previous renaming of the serial mode dispatchs. > > patch3: We can't free transaction-local memory during nested commits > unless we also go through the undo and redo logs and remove all > references to the to-be-freed memory (otherwise, we'll undo/redo to > privatized memory...). I guess going trough the logs is higher overhead > than just keeping the allocations around. If we see transactions in > practice that have large malloc/free cycles embedded in nested txns that > are not flattened, we can still add special handling for this case. > > patch4: We sometimes need to re-initialize method groups (e.g., to avoid > overflow of counters etc.). TM methods can request this using a special > restart reason. > > patch5: The undo log is used for both thread-local and shared data > (which are separate). Maintaining two undo logs does not provide any > advantages. However, we have to perform undo actions to shared data > before dispatch-specific rollback (e.g., where we release locks). > > patch6: Add support for quiescence-based privatization safety (using > gtm_thread::shared_state as the value of the current (snapshot) time of > a transaction). Currently, this uses just spinning, but it should > eventually be changed to block using cond vars / futexes if necessary. > This requires more thought and tuning however, as it should probably be > integrated with the serial lock (and it poses similar challenges, such > as having to minimize the number of wait/wakeup calls, number of cache > misses, etc.). Therefore, this should be addressed in a future patch. > > patch7: Finally, the new TM method, gl_wt (global lock, write trough). > This is a simple algorithm that uses a global versioned lock (aka > ownership record or orec) together with write-through / undolog-style > txnal writes. This has a lot of similarities to undolog-style TM methods > that use several locks (e.g., privatization safety has to be ensured), > but has less overhead. If update txns are frequent, it obviously won't > scale. With the current code base, gl_wt performs better than > serialirr_onwrite but probably mostly due spinning and restarts when the > global lock is acquired instead of falling back to heavyweight waiting > via futex wait/wakeup calls. > gl_wt is in the globallock method group, to which at least another > write-back, value-based-validation TM method will be added. The attached two patches fix one bug in gl_wt (patch8) and one unrelated bug that affected gl_wt (patch9). patch8: The previous code did not handle transitions to serial mode properly. On such a transition, a method's trycommit() and/or rollback() functions are called when the serial lock is already acquired. This acquisition is handled through (and stored in) gtm_thread::shared_state, which the gl_wt method uses too. Thus, blindly overwriting shared_state in commit/rollback doesn't work. The fix handles now detects this case and makes gl_wt not interfere with serial mode anymore. patch9: The previous custom TLS slot read accesses were reordered by the compiler across non-inlined function calls. For gl_wt, this caused restarts to operate on a stale value for abi_disp() (because this call was moved to before the decide_retry_strategy() call, which would call set_abi_disp). In turn, this broke gl_wt because it was being used together with serial mode, which does not work because it tries to use shared_state too. Ok for branch (patches 1-9)? commit 08d3f16c5fffa2bd3acb0c27af4010c72c75b23d Author: Torvald Riegel Date: Thu Oct 13 18:16:39 2011 +0200 Fix gl_wt commit/rollback when serial lock has been acquired. * method-gl.cc (gl_wt_dispatch::trycommit): Fix interaction with gtm_thread::shared_state when the serial lock is acquired. (gl_wt_dispatch::rollback): Same. commit c6eda0b9b471abf853af85c08eca3f4023f60bc7 Author: Torvald Riegel Date: Thu Oct 13 22:51:46 2011 +0200 Fix TLS read accesses on Linux/x86. * config/linux/x86/tls.h (abi_disp): Make TLS slot read volatile. (gtm_thr): Same. diff --git a/libitm/config/linux/x86/tls.h b/libitm/config/linux/x86/tls.h index 3d247e3..2525f36 100644 --- a/libitm/config/linux/x86/tls.h +++ b/libitm/config/linux/x86/tls.h @@ -72,7 +72,7 @@ namespace GTM HIDDEN { static inline struct gtm_thread *gtm_thr(void) { struct gtm_thread *r; - asm (SEG_READ(10) : "=r"(r)); + asm volatile (SEG_READ(10) : "=r"(r)); return r; } @@ -84,7 +84,7 @@ static inline void set_gtm_thr(struct gtm_thread *x) static inline struct abi_dispatch *abi_disp(void) { struct abi_dispatch *r; - asm (SEG_DECODE_READ(11) : "=r"(r)); + asm volatile (SEG_DECODE_READ(11) : "=r"(r)); return r; } diff --git a/libitm/method-gl.cc b/libitm/method-gl.cc index 17a2b9f..1dc700a 100644 --- a/libitm/method-gl.cc +++ b/libitm/method-gl.cc @@ -72,6 +72,12 @@ static gl_mg o_gl_mg; // validate that no other update transaction comitted before we acquired the // orec, so we have the most recent timestamp and no other transaction can // commit until we have committed). +// However, we therefore cannot use this method for a serial transaction +// (because shared_state needs to remain at ~0) and we have to be careful +// when switching to serial mode (see the special handling in trycommit() and +// rollback()). +// ??? This sharing adds some complexity wrt. serial mode. Just use a separate +// state variable? class gl_wt_dispatch : public abi_dispatch { protected: @@ -202,10 +208,22 @@ public: virtual bool trycommit(gtm_word& priv_time) { - // Release the orec but do not reset shared_state, which will be modified - // by the serial lock right after our commit anyway. gtm_thread* tx = gtm_thr(); gtm_word v = tx->shared_state; + + // Special case: If shared_state is ~0, then we have acquired the + // serial lock (tx->state is not updated yet). In this case, the previous + // value isn't available anymore, so grab it from the global lock, which + // must have a meaningful value because no other transactions are active + // anymore. In particular, if it is locked, then we are an update + // transaction, which is all we care about for commit. + if (v == ~(typeof v)0) + v = o_gl_mg.orec; + + // Release the orec but do not reset shared_state, which will be modified + // by the serial lock right after our commit anyway. Also, resetting + // shared state here would interfere with the serial lock's use of this + // location. if (gl_mg::is_locked(v)) { // Release the global orec, increasing its version number / timestamp. @@ -228,11 +246,20 @@ public: if (cp != 0) return; + gtm_thread *tx = gtm_thr(); + gtm_word v = tx->shared_state; + // Special case: If shared_state is ~0, then we have acquired the + // serial lock (tx->state is not updated yet). In this case, the previous + // value isn't available anymore, so grab it from the global lock, which + // must have a meaningful value because no other transactions are active + // anymore. In particular, if it is locked, then we are an update + // transaction, which is all we care about for rollback. + if (v == ~(typeof v)0) + v = o_gl_mg.orec; + // Release lock and increment version number to prevent dirty reads. // Also reset shared state here, so that begin_or_restart() can expect a // value that is correct wrt. privatization safety. - gtm_thread *tx = gtm_thr(); - gtm_word v = tx->shared_state; if (gl_mg::is_locked(v)) { // Release the global orec, increasing its version number / timestamp. @@ -242,7 +269,11 @@ public: o_gl_mg.orec = v; // Also reset the timestamp published via shared_state. - tx->shared_state = v; + // Special case: Only do this if we are not a serial transaction + // because otherwise, we would interfere with the serial lock. + if (tx->shared_state != ~(typeof tx->shared_state)0) + tx->shared_state = v; + // We need a store-load barrier after this store to prevent it // from becoming visible after later data loads because the // previous value of shared_state has been higher than the actual