From patchwork Fri Jan 6 00:15:36 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Torvald Riegel X-Patchwork-Id: 134558 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 038751007D6 for ; Fri, 6 Jan 2012 11:16:12 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1326413774; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Subject:From:To:Cc:Content-Type:Date:Message-ID:Mime-Version: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=zYtQ4qn6d3mvJxNEIEDC ZQ+7TmU=; b=MLyMkZ1ukkpST6G4YhkV/FMR7X9MmDWu5Z8Rf7lPQJbZ002x/D3C YoC6sP5e84qu9ZbDHcll77ThXd8LzuJgwevSYDB+bgdnlWiZiSiNPPC8WW/L/nVO uATC2WuoHpvA/m2rFbgz9dfTH5kcvIgxlCldjPijGiPQLqX3NWDQ3uc= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Subject:From:To:Cc:Content-Type:Date:Message-ID:Mime-Version:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=HD/qXOEv7Q80b9aPQW1xE+X5Ndnz2VRWXgfDFzX2hXu2uRp6LrCcZcDhYcEhQb QRDJyfhu8e9xhvozFcopepBvIu21aVvdSeoBnIx5eBIgB0CzLeP4H1mWYvweijMM lOnWcL0lLMZfwwiKbA3o3a4tOZ4TkOZY6uPfLpvlY2KIQ=; Received: (qmail 2785 invoked by alias); 6 Jan 2012 00:16:04 -0000 Received: (qmail 2747 invoked by uid 22791); 6 Jan 2012 00:15:59 -0000 X-SWARE-Spam-Status: No, hits=-6.6 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, TW_CP, TW_CX, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 06 Jan 2012 00:15:40 +0000 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q060Fe39001539 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 5 Jan 2012 19:15:40 -0500 Received: from [10.36.5.137] (vpn1-5-137.ams2.redhat.com [10.36.5.137]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id q060Fcmj011444; Thu, 5 Jan 2012 19:15:39 -0500 Subject: [patch] libitm: Optimize undo log. From: Torvald Riegel To: GCC Patches Cc: Richard Henderson , Aldy Hernandez Date: Fri, 06 Jan 2012 01:15:36 +0100 Message-ID: <1325808936.7636.3531.camel@triegel.csb> Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This fixes a known, obvious performance problem in libitm's undo log: Before, each write was put into a separately allocated buffer, which kills performance. With the patch, everything goes into a single buffer: Undo data first, then the length of the data, then the address. We only need this data when rolling it back, starting from the most recent undo. Address and length are gtm_word-sized anyway (==uintptr_t), and the undo data is adjusted to the same length so that we don't have to bother with unaligned accesses. Additionally, the undolog logging code (which is called on each store) is moved into a header to allow easy inlining. With that, the memcpy's used to copy the data can also be optimized, because the length is known. This reduces overheads further, especially for doubles it seems (tested on x86_64). STAMP apps seem to work too, and run much faster. Surprise! :) I'll try to get some more testing on ppc tomorrow or so. OK for trunk? commit 93c95a6d55e79ce61eaf4aec4265974ded967442 Author: Torvald Riegel Date: Thu Jan 5 21:49:48 2012 +0100 libitm: Optimize undo log. libitm/ * local.cc (GTM_LB): Use GTM::gtm_undolog. (GTM::gtm_thread::drop_references_undolog): Remove. (GTM::gtm_thread::commit_undolog, GTM::gtm_thread::rollback_undolog): Move to ... * libitm_i.h (GTM::gtm_undolog): ...here. New. (GTM::gtm_undolog_entry): Remove. (GTM::gtm_thread): Adapt. * beginend.cc (GTM::gtm_thread::rollback): Adapt. (GTM::gtm_thread::trycommit): Adapt. * method-serial.cc (serial_dispatch::log): Adapt. * method-gl.cc (gl_wt_dispatch::pre_write): Adapt. (gl_wt_dispatch::store): Fix likely/unlikely. * containers.h (GTM::vector::resize): Add additional_capacity parameter and handle it. (GTM::vector::resize_noinline): New/adapt. (GTM::vector::push): New. diff --git a/libitm/beginend.cc b/libitm/beginend.cc index 7975481..fe14f32 100644 --- a/libitm/beginend.cc +++ b/libitm/beginend.cc @@ -1,4 +1,4 @@ -/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. +/* Copyright (C) 2008, 2009, 2011, 2012 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). @@ -327,7 +327,7 @@ GTM::gtm_thread::rollback (gtm_transaction_cp *cp, bool aborting) // data. Because of the latter, we have to roll it back before any // dispatch-specific rollback (which handles synchronization with other // transactions). - rollback_undolog (cp ? cp->undolog_size : 0); + undolog.rollback (cp ? cp->undolog_size : 0); // Perform dispatch-specific rollback. abi_disp()->rollback (cp); @@ -470,7 +470,7 @@ GTM::gtm_thread::trycommit () // We can commit the undo log after dispatch-specific commit and after // making the transaction inactive because we only have to reset // gtm_thread state. - commit_undolog (); + undolog.commit (); // Reset further transaction state. cxa_catch_count = 0; cxa_unthrown = NULL; diff --git a/libitm/containers.h b/libitm/containers.h index e8aa94b..394b6f2 100644 --- a/libitm/containers.h +++ b/libitm/containers.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2011 Free Software Foundation, Inc. +/* Copyright (C) 2011, 2012 Free Software Foundation, Inc. Contributed by Torvald Riegel . This file is part of the GNU Transactional Memory Library (libitm). @@ -70,17 +70,24 @@ class vector } ~vector() { if (m_capacity) free(entries); } - void resize() + void resize(size_t additional_capacity) { - if (m_capacity >= default_resize_max) - m_capacity = m_capacity + default_resize_max; + size_t target = m_capacity + additional_capacity; + if (target > default_resize_max) + m_capacity = ((target - 1 + default_resize_max) / default_resize_max) + * default_resize_max; else - m_capacity = m_capacity * 2; + while (m_capacity < target) + m_capacity = m_capacity * 2; if (m_capacity < default_resize_min) m_capacity = default_resize_min; entries = (T*) xrealloc(entries, sizeof(T) * m_capacity, alloc_separate_cl); } - void resize_noinline() __attribute__((noinline)) { resize(); } + void resize_noinline() __attribute__((noinline)) { resize(1); } + void resize_noinline(size_t elements) __attribute__((noinline)) + { + resize(elements); + } size_t size() const { return m_size; } size_t capacity() const { return this->capacity; } @@ -93,6 +100,15 @@ class vector return &entries[m_size++]; } + iterator push(size_t elements) + { + // We don't want inlining here since push() is often on the fast path. + if (unlikely(m_size + elements > m_capacity)) resize_noinline(elements); + iterator it = &entries[m_size]; + m_size += elements; + return it; + } + iterator pop() { if (likely(m_size > 0)) { diff --git a/libitm/libitm_i.h b/libitm/libitm_i.h index b53792a..f922d22 100644 --- a/libitm/libitm_i.h +++ b/libitm/libitm_i.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. +/* Copyright (C) 2008, 2009, 2011, 2012 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). @@ -93,9 +93,6 @@ struct gtm_alloc_action bool allocated; }; -// This type is private to local.c. -struct gtm_undolog_entry; - struct gtm_thread; // A transaction checkpoint: data that has to saved and restored when doing @@ -121,6 +118,29 @@ struct gtm_transaction_cp void commit(gtm_thread* tx); }; +// An undo log for writes. +struct gtm_undolog +{ + vector undolog; + + // Log the previous value at a certain address. + // The easiest way to inline this is to just define this here. + void log(const void *ptr, size_t len) + { + size_t words = (len + sizeof(gtm_word) - 1) / sizeof(gtm_word); + gtm_word *undo = undolog.push(words + 2); + memcpy(undo, ptr, len); + undo[words] = len; + undo[words + 1] = (gtm_word) ptr; + } + + void commit () { undolog.clear(); } + size_t size() const { return undolog.size(); } + + // In local.cc + void rollback (size_t until_size = 0); +}; + // Contains all thread-specific data required by the entire library. // This includes all data relevant to a single transaction. Because most // thread-specific data is about the current transaction, we also refer to @@ -148,7 +168,7 @@ struct gtm_thread gtm_jmpbuf jb; // Data used by local.c for the undo log for both local and shared memory. - vector undolog; + gtm_undolog undolog; // Data used by alloc.c for the malloc/free undo log. aa_tree alloc_actions; @@ -254,11 +274,6 @@ struct gtm_thread // In eh_cpp.cc void revert_cpp_exceptions (gtm_transaction_cp *cp = 0); - // In local.cc - void commit_undolog (void); - void rollback_undolog (size_t until_size = 0); - void drop_references_undolog (const void *, size_t); - // In retry.cc // Must be called outside of transactions (i.e., after rollback). void decide_retry_strategy (gtm_restart_reason); diff --git a/libitm/local.cc b/libitm/local.cc index 4f47ff2..39b6da3 100644 --- a/libitm/local.cc +++ b/libitm/local.cc @@ -1,4 +1,4 @@ -/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. +/* Copyright (C) 2008, 2009, 2011, 2012 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). @@ -26,29 +26,9 @@ namespace GTM HIDDEN { -struct gtm_undolog_entry -{ - void *addr; - size_t len; - char saved[]; -}; - void -gtm_thread::commit_undolog () -{ - size_t i, n = undolog.size(); - - if (n > 0) - { - for (i = 0; i < n; ++i) - free (undolog[i]); - this->undolog.clear(); - } -} - -void -gtm_thread::rollback_undolog (size_t until_size) +gtm_undolog::rollback (size_t until_size) { size_t i, n = undolog.size(); @@ -56,36 +36,11 @@ gtm_thread::rollback_undolog (size_t until_size) { for (i = n; i-- > until_size; ) { - gtm_undolog_entry *u = *undolog.pop(); - if (u) - { - memcpy (u->addr, u->saved, u->len); - free (u); - } - } - } -} - -/* Forget any references to PTR in the local log. */ - -void -gtm_thread::drop_references_undolog (const void *ptr, size_t len) -{ - size_t i, n = undolog.size(); - - if (n > 0) - { - for (i = n; i > 0; i--) - { - gtm_undolog_entry *u = undolog[i]; - /* ?? Do we need such granularity, or can we get away with - just comparing PTR and LEN. ?? */ - if ((const char *)u->addr >= (const char *)ptr - && ((const char *)u->addr + u->len <= (const char *)ptr + len)) - { - free (u); - undolog[i] = NULL; - } + void *ptr = (void *) undolog[i--]; + size_t len = undolog[i]; + size_t words = (len + sizeof(gtm_word) - 1) / sizeof(gtm_word); + i -= words; + __builtin_memcpy (ptr, &undolog[i], len); } } } @@ -94,16 +49,7 @@ void ITM_REGPARM GTM_LB (const void *ptr, size_t len) { gtm_thread *tx = gtm_thr(); - gtm_undolog_entry *undo; - - undo = (gtm_undolog_entry *) - xmalloc (sizeof (struct gtm_undolog_entry) + len); - undo->addr = (void *) ptr; - undo->len = len; - - tx->undolog.push()[0] = undo; - - memcpy (undo->saved, ptr, len); + tx->undolog.log(ptr, len); } } // namespace GTM diff --git a/libitm/method-gl.cc b/libitm/method-gl.cc index e678da7..d6d050d 100644 --- a/libitm/method-gl.cc +++ b/libitm/method-gl.cc @@ -1,4 +1,4 @@ -/* Copyright (C) 2011 Free Software Foundation, Inc. +/* Copyright (C) 2011, 2012 Free Software Foundation, Inc. Contributed by Torvald Riegel . This file is part of the GNU Transactional Memory Library (libitm). @@ -120,8 +120,7 @@ protected: tx->shared_state.store(gl_mg::set_locked(now), memory_order_release); } - // TODO Ensure that this gets inlined: Use internal log interface and LTO. - GTM_LB(addr, len); + tx->undolog.log(addr, len); } static void validate() @@ -181,7 +180,7 @@ protected: template static void store(V* addr, const V value, ls_modifier mod) { - if (unlikely(mod != WaW)) + if (likely(mod != WaW)) pre_write(addr, sizeof(V)); // FIXME We would need an atomic store here but we can't just forge an // atomic load for nonatomic data because this might not work on all diff --git a/libitm/method-serial.cc b/libitm/method-serial.cc index bf79826..bdecd7b 100644 --- a/libitm/method-serial.cc +++ b/libitm/method-serial.cc @@ -1,4 +1,4 @@ -/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. +/* Copyright (C) 2008, 2009, 2011, 2012 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). @@ -107,8 +107,8 @@ class serial_dispatch : public abi_dispatch protected: static void log(const void *addr, size_t len) { - // TODO Ensure that this gets inlined: Use internal log interface and LTO. - GTM_LB(addr, len); + gtm_thread *tx = gtm_thr(); + tx->undolog.log(addr, len); } template static V load(const V* addr, ls_modifier mod)