From patchwork Sat Aug 20 15:51:57 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Torvald Riegel X-Patchwork-Id: 110780 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 6AA37B6F7E for ; Sun, 21 Aug 2011 01:52:32 +1000 (EST) Received: (qmail 16630 invoked by alias); 20 Aug 2011 15:52:28 -0000 Received: (qmail 16620 invoked by uid 22791); 20 Aug 2011 15:52:23 -0000 X-SWARE-Spam-Status: No, hits=-5.6 required=5.0 tests=AWL, BAYES_50, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_AV, TW_FN, TW_LR, TW_RW, TW_TX, TW_YM X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 20 Aug 2011 15:52:00 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p7KFq0vo029965 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 20 Aug 2011 11:52:00 -0400 Received: from [10.36.7.116] (vpn1-7-116.ams2.redhat.com [10.36.7.116]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p7KFpw0w011435; Sat, 20 Aug 2011 11:51:58 -0400 Subject: [trans-mem] Add futex-based serial lock From: Torvald Riegel To: GCC Patches Cc: Richard Henderson Date: Sat, 20 Aug 2011 17:51:57 +0200 Message-ID: <1313855517.3533.2476.camel@triegel.csb> Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This adds a futex-based version of the serial lock for use on Linux. The futex code is basically old code of libitm (it got removed in SVN rev 157758) with one fix for sysfutex0 on x86_64 and one change that returns the number of woken processes (futex_wake). The gtm_rwlock is similar in concept to the mutex-based version, but adapted to futexes. It performs better than the mutex-based version. Not really great yet, but there is no spinning yet, so on contention we'll always have the overhead of waiting via the futexes. Again, RBTree with 1/2/4/6 threads, with different update %: 0%: 49 89 120 120 1%: 48 65 75 80 20%: 35 10 10 9 100%: 16.5 2.5 2.5 2.5 For comparison, the mutex-based version: 0%: 49 / 90 / 120 1%: 47 / 59 / 27 20% 34 / 6 / 3 100%: 15 / 1 / 1 OK for branch? commit 0b95e53c6da549032ebf7533a4dfea75a7ccb1b2 Author: Torvald Riegel Date: Fri Aug 19 15:56:43 2011 +0200 Add futex-based serial lock. * config/linux/rwlock.h: New file. * config/linux/rwlock.c: New file. * configure.ac: Reenable futex support (undo SVN rev 157758). * Makefile.am: Same. * configure.tgt: Same. * config/linux/alpha/futex_bits.h: Same. * config/linux/futex.h: Same. Return number of woken processes. * config/linux/futex.cc: Same. (futex_wait): Remove spinning. * config/linux/x86/futex_bits.h: Same. Set futex timeout to zero. * aclocal.m4: Include generic futex checks. * configure: Rebuild. * Makefile.in: Rebuild. * testsuite/Makefile.in: Rebuild. * beginend.cc: Include pthread.h. * config/posix/cachepage.cc: Same. diff --git a/libitm/Makefile.am b/libitm/Makefile.am index 09c21ff..ee1822b 100644 --- a/libitm/Makefile.am +++ b/libitm/Makefile.am @@ -51,6 +51,9 @@ x86_sse.lo : XCFLAGS += -msse x86_avx.lo : XCFLAGS += -mavx endif +if ARCH_FUTEX +libitm_la_SOURCES += futex.cc +endif # Automake Documentation: # If your package has Texinfo files in many directories, you can use the diff --git a/libitm/Makefile.in b/libitm/Makefile.in index 57f76f6..524753e 100644 --- a/libitm/Makefile.in +++ b/libitm/Makefile.in @@ -37,6 +37,7 @@ build_triplet = @build@ host_triplet = @host@ target_triplet = @target@ @ARCH_X86_TRUE@am__append_1 = x86_sse.cc x86_avx.cc +@ARCH_FUTEX_TRUE@am__append_2 = futex.cc subdir = . DIST_COMMON = $(am__configure_deps) $(srcdir)/../config.guess \ $(srcdir)/../config.sub $(srcdir)/../depcomp \ @@ -49,6 +50,7 @@ ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ $(top_srcdir)/../config/depstand.m4 \ $(top_srcdir)/../config/enable.m4 \ + $(top_srcdir)/../config/futex.m4 \ $(top_srcdir)/../config/lead-dot.m4 \ $(top_srcdir)/../config/mmap.m4 \ $(top_srcdir)/../config/multi.m4 \ @@ -95,12 +97,14 @@ am__libitm_la_SOURCES_DIST = aatree.cc alloc.cc alloc_c.cc \ alloc_cpp.cc barrier.cc beginend.cc clone.cc cacheline.cc \ cachepage.cc eh_cpp.cc local.cc query.cc retry.cc rwlock.cc \ useraction.cc util.cc sjlj.S tls.cc method-serial.cc \ - x86_sse.cc x86_avx.cc + x86_sse.cc x86_avx.cc futex.cc @ARCH_X86_TRUE@am__objects_1 = x86_sse.lo x86_avx.lo +@ARCH_FUTEX_TRUE@am__objects_2 = futex.lo am_libitm_la_OBJECTS = aatree.lo alloc.lo alloc_c.lo alloc_cpp.lo \ barrier.lo beginend.lo clone.lo cacheline.lo cachepage.lo \ eh_cpp.lo local.lo query.lo retry.lo rwlock.lo useraction.lo \ - util.lo sjlj.lo tls.lo method-serial.lo $(am__objects_1) + util.lo sjlj.lo tls.lo method-serial.lo $(am__objects_1) \ + $(am__objects_2) libitm_la_OBJECTS = $(am_libitm_la_OBJECTS) DEFAULT_INCLUDES = -I.@am__isrc@ depcomp = $(SHELL) $(top_srcdir)/../depcomp @@ -369,7 +373,8 @@ libitm_la_LDFLAGS = $(libitm_version_info) $(libitm_version_script) \ libitm_la_SOURCES = aatree.cc alloc.cc alloc_c.cc alloc_cpp.cc \ barrier.cc beginend.cc clone.cc cacheline.cc cachepage.cc \ eh_cpp.cc local.cc query.cc retry.cc rwlock.cc useraction.cc \ - util.cc sjlj.S tls.cc method-serial.cc $(am__append_1) + util.cc sjlj.S tls.cc method-serial.cc $(am__append_1) \ + $(am__append_2) # Automake Documentation: # If your package has Texinfo files in many directories, you can use the @@ -499,6 +504,7 @@ distclean-compile: @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/cachepage.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/clone.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eh_cpp.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/futex.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/local.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/method-serial.Plo@am__quote@ @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/query.Plo@am__quote@ diff --git a/libitm/aclocal.m4 b/libitm/aclocal.m4 index ff6ddad..6dcccdf 100644 --- a/libitm/aclocal.m4 +++ b/libitm/aclocal.m4 @@ -993,6 +993,7 @@ AC_SUBST([am__untar]) m4_include([../config/acx.m4]) m4_include([../config/depstand.m4]) m4_include([../config/enable.m4]) +m4_include([../config/futex.m4]) m4_include([../config/lead-dot.m4]) m4_include([../config/mmap.m4]) m4_include([../config/multi.m4]) diff --git a/libitm/beginend.cc b/libitm/beginend.cc index 7863042..e53ea6c 100644 --- a/libitm/beginend.cc +++ b/libitm/beginend.cc @@ -23,6 +23,7 @@ . */ #include "libitm_i.h" +#include using namespace GTM; diff --git a/libitm/config/linux/alpha/futex_bits.h b/libitm/config/linux/alpha/futex_bits.h new file mode 100644 index 0000000..5a7a474 --- /dev/null +++ b/libitm/config/linux/alpha/futex_bits.h @@ -0,0 +1,56 @@ +/* Copyright (C) 2008, 2009 Free Software Foundation, Inc. + Contributed by Richard Henderson . + + This file is part of the GNU Transactional Memory Library (libitm). + + Libitm is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + Libitm is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* Provide target-specific access to the futex system call. */ + +#ifndef SYS_futex +#define SYS_futex 394 +#endif + +static inline long +sys_futex0 (int *addr, long op, long val) +{ + register long sc_0 __asm__("$0"); + register long sc_16 __asm__("$16"); + register long sc_17 __asm__("$17"); + register long sc_18 __asm__("$18"); + register long sc_19 __asm__("$19"); + long res; + + sc_0 = SYS_futex; + sc_16 = (long) addr; + sc_17 = op; + sc_18 = val; + sc_19 = 0; + __asm volatile ("callsys" + : "=r" (sc_0), "=r"(sc_19) + : "0"(sc_0), "r" (sc_16), "r"(sc_17), "r"(sc_18), "1"(sc_19) + : "$1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", + "$22", "$23", "$24", "$25", "$27", "$28", "memory"); + + res = sc_0; + if (__builtin_expect (sc_19, 0)) + res = -res; + return res; +} diff --git a/libitm/config/linux/futex.cc b/libitm/config/linux/futex.cc new file mode 100644 index 0000000..9585108 --- /dev/null +++ b/libitm/config/linux/futex.cc @@ -0,0 +1,82 @@ +/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. + Contributed by Richard Henderson . + + This file is part of the GNU Transactional Memory Library (libitm). + + Libitm is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + Libitm is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* Provide access to the futex system call. */ + +#include "libitm_i.h" +#include "futex.h" +#include + +namespace GTM HIDDEN { + +#define FUTEX_WAIT 0 +#define FUTEX_WAKE 1 +#define FUTEX_PRIVATE_FLAG 128L + + +static long int gtm_futex_wait = FUTEX_WAIT | FUTEX_PRIVATE_FLAG; +static long int gtm_futex_wake = FUTEX_WAKE | FUTEX_PRIVATE_FLAG; + + +void +futex_wait (int *addr, int val) +{ + long res; + + res = sys_futex0 (addr, gtm_futex_wait, val); + if (__builtin_expect (res == -ENOSYS, 0)) + { + gtm_futex_wait = FUTEX_WAIT; + gtm_futex_wake = FUTEX_WAKE; + res = sys_futex0 (addr, FUTEX_WAIT, val); + } + if (__builtin_expect (res < 0, 0)) + { + if (res == -EWOULDBLOCK || res == -ETIMEDOUT) + ; + else if (res == -EFAULT) + GTM_fatal ("futex failed (EFAULT %p)", addr); + else + GTM_fatal ("futex failed (%s)", strerror(-res)); + } +} + + +long +futex_wake (int *addr, int count) +{ + long res = sys_futex0 (addr, gtm_futex_wake, count); + if (__builtin_expect (res == -ENOSYS, 0)) + { + gtm_futex_wait = FUTEX_WAIT; + gtm_futex_wake = FUTEX_WAKE; + res = sys_futex0 (addr, FUTEX_WAKE, count); + } + if (__builtin_expect (res < 0, 0)) + GTM_fatal ("futex failed (%s)", strerror(-res)); + else + return res; +} + +} // namespace GTM diff --git a/libitm/config/linux/futex.h b/libitm/config/linux/futex.h new file mode 100644 index 0000000..3471119 --- /dev/null +++ b/libitm/config/linux/futex.h @@ -0,0 +1,39 @@ +/* Copyright (C) 2008, 2009 Free Software Foundation, Inc. + Contributed by Richard Henderson . + + This file is part of the GNU Transactional Memory Library (libitm). + + Libitm is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + Libitm is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +/* Provide access to the futex system call. */ + +#ifndef GTM_FUTEX_H +#define GTM_FUTEX_H 1 + +namespace GTM HIDDEN { + +#include "futex_bits.h" + +extern void futex_wait (int *addr, int val); +extern long futex_wake (int *addr, int count); + +} + +#endif /* GTM_FUTEX_H */ diff --git a/libitm/config/linux/rwlock.cc b/libitm/config/linux/rwlock.cc new file mode 100644 index 0000000..ea5427d --- /dev/null +++ b/libitm/config/linux/rwlock.cc @@ -0,0 +1,235 @@ +/* Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Torvald Riegel . + + This file is part of the GNU Transactional Memory Library (libitm). + + Libitm is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + Libitm is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#include "libitm_i.h" +#include "futex.h" +#include + +namespace GTM HIDDEN { + +// Acquire a RW lock for reading. + +void +gtm_rwlock::read_lock (gtm_thread *tx) +{ + for (;;) + { + // Fast path: first announce our intent to read, then check for + // conflicting intents to write. The barrier makes sure that this + // happens in exactly this order. + tx->shared_state = 0; + __sync_synchronize(); + if (likely(writers == 0)) + return; + + // There seems to be an active, waiting, or confirmed writer, so enter + // the futex-based slow path. + + // Before waiting, we clear our read intent check whether there are any + // writers that might potentially wait for readers. If so, wake them. + // We need the barrier here for the same reason that we need it in + // read_unlock(). + // TODO Potentially too many wake-ups. See comments in read_unlock(). + tx->shared_state = ~(typeof tx->shared_state)0; + __sync_synchronize(); + if (writer_readers > 0) + { + writer_readers = 0; + futex_wake(&writer_readers, 1); + } + + // Signal that there are waiting readers and wait until there is no + // writer anymore. + // TODO Spin here on writers for a while. Consider whether we woke + // any writers before? + while (writers) + { + // An active writer. Wait until it has finished. To avoid lost + // wake-ups, we need to use Dekker-like synchronization. + // Note that we cannot reset readers to zero when we see that there + // are no writers anymore after the barrier because this pending + // store could then lead to lost wake-ups at other readers. + readers = 1; + __sync_synchronize(); + if (writers) + futex_wait(&readers, 1); + } + + // And we try again to acquire a read lock. + } +} + + +// Acquire a RW lock for writing. Generic version that also works for +// upgrades. +// Note that an upgrade might fail (and thus waste previous work done during +// this transaction) if there is another thread that tried to go into serial +// mode earlier (i.e., upgrades do not have higher priority than pure writers). +// However, this seems rare enough to not consider it further as we need both +// a non-upgrade writer and a writer to happen to switch to serial mode +// concurrently. If we'd want to handle this, a writer waiting for readers +// would have to coordinate with later arriving upgrades and hand over the +// lock to them, including the the reader-waiting state. We can try to support +// this if this will actually happen often enough in real workloads. + +bool +gtm_rwlock::write_lock_generic (gtm_thread *tx) +{ + // Try to acquire the write lock. + unsigned int w; + if (unlikely((w = __sync_val_compare_and_swap(&writers, 0, 1)) != 0)) + { + // If this is an upgrade, we must not wait for other writers or + // upgrades. + if (tx != 0) + return false; + + // There is already a writer. If there are no other waiting writers, + // switch to contended mode. + // Note that this is actually an atomic exchange, not a TAS. Also, + // it's only guaranteed to have acquire semantics, whereas we need a + // full barrier to make the Dekker-style synchronization work. However, + // we rely on the xchg being a full barrier on the architectures that we + // consider here. + // ??? Use C++0x atomics as soon as they are available. + if (w != 2) + w = __sync_lock_test_and_set(&writers, 2); + while (w != 0) + { + futex_wait(&writers, 2); + w = __sync_lock_test_and_set(&writers, 2); + } + } + + // We have acquired the writer side of the R/W lock. Now wait for any + // readers that might still be active. + // We don't need an extra barrier here because the CAS and the xchg + // operations have full barrier semantics already. + + // If this is an upgrade, we are not a reader anymore. This is only safe to + // do after we have acquired the writer lock. + // TODO In the worst case, this requires one wait/wake pair for each + // active reader. Reduce this! + if (tx != 0) + tx->shared_state = ~(typeof tx->shared_state)0; + + for (gtm_thread *it = gtm_thread::list_of_threads; it != 0; + it = it->next_thread) + { + // Use a loop here to check reader flags again after waiting. + while (it->shared_state != ~(typeof it->shared_state)0) + { + // An active reader. Wait until it has finished. To avoid lost + // wake-ups, we need to use Dekker-like synchronization. + // Note that we can reset writer_readers to zero when we see after + // the barrier that the reader has finished in the meantime; + // however, this is only possible because we are the only writer. + // TODO Spin for a while on this reader flag. + writer_readers = 1; + __sync_synchronize(); + if (it->shared_state != ~(typeof it->shared_state)0) + futex_wait(&writer_readers, 1); + else + writer_readers = 0; + } + } + + return true; +} + +// Acquire a RW lock for writing. + +void +gtm_rwlock::write_lock () +{ + write_lock_generic (0); +} + + +// Upgrade a RW lock that has been locked for reading to a writing lock. +// Do this without possibility of another writer incoming. Return false +// if this attempt fails (i.e. another thread also upgraded). + +bool +gtm_rwlock::write_upgrade (gtm_thread *tx) +{ + return write_lock_generic (tx); +} + + +// Release a RW lock from reading. + +void +gtm_rwlock::read_unlock (gtm_thread *tx) +{ + tx->shared_state = ~(typeof tx->shared_state)0; + + // If there is a writer waiting for readers, wake it up. We need the barrier + // to avoid lost wake-ups. + // ??? We might not be the last active reader, so the wake-up might happen + // too early. How do we avoid this without slowing down readers too much? + // Each reader could scan the list of txns for other active readers but + // this can result in many cache misses. Use combining instead? + // TODO Sends out one wake-up for each reader in the worst case. + __sync_synchronize(); + if (unlikely(writer_readers > 0)) + { + writer_readers = 0; + futex_wake(&writer_readers, 1); + } +} + + +// Release a RW lock from writing. + +void +gtm_rwlock::write_unlock () +{ + // This is supposed to be a full barrier. + if (__sync_fetch_and_sub(&writers, 1) == 2) + { + // There might be waiting writers, so wake them. + writers = 0; + if (futex_wake(&writers, 1) == 0) + { + // If we did not wake any waiting writers, we might indeed be the + // last writer (this can happen because write_lock_generic() + // exchanges 0 or 1 to 2 and thus might go to contended mode even if + // no other thread holds the write lock currently). Therefore, we + // have to wake up readers here as well. + futex_wake(&readers, INT_MAX); + } + return; + } + // No waiting writers, so wake up all waiting readers. + // Because the fetch_and_sub is a full barrier already, we don't need + // another barrier here (as in read_unlock()). + if (readers > 0) + { + readers = 0; + futex_wake(&readers, INT_MAX); + } +} + +} // namespace GTM diff --git a/libitm/config/linux/rwlock.h b/libitm/config/linux/rwlock.h new file mode 100644 index 0000000..7e6229b --- /dev/null +++ b/libitm/config/linux/rwlock.h @@ -0,0 +1,66 @@ +/* Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Torvald Riegel . + + This file is part of the GNU Transactional Memory Library (libitm). + + Libitm is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + Libitm is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifndef GTM_RWLOCK_H +#define GTM_RWLOCK_H + +#include "common.h" + +namespace GTM HIDDEN { + +struct gtm_thread; + +// This datastructure is the blocking, futex-based version of the Dekker-style +// reader-writer lock used to provide mutual exclusion between active and +// serial transactions. +// See libitm's documentation for further details. +// +// In this implementation, writers are given highest priority access but +// read-to-write upgrades do not have a higher priority than writers. + +class gtm_rwlock +{ + // TODO Put futexes on different cachelines? + int writers; // Writers' futex. + int writer_readers; // A confirmed writer waits here for readers. + int readers; // Readers wait here for writers (iff true). + + public: + gtm_rwlock() : writers(0), writer_readers(0), readers(0) {}; + + void read_lock (gtm_thread *tx); + void read_unlock (gtm_thread *tx); + + void write_lock (); + void write_unlock (); + + bool write_upgrade (gtm_thread *tx); + + protected: + bool write_lock_generic (gtm_thread *tx); +}; + +} // namespace GTM + +#endif // GTM_RWLOCK_H diff --git a/libitm/config/linux/x86/futex_bits.h b/libitm/config/linux/x86/futex_bits.h new file mode 100644 index 0000000..9362107 --- /dev/null +++ b/libitm/config/linux/x86/futex_bits.h @@ -0,0 +1,82 @@ +/* Copyright (C) 2008, 2009 Free Software Foundation, Inc. + Contributed by Richard Henderson . + + This file is part of the GNU Transactional Memory Library (libitm). + + Libitm is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + Libitm is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifdef __LP64__ +# ifndef SYS_futex +# define SYS_futex 202 +# endif + +static inline long +sys_futex0 (int *addr, long op, long val) +{ + register long r10 __asm__("%r10") = 0; + long res; + + __asm volatile ("syscall" + : "=a" (res) + : "0" (SYS_futex), "D" (addr), "S" (op), "d" (val), "r" (r10) + : "r11", "rcx", "memory"); + + return res; +} + +#else +# ifndef SYS_futex +# define SYS_futex 240 +# endif + +# ifdef __PIC__ + +static inline long +sys_futex0 (int *addr, int op, int val) +{ + long res; + + __asm volatile ("xchgl\t%%ebx, %2\n\t" + "int\t$0x80\n\t" + "xchgl\t%%ebx, %2" + : "=a" (res) + : "0"(SYS_futex), "r" (addr), "c"(op), + "d"(val), "S"(0) + : "memory"); + return res; +} + +# else + +static inline long +sys_futex0 (int *addr, int op, int val) +{ + long res; + + __asm volatile ("int $0x80" + : "=a" (res) + : "0"(SYS_futex), "b" (addr), "c"(op), + "d"(val), "S"(0) + : "memory"); + return res; +} + +# endif /* __PIC__ */ +#endif /* __LP64__ */ diff --git a/libitm/config/posix/cachepage.cc b/libitm/config/posix/cachepage.cc index 240d09f..3965a9e 100644 --- a/libitm/config/posix/cachepage.cc +++ b/libitm/config/posix/cachepage.cc @@ -23,6 +23,7 @@ . */ #include "libitm_i.h" +#include // // We have three possibilities for alloction: mmap, memalign, posix_memalign diff --git a/libitm/configure b/libitm/configure index fe7f15b..6d18d87 100755 --- a/libitm/configure +++ b/libitm/configure @@ -601,6 +601,8 @@ ac_subst_vars='am__EXEEXT_FALSE am__EXEEXT_TRUE LTLIBOBJS LIBOBJS +ARCH_FUTEX_FALSE +ARCH_FUTEX_TRUE ARCH_X86_FALSE ARCH_X86_TRUE link_itm @@ -762,6 +764,7 @@ enable_fast_install with_gnu_ld enable_libtool_lock enable_maintainer_mode +enable_linux_futex enable_tls enable_symvers ' @@ -1412,6 +1415,7 @@ Optional Features: --disable-libtool-lock avoid locking (might break parallel builds) --enable-maintainer-mode enable make rules and dependencies not useful (and sometimes confusing) to the casual installer + --enable-linux-futex use the Linux futex system call [default=default] --enable-tls Use thread-local storage [default=yes] --enable-symvers=STYLE enables symbol versioning of the shared library [default=yes] @@ -11811,7 +11815,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 11814 "configure" +#line 11818 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -11917,7 +11921,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 11920 "configure" +#line 11924 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -19139,6 +19143,113 @@ $as_echo "#define HAVE_BROKEN_POSIX_SEMAPHORES 1" >>confdefs.h ;; esac + # Check whether --enable-linux-futex was given. +if test "${enable_linux_futex+set}" = set; then : + enableval=$enable_linux_futex; + case "$enableval" in + yes|no|default) ;; + *) as_fn_error "Unknown argument to enable/disable linux-futex" "$LINENO" 5 ;; + esac + +else + enable_linux_futex=default +fi + + +case "$target" in + *-linux*) + case "$enable_linux_futex" in + default) + # If headers don't have gettid/futex syscalls definition, then + # default to no, otherwise there will be compile time failures. + # Otherwise, default to yes. If we don't detect we are + # compiled/linked against NPTL and not cross-compiling, check + # if programs are run by default against NPTL and if not, issue + # a warning. + enable_linux_futex=no + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#include + int lk; +int +main () +{ +syscall (SYS_gettid); syscall (SYS_futex, &lk, 0, 0, 0); + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + save_LIBS="$LIBS" + LIBS="-lpthread $LIBS" + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#ifndef _GNU_SOURCE + #define _GNU_SOURCE 1 + #endif + #include + pthread_t th; void *status; +int +main () +{ +pthread_tryjoin_np (th, &status); + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + enable_linux_futex=yes +else + if test x$cross_compiling = xno; then + if getconf GNU_LIBPTHREAD_VERSION 2>/dev/null \ + | LC_ALL=C grep -i NPTL > /dev/null 2>/dev/null; then :; else + { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: The kernel might not support futex or gettid syscalls. +If so, please configure with --disable-linux-futex" >&5 +$as_echo "$as_me: WARNING: The kernel might not support futex or gettid syscalls. +If so, please configure with --disable-linux-futex" >&2;} + fi + fi + enable_linux_futex=yes +fi +rm -f core conftest.err conftest.$ac_objext \ + conftest$ac_exeext conftest.$ac_ext + LIBS="$save_LIBS" +fi +rm -f core conftest.err conftest.$ac_objext \ + conftest$ac_exeext conftest.$ac_ext + ;; + yes) + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +#include + int lk; +int +main () +{ +syscall (SYS_gettid); syscall (SYS_futex, &lk, 0, 0, 0); + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + +else + as_fn_error "SYS_gettid and SYS_futex required for --enable-linux-futex" "$LINENO" 5 +fi +rm -f core conftest.err conftest.$ac_objext \ + conftest$ac_exeext conftest.$ac_ext + ;; + esac + ;; + *) + enable_linux_futex=no + ;; +esac +if test x$enable_linux_futex = xyes; then + : +fi + + # See if we support thread-local storage. @@ -20078,6 +20189,14 @@ else ARCH_X86_FALSE= fi + if test $enable_linux_futex = yes; then + ARCH_FUTEX_TRUE= + ARCH_FUTEX_FALSE='#' +else + ARCH_FUTEX_TRUE='#' + ARCH_FUTEX_FALSE= +fi + ac_config_files="$ac_config_files Makefile testsuite/Makefile libitm.spec" @@ -20223,6 +20342,10 @@ if test -z "${ARCH_X86_TRUE}" && test -z "${ARCH_X86_FALSE}"; then as_fn_error "conditional \"ARCH_X86\" was never defined. Usually this means the macro was only invoked conditionally." "$LINENO" 5 fi +if test -z "${ARCH_FUTEX_TRUE}" && test -z "${ARCH_FUTEX_FALSE}"; then + as_fn_error "conditional \"ARCH_FUTEX\" was never defined. +Usually this means the macro was only invoked conditionally." "$LINENO" 5 +fi : ${CONFIG_STATUS=./config.status} ac_write_fail=0 diff --git a/libitm/configure.ac b/libitm/configure.ac index c6892d6..f2c7ce6 100644 --- a/libitm/configure.ac +++ b/libitm/configure.ac @@ -196,6 +196,8 @@ case "$host" in ;; esac +GCC_LINUX_FUTEX(:) + # See if we support thread-local storage. GCC_CHECK_TLS @@ -255,6 +257,7 @@ fi AC_SUBST(link_itm) AM_CONDITIONAL([ARCH_X86], [test "$ARCH" = x86]) +AM_CONDITIONAL([ARCH_FUTEX], [test $enable_linux_futex = yes]) AC_CONFIG_FILES(Makefile testsuite/Makefile libitm.spec) AC_OUTPUT diff --git a/libitm/configure.tgt b/libitm/configure.tgt index 68c81d4..853c082 100644 --- a/libitm/configure.tgt +++ b/libitm/configure.tgt @@ -89,6 +89,12 @@ config_path="$ARCH posix generic" # Other system configury case "${target}" in + *-*-linux*) + if test $enable_linux_futex = yes; then + config_path="linux/$ARCH linux $config_path" + fi + ;; + *-*-hpux11*) # HPUX v11.x requires -lrt to resolve sem_init in libgomp.la XLDFLAGS="${XLDFLAGS} -lrt" diff --git a/libitm/testsuite/Makefile.in b/libitm/testsuite/Makefile.in index 88ee292..ed1f314 100644 --- a/libitm/testsuite/Makefile.in +++ b/libitm/testsuite/Makefile.in @@ -40,6 +40,7 @@ ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ $(top_srcdir)/../config/depstand.m4 \ $(top_srcdir)/../config/enable.m4 \ + $(top_srcdir)/../config/futex.m4 \ $(top_srcdir)/../config/lead-dot.m4 \ $(top_srcdir)/../config/mmap.m4 \ $(top_srcdir)/../config/multi.m4 \