From patchwork Sun Nov 27 22:53:43 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Modra X-Patchwork-Id: 127904 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 1D1F2B6F6F for ; Mon, 28 Nov 2011 09:54:12 +1100 (EST) Received: (qmail 6552 invoked by alias); 27 Nov 2011 22:54:09 -0000 Received: (qmail 6544 invoked by uid 22791); 27 Nov 2011 22:54:07 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from mail-iy0-f175.google.com (HELO mail-iy0-f175.google.com) (209.85.210.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 27 Nov 2011 22:53:51 +0000 Received: by iahk25 with SMTP id k25so8757556iah.20 for ; Sun, 27 Nov 2011 14:53:50 -0800 (PST) Received: by 10.50.168.65 with SMTP id zu1mr47452223igb.42.1322434430732; Sun, 27 Nov 2011 14:53:50 -0800 (PST) Received: from bubble.grove.modra.org ([115.187.252.19]) by mx.google.com with ESMTPS id g16sm37422210ibs.8.2011.11.27.14.53.48 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 27 Nov 2011 14:53:50 -0800 (PST) Received: by bubble.grove.modra.org (Postfix, from userid 1000) id 899E8170C2BF; Mon, 28 Nov 2011 09:23:43 +1030 (CST) Date: Mon, 28 Nov 2011 09:23:43 +1030 From: Alan Modra To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org Subject: Re: Fix libgomp semaphores Message-ID: <20111127225343.GA7827@bubble.grove.modra.org> Mail-Followup-To: Jakub Jelinek , gcc-patches@gcc.gnu.org References: <20111125000328.GD5085@bubble.grove.modra.org> <20111125073839.GN27242@tyan-ft48-01.lab.bos.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20111125073839.GN27242@tyan-ft48-01.lab.bos.redhat.com> User-Agent: Mutt/1.5.20 (2009-06-14) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Fri, Nov 25, 2011 at 08:38:39AM +0100, Jakub Jelinek wrote: > Furthermore, I'd prefer if the patch could be split into smaller parts, > e.g. for bisecting purposes. One patch would do the mutex changes > to use new atomics, remove extra mutex.h headers and start using 0/1/-1 > instead of 0/1/2. And another patch would rewrite the semaphores. As requested, this is the semaphore rewrite. The code is modelleled on the existing mutex implementation, using a single int to hold 31 bits of count info and a wait flag bit. Like the mutexes, these semaphores have the nice property that no syscalls are made unless threads are waiting on the semaphore, and if we reach the contended case, there is an orderly transition back to uncontended. Also, I believe this implementation uses the bare minimum of synchronization instructions. Unlike the old semaphore implementation, I do not try to wake multiple threads in the sem_post_slow futex_wake call. That's because it is necessary to wake further threads if possible on exiting from sem_wait_slow, and given that is necessary, then waking multiple threads just leads to unnecessary futex_wake syscalls. You can see why the wake in sem_wait_slow is necessary by considering a semaphore set up to allow two threads to work on some job. If there were four threads trying to get the semaphore, two would succeed, A and B, and two, C and D, end up waiting in gomp_set_wait_slow. When A finishes its job, it will call gomp_sem_post to increment the semaphore, see and clear the wait flag, and call futex_wake on one thread. If B also finishes while this is happening, but before C has woken, then B's call to gomp_sem_post may not see the wait flag set, leaving D asleep. Incidentally, the regression I mentioned in http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02393.html wasn't real. (I'd been testing for failures with a script that repeatedly hammered selected libgomp tests with while LD_LIBRARY_PATH=blahblah ./sometest; do :; done because a single gcc testsuite run often doesn't catch locking failures. The regression was due to a typo on LD_LIBRARY_PATH..) Bootstrapped and regression tested powerpc-linux. OK to apply? PR libgomp/51249 * config/linux/sem.h: Rewrite. * config/linux/sem.c: Rewrite. Index: libgomp/config/linux/sem.h =================================================================== --- libgomp/config/linux/sem.h (revision 181718) +++ libgomp/config/linux/sem.h (working copy) @@ -24,34 +24,74 @@ /* This is a Linux specific implementation of a semaphore synchronization mechanism for libgomp. This type is private to the library. This - implementation uses atomic instructions and the futex syscall. */ + counting semaphore implementation uses atomic instructions and the + futex syscall, and a single 32-bit int to store semaphore state. + The low 31 bits are the count, the top bit is a flag set when some + threads may be waiting. */ #ifndef GOMP_SEM_H #define GOMP_SEM_H 1 typedef int gomp_sem_t; -static inline void gomp_sem_init (gomp_sem_t *sem, int value) +enum memmodel +{ + MEMMODEL_RELAXED = 0, + MEMMODEL_CONSUME = 1, + MEMMODEL_ACQUIRE = 2, + MEMMODEL_RELEASE = 3, + MEMMODEL_ACQ_REL = 4, + MEMMODEL_SEQ_CST = 5, + MEMMODEL_LAST = 6 +}; + +extern void gomp_sem_wait_slow (gomp_sem_t *, int); +extern void gomp_sem_post_slow (gomp_sem_t *); + +static inline void +gomp_sem_init (gomp_sem_t *sem, int value) { *sem = value; } -extern void gomp_sem_wait_slow (gomp_sem_t *); -static inline void gomp_sem_wait (gomp_sem_t *sem) +static inline void +gomp_sem_destroy (gomp_sem_t *sem) { - if (!__sync_bool_compare_and_swap (sem, 1, 0)) - gomp_sem_wait_slow (sem); } -extern void gomp_sem_post_slow (gomp_sem_t *); -static inline void gomp_sem_post (gomp_sem_t *sem) +static inline void +gomp_sem_wait (gomp_sem_t *sem) { - if (!__sync_bool_compare_and_swap (sem, 0, 1)) - gomp_sem_post_slow (sem); + int count = *sem; + + while ((count & 0x7fffffff) != 0) + { + int oldval = count; + __atomic_compare_exchange_4 (sem, &oldval, count - 1, + false, MEMMODEL_ACQUIRE, MEMMODEL_RELAXED); + if (__builtin_expect (oldval == count, 1)) + return; + count = oldval; + } + gomp_sem_wait_slow (sem, count); } -static inline void gomp_sem_destroy (gomp_sem_t *sem) +static inline void +gomp_sem_post (gomp_sem_t *sem) { -} + int count = *sem; + + while (1) + { + int oldval = count; + __atomic_compare_exchange_4 (sem, &oldval, (count + 1) & 0x7fffffff, + false, MEMMODEL_RELEASE, MEMMODEL_RELAXED); + if (__builtin_expect (oldval == count, 1)) + break; + count = oldval; + } + if (__builtin_expect (count & 0x80000000, 0)) + gomp_sem_post_slow (sem); +} #endif /* GOMP_SEM_H */ Index: libgomp/config/linux/sem.c =================================================================== --- libgomp/config/linux/sem.c (revision 181718) +++ libgomp/config/linux/sem.c (working copy) @@ -28,34 +28,69 @@ #include "wait.h" - void -gomp_sem_wait_slow (gomp_sem_t *sem) +gomp_sem_wait_slow (gomp_sem_t *sem, int count) { - while (1) + int oldval, newval; + + /* First loop spins a while. */ + while (count == 0) { - int val = __sync_val_compare_and_swap (sem, 0, -1); - if (val > 0) + if (do_spin (sem, 0)) + { + /* Spin timeout, nothing changed. Set waiting flag. */ + oldval = 0; + __atomic_compare_exchange_4 (sem, &oldval, 0x80000000, false, + MEMMODEL_ACQUIRE, MEMMODEL_RELAXED); + count = oldval; + if (oldval == 0) + { + futex_wait (sem, 0x80000000); + count = *sem; + } + break; + } + /* Something changed. If positive, we're good to go. */ + else if ((count = *sem) > 0) { - if (__sync_bool_compare_and_swap (sem, val, val - 1)) + oldval = count; + __atomic_compare_exchange_4 (sem, &oldval, count - 1, false, + MEMMODEL_ACQUIRE, MEMMODEL_RELAXED); + if (oldval == count) return; + count = oldval; } - do_wait (sem, -1); + } + + /* Second loop waits until semaphore is posted. We always exit this + loop with wait flag set, so next post will awaken a thread. */ + while (1) + { + oldval = count; + newval = 0x80000000; + if ((count & 0x7fffffff) != 0) + newval |= count - 1; + __atomic_compare_exchange_4 (sem, &oldval, newval, false, + MEMMODEL_ACQUIRE, MEMMODEL_RELAXED); + if (oldval == count) + { + if ((count & 0x7fffffff) != 0) + { + /* If we can wake more threads, do so now. */ + if ((count & 0x7fffffff) > 1) + gomp_sem_post_slow (sem); + break; + } + do_wait (sem, 0x80000000); + count = *sem; + } + else + count = oldval; } } void gomp_sem_post_slow (gomp_sem_t *sem) { - int old, tmp = *sem, wake; - - do - { - old = tmp; - wake = old > 0 ? old + 1 : 1; - tmp = __sync_val_compare_and_swap (sem, old, wake); - } - while (old != tmp); - - futex_wake (sem, wake); + futex_wake (sem, 1); }