From patchwork Wed Oct 25 14:10:02 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 830296 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-86339-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="Brj9Ghbx"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yMX982wPnz9t2M for ; Thu, 26 Oct 2017 01:10:36 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id; q=dns; s= default; b=aQs3PYnbWKze+l65I8ePHW2B1SFg4xuREdPvTPuZBiWZfyyE4g6j+ IDpEGBRECgOSXrjF5xG3yxGu6zNB0NLDLchaUlF2DQzA5b30Wgue1iPcwk5PAdZS ahDmUwA3l6bryE4ItAPi0ejD9SIEPPvcBDToPZs51FYc2ZxpkZS8JA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id; s=default; bh=sAXuupj2IWDWxIgXWk8gaun5y1Q=; b=Brj9GhbxT0VhGlJeAWVRb6huYL+h enMwI7Xvagfeft8+oiSsAhIiw7fP+Ew7tFNhX2aSMDBUiV58g8lmUd6qLexFVEHt mTMOXcV6NlQdwMx2EUjTh4j3q1GyFhbpclCwHo814zNdLe6se1Oq6DWJ+J75LMjx sN/+yngYkY8nqag= Received: (qmail 49443 invoked by alias); 25 Oct 2017 14:10:29 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 49433 invoked by uid 89); 25 Oct 2017 14:10:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=starvation, 7.8, sk:through, 13.2 X-HELO: mx1.redhat.com DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com AF87B2CB7 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=longman@redhat.com From: Waiman Long To: GLIBC Devel Cc: Torvald Riegel , Carlos O'Donell , Thomas Gleixner , Waiman Long Subject: [RFC] [PATCH 0/2] nptl: Support TP futexes in pthread mutex & rwlock Date: Wed, 25 Oct 2017 10:10:02 -0400 Message-Id: <1508940604-7233-1-git-send-email-longman@redhat.com> Throughput-Optimized (TP) futex is a new futex type that is being proposed to be merged into the Linux kernel: v6: https://lkml.org/lkml/2017/3/22/654 This patchset enhances the pthread_mutex and pthread_rwlock APIs to support the new TP futexes when they are available in the running kernel. For mutex, a user can designate the use of TP futexes by using pthread_mutexattr_setprotocol(attr, PTHREAD_THROUGHPUT_NP); For rwlock, a user can designate the use of TP futexes by using pthread_rwlockattr_setkind_np(attr, PTHREAD_RWLOCK_USE_TP_FUTEX_NP); A locking microbenchmark was run on a 2-socket 40-core Intel Gold 6148 system (HT on) for 5s with a 4.14 based Linux kernel (with TP futex patch). A modified glibc (v2.26) with TP futex support was used for measuring locking performance with standard mutex and rwlock versus the TP futex versions. For mutex, the total locking ops (in 5s) with various number of locking threads and critical section loads were as follows: threads/Load pthread-mutex pthread-tp-mutex % change ------------ ------------- ---------------- -------- 10/10 41,758,248 60,517,142 44.9% 20/10 53,331,622 60,349,448 13.2% 40/10 47,683,927 52,437,558 10.0% 80/10 30,632,089 56,019,758 82.9% 10/50 27,155,380 42,259,331 55.6% 20/50 34,292,402 36,979,265 7.8% 40/50 30,510,507 37,317,158 22.3% 80/50 17,800,497 41,251,263 131.7% It can be seen that the TP version of the mutex performs well with respect to the standard mutex. For rwlock, with mixed locking threads doing equal number of read and write locks, the total locking ops (in 5s) with various number of locking threads and a fixed critical section load of 10 were as follows: threads pthread-rwlock pthread-rwlock pthread-tp-rwlock (prefer-R) (prefer-W) ------- ------------- -------------- ----------------- 10 25,365,336 994,356 59,150,680 20 8,055,288 1,012,284 59,891,268 40 17,750,188 1,269,368 56,844,898 80 12,267,932 1,667,678 57,855,304 With separate reader and writer locking threads, the total locking ops (in 5s) were as follows: threads pthread-rwlock pthread-rwlock pthread-tp-rwlock (prefer-R) (prefer-W) ------- ------------- -------------- ----------------- 5 readers 27,450,747 100,971 668,412 5 writers 168,557 12,456,028 32,128,358 10 readers 34,341,123 4,151 3,428,028 10 writers 15,880 15,030,632 30,132,666 20 readers 34,935,993 20 7,707,514 20 writers 26 18,557,431 8,983,738 40 readers 25,636,457 41 6,500,411 40 writers 40 22,634,557 5,584,316 Lock starvation happened for the standard glibc rwlock when the number of contending threads was 40 or more. The TP futex version of the rwlock, however, was more fair and hence suffers a bit performance-wise when the number of contending threads is large. Waiman Long (2): nptl: Enable pthread mutex to use the TP futex nptl: Enable pthread rwlock to use the TP futex ChangeLog | 26 +++ nptl/pthreadP.h | 18 ++ nptl/pthread_mutex_init.c | 27 +++ nptl/pthread_mutex_lock.c | 49 +++++- nptl/pthread_mutex_timedlock.c | 52 +++++- nptl/pthread_mutex_trylock.c | 20 ++- nptl/pthread_mutex_unlock.c | 20 ++- nptl/pthread_mutexattr_setprotocol.c | 1 + nptl/pthread_rwlock_rdlock.c | 5 +- nptl/pthread_rwlock_timedrdlock.c | 5 +- nptl/pthread_rwlock_timedwrlock.c | 5 +- nptl/pthread_rwlock_tp.c | 235 +++++++++++++++++++++++++++ nptl/pthread_rwlock_tryrdlock.c | 5 + nptl/pthread_rwlock_trywrlock.c | 5 + nptl/pthread_rwlock_unlock.c | 14 ++ nptl/pthread_rwlock_wrlock.c | 5 +- nptl/pthread_rwlockattr_setkind_np.c | 21 ++- sysdeps/nptl/pthread.h | 4 + sysdeps/unix/sysv/linux/hppa/pthread.h | 3 +- sysdeps/unix/sysv/linux/lowlevellock-futex.h | 9 + 20 files changed, 502 insertions(+), 27 deletions(-) create mode 100644 nptl/pthread_rwlock_tp.c