From patchwork Thu Sep 22 14:14:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John David Anglin X-Patchwork-Id: 673424 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3sfz5S4hPXz9s4x for ; Fri, 23 Sep 2016 00:14:36 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b=k3CCv1JJ; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:mime-version:content-type:date:subject:cc :to:message-id; q=dns; s=default; b=BMuOFGYkm2KnypFYeRChWy8MHo51 5qyYihZ+fYsEt+u6LX2oGZZP/TGtPYPsK5O3htxswgYLHBbjSMNaozOXySDsx7J0 C42UsHtFgSRDnuiIfQRZqLuayaPkOaSUjxI+Z8NHAf9a/qd9Qq2WRJv4y7PY7Qlp SJlN49b3a05YNZU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:mime-version:content-type:date:subject:cc :to:message-id; s=default; bh=5qtR5RX6huvjr8qlyq9UAtOm5HI=; b=k3 CCv1JJW753obPe39QJlHSDL3qaKnoBbJA/RH/MZXHMLdfDLjiJ14mRkLgOgvJtAB UpMC67aa+zAT8Ax26T+SO2aESzMC5e74utL2qx8dV4IrL25gluGGDOzlEPsClViu rkGLu1yiDWSJZupE+9cVixc7X0IIuyaVDse5xdxRQ= Received: (qmail 92718 invoked by alias); 22 Sep 2016 14:14:25 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 92212 invoked by uid 89); 22 Sep 2016 14:14:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_PASS, UNPARSEABLE_RELAY autolearn=ham version=3.3.2 spammy=bn, H*MI:bell, stalls, enosys X-HELO: mtlfep02.bell.net From: John David Anglin Mime-Version: 1.0 (Apple Message framework v1085) Date: Thu, 22 Sep 2016 10:14:11 -0400 Subject: [PATCH] hppa: Optimize atomic_compare_and_exchange_val_acq Cc: deller@kernel.org, Carlos O'Donell , Mike Frysinger , Aurelien Jarno To: GNU C Library Message-Id: <58B70052-B987-4C41-B603-F3AAB2FDE34B@bell.net> X-Opwv-CommTouchExtSvcRefID: str=0001.0A020205.57E3E733.0343, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 The attached patch replaces the conditional branch tests in atomic_compare_and_exchange_val_acq with conditional instruction nullification. This avoids the stalls associated with conditional branches and the resulting code is shorter. There are no branches in the fast path when the operation is successful. The change was intended as an optimization but tst-stack4 now passes. Please install. Thanks, Dave --- John David Anglin dave.anglin@bell.net 2016-09-22 John David Anglin * sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Don't include abort-instr.h. (EFAULT): Remove conditional define. (ENOSYS): Likewise. (atomic_compare_and_exchange_val_acq): Use instruction nullification instead of conditional branch instructions. Index: glibc-2.23/sysdeps/unix/sysv/linux/hppa/atomic-machine.h =================================================================== --- glibc-2.23.orig/sysdeps/unix/sysv/linux/hppa/atomic-machine.h +++ glibc-2.23/sysdeps/unix/sysv/linux/hppa/atomic-machine.h @@ -17,13 +17,6 @@ . */ #include /* Required for type definitions e.g. uint8_t. */ -#include /* Required for ABORT_INSTRUCTIUON. */ - -/* We need EFAULT, ENONSYS */ -#if !defined EFAULT && !defined ENOSYS -#define EFAULT 14 -#define ENOSYS 251 -#endif #ifndef _ATOMIC_MACHINE_H #define _ATOMIC_MACHINE_H 1 @@ -62,7 +55,7 @@ typedef uintmax_t uatomic_max_t; #define _ASM_EDEADLOCK "-45" /* The only basic operation needed is compare and exchange. The mem - pointer must be word aligned. */ + pointer must be word aligned. We no longer loop on deadlock. */ #define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \ ({ \ register long lws_errno asm("r21"); \ @@ -74,20 +67,15 @@ typedef uintmax_t uatomic_max_t; "0: \n\t" \ "ble " _LWS "(%%sr2, %%r0) \n\t" \ "ldi " _LWS_CAS ", %%r20 \n\t" \ - "ldi " _ASM_EAGAIN ", %%r20 \n\t" \ - "cmpb,=,n %%r20, %%r21, 0b \n\t" \ - "nop \n\t" \ - "ldi " _ASM_EDEADLOCK ", %%r20 \n\t" \ - "cmpb,=,n %%r20, %%r21, 0b \n\t" \ - "nop \n\t" \ + "cmpiclr,<> " _ASM_EAGAIN ", %%r21, %%r0\n\t" \ + "b,n 0b \n\t" \ + "cmpclr,= %%r0, %%r21, %%r0 \n\t" \ + "iitlbp %%r0,(%%sr0, %%r0) \n\t" \ : "=r" (lws_ret), "=r" (lws_errno) \ : "r" (lws_mem), "r" (lws_old), "r" (lws_new) \ : _LWS_CLOBBER \ ); \ \ - if (lws_errno == -EFAULT || lws_errno == -ENOSYS) \ - ABORT_INSTRUCTION; \ - \ (__typeof (oldval)) lws_ret; \ })