From patchwork Thu Nov 1 21:46:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 992092 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-488860-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="j6gljtSl"; dkim=pass (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="EKivhlXQ"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42mJjC4z0Tz9s7T for ; Fri, 2 Nov 2018 08:47:59 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=ecVzvmM5NvENfpQAy+zOlFnemMUF4UZ8zcY+5Kjkn5BagWRKZwKsu Y1+h4q5sL3fehnnjtngiezqAwXvUtCKdLQCx8vUIOClDHk/L76blLDaDpVoX7knK a7WZfdQNN8QUqOGIg2L+NFhuh9S51y36IHLMug9EyDf3k7jpK8bzx0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=QIVlr9c+8B5np3T+/5eXMYeCFrc=; b=j6gljtSlAPjayyBSUg87 hfyA50wbGlILaVh/4y6akk3+7lWQAbBTg6CX67998juF5Ike7moIyeMtmxozEyfC G1RlHVDQH5PIC5xGw6Nd1HMaV0UGjPS8sStg9PQH962PguMktMYCDb1B0+Gwo/fo c6uLi49RZ2Bx/mpxNawo7Sc= Received: (qmail 49733 invoked by alias); 1 Nov 2018 21:47:03 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 49497 invoked by uid 89); 1 Nov 2018 21:47:02 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=_, expectation, predicted, GPL X-HELO: mail-wr1-f65.google.com Received: from mail-wr1-f65.google.com (HELO mail-wr1-f65.google.com) (209.85.221.65) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 01 Nov 2018 21:46:58 +0000 Received: by mail-wr1-f65.google.com with SMTP id t10-v6so21507798wrn.10 for ; Thu, 01 Nov 2018 14:46:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=duPCFF7VfFdqkUmDcgnfICyOcWBrBMdBodAA19vLfGU=; b=EKivhlXQgC1VlSDwW/0CM5Kk4kM1u0yM1fEUTgven7pcDTBOCqi+i3uYG7VFYxGdZ5 kY+1gU8AbwhAygOLdkXAl9MenmjYSwYl3osqL9QSptznmW1UgSeROGg7Vpqw8l1k4k00 jCanZKakZMUBdmHm9l9tXzHEt6FUZn/K8DErY= Received: from cloudburst.Home ([2a02:c7f:504f:6300:a3de:88d8:75ae:bf4c]) by smtp.gmail.com with ESMTPSA id h18-v6sm21097360wro.0.2018.11.01.14.46.54 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 01 Nov 2018 14:46:54 -0700 (PDT) From: Richard Henderson To: gcc-patches@gcc.gnu.org Cc: ramana.radhakrishnan@arm.com, agraf@suse.de, marcus.shawcroft@arm.com, james.greenhalgh@arm.com Subject: [PATCH, AArch64, v3 4/6] aarch64: Add out-of-line functions for LSE atomics Date: Thu, 1 Nov 2018 21:46:46 +0000 Message-Id: <20181101214648.29432-5-richard.henderson@linaro.org> In-Reply-To: <20181101214648.29432-1-richard.henderson@linaro.org> References: <20181101214648.29432-1-richard.henderson@linaro.org> This is the libgcc part of the interface -- providing the functions. Rationale is provided at the top of libgcc/config/aarch64/lse.S. * config/aarch64/lse-init.c: New file. * config/aarch64/lse.S: New file. * config/aarch64/t-lse: New file. * config.host: Add t-lse to all aarch64 tuples. --- libgcc/config/aarch64/lse-init.c | 45 ++++++ libgcc/config.host | 4 + libgcc/config/aarch64/lse.S | 238 +++++++++++++++++++++++++++++++ libgcc/config/aarch64/t-lse | 44 ++++++ 4 files changed, 331 insertions(+) create mode 100644 libgcc/config/aarch64/lse-init.c create mode 100644 libgcc/config/aarch64/lse.S create mode 100644 libgcc/config/aarch64/t-lse diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c new file mode 100644 index 00000000000..03b4e1e8ea8 --- /dev/null +++ b/libgcc/config/aarch64/lse-init.c @@ -0,0 +1,45 @@ +/* Out-of-line LSE atomics for AArch64 architecture, Init. + Copyright (C) 2018 Free Software Foundation, Inc. + Contributed by Linaro Ltd. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + +/* Define the symbol gating the LSE implementations. */ +extern _Bool __aa64_have_atomics + __attribute__((visibility("hidden"), nocommon)); + +/* Disable initialization of __aa64_have_atomics during bootstrap. */ +#ifndef inhibit_libc +# include + +/* Disable initialization if the system headers are too old. */ +# if defined(AT_HWCAP) && defined(HWCAP_ATOMICS) + +static void __attribute__((constructor)) +init_have_atomics (void) +{ + unsigned long hwcap = getauxval (AT_HWCAP); + __aa64_have_atomics = (hwcap & HWCAP_ATOMICS) != 0; +} + +# endif /* HWCAP */ +#endif /* inhibit_libc */ diff --git a/libgcc/config.host b/libgcc/config.host index 029f6569caf..7e9a8b6bc8f 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -340,23 +340,27 @@ aarch64*-*-elf | aarch64*-*-rtems*) extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o" extra_parts="$extra_parts crtfastmath.o" tmake_file="${tmake_file} ${cpu_type}/t-aarch64" + tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc" tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" md_unwind_header=aarch64/aarch64-unwind.h ;; aarch64*-*-freebsd*) extra_parts="$extra_parts crtfastmath.o" tmake_file="${tmake_file} ${cpu_type}/t-aarch64" + tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc" tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" md_unwind_header=aarch64/freebsd-unwind.h ;; aarch64*-*-fuchsia*) tmake_file="${tmake_file} ${cpu_type}/t-aarch64" + tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc" tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp" ;; aarch64*-*-linux*) extra_parts="$extra_parts crtfastmath.o" md_unwind_header=aarch64/linux-unwind.h tmake_file="${tmake_file} ${cpu_type}/t-aarch64" + tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc" tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm" ;; alpha*-*-linux*) diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S new file mode 100644 index 00000000000..3e42a6569af --- /dev/null +++ b/libgcc/config/aarch64/lse.S @@ -0,0 +1,238 @@ +/* Out-of-line LSE atomics for AArch64 architecture. + Copyright (C) 2018 Free Software Foundation, Inc. + Contributed by Linaro Ltd. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + +/* + * The problem that we are trying to solve is operating system deployment + * of ARMv8.1-Atomics, also known as Large System Exensions (LSE). + * + * There are a number of potential solutions for this problem which have + * been proposed and rejected for various reasons. To recap: + * + * (1) Multiple builds. The dynamic linker will examine /lib64/atomics/ + * if HWCAP_ATOMICS is set, allowing entire libraries to be overwritten. + * However, not all Linux distributions are happy with multiple builds, + * and anyway it has no effect on main applications. + * + * (2) IFUNC. We could put these functions into libgcc_s.so, and have + * a single copy of each function for all DSOs. However, ARM is concerned + * that the branch-to-indirect-branch that is implied by using a PLT, + * as required by IFUNC, is too much overhead for smaller cpus. + * + * (3) Statically predicted direct branches. This is the approach that + * is taken here. These functions are linked into every DSO that uses them. + * All of the symbols are hidden, so that the functions are called via a + * direct branch. The choice of LSE vs non-LSE is done via one byte load + * followed by a well-predicted direct branch. The functions are compiled + * separately to minimize code size. + */ + +/* Tell the assembler to accept LSE instructions. */ + .arch armv8-a+lse + +/* Declare the symbol gating the LSE implementations. */ + .hidden __aa64_have_atomics + +/* Turn size and memory model defines into mnemonic fragments. */ +#if SIZE == 1 +# define S b +# define MASK , uxtb +#elif SIZE == 2 +# define S h +# define MASK , uxth +#elif SIZE == 4 || SIZE == 8 || SIZE == 16 +# define S +# define MASK +#else +# error +#endif + +#if MODEL == 1 +# define SUFF _relax +# define A +# define L +#elif MODEL == 2 +# define SUFF _acq +# define A a +# define L +#elif MODEL == 3 +# define SUFF _rel +# define A +# define L l +#elif MODEL == 4 +# define SUFF _acq_rel +# define A a +# define L l +#else +# error +#endif + +/* Concatenate symbols. */ +#define glue2_(A, B) A ## B +#define glue2(A, B) glue2_(A, B) +#define glue3_(A, B, C) A ## B ## C +#define glue3(A, B, C) glue3_(A, B, C) +#define glue4_(A, B, C, D) A ## B ## C ## D +#define glue4(A, B, C, D) glue4_(A, B, C, D) + +/* Select the size of a register, given a regno. */ +#define x(N) glue2(x, N) +#define w(N) glue2(w, N) +#if SIZE < 8 +# define s(N) w(N) +#else +# define s(N) x(N) +#endif + +#define NAME(BASE) glue4(__aa64_, BASE, SIZE, SUFF) +#define LDXR glue4(ld, A, xr, S) +#define STXR glue4(st, L, xr, S) + +/* Temporary registers used. Other than these, only the return value + register (x0) and the flags are modified. */ +#define tmp0 16 +#define tmp1 17 +#define tmp2 15 + +/* Start and end a function. */ +.macro STARTFN name + .text + .balign 16 + .globl \name + .hidden \name + .type \name, %function +\name: +.endm + +.macro ENDFN name + .size \name, . - \name +.endm + +/* Branch to LABEL if LSE is enabled. + The branch should be easily predicted, in that it will, after constructors, + always branch the same way. The expectation is that systems that implement + ARMv8.1-Atomics are "beefier" than those that omit the extension. + By arranging for the fall-through path to use load-store-exclusive insns, + we aid the branch predictor of the smallest cpus. */ +.macro JUMP_IF_LSE label + adrp x(tmp0), __aa64_have_atomics + ldrb w(tmp0), [x(tmp0), :lo12:__aa64_have_atomics] + cbnz w(tmp0), \label +.endm + +#ifdef L_cas + +STARTFN NAME(cas) + JUMP_IF_LSE 8f + +#if SIZE < 16 +#define CAS glue4(cas, A, L, S) + + mov s(tmp0), s(0) +0: LDXR s(0), [x2] + cmp s(0), s(tmp0) MASK + bne 1f + STXR w(tmp1), s(1), [x2] + cbnz w(tmp1), 0b +1: ret + +8: CAS w(0), w(1), [x2] + ret + +#else +#define LDXP glue3(ld, A, xp) +#define STXP glue3(st, L, xp) +#define CASP glue3(casp, A, L) + + mov x(tmp0), x0 + mov x(tmp1), x1 +0: LDXP x0, x1, [x4] + cmp x0, x(tmp0) + ccmp x1, x(tmp1), #0, eq + bne 1f + STXP w(tmp2), x(tmp0), x(tmp1), [x4] + cbnz w(tmp2), 0b +1: ret + +8: CASP x0, x1, x2, x3, [x4] + ret + +#endif + +ENDFN NAME(cas) +#endif + +#ifdef L_swp +#define SWP glue4(swp, A, L, S) + +STARTFN NAME(swp) + JUMP_IF_LSE 8f + + mov s(tmp0), s(0) +0: LDXR s(0), [x1] + STXR w(tmp1), s(tmp0), [x1] + cbnz w(tmp1), 0b + ret + +8: SWP w(0), w(0), [x1] + ret + +ENDFN NAME(swp) +#endif + +#if defined(L_ldadd) || defined(L_ldclr) \ + || defined(L_ldeor) || defined(L_ldset) + +#ifdef L_ldadd +#define LDNM ldadd +#define OP add +#elif defined(L_ldclr) +#define LDNM ldclr +#define OP bic +#elif defined(L_ldeor) +#define LDNM ldeor +#define OP eor +#elif defined(L_ldset) +#define LDNM ldset +#define OP orr +#else +#error +#endif +#define LDOP glue4(LDNM, A, L, S) + +STARTFN NAME(LDNM) + JUMP_IF_LSE 8f + + mov s(tmp0), s(0) +0: LDXR s(0), [x1] + OP s(tmp1), s(0), s(tmp0) + STXR w(tmp1), s(tmp1), [x1] + cbnz w(tmp1), 0b + ret + +8: LDOP s(0), s(0), [x1] + ret + +ENDFN NAME(LDNM) +#endif diff --git a/libgcc/config/aarch64/t-lse b/libgcc/config/aarch64/t-lse new file mode 100644 index 00000000000..c7f4223cd45 --- /dev/null +++ b/libgcc/config/aarch64/t-lse @@ -0,0 +1,44 @@ +# Out-of-line LSE atomics for AArch64 architecture. +# Copyright (C) 2018 Free Software Foundation, Inc. +# Contributed by Linaro Ltd. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# Compare-and-swap has 5 sizes and 4 memory models. +S0 := $(foreach s, 1 2 4 8 16, $(addsuffix _$(s), cas)) +O0 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S0))) + +# Swap, Load-and-operate have 4 sizes and 4 memory models +S1 := $(foreach s, 1 2 4 8, $(addsuffix _$(s), swp ldadd ldclr ldeor ldset)) +O1 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S1))) + +LSE_OBJS := $(O0) $(O1) + +libgcc-objects += $(LSE_OBJS) lse-init$(objext) + +empty = +space = $(empty) $(empty) +PAT_SPLIT = $(subst _,$(space),$(*F)) +PAT_BASE = $(word 1,$(PAT_SPLIT)) +PAT_N = $(word 2,$(PAT_SPLIT)) +PAT_M = $(word 3,$(PAT_SPLIT)) + +lse-init$(objext): $(srcdir)/config/aarch64/lse-init.c + $(gcc_compile) -c $< + +$(LSE_OBJS): $(srcdir)/config/aarch64/lse.S + $(gcc_compile) -DL_$(PAT_BASE) -DSIZE=$(PAT_N) -DMODEL=$(PAT_M) -c $<