From patchwork Fri Jan 15 11:30:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 1426917 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=danielengel.com header.i=@danielengel.com header.a=rsa-sha256 header.s=fm1 header.b=LisHWYGt; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm1 header.b=EKU8aZz4; dkim-atps=neutral Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DHJvc2x5Qz9sWc for ; Fri, 15 Jan 2021 22:33:00 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 62A943982429; Fri, 15 Jan 2021 11:31:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 677A9398241F for ; Fri, 15 Jan 2021 11:31:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 677A9398241F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gnu@danielengel.com Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 7D767F7D; Fri, 15 Jan 2021 06:31:50 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 15 Jan 2021 06:31:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=6h6WVA/dfQvDM Gtz60FxM3SkGr3/DeqGVkvDtte53Ok=; b=LisHWYGtcSrV1jsjrr0fmBNf1Fw9p sQGFDgeSi3hsIRAt6BxdTlyPePQfnRAliomKBOB7XCsXXISlCJDtaZ9KbA3bpqZv YT3JtPo6JF3gJvjYCk27k+vUdlrNhy2wPCGiyxlaK/OEajarjJFYjCX64v1ASDMy h9AbPjhAKOhqieICckXPi8RIjm3vBp0UxVNalo6TdZhLjBEdERqmg5sEf7FCkiNy k8igj8wHf77biSNiK4RdO/Eq9+LB1zoTVgGqOOyOToCjX4n2X97sDTXI/BcnVHnQ idrV8QW/mfppSSgYgj6ODul/k7K6GN+Oz5LSGNnrxk5WfxN5dAZcsPrkg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=6h6WVA/dfQvDMGtz60FxM3SkGr3/DeqGVkvDtte53Ok=; b=EKU8aZz4 EKWCmgpPgz5T9I8PmBeeiRVF7VIIsL/Qe4vivB/+l+jeWO1qqZpUlBs4M69uveIl vAmvmqZhJSCmb5es16ngehEE3ZQMx1l4p1yJ52KKuo097AY+DUp2EWboE2iJJT+Y QpCGcQOvoqlXUqDaiHcYUsUz1aphR58IoN+fDPLO6xUIcVajrebvINk6NiTdSW6a vHOFQSrWL4HavS1CGp1JBFrCEE8IKNuF6gJGgP3zKG+QNchltxe4WECCUqP1Nd/9 Wr4SzUXKsQ0ethlyhzdXpit5W0BB1Vx4D+k0El7mXSBwn1oM1f3DTxgHojzP+4gn 6fu4P3V/6q81kQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrtddvgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeffrghnihgvlhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgv nhhgvghlrdgtohhmqeenucggtffrrghtthgvrhhnpeejffeileefgfffvdevheelgeduue euiefhueetteeuudeikefhfeeugffhueegheenucffohhmrghinheplhhmuhhlrdhssgdp ghhnuhdrohhrghdplhhisgdufhhunhgtshdrshgsnecukfhppeejuddrfeeirddutddtrd dvvddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep ghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: from sendmail.lorien.danielengel.com (71-36-100-220.ptld.qwest.net [71.36.100.220]) by mail.messagingengine.com (Postfix) with ESMTPA id B60C61080066; Fri, 15 Jan 2021 06:31:49 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 10FBVm5n023757; Fri, 15 Jan 2021 03:31:48 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: gcc-patches@gcc.gnu.org Subject: [PATCH v5 22/33] Import integer multiplication from the CM0 library Date: Fri, 15 Jan 2021 03:30:50 -0800 Message-Id: <15701b463680f1f6d985b9aad13987dc39681b5a.1610709584.git.gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@foss.arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" gcc/libgcc/ChangeLog: 2021-01-07 Daniel Engel * config/arm/eabi/lmul.S: New file for __muldi3(), __mulsidi3(), and __umulsidi3(). * config/arm/lib1funcs.S: #eabi/lmul.S (v6m only). * config/arm/t-elf: Add the new objects to LIB1ASMFUNCS. --- libgcc/config/arm/eabi/lmul.S | 218 ++++++++++++++++++++++++++++++++++ libgcc/config/arm/lib1funcs.S | 1 + libgcc/config/arm/t-elf | 13 +- 3 files changed, 230 insertions(+), 2 deletions(-) create mode 100644 libgcc/config/arm/eabi/lmul.S diff --git a/libgcc/config/arm/eabi/lmul.S b/libgcc/config/arm/eabi/lmul.S new file mode 100644 index 00000000000..9fec4364a26 --- /dev/null +++ b/libgcc/config/arm/eabi/lmul.S @@ -0,0 +1,218 @@ +/* lmul.S: Thumb-1 optimized 64-bit integer multiplication + + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) + + This file is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 3, or (at your option) any + later version. + + This file is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + + +#ifdef L_muldi3 + +// long long __aeabi_lmul(long long, long long) +// Returns the least significant 64 bits of a 64 bit multiplication. +// Expects the two multiplicands in $r1:$r0 and $r3:$r2. +// Returns the product in $r1:$r0 (does not distinguish signed types). +// Uses $r4 and $r5 as scratch space. +// Same parent section as __umulsidi3() to keep tail call branch within range. +FUNC_START_SECTION muldi3 .text.sorted.libgcc.lmul.muldi3 + +#ifndef __symbian__ + FUNC_ALIAS aeabi_lmul muldi3 +#endif + + CFI_START_FUNCTION + + // $r1:$r0 = 0xDDDDCCCCBBBBAAAA + // $r3:$r2 = 0xZZZZYYYYXXXXWWWW + + // The following operations that only affect the upper 64 bits + // can be safely discarded: + // DDDD * ZZZZ + // DDDD * YYYY + // DDDD * XXXX + // CCCC * ZZZZ + // CCCC * YYYY + // BBBB * ZZZZ + + // MAYBE: Test for multiply by ZERO on implementations with a 32-cycle + // 'muls' instruction, and skip over the operation in that case. + + // (0xDDDDCCCC * 0xXXXXWWWW), free $r1 + muls xxh, yyl + + // (0xZZZZYYYY * 0xBBBBAAAA), free $r3 + muls yyh, xxl + adds yyh, xxh + + // Put the parameters in the correct form for umulsidi3(). + movs xxh, yyl + b LLSYM(__mul_overflow) + + CFI_END_FUNCTION +FUNC_END muldi3 + +#ifndef __symbian__ + FUNC_END aeabi_lmul +#endif + +#endif /* L_muldi3 */ + + +// The following implementation of __umulsidi3() integrates with __muldi3() +// above to allow the fast tail call while still preserving the extra +// hi-shifted bits of the result. However, these extra bits add a few +// instructions not otherwise required when using only __umulsidi3(). +// Therefore, this block configures __umulsidi3() for compilation twice. +// The first version is a minimal standalone implementation, and the second +// version adds the hi bits of __muldi3(). The standalone version must +// be declared WEAK, so that the combined version can supersede it and +// provide both symbols in programs that multiply long doubles. +// This means '_umulsidi3' should appear before '_muldi3' in LIB1ASMFUNCS. +#if defined(L_muldi3) || defined(L_umulsidi3) + +#ifdef L_umulsidi3 +// unsigned long long __umulsidi3(unsigned int, unsigned int) +// Returns all 64 bits of a 32 bit multiplication. +// Expects the two multiplicands in $r0 and $r1. +// Returns the product in $r1:$r0. +// Uses $r3, $r4 and $ip as scratch space. +WEAK_START_SECTION umulsidi3 .text.sorted.libgcc.lmul.umulsidi3 + CFI_START_FUNCTION + +#else /* L_muldi3 */ +FUNC_ENTRY umulsidi3 + CFI_START_FUNCTION + + // 32x32 multiply with 64 bit result. + // Expand the multiply into 4 parts, since muls only returns 32 bits. + // (a16h * b16h / 2^32) + // + (a16h * b16l / 2^48) + (a16l * b16h / 2^48) + // + (a16l * b16l / 2^64) + + // MAYBE: Test for multiply by 0 on implementations with a 32-cycle + // 'muls' instruction, and skip over the operation in that case. + + eors yyh, yyh + + LLSYM(__mul_overflow): + mov ip, yyh + +#endif /* !L_muldi3 */ + + // a16h * b16h + lsrs r2, xxl, #16 + lsrs r3, xxh, #16 + muls r2, r3 + + #ifdef L_muldi3 + add ip, r2 + #else + mov ip, r2 + #endif + + // a16l * b16h; save a16h first! + lsrs r2, xxl, #16 + #if (__ARM_ARCH >= 6) + uxth xxl, xxl + #else /* __ARM_ARCH < 6 */ + lsls xxl, #16 + lsrs xxl, #16 + #endif + muls r3, xxl + + // a16l * b16l + #if (__ARM_ARCH >= 6) + uxth xxh, xxh + #else /* __ARM_ARCH < 6 */ + lsls xxh, #16 + lsrs xxh, #16 + #endif + muls xxl, xxh + + // a16h * b16l + muls xxh, r2 + + // Distribute intermediate results. + eors r2, r2 + adds xxh, r3 + adcs r2, r2 + lsls r3, xxh, #16 + lsrs xxh, #16 + lsls r2, #16 + adds xxl, r3 + adcs xxh, r2 + + // Add in the high bits. + add xxh, ip + + RET + + CFI_END_FUNCTION +FUNC_END umulsidi3 + +#endif /* L_muldi3 || L_umulsidi3 */ + + +#ifdef L_mulsidi3 + +// long long mulsidi3(int, int) +// Returns all 64 bits of a 32 bit signed multiplication. +// Expects the two multiplicands in $r0 and $r1. +// Returns the product in $r1:$r0. +// Uses $r3, $r4 and $rT as scratch space. +FUNC_START_SECTION mulsidi3 .text.sorted.libgcc.lmul.mulsidi3 + CFI_START_FUNCTION + + // Push registers for function call. + push { rT, lr } + .cfi_remember_state + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset rT, 0 + .cfi_rel_offset lr, 4 + + // Save signs of the arguments. + asrs r3, r0, #31 + asrs rT, r1, #31 + + // Absolute value of the arguments. + eors r0, r3 + eors r1, rT + subs r0, r3 + subs r1, rT + + // Save sign of the result. + eors rT, r3 + + bl SYM(__umulsidi3) __PLT__ + + // Apply sign of the result. + eors xxl, rT + eors xxh, rT + subs xxl, rT + sbcs xxh, rT + + pop { rT, pc } + .cfi_restore_state + + CFI_END_FUNCTION +FUNC_END mulsidi3 + +#endif /* L_mulsidi3 */ + diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 97dd9f12e31..dc34ea76b15 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1578,6 +1578,7 @@ LSYM(Lover12): #define PEDANTIC_DIV0 (1) #include "eabi/idiv.S" #include "eabi/ldiv.S" +#include "eabi/lmul.S" #endif /* NOT_ISA_TARGET_32BIT */ /* ------------------------------------------------------------------------ */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 4d430325fa1..eb1acd8d5a2 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -27,6 +27,13 @@ LIB1ASMFUNCS += \ _paritysi2 \ _popcountsi2 \ +ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) +# Group 0B: WEAK overridable function objects built for v6m only. +LIB1ASMFUNCS += \ + _muldi3 \ + +endif + # Group 1: Integer function objects. LIB1ASMFUNCS += \ @@ -51,11 +58,13 @@ LIB1ASMFUNCS += \ ifeq (__ARM_ARCH_ISA_THUMB 1,$(ARM_ISA)$(THUMB1_ISA)) -# Group 1B: Integer functions built for v6m only. +# Group 1B: Integer function objects built for v6m only. LIB1ASMFUNCS += \ _divdi3 \ _udivdi3 \ - + _mulsidi3 \ + _umulsidi3 \ + endif